README.md · main · Radhika Yadav / IM_ShapePlacement

Philipp authored Feb 19, 2024
* Fixes #26: A Model is now retrieved by the backends.get_model_for(model_spec) method which performs a unification operation with existing ModelSpecs from a model registry. The first unifying model spec is returned or the one given retained. A ModelSpec must define a "backend" (name) to properly load a Model. The backend name must match a backend file like <name>_api.py.

Changes:
- Backends now provide Models
- Models are defined by ModelSpecs
- ModelSpecs are described in a model_registry.json
- generation arguments (temp, max_tokens) are directly attached to Model
- backends are now lazy loaded and the one to be used must be specified in the ModelSpec

New Feature:
- now the benchmark will try to parse the -m option as json to create a ModelSpec (names also still work)
- this looks like: python3 scripts/cli.py run -g taboo -m "{'model_name':'gpt3-turbo','backend':'openai'}"
- note that single-quotes must be specified (these will be replaced to create proper json)

Aside:
- adjusted all pre-defined bechmark games to newly introduced classes
- remove text-davinci-003 (not listed anymore in https://api.openai.com/v1/models)
- prototyping: additional model specs can be defined in a model_registry_custom.json (not version controlled)
- prototyping: use model.set_gen_args(arg0=1,arg1=2) to set generation arguments
8d722b45