libs/ai-endpoints/v0.1.0
github-actions
released this
31 May 15:24
·
397 commits
to main
since this release
What's Changed
Full Changelog: libs/ai-endpoints/v0.0.20...libs/ai-endpoints/v0.1.0
this is version 0.1.0 of the connectors with two primary changes -
- [user visible] deprecated or unavailable functionality removed
- [not use visible] use of api catalog (integrate.api.nvidia.com and ai.api.nvidia.com) instead of nvcf (api.nvcf.nvidia.com) for inference
all playground_* model endpoints have been decommissioned.
functionality removed -
- models: playground_mamba_chat, playground_smaug_72b, playground_nemotron_qa_8b, playground_nemotron_steerlm_8b, playground_steerlm_llama_70b, playground_yi_34b, playground_nvolveqa_40k
- methods: available_functions, get_available_functions, get_model_details, get_binding_model, mode, validate_model, validate_base_url, reset_method_cache, validate_client, aifm_deprecated, aifm_bad_deprecated, aifm_labels_deprecated, custom_preprocess, preprocess_msg, custom_postprocess, get_generation, get_stream, get_astream, get_payload, prep_payload, prep_msg, deprecate_max_length
- properties: ChatNVIDIA.bad, ChatNVIDIA.labels, ChatNVIDIA.infer_endpoint, ChatNVIDIA.client, ChatNVIDIA.streaming, NVIDIAEmbeddings.infer_endpoint, NVIDIAEmbeddings.client, NVIDIAEmbeddings.max_length
functionality deprecated -
- NVIDIAEmbeddings.model_type, instead of setting model_type="query" or "passage" use NVIDIAEmbeddings.embed_query and NVIDIAEmbeddings.embed_documents
migration guide -
ChatNVIDIA().mode("nim", base_url="http://...")
must useChatNVIDIA(base_url="http://...")
NVIDIAEmbeddings().mode("nim", base_url="http://...")
must useNVIDIAEmbeddings(base_url="http://...")
NVIDIARerank().mode("nim", base_url="http://...")
must useNVIDIARerank(base_url="http://...")
- compatibility for the playground_nvolveqa_40k (aka nvolveqa_40k) model. when specifying model="nvolveqa_40k", the NV-Embed-QA model will be used and truncate="END" will be set. note: the old nvolveqa_40k model endpoint would silently truncated input while all available endpoints raise an error instead of truncating.
model migration, the following models will raise a warning and use an alternative (model changes in italics) -
model | alternative |
---|---|
playground_llama2_13b | meta/llama2-70b |
playground_llama2_code_13b | meta/codellama-70b |
playground_llama2_code_34b | meta/codellama-70b |
playground_nv_llama2_rlhf_70b | meta/llama2-70b |
playground_phi2 | microsoft/phi-3-mini-4k-instruct |
playground_llama2_code_70b | meta/codellama-70b |
playground_gemma_2b | google/gemma-2b |
playground_gemma_7b | google/gemma-7b |
playground_llama2_70b | meta/llama2-70b |
playground_mistral_7b | mistralai/mistral-7b-instruct-v0.2 |
playground_mixtral_8x7b | mistralai/mixtral-8x7b-instruct-v0.1 |
playground_deplot | google/deplot |
playground_fuyu_8b | adept/fuyu-8b |
playground_kosmos_2 | microsoft/kosmos-2 |
playground_neva_22b | nvidia/neva-22b |
playground_nvolveqa_40k | NV-Embed-QA |