Skip to content

libs/ai-endpoints/v0.1.0

Compare
Choose a tag to compare
@github-actions github-actions released this 31 May 15:24
· 397 commits to main since this release
2719ca5

What's Changed

  • update to 0.1, remove deprecated functionality and focus on api catalog backend by @mattf in #48

Full Changelog: libs/ai-endpoints/v0.0.20...libs/ai-endpoints/v0.1.0

this is version 0.1.0 of the connectors with two primary changes -

  1. [user visible] deprecated or unavailable functionality removed
  2. [not use visible] use of api catalog (integrate.api.nvidia.com and ai.api.nvidia.com) instead of nvcf (api.nvcf.nvidia.com) for inference

all playground_* model endpoints have been decommissioned.

functionality removed -

  • models: playground_mamba_chat, playground_smaug_72b, playground_nemotron_qa_8b, playground_nemotron_steerlm_8b, playground_steerlm_llama_70b, playground_yi_34b, playground_nvolveqa_40k
  • methods: available_functions, get_available_functions, get_model_details, get_binding_model, mode, validate_model, validate_base_url, reset_method_cache, validate_client, aifm_deprecated, aifm_bad_deprecated, aifm_labels_deprecated, custom_preprocess, preprocess_msg, custom_postprocess, get_generation, get_stream, get_astream, get_payload, prep_payload, prep_msg, deprecate_max_length
  • properties: ChatNVIDIA.bad, ChatNVIDIA.labels, ChatNVIDIA.infer_endpoint, ChatNVIDIA.client, ChatNVIDIA.streaming, NVIDIAEmbeddings.infer_endpoint, NVIDIAEmbeddings.client, NVIDIAEmbeddings.max_length

functionality deprecated -

  • NVIDIAEmbeddings.model_type, instead of setting model_type="query" or "passage" use NVIDIAEmbeddings.embed_query and NVIDIAEmbeddings.embed_documents

migration guide -

  • ChatNVIDIA().mode("nim", base_url="http://...") must use ChatNVIDIA(base_url="http://...")
  • NVIDIAEmbeddings().mode("nim", base_url="http://...") must use NVIDIAEmbeddings(base_url="http://...")
  • NVIDIARerank().mode("nim", base_url="http://...") must use NVIDIARerank(base_url="http://...")
  • compatibility for the playground_nvolveqa_40k (aka nvolveqa_40k) model. when specifying model="nvolveqa_40k", the NV-Embed-QA model will be used and truncate="END" will be set. note: the old nvolveqa_40k model endpoint would silently truncated input while all available endpoints raise an error instead of truncating.

model migration, the following models will raise a warning and use an alternative (model changes in italics) -

model alternative
playground_llama2_13b meta/llama2-70b
playground_llama2_code_13b meta/codellama-70b
playground_llama2_code_34b meta/codellama-70b
playground_nv_llama2_rlhf_70b meta/llama2-70b
playground_phi2 microsoft/phi-3-mini-4k-instruct
playground_llama2_code_70b meta/codellama-70b
playground_gemma_2b google/gemma-2b
playground_gemma_7b google/gemma-7b
playground_llama2_70b meta/llama2-70b
playground_mistral_7b mistralai/mistral-7b-instruct-v0.2
playground_mixtral_8x7b mistralai/mixtral-8x7b-instruct-v0.1
playground_deplot google/deplot
playground_fuyu_8b adept/fuyu-8b
playground_kosmos_2 microsoft/kosmos-2
playground_neva_22b nvidia/neva-22b
playground_nvolveqa_40k NV-Embed-QA