Release libs/ai-endpoints/v0.1.0 · langchain-ai/langchain-nvidia

What's Changed

update to 0.1, remove deprecated functionality and focus on api catalog backend by @mattf in #48

Full Changelog: libs/ai-endpoints/v0.0.20...libs/ai-endpoints/v0.1.0

this is version 0.1.0 of the connectors with two primary changes -

[user visible] deprecated or unavailable functionality removed
[not use visible] use of api catalog (integrate.api.nvidia.com and ai.api.nvidia.com) instead of nvcf (api.nvcf.nvidia.com) for inference

all playground_* model endpoints have been decommissioned.

functionality removed -

models: playground_mamba_chat, playground_smaug_72b, playground_nemotron_qa_8b, playground_nemotron_steerlm_8b, playground_steerlm_llama_70b, playground_yi_34b, playground_nvolveqa_40k
methods: available_functions, get_available_functions, get_model_details, get_binding_model, mode, validate_model, validate_base_url, reset_method_cache, validate_client, aifm_deprecated, aifm_bad_deprecated, aifm_labels_deprecated, custom_preprocess, preprocess_msg, custom_postprocess, get_generation, get_stream, get_astream, get_payload, prep_payload, prep_msg, deprecate_max_length
properties: ChatNVIDIA.bad, ChatNVIDIA.labels, ChatNVIDIA.infer_endpoint, ChatNVIDIA.client, ChatNVIDIA.streaming, NVIDIAEmbeddings.infer_endpoint, NVIDIAEmbeddings.client, NVIDIAEmbeddings.max_length

functionality deprecated -

NVIDIAEmbeddings.model_type, instead of setting model_type="query" or "passage" use NVIDIAEmbeddings.embed_query and NVIDIAEmbeddings.embed_documents

migration guide -

ChatNVIDIA().mode("nim", base_url="http://...") must use ChatNVIDIA(base_url="http://...")
NVIDIAEmbeddings().mode("nim", base_url="http://...") must use NVIDIAEmbeddings(base_url="http://...")
NVIDIARerank().mode("nim", base_url="http://...") must use NVIDIARerank(base_url="http://...")
compatibility for the playground_nvolveqa_40k (aka nvolveqa_40k) model. when specifying model="nvolveqa_40k", the NV-Embed-QA model will be used and truncate="END" will be set. note: the old nvolveqa_40k model endpoint would silently truncated input while all available endpoints raise an error instead of truncating.

model migration, the following models will raise a warning and use an alternative (model changes in italics) -

model	alternative
playground_llama2_13b	meta/llama2-70b
playground_llama2_code_13b	meta/codellama-70b
playground_llama2_code_34b	meta/codellama-70b
playground_nv_llama2_rlhf_70b	meta/llama2-70b
playground_phi2	microsoft/phi-3-mini-4k-instruct
playground_llama2_code_70b	meta/codellama-70b
playground_gemma_2b	google/gemma-2b
playground_gemma_7b	google/gemma-7b
playground_llama2_70b	meta/llama2-70b
playground_mistral_7b	mistralai/mistral-7b-instruct-v0.2
playground_mixtral_8x7b	mistralai/mixtral-8x7b-instruct-v0.1
playground_deplot	google/deplot
playground_fuyu_8b	adept/fuyu-8b
playground_kosmos_2	microsoft/kosmos-2
playground_neva_22b	nvidia/neva-22b
playground_nvolveqa_40k	NV-Embed-QA