GitHub - Ishwarya254/llama3.2_11b_vision_preview_GROQ: llama-3.2-11b-vision-preview

Llama-3.2-11b-vision-preview

Llama-3.2-11b-vision-preview using by integrating it with ChatGroq for advanced visual model applications. Showcased capabilities through practical implementation and testing.

Llama 3.2-Vision is intended for commercial and research use. Instruction tuned models are intended for visual recognition, image reasoning, captioning, and assistant-like chat with images, whereas pretrained models can be adapted for a variety of image reasoning tasks.

Visual Question Answering (VQA) and Visual Reasoning:

Imagine a machine that looks at a picture and understands your questions about it.

Image Captioning:

Image captioning bridges the gap between vision and language, extracting details, understanding the scene, and then crafting a sentence or two that tells the story.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
main.py		main.py
requirements..txt		requirements..txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama-3.2-11b-vision-preview

About

Releases

Packages

Languages

Ishwarya254/llama3.2_11b_vision_preview_GROQ

Folders and files

Latest commit

History

Repository files navigation

Llama-3.2-11b-vision-preview

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages