Use Meta Llama 3.2 90B Vision and 11B Vision in OCI Generative AI

Services: Generative AI
Release Date: November 14, 2024

OCI Generative AI now supports Meta's 90 billion-parameter and 11 billion-parameter Llama 3.2 vision models. These models offer multimodal AI capabilities enabling advanced image and text understanding in one model.

Key Highlights

Both models support visual understanding and image recognition tasks.
Both models offer a 128,000 token context length.
Llama 3.2 90B Vision includes the text-based capabilities of the previous Llama 3.1 70B model.
Llama 3.2 11B Vision provides robust multimodal capabilities in a more compact form.
Both models are available for dedicated hosting, with Llama 3.2 90B also offered for on-demand inferencing.
Llama 3.1 70B remains available for on-demand, dedicated hosting, and fine-tuning.

Available Regions

US Midwest (Chicago) and Brazil East (Sao Paulo)

New Features and Improvements

Multimodal support: Text and image as inputs with text output
- Image understanding relevant for enterprise use cases such as charts and graphs
- Advanced image captioning and visual grounding capabilities
Multilingual support for text-only queries in eight languages including English, French, Hindi, Italian, Portuguese, Spanish, and Thai

You can now leverage these powerful multimodal models without infrastructure management concerns. Access is available through chat interfaces, API, and dedicated endpoints.

Important Note: Review Meta's Llama 3.2 Acceptable Use Policy, for restrictions regarding the use of Llama 3.2 multimodal models.

For a list of offered models, see Pretrained Foundational Models in Generative AI. For information about the service, see the Generative AI documentation.

Oracle Cloud Infrastructure Documentation / Release Notes

Use Meta Llama 3.2 90B Vision and 11B Vision in OCI Generative AI