Use Meta Llama 3.2 90B Vision and 11B Vision in OCI Generative AI

OCI Generative AI now supports Meta's 90 billion-parameter and 11 billion-parameter Llama 3.2 vision models. These models offer multimodal AI capabilities enabling advanced image and text understanding in one model.

Key Highlights
  • Both models support visual understanding and image recognition tasks.
  • Both models offer a 128,000 token context length.
  • Llama 3.2 90B Vision includes the text-based capabilities of the previous Llama 3.1 70B model.
  • Llama 3.2 11B Vision provides robust multimodal capabilities in a more compact form.
  • Both models are available for dedicated hosting, with Llama 3.2 90B also offered for on-demand inferencing.
  • Llama 3.1 70B remains available for on-demand, dedicated hosting, and fine-tuning.
Available Regions

US Midwest (Chicago), UK South (London), and Brazil East (Sao Paulo). Use is restricted in the European Union. (See the following note.)

New Features and Improvements
  • Multimodal support: Text and image as inputs with text output
    • Image understanding relevant for enterprise use cases such as charts and graphs
    • Advanced image captioning and visual grounding capabilities
  • Multilingual support for text-only queries in eight languages including English, French, Hindi, Italian, Portuguese, Spanish, and Thai

You can now leverage these powerful multimodal models without infrastructure management concerns. Access is available through chat interfaces, API, and dedicated endpoints.

Important Note: Per Meta's Llama 3.2 Acceptable Use Policy, there are restrictions for individuals and companies in the European Union regarding the use of Llama 3.2 multimodal models. Review the full policy for details.

For a list of offered models, see Pretrained Foundational Models in Generative AI. For information about the service, see the Generative AI documentation.