ai21-labs

Introducing AI21's Jamba 1.5 Models: A Leap Forward in Large Language Models

Tal Peretz

27 Aug 2024 — 2 min read

AI21 Labs is revolutionizing the landscape of large language models (LLMs) with the introduction of their latest offerings, Jamba 1.5 Mini and Jamba 1.5 Large. These new models address the limitations of traditional Transformer models, particularly in handling extensive context windows, through an innovative hybrid architecture.

Architecture Innovations

The Jamba 1.5 models employ a hybrid approach that combines the strengths of Transformer and Mamba (Structured State Space) architectures. This hybrid design facilitates efficient processing of long sequences of data:

Mamba-Transformer Layers: Mamba layers handle short-range dependencies, while Transformer layers manage long-range dependencies, optimizing speed, memory, and quality.

Model Specifications

AI21 Labs offers two variants to cater to different use cases:

Jamba 1.5 Mini: With 12 billion active parameters and a total of 52 billion parameters, this model is ideal for customer support, document summarization, and text generation.
Jamba 1.5 Large: Featuring 94 billion active parameters and a total of 398 billion parameters, this model excels in advanced reasoning tasks, financial analysis, and complex document summarization.

Key Features

The Jamba 1.5 models come packed with features that set them apart:

Context Window: Both models support a 256,000 token context window, the largest available under an open license, fully utilizing their declared context window.
Performance: Excelling in multihop tracing, retrieval, aggregation, and question-answering tasks, the models have demonstrated superior performance in the RULER benchmark. Jamba 1.5 Large, in particular, has shown to be twice as fast in the longest context windows compared to similar models.
Developer Features: Support for function calling, Retrieval-Augmented Generation (RAG) optimizations, JSON mode, citation mode, and structured document objects.

Availability and Integration

The Jamba 1.5 models are available on multiple platforms, making them accessible to a wide range of users:

Platforms: Available on Google Cloud's Vertex AI, Microsoft Azure AI, Hugging Face, Langchain, LlamIndex, and Together.AI.
Deployment: Offered as Models-as-a-Service (MaaS) on Azure AI and Google Cloud, allowing for pay-as-you-go inference APIs without the need to manage underlying infrastructure.

Use Cases

The Jamba 1.5 models are versatile and can be deployed in various enterprise applications:

Summarizing lengthy documents
Powering RAG-based solutions
Customer service
Financial analysis
Content creation

Security and Compliance

AI21 Labs ensures that the Jamba models are integrated with various security features and compliance certifications to guarantee data privacy and security.

Partnerships

AI21 Labs is collaborating with major cloud providers such as Amazon Web Services (AWS), Google Cloud, and Microsoft Azure to ensure seamless deployment and integration of the Jamba models.

Introducing Gemini 2.0 Flash Preview Image Generation: Google's Next-Step Generative AI Model

Google’s Gemini 2.0 Flash Preview Image Generation is the latest breakthrough in generative AI, introducing robust multimodal capabilities that enable intuitive, context-aware image generation and editing. This model builds upon the powerful Gemini 2.0 Flash architecture, providing developers and creators with a versatile tool for visually expressive

Exploring Google's Gemini 2.5 Flash Preview TTS: Powerful, Cost-Efficient Text-to-Speech

Google continues to set the pace in generative AI with the introduction of Gemini 2.5 Flash Preview TTS, a sophisticated text-to-speech model designed for structured workflows demanding high control, transparency, and cost-efficiency. Released as part of Google's Gemini 2.5 series, this model builds upon previous iterations

Introducing Vertex AI Gemini-2.5-Pro-Preview-TTS: Google's New Flagship LLM Explained

Google continues to push the boundaries of artificial intelligence with the recent release of its highly anticipated Vertex AI Gemini-2.5-Pro-Preview-TTS model. As part of the Vertex AI ecosystem, Gemini 2.5 Pro represents a significant leap forward in AI capabilities, offering advanced reasoning, exceptional coding proficiency, and unparalleled multimodal

Introducing Gemini 2.5 Pro Preview TTS: Google's Next-Generation Multimodal AI

Google DeepMind's Gemini 2.5 Pro Preview TTS is the latest breakthrough in large language models (LLMs), designed to deliver exceptional performance across reasoning, coding, multimodal capabilities, and text-to-speech (TTS) quality. Let's explore the key features, capabilities, and practical applications of this advanced AI model. Key