vertex-ai

Exploring the Latest Advancements in Large Language Models on Vertex AI

Tal Peretz

28 Dec 2024 — 2 min read

As the landscape of artificial intelligence continuously evolves, Vertex AI stands at the forefront of innovation with its latest offerings in Large Language Models (LLMs) and generative AI. This update provides a deep dive into the exciting new models and enhancements available on Vertex AI.

Introducing Llama 3.1

The Llama 3.1 405B model is now in preview on Vertex AI, offering unprecedented capabilities. This model excels in synthetic data generation, model distillation, steerability, and multilingual translation. Additionally, its proficiency in mathematics and tool use makes it a versatile choice for a variety of applications.

Gemma 2 Joins the Model Garden

The Gemma 2 2B model, developed by Google Deepmind, is the latest addition to the Model Garden. As a foundation LLM, it brings robust open-weight capabilities that enhance the flexibility and adaptability of AI solutions.

Mistral AI Models Now Available

Vertex AI has expanded its offerings with managed models from Mistral AI, including the generally available Mistral Large (24.11). These models, which can also be deployed from Hugging Face, provide users with a wide array of options to suit specific needs.

Efficiency with Hex-LLM

Hex-LLM emerges as a high-efficiency solution for serving large language models. It incorporates advanced parallelism strategies, quantizations, and dynamic LoRA to deliver high throughput and low latency, supporting both dense and sparse LLMs.

Enhanced Fine-Tuning Capabilities

Fine-tuning processes have been significantly enhanced for models like Gemma 2, Llama 3.1, and others. These updates include improved GPU utilization, support for input token masking, and the elimination of out-of-memory errors, ensuring a smoother fine-tuning experience.

Diverse Model Library

Vertex AI continues to expand its library with additional models such as Qwen2 by Alibaba Cloud and Phi-3 by Microsoft. This expansion provides users with diverse options to leverage the best tools for their unique applications.

With competitive pricing at $0.2 per million input tokens and $0.6 per million output tokens, coupled with a maximum token capacity of 128,000, Vertex AI remains a cost-effective and powerful platform for deploying advanced LLMs. As the technology evolves, Vertex AI is committed to providing cutting-edge solutions that drive innovation across industries.

Introducing Gemini 2.0 Flash Preview Image Generation: Google's Next-Step Generative AI Model

Google’s Gemini 2.0 Flash Preview Image Generation is the latest breakthrough in generative AI, introducing robust multimodal capabilities that enable intuitive, context-aware image generation and editing. This model builds upon the powerful Gemini 2.0 Flash architecture, providing developers and creators with a versatile tool for visually expressive

Exploring Google's Gemini 2.5 Flash Preview TTS: Powerful, Cost-Efficient Text-to-Speech

Google continues to set the pace in generative AI with the introduction of Gemini 2.5 Flash Preview TTS, a sophisticated text-to-speech model designed for structured workflows demanding high control, transparency, and cost-efficiency. Released as part of Google's Gemini 2.5 series, this model builds upon previous iterations

Introducing Vertex AI Gemini-2.5-Pro-Preview-TTS: Google's New Flagship LLM Explained

Google continues to push the boundaries of artificial intelligence with the recent release of its highly anticipated Vertex AI Gemini-2.5-Pro-Preview-TTS model. As part of the Vertex AI ecosystem, Gemini 2.5 Pro represents a significant leap forward in AI capabilities, offering advanced reasoning, exceptional coding proficiency, and unparalleled multimodal

Introducing Gemini 2.5 Pro Preview TTS: Google's Next-Generation Multimodal AI

Google DeepMind's Gemini 2.5 Pro Preview TTS is the latest breakthrough in large language models (LLMs), designed to deliver exceptional performance across reasoning, coding, multimodal capabilities, and text-to-speech (TTS) quality. Let's explore the key features, capabilities, and practical applications of this advanced AI model. Key