Introducing Vertex AI's text-multilingual-embedding-002: A Game Changer for Multilingual Text Embeddings

Introducing Vertex AI's text-multilingual-embedding-002: A Game Changer for Multilingual Text Embeddings

Vertex AI's text-multilingual-embedding-002 model is designed to revolutionize the way we handle multilingual text embeddings. This advanced model supports a wide array of languages beyond English, making it an invaluable tool for global applications.

Supported Languages

The model has been rigorously evaluated on numerous languages including Arabic, Bengali, English, Spanish, German, Persian, Finnish, French, Hindi, Indonesian, Japanese, Korean, Russian, Swahili, Telugu, Thai, Yoruba, and Chinese. Additionally, it supports many other languages, offering extensive versatility for users worldwide.

Usage

With the text-multilingual-embedding-002 model, you can generate dense vector representations of text. These embeddings are essential for tasks that require a deep understanding of the text's meaning rather than just direct word or syntax matches, such as search, recommendation systems, and natural language understanding.

API and SDK Access

Accessing this model is straightforward through the Vertex AI API or the Vertex AI SDK for Python. Simply specify the model ID in your API requests or SDK calls to embed texts efficiently.

Token Limit and Auto-Truncation

Each input text has a token limit of 2048. Texts longer than this are silently truncated unless you set autoTruncate to false, giving you control over how your data is processed.

Dimensionality Options

The model primarily uses 768-dimensional dense vector embeddings. However, it also supports flexible dimensions such as 256 or 128 without compromising quality, which is beneficial for conserving storage and memory.

Stay Updated

To leverage the full capabilities of Vertex AI, it's recommended to use the latest versions of the models. The text-multilingual-embedding-002 model is among the newest and most advanced options available for multilingual text embeddings.

For detailed usage instructions and examples, refer to the official Google Cloud documentation and tutorials.

Read more