Exploring the Capabilities of Perplexity/Mixtral-8x7B-Instruct: A New Era of LLMs

Exploring the Capabilities of Perplexity/Mixtral-8x7B-Instruct: A New Era of LLMs

The Perplexity/Mixtral-8x7B-Instruct is revolutionizing the landscape of large language models (LLMs) with its groundbreaking architecture and performance. Developed by Mistral AI, this sparse mixture of experts (MoE) model boasts an impressive 45 billion parameters, yet it matches the computational demands of a 14 billion parameter model. This efficiency is achieved through a unique architecture featuring 8 experts per MLP layer, each containing 7 billion parameters.

Performance-wise, Mixtral-8x7B sets a new benchmark by outperforming the Llama 2 70B model across various metrics. It also delivers six times faster inference, making it an attractive option for applications requiring high-speed processing.

For those interested in accessing this advanced model, several platforms provide convenient options:

  • Mistral AI Platform: Sign up for beta access to explore the model via the 'mistral-small' endpoint, along with the 'mistral-medium' model.
  • Perplexity Labs: Utilize the instruction-tuned version through the chat interface, available from the model selection dropdown.
  • Hugging Face: Download ready-to-use checkpoints or convert raw checkpoints, with the model accessible on the Hugging Face Hub.
  • Together.AI: Access the model via API, optimized for speedy inference using FlashAttention.

In terms of pricing, Perplexity AI offers competitive rates at $0.14 per million input tokens and $0.56 per million output tokens. This adjustment ensures that users receive value for investment, with Together AI offering a flat rate of $0.6 per million tokens.

The model's architecture incorporates Sliding Window Attention, Grouped Query Attention (GQA), and a byte-fallback BPE tokenizer, enhancing its efficiency and scope. With a training context length of 8k and a theoretical attention span of 128K tokens, it is equipped to handle large-scale data processing tasks.

Moreover, the Apache 2.0 license under which Mixtral-8x7B is released allows for open access, facilitating diverse applications and innovations. On platforms like Perplexity, the context window length is expanded to 16k tokens, further broadening its capabilities.

In conclusion, the Mixtral-8x7B-Instruct model offers a robust tool for developers and researchers seeking to leverage advanced AI functionalities. By tapping into its potential, users can enhance their AI applications, pushing the boundaries of what is possible with LLMs.

Read more