firefunction-v2

Introducing Firefunction-V2: Fireworks AI's Next-Generation LLM for Real-World Applications

Tal Peretz

15 Jul 2024 — 2 min read

Fireworks AI is proud to announce the release of Firefunction-V2, our latest large language model (LLM) engineered to excel in real-world applications. Designed with a focus on function calling and multi-turn conversations, Firefunction-V2 brings remarkable advancements to the AI landscape.

Key Features and Capabilities

Function Calling

Firefunction-V2 is optimized for function calling, which enables it to interact with external APIs effectively. Unlike its predecessor, Firefunction-V2 supports parallel function calling, significantly enhancing its utility.

Performance

Our new model competes head-to-head with GPT-4o in function-calling tasks, scoring 0.81 on public benchmarks, slightly higher than GPT-4o’s 0.80. Impressively, Firefunction-V2 operates at 2.5 times the speed and 10% of the cost of GPT-4o, making it both faster and more cost-effective.

General Reasoning and Conversation

Firefunction-V2 balances function calling with general reasoning abilities, maintaining the conversational and instruction-following capabilities found in Llama 3.

Cost and Speed

Firefunction-V2 is highly cost-effective, priced at $0.9 per output token compared to GPT-4o’s $15 per token. It processes 180 tokens per second, compared to GPT-4o’s 69 tokens per second, ensuring rapid and affordable performance.

Integration and Accessibility

Accessible through Fireworks AI’s platform, Firefunction-V2 offers a speed-optimized setup with an OpenAI-compatible API, facilitating easy integration into existing systems. Users can also explore the model using a demo app and UI playground.

Development and Fine-Tuning

Firefunction-V2 was fine-tuned from the Llama3-70b-instruct base model using a curated dataset that included function calling and general conversation data. User feedback played a crucial role in its development, ensuring the model meets real-world application needs.

Practical Applications

Firefunction-V2 excels in multi-turn conversations and instruction following, making intelligent decisions about when to call functions and executing them accurately. It supports up to 30 function specifications, a significant improvement over its predecessor.

Market and User Reception

Fireworks AI has seen significant enterprise adoption, with companies like Uber, DoorDash, and AI-native startups building their stack around Fireworks. The developer community has responded positively, with beta testers praising its performance and cost-effectiveness.

Future Ambitions

Fireworks AI is committed to building the best platform for enterprises to deploy AI into production. We focus on latency, cost, quality, ownership, and developer experience. The team will continue iterating on Firefunction models based on user feedback and a commitment to practical solutions for developers.

Discover the future of AI with Firefunction-V2, and see how it can transform your real-world applications.

Introducing Gemini 2.0 Flash Preview Image Generation: Google's Next-Step Generative AI Model

Google’s Gemini 2.0 Flash Preview Image Generation is the latest breakthrough in generative AI, introducing robust multimodal capabilities that enable intuitive, context-aware image generation and editing. This model builds upon the powerful Gemini 2.0 Flash architecture, providing developers and creators with a versatile tool for visually expressive

Exploring Google's Gemini 2.5 Flash Preview TTS: Powerful, Cost-Efficient Text-to-Speech

Google continues to set the pace in generative AI with the introduction of Gemini 2.5 Flash Preview TTS, a sophisticated text-to-speech model designed for structured workflows demanding high control, transparency, and cost-efficiency. Released as part of Google's Gemini 2.5 series, this model builds upon previous iterations

Introducing Vertex AI Gemini-2.5-Pro-Preview-TTS: Google's New Flagship LLM Explained

Google continues to push the boundaries of artificial intelligence with the recent release of its highly anticipated Vertex AI Gemini-2.5-Pro-Preview-TTS model. As part of the Vertex AI ecosystem, Gemini 2.5 Pro represents a significant leap forward in AI capabilities, offering advanced reasoning, exceptional coding proficiency, and unparalleled multimodal

Introducing Gemini 2.5 Pro Preview TTS: Google's Next-Generation Multimodal AI

Google DeepMind's Gemini 2.5 Pro Preview TTS is the latest breakthrough in large language models (LLMs), designed to deliver exceptional performance across reasoning, coding, multimodal capabilities, and text-to-speech (TTS) quality. Let's explore the key features, capabilities, and practical applications of this advanced AI model. Key