Mistral

Mistral: Advanced AI Language Model with 671B Parameters

Experience the next generation of language models with groundbreaking efficiency in reasoning, coding, and mathematical computation

671B Parameters

Advanced Coding

Efficient Training

Try Mistral Access API

Free Website Integration

Own a website? Embed our chat interface for free with a simple iframe code. No registration required.

Try Mistral Free Chat Without Register

Mistral Chat

Key Features

Discover the powerful capabilities that make Mistral stand out

Advanced MoE Architecture

Revolutionary 671B parameter model with only 37B activated per token, achieving optimal efficiency through innovative load balancing

•Multi-head Latent Attention (MLA)
•Auxiliary-loss-free load balancing
•MistralMoE architecture
•Multi-token prediction objective

State-of-the-Art Performance

Exceptional results across multiple benchmarks including MMLU (87.1%), BBH (87.5%), and mathematical reasoning tasks

•Top scores in coding competitions
•Advanced mathematical computation
•Multilingual capabilities
•Complex reasoning tasks

Efficient Training

Groundbreaking training approach requiring only 2.788M H800 GPU hours, with remarkable cost efficiency of $5.5M

•FP8 mixed precision training
•Optimized training framework
•Stable training process
•No rollbacks required

Versatile Deployment

Multiple deployment options supporting NVIDIA, AMD GPUs and Huawei Ascend NPUs for flexible integration

•Cloud deployment ready
•Local inference support
•Multiple hardware platforms
•Optimized serving options

Advanced Coding Capabilities

Superior performance in programming tasks, excelling in both competitive coding and real-world development scenarios

•Multi-language support
•Code completion
•Bug detection
•Code optimization

Enterprise-Ready Security

Comprehensive security measures and compliance features for enterprise deployment and integration

•Access control
•Data encryption
•Audit logging
•Compliance ready

Extensive Training Data

Pre-trained on 14.8T diverse and high-quality tokens, ensuring broad knowledge and capabilities

•Diverse data sources
•Quality-filtered content
•Multiple domains
•Regular updates

Innovation Leadership

Pioneering advancements in AI technology through open collaboration and continuous innovation

•Research leadership
•Open collaboration
•Community driven
•Regular improvements

Mistral in the Media

Breaking new ground in open-source AI development

Breakthrough Performance

Mistral outperforms both open and closed AI models in coding competitions, particularly excelling in Codeforces contests and Aider Polyglot tests.

Massive Scale

Built with 671 billion parameters and trained on 14.8 trillion tokens, making it 1.6 times larger than Meta's Llama 3.1 405B.

Cost-Effective Development

Trained in just two months using Nvidia H800 GPUs, with a remarkably efficient development cost of $5.5 million.

Mistral in Action

Watch how Mistral revolutionizes open-source AI capabilities

Mistral: Revolutionary Open Source AI

An in-depth look at Mistral's capabilities and performance compared to other leading AI models.

Mistral Performance Metrics

Mistral Language Understanding

MMLU87.1%

BBH87.5%

DROP89.0%

Mistral Coding

HumanEval65.2%

MBPP75.4%

CRUXEval68.5%

Mistral Mathematics

GSM8K89.3%

MATH61.6%

CMath90.7%

Technical Specifications

Explore the advanced technical capabilities and architecture that power Mistral

Mistral Architecture Details

Advanced neural architecture designed for optimal performance and efficiency

•671B total parameters with dynamic activation of 37B per token

•Multi-head Latent Attention (MLA) for enhanced context understanding

•MistralMoE architecture with specialized expert networks

•Auxiliary-loss-free load balancing for optimal resource utilization

•Multi-token prediction training objective for improved efficiency

•Innovative sparse gating mechanism

•Advanced parameter sharing techniques

•Optimized memory management system

Mistral Research

Advancing the boundaries of language model capabilities

Novel Architecture

Innovative Mixture-of-Experts (MoE) architecture with auxiliary-loss-free load balancing strategy

Training Methodology

Advanced FP8 mixed precision training framework validated on large-scale model training

Technical Paper

Read our comprehensive technical paper detailing the architecture, training process, and evaluation results of Mistral.

Read the Paper

About Mistral

Pioneering the future of open-source AI development

Company Background

Backed by High-Flyer Capital Management, Mistral aims to achieve breakthrough advances in AI technology through open collaboration and innovation.

Infrastructure

Utilizing advanced computing clusters including 10,000 Nvidia A100 GPUs, Mistral demonstrates exceptional capabilities in large-scale model training.

Download Mistral Models

Choose between the base and chat-tuned versions of Mistral

Mistral Base Model

The foundation model with 671B parameters (37B activated)

Size: 685GB

•Pre-trained on 14.8T tokens
•128K context length
•FP8 weights
•671B total parameters

Download Base Model

Mistral Chat Model

Fine-tuned model optimized for dialogue and interaction

Size: 685GB

•Enhanced reasoning
•128K context length
•Improved instruction following
•671B total parameters

Download Chat Model

Installation Instructions

Download using Git LFS (recommended method):

# For Base Model
git lfs install
git clone https://huggingface.co/Mistral-ai/Mistral-V3-Base

# For Chat Model
git lfs install
git clone https://huggingface.co/Mistral-ai/Mistral-V3

View Base Model View Chat Model

Mistral Deployment Options

Mistral Local Deployment

Run locally with Mistral-Infer Demo supporting FP8 and BF16 inference

Simple setup
Lightweight demo
Multiple precision options

Mistral Cloud Integration

Deploy on cloud platforms with SGLang and LMDeploy support

Cloud-native deployment
Scalable infrastructure
Enterprise-ready

Mistral Hardware Support

Compatible with NVIDIA, AMD GPUs and Huawei Ascend NPUs

Multi-vendor support
Optimized performance
Flexible deployment

How to Use Mistral

Start chatting with Mistral in three simple steps

Step 1

Visit Chat Page

Click the "Try Chat" button at the top of the page to enter the chat interface

Step 2

Enter Your Question

Type your question in the chat input box

Step 3

Wait for Response

Mistral will quickly generate a response, usually within seconds

Start Chatting Now

FAQ

Learn more about Mistral

What makes Mistral unique?

Mistral features a 671B parameter MoE architecture, incorporating innovations like multi-token prediction and auxiliary-free load balancing, delivering exceptional performance across various tasks.

How can I access Mistral?

You can access Mistral through our online demo platform and API service, or download the model weights for local deployment.

In which tasks does Mistral excel?

Mistral excels in mathematics, programming, reasoning, and multilingual tasks, consistently achieving top scores in benchmark evaluations.

What are the hardware requirements for running Mistral?

Mistral supports various deployment options, including NVIDIA GPUs, AMD GPUs, and Huawei Ascend NPUs, with multiple framework choices for optimal performance.

Is Mistral available for commercial use?

Yes, Mistral is available for commercial use. Please refer to the model license agreement for specific terms of use.

How does Mistral compare to other language models?

Mistral outperforms other open-source models in various benchmarks and achieves performance comparable to leading closed-source models.

Which deployment frameworks does Mistral support?

Mistral can be deployed using various frameworks including SGLang, LMDeploy, TensorRT-LLM, vLLM, and supports FP8 and BF16 inference modes.

What is the context window size of Mistral?

Mistral has a 128K context window, enabling effective processing and understanding of complex tasks and long-form content.

Get Started with Mistral

Try Mistral API

Access Mistral's capabilities through our developer-friendly API platform

Start Building

Explore on GitHub

Access the source code, documentation, and contribute to Mistral

View Repository

Try Mistral Chat

Experience Mistral's capabilities directly through our interactive chat interface

Start Chatting