Mistral announces Codestral, its first programming focused AI model


Time’s almost up! There’s only one week left to request an invite to The AI Impact Tour on June 5th. Don’t miss out on this incredible opportunity to explore various methods for auditing AI models. Find out how you can attend here.


Today, Paris-based Mistral, the AI startup that raised Europe’s largest-ever seed round a year ago and has since become a rising star in the global AI domain, marked its entry into the programming and development space with the launch of Codestral, its first-ever code-centric large language model (LLM).

Available today under a non-commercial license, Codestral is a 22B parameter, open-weight generative AI model that specializes in coding tasks, right from generation to completion.

According to Mistral, the model specializes in more than 80 programming languages, making it an ideal tool for software developers looking to design advanced AI applications.

The company claims Codestral already outperforms previous models designed for coding tasks, including CodeLlama 70B and Deepseek Coder 33B, and is being utilized by several industry partners, including JetBrains, SourceGraph and LlamaIndex.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


A performant model for all things coding

At the core, Codestral 22B comes with a context length of 32K and provides developers with the ability to write and interact with code in various coding environments and projects. 

The model has been trained on a dataset of more than 80 programming languages, which makes it suitable for a diverse range of coding tasks, including generating code from scratch, completing coding functions, writing tests and completing any partial code using a fill-in-the-middle mechanism. The programming languages it covers include popular ones such as SQL, Python, Java, C and C++ as well as more specific ones like Swift and Fortran.

Mistral says Codestral can help developers ‘level up their coding game’ to accelerate workflows and save a significant amount of time and effort when building applications. Not to mention, it can also help reduce the risk of errors and bugs.

While the model has just been launched and is yet to be tested publicly, Mistral claims it already outperforms existing code-centric models, including CodeLlama 70B, Deepseek Coder 33B, and Llama 3 70B, on most programming languages.

Codestral performance on HumanEval across different programming languages

On RepoBench, designed for evaluating long-range repository-level Python code completion, Codestral outperformed all three models with an accuracy score of 34%. Similarly, on HumanEval to evaluate Python code generation and CruxEval to test Python output prediction, the model bested the competition with scores of 81.1% and 51.3%, respectively. It even outperformed the models on HumanEval for Bash, Java and PHP.

Notably, the model’s performance on HumanEval for C++, C and Typescript, was not the best but the average score across all tests combined was the highest at 61.5%, sitting just ahead of Llama 3 70B’s 61.2%. On the Spider assessment for SQL performance, it stood second with a score of 63.5%.

Several popular tools for developer productivity and AI application development have already started testing Codestral. This includes big names such as LlamaIndex, LangChain, Continue.dev, Tabnine and JetBrains.

“From our initial testing, it’s a great option for code generation workflows because it’s fast, has a favorable context window, and the instruct version supports tool use. We tested with LangGraph for self-corrective code generation using the instruct Codestral tool use for output, and it worked really well out-of-the-box,” Harrison Chase, CEO and co-founder of LangChain, said in a statement.

How to get started with Codestral?

Mistral is offering Codestral 22B on Hugging Face under its own non-production license, which allows developers to use the technology for non-commercial purposes, testing and to support research work.

The company is also making the model available via two API endpoints: codestral.mistral.ai and api.mistral.ai.

The former is designed for users looking to use Codestral’s Instruct or Fill-In-the-Middle routes inside their IDE. It comes with an API key managed at the personal level without usual organization rate limits and is free to use during a beta period of eight weeks. Meanwhile, the latter is the usual endpoint for broader research, batch queries or third-party application development, with queries billed per token.

Further, interested developers can also test Codestral’s capabilities by chatting with an instructed version of the model on Le Chat, Mistral’s free conversational interface. 

Mistral’s move to introduce Codestral gives enterprise researchers another notable option to accelerate software development, but it remains to be seen how the model performs against other code-centric models in the market, including the recently-introduced StarCoder2 as well as offerings from OpenAI and Amazon.

The former offers Codex, which powers the GitHub co-pilot service, while the latter has its CodeWhisper tool. OpenAI’s ChatGPT has also been used by programmers as a coding tool, and the company’s GPT-4 Turbo model powers Devin, the semi-autonomous coding agent service from Cognition.

There’s also strong competition from Replit, which has a few small AI coding models on Hugging Face and Codenium, which recently nabbed $65 million series B funding at a valuation of $500 million. 



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *