On May 2, 2026, Mistral AI released Mistral Medium 3.5 — a 128-billion-parameter dense model with a 256K-token context window, available as open weights on Hugging Face under a modified MIT license. The model became the default in Le Chat and Vibe, Mistral's coding IDE, and shipped alongside a new ‘remote agents’ feature that runs async cloud-based coding sessions from inside Vibe.
Medium 3.5 collapses three previously separate model categories — instruction-following, reasoning, and code — into a single set of weights. Mistral reports 77.6% on SWE-Bench Verified, beating Devstral 2 and Qwen3.5 397B A17B at a fraction of the parameter count. Reasoning effort is configurable per request, so the same checkpoint can run as a fast instant-reply tool or a deeper test-time-compute reasoner. List pricing is $1.50 per million input tokens and $7.50 per million output, and the model runs on as few as four GPUs.
The release tightens an open-weights race that already includes DeepSeek V4 Pro, Moonshot's FlashKDA, and Meta's Llama line. Vibe's remote agents follow the pattern Anthropic, Cognition, and OpenAI have all converged on — long-running coding tasks delegated to a sandboxed environment while the developer keeps working. The differentiator Mistral is leaning on is sovereign deployment: open weights, European base, on-prem-friendly footprint.
Takeaway for learners — ‘open weights’ no longer means ‘small and behind.’ A 128B dense model with a 256K context, configurable reasoning, and SWE-Bench numbers in frontier-lab range is something a student or solo engineer can actually run, fine-tune, or fork on a modest GPU cluster. If you're building anything where you need to control cost or keep data on-prem, Medium 3.5 is the new reference point worth benchmarking against.