On May 1, 2026, Nebius Group agreed to acquire Eigen AI for approximately $643 million in a mix of cash and Nebius Class A shares. Eigen AI is a 20-person inference-optimization startup founded by Ryan Hanrui Wang and Wei-Chen Wang, both alumni of Professor Song Han's HAN Lab at MIT. Eigen's optimization stack will fold directly into Nebius Token Factory, the company's managed inference platform.

The price tag — roughly $32M per employee — signals where the AI infrastructure market believes margin lives in 2026. Compute access alone has commodified; the differentiator is how cheaply and quickly each token gets served. Eigen's system-, model-, and kernel-level techniques are designed to extract more throughput from the same hardware, which translates directly into lower cost-per-inference for Token Factory customers and faster time-to-production for new model releases.

This is the inference layer doing what the training layer did two years ago: consolidating into a few platforms with deep optimization expertise. Together AI, Fireworks, Groq, Cerebras, and Anyscale are running variants of the same playbook. Nebius — listed on Nasdaq and previously the European arm of Yandex — is using its public-market currency to buy the optimization talent that hyperscalers grew internally. Expect more $500M-plus inference acquisitions through the rest of 2026.

Takeaway for learners — inference engineering is now its own discipline, distinct from ‘ML engineering’ broadly defined. Kernel-level CUDA work, speculative decoding, KV-cache optimization, and quantization-aware deployment are the specific skills the market is paying $32M per head for. If you've been told to specialize in ML and want a more concrete target, this is one of the most lucrative niches in AI infrastructure right now.