“We processed over 100t tokens this quarter, up 5x yr over yr, together with a file 50t tokens final month alone.”
If the market harbored any doubt for the insatiable demand for AI, this assertion throughout Microsoft’s quarterly earnings yesterday, quashed it.
What might this imply for a run charge? Utilizing some primary assumptions1, this means :
State of affairs | Mannequin combine (% of complete tokens) | Month-to-month run-rate after 20 % low cost | Annual run charge | % of Azure Income (assuming $21B Annual) |
---|---|---|---|---|
Excessive | OpenAI 70 % • Claude 20 % • Different 10 % | 382.9 | 4,594.8 | 21.88% |
Medium | OpenAI 65 % • Claude 20 % • Different 15 % | 110.5 | 1,326.0 | 6.31% |
Low | OpenAI 60 % • Claude 20 % • Different 20 % | 27.3 | 327.6 | 1.56% |
So AI is roughly between 2 to 22% of Azure income. Error bars listed here are fairly massive, although.
A significant contributor to this elevated demand is efficiency, particularly with reasoning fashions.
Mixed with among the huge reductions in inference prices, particularly with smaller fashions just like the Phi-4 fashions that Microsoft launched yesterday which are open supply and small. The margins on AI inference ought to proceed to surge.
“…our value per token, which has greater than halved.”
“You see this in our provide chain the place we have now diminished dock to steer instances for brand new GPUs by practically 20% throughout our blended fleet the place we have now elevated AI efficiency by practically 30% ISO energy…”
Jevon’s Paradox in full drive.
“The actual outperformance in Azure this quarter was in our non AI enterprise.”
This was a shock, nevertheless it probably is the results of further calls for positioned on adjoining techniques. AI doesn’t exist in a vacuum. It wants databases, storage, orchestration, and observability to succeed.
“PostgreSQL utilization accelerated for the third consecutive quarter… Cosmos DB income development additionally accelerated once more this quarter…”
A later quote inside the analyst name reinforces this level, the database techniques, Cosmos (a MongoDB-like doc information retailer) & PostGres, Each of that are transactional databases.
100 trillion tokens up 4x y/y. Subsequent yr, might we see a quadrillion?
1 20:1 input-to-output token ratio; a mannequin utilization mixture of 60-70% OpenAI, 20% Anthropic, the rest of different fashions ; and a 20% low cost to public costs. See the work right here