Microsoft reportedly building a 500B LLM called MAI-1

Published on:

Based on a report by The Data, Microsoft is engaged on a 500B parameter LLM referred to as MAI-1 that might tackle GPT-4 and Google’s Gemini fashions.

We lately reported on Microsoft’s Phi-3 Mini household of small language fashions starting from 3.8B to 14B parameters. At 500B parameters, MAI-1 is ready to be the biggest mannequin Microsoft has deployed.

Its measurement places it in the identical ballpark as GPT-4 and Google’s larger Gemini fashions. GPT-4 is rumored to have 1.76T parameters nevertheless it’s a Combination of Specialists (MoE) mannequin so solely round 280B parameters are in play throughout inference.

- Advertisement -

There isn’t any info out there concerning the structure of MAI-1, but when it’s a dense mannequin, versus MoE, then it’s going to be fairly highly effective. Meta’s anticipated Llama 3 mannequin is predicted to have 400B parameters.

The event of MAI-1 is being led by Mustafa Suleyman, co-founder and former head of utilized AI at DeepMind.

Mustafa left DeepMind to co-found AI startup Inflection in 2022. In March this yr, Microsoft employed nearly all of the Inflection’s workers and paid $650 million for the rights to the corporate’s IP.

MAI-1 is outwardly a totally new Microsoft challenge somewhat than a continuation of an present Inflection challenge. There’s no phrase on a launch date however we’d get to see a preview of MAI-1 on 16 Might at Microsoft’s Construct developer convention.

- Advertisement -

Microsoft is OpenAI’s greatest investor so the truth that it’s creating its personal LLMs to rival these of OpenAI is a bit stunning to some. Is Microsoft hedging its bets, pursuing a number of improvement methods, or one thing else totally?

See also  Interview: Cinderella Amar - Web4, AI, and Blockchain evangelist

Microsoft’s CTO Kevin Scott tried to downplay the problem. In a submit on LinkedIn Scott mentioned, “I’m unsure why that is information, however simply to summarize the apparent: we construct large supercomputers to coach AI fashions; our accomplice Open AI makes use of these supercomputers to coach frontier-defining fashions; after which we each make these fashions out there in services and products in order that plenty of individuals can profit from them. We somewhat like this association.”

Scott could also be honest on this assertion, however when MAI-1 is launched it may put Microsoft squarely in competitors with the corporate by which it has invested billions of {dollars}.

Will MAI-1 be launched simply in time for OpenAI to upstage it by releasing GPT-5? OpenAI scheduled an occasion for this Thursday the place the corporate was anticipated to share updates and product demonstrations however the occasion has since been postponed.

With mysterious GPT-2 chatbots showing, disappearing, and now reappearing, Microsoft constructing enormous fashions, and OpenAI maintaining us guessing, the AI drama is relentless.

- Advertisment -

Related

- Advertisment -

Leave a Reply

Please enter your comment!
Please enter your name here