OpenAI Announces a New AI Model, Code-Named Strawberry, That Solves Difficult Problems Step by Step

OpenAI made the final huge breakthrough in artificial intelligence by rising the scale of its fashions to dizzying proportions, when it introduced GPT-4 final yr. The corporate right this moment introduced a brand new advance that indicators a shift in method—a mannequin that may “purpose” logically by many troublesome issues and is considerably smarter than present AI and not using a main scale-up.

The brand new mannequin, dubbed OpenAI o1, can remedy issues that stump present AI fashions, together with OpenAI’s strongest present mannequin, GPT-4o. Fairly than summon up a solution in a single step, as a big language mannequin usually does, it causes by the issue, successfully pondering out loud as an individual would possibly, earlier than arriving on the proper end result.

“That is what we contemplate the brand new paradigm in these fashions,” Mira Murati, OpenAI’s chief know-how officer, tells WIRED. “It’s a lot better at tackling very advanced reasoning duties.”

The brand new mannequin was code-named Strawberry inside OpenAI, and it isn’t a successor to GPT-4o however slightly a complement to it, the corporate says.

Murati says that OpenAI is at present constructing its subsequent grasp mannequin, GPT-5, which will likely be significantly bigger than its predecessor. However whereas the corporate nonetheless believes that scale will assist wring new skills out of AI, GPT-5 is more likely to additionally embrace the reasoning know-how launched right this moment. “There are two paradigms,” Murati says. “The scaling paradigm and this new paradigm. We count on that we are going to carry them collectively.”

LLMs usually conjure their solutions from enormous neural networks fed huge portions of coaching information. They will exhibit outstanding linguistic and logical skills, however historically battle with surprisingly easy issues equivalent to rudimentary math questions that contain reasoning.

Murati says OpenAI o1 makes use of reinforcement studying, which entails giving a mannequin constructive suggestions when it will get solutions proper and damaging suggestions when it doesn’t, with a purpose to enhance its reasoning course of. “The mannequin sharpens its pondering and high quality tunes the methods that it makes use of to get to the reply,” she says. Reinforcement studying has enabled computer systems to play games with superhuman skill and do helpful duties like designing computer chips. The approach can be a key ingredient for turning an LLM right into a helpful and well-behaved chatbot.

Mark Chen, vice chairman of analysis at OpenAI, demonstrated the brand new mannequin to WIRED, utilizing it to resolve a number of issues that its prior mannequin, GPT-4o, can not. These included a complicated chemistry query and the next mind-bending mathematical puzzle: “A princess is as outdated because the prince will likely be when the princess is twice as outdated because the prince was when the princess’s age was half the sum of their current age. What’s the age of the prince and princess?” (The right reply is that the prince is 30, and the princess is 40).

“The [new] mannequin is studying to assume for itself, slightly than sort of making an attempt to mimic the way in which people would assume,” as a standard LLM does, Chen says.

OpenAI says its new mannequin performs markedly higher on quite a few drawback units, together with ones targeted on coding, math, physics, biology, and chemistry. On the American Invitational Arithmetic Examination (AIME), a take a look at for math college students, GPT-4o solved on common 12 % of the issues whereas o1 bought 83 % proper, in line with the corporate.

Shopping cart