o1 (generative pre-trained transformer)

o1
Developer(s)	OpenAI
Initial release	September 12, 2024; 4 days ago
Type	Generative pre-trained transformer
Website	openai.com/o1/

OpenAI o1 is a generative pre-trained transformer released by OpenAI in September 2024. o1 spends time "thinking" before it answers, making it more efficient in complex reasoning tasks, science and programming.^[1]

History

Background

According to leaked information, o1 was formerly known within OpenAI as "Q*", and later as "Strawberry".^[2] The codename "Q*" first surfaced in November 2023, around the time of Sam Altman's ousting and subsequent reinstatement, with rumors suggesting that this experimental model had shown promising results on mathematical benchmarks.^[3] In July 2024, Reuters reported that OpenAI was developing a generative pre-trained transformer known as "Strawberry".^[2]

Release

"o1-preview" and "o1-mini" were released on September 12, 2024, for ChatGPT Plus and Team users.^[1] GitHub started testing the integration of o1-preview in its Copilot service the same day.^[4]

OpenAI noted that o1 is the first of a series of "reasoning" models, and that it was planning to add access to o1-mini to all ChatGPT free users. o1-preview's API is several times more expensive than GPT-4o.^[5]

Capabilities

According to OpenAI, o1 has been trained using a new optimization algorithm and a dataset specifically tailored to it. The training leverages reinforcement learning.^[5]

o1 spends additional time thinking before generating an answer, which makes it more effective for complex reasoning tasks, particularly in science and programming.^[1] Compared to previous models, o1 has been trained to generate long "chains of thought" (which are hidden from the user) before returning a final answer.^[6]^[7] According to Mira Murati, this ability to think before responding represents a new, additional paradigm. It improves model outputs by spending more computing power when generating the answer, whereas the model scaling paradigm improves outputs by increasing the model size, training data and training compute power.^[8] OpenAI's test results suggest a correlation between accuracy and the logarithm of the amount of compute spent thinking before answering.^[7]^[6]

o1-preview performed approximately at a PhD level when answering questions related to physics, chemistry, and biology. On the International Mathematics Olympiad qualifying exam, it solved 83% of the problems, compared to 13% for GPT-4o. It also ranked in the 89th percentile in Codeforces coding competitions.^[9] o1-mini is faster and 80% cheaper than o1-preview. It is particularly suitable for programming and STEM-related tasks, but does not have the same "broad world knowledge" as o1-preview.^[10]

OpenAI noted that o1's reasoning capabilities make it better at adhering to safety rules provided in the prompt's context window. OpenAI reported that during a test, one instance of o1-preview exploited a misconfiguration to succeed at a task that should have been infeasible due to a bug.^[11]^[12] OpenAI also granted early access to the UK and US AI Safety Institutes for research, evaluation, and testing. Dan Hendrycks wrote that "The model already outperforms PhD scientists most of the time on answering questions related to bioweapons." He suggested that these concerning capabilities will continue to increase, making it pressing to adopt safety legislations like SB 1047.^[13]

References

^ ^a ^b ^c Metz, Cade (September 12, 2024). "OpenAI Unveils New ChatGPT That Can Reason Through Math and Science". The New York Times. Retrieved September 12, 2024.
^ ^a ^b Tong, Anna; Paul, Katie (July 15, 2024). "Exclusive: OpenAI working on new reasoning technology under code name 'Strawberry'". Reuters. Retrieved September 12, 2024.
^ "OpenAI researchers warned board of AI breakthrough ahead of CEO ouster, sources say". Reuters. November 23, 2023.
^ Peters, Jay (September 12, 2024). "GitHub has started testing OpenAI's o1-preview in GitHub Copilot". The Verge. Retrieved September 12, 2024.
^ ^a ^b Robison, Kylie (2024-09-12). "OpenAI releases o1, its first model with 'reasoning' abilities". The Verge. Retrieved 2024-09-15.
^ ^a ^b "Learning to Reason with LLMs". OpenAI. Archived from the original on September 12, 2024. Retrieved September 13, 2024.
^ ^a ^b Kahn, Jeremy. "Here are 9 things you need to know about OpenAI's o1 model". Fortune. Retrieved 2024-09-15.
^ Knight, Will. "OpenAI Announces a New AI Model, Code-Named Strawberry, That Solves Difficult Problems Step by Step". Wired. ISSN 1059-1028. Retrieved 2024-09-15.
^ Franzen, Carl (2024-09-12). "Forget GPT-5! OpenAI launches new AI model family o1 claiming PhD-level performance". VentureBeat. Retrieved 2024-09-15.
^ "OpenAI o1-mini". OpenAI. September 12, 2024.
^ Coombes, Lloyd (2024-09-13). "OpenAI's new ChatGPT o1 model 'cheated' on an impossible test — here's what happened". Tom's Guide. Retrieved 2024-09-15.
^ "OpenAI o1 System Card" (PDF). OpenAI. September 12, 2024. pp. 16–17.
^ Boran, Marie (2024-09-13). "OpenAI o1 model warning issued by scientist: "Particularly dangerous"". Newsweek. Retrieved 2024-09-15.

[NYTimesInfo-1] Metz, Cade (September 12, 2024). "OpenAI Unveils New ChatGPT That Can Reason Through Math and Science". The New York Times. Retrieved September 12, 2024.

[:0-2] Tong, Anna; Paul, Katie (July 15, 2024). "Exclusive: OpenAI working on new reasoning technology under code name 'Strawberry'". Reuters. Retrieved September 12, 2024.

[3] "OpenAI researchers warned board of AI breakthrough ahead of CEO ouster, sources say". Reuters. November 23, 2023.

[4] Peters, Jay (September 12, 2024). "GitHub has started testing OpenAI's o1-preview in GitHub Copilot". The Verge. Retrieved September 12, 2024.

[:1-5] Robison, Kylie (2024-09-12). "OpenAI releases o1, its first model with 'reasoning' abilities". The Verge. Retrieved 2024-09-15.

[:3-6] "Learning to Reason with LLMs". OpenAI. Archived from the original on September 12, 2024. Retrieved September 13, 2024.

[:2-7] Kahn, Jeremy. "Here are 9 things you need to know about OpenAI's o1 model". Fortune. Retrieved 2024-09-15.

[8] Knight, Will. "OpenAI Announces a New AI Model, Code-Named Strawberry, That Solves Difficult Problems Step by Step". Wired. ISSN 1059-1028. Retrieved 2024-09-15.

[9] Franzen, Carl (2024-09-12). "Forget GPT-5! OpenAI launches new AI model family o1 claiming PhD-level performance". VentureBeat. Retrieved 2024-09-15.

[10] "OpenAI o1-mini". OpenAI. September 12, 2024.

[11] Coombes, Lloyd (2024-09-13). "OpenAI's new ChatGPT o1 model 'cheated' on an impossible test — here's what happened". Tom's Guide. Retrieved 2024-09-15.

[12] "OpenAI o1 System Card" (PDF). OpenAI. September 12, 2024. pp. 16–17.

[13] Boran, Marie (2024-09-13). "OpenAI o1 model warning issued by scientist: "Particularly dangerous"". Newsweek. Retrieved 2024-09-15.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]