OpenAI o1 System Card

The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when res...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: OpenAI, Jaech, Aaron, Low, Aiden, Helyar, Alec, Passos, Alex Tachard, Neitz, Alexander, Tam, Allison, Bennett, Ally, Applebaum, Andy, Zoph, Barret, Ghorbani, Behrooz, Rossen, Ben, McKinzie, Brandon, Lugaresi, Camillo, Shen, Chen, Zhang, Chong, Koch, Chris, Roberts, Dan, Kappler, Daniel, Dohan, David, Farhi, David, Zhang, Eddie, Wallace, Eric, Ritter, Erik, Such, Felipe Petroski, Raso, Filippo, Tsimpourlas, Foivos, Sulit, Freddie, Parascandolo, Giambattista, Chabot, Gildas, Andrin, Hart, Ren, Hongyu, Lightman, Hunter, Kivlichan, Ian, Kostrikov, Ilya, Sutskever, Ilya, Lennon, James, Harb, Jean, Yu, Jiahui, Tang, Jie, Yu, Jieqi, Parish, Joel, Heidecke, Johannes, Ward, Jonathan, Huizinga, Joost, Nguyen, Karina, Shi, Katy, Gu-Lemberg, Keren, Lu, Kevin, Yu, Kevin, Ahmad, Lama, Kuhn, Lorenz, Kondraciuk, Lukas, Boyd, Madelaine, Joglekar, Manas, Chen, Mark, Tintor, Marko, Schwarzer, Max, Shah, Meghan, Yatbaz, Mehmet, Xu, Mengyuan, Yan, Mengyuan, Glaese, Mia, Malek, Michael, Pavlov, Mikhail, Wang, Miles, McAleese, Nat, Chowdhury, Neil, Ryder, Nick, Chao, Patrick, Izmailov, Pavel, Arora, Rahul, Lopes, Rapha Gontijo, Gaon, Raz, Leike, Reimar, Brown, Robin, Altman, Sam, Agarwal, Sandhini, Baker, Sasha, McKinney, Scott, Yan, Scottie, Chaudhuri, Shraman Ray, Zhang, Shuyuan, Fu, Siyuan, Wang, Tao, Gordon, Taylor, Patwardhan, Tejal, Dimson, Thomas, Zheng, Tianhao, Stasi, Tom, Bansal, Trapit, Creech, Trevor, Peterson, Troy, Zhou, Wenda, Dubois, Yann, Chen, Yining, Bai, Yu, He, Yuchen, Zhang, Yuchen, Wang, Yunyun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator OpenAI
Jaech, Aaron
Low, Aiden
Helyar, Alec
Passos, Alex Tachard
Neitz, Alexander
Tam, Allison
Bennett, Ally
Applebaum, Andy
Zoph, Barret
Ghorbani, Behrooz
Rossen, Ben
McKinzie, Brandon
Lugaresi, Camillo
Shen, Chen
Zhang, Chong
Koch, Chris
Roberts, Dan
Kappler, Daniel
Dohan, David
Farhi, David
Zhang, Eddie
Wallace, Eric
Ritter, Erik
Such, Felipe Petroski
Raso, Filippo
Tsimpourlas, Foivos
Sulit, Freddie
Parascandolo, Giambattista
Chabot, Gildas
Andrin, Hart
Ren, Hongyu
Lightman, Hunter
Kivlichan, Ian
Kostrikov, Ilya
Sutskever, Ilya
Lennon, James
Harb, Jean
Yu, Jiahui
Tang, Jie
Yu, Jieqi
Parish, Joel
Heidecke, Johannes
Ward, Jonathan
Huizinga, Joost
Nguyen, Karina
Shi, Katy
Gu-Lemberg, Keren
Lu, Kevin
Yu, Kevin
Ahmad, Lama
Kuhn, Lorenz
Kondraciuk, Lukas
Boyd, Madelaine
Joglekar, Manas
Chen, Mark
Tintor, Marko
Schwarzer, Max
Shah, Meghan
Yatbaz, Mehmet
Xu, Mengyuan
Yan, Mengyuan
Glaese, Mia
Malek, Michael
Pavlov, Mikhail
Wang, Miles
McAleese, Nat
Chowdhury, Neil
Ryder, Nick
Chao, Patrick
Izmailov, Pavel
Arora, Rahul
Lopes, Rapha Gontijo
Gaon, Raz
Leike, Reimar
Brown, Robin
Altman, Sam
Agarwal, Sandhini
Baker, Sasha
McKinney, Scott
Yan, Scottie
Chaudhuri, Shraman Ray
Zhang, Shuyuan
Fu, Siyuan
Wang, Tao
Gordon, Taylor
Patwardhan, Tejal
Dimson, Thomas
Zheng, Tianhao
Stasi, Tom
Bansal, Trapit
Creech, Trevor
Peterson, Troy
Zhou, Wenda
Dubois, Yann
Chen, Yining
Bai, Yu
He, Yuchen
Zhang, Yuchen
Wang, Yunyun
description The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-art performance on certain benchmarks for risks such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks. Training models to incorporate a chain of thought before answering has the potential to unlock substantial benefits, while also increasing potential risks that stem from heightened intelligence. Our results underscore the need for building robust alignment methods, extensively stress-testing their efficacy, and maintaining meticulous risk management protocols. This report outlines the safety work carried out for the OpenAI o1 and OpenAI o1-mini models, including safety evaluations, external red teaming, and Preparedness Framework evaluations.
doi_str_mv 10.48550/arxiv.2412.16720
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_16720</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_16720</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_167203</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Mzcy4GQQ9S9IzXP0VMg3VAiuLC5JzVVwTixK4WFgTUvMKU7lhdLcDPJuriHOHrpgA-ILijJzE4sq40EGxYMNMiasAgCMISWv</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>OpenAI o1 System Card</title><source>arXiv.org</source><creator>OpenAI ; Jaech, Aaron ; Low, Aiden ; Helyar, Alec ; Passos, Alex Tachard ; Neitz, Alexander ; Tam, Allison ; Bennett, Ally ; Applebaum, Andy ; Zoph, Barret ; Ghorbani, Behrooz ; Rossen, Ben ; McKinzie, Brandon ; Lugaresi, Camillo ; Shen, Chen ; Zhang, Chong ; Koch, Chris ; Roberts, Dan ; Kappler, Daniel ; Dohan, David ; Farhi, David ; Zhang, Eddie ; Wallace, Eric ; Ritter, Erik ; Such, Felipe Petroski ; Raso, Filippo ; Tsimpourlas, Foivos ; Sulit, Freddie ; Parascandolo, Giambattista ; Chabot, Gildas ; Andrin, Hart ; Ren, Hongyu ; Lightman, Hunter ; Kivlichan, Ian ; Kostrikov, Ilya ; Sutskever, Ilya ; Lennon, James ; Harb, Jean ; Yu, Jiahui ; Tang, Jie ; Yu, Jieqi ; Parish, Joel ; Heidecke, Johannes ; Ward, Jonathan ; Huizinga, Joost ; Nguyen, Karina ; Shi, Katy ; Gu-Lemberg, Keren ; Lu, Kevin ; Yu, Kevin ; Ahmad, Lama ; Kuhn, Lorenz ; Kondraciuk, Lukas ; Boyd, Madelaine ; Joglekar, Manas ; Chen, Mark ; Tintor, Marko ; Schwarzer, Max ; Shah, Meghan ; Yatbaz, Mehmet ; Xu, Mengyuan ; Yan, Mengyuan ; Glaese, Mia ; Malek, Michael ; Pavlov, Mikhail ; Wang, Miles ; McAleese, Nat ; Chowdhury, Neil ; Ryder, Nick ; Chao, Patrick ; Izmailov, Pavel ; Arora, Rahul ; Lopes, Rapha Gontijo ; Gaon, Raz ; Leike, Reimar ; Brown, Robin ; Altman, Sam ; Agarwal, Sandhini ; Baker, Sasha ; McKinney, Scott ; Yan, Scottie ; Chaudhuri, Shraman Ray ; Zhang, Shuyuan ; Fu, Siyuan ; Wang, Tao ; Gordon, Taylor ; Patwardhan, Tejal ; Dimson, Thomas ; Zheng, Tianhao ; Stasi, Tom ; Bansal, Trapit ; Creech, Trevor ; Peterson, Troy ; Zhou, Wenda ; Dubois, Yann ; Chen, Yining ; Bai, Yu ; He, Yuchen ; Zhang, Yuchen ; Wang, Yunyun</creator><creatorcontrib>OpenAI ; Jaech, Aaron ; Low, Aiden ; Helyar, Alec ; Passos, Alex Tachard ; Neitz, Alexander ; Tam, Allison ; Bennett, Ally ; Applebaum, Andy ; Zoph, Barret ; Ghorbani, Behrooz ; Rossen, Ben ; McKinzie, Brandon ; Lugaresi, Camillo ; Shen, Chen ; Zhang, Chong ; Koch, Chris ; Roberts, Dan ; Kappler, Daniel ; Dohan, David ; Farhi, David ; Zhang, Eddie ; Wallace, Eric ; Ritter, Erik ; Such, Felipe Petroski ; Raso, Filippo ; Tsimpourlas, Foivos ; Sulit, Freddie ; Parascandolo, Giambattista ; Chabot, Gildas ; Andrin, Hart ; Ren, Hongyu ; Lightman, Hunter ; Kivlichan, Ian ; Kostrikov, Ilya ; Sutskever, Ilya ; Lennon, James ; Harb, Jean ; Yu, Jiahui ; Tang, Jie ; Yu, Jieqi ; Parish, Joel ; Heidecke, Johannes ; Ward, Jonathan ; Huizinga, Joost ; Nguyen, Karina ; Shi, Katy ; Gu-Lemberg, Keren ; Lu, Kevin ; Yu, Kevin ; Ahmad, Lama ; Kuhn, Lorenz ; Kondraciuk, Lukas ; Boyd, Madelaine ; Joglekar, Manas ; Chen, Mark ; Tintor, Marko ; Schwarzer, Max ; Shah, Meghan ; Yatbaz, Mehmet ; Xu, Mengyuan ; Yan, Mengyuan ; Glaese, Mia ; Malek, Michael ; Pavlov, Mikhail ; Wang, Miles ; McAleese, Nat ; Chowdhury, Neil ; Ryder, Nick ; Chao, Patrick ; Izmailov, Pavel ; Arora, Rahul ; Lopes, Rapha Gontijo ; Gaon, Raz ; Leike, Reimar ; Brown, Robin ; Altman, Sam ; Agarwal, Sandhini ; Baker, Sasha ; McKinney, Scott ; Yan, Scottie ; Chaudhuri, Shraman Ray ; Zhang, Shuyuan ; Fu, Siyuan ; Wang, Tao ; Gordon, Taylor ; Patwardhan, Tejal ; Dimson, Thomas ; Zheng, Tianhao ; Stasi, Tom ; Bansal, Trapit ; Creech, Trevor ; Peterson, Troy ; Zhou, Wenda ; Dubois, Yann ; Chen, Yining ; Bai, Yu ; He, Yuchen ; Zhang, Yuchen ; Wang, Yunyun</creatorcontrib><description>The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-art performance on certain benchmarks for risks such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks. Training models to incorporate a chain of thought before answering has the potential to unlock substantial benefits, while also increasing potential risks that stem from heightened intelligence. Our results underscore the need for building robust alignment methods, extensively stress-testing their efficacy, and maintaining meticulous risk management protocols. This report outlines the safety work carried out for the OpenAI o1 and OpenAI o1-mini models, including safety evaluations, external red teaming, and Preparedness Framework evaluations.</description><identifier>DOI: 10.48550/arxiv.2412.16720</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence</subject><creationdate>2024-12</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.16720$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.16720$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>OpenAI</creatorcontrib><creatorcontrib>Jaech, Aaron</creatorcontrib><creatorcontrib>Low, Aiden</creatorcontrib><creatorcontrib>Helyar, Alec</creatorcontrib><creatorcontrib>Passos, Alex Tachard</creatorcontrib><creatorcontrib>Neitz, Alexander</creatorcontrib><creatorcontrib>Tam, Allison</creatorcontrib><creatorcontrib>Bennett, Ally</creatorcontrib><creatorcontrib>Applebaum, Andy</creatorcontrib><creatorcontrib>Zoph, Barret</creatorcontrib><creatorcontrib>Ghorbani, Behrooz</creatorcontrib><creatorcontrib>Rossen, Ben</creatorcontrib><creatorcontrib>McKinzie, Brandon</creatorcontrib><creatorcontrib>Lugaresi, Camillo</creatorcontrib><creatorcontrib>Shen, Chen</creatorcontrib><creatorcontrib>Zhang, Chong</creatorcontrib><creatorcontrib>Koch, Chris</creatorcontrib><creatorcontrib>Roberts, Dan</creatorcontrib><creatorcontrib>Kappler, Daniel</creatorcontrib><creatorcontrib>Dohan, David</creatorcontrib><creatorcontrib>Farhi, David</creatorcontrib><creatorcontrib>Zhang, Eddie</creatorcontrib><creatorcontrib>Wallace, Eric</creatorcontrib><creatorcontrib>Ritter, Erik</creatorcontrib><creatorcontrib>Such, Felipe Petroski</creatorcontrib><creatorcontrib>Raso, Filippo</creatorcontrib><creatorcontrib>Tsimpourlas, Foivos</creatorcontrib><creatorcontrib>Sulit, Freddie</creatorcontrib><creatorcontrib>Parascandolo, Giambattista</creatorcontrib><creatorcontrib>Chabot, Gildas</creatorcontrib><creatorcontrib>Andrin, Hart</creatorcontrib><creatorcontrib>Ren, Hongyu</creatorcontrib><creatorcontrib>Lightman, Hunter</creatorcontrib><creatorcontrib>Kivlichan, Ian</creatorcontrib><creatorcontrib>Kostrikov, Ilya</creatorcontrib><creatorcontrib>Sutskever, Ilya</creatorcontrib><creatorcontrib>Lennon, James</creatorcontrib><creatorcontrib>Harb, Jean</creatorcontrib><creatorcontrib>Yu, Jiahui</creatorcontrib><creatorcontrib>Tang, Jie</creatorcontrib><creatorcontrib>Yu, Jieqi</creatorcontrib><creatorcontrib>Parish, Joel</creatorcontrib><creatorcontrib>Heidecke, Johannes</creatorcontrib><creatorcontrib>Ward, Jonathan</creatorcontrib><creatorcontrib>Huizinga, Joost</creatorcontrib><creatorcontrib>Nguyen, Karina</creatorcontrib><creatorcontrib>Shi, Katy</creatorcontrib><creatorcontrib>Gu-Lemberg, Keren</creatorcontrib><creatorcontrib>Lu, Kevin</creatorcontrib><creatorcontrib>Yu, Kevin</creatorcontrib><creatorcontrib>Ahmad, Lama</creatorcontrib><creatorcontrib>Kuhn, Lorenz</creatorcontrib><creatorcontrib>Kondraciuk, Lukas</creatorcontrib><creatorcontrib>Boyd, Madelaine</creatorcontrib><creatorcontrib>Joglekar, Manas</creatorcontrib><creatorcontrib>Chen, Mark</creatorcontrib><creatorcontrib>Tintor, Marko</creatorcontrib><creatorcontrib>Schwarzer, Max</creatorcontrib><creatorcontrib>Shah, Meghan</creatorcontrib><creatorcontrib>Yatbaz, Mehmet</creatorcontrib><creatorcontrib>Xu, Mengyuan</creatorcontrib><creatorcontrib>Yan, Mengyuan</creatorcontrib><creatorcontrib>Glaese, Mia</creatorcontrib><creatorcontrib>Malek, Michael</creatorcontrib><creatorcontrib>Pavlov, Mikhail</creatorcontrib><creatorcontrib>Wang, Miles</creatorcontrib><creatorcontrib>McAleese, Nat</creatorcontrib><creatorcontrib>Chowdhury, Neil</creatorcontrib><creatorcontrib>Ryder, Nick</creatorcontrib><creatorcontrib>Chao, Patrick</creatorcontrib><creatorcontrib>Izmailov, Pavel</creatorcontrib><creatorcontrib>Arora, Rahul</creatorcontrib><creatorcontrib>Lopes, Rapha Gontijo</creatorcontrib><creatorcontrib>Gaon, Raz</creatorcontrib><creatorcontrib>Leike, Reimar</creatorcontrib><creatorcontrib>Brown, Robin</creatorcontrib><creatorcontrib>Altman, Sam</creatorcontrib><creatorcontrib>Agarwal, Sandhini</creatorcontrib><creatorcontrib>Baker, Sasha</creatorcontrib><creatorcontrib>McKinney, Scott</creatorcontrib><creatorcontrib>Yan, Scottie</creatorcontrib><creatorcontrib>Chaudhuri, Shraman Ray</creatorcontrib><creatorcontrib>Zhang, Shuyuan</creatorcontrib><creatorcontrib>Fu, Siyuan</creatorcontrib><creatorcontrib>Wang, Tao</creatorcontrib><creatorcontrib>Gordon, Taylor</creatorcontrib><creatorcontrib>Patwardhan, Tejal</creatorcontrib><creatorcontrib>Dimson, Thomas</creatorcontrib><creatorcontrib>Zheng, Tianhao</creatorcontrib><creatorcontrib>Stasi, Tom</creatorcontrib><creatorcontrib>Bansal, Trapit</creatorcontrib><creatorcontrib>Creech, Trevor</creatorcontrib><creatorcontrib>Peterson, Troy</creatorcontrib><creatorcontrib>Zhou, Wenda</creatorcontrib><creatorcontrib>Dubois, Yann</creatorcontrib><creatorcontrib>Chen, Yining</creatorcontrib><creatorcontrib>Bai, Yu</creatorcontrib><creatorcontrib>He, Yuchen</creatorcontrib><creatorcontrib>Zhang, Yuchen</creatorcontrib><creatorcontrib>Wang, Yunyun</creatorcontrib><title>OpenAI o1 System Card</title><description>The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-art performance on certain benchmarks for risks such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks. Training models to incorporate a chain of thought before answering has the potential to unlock substantial benefits, while also increasing potential risks that stem from heightened intelligence. Our results underscore the need for building robust alignment methods, extensively stress-testing their efficacy, and maintaining meticulous risk management protocols. This report outlines the safety work carried out for the OpenAI o1 and OpenAI o1-mini models, including safety evaluations, external red teaming, and Preparedness Framework evaluations.</description><subject>Computer Science - Artificial Intelligence</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Mzcy4GQQ9S9IzXP0VMg3VAiuLC5JzVVwTixK4WFgTUvMKU7lhdLcDPJuriHOHrpgA-ILijJzE4sq40EGxYMNMiasAgCMISWv</recordid><startdate>20241221</startdate><enddate>20241221</enddate><creator>OpenAI</creator><creator>Jaech, Aaron</creator><creator>Low, Aiden</creator><creator>Helyar, Alec</creator><creator>Passos, Alex Tachard</creator><creator>Neitz, Alexander</creator><creator>Tam, Allison</creator><creator>Bennett, Ally</creator><creator>Applebaum, Andy</creator><creator>Zoph, Barret</creator><creator>Ghorbani, Behrooz</creator><creator>Rossen, Ben</creator><creator>McKinzie, Brandon</creator><creator>Lugaresi, Camillo</creator><creator>Shen, Chen</creator><creator>Zhang, Chong</creator><creator>Koch, Chris</creator><creator>Roberts, Dan</creator><creator>Kappler, Daniel</creator><creator>Dohan, David</creator><creator>Farhi, David</creator><creator>Zhang, Eddie</creator><creator>Wallace, Eric</creator><creator>Ritter, Erik</creator><creator>Such, Felipe Petroski</creator><creator>Raso, Filippo</creator><creator>Tsimpourlas, Foivos</creator><creator>Sulit, Freddie</creator><creator>Parascandolo, Giambattista</creator><creator>Chabot, Gildas</creator><creator>Andrin, Hart</creator><creator>Ren, Hongyu</creator><creator>Lightman, Hunter</creator><creator>Kivlichan, Ian</creator><creator>Kostrikov, Ilya</creator><creator>Sutskever, Ilya</creator><creator>Lennon, James</creator><creator>Harb, Jean</creator><creator>Yu, Jiahui</creator><creator>Tang, Jie</creator><creator>Yu, Jieqi</creator><creator>Parish, Joel</creator><creator>Heidecke, Johannes</creator><creator>Ward, Jonathan</creator><creator>Huizinga, Joost</creator><creator>Nguyen, Karina</creator><creator>Shi, Katy</creator><creator>Gu-Lemberg, Keren</creator><creator>Lu, Kevin</creator><creator>Yu, Kevin</creator><creator>Ahmad, Lama</creator><creator>Kuhn, Lorenz</creator><creator>Kondraciuk, Lukas</creator><creator>Boyd, Madelaine</creator><creator>Joglekar, Manas</creator><creator>Chen, Mark</creator><creator>Tintor, Marko</creator><creator>Schwarzer, Max</creator><creator>Shah, Meghan</creator><creator>Yatbaz, Mehmet</creator><creator>Xu, Mengyuan</creator><creator>Yan, Mengyuan</creator><creator>Glaese, Mia</creator><creator>Malek, Michael</creator><creator>Pavlov, Mikhail</creator><creator>Wang, Miles</creator><creator>McAleese, Nat</creator><creator>Chowdhury, Neil</creator><creator>Ryder, Nick</creator><creator>Chao, Patrick</creator><creator>Izmailov, Pavel</creator><creator>Arora, Rahul</creator><creator>Lopes, Rapha Gontijo</creator><creator>Gaon, Raz</creator><creator>Leike, Reimar</creator><creator>Brown, Robin</creator><creator>Altman, Sam</creator><creator>Agarwal, Sandhini</creator><creator>Baker, Sasha</creator><creator>McKinney, Scott</creator><creator>Yan, Scottie</creator><creator>Chaudhuri, Shraman Ray</creator><creator>Zhang, Shuyuan</creator><creator>Fu, Siyuan</creator><creator>Wang, Tao</creator><creator>Gordon, Taylor</creator><creator>Patwardhan, Tejal</creator><creator>Dimson, Thomas</creator><creator>Zheng, Tianhao</creator><creator>Stasi, Tom</creator><creator>Bansal, Trapit</creator><creator>Creech, Trevor</creator><creator>Peterson, Troy</creator><creator>Zhou, Wenda</creator><creator>Dubois, Yann</creator><creator>Chen, Yining</creator><creator>Bai, Yu</creator><creator>He, Yuchen</creator><creator>Zhang, Yuchen</creator><creator>Wang, Yunyun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241221</creationdate><title>OpenAI o1 System Card</title><author>OpenAI ; Jaech, Aaron ; Low, Aiden ; Helyar, Alec ; Passos, Alex Tachard ; Neitz, Alexander ; Tam, Allison ; Bennett, Ally ; Applebaum, Andy ; Zoph, Barret ; Ghorbani, Behrooz ; Rossen, Ben ; McKinzie, Brandon ; Lugaresi, Camillo ; Shen, Chen ; Zhang, Chong ; Koch, Chris ; Roberts, Dan ; Kappler, Daniel ; Dohan, David ; Farhi, David ; Zhang, Eddie ; Wallace, Eric ; Ritter, Erik ; Such, Felipe Petroski ; Raso, Filippo ; Tsimpourlas, Foivos ; Sulit, Freddie ; Parascandolo, Giambattista ; Chabot, Gildas ; Andrin, Hart ; Ren, Hongyu ; Lightman, Hunter ; Kivlichan, Ian ; Kostrikov, Ilya ; Sutskever, Ilya ; Lennon, James ; Harb, Jean ; Yu, Jiahui ; Tang, Jie ; Yu, Jieqi ; Parish, Joel ; Heidecke, Johannes ; Ward, Jonathan ; Huizinga, Joost ; Nguyen, Karina ; Shi, Katy ; Gu-Lemberg, Keren ; Lu, Kevin ; Yu, Kevin ; Ahmad, Lama ; Kuhn, Lorenz ; Kondraciuk, Lukas ; Boyd, Madelaine ; Joglekar, Manas ; Chen, Mark ; Tintor, Marko ; Schwarzer, Max ; Shah, Meghan ; Yatbaz, Mehmet ; Xu, Mengyuan ; Yan, Mengyuan ; Glaese, Mia ; Malek, Michael ; Pavlov, Mikhail ; Wang, Miles ; McAleese, Nat ; Chowdhury, Neil ; Ryder, Nick ; Chao, Patrick ; Izmailov, Pavel ; Arora, Rahul ; Lopes, Rapha Gontijo ; Gaon, Raz ; Leike, Reimar ; Brown, Robin ; Altman, Sam ; Agarwal, Sandhini ; Baker, Sasha ; McKinney, Scott ; Yan, Scottie ; Chaudhuri, Shraman Ray ; Zhang, Shuyuan ; Fu, Siyuan ; Wang, Tao ; Gordon, Taylor ; Patwardhan, Tejal ; Dimson, Thomas ; Zheng, Tianhao ; Stasi, Tom ; Bansal, Trapit ; Creech, Trevor ; Peterson, Troy ; Zhou, Wenda ; Dubois, Yann ; Chen, Yining ; Bai, Yu ; He, Yuchen ; Zhang, Yuchen ; Wang, Yunyun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_167203</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><toplevel>online_resources</toplevel><creatorcontrib>OpenAI</creatorcontrib><creatorcontrib>Jaech, Aaron</creatorcontrib><creatorcontrib>Low, Aiden</creatorcontrib><creatorcontrib>Helyar, Alec</creatorcontrib><creatorcontrib>Passos, Alex Tachard</creatorcontrib><creatorcontrib>Neitz, Alexander</creatorcontrib><creatorcontrib>Tam, Allison</creatorcontrib><creatorcontrib>Bennett, Ally</creatorcontrib><creatorcontrib>Applebaum, Andy</creatorcontrib><creatorcontrib>Zoph, Barret</creatorcontrib><creatorcontrib>Ghorbani, Behrooz</creatorcontrib><creatorcontrib>Rossen, Ben</creatorcontrib><creatorcontrib>McKinzie, Brandon</creatorcontrib><creatorcontrib>Lugaresi, Camillo</creatorcontrib><creatorcontrib>Shen, Chen</creatorcontrib><creatorcontrib>Zhang, Chong</creatorcontrib><creatorcontrib>Koch, Chris</creatorcontrib><creatorcontrib>Roberts, Dan</creatorcontrib><creatorcontrib>Kappler, Daniel</creatorcontrib><creatorcontrib>Dohan, David</creatorcontrib><creatorcontrib>Farhi, David</creatorcontrib><creatorcontrib>Zhang, Eddie</creatorcontrib><creatorcontrib>Wallace, Eric</creatorcontrib><creatorcontrib>Ritter, Erik</creatorcontrib><creatorcontrib>Such, Felipe Petroski</creatorcontrib><creatorcontrib>Raso, Filippo</creatorcontrib><creatorcontrib>Tsimpourlas, Foivos</creatorcontrib><creatorcontrib>Sulit, Freddie</creatorcontrib><creatorcontrib>Parascandolo, Giambattista</creatorcontrib><creatorcontrib>Chabot, Gildas</creatorcontrib><creatorcontrib>Andrin, Hart</creatorcontrib><creatorcontrib>Ren, Hongyu</creatorcontrib><creatorcontrib>Lightman, Hunter</creatorcontrib><creatorcontrib>Kivlichan, Ian</creatorcontrib><creatorcontrib>Kostrikov, Ilya</creatorcontrib><creatorcontrib>Sutskever, Ilya</creatorcontrib><creatorcontrib>Lennon, James</creatorcontrib><creatorcontrib>Harb, Jean</creatorcontrib><creatorcontrib>Yu, Jiahui</creatorcontrib><creatorcontrib>Tang, Jie</creatorcontrib><creatorcontrib>Yu, Jieqi</creatorcontrib><creatorcontrib>Parish, Joel</creatorcontrib><creatorcontrib>Heidecke, Johannes</creatorcontrib><creatorcontrib>Ward, Jonathan</creatorcontrib><creatorcontrib>Huizinga, Joost</creatorcontrib><creatorcontrib>Nguyen, Karina</creatorcontrib><creatorcontrib>Shi, Katy</creatorcontrib><creatorcontrib>Gu-Lemberg, Keren</creatorcontrib><creatorcontrib>Lu, Kevin</creatorcontrib><creatorcontrib>Yu, Kevin</creatorcontrib><creatorcontrib>Ahmad, Lama</creatorcontrib><creatorcontrib>Kuhn, Lorenz</creatorcontrib><creatorcontrib>Kondraciuk, Lukas</creatorcontrib><creatorcontrib>Boyd, Madelaine</creatorcontrib><creatorcontrib>Joglekar, Manas</creatorcontrib><creatorcontrib>Chen, Mark</creatorcontrib><creatorcontrib>Tintor, Marko</creatorcontrib><creatorcontrib>Schwarzer, Max</creatorcontrib><creatorcontrib>Shah, Meghan</creatorcontrib><creatorcontrib>Yatbaz, Mehmet</creatorcontrib><creatorcontrib>Xu, Mengyuan</creatorcontrib><creatorcontrib>Yan, Mengyuan</creatorcontrib><creatorcontrib>Glaese, Mia</creatorcontrib><creatorcontrib>Malek, Michael</creatorcontrib><creatorcontrib>Pavlov, Mikhail</creatorcontrib><creatorcontrib>Wang, Miles</creatorcontrib><creatorcontrib>McAleese, Nat</creatorcontrib><creatorcontrib>Chowdhury, Neil</creatorcontrib><creatorcontrib>Ryder, Nick</creatorcontrib><creatorcontrib>Chao, Patrick</creatorcontrib><creatorcontrib>Izmailov, Pavel</creatorcontrib><creatorcontrib>Arora, Rahul</creatorcontrib><creatorcontrib>Lopes, Rapha Gontijo</creatorcontrib><creatorcontrib>Gaon, Raz</creatorcontrib><creatorcontrib>Leike, Reimar</creatorcontrib><creatorcontrib>Brown, Robin</creatorcontrib><creatorcontrib>Altman, Sam</creatorcontrib><creatorcontrib>Agarwal, Sandhini</creatorcontrib><creatorcontrib>Baker, Sasha</creatorcontrib><creatorcontrib>McKinney, Scott</creatorcontrib><creatorcontrib>Yan, Scottie</creatorcontrib><creatorcontrib>Chaudhuri, Shraman Ray</creatorcontrib><creatorcontrib>Zhang, Shuyuan</creatorcontrib><creatorcontrib>Fu, Siyuan</creatorcontrib><creatorcontrib>Wang, Tao</creatorcontrib><creatorcontrib>Gordon, Taylor</creatorcontrib><creatorcontrib>Patwardhan, Tejal</creatorcontrib><creatorcontrib>Dimson, Thomas</creatorcontrib><creatorcontrib>Zheng, Tianhao</creatorcontrib><creatorcontrib>Stasi, Tom</creatorcontrib><creatorcontrib>Bansal, Trapit</creatorcontrib><creatorcontrib>Creech, Trevor</creatorcontrib><creatorcontrib>Peterson, Troy</creatorcontrib><creatorcontrib>Zhou, Wenda</creatorcontrib><creatorcontrib>Dubois, Yann</creatorcontrib><creatorcontrib>Chen, Yining</creatorcontrib><creatorcontrib>Bai, Yu</creatorcontrib><creatorcontrib>He, Yuchen</creatorcontrib><creatorcontrib>Zhang, Yuchen</creatorcontrib><creatorcontrib>Wang, Yunyun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>OpenAI</au><au>Jaech, Aaron</au><au>Low, Aiden</au><au>Helyar, Alec</au><au>Passos, Alex Tachard</au><au>Neitz, Alexander</au><au>Tam, Allison</au><au>Bennett, Ally</au><au>Applebaum, Andy</au><au>Zoph, Barret</au><au>Ghorbani, Behrooz</au><au>Rossen, Ben</au><au>McKinzie, Brandon</au><au>Lugaresi, Camillo</au><au>Shen, Chen</au><au>Zhang, Chong</au><au>Koch, Chris</au><au>Roberts, Dan</au><au>Kappler, Daniel</au><au>Dohan, David</au><au>Farhi, David</au><au>Zhang, Eddie</au><au>Wallace, Eric</au><au>Ritter, Erik</au><au>Such, Felipe Petroski</au><au>Raso, Filippo</au><au>Tsimpourlas, Foivos</au><au>Sulit, Freddie</au><au>Parascandolo, Giambattista</au><au>Chabot, Gildas</au><au>Andrin, Hart</au><au>Ren, Hongyu</au><au>Lightman, Hunter</au><au>Kivlichan, Ian</au><au>Kostrikov, Ilya</au><au>Sutskever, Ilya</au><au>Lennon, James</au><au>Harb, Jean</au><au>Yu, Jiahui</au><au>Tang, Jie</au><au>Yu, Jieqi</au><au>Parish, Joel</au><au>Heidecke, Johannes</au><au>Ward, Jonathan</au><au>Huizinga, Joost</au><au>Nguyen, Karina</au><au>Shi, Katy</au><au>Gu-Lemberg, Keren</au><au>Lu, Kevin</au><au>Yu, Kevin</au><au>Ahmad, Lama</au><au>Kuhn, Lorenz</au><au>Kondraciuk, Lukas</au><au>Boyd, Madelaine</au><au>Joglekar, Manas</au><au>Chen, Mark</au><au>Tintor, Marko</au><au>Schwarzer, Max</au><au>Shah, Meghan</au><au>Yatbaz, Mehmet</au><au>Xu, Mengyuan</au><au>Yan, Mengyuan</au><au>Glaese, Mia</au><au>Malek, Michael</au><au>Pavlov, Mikhail</au><au>Wang, Miles</au><au>McAleese, Nat</au><au>Chowdhury, Neil</au><au>Ryder, Nick</au><au>Chao, Patrick</au><au>Izmailov, Pavel</au><au>Arora, Rahul</au><au>Lopes, Rapha Gontijo</au><au>Gaon, Raz</au><au>Leike, Reimar</au><au>Brown, Robin</au><au>Altman, Sam</au><au>Agarwal, Sandhini</au><au>Baker, Sasha</au><au>McKinney, Scott</au><au>Yan, Scottie</au><au>Chaudhuri, Shraman Ray</au><au>Zhang, Shuyuan</au><au>Fu, Siyuan</au><au>Wang, Tao</au><au>Gordon, Taylor</au><au>Patwardhan, Tejal</au><au>Dimson, Thomas</au><au>Zheng, Tianhao</au><au>Stasi, Tom</au><au>Bansal, Trapit</au><au>Creech, Trevor</au><au>Peterson, Troy</au><au>Zhou, Wenda</au><au>Dubois, Yann</au><au>Chen, Yining</au><au>Bai, Yu</au><au>He, Yuchen</au><au>Zhang, Yuchen</au><au>Wang, Yunyun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>OpenAI o1 System Card</atitle><date>2024-12-21</date><risdate>2024</risdate><abstract>The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-art performance on certain benchmarks for risks such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks. Training models to incorporate a chain of thought before answering has the potential to unlock substantial benefits, while also increasing potential risks that stem from heightened intelligence. Our results underscore the need for building robust alignment methods, extensively stress-testing their efficacy, and maintaining meticulous risk management protocols. This report outlines the safety work carried out for the OpenAI o1 and OpenAI o1-mini models, including safety evaluations, external red teaming, and Preparedness Framework evaluations.</abstract><doi>10.48550/arxiv.2412.16720</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2412.16720
ispartof
issn
language eng
recordid cdi_arxiv_primary_2412_16720
source arXiv.org
subjects Computer Science - Artificial Intelligence
title OpenAI o1 System Card
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T10%3A28%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=OpenAI%20o1%20System%20Card&rft.au=OpenAI&rft.date=2024-12-21&rft_id=info:doi/10.48550/arxiv.2412.16720&rft_dat=%3Carxiv_GOX%3E2412_16720%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true