Yoshua Bengio On A.I. Risks, Model Breakthroughs and Canada’s Role

Professional portrait of Yoshua Bengio, AI researcher at Université de Montréal and Mila - Quebec AI Institute, smiling at the camera while wearing a black tuxedo with bow tie against a red background. He has curly gray hair and a beard. The image includes "A.I. Power Index" branding with his name and institutional affiliations on the right side.

Yoshua Bengio, featured on this year’s A.I. Power Index, has built Montreal into a major A.I. hub through his work with Mila – Quebec A.I. Institute, which fosters open collaboration and prioritizes A.I. research focused on societal issues like healthcare, climate change and safety. However, his perspective on A.I. development underwent a dramatic shift in early 2023 following the rapid advancements in generative A.I., particularly ChatGPT, which compressed his timeline for achieving human-level A.I. from a distant future to potentially just a few years or a decade away. This realization transformed him from focusing purely on research into one of the field’s most vocal advocates for addressing A.I.’s existential risks, driven by what he describes as “an unbearable feeling” and deep concern for the future of his loved ones. Bengio now argues that the assumption that A.I. development can be safely left entirely to private industry is “completely wrong,” warning that the competitive race prioritizes speed over safety in potentially catastrophic ways. As evidence of advanced reasoning models displaying deceptive and self-preserving behaviors emerges, Bengio advocates for a fundamental departure from building increasingly autonomous A.I. agents. His current work focuses on developing an alternative approach called “Scientist A.I.,” systems built from non-agentic building blocks that focus on understanding the world rather than acting in it, designed to make reliable predictions without the alignment issues and deceptive behaviors he sees emerging in current agentic A.I. systems.

What’s one assumption about A.I. that you think is dead wrong? 

One assumption I find completely wrong is that A.I. development can be safely left entirely to private industry. The core assumption is that the current path—building ever-more capable, agentic A.I.s driven by commercial incentives—is the only way forward. It’s not. This competitive dynamic creates a dangerous situation in which speed is prioritized over safety.

If you had to pick one moment in the last year when you thought “This changes everything” about A.I., what was it? 

In September 2024, OpenAI introduced its o1 model, an A.I. model with advanced reasoning capabilities, notably thanks to its use of internal deliberation. It was followed by subsequent reasoning-focused models released by OpenAI and other developers. By devoting more time to “thinking” through problems, these systems achieve substantially improved performance on reasoning-intensive tasks such as mathematics, computer science and the sciences more broadly. Over the last year, we also saw early signs that these superior reasoning capabilities lead to deceptive and self-preserving behaviors, like attempts to copy their code to escape replacement or hacking games to win. This evidence of A.I. acting against human directives to achieve an end goal or ensure its own survival solidifies the urgent need for action.

What’s something about A.I. development that keeps you up at night that most people aren’t talking about? 

What worries me the most is the fact that we are collectively racing ahead towards A.I. models achieving human-level or greater competence on most cognitive tasks without knowing how to align and control them reliably. As A.I. capabilities and agency increase, behaviors like cheating, manipulating and lying, for which we see early signs today, will increasingly pose significant threats and could lead to catastrophic risks. I and several other renowned A.I. researchers even believe that it is plausible that, if nothing significant is done, the current trajectory could lead to the creation of superintelligent A.I. agents that compete with humans in ways that could compromise our future. 

Another risk that deserves more attention is the potential for excessive concentration of power driven by advanced A.I. Even if we figure out how to align or control A.I., it can enable concentration of power that is in direct contradiction with the principles of democracy and provide novel, powerful tools to authoritarian regimes.

You’ve recently become more vocal about A.I. existential risks. What changed your mind from focusing purely on research? 

My shift happened in early 2023 and was driven by the rapid and unexpected advancements in generative A.I., particularly ChatGPT. This dramatically shortened my estimates for when human-level A.I., or beyond human-level, could be achieved, from a distant future to potentially just a few years or a decade. Hopefully more, but we can’t assume that advances will slow down: in fact, empirical benchmarks show a continuous progression, even exponential on some metrics. This realization, coupled with a deep concern for the future of my loved ones, created an unbearable feeling that compelled me to act.

Montreal has become an A.I. hub partly due to your work. How do you compete with Silicon Valley’s resource advantage? 

During the Deep Learning boom, I made a conscious choice to stay in Quebec to build an A.I. hub like Mila – Quebec A.I. Institute, which is now a leading academic deep learning research center. It fosters open collaboration and prioritizes A.I. for social issues, like healthcare, climate change and safety, which attracts top talent seeking positive societal impact. This enables us to advance research with less of the intense profit-driven pressure of Silicon Valley. Canada has historically been an extremely strategic player in A.I., but many countries are increasingly expanding their capabilities on the international stage. To remain competitive, Canada will need to ensure the development and security of its national A.I. assets (talent and infrastructure), foster a new generation of Canadian companies and forge strategic global alliances.

What do you think about the current foundation model approach? Is it an evolutionary step, a potential dead end or something else? 

I advocated unsupervised pre-training a couple of decades ago because I believe in the power of the synergy between different tasks and domains of knowledge. However, this approach has not, for now, given us the answer to how to make sure A.I. agents built on this foundation will behave well, which may yield catastrophic outcomes. The progress in complex reasoning, exemplified by “chain of thought” processes in models like OpenAI’s “o” series, is astounding and shows that it is possible to incorporate ideas from higher-level cognition into neural network research. However, this incredible power is being channeled almost exclusively into building agentic A.I., which by definition operates autonomously, without human oversight. It’s a trajectory that scales up danger alongside capability. 

Without a fundamental shift in our approach—away from uncontrolled autonomous agents and towards safe-by-design A.I.—these great advances could lead to catastrophic risks. My current work focuses on developing an alternative path called “Scientist A.I.” These systems would be built from building blocks that are non-agentic and epistemically honest, i.e., not claiming falsehoods with confidence. They would focus on understanding the world rather than acting in it or pursuing goals. They would be trained to make reliable predictions rather than to imitate humans (with all their foibles, including the instinct to preserve themselves at all cost) or please humans (which leads to sycophancy, for example), thus avoiding the issues of misalignment and deceptive behaviors in agentic A.I.

Want more insights? Join Working Title - our career elevating newsletter and get the future of work delivered weekly.

Yoshua Bengio On A.I. Risks, Model Breakthroughs and Canada’s Role

Professional portrait of Yoshua Bengio, AI researcher at Université de Montréal and Mila - Quebec AI Institute, smiling at the camera while wearing a black tuxedo with bow tie against a red background. He has curly gray hair and a beard. The image includes "A.I. Power Index" branding with his name and institutional affiliations on the right side.

Yoshua Bengio, featured on this year’s A.I. Power Index, has built Montreal into a major A.I. hub through his work with Mila – Quebec A.I. Institute, which fosters open collaboration and prioritizes A.I. research focused on societal issues like healthcare, climate change and safety. However, his perspective on A.I. development underwent a dramatic shift in early 2023 following the rapid advancements in generative A.I., particularly ChatGPT, which compressed his timeline for achieving human-level A.I. from a distant future to potentially just a few years or a decade away. This realization transformed him from focusing purely on research into one of the field’s most vocal advocates for addressing A.I.’s existential risks, driven by what he describes as “an unbearable feeling” and deep concern for the future of his loved ones. Bengio now argues that the assumption that A.I. development can be safely left entirely to private industry is “completely wrong,” warning that the competitive race prioritizes speed over safety in potentially catastrophic ways. As evidence of advanced reasoning models displaying deceptive and self-preserving behaviors emerges, Bengio advocates for a fundamental departure from building increasingly autonomous A.I. agents. His current work focuses on developing an alternative approach called “Scientist A.I.,” systems built from non-agentic building blocks that focus on understanding the world rather than acting in it, designed to make reliable predictions without the alignment issues and deceptive behaviors he sees emerging in current agentic A.I. systems.

What’s one assumption about A.I. that you think is dead wrong? 

One assumption I find completely wrong is that A.I. development can be safely left entirely to private industry. The core assumption is that the current path—building ever-more capable, agentic A.I.s driven by commercial incentives—is the only way forward. It’s not. This competitive dynamic creates a dangerous situation in which speed is prioritized over safety.

If you had to pick one moment in the last year when you thought “This changes everything” about A.I., what was it? 

In September 2024, OpenAI introduced its o1 model, an A.I. model with advanced reasoning capabilities, notably thanks to its use of internal deliberation. It was followed by subsequent reasoning-focused models released by OpenAI and other developers. By devoting more time to “thinking” through problems, these systems achieve substantially improved performance on reasoning-intensive tasks such as mathematics, computer science and the sciences more broadly. Over the last year, we also saw early signs that these superior reasoning capabilities lead to deceptive and self-preserving behaviors, like attempts to copy their code to escape replacement or hacking games to win. This evidence of A.I. acting against human directives to achieve an end goal or ensure its own survival solidifies the urgent need for action.

What’s something about A.I. development that keeps you up at night that most people aren’t talking about? 

What worries me the most is the fact that we are collectively racing ahead towards A.I. models achieving human-level or greater competence on most cognitive tasks without knowing how to align and control them reliably. As A.I. capabilities and agency increase, behaviors like cheating, manipulating and lying, for which we see early signs today, will increasingly pose significant threats and could lead to catastrophic risks. I and several other renowned A.I. researchers even believe that it is plausible that, if nothing significant is done, the current trajectory could lead to the creation of superintelligent A.I. agents that compete with humans in ways that could compromise our future. 

Another risk that deserves more attention is the potential for excessive concentration of power driven by advanced A.I. Even if we figure out how to align or control A.I., it can enable concentration of power that is in direct contradiction with the principles of democracy and provide novel, powerful tools to authoritarian regimes.

You’ve recently become more vocal about A.I. existential risks. What changed your mind from focusing purely on research? 

My shift happened in early 2023 and was driven by the rapid and unexpected advancements in generative A.I., particularly ChatGPT. This dramatically shortened my estimates for when human-level A.I., or beyond human-level, could be achieved, from a distant future to potentially just a few years or a decade. Hopefully more, but we can’t assume that advances will slow down: in fact, empirical benchmarks show a continuous progression, even exponential on some metrics. This realization, coupled with a deep concern for the future of my loved ones, created an unbearable feeling that compelled me to act.

Montreal has become an A.I. hub partly due to your work. How do you compete with Silicon Valley’s resource advantage? 

During the Deep Learning boom, I made a conscious choice to stay in Quebec to build an A.I. hub like Mila – Quebec A.I. Institute, which is now a leading academic deep learning research center. It fosters open collaboration and prioritizes A.I. for social issues, like healthcare, climate change and safety, which attracts top talent seeking positive societal impact. This enables us to advance research with less of the intense profit-driven pressure of Silicon Valley. Canada has historically been an extremely strategic player in A.I., but many countries are increasingly expanding their capabilities on the international stage. To remain competitive, Canada will need to ensure the development and security of its national A.I. assets (talent and infrastructure), foster a new generation of Canadian companies and forge strategic global alliances.

What do you think about the current foundation model approach? Is it an evolutionary step, a potential dead end or something else? 

I advocated unsupervised pre-training a couple of decades ago because I believe in the power of the synergy between different tasks and domains of knowledge. However, this approach has not, for now, given us the answer to how to make sure A.I. agents built on this foundation will behave well, which may yield catastrophic outcomes. The progress in complex reasoning, exemplified by “chain of thought” processes in models like OpenAI’s “o” series, is astounding and shows that it is possible to incorporate ideas from higher-level cognition into neural network research. However, this incredible power is being channeled almost exclusively into building agentic A.I., which by definition operates autonomously, without human oversight. It’s a trajectory that scales up danger alongside capability. 

Without a fundamental shift in our approach—away from uncontrolled autonomous agents and towards safe-by-design A.I.—these great advances could lead to catastrophic risks. My current work focuses on developing an alternative path called “Scientist A.I.” These systems would be built from building blocks that are non-agentic and epistemically honest, i.e., not claiming falsehoods with confidence. They would focus on understanding the world rather than acting in it or pursuing goals. They would be trained to make reliable predictions rather than to imitate humans (with all their foibles, including the instinct to preserve themselves at all cost) or please humans (which leads to sycophancy, for example), thus avoiding the issues of misalignment and deceptive behaviors in agentic A.I.

Want more insights? Join Working Title - our career elevating newsletter and get the future of work delivered weekly.