The Information : OpenAI Co-Founder Sutskever Joins the Skeptics

OpenAI Co-Founder Sutskever Joins the Skeptics

As Sri and I head to San Diego for the annual Neural Information Processing Systems conference this week (get in touch if you’ll also be there!), we’re excited to learn more about reinforcement learning, the model training technique du jour at all the major AI developers.

There’s rising skepticism among researchers, including OpenAI co-founder Ilya Sutskever, about the effectiveness of RL and whether it can advance AI to the level of artificial general intelligence, on par with human experts in scientific research, healthcare and other domains.

Sutskever, who left OpenAI last year to start his own AI lab, explained in a rare interview on the Dwarkesh podcast why AI models are struggling to handle real-world tasks that aren’t part of the evaluations that researchers use when they develop the models.

He said researchers use RL to help the models ace the evaluations, but that doesn’t improve the way the models generalize, or handle a wide variety of tasks. (We covered this topic last week in the context of OpenAI versus Google.)

His lukewarm opinion of RL puts him in a similar camp as another OpenAI co-founder, Andrej Karpathy.

Sutskever also questioned the definition of artificial general intelligence, the concept of AI that outperforms humans at most economically valuable work. Sutskever said AGI should instead refer to AI that can learn how to excel at tasks after it starts working on them, much like a teenager who can learn how to drive in the real world after just a few hours behind the wheel.

Sutskever said nobody knows how to develop this kind of machine intelligence yet. And that’s why we’re entering what he calls an “age of research,” meaning smaller-scale experiments of new methods, rather than the “age of scaling” of existing methods by putting more computing power behind them.

The comments suggest Sutskever disagrees with the current approach used by many leading AI developers. That approach includes relying on people to evaluate thousands of AI responses to specific queries or giving models simulated versions of popular apps so they can try hundreds of times to complete tasks successfully in the apps. Billions of dollars have been spent on those techniques this year!

Still, he acknowledged that even if today’s advanced AI developers stall out, their technology can still make “stupendous” revenue along the way, though he cautioned that profits would be harder to come by as they “work hard to differentiate [from] each other.”

Sutskever also had a lot of thoughts on existential risks from AI—unsurprising since it was likely one of his reasons for leaving OpenAI in the first place, given his current lab’s focus on safe superintelligence. One potential solution for ensuring AI acts in the best interest of humans is to essentially merge human bodies with AI, which he referred to as a more advanced version of Elon Musk’s Neuralink efforts, to make sure humans know everything an AI is thinking and so the AI would want to protect the human body on which it depends, Sutskever said.

This sounds somewhat in line with Ray Kurzweil’s Singularity vision in which humans and AI will merge by 2045!

Before then, making AI that learns on-the-fly will likely require a completely different way of developing models, Sutskever said. For instance, he believes models are missing something similar to the emotions that guide human judgment and decision-making. Models also need a “value function,” something that humans have that tell us whether any decision will lead us to a good or bad outcome, he said.

If you’re wondering how long it might take to get to this kind of AGI, don’t hold your breath. Sutskever estimated it could take anywhere from five to 20 years.