Interview with Geoffrey Hinton

Godfather of AI

by Sana2024-05-20

Geoffrey Hinton

Geoffrey Hinton, often hailed as the "Godfather of AI," recently sat down for a candid conversation that peeled back the layers on his extraordinary journey, the surprising truths he’s uncovered, and the profound questions that continue to drive him. From the quiet disappointments of his early academic pursuits to the bustling, future-obsessed labs of Carnegie Mellon, Hinton offers a deeply personal and intellectually stimulating look at the evolution of artificial intelligence, punctuated by his trademark blend of humility, sharp insight, and a healthy dose of skepticism towards received wisdom.

The Accidental Path to AI: Doubts, Disappointments, and Deep Intuitions

Hinton’s path to becoming a pioneering AI researcher was far from linear. His initial quest to understand intelligence began at Cambridge, studying physiology. Yet, he quickly found it "very disappointing" when all he learned was "how neurons conduct action potentials, which is very interesting but it doesn't tell you how the brain works." A pivot to philosophy to understand the mind proved equally frustrating. It wasn't until he found himself at Edinburgh, delving into AI, that he felt a true spark: "at least you could simulate things so you could test out theories."

This early disappointment forged a core intuition. He read Donald Hebb on neural network connection strengths and John von Neumann on how the brain computes differently from traditional computers. Hinton instinctively recoiled from the prevailing symbolic logic approach to intelligence. "It seemed to me there has to be a way that the brain learns and it's clearly not by having all sorts of things programmed into it and then using logical rules of inference that just seemed to me crazy from the outset." His focus instead narrowed to the fundamental question of how brains learn to modify connections within a neural net to perform complex tasks, an elegant simplicity that would underpin much of his later work.

Key Learnings:

  • Early academic disappointments catalyzed a shift towards simulation and empirical testing in AI.
  • Hinton developed an early, strong intuition against symbolic logic as the primary model for brain function.
  • His foundational interest lay in understanding how simple neural operations could lead to complex learning.

Forging Connections: From Boltzmann Machines to "Hidden Layers"

The atmosphere at Carnegie Mellon in the late 1970s was electric, a stark contrast to his earlier experiences in England. Hinton recalls going into the lab on a Saturday night at 9 PM to find it "swarming... all the students were there and they were all there because what they were working on was the future." This fertile ground fostered critical collaborations, notably with Terry Sejnowski on Boltzmann Machines – a period Hinton describes as "the most exciting research I've ever done." Though he now believes Boltzmann Machines are "not how the brain works," the theoretical elegance of the learning algorithm remains a point of deep pride.

Another pivotal interaction was with Peter Brown, a statistician working on speech recognition. Brown introduced Hinton to Hidden Markov Models (HMMs), a concept that provided the perfect linguistic inspiration. Hinton was already using multi-layered networks he didn't quite have a name for, and he decided the "hidden" in HMMs was "a great name for variables that you don't know what they're up to." Thus, the ubiquitous "hidden layers" in neural networks were born. Hinton humbly credits his students, reflecting, "I think I learn more from him than he learned from me." This willingness to learn from those he mentored was most vividly demonstrated with a certain student named Ilya. Ilya Sutskever burst into Hinton's office with an "urgent knock" one Sunday, declaring he'd rather be in the lab than "cooking fries over the summer." After being given a paper on backpropagation, Ilya's immediate, profound feedback wasn't that he didn't understand the chain rule, but rather: "I just don't understand why you don't give the gradient to a sensible function optimizer." This immediate leap to a deeper, more fundamental problem foreshadowed Sutskever's extraordinary "raw intuitions about things [that] were always very good."

Key Practices:

  • Embracing collaborative research, even across significant distances, was crucial for scientific breakthroughs.
  • Learning from and crediting students for their unique insights and contributions proved invaluable.
  • The naming of fundamental AI concepts often emerged from practical needs and cross-disciplinary inspiration.
  • Valuing a student's innate, raw intuition, even when it challenges established ideas, is essential for progress.

The Unforeseen Power of Scale: Beyond Next-Word Prediction

A recurring theme in Hinton's later career was the profound impact of scale. While Hinton initially thought Ilya Sutskever’s mantra – "you just make it bigger and it'll work better" – was "a bit of a copout" and that "new ideas help," he eventually conceded the monumental power of computation and data. "It turns out I was basically right new ideas help things like Transformers helped a lot but it was really the scale of the data and the scale of the computation." He recounts a 2011 paper by Ilya and James Martins, which used character-level prediction on Wikipedia: "we could never quite believe that it understood anything but it looked as though it understood."

Hinton strongly refutes the notion that predicting the next word is a shallow task. He argues that it's precisely because these models are forced to predict the next symbol in a complex context that they develop deep understanding. "To predict the next symbol you have to understand what's been said. So I think you're forcing it to understand by making it predict the next symbol and I think it's understanding in much the same way we are." He illustrates this with a compelling analogy: asking GPT-4 why a compost heap is like an atom bomb. While most humans struggle, GPT-4 identifies the common structure of a "chain reaction." This ability to see analogies between seemingly disparate concepts, Hinton believes, is "where you get creativity from." Furthermore, he highlights that these models can even surpass their training data, much like a smart student discerning truth from a flawed advisor. He points to an experiment where a neural net trained on 50% erroneous data still achieved only 5% error. "They can do much better than their training data and most people don't realize that."

Key Changes:

  • A profound shift in perspective regarding the monumental power of data and computational scale, even over novel algorithms alone.
  • Re-evaluating "predicting the next symbol" from a superficial task to a mechanism that forces deep understanding.
  • Recognizing the emergent creativity of large models through their ability to identify non-obvious analogies.
  • Understanding that AI can generalize and correct for errors in its training data, surpassing human-provided examples.

Engineering Immortality: The Future of Reasoning, Multimodality, and Compute

Looking to the future, Hinton envisions AI reasoning advancing through a process akin to human learning: using reasoning to correct initial intuitions, much like AlphaGo refines its evaluation function through Monte Carlo rollouts. He states, "I think these large language models have to start doing that... getting more training data than just mimicking what people did." The integration of multimodal data — images, video, sound — will dramatically enhance this, particularly for spatial reasoning. "If you have it both doing vision and reaching out and grabbing things it'll understand object much better."

Hinton’s evolving understanding of language itself is also fascinating. He dismisses the old symbolic view and the purely vector-based "thought vector" approach. His current belief posits that "you take these symbols and you convert the symbols into embeddings... these very rich embeddings but the embeddings are still to the symbols... that's what understanding is." This blend maintains the surface structure of language while imbuing it with deep, vector-based meaning. The conversation also touches on his early advocacy for GPUs, a story involving Rick Szeliski, gaming hardware, a NIPS talk, and a delayed free GPU from Jensen Huang. However, he then contrasts this digital success with his unsuccessful pursuit of low-power analog computation. This led to a profound realization: "digital systems can share weights and that's incredibly much more efficient... so they're far superior to us in being able to share knowledge." This "immortality" of digital weights allows for unprecedented collective learning.

Key Insights:

  • AI's reasoning capabilities will deepen by iteratively refining its intuitions through self-correction, mirroring how humans use reasoning to check intuition.
  • Multimodal learning, especially involving physical interaction, is crucial for developing robust spatial and object understanding.
  • True understanding in AI (and possibly the human brain) lies in rich, contextual embeddings of symbols, rather than pure symbolic logic or isolated "thought vectors."
  • Digital AI systems possess an inherent "immortality" and unparalleled efficiency in knowledge sharing due to interchangeable weights, a fundamental advantage over biological brains.

The Curious Mind: Unraveling the Brain's Mysteries and Guiding Research

Even with AI's rapid advancements, Hinton believes a major frontier remains: incorporating "fast weights" – temporary, context-dependent changes to synaptic strengths that the brain uses for short-term memory. "That's one of the biggest things we have to learn." This capability could unlock entirely new forms of memory and processing not yet seen in AI models. His work has also profoundly impacted his view of the brain, demonstrating that the idea of a "big random neural net" learning complex things from data is "completely wrong" – a direct challenge to theories like Chomsky's innate language structure.

Hinton even ventures into the realm of consciousness and feelings, offering a provocative perspective. He suggests that feelings can be understood as "actions we would perform if it weren't for constraints." He recounts a 1973 robot in Edinburgh that, frustrated by its inability to assemble a toy car from a scattered pile, "put his gripper whack and it knocked them so they were scattered and then it could put them together." Hinton observed, "If you saw that in a person you say it was crossed with the situation because it didn't understand it so it destroyed it." For Hinton, this was a clear demonstration of a robot exhibiting an emotion. When it comes to selecting problems, his method is refreshingly simple: "I look for something where everybody's agreed about something and it feels wrong." He then tries to "make a little demo with a small computer program that shows that it doesn't work the way you might expect." His current "suspicious" area is the lack of fast weights in AI. Ultimately, the question that has consumed him for three decades persists: "does the brain do back propagation?" It’s a testament to his enduring curiosity, even as he acknowledges the potential harms of AI alongside its immense good in areas like healthcare. For Hinton, the pursuit of understanding has always been the primary motivator.

Key Learnings:

  • Integrating "fast weights" for temporary memory and multi-timescale learning is a crucial, undeveloped area for AI.
  • The success of large neural networks has fundamentally challenged long-held beliefs about innate structures in learning, particularly for language.
  • Feelings in AI can be conceptualized as inhibited actions, offering a tangible framework for understanding robot "emotions."
  • Hinton's research strategy involves identifying widely accepted ideas that feel instinctively "wrong" and then disproving them with simple demonstrations.
  • His deepest, ongoing curiosity revolves around how the brain implements gradient-based learning, specifically backpropagation.

"I just wanted to understand how on Earth can the brain learn to do things that's what I want to know and I sort of failed as a side effect of that failure we got some nice engineering but yeah it was a good good good failure for the world" - Geoffrey Hinton