In the mid-2010s, some AI researchers noticed that larger AI systems were consistently smarter, and so they theorized that the most important ingredient in AI performance might be the total budget for AI training computation. When this was graphed, it became clear that the amount of computation going into the largest models was growing at 10x per year (a doubling time 7 times faster than Moore’s Law).
In 2019, several members of what was to become the founding Anthropic team made this idea precise by developing scaling laws for AI, demonstrating that you could make AIs smarter in a predictable way, just by making them larger and training them on more data. Justified in part by these results, this team led the effort to train GPT-3, arguably the first modern “large” language model, with over 173B parameters.
In 2019, it seemed possible that
* multimodality
* logical reasoning
* speed of learning
* transfer learning across tasks, and
* long-term memory might be “walls” that would slow or halt the progress of AI.
In the years since, several of these “walls”, such as multimodality and logical reasoning, have fallen.
It seems likely that rapid AI progress will continue rather than stall or plateau. AI systems are now approaching human level performance on a large variety of tasks, and yet training these systems costs are still high.
If Anthropic is correct then rapid AI progress may not end before AI systems have a broad range of capabilities that exceed our own capacities.
Most or all knowledge work may be automatable in the not-too-distant future and this could accelerate the rate of progress of other technologies as well.
Less Wrong also believes we are in the AGI end game.
AGI is happening soon. Significant probability of it happening in less than 5 years.
Five years ago, there were many obstacles on what we considered to be the path to AGI.
In the last few years, we’ve gotten:
Powerful Agents (Agent57, GATO, Dreamer V3)
Reliably good Multimodal Models (StableDiffusion, Whisper, Clip)
Just about every language tasks (GPT3, ChatGPT, Bing Chat)
Human and Social Manipulation
Robots (Boston Dynamics, Day Dreamer, VideoDex, RT-1: Robotics Transformer)
AIs that are superhuman at just about any task we can (or simply bother to) define a benchmark. We don’t have any obstacle left in mind that we don’t expect to get overcome in more than 6 months after efforts are invested to take it down.
AI Safety Problems
AI Safety, and we don’t have much time left.
No one knows how to get LLMs to be truthful. LLMs make things up, constantly. It is really hard to get them not to do this, and we don’t know how to do this at scale.
Optimizers quite often break their setup in unexpected ways. There have been quite a few examples of this. But in brief, the lessons we have learned are:
Optimizers can yield unexpected results
Those results can be very weird (like breaking the simulation environment)
Yet very few extrapolate from this and find these as worrying signs
No one understands how large models make their decisions. Interpretability is extremely nascent, and mostly empirical. In practice, we are still completely in the dark about nearly all decisions taken by large models.
RLHF [reinforcement learning from human feedback] and Fine-Tuning have not worked well so far. Models are often unhelpful, untruthful, inconsistent, in many ways that had been theorized in the past. There are observed problems goal misspecification, misalignment, etc. Worse than this, as models become more powerful, we expect more egregious instances of misalignment, as more optimization will push for more and more extreme edge cases and pseudo-adversarial examples.
No one knows how to predict AI capabilities.
How Hard is AI Safety?
How difficult it will be to develop advanced AI systems that are broadly safe and pose little risk to humans. Developing such systems could lie anywhere on the spectrum from very easy to impossible. Three scenarios with very different implications:
Optimistic scenarios: There is very little chance of catastrophic risk from advanced AI as a result of safety failures. Safety techniques that have already been developed, such as reinforcement learning from human feedback (RLHF) and Constitutional AI (CAI), are already largely sufficient for alignment. The main risks from AI are extrapolations of issues faced today, such as toxicity and intentional misuse, as well as potential harms resulting from things like widespread automation and shifts in international power dynamics – this will require AI labs and third parties such as academia and civil society institutions to conduct significant amounts of research to minimize harms.
Intermediate scenarios: Catastrophic risks are a possible or even plausible outcome of advanced AI development. Counteracting this requires a substantial scientific and engineering effort, but with enough focused work we can achieve it.
Pessimistic scenarios: AI safety is an essentially unsolvable problem – it’s simply an empirical fact that we cannot control or dictate values to a system that’s broadly more intellectually capable than ourselves – and so we must not develop or deploy very advanced AI systems. It’s worth noting that the most pessimistic scenarios might look like optimistic scenarios up until very powerful AI systems are created. Taking pessimistic scenarios seriously requires humility and caution in evaluating evidence that systems are safe.
Current Safety Research
Anthropic and OpenAI are currently working in a variety of different directions to discover how to train safe AI systems, with some projects addressing distinct threat models and capability levels. Some key ideas include:
Mechanistic Interpretability
Scalable Oversight
Process-Oriented Learning
Understanding Generalization
Testing for Dangerous Failure Modes
Societal Impacts and Evaluations
Major AGI Players
AdeptAI is working on giving AIs access to everything.
DeepMind has done a lot of work on RL, agents and multi-modalities. It is literally in their mission statement to “solve intelligence, developing more general and capable problem-solving systems, known as AGI”. Google owns DeepMind and has invested in Anthropic.
OpenAI has a mission statement more focused on safety. Microsoft and OpenAI are working together and will spend tens of billions of dollars. They are releasing GP4 this week.
Anthropic is from two people who left OpenAI in 2021 and now have over a billion in funding.
Facebook and others have made big investments.
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.