Google Deepmind Intros Generalist AI Which May Lead to AGI

Arxiv – Deepmind introduces GATO, a generalist AI agent, which could be a path to AGI, Artificial General Intelligence. Inspired by progress in large-scale language modeling, Deepmind apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which Deepmind refer to as Gato, works as a multi-modal,…
Google Deepmind Intros Generalist AI Which May Lead to AGI

Arxiv – Deepmind introduces GATO, a generalist AI agent, which could be a path to AGI, Artificial General Intelligence.

Inspired by progress in large-scale language modeling, Deepmind apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which Deepmind refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens. In this report Deepmind describe the model and the data, and document the current capabilities of Gato.

A generalist agent. Gato can sense and act with different embodiments across a wide range of environments using a single neural network with the same set of weights. Gato was trained on 604 distinct tasks with varying modalities, observations and action specifications.

Transformer sequence models are effective as multi-task multi-embodiment policies, including for real-world text, vision and robotics tasks. They show promise as well in few-shot out-of-distribution task learning. In the future, such models could be used as a default starting point via prompting or fine-tuning to learn new behaviors, rather than training from scratch.

Given scaling law trends, the performance across all tasks including dialogue will increase with scale in parameters, data and compute. Better hardware and network architectures will allow training bigger models while maintaining real-time robot control capability. By scaling up and iterating on this same basic approach, Deepmind can build a useful general-purpose agent.

GATO Robotics – RGB Stacking Benchmark (real and sim)

As a testbed for taking physical actions in the real world, they chose the robotic block stacking environment introduced by Lee et al. (2021). The environment consists of a Sawyer robot arm with 3-DoF cartesian velocity control, an additional DoF for velocity, and a discrete gripper action. The robot’s workspace contains three plastic blocks colored red, green and blue with varying shapes. The available observations include two 128 × 128 camera images, robot arm and gripper joint angles as well as the robot’s end-effector pose. Notably, ground truth state information for the three objects in the basket is not observed by the agent. Episodes have a fixed length of 400 timesteps at 20 Hz for a total of 20 seconds, and at the end of an episode block positions are randomly re-positioned within the workspace. The robot in action is shown in Figure 4. There are two challenges in this benchmark:


Skill Mastery (where the agent is provided data from the 5 test object triplets it is later tested on) and


Skill Generalization (where data can only be obtained from a set of training objects that excludes the 5 test sets).

They used several sources of training data for these tasks. In Skill Generalization, for both simulation and real, they use data collected by the best generalist sim2real agent from Lee et al. (2021). They collected data only when interacting with the designated RGB-stacking training objects (this amounts to a total of 387k successful trajectories in simulation and 15k trajectories in real). For Skill Mastery Deepmind used data from the best per group experts from Lee et al. (2021) in simulation and from the best sim2real policy on the real robot (amounting to 219k trajectories in total). Note that this data is only included for specific Skill Mastery experiments.

AI Progress Via Deep Learning and Other Recent AI Developments

Geoff Hinton joins Pieter in a two-part season finale for a wide-ranging discussion inspired by insights gleaned from Hinton’s journey from academia to Google Brain. The episode covers how existing neural networks and backpropagation models operate differently than how the brain actually works; the purpose of sleep; and why it’s better to grow our computers than manufacture them.

What’s in this episode:

00:00:00 – Introduction


00:02:48 – Understanding how the brain works


00:06:59 – Why we need unsupervised local objective functions


00:09:39 – Masked auto-encoders


00:10:55 – Current methods in end to end learning


00:18:36 – Spiking neural networks


00:23:00 – Leveraging spike times


00:29:55 – The story behind AlexNet


00:36:15 – Transition from pure academia to Google


00:40:23 – The secret auction of Hinton’s company at NuerIPS


00:44:18 – Hinton’s start in psychology and carpentry


00:54:34 – Why computers should be grown rather than manufactured


01:06:57 – The function of sleep and Boltzmann Machines


01:11:49 – Need for negative data


01:19:35 – Visualizing data using t-SNE

Complex Reasoning With Large Language Models

Arxiv – Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

We propose a novel prompting strategy, least-to-most prompting, that enables large language models to better perform multi-step reasoning tasks. Least-to-most prompting first reduces a complex problem into a list of subproblems, and then sequentially solves the subproblems, whereby solving a given subproblem is facilitated by the model’s answers to previously solved subproblems. Experiments on symbolic manipulation, compositional generalization and numerical reasoning demonstrate that least-to-most prompting can generalize to examples that are harder than those seen in the prompt context, outperforming other prompting based approaches by a large margin. A notable empirical result is that the GPT-3 code-davinci-002 model with least-to-most-prompting can solve the SCAN


benchmark with an accuracy of 99.7% using 14 examples. As a comparison, the neural-symbolic models in the literature specialized for solving SCAN are trained with the full training set of more than 15,000 examples.

META AI LeCun Proposes Six Module CommonSense AI

META LeCun proposed an architecture of six separate, differential modules that can easily compute gradient estimates of the objective function with respect to input and propagate the information to upstream modules. This common-sense architecture can help AI systems to achieve autonomous intelligence. The six modules are configurator, perception, world model, short-term memory, actor, and cost.

The configurator module is for executive control, like executing a given task. It’s also responsible for pre-configuring the perception, world model, cost, and the actor module by modulating the parameters of those modules.

The perception module receives signals from sensors and estimates the current state of the world, but only a small subset of the perceived state of the world is relevant and valuable for a given task.

The world model module has two roles, and it’s the most complex piece of architecture. The first role is to estimate missing information about the state of the world that is not provided by perception to predict the natural evolutions of the world. The second role is to predict plausible future states of the world. The world model module acts as a simulator to the task at hand. It helps represent multiple possible predictions.

The cost module predicts the level of discomfort of the agent and has two submodules: the intrinsic cost and the critic. The former submodule is immutable and computes discomforts like damage to the agent, violation of hard-coded behavioral constraints, etc.). The latter submodule is a trainable module that predicts future values of the intrinsic cost.

The actor module computes proposals for action sequences. “The actor can find an optimal action sequence that minimizes the estimated future cost and output the first action in the optimal sequence, in a fashion similar to classical optimal control,” LeCun says.

The short-term memory module keeps track of the current and predicted world state and associated costs.

Read More

Total
0
Shares
Leave a Reply

Your email address will not be published.

Related Posts
Shuang Li & Osheyack reflect on the death of the image in ‘Still’
Read More

Shuang Li & Osheyack reflect on the death of the image in ‘Still’

Multidisciplinary artist Shuang Li weaves together stock footage into glossy artificial textures, illuminating the flattened Vocaloid emotion of Osheyack’s astonishing ‘Still’. ‘Still’ is the astonishing centrepiece of Osheyack’s latest album for SVBKVLT, Intimate Publics, a haunting sonic assemblage of deep ambient rumble and the reverberant ricochet of stock samples lifted from Yamaha’s Vocaloid software, a…
SpaceX is Aiming for 100 Flights in 2023
Read More

SpaceX is Aiming for 100 Flights in 2023

Elon says SpaceX is aiming for 100 Falcon 9 flights in 2023. Elon also shared a photo of the Mechazilla lifting the Super Heavy booster. If SpaceX finishes the year with another 15 launches then they will have total 187 Falcon 9 launches at the end of 2022. One hundred launches in 2023 would put…
SpaceX Will Launch OneWeb Communication Satellites
Read More

SpaceX Will Launch OneWeb Communication Satellites

SpaceX will launch the remaining OneWeb satellites. OneWeb is not directly competing with SpaceX Starlink for providing satellite communications. The satellites in the OneWeb constellation are approximately 150 kg (330 lb) in mass. This is half of the mass of the latest Starlink satellites. The 648 operational satellites are to operate in 12 near polar…
Patch Notes: Qasim Naqvi
Read More

Patch Notes: Qasim Naqvi

An extended modular performance from the Erased Tapes artist and Dawn of Midi co-founder. Pakistani-American musician Qasim Naqvi is perhaps best known as a founding member of acoustic trio Dawn of Midi, in which he has played drums since 2007. Outside of his work in Dawn of Midi, Naqvi is an accomplished solo artist whose…