National / World

Google DeepMind enables robots to perform novel tasks

Sat, Jul 29 2023 11:10:00 AM

New Delhi, Jul 29 (IANS): Google has demonstrated its first vision-language-action (VLA) model for robot control that showed improved generalisation capabilities and semantic and visual understanding beyond the robotic data it was exposed to.

This includes interpreting new commands and responding to user commands by performing rudimentary reasoning, such as reasoning about object categories or high-level descriptions.

The Robotic Transformer 2 (RT-2) is a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control, according to Google DeepMind.

A traditional robot can pick up a ball and stumble when picking up a cube.

RT-2's flexible approach enables a robot to train on picking up a ball and can figure out how to adjust its extremities to pick up a cube or another toy it's never seen before.

“We also show that incorporating chain-of-thought reasoning allows RT-2 to perform multi-stage semantic reasoning, like deciding which object could be used as an improvised hammer (a rock), or which type of drink is best for a tired person (an energy drink),” said the DeepMind team.

The latest model builds upon Robotic Transformer 1 (RT-1) that was trained on multi-task demonstrations.

The team performed a series of qualitative and quantitative experiments on RT-2 models, on over 6,000 robotic trials.

“Across all categories, we observed increased generalisation performance (more than 3x improvement) compared to previous baselines,” the team said.

The RT-2 model shows that vision-language models (VLMs) can be transformed into powerful vision-language-action (VLA) models, which can directly control a robot by combining VLM pre-training with robotic data.

“RT-2 is not only a simple and effective modification over existing VLM models, but also shows the promise of building a general-purpose physical robot that can reason, problem solve, and interpret information for performing a diverse range of tasks in the real-world,” said Google DeepMind.

Follow Daijiworld News Network on

Live from Bengaluru

Latest

Two killed as fire guts shelter home in Vasant Vihar

Class 12 student found dead hours before exam in Kanpur

Rahul Easwar denied bail, sent to judicial custody in defamation case

Farmers halt day-long highway blockade after talks with officials

West Bengal polling booths spark BJP concerns over voter list accuracy

Government removes 1.85 lac dormant companies in five-year cleanup drive

BSF seizes gold worth Rs 3.02 crore, arrests two smugglers in North 24-Parganas

National / World

Google DeepMind enables robots to perform novel tasks

Top Stories

SJIM traces its journey from 1960s roots to a leading Jesuit management institute

Eid Al Etihad: Al Khalidiya Group celebrates UAE’s 54th National Day

Leave a Comment Your Email address will not be published.

Title: Google DeepMind enables robots to perform novel tasks

You might also like

US court strikes down Trump’s pick for top prosecutor

US–Ukraine talks show progress but major challenges remain, says secretary Rubio

Trump slams Biden, Harris after Afghan National held in shooting near White House

US congress probes reported second strike on drug boat in Caribbean

Elon Musk backs H-1B visa programme, says US gains significantly from Indian talent

Trump declares Venezuelan airspace closed, heightening tensions amid military build-up

Afghan national charged with terror threat in Texas a day before White House shooting

Four killed, ten injured in shooting at family gathering in Stockton

Former US soldier defends Afghan community amid Trump crackdown after DC shooting

US halts Afghan visa issuance after Washington shooting linked to Afghan asylee