Google’s New Robot Learned to Take Orders by Scraping the Web
Last weekend, Google research scientist Fei Xia sits in the center of a bright, open-plan kitchen and enters commands into a laptop connected to a weapon, robot with wheels like a large floor lamp. “I am hungry,” he wrote. The robot quickly dashed to a nearby counter, crept up to a bag of multigrain chips with a large plastic pliers, and returned to Xia to offer a snack.
The most impressive thing about that demonstration, held in Google’s robotics lab in Mountain View, California, was that none of the human programmers programmed the robot to understand what to do on Xia’s command. . Its control software learned to translate a spoken phrase into a sequence of physical actions using millions of pages of text pulled from the web.
That means a person doesn’t have to use specific pre-approved words to give commands, which may be necessary with virtual assistants like Alexa or Siri. Tell the robot “I’m dry” and it will try to find you something to drink; tell it “Oops, I just spilled my drink” and it has to come back with a sponge.
“To cope with the diversity of the real world, robots need to be able to adapt and learn from them,” said Karol Hausman, senior research scientist at Google, in the demo. involves the robot carrying a sponge over to clean up a spill. To interact with humans, machines must learn to grasp how words can be put together in multiple ways to create different meanings. “Understanding all the subtleties and intricacies of language is up to the robot,” says Hausman.
Google’s demo is a step towards the long-term goal of creating robots capable of interacting with humans in complex environments. Over the past few years, researchers have discovered that feeding large amounts of text pulled from books or the web into large machine learning models can yield programs with Impressive language skillsconsists of OpenAI GPT-3 Text Generator. By using a variety of online writing forms, the software can choose to be able to summarize or answer questions about the text, create coherent articles on a certain topic, or even hold meetings. focused chat.
Google and other Big Tech companies are making extensive use of these large language models to Search and advertising. Several companies provide technology through cloud APIs and new services have emerged that apply AI language capabilities to tasks such as code generation or advertising writing. Google engineer Blake Lemoine was recently fired after public warning that a chatbot is powered by technology, called LaMDA, can have sentience. Google’s vice president still works at the company wrote in The Economist that chatting with bots is like “talking to something smart.”
Despite those strides, AI programs still tend to become confused or repeat meaningless things. Language models trained with web text also lack knowledge of truth and often reproduce prejudice or hate language found in their training data, suggesting that careful engineering may be required to reliably guide the robot without it running amok.
The robot demonstrated by Hausman is powered by the most powerful language model Google has announced to date, called Palm. It can do many tricks, including explaining, in natural language, how to come to a particular conclusion when answering a question. The same approach is used to create a sequence of steps the robot will take to perform a given task.