Robots adopt human-like learning techniques by watching us

Robots that previously needed meticulous programming may soon operate with a level of autonomy that was once exclusive to humans. Welcome to the future — where robots learn not by code, but by imitation.
Industries: AlgemeenGeneral
  • DeepMind’s new AI develops new skills just by watching people
  • Carnegie Mellon’s robots learn household chores through video observation
  • With Toyota’s new AI, robots learn to peel vegetables or pour liquids
  • UC Berkeley’s robot folds your laundry faster than you

In the rapidly advancing world of robotics, a new era is upon us — one where robots learn not by code, but by imitation. This leap forward is powered by generative AI, an ingenious force that enables machines to observe and replicate complex human behaviours. These robots can now acquire skills in real-time, from human gestures to nuanced tasks, by analysing and imitating actions seen in the physical world or through digital mediums like videos. This shift from programmed instructions to observational learning opens a myriad of possibilities. Imagine robots in healthcare observing and learning from surgeons to assist in operations, or in art studios, capturing the essence of creativity from seasoned artists. The potential for applications is as limitless as it is breathtaking.

At the heart of this revolution is reinforcement learning — a process where robots refine their abilities through trial and error, much like humans do when learning a new skill. The blend of generative AI with this learning approach results in robots that not only mimic but also enhance and evolve in the tasks they learn. In this article, we will examine specific instances where this technology is used, even excelling. These examples illustrate a significant transformation in robotics: machines that previously needed meticulous programming may soon operate with a level of autonomy that was once exclusive to humans. This isn’t just a step towards more intelligent machines; it’s a stride into the future of autonomous learning, where robots could potentially create solutions to problems we haven’t even encountered yet.

DeepMind’s AI, with its novel learning capabilities, offers a tantalising glimpse of a future where machines could learn directly from the wealth of human experience, potentially transforming every aspect of how we interact with them.

DeepMind’s new AI develops new skills just by watching people

In a groundbreaking move by Google DeepMind, artificial intelligence has taken a significant stride towards human-like adaptability. Unlike the conventional AIs that require extensive training with countless examples to perform a simple task, DeepMind’s latest creation is learning in real-time, picking up new skills by merely observing human demonstrators. It’s a technique reminiscent of one of our most profound abilities — the power to quickly learn from each other, a skill that’s been instrumental in human progress. The DeepMind researchers have pushed the boundaries of what we thought possible by developing agents that can navigate a virtual world with the dexterity and insight of a human being. “Our agents succeed at real-time imitation of a human in novel contexts without using any pre-collected human data. We identify a surprisingly simple set of ingredients sufficient for generating cultural transmission”, they note, an accomplishment that opens the door to endowing machines with our cultural wisdom.

These AI agents reside in a digital playground named GoalCycle3D, where they encounter an ever-changing landscape filled with hurdles to overcome. Here, they learn not by rote, but by watching an expert — either a coded entity or a human-controlled one — swiftly navigate through the course. By doing so, they grasp the nuances of the task at hand, honing their skills through reinforcement learning, a trial-and-error approach similar to how humans learn. After their training, these agents were able to emulate the expert’s actions independently, a clear indication that they’re not just memorising sequences but actually understanding the tasks. The researchers have equipped the AI with a predictive focus and a memory module to facilitate this. This setup allows the AI to remember and emulate the expert’s moves, even when these experts are not present, which means they adapt to a broad spectrum of tasks in varying environments.

While these are promising developments, translating this tech from virtual environments to real life situations isn’t straightforward. The AI’s learning was tested under a consistent human guide during the trials, which raises the question — can it adapt to the unpredictable variety of human instruction? Moreover, the dynamic and random nature of the training environment in GoalCycle3D is challenging to replicate in the less predictable real world. Plus, the tasks were fairly basic, not requiring the kind of fine motor skills that real-world tasks often demand. Still, the progress in AI’s ability to learn socially is an exciting development. As we look ahead to a future shared with intelligent machines, it’s crucial to devise ways to transfer our knowledge to them efficiently and effectively. DeepMind’s AI, with its novel learning capabilities, offers a tantalising glimpse of a future where machines could learn directly from the wealth of human experience, potentially transforming every aspect of how we interact with them. It’s a balancing act between the formal rigours of AI development and the informal, adaptable learning that characterises human intelligence, and DeepMind is walking this tightrope with promising agility.

Carnegie Mellon’s robots learn household chores through video observation

At Carnegie Mellon University, researchers are breaking new ground by teaching robots to perform household chores just by watching videos. Imagine a robot that learns to navigate your home, opening drawers, answering phones, and even cooking for you, simply by observing everyday life. This cutting-edge research promises to completely change the way we integrate robots into our daily routines, transforming them into helpful assistants for tasks we do every day. Deepak Pathak, an assistant professor at CMU, spearheads this project that allows robots to understand how humans interact with their environment. “The robot can learn where and how humans interact with different objects through watching videos. From this knowledge, we can train a model that enables two robots to complete similar tasks in varied environments”, Pathak explains. This learning method is a significant leap from traditional techniques that required humans to manually demonstrate tasks or robots to undergo extensive training in simulated settings — methods that were not only laborious but also fraught with failure.

Pathak’s team has advanced past their earlier work, known as In-the-Wild Human Imitating Robot Learning (WHIRL), which necessitated human demonstrations in the robot’s presence. Their new model, the Vision-Robotics Bridge (VRB) streamlines this process, removing the need for direct human guidance and identical environments. Now, a robot can pick up a new task in a mere 25 minutes of practice. Shikhar Bahl, a PhD student in robotics, shares the team’s excitement: “We were able to take robots around campus and do all sorts of tasks. Robots can use this model to curiously explore the world around them”. This curiosity is directed by a sophisticated understanding of ‘affordances’ — a term borrowed from psychology, indicating the potential actions available to an individual in their environment. By identifying contact points and motions, such as a human opening a drawer, a robot can deduce how to perform this action on any drawer.

Leveraging large datasets like Ego4D, with thousands of hours of first-person footage, and Epic Kitchens, which captures kitchen activities, the team trains their robots to learn from a diverse range of human behaviours. These datasets, previously intended for training computer vision models, are now instrumental in teaching robots to learn from the plethora of internet and YouTube videos that are available. This work is not just about teaching robots — it’s about unlocking their potential to learn autonomously from the boundless resources of the digital world. It’s a future where robots don’t just work for us; they learn from us, becoming more intuitive and helpful in our homes.

With Toyota’s new AI, robots learn to peel vegetables or pour liquids

Toyota Research Institute (TRI) is reshaping the future of robotics with a groundbreaking generative AI technique called Diffusion Policy, which rapidly and reliably equips robots with new skills. With TRI’s method, robots can now learn complex tasks such as flipping pancakes or peeling vegetables, going from novices to skilled performers in mere hours. This efficiency is achieved through a human operator who teleoperates the robot, providing it with a variety of demonstrations to learn from. The robot then processes this information autonomously, applying its new skills to tasks without further human intervention. It’s a game-changer that significantly reduces the time and complexity traditionally involved in training robots.

The Diffusion Policy technique has already empowered robots at TRI to master more than 60 different skills, from manipulating deformable materials to pouring liquids. And it’s done without having to write new code — just by feeding robots fresh demonstration data. The approach draws on the same diffusion processes used in cutting-edge image generation, reaping benefits like adaptability to multi-modal demonstrations, the capability to plan high-dimensional actions over time, and stability in training without the need for constant real-world adjustments. Russ Tedrake, TRI’s vice president of Robotics, celebrates the achievement, noting that their robots are now performing tasks that are “simply amazing” and beyond reach only a year ago. The speed and reliability with which these robots learn new skills, especially in tricky areas like handling soft or irregular materials and objects, mark a significant milestone.

However, the journey doesn’t end here. While TRI’s robots have shown impressive progress, they sometimes falter when applying their skills in varying conditions. Acknowledging these challenges, TRI is committed to developing a broad spectrum of robot behaviours through both physical and simulated training environments. Their vision includes a ‘large behaviour model’ that parallels the sophistication of large language models in understanding and generating written content. With the ambitious aim of teaching robots a thousand skills by 2024, TRI is forging a path toward a future where robots with a rich repertoire of behaviours work seamlessly alongside humans. While there’s still a road to travel before these robots become an everyday presence, Diffusion Policy has certainly set the wheels of innovation in motion, hinting at a new era where the robots of tomorrow are not just tools, but partners in our daily lives.

At UC Berkeley, a robotic revolution is unfolding in the laundry room — literally. The university has unveiled a breakthrough methodology that trains robots to tackle the mundane yet surprisingly complex task of folding laundry.

UC Berkeley’s robot folds your laundry faster than you

At UC Berkeley, a robotic revolution is unfolding in the laundry room — literally. The university’s AUTOLAB has unveiled SpeedFolding, a breakthrough methodology that trains robots to tackle the mundane yet surprisingly complex task of folding laundry. In a world where washing and drying machines have long taken the drudgery out of laundry day, the final frontier of folding has remained stubbornly manual. SpeedFolding aims to consign this chore to the annals of history, boasting a record-breaking speed of folding 30-40 garments per hour, an impressive increase from the previous benchmark of a mere 3-6 folds. Folding laundry is not as straightforward as it seems. The variety of clothing, from the thin and slippery to those prone to static cling, presents a significant challenge, not to mention the randomness of a laundry basket’s contents. Yet, UC Berkeley’s approach has mastered these hurdles with a system that mimics the human process of folding — analysing over 4,300 actions to predict the shape of laundry items from a crumpled heap, smoothing them out, and then folding them along precise lines.

Central to this innovation is the BiManual Manipulation Network (BiMaMa), a neural network that guides two robotic arms in a harmonious dance of laundry folding. This dual-arm system mimics the coordination found in human limbs, allowing for the efficient smoothing and folding of garments into recognisable shapes. While this method doesn’t exactly enable robots to copy human actions by watching them, it is designed based on our understanding of human movements to achieve similar results. The success rate? An impressive 93 per cent, with most pieces neatly folded in less than two minutes. While this system wasn’t benchmarked against the average teenager’s folding prowess, its efficiency and accuracy are undoubtedly superior. SpeedFolding is more than just a record-breaking feat; it’s a glimpse into a future where robots relieve us of the repetitive and tedious tasks that pepper our daily lives. As UC Berkeley continues to refine this methodology, the day when robots seamlessly integrate into our homes to assist with everyday chores grows ever closer, promising a world where our time is freed up for more engaging pursuits.

In closing

Today’s robotics technology is advancing so quickly that it’s set to dramatically change how we do many of our everyday tasks. With DeepMind, Carnegie Mellon, Toyota, and UC Berkeley leading the charge, robots are stepping out of the realm of science fiction and into our living rooms, kitchens, and laundry rooms. The implications of these developments extend far beyond mere convenience; they invite us to reconsider our relationship with machines and, perhaps more importantly, with our own time and creativity.

As we marvel at all the incredible things advanced robots can do, it’s crucial to ponder the broader picture as well. For instance, what does all of this mean for employment in sectors that traditionally rely on human labour? And how do we ensure that ethical considerations keep pace with all of these technological advancements, especially when it comes to privacy and autonomy? These questions aren’t just rhetorical; they’re essential checkpoints on our journey forward. The democratisation of such technology also remains a challenge. As the potential for a societal divide based on access to robotic assistance is a real concern, it will be critical to ensure that robotic advancements are accessible to people from all walks of life and to avoid a future dominated by technological elitism.

It does, however, go without saying that the potential benefits are tantalising. Imagine a world where the tedious tasks that consume our days are taken care of by robotic assistants, freeing us humans to pursue more creative, fulfilling endeavours. These developments could enable us to augment our lives in a myriad of ways and completely redefine productivity and leisure. Having said that, it’s clear that the evolution of robotics is not just about what machines can do for us but also about what this means for the future of human potential. The robotics innovations we see now are just the beginning of what could be a remarkable display of human creativity. Whether these advances will benefit everyone or only a few depends on the decisions we make at this very moment. It will be paramount to move forward with optimism as well as caution, and ensure that as our robotic companions learn from us, we also learn from them — particularly about the kind of world we want to create. Not only for ourselves, but also for our children and the generations after them.

Industries: AlgemeenGeneral
We’re in the midst of a technological revolution and the trends, technologies, and innovations to look out for are all game-changers. They bring competitive advantages, increase the effectiveness of operations, make our daily lives more efficient, improve healthcare, and significantly change the landscape and beyond.

Free trendservice

Receive the latest insights, research material, e-books, white papers and articles from our research team every month, for free!