Humans are endowed with the ability to predict how an object will feel just by throwing a quick glance at it, or even figure the visual aspect of an object by running fingers over it. But what seems rather too easy for us can be a formidable challenge for machines. Now, a syndicate of engineers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are in the throes of developing a new kind of robot with that same ability.

To start with, the team brought a KUKA robot arm, with integrated tactile sensing device called GelSight, a brainchild of Ted Adelson’s group at CSAIL. And the information picked up by the device was then supplied to an artificial intelligence (AI) so it could determine the relationship between visual and tactile information.

Teaching artificial intelligence to connect senses like vision and touch
Image for representational purpose only. [Via – Pixabay]

To train the AI to able to identify objects by sense of touch, the team made the robotic arm touch 200 household objects such as fabrics, household products, tools, fabrics more than 12,000 times and recorded the visual and tactile data it generates through touching them. And based on that, the team compiled a dataset of more than 3 million visual/tactile-paired images called VisGel.

“By looking at the scene, our model can imagine the feeling of touching a flat surface or a sharp edge”, says Yunzhu Li, CSAIL PhD student and lead on a new paper about the system in a news release. “By blindly touching around, our model can predict the interaction with the environment purely from tactile feelings. Bringing these two senses together could empower the robot and reduce the data we might need for tasks involving manipulating and grasping objects.”

At the moment, the interactions for the proposed AI for robots are limited to a controlled environment. Collecting data in a more unstructured strategic environment, or using the MIT-designed “scalable tactile glove” (STAG) should make it work in different settings.

“This is the first method that can convincingly translate between visual and touch signals,” says Andrew Owens, a postdoctoral researcher at the University of California at Berkeley. “Methods like this have the potential to be very useful for robotics, where you need to answer questions like ‘is this object hard or soft?’, or ‘if I lift this mug by its handle, how good will my grip be?’ This is a very challenging problem, since the signals are so different, and this model has demonstrated great capability.”