As humans, we take for granted how easy it is to grab something and put it its rightful place. If a friend asks you to grab a mug for example and place it on a hook, it takes very little thinking to do so. Robots, on the other hand, would need a ton of instructions to perform the same act — identify the mug, visually locate the handle, understand that’s how it’s picked up, use the correct number of contact points to do so, and so on.
Engineers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a new system for robots that allows them to pick-and-place a host of different objects without ever having seen what those objects interact with. The system is known as KPAM (Keypoint Affordance Manipulation), which collects a series of 3D keypoints (coordinates) on an object and uses it a visual roadmap to determine what to do with it.
As shown in the video below, the robot only needed three keypoints for the mug — the center of the mugs side, handle, and bottom, and used that data to place the mug on a holder. For the shoe, KPAM needed six keypoints to pick-and-place more than 20 different pairs of shoes. The engineers are looking to continue KPAMs development for greater generalizability to perform greater tasks, such as unloading dishwashers and cleaning kitchen counters.
“Understanding just a little bit more about the object — the location of a few key points — is enough to enable a wide range of useful manipulation tasks, and this particular representation works magically well with today’s state-of-the-art machine learning perception and planning algorithms.” — MIT Professor Russ Tedrake,