MIT CSAIL scientists created an algorithm to solve one of the hardest tasks in computer vision: assigning a label to every pixel in the world, without human supervision.
Labeling data can be a chore. It’s the main source of sustenance for computer-vision models; without it, they’d have a lot of difficulty identifying objects, people, and other important image characteristics. Yet producing just an hour of tagged and labeled data can take a whopping 800 hours of human time. Our high-fidelity understanding of the world develops as machines can better perceive and interact with our surroundings. But they need more help.
Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), Microsoft, and Cornell University have attempted to solve this problem plaguing vision models by creating “STEGO,” an algorithm that can jointly discover and segment objects without any human labels at all, down to the pixel.