With explicit feedback, AI needs less data than you think

24 octobre 2022 Intelligence Artificielle

We’ve all come to appreciate that AI and machine learning are the magic sauce powering large-scale consumer internet properties. Facebook, Amazon and Instacart boast enormous datasets and huge user counts. Common wisdom suggests that this scale advantage is a powerful competitive moat; it enables far better personalization, recommendations and ultimately, a better user experience. In this article, I will show you that this moat is shallower than it seems; and that alternative approaches to personalization can produce outstanding outcomes without relying on billions of data points.

Most of today’s user data is from implicit behaviors

How do Instagram and TikTok understand what you like and don’t like? Sure, there are explicit signals — likes and comments. But the vast majority of your interactions aren’t those; it’s your scrolling behavior, “read more” clicks, and video interactions. Users consume far more content than they produce; key factors that social media platforms use to determine what you liked and didn’t like are based on those cues. Did you unmute that Instagram video and watch it for a whopping 30 seconds? Instagram can infer that you’re interested. Scrolled past it to skip? OK, not so much.