Artificial intelligence system learns concepts shared across video

A machine-learning model can identify the action in a video clip and label it, without the help of humans.

Humans observe the world through a combination of different modalities, like vision, hearing, and our understanding of language. Machines, on the other hand, interpret the world through data that algorithms can process.

So, when a machine “sees” a photo, it must encode that photo into data it can use to perform a task like image classification. This process becomes more complicated when inputs come in multiple formats, like videos, audio clips, and images.

“The main challenge here is, how can a machine align those different modalities? As humans, this is easy for us. We see a car and then hear the sound of a car driving by, and we know these are the same thing. But for machine learning, it is not that straightforward,” says Alexander Liu, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and first author of a paper tackling this problem.

Liu and his collaborators developed an artificial intelligence technique that learns to represent data in a way that captures concepts which are shared between visual and audio modalities. For instance, their method can learn that the action of a baby crying in a video is related to the spoken word “crying” in an audio clip.

Veille-cyber

Next The metaverse merges science fiction, tech, and games »

Previous « Meta wants to improve its AI by studying human brains

Published by

Veille-cyber

3 ans ago

Les 7 menaces cyber les plus fréquentes en entreprise

Introduction La cybersécurité est devenue une priorité stratégique pour toutes les entreprises, grandes ou petites.…

4 mois ago

cybersécurité

Cybersécurité : Vers une montée en compétence des établissements de santé grâce aux exercices de crise

Cybersécurité : les établissements de santé renforcent leur défense grâce aux exercices de crise Face…

4 mois ago

Règlementation

Règlement DORA : implications contractuelles pour les entités financières et les prestataires informatiques

La transformation numérique du secteur financier n'a pas que du bon : elle augmente aussi…

4 mois ago

cybersécurité

L’IA : opportunité ou menace ? Les DSI de la finance s’interrogent

L'IA : opportunité ou menace ? Les DSI de la finance s'interrogent Alors que l'intelligence…

4 mois ago

cybersécurité

Telegram menace de quitter la France : le chiffrement de bout en bout en ligne de mire

Telegram envisage de quitter la France : le chiffrement de bout en bout au cœur…

4 mois ago

cybersécurité

Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le secteur financier

Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le…

4 mois ago

This website uses cookies.

Artificial intelligence system learns concepts shared across video

Recent Posts

Les 7 menaces cyber les plus fréquentes en entreprise

Cybersécurité : Vers une montée en compétence des établissements de santé grâce aux exercices de crise

Règlement DORA : implications contractuelles pour les entités financières et les prestataires informatiques

L’IA : opportunité ou menace ? Les DSI de la finance s’interrogent

Telegram menace de quitter la France : le chiffrement de bout en bout en ligne de mire

Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le secteur financier