Researchers create a mathematical framework to evaluate explanations of machine-learning models and quantify how well people understand them.
Modern machine-learning models, such as neural networks, are often referred to as “black boxes” because they are so complex that even the researchers who design them can’t fully understand how they make predictions.
To provide some insights, researchers use explanation methods that seek to describe individual model decisions. For example, they may highlight words in a movie review that influenced the model’s decision that the review was positive.
But these explanation methods don’t do any good if humans can’t easily understand them, or even misunderstand them. So, MIT researchers created a mathematical framework to formally quantify and evaluate the understandability of explanations for machine-learning models. This can help pinpoint insights about model behavior that might be missed if the researcher is only evaluating a handful of individual explanations to try to understand the entire model.
“With this framework, we can have a very clear picture of not only what we know about the model from these local explanations, but more importantly what we don’t know about it,” says Yilun Zhou, an electrical engineering and computer science graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of a paper presenting this framework.
L'IA : opportunité ou menace ? Les DSI de la finance s'interrogent Alors que l'intelligence…
Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le…
La transformation numérique du secteur financier n'a pas que du bon : elle augmente aussi…
Telegram envisage de quitter la France : le chiffrement de bout en bout au cœur…
L'intelligence artificielle (IA) révolutionne le paysage de la cybersécurité, mais pas toujours dans le bon…
TISAX® et ISO 27001 sont toutes deux des normes dédiées à la sécurité de l’information. Bien qu’elles aient…
This website uses cookies.