Could machine learning fuel a reproducibility crisis in science?

machine learning science

From biomedicine to political sciences, researchers increasingly use machine learning as a tool to make predictions on the basis of patterns in their data. But the claims in many such studies are likely to be overblown, according to a pair of researchers at Princeton University in New Jersey. They want to sound an alarm about what they call a “brewing reproducibility crisis” in machine-learning-based sciences.

Machine learning is being sold as a tool that researchers can learn in a few hours and use by themselves — and many follow that advice, says Sayash Kapoor, a machine-learning researcher at Princeton. “But you wouldn’t expect a chemist to be able to learn how to run a lab using an online course,” he says. And few scientists realize that the problems they encounter when applying artificial intelligence (AI) algorithms are common to other fields, says Kapoor, who has co-authored a preprint on the ‘crisis’¹. Peer reviewers do not have the time to scrutinize these models, so academia currently lacks mechanisms to root out irreproducible papers, he says. Kapoor and his co-author Arvind Narayanan created guidelines for scientists to avoid such pitfalls, including an explicit checklist to submit with each paper.

What is reproducibility?

Kapoor and Narayanan’s definition of reproducibility is wide. It says that other teams should be able to replicate the results of a model, given the full details on data, code and conditions — often termed computational reproducibility, something that is already a concern for machine-learning scientists. The pair also define a model as irreproducible when researchers make errors in data analysis that mean that the model is not as predictive as claimed.

Judging such errors is subjective and often requires deep knowledge of the field in which machine learning is being applied. Some researchers whose work has been critiqued by the team disagree that their papers are flawed, or say Kapoor’s claims are too strong. In social studies, for example, researchers have developed machine-learning models that aim to predict when a country is likely to slide into civil war. Kapoor and Narayanan claim that, once errors are corrected, these models perform no better than standard statistical techniques. But David Muchlinski, a political scientist at the Georgia Institute of Technology in Atlanta, whose paper² was examined by the pair, says that the field of conflict prediction has been unfairly maligned and that follow-up studies back up his work.

Mots-clés : cybersécurité, sécurité informatique, protection des données, menaces cybernétiques, veille cyber, analyse de vulnérabilités, sécurité des réseaux, cyberattaques, conformité RGPD, NIS2, DORA, PCIDSS, DEVSECOPS, eSANTE, intelligence artificielle, IA en cybersécurité, apprentissage automatique, deep learning, algorithmes de sécurité, détection des anomalies, systèmes intelligents, automatisation de la sécurité, IA pour la prévention des cyberattaques.

Veille-cyber

Next The Future of Generative Adversarial Networks in Deepfakes »

Previous « Open source isn’t working for AI

Published by

Veille-cyber

4 ans ago

Bots et IA biaisées : menaces pour la cybersécurité

Bots et IA biaisées : une menace silencieuse pour la cybersécurité des entreprises Introduction Les…

3 mois ago

Cybersécurité

Cloudflare en Panne

Cloudflare en Panne : Causes Officielles, Impacts et Risques pour les Entreprises Le 5 décembre…

3 mois ago

Cybersécurité

Alerte sur le Malware Brickstorm : Une Menace pour les Infrastructures Critiques

Introduction La cybersécurité est aujourd’hui une priorité mondiale. Récemment, la CISA (Cybersecurity and Infrastructure Security…

3 mois ago

Cybersécurité

Cloud Computing : État de la menace et stratégies de protection

La transformation numérique face aux nouvelles menaces Le cloud computing s’impose aujourd’hui comme un…

3 mois ago

Cybersécurité

Attaque DDoS record : Cloudflare face au botnet Aisuru – Une analyse de l’évolution des cybermenaces

Les attaques par déni de service distribué (DDoS) continuent d'évoluer en sophistication et en ampleur,…

3 mois ago

Cybersécurité

Poèmes Pirates : La Nouvelle Arme Contre Votre IA

Face à l'adoption croissante des technologies d'IA dans les PME, une nouvelle menace cybersécuritaire émerge…

3 mois ago

This website uses cookies.

Could machine learning fuel a reproducibility crisis in science?

What is reproducibility?

Recent Posts

Bots et IA biaisées : menaces pour la cybersécurité

Cloudflare en Panne

Alerte sur le Malware Brickstorm : Une Menace pour les Infrastructures Critiques

Cloud Computing : État de la menace et stratégies de protection

Attaque DDoS record : Cloudflare face au botnet Aisuru – Une analyse de l’évolution des cybermenaces

Poèmes Pirates : La Nouvelle Arme Contre Votre IA