A vast team of over 400 researchers recently released a new open-access study on the performance of recent, popular text-based AI architectures such as GPT, the Pathways Language Model, the (recently controversial) LaMBDA architecture, and sparse expert models. The study, titled the “Beyond the Imitation Game,” or BIG, tries to provide a general benchmark for the state of text-based AI, how it compares to humans on the same tasks, and the effect of model size on the ability to perform the task.
First, many of the results were interesting though not surprising:
● In all categories, the best humans outdid the best AIs (though that edge was smallest on translation problems from the International Language Olympiad).
● Bigger models generally showed better results.
● For some tasks, the improvement was linear with model size. These were primarily knowledge-based tasks where the explicit answer was already somewhere in the training data.
● Some tasks (“breakthrough” tasks) required a very large AI model to even get started. These were mostly what the team called “composite” tasks — where two different skills must be combined or multiple steps followed to get the right answer.
However, some results were a little more interesting. Essentially, the researchers found that all model sizes were highly sensitive to the way the question was asked. For some ways of asking a question, the answers improved with larger model sizes but for other ways the results were no better than random, no matter the model size.
L'IA : opportunité ou menace ? Les DSI de la finance s'interrogent Alors que l'intelligence…
Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le…
La transformation numérique du secteur financier n'a pas que du bon : elle augmente aussi…
Telegram envisage de quitter la France : le chiffrement de bout en bout au cœur…
L'intelligence artificielle (IA) révolutionne le paysage de la cybersécurité, mais pas toujours dans le bon…
TISAX® et ISO 27001 sont toutes deux des normes dédiées à la sécurité de l’information. Bien qu’elles aient…
This website uses cookies.