Meta’s AI translation breaks 200 language barrier

8 juillet 2022 Metaverse

The UK wants to boost AI development by removing data mining hurdles

Meta’s quest to translate underserved languages is marking its first victory with the open source release of a language model able to decipher 202 languages.

Named after Meta’s No Language Left Behind initiative and dubbed NLLB-200, the model is the first able to translate so many languages, according to its makers, all with the goal to improve translation for languages overlooked by similar projects.

« The vast majority of improvements made in machine translation in the last decades have been for high-resource languages, » Meta researchers wrote in a paper [PDF]. « While machine translation continues to grow, the fruits it bears are unevenly distributed, » they said.

According to the announcement of NLLB-200, the model can translate 55 African languages « with high-quality results. » Prior to NLLB-200’s creation, Meta said fewer than 25 African languages were covered by widely used translation tools. When tested against the BLEU standard, Meta said NLLB-200 showed an average improvement of 44 percent over other state-of-the-art translation models. For some African and Indian languages, the improvement reportedly went as high as 70 percent.