PARIS — This is as close as you can get to a rock concert in AI research. Inside the supercomputing center of the French National Center for Scientific Research, on the outskirts of Paris, rows and rows of what look like black fridges hum at a deafening 100 decibels.
They form part of a supercomputer that has spent 117 days gestating a new large language model (LLM) called BLOOM that its creators hope represents a radical departure from the way AI is usually developed.
Unlike other, more famous large language models such as OpenAI’s GPT-3 and Google’s LaMDA, BLOOM (which stands for BigScience Large Open-science Open-access Multilingual Language Model) is designed to be as transparent as possible, with researchers sharing details about the data it was trained on, the challenges in its development, and the way they evaluated its performance. OpenAI and Google have not shared their code or made their models available to the public, and external researchers have very little understanding of how these models are trained.
BLOOM was created over the last year by over 1,000 volunteer researchers in a project called BigScience, which was coordinated by AI startup Hugging Face using funding from the French government. It officially launched on July 12. The researchers hope developing an open-access LLM that performs as well as other leading models will lead to long-lasting changes in the culture of AI development and help democratize access to cutting-edge AI technology for researchers around the world.