Training an advanced AI model takes time, money and high-quality data. It also takes energy — a lot of it.
Between storing data in large-scale data centers and then using that data to train a machine learning or deep learning model, AI energy consumption is high. While an AI system may pay off monetarily, AI poses a problem environmentally.
AI energy consumption during training
Take some of the most popular language models, for example.
OpenAI trained its GPT-3 model on 45 terabytes of data. To train the final version of MegatronLM, a language model similar to but smaller than GPT-3, Nvidia ran 512 V100 GPUs over nine days.
A single V100 GPU can consume between 250 and 300 watts. If we assume 250 watts, then 512 V100 GPUS consumes 128,000 watts, or 128 kilowatts (kW). Running for nine days means the MegatronLM’s training cost 27,648 kilowatt hours (kWh).
The average household uses 10,649 kWh annually, according to the U.S. Energy Information Administration. Therefore, training the final version of MegatronLM used almost the amount of energy three homes use in a year.
New training techniques reduce the amount of data needed to train machine learning and deep learning models, but many models still need a huge amount of data to complete an initial training phase, and additional data to keep up to date.
Data center energy usage
As AI becomes more complex, expect some models to use even more data. That’s a problem, because data centers use an incredible amount of energy.
« Data centers are going to be one of the most impactful things on the environment, » said Alan Pelz-Sharpe, founder of analyst firm Deep Analysis.
IBM’s The Weather Company processes around 400 terabytes of data per day to enable its models to predict the weather days in advance around the globe. Facebook generates about 4 petabytes (4,000 terabytes) of data per day.
People generated 64.2 zettabytes of data in 2020. That’s about 58,389,559,853 terabytes, market research company IDC estimated.
Data centers store that data around the world.
Meanwhile, the largest data centers require more than 100 megawatts of power capacity, which is enough to power some 80,000 U.S. households, according to energy and climate think tank Energy Innovation.
With about 600 hyperscale data centers — data centers that exceed 5,000 servers and 10,000 square feet — in the world, it’s unclear how much energy is required to store all of our data, but the number is likely staggering.
From an environmental standpoint, data center and AI energy consumption is also a nightmare.