Microsoft, Google, and Amazon have all acknowledged the difficulty of building and developing computationally
intensive artificial intelligence (AI)
infrastructure while simultaneously trying to
meet their net-zero and sustainability goals.
These ever-growing energy demands of AI starkly
contrast with the global efforts to reduce carbon
emissions and minimize waste. Interest in green AI
has surged to spread awareness and reduce the
environmental impact of such technologies.
To better understand the environmental impact of
AI, we teamed up with the Collaborative Innovation
Program at the Wharton School of the University
of Pennsylvania. After an in-depth review and
interviewing multiple senior executives across the
technology, climate/ESG, and business spaces, this
report identified three key areas with the greatest
energy implications: data centers, hardware and
algorithmic optimization.
Part 1 - Carbon, computation, and the cloud: The ecological footprint of data centers
The International Energy Agency (IEA) estimates that data centers and data transmission networks are responsible for approximately 1% of energy-related
greenhouse gas emissions globally. As AI demand increases, so too will the need to build out and maintain data center warehouses, which are often powered by “dirty” electricity grids, including in Virginia’s “data center alley”, the site of 70% of the
world’s internet traffic in 2019.
The energy usage of AI and data centers is shifting
the long-term thinking of many technology
companies. The ability of generative AI (GenAI) to
produce complex data differs drastically from that of
discriminative AI, as the latter are models designed
for classification purposes in creating binary decisions such as approving/disapproving loan applications. Due to GenAI’s inherent complexity in generating outputs, the carbon emissions of training these models requires 10 to 15 times more energy utilizing graphics processing units (GPUs) than traditional central processing units (CPUs) due to the former’s superiority in computationally intensive tasks. These rapidly accelerating energy needs are shifting the calculus of technology companies that are now exploring previously untenable sources such as nuclear fusion and small modular reactors.
To understand these dynamics better, three insights have been provided to help companies approach data center selection. First, as use cases of AI soar, so do the energy and water consumption required to run the data centers. Currently, data centers use 6% of all electricity in the U.S. – a figure that is expected to double by 2026. This will impact energy, water, and resource capabilities as the world transitions to a lowcarbon economy and critical electrical components such as semiconductors may face shortages similar to those experienced during the COVID-19 pandemic.
Insight 1. As use of Al soars, so does the energy and water it requires
Second, operational emissions represent the bulk
of environmental impact from data centers. This
is becoming a priority for technology companies.
For example, due to such large emissions,
Microsoft has pledged four primary actions to
address this issue:
- reducing direct operational emissions
for Scope 1 (direct emissions owned by a
company) and Scope 2 (indirect emissions
from power sources used by the company) ;
- accelerating its carbon removal efforts;
- designing and optimizing for circularity in
reusing cloud hardware; and
- improving biodiversity and protecting more
land than it uses.
Renewable energy and attendant investments will
play an important role in creating a green, circular
ecosystem, but fossil fuels will largely run the
initial advancements of AI.
Insight 2. Operational emissions represent the bulk of environmental impact from data centers
- Carbon-intensive electricity sources drive operational emissions
- Al companies prioritize innovation over sustainability to beat competition
- Renewable investments decoupled from data centers limited hourly emission reductions
3% non-operational

Transportation and Distribution

End-of-Life Treatment of Sold Product

Processing of Sold Products

Use of Sold Products
97% operational

Company vehicle

Business travel

Diesel Generator

Purchase of Electricity, Heat, Steam, etc.
Source: Operational vs. Non-Operational Emissions (Data Centre Life Cycle analysis)
Towards a Systematic Survey for Carbon Neutral Data Centers
And third, selecting the right data center location can drastically cut operational emissions by at least 60%. Key considerations for site selection include power purchase agreements (PPAs) and access to carbon-free energy (CFE) sources such as solar, wind, hydroelectric, and geothermal. In its most recent 2024 sustainability report, Google’s total greenhouse gas emissions increased by 13% year over year, primarily driven by data center energy consumption and supply chain emissions in “hard-to-decarbonize” regions such as the Asia Pacific where CFE isn’t readily available.
The recent explosion of large language models (LLMs) and attendant data center expansions has forced a radical rethinking in how tech companies and countries approach their electrical grid. Adding more power generation capacities and capabilities has caused countries to reassess their larger net-zero and decarbonization goals. With the IEA projecting that global data center electricity demand will more than double by 2026 – largely driven by LLMs and data centers – there is a need to reevaluate and rethink the very nature of electricity consumption in trying to match supply and demand.
Insight 3. Choice of data center location can reduce operating emissions by at least 60%
Google's Data Center Emissions per Region
The intermittent nature of renewable energy will require greater coordination between technology companies and electrical utilities in mapping out the grid to better absorb the large electricity demand required by hyperscale data centers, whose main purpose is to handle the extreme scalability capabilities optimized for networked infrastructure
and large-scale workloads of GenAI models.
The creation and deployment of new technologies and standards – such as Green AI Code of Conduct; environmental, social and governmental protocols; specialized hardware accelerators; water cooling systems; 3D chips; and non-silicon semiconductors – may bring computation to where renewable energy is sufficient and where competitive behavior will
be awarded by consumers through the use of more environmentally friendly chatbots.
Part 2 - Hardware selection: The Landscape — CPUs vs. GPUs vs. TPUs vs NPUs
Hardware selection will ultimately rest upon
what functions and roles researchers and
developers require when systematizing AI
applications. Depending on cost, efficiency,
scalability and purpose of AI projects,
choosing the right processors to power the
architecture will be determined when building
and training models. These processing units
are the fundamental computing engines of the
hardware that powers deep learning and high
performance inferencing tasks and can have
a material impact on the sustainability and
environmental impact of technology.
The processing units that make up and perform the
complicated tasks involved with AI center around
central processing units (CPUs), graphics processing
units (GPUs), tensor processing units (TPUs), and
neural processing units (NPUs). Choosing which
processing unit is needed for which operation
depends on striking a delicate balance between
complexity, cost-efficiency for real-world applications
and environmental impacts.
Insight 4. Why is Nvidia GPU preferred?
CPUs have multiple cores and are commonly known as the brain of the computer, executing the commands needed for a computer’s operating system. Due to their versatility, cost effectiveness, and convenience in availability, they can handle simple and general purpose computing tasks; however, CPUs can face bandwidth and memory issues. A lack of dedicated hardware for powerful and specific machine learning operations makes CPUs an inferior processing unit when compared with GPUs and TPUs.
In recent years, GPUs (more specifically, Nvidia’s Ampere, Hopper, Lovelace, and Blackwell GPUs) have surpassed the roles traditionally required of CPUs due to their superior abilities in computing power and attendant operations. Designed for parallel processing and to accelerate the rendering of 3D graphics, GPUs are now used in high performance computing (HPC), deep learning, and training and inference. Working in conjunction with CPUs, GPU parallel computing helps to accelerate some of the CPUs’ functions, with both sharing similar internal components such as core, memory, and control units.
Google created its TPUs as an AI accelerator
application-specific integrated circuit (ASIC) for use
in neural network machine learning based on its own
TensorFlow software. TPUs differ from GPUs in that
TPUs’ specialized feature is its utilization of matrix
multiplication for AI training and inference whereas
GPUs are ideal for algorithms that process large
datasets found in AI workloads. GPUs are the primary compute hardware for AI applications, but specialized AI hardware such as Google's TPUs offer greater energy efficiency, being tailor-made for AI tasks.
Device NPUs have architecture that simulates the
brain’s neural network. Unlike general purpose CPUs
and GPUs, NPUs are optimized for handling AI-related tasks while differing from TPUs and other ASICs. While ASICs are designed for a singular purpose, NPUs offer more flexibility due to their tailor-made requirements for neural network computations. As demands for processing performance increased, NPUs were regarded as a specialized solution for handling new AI tasks that CPUs and GPUs were not built for.
The AI hardware landscape is rapidly expanding with
new entrants such as the Cerebras AI processor,
Ampere CPU, and Graphcore IPU, driven by the
burgeoning use of AI. With industry measuring energy efficiency in TOPS/W, specialized hardware options have demonstrated up to 1.5 times more energy efficiency over GPUs. Despite this, Nvidia maintains market dominance thanks to its comprehensive ecosystem of DGX hardware (enterprise AI combining software, infrastructure and expertise) and CUDA software, the latter being Nvidia’s proprietary parallel computing platform, developed around the company’s market-leading GPUs.
Insight 5. Architecture comparison

Architecture, such as Google TPUs, are optimised for tensor operations
- Higher performance for large neural network training
- More energy efficient than GPUs for AI workloads
Tables showing architecture comparisons
Source: WTW and Wharton
| Feature |
Nividia H100 |
Google TPUv5 |
| Architecture |
Hopper |
TPUv5 |
| Tensor scores |
80 |
64 |
| Floating points
performance |
180 TFLOPS |
180 TFLOPS |
| Power consumption |
450W |
300W |
| Efficiency |
4 TFLOPS/W |
6TFLOPS/W |

Other emerging options: Cerebras AI processors, Ampere CPUs …
- Designed for maximum performance/watt at scale
- Architectural, software, and cooling advantages
Tables showing architecture comparisons
Source: WTW and Wharton
| Spec |
Cerebras CS-3 |
B200
|
DGX B200
|
GB200 NVL72
|
|
FP16 PFLOPs
|
125
|
4.4
|
36
|
360
|
|
Memory(GB)
|
1,200,000
|
192
|
1,536
|
13,500
|
NVLlnk I Fabric Bandwidth (TB/s)
|
26,750
|
1.8
|
14.4
|
130
|
|
Power (Watts)
|
23,000
|
1,000
|
14,300
|
120.000
|
| PFLOPs/W |
0.005 |
0.004 |
0.003 |
0.003 |
AI models undergo training (the first phase for an AI
model where the model is shown desired inputs and
outputs) and inference (the process that follows AI
training where the model recognizes inputted data
and makes predictions). Initially, it was believed that
the training of AI models was higher than inference
costs. However, companies such as Nvidia and
Amazon now believe that inference can exceed the
cost of training, and that inference may account for
up to 90% of machine learning costs for AI systems while Google estimates that 60% of energy used goes towards inference and 40% towards training.
Cost-effective AI workloads will depend on utilizing
CPUs or GPUs (or a combination of the two) in
a system architecture with clear goals aimed at
accomplishing specific and/or complex tasks across
multiple industries and platforms. For instance, it
is estimated that OpenAI’s ChatGPT was trained on
over 20,000 Nvidia A100 GPUs and future ChatGPT
versions will require over 30,000 H100 GPUs.
Insight 6. Reduction in gross CO2 emissions since 2017
Given the high cost and large carbon footprint
of such computational power, start-ups and
alternatives in the LLM and chip space are
challenging the established dominance of
ChatGPT and Nvidia, respectively.
Given these numbers, new means and methods
have been devised to reduce the carbon footprint
of these models. While GPUs remain preferable for
training AI models, inference tasks are increasingly
shifting to specialized hardware, yielding significant
efficiency improvements. Federated learning,
neuromorphic computing, and implementing 4M best
practices – what Google refers to as Model, Machine, Mechanization, and Map – can help reduce energy usage and carbon emissions by 100 times and 1000 times, respectively.
Insight 7. New Opportunities in Federated Learning
- Federated learning decentralizes model training, allowing diverse data insights and preserving privacy
- This method supports efficient collaboration, requiring only model updates, not the full datasets, to be transmitted
- Smaller AI models collaborate across devices, adapting more dynamically to localized data for tailored solutions
- Federated learning catalyzes breakthroughs in AI, leveraging wider data sources for robust, adaptable models
Part 3 - Green AI: Optimizing algorithms for energy efficiency and sustainability
AI's energy issues can be tackled by optimizing hardware, but further miniaturization of microelectronics is not feasible in the long-term. Since GenAI’s training process – using LLMs – consumes considerable energy, optimization must focus on algorithms. Enhancements in data collection, processing techniques, choosing more efficient libraries, and improving training algorithm efficiency are essential.
There are four valuable insights for guiding companies' developers in writing eco-friendly code. First, using efficient AI models helps decrease energy use and carbon emissions. To gauge a machine learning model's carbon footprint, look at the energy intensive stages: model training, inference execution, and the production of computing hardware and data center infrastructure. Among these three areas, training costs exceed inference costs in the initial stages of a non-deployed LLM. Training just one LLM model can emit an estimated 300 tons of CO2.
Insight 8. Employing efficient Al models can reduce energy needs and carbon emissions
- Carbon footprint in ML includes training the model, running inference, and the production of computing hardware and data center capabilities
- More parameters and training data mean more energy consumption and carbon generation
- Training models are the most energy intensive components in AI (training a single LLM can use an estimated 300 tons of CO2)
KWh = Hours to train x Number of Processors x Average Power per Processor x PUE ÷ 1000
tCO2e = KWh x kg CO2e perKWh ÷ 1000
Source: WTW and Wharton
|
|
Model
|
GPT-3
|
Bloom
|
LLaMa
|
LLaMa-2
|
T5
|
PaLM
|
|
Developer
|
OpenAI
|
BigScience
|
Meta
|
Google
|
|
Model Size (# of parameters)
|
175B
|
175B
|
7B,13B, 33B, 65B
|
7B, 13B, 34B, 70B
|
11B
|
540B
|
|
Training Data (# of tokens)
|
300B
|
350B
|
1.4T
|
2T
|
34B
|
795B
|
|
Training Compute (FLOPS)
|
3.2E+23
|
3.7E+23
|
9.9E+23
|
1.5E+24
|
2.2E+21
|
2.6E+24
|
|
Processor Hours
|
3,552,000
|
1,082,990
|
1,770,394
|
3,311,616
|
245,760
|
8,404,992
|
|
Grid Carbon Intensity (kgCO2e/KWh
|
0.429
|
0.057
|
0.385
|
0.423
|
0.545
|
0.079
|
|
Data Center Efficiency (PUE)
|
1.1
|
1.2
|
1.1
|
1.1
|
1.12
|
1.08
|
|
Energy Consumption
|
1,287
|
520
|
779
|
1,400
|
86
|
3,436
|
|
Carbon Emissions (tCO2e)
|
552
|
30
|
300
|
593
|
47
|
271
|
Second, when evaluating the model, it is critical
to assess its generality as this will provide an
understanding of its energy consumption. The
broader a model’s capabilities, the larger its energy
consumption will be. Multi-purpose, generative
frameworks consume more energy compared with
those designed for specific tasks. For example,
task-specific systems include voice assistants,
recommendation algorithms, autonomous vehicles,
and image recognition tools, whereas general,
generative AI (AGI) systems encompass ChatGPT,
DALL-E, and Google Bard.
Third, developers must review each task their system
performs, as certain operations can demand more
energy. Factors influencing energy consumption
include the task's complexity, the length of generated
text, and whether an image is produced. Employing
skilled and considerate programmers will facilitate this aspect of energy efficiency.
When collaborating with developers, it is
recommended to integrate sustainability
considerations from the start, alongside discussions
on model expectations, accuracy, and governance.
Rushing the planning process can lead to hasty
development and poor outcomes in the long-term.
Last, effective prompt engineering is crucial
for decreasing AI's computational needs and carbon
footprint. Prompt engineering optimizes inputs to yield better outputs from a generative AI model.
Higher quality inputs lead to more accurate and efficient responses, improving model performance and sustainability. Techniques such as using contextual prompts, compressing prompts, caching, reusing prompts and optimizing them can help achieve more pertinent outputs while cutting down on
energy consumption.
Insight 9. General algorithms use more energy than task-specific systems
Generality" comes at a steep
cost to the environment, given
the amount of energy these
systems require. Multi-purpose,
generative architectures are
more energy expensive than
task-specific systems.
Explore task-specific Al tools
rather than general, generative
Al (AGI).
General algorithms use more energy than task-specific systems
|
Increasing energy consumption and carbon emissions
|
|
Prompt engineering
|
Design and craft prompts to guide
the model's responses effective
|
|
Retrieval augmented
engineering
|
Retrieve data from outside the model
and augment the prompts by adding
the relevant retrieved data in context
|
|
Parameter efficient tuning
|
Fine-tune the model with a minimal
number of parameters
|
|
Full fine tuning
|
Fine-tune the model by updating all
the parameters
|
|
Training from scratch
|
Build your own model
|
Insight 10. Some tasks are more energy intensive than others
But even with task specific Al tools, some tasks can be more energy intensive than others. Factors that affect energy intensity:
Mean and standard deviation of energy per l ,000 queries for the IO tasks examined
Source: WTW and Wharton
|
|
Inference energy (kWh)
|
|
task
|
mean
|
std
|
|
text classification
|
0.002
|
0.001
|
|
extractive QA
|
0.003
|
0.001
|
|
masked language modeling
|
0.003
|
0.001
|
|
token classification
|
0.004
|
0.002
|
|
image classification
|
0.007
|
0.001
|
|
object detection
|
0.038
|
0.02
|
|
text generation
|
0.047
|
0.03
|
|
summarization
|
0.049
|
0.01
|
|
image captioning
|
0.063
|
0.02
|
|
image generation
|
2.907
|
3.31
|
Insight 11. Green prompt engineering is crucial for reducing Al's computational needs and carbon foot print
01
The Art of Prompt Engineering
Craft inputs that elicit effective and efficient responses, improving model performance and sustainability.
02
Strategies to Reduce Computation Load
- Contextual prompts
- Prompt compression
- Caching
- Reusing prompts
03
Prompt Optimization
Optimize prompts in order to achieve more relevant output and reduce energy use.
04
Efficient Prompting Guidelines
Keep prompts concise, experiment gradually with different prompts and use reproducible prompts
When developing a system, consider the following:
- Process Improvements
- Model choices
- Training and tools
Process improvements: Begin with framing problems and scopes with sustainability in mind. Concurrently, monitor utilization metrics and refine configurations for performance with a low carbon footprint.
Model choices: When selecting a model, aim for lightweight base models – also referred to as task-specific models in this article. These models have fewer layers and parameters, which reduces computational overhead and allows for easy deployment across various hardware platforms. They can be adjusted in scale according to the task requirements and available resources. Additionally, employ prompt engineering and parameter-efficient fine-tuning during customization.
Consulting your organization’s technology team can help to evaluate how the computational load from model training will be allocated across specialized hardware such as GPUs and TPUs. As mentioned above in the section on hardware selection, GPUs excel at matrix operations, while TPUs are specifically designed for machine learning tasks. Essentially, inquire about what hardware is being utilized and whether it is optimized based on its capabilities. Addressing and acting on these questions can directly reduce the computational burden in a system, thereby lowering overall energy consumption.
Training and tools: Finally, using tools such as CodeCarbon to obtain real-time metrics on the model’s carbon footprint can help in reducing overall carbon emissions. CodeCarbon, a Python package, helps developers reduce emissions by optimizing their code and utilizing cloud infrastructure in regions that rely on renewable energy. These tools assist in evaluating algorithms from an environmental perspective, allowing developers to actively analyze and validate their code.
Think carefully about the business need and the task at hand, and choose an algorithm that meets just that task. Not every business need requires a generative Al solution.
Insight 12. What are some best practices to utilize when selecting sustainable, energy efficient algorithms?

Process Improvements
- Frame problems and model scope with sustainability.
- Monitor utilization metrics.
- Refine configurations for performance with a low carbon footprint.

Model Choices
- Opt for lightweight base models.
- Use prompt engineering and parameter efficient fine tuning during customization.

Training and Tools
- Distribute the training procedures across specialized hardware.
- Leverage tools like CodeCarbon to calculate algorithmic carbon footprint in real-time
The increase in carbon emissions over the past few
years by the world’s leading technology companies
has provided an impetus for society to search for new
sources of clean energy to power the AI revolution.
The internet’s insatiable appetite for data is causing
companies such as Microsoft and leading tech
leaders such as Jeff Bezos and Bill Gates to fund more
sustainable and circular energy sources, most notably
nuclear energy in the form of small modular reactors,
to power the world’s 7,000 data centers.
Given that electricity demand is no longer easily predictable, country electric loads are now growing significantly faster than grid planners have forecasted, with the load growth curve soaring due to growing demands by industry and the construction of new data centers to handle AI’s explosive electricity usage. Scaling up investments in the clean energy space has now become not only a sustainability necessity but also an economic one as companies are now looking to compete with one another in the green tech space to fuel their technological advancements. While innovation in the AI and LLM space in recent years was given precedence over sustainability imperatives, companies are now trying to outdo each other in unlocking powerfully efficient green tech to meet their companies’ hyperscale abilities, hoping to achieve limitless zero-carbon energy.
Conclusion
Green AI is both a technological challenge and an environmental necessity. With AI becoming increasingly embedded in our daily lives, its considerable energy use and carbon emissions must be addressed.
The strategies discussed in this essay—ranging from algorithm and hardware optimization to embedding sustainability in development practices—offer a guide for mitigating AI's environmental effects. As societal and regulatory pressure builds, companies will find themselves in situations where consumers will reward them based on not just their environmental and sustainability pledges but also their actionable results in “greenifying” their data architecture systems.