Energy and Cost Estimation of Local Large Language Models : An Empirical Study for Business Decision-Making

Kuvaus

Opinnäytetyö kokotekstinä PDF-muodossa.
The rapid adoption of large language models in business organizations has created a growing need to understand the costs of running these tools locally or on cloud. While Online services of LLMs are widely available, organizations prefer to deploy the opensource models on their own hardware to protect data privacy and reduce risk of price changing. However, there is no practical framework to help businesses calculate their electricity costs of running LLMs locally. This study measures the energy consumption of 18 locally deployed LLMs across three common business task types: text summarization, code generation, and financial analysis. The experiment was conducted using WSTAR datacenter in Vaasa, Finland, using 2 L40S NVIDIA GPUs. Energy consumption was recorded in real time using CodeCarbon library in Python and the results were translated into practical cost estimates for organizations of different sizes. Finding shows that enabling chain-of-thought reasoning increases the number of output tokens by three to five times and raises total energy use, while energy per token remains constant. The monthly electricity cost for inference via local LLMs ranges from one euro for micro-sized companies to about 300 euros for enterprise-level organizations, based on Finnish energy price. When comparing local deployment costs to online API services on a cost-per-million-token basis, online services were more cost-effective for almost all tested models. Through the analysis it is clear that energy cost savings are of not great importance while considering local vs cloud deployment decision, the more likely factors to consider are privacy and price risk mitigation.

URI

DOI

Emojulkaisu

ISBN

ISSN

Aihealue

OKM-julkaisutyyppi