The recent boom in generative AI technologies such as ChatGPT and other large language or foundation models has been nothing short of phenomenal. However, the costs of creating and maintaining such technologies can be incredibly high. According to analysts and technologists, the critical process of training a large language model such as GPT-3 could cost over $4 million. Even more advanced language models could cost over “the high-single-digit millions” to train, according to a Forrester analyst who focuses on AI and machine learning.
Analysts and technologists estimate that the critical process of training a large language model such as OpenAI’s GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single-digit millions” to train. For example, Meta’s largest LLaMA model used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens, taking about 21 days, and about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters.
Organizations that build large language models must be cautious when they retrain the software, which helps improve its abilities because it costs so much. These models are not trained all the time, like every day. It’s important to note that’s why some models, such as ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021.
To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product. For a product as popular as ChatGPT, which investment firm UBS estimates to have reached 100 million monthly active users in January, analysts believe that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.
The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software. Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies. The margin for AI applications is permanently smaller than previous software-as-a-service margins because of the high cost of computing.
In order to train these models, organizations require specialized hardware. While traditional computer processors can run machine learning models, they are not efficient enough to keep up with the demands of training a large language model. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously. Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000.
The high cost of training large language models is a structural cost that differs from previous computing booms. Even when the software is built or trained, it still requires a massive amount of computing power to run large language models. These costs skyrocket when these tools are used billions of times a day. For instance, financial analysts estimate that Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.
There is no doubt that the high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions. Big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to. However, if the margin for AI applications is permanently smaller than previous software-as-a-service margins because of the high cost of computing, it could put a damper on the current boom.
Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis. Therefore, some startups have focused on the high cost of AI as a business opportunity. For instance, D-Matrix is a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU. Similarly, HuggingFace CEO Clement Delangue believes that more companies would be better served focusing on smaller, specific models that are cheaper to train and run instead of the large language models that are garnering most of the attention.
The high cost of AI has led to some startups and organizations using open source and free language models to lower costs, as Latitude did when it switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, and incorporated open source and free language models into its service to lower costs.
It’s not all doom and gloom, though. Companies making the foundation models, semiconductor makers, and startups all see business opportunities in reducing the price of running AI software. Nvidia continues to develop more powerful versions of GPUs designed specifically for machine learning, and improvements in total chip power across the industry have slowed in recent years. However, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips but also in software and other computer parts.
Meanwhile, OpenAI has announced that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output. This has caught the attention of AI Dungeon-maker Latitude, which says it’s constantly evaluating how it can deliver the best experience to users.
While the costs of creating and maintaining generative AI technologies such as ChatGPT can be extraordinarily high, there is optimism that prices will come down over time as the industry develops. With companies, semiconductor makers, and startups all working