token price comparison

Wednesday 13th November 2024 02:57 AM

Overview of the prices of the main large language models depending on the number of tokens requested via their API. Prices can vary completely depending on the offers.

LLMs are billed by “token”, a unit of measurement representing segments of text, that is, words or fragments of words. In general, large language models (LLM) price input tokens (those sent to the model) and output tokens (those generated in response) separately. This billing method allows users to estimate the cost based on the volume of data processed by the LLMs made available in the form of an API. According to OpenAI's definition, a token corresponds, in English, to 0.75 words. The table below summarizes the estimated prices per token for the main LLMs on the market, whether entry or exit.

Price of main LLMs depending on the number of tokens
Model	Supplier	Price for 1000 input tokens	Price for 1000 tokens output	Total number of tokens supported
GPT-4o (omni)	OpenAI	0,05 dollar	0,15 dollar	128 000
GPT-4 Turbo	OpenAI	0,1 dollar	0,3 dollar	128 000
GPT-4	OpenAI	0,3 dollar	0,6 dollar	8 000
Claude 3 Haiku	Anthropic	0,0025 dollar	0,0125 dollar	200 000
Claude 3 Sonnet	Anthropic	0,03 dollar	0,15 dollar	200 000
Close 3 Work	Anthropic	0,15 dollar	0,75 dollar	200 000
Call 3 70b	Meta (via AWS)	0,00265 dollar	0,0035 dollar	8 000
Call 2 70b	Meta (via AWS)	0,00195 dollar	0,00256 dollar	4 000
Gemini 1.0 Pro	Google	0,005 dollar	0,015 dollar	32 000
Gemini 1.5 Pro	Google	0,07 dollar	0.21 dollar	1 000 000
Command	Cohere	0,1 dollar	0,2 dollars	4 000
Command R	Cohere	0,005 dollars	0,015 dollars	132 000
Command R+	Cohere	0,03 dollars	0,15 dollars	128 000
Mixtral 8x7B	Mistral AI (via Anyscale)	0,005 dollars	0,005 dollars	32 000
Mistral Small	Mistral AI	0,02 dollars	0,06 dollars	32 000
Mistral Large	Mistral AI	0,08 dollars	0,24 dollars	32 000
GPT-3.5 Turbo	OpenAI	0,12 dollars	0,16 dollars	4 000
PaLM 2	Google	0,02 dollars	0,02 dollars	8 000

Model Analysis

GPT-4o et GPT-4 Turbo (OpenAI) – Within OpenAI's LLM range, GPT-4o is a high-performance multimodal version at a competitive price, while GPT-4 Turbo is a lightweight option at reduced costs. GPT-4 remains the main version for complex tasks, but at a higher cost.
Gemini 1.5 et 1.5 Pro (Google DeepMind) – Developed by Google, Gemini templates are optimized for text and image. The Pro version is more powerful, with higher token capacity for advanced use cases.
Claude 3.5 Sonnet, Haiku et Opus (Anthropic) – Anthropic offers different versions of its LLM for specific needs, rapid responses (Haiku) or long and in-depth interactions (Sonnet and Opus). Prices vary depending on the abilities of each person.
Call 3 70b and Call 2 70b (Goal) – Developed by Meta, these models are notably offered by the Amazon cloud. They provide increased flexibility for custom tasks.
Command, Command R, et Command R+ (Cohere) – Cohere offers a full range of models for research and analysis, ranging from cost-effective options for basic data research to LLMs tailored for more complex analyses.
Mixtral and Mistral – Mistral French models are optimized for specific linguistic tasks (Mixtral) or quick interactions (Mistral Small).
GPT-3.5 Turbo – This is a cost-effective option of OpenAI models, suitable for simple tasks with limited token capacity, offering a good alternative for less intensive needs.