Embedding Model Comparison
Side-by-side pricing, dimensions, and quality scores for popular embedding models.
| Model | Provider | Price / 1M Tokens | Dimensions | Max Tokens | MTEB Score |
|---|---|---|---|---|---|
| text-embedding-005 | Free* | 768 | 2,048 | 63 | |
| text-embedding-3-small | OpenAI | $0.02 | 1,536 | 8,191 | 62 |
| voyage-3-lite | Voyage | $0.02 | 512 | 32,000 | 62 |
| jina-embeddings-v3 | Jina | $0.02 | 1,024 | 8,192 | 66 |
| voyage-3 | Voyage | $0.06 | 1,024 | 32,000 | 67 |
| embed-v4 | Cohere | $0.10 | 1,024 | 512 | 64 |
| mistral-embed | Mistral | $0.10 | 1,024 | 8,192 | 61 |
| text-embedding-3-large | OpenAI | $0.13 | 3,072 | 8,191 | 65 |
* Google text-embedding-005 has a generous free tier. Check current limits at cloud.google.com.
Cost vs. Quality Tradeoff
The cheapest model isn't always the best choice. Consider your use case: for high-volume RAG pipelines where marginal quality differences compound, a higher-quality model like Voyage-3 or OpenAI text-embedding-3-large may deliver better end-user results despite higher costs. For simple classification or clustering tasks, budget models work well.
Key considerations
- • Dimensions affect storage cost in your vector database. Fewer dimensions = less storage and faster search.
- • Max tokens determine how much text you can embed in a single call. Longer contexts reduce chunking complexity.
- • MTEB scores are approximate and vary by task. Always benchmark on your own data.