OpenAI’s text-embedding-3-small is one of the most widely used embedding models for AI applications. It’s the go-to choice for developers building search, RAG (Retrieval-Augmented Generation), and classification systems. Here’s everything you need to know.
What It Is
text-embedding-3-small is an embedding model from OpenAI that converts text into numerical vectors (embeddings). These vectors capture the semantic meaning of the text, enabling similarity search, clustering, and classification.
When you send text to the model, it returns a vector of 1,536 dimensions (by default). Texts with similar meanings produce vectors that are close together in this high-dimensional space.
Key Specifications
Dimensions: 1,536 (default), can be reduced to as low as 256 using Matryoshka representation learning. Reducing dimensions saves storage and speeds up search with minimal quality loss.
Max input: 8,191 tokens (~6,000 words). Long enough for most documents and passages.
Performance: Strong performance on standard benchmarks (MTEB). Not the absolute best, but excellent for its size and cost.
Cost: $0.02 per million tokens. Extremely cheap — embedding a million words costs about 3 cents.
text-embedding-3-small vs. text-embedding-3-large
OpenAI offers two embedding models in the v3 family:
text-embedding-3-small: 1,536 dimensions, $0.02/M tokens. Good performance, very cheap.
text-embedding-3-large: 3,072 dimensions, $0.13/M tokens. Better performance, 6.5x more expensive.
For most applications, text-embedding-3-small is the better choice. The quality difference is small, and the cost savings are significant. Use text-embedding-3-large only when you need maximum retrieval accuracy and cost isn’t a concern.
Common Use Cases
Semantic search. Convert documents and queries into embeddings, then find the most similar documents for any query. This powers search features in AI applications, knowledge bases, and documentation sites.
RAG (Retrieval-Augmented Generation). The most common use case. Embed your documents, store them in a vector database, and retrieve relevant context when users ask questions. The retrieved context is then passed to an LLM to generate accurate answers.
Classification. Use embeddings as features for text classification. The embeddings capture semantic meaning, making classification more accurate than keyword-based approaches.
Clustering. Group similar documents together based on their embeddings. Useful for organizing large document collections, identifying topics, and detecting duplicates.
Recommendation. Find similar items (products, articles, content) based on embedding similarity. More nuanced than keyword matching because it understands semantic relationships.
How to Use It
Using the OpenAI API:
Call the embeddings endpoint with your text and the model name “text-embedding-3-small”. The API returns a vector that you can store in a vector database (Pinecone, Weaviate, ChromaDB, pgvector) or use directly for similarity calculations.
For dimension reduction, pass the “dimensions” parameter with your desired size (e.g., 256, 512, 1024). The model uses Matryoshka representation learning to produce shorter vectors that retain most of the semantic information.
Tips for Best Results
Chunk your documents. Don’t embed entire documents as single vectors. Split them into chunks of 200-500 tokens for better retrieval accuracy.
Use meaningful chunks. Split at paragraph or section boundaries rather than arbitrary token counts. Semantic coherence within chunks improves retrieval quality.
Consider dimension reduction. For large-scale applications, reducing dimensions from 1,536 to 512 or 256 can significantly reduce storage costs and speed up search with minimal quality loss.
Normalize vectors. For cosine similarity search, normalize your vectors. Most vector databases handle this automatically.
Alternatives
Cohere Embed v3: Competitive quality, supports multiple languages well.
Voyage AI: Strong performance, particularly for code and technical content.
BGE (BAAI): Open-source, can be run locally. Good quality for a free option.
Nomic Embed: Open-source with competitive performance.
My Take
text-embedding-3-small is the default choice for most AI applications. It’s cheap, fast, easy to use, and good enough for the vast majority of use cases. Start here, and only consider alternatives if you have specific requirements (better multilingual support, local deployment, or maximum accuracy) that justify the switch.
🕒 Last updated: · Originally published: March 13, 2026