Embedding
LLM & Language ModelsA numerical representation of text (or images or audio) as a list of numbers, allowing AI to understand meaning and find similarities.
An embedding converts human-readable content into a list of numbers (a vector) that captures its meaning. Similar concepts end up with similar numbers, so 'dog' and 'puppy' would have embeddings close to each other, while 'dog' and 'refrigerator' would be far apart.
Embeddings are the secret sauce behind semantic search, recommendation systems, and RAG. When you search with a vague description and the AI still finds what you meant, embeddings are doing the heavy lifting — matching meaning, not just keywords.
In practice, you create embeddings using models like OpenAI's text-embedding-3 or open-source alternatives like Sentence-BERT. These embeddings are stored in vector databases (Pinecone, Weaviate, Chroma) and used for similarity search, clustering, and retrieval.
Real-World Example
When Perplexity finds relevant sources for your question it uses embeddings to match your query's meaning against millions of documents — not just keyword matching.
Related Terms
More in LLM & Language Models
FAQ
What is Embedding?
A numerical representation of text (or images or audio) as a list of numbers, allowing AI to understand meaning and find similarities.
How is Embedding used in practice?
When Perplexity finds relevant sources for your question it uses embeddings to match your query's meaning against millions of documents — not just keyword matching.
What concepts are related to Embedding?
Key related concepts include Vector Database, RAG (Retrieval-Augmented Generation), Semantic Search, Token, Transformer. Understanding these together gives a more complete picture of how Embedding fits into the AI landscape.