Top-p (Nucleus Sampling)
LLM & Language ModelsA generation parameter that controls which tokens the model considers — only tokens within the top probability mass are eligible, filtering out unlikely choices.
Top-p (also called nucleus sampling) is a complementary parameter to temperature that controls text generation randomness. While temperature scales all probabilities, top-p sets a cumulative probability threshold — only tokens whose combined probabilities reach the top-p value are considered.
With top-p = 0.9, the model only considers tokens that together make up 90% of the probability mass, ignoring the long tail of unlikely tokens. With top-p = 0.1, only the very top choices are considered, making output more deterministic.
In practice, most users adjust either temperature OR top-p, not both simultaneously (as they can interact unpredictably). OpenAI recommends adjusting one while keeping the other at its default. Many AI tools don't expose top-p to users at all, handling it internally.
Real-World Example
If an AI's output is too random or nonsensical, lowering top-p to 0.8 or 0.9 constrains it to more probable word choices without making it completely rigid.
Related Terms
More in LLM & Language Models
FAQ
What is Top-p (Nucleus Sampling)?
A generation parameter that controls which tokens the model considers — only tokens within the top probability mass are eligible, filtering out unlikely choices.
How is Top-p (Nucleus Sampling) used in practice?
If an AI's output is too random or nonsensical, lowering top-p to 0.8 or 0.9 constrains it to more probable word choices without making it completely rigid.
What concepts are related to Top-p (Nucleus Sampling)?
Key related concepts include Temperature, Token, LLM (Large Language Model), Inference. Understanding these together gives a more complete picture of how Top-p (Nucleus Sampling) fits into the AI landscape.