Skip to content

Top-p (Nucleus Sampling)

LLM & Language Models

A generation parameter that controls which tokens the model considers — only tokens within the top probability mass are eligible, filtering out unlikely choices.

Top-p (also called nucleus sampling) is a complementary parameter to temperature that controls text generation randomness. While temperature scales all probabilities, top-p sets a cumulative probability threshold — only tokens whose combined probabilities reach the top-p value are considered.

With top-p = 0.9, the model only considers tokens that together make up 90% of the probability mass, ignoring the long tail of unlikely tokens. With top-p = 0.1, only the very top choices are considered, making output more deterministic.

In practice, most users adjust either temperature OR top-p, not both simultaneously (as they can interact unpredictably). OpenAI recommends adjusting one while keeping the other at its default. Many AI tools don't expose top-p to users at all, handling it internally.

Real-World Example

If an AI's output is too random or nonsensical, lowering top-p to 0.8 or 0.9 constrains it to more probable word choices without making it completely rigid.

Related Terms

More in LLM & Language Models

FAQ

What is Top-p (Nucleus Sampling)?

A generation parameter that controls which tokens the model considers — only tokens within the top probability mass are eligible, filtering out unlikely choices.

How is Top-p (Nucleus Sampling) used in practice?

If an AI's output is too random or nonsensical, lowering top-p to 0.8 or 0.9 constrains it to more probable word choices without making it completely rigid.

What concepts are related to Top-p (Nucleus Sampling)?

Key related concepts include Temperature, Token, LLM (Large Language Model), Inference. Understanding these together gives a more complete picture of how Top-p (Nucleus Sampling) fits into the AI landscape.