Skip to content

Prompt Safe

Verified

Token-safe prompt assembly with memory orchestration. Use for any agent that needs to construct LLM prompts with memory retrieval. Guarantees no API failure due to token overflow. Implements two-phase context construction, memory safety valve, and hard limits on memory injection.

974

Install

Claude Code

Add to .claude/skills/

About This Skill

# Prompt Assemble

Overview

A standardized, token-safe prompt assembly framework that guarantees API stability. Implements Two-Phase Context Construction and Memory Safety Valve to prevent token overflow while maximizing relevant context.

  • Design Goals:
  • ✅ Never fail due to memory-related token overflow
  • ✅ Memory is always discardable enhancement, never rigid dependency
  • ✅ Token budget decisions centralized at prompt assemble layer

When to Use

  1. Use this skill when:
  2. Building or modifying any agent that constructs prompts
  3. Implementing memory retrieval systems
  4. Adding new prompt-related logic to existing agents
  5. Any scenario where token budget safety is required

Core Workflow

``` User Input ↓ Need-Memory Decision ↓ Minimal Context Build ↓ Memory Retrieval (Optional) ↓ Memory Summarization ↓ Token Estimation ↓ Safety Valve Decision ↓ Final Prompt → LLM Call ```

Phase Details

Phase 0: Base Configuration ```python # Model Context Windows (2026-02-04) # - MiniMax-M2.1: 204,000 tokens (default) # - Claude 3.5 Sonnet: 200,000 tokens # - GPT-4o: 128,000 tokens

MAX_TOKENS = 204000 # Set to your model's context limit SAFETY_MARGIN = 0.75 * MAX_TOKENS # Conservative: 75% threshold = 153,000 tokens MEMORY_TOP_K = 3 # Max 3 memories MEMORY_SUMMARY_MAX = 3 lines # Max 3 lines per memory ```

  • Design Philosophy:
  • Leave 25% buffer for safety (model overhead, estimation errors, spikes)
  • Better to underutilize capacity than to overflow

Phase 1: Minimal Context - System prompt - Recent N messages (N=3, trimmed) - Current user input - **No memory by default**

Phase 2: Memory Need Decision ```python def need_memory(user_input): triggers = [ "previously", "earlier we discussed", "do you remember", "as I mentioned before", "continuing from", "before we", "last time", "previously mentioned" ] for trigger in triggers: if trigger.lower() in user_input.lower(): return True return False ```

Phase 3: Memory Retrieval (Optional) ```python memories = memory_search(query=user_input, top_k=MEMORY_TOP_K) for mem in memories: summarized_memories.append(summarize(mem, max_lines=MEMORY_SUMMARY_MAX)) ```

Phase 4: Token Estimation Calculate estimated tokens for base_context + summarized_memories.

Phase 5: Safety Valve (Critical) ```python if estimated_tokens > SAFETY_MARGIN: base_context.append("[System Notice] Relevant memory skipped due to token budget.") return assemble(base_context) ```

  • Hard Rules:
  • ❌ Never downgrade system prompt
  • ❌ Never truncate user input
  • ❌ No "lucky splicing"
  • ✅ Only memory layer is expendable

Phase 6: Final Assembly ```python final_prompt = assemble(base_context + summarized_memories) return final_prompt ```

Memory Data Standards

Allowed in Long-Term Memory - ✅ User preferences / identity / long-term goals - ✅ Confirmed important conclusions - ✅ System-level settings and rules

Forbidden in Long-Term Memory - ❌ Raw conversation logs - ❌ Reasoning traces - ❌ Temporary discussions - ❌ Information recoverable from chat history

Quick Start

Copy `scripts/prompt_assemble.py` to your agent and use:

```python from prompt_assemble import build_prompt

# In your agent's prompt construction: final_prompt = build_prompt(user_input, memory_search_fn, get_recent_dialog_fn) ```

Resources

scripts/ - `prompt_assemble.py` - Complete implementation with all phases (PromptAssembler class)

references/ - `memory_standards.md` - Detailed memory content guidelines - `token_estimation.md` - Token counting strategies

Use Cases

  • Enhance and optimize AI prompts for better response quality
  • Generate structured prompts from templates or natural language descriptions
  • Assemble complex prompts from reusable components and templates
  • Iterate on prompt designs with systematic testing and refinement
  • Manage prompt libraries for consistent AI interaction patterns across projects

Pros & Cons

Pros

  • + Well-adopted with 1,948+ downloads showing reliable real-world usage
  • + Leverages AI models for intelligent automation beyond simple rule-based tools
  • + Configurable parameters allow tuning for different quality and cost tradeoffs

Cons

  • - Depends on external AI model APIs which may incur usage costs
  • - Output quality varies based on input specificity and model capabilities

Frequently Asked Questions

What does Prompt Safe do?

Token-safe prompt assembly with memory orchestration. Use for any agent that needs to construct LLM prompts with memory retrieval. Guarantees no API failure due to token overflow. Implements two-phase context construction, memory safety valve, and hard limits on memory injection.

What platforms support Prompt Safe?

Prompt Safe is available on Claude Code, OpenClaw.

What are the use cases for Prompt Safe?

Enhance and optimize AI prompts for better response quality. Generate structured prompts from templates or natural language descriptions. Assemble complex prompts from reusable components and templates.

Stay Updated on Agent Skills

Get weekly curated skills + safety alerts