Skip to content

AI Engineer

Verified

AI/ML engineering specialist for building intelligent features, RAG systems, LLM integrations, data pipelines, vector search, and AI-powered applications. Us...

72 downloads
$ Add to .claude/skills/

About This Skill

# AI Engineer

Build practical AI systems that work in production. Data-driven, systematic, performance-focused.

Core Capabilities

  • LLM Integration: OpenAI, Anthropic, local models (Ollama, llama.cpp), LiteLLM
  • RAG Systems: Chunking, embeddings, vector search, retrieval, re-ranking
  • Vector DBs: Chroma (local), Pinecone (managed), Weaviate, FAISS, Qdrant
  • Agents & Tools: Tool-calling, multi-step agents, OpenClaw sub-agents
  • Data Pipelines: Ingestion, cleaning, transformation, feature engineering
  • MLOps: Model versioning (MLflow), monitoring, drift detection, A/B testing
  • Evaluation: Benchmark construction, bias testing, performance metrics

Decision Framework

Which LLM provider? - **Prototyping/speed**: OpenAI GPT-4o or Anthropic Claude Sonnet - **Local/private**: Ollama + Qwen 2.5 32B or Llama 3.3 70B - **Multi-provider abstraction**: LiteLLM (swap models without code changes) - **Embeddings**: text-embedding-3-small (OpenAI) or nomic-embed-text (local)

Which vector DB? - **Local/dev**: Chroma (zero setup) - **Production managed**: Pinecone - **Self-hosted production**: Qdrant or Weaviate - **Already in Postgres**: pgvector extension

RAG or fine-tuning? - **RAG first** — always try RAG before fine-tuning. 90% of cases RAG is enough. - Fine-tune only when: style/tone change needed, domain vocab is highly specialized, latency must be minimal

RAG Workflow

1. Ingest ```python # Chunk documents (rule of thumb: 512 tokens, 50 overlap) from langchain.text_splitter import RecursiveCharacterTextSplitter splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=50) chunks = splitter.split_documents(docs) ```

2. Embed + store ```python import chromadb from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction

client = chromadb.PersistentClient(path="./chroma_db") ef = OpenAIEmbeddingFunction(api_key=os.environ["OPENAI_API_KEY"], model_name="text-embedding-3-small") collection = client.get_or_create_collection("docs", embedding_function=ef) collection.add(documents=[c.page_content for c in chunks], ids=[str(i) for i in range(len(chunks))]) ```

3. Retrieve + generate ```python results = collection.query(query_texts=[user_query], n_results=5) context = "\n\n".join(results["documents"][0])

response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": f"Answer based on this context:\n{context}"}, {"role": "user", "content": user_query}, ] ) ```

See `references/rag-patterns.md` for advanced patterns: re-ranking, hybrid search, HyDE, eval.

LLM Tool Calling (Agents)

```python tools = [{ "type": "function", "function": { "name": "search_docs", "description": "Search internal documentation", "parameters": { "type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"] } } }]

response = openai.chat.completions.create(model="gpt-4o", messages=messages, tools=tools) ```

See `references/agent-patterns.md` for multi-step agent loops, error handling, tool schemas.

Critical Rules

  • Evaluate early — build an eval set before you build the system
  • RAG before fine-tuning — always
  • Log everything — prompts, completions, latency, token usage from day one
  • Test for bias — especially for user-facing classification or scoring systems
  • Never hardcode API keys — use env vars or secret managers

References

  • `references/rag-patterns.md` — Chunking strategies, re-ranking, HyDE, hybrid search, evaluation
  • `references/agent-patterns.md` — Tool calling, multi-step loops, memory, error handling

Use Cases

  • Integrate LLM APIs from OpenAI, Anthropic, or local models into production applications
  • Build RAG systems with chunking, embedding, vector search, and re-ranking pipelines
  • Design and implement AI agent architectures with tool use, memory, and planning capabilities
  • Optimize model inference for latency and cost using quantization, caching, and batching
  • Set up evaluation frameworks to measure AI system accuracy, safety, and performance

Pros & Cons

Pros

  • +Comprehensive coverage of the full AI engineering stack — LLMs, RAG, agents, fine-tuning, and eval
  • +Production-focused mindset with emphasis on performance optimization and systematic testing
  • +Multi-model support including cloud APIs and local models via Ollama and llama.cpp

Cons

  • -Broad scope means shallow coverage on any single topic — more of a generalist than a specialist
  • -No runnable code or scripts included — provides guidance but not executable implementations
  • -Requires significant existing ML/AI knowledge to fully leverage the recommendations

FAQ

What does AI Engineer do?
AI/ML engineering specialist for building intelligent features, RAG systems, LLM integrations, data pipelines, vector search, and AI-powered applications. Us...
What platforms support AI Engineer?
AI Engineer is available on Claude Code, OpenClaw.
What are the use cases for AI Engineer?
Integrate LLM APIs from OpenAI, Anthropic, or local models into production applications. Build RAG systems with chunking, embedding, vector search, and re-ranking pipelines. Design and implement AI agent architectures with tool use, memory, and planning capabilities.

100+ free AI tools

Writing, PDF, image, and developer tools — all in your browser.

Next Step

Use the skill detail page to evaluate fit and install steps. For a direct browser workflow, move into a focused tool route instead of staying in broader support surfaces.