Skip to content

Pandas Assistant

Caution

Optimizes Python pandas workflows by writing efficient DataFrame operations, fixing common performance pitfalls, and converting between pandas, polars, and SQL.

By Data Skills Lab 3,950 v2.0.0 Updated 2026-03-10

Install

Claude Code

Copy the SKILL.md file to your project's .claude/skills/ directory

About This Skill

Pandas Assistant optimizes your Python data analysis workflows. It rewrites slow pandas code into efficient vectorized operations, fixes common anti-patterns, and helps you choose between pandas, polars, and SQL for different scale requirements.

How It Works

  1. Code review — Identifies pandas anti-patterns: iterrows(), apply() with lambdas, string concatenation in loops, repeated DataFrame copies
  2. Vectorization — Rewrites iterative code to use built-in pandas/numpy vectorized operations
  3. Memory optimization — Downcasts numeric types, converts strings to categoricals, uses sparse types where appropriate
  4. API modernization — Updates deprecated pandas APIs and adopts Copy-on-Write mode for pandas 2.x
  5. Alternative suggestion — Recommends polars or DuckDB when pandas hits performance ceilings

Best For

  • Speeding up slow Jupyter notebook analyses
  • Refactoring pandas code from iterative to vectorized style
  • Reducing memory footprint for large datasets
  • Migrating pandas codebases to polars for performance

Performance Guidelines

Typical speedups: 10-100x from iterrows() to vectorized, 5-10x from apply() to vectorized, 50-80% memory reduction from dtype optimization. For DataFrames over 10M rows, recommends polars or DuckDB.

Use Cases

  • Rewrite iterative pandas code to vectorized operations
  • Optimize memory usage with proper dtypes and categorical columns
  • Convert complex pandas pipelines to polars for speed
  • Build multi-step data analysis workflows with method chaining

Pros & Cons

Pros

  • + Identifies and fixes the most common pandas performance killers
  • + Provides concrete speedup estimates for each optimization
  • + Covers migration path from pandas to polars
  • + Memory optimization techniques for large-scale analysis

Cons

  • - Cannot profile actual execution times without running code
  • - Some domain-specific operations resist vectorization

Related AI Tools

Related Skills

Stay Updated on Agent Skills

Get weekly curated skills + safety alerts

每周精选 Skills + 安全预警