MetriLLM
VerifiedFind the best local LLM for your machine. Tests speed, quality and RAM fit, then tells you if a model is worth running on your hardware.
$ Add to .claude/skills/ About This Skill
# MetriLLM — Find the Best LLM for Your Hardware
Test any local model and get a clear verdict: is it worth running on your machine?
Prerequisites
- Node.js 20+ — check with `node -v`
- Ollama or LM Studio installed and running
- - Ollama: ollama.com, then `ollama serve`
- - LM Studio: lmstudio.ai, load a model and start the server
- MetriLLM CLI — install globally:
```bash npm install -g metrillm ```
Usage
List available models
```bash ollama list ```
Run a full benchmark
```bash metrillm bench --model $ARGUMENTS --json ```
- This measures:
- Performance: tokens/second, time to first token, memory usage
- Quality: reasoning, math, coding, instruction following, structured output, multilingual
- Fitness verdict: EXCELLENT / GOOD / MARGINAL / NOT RECOMMENDED
Performance-only benchmark (faster)
```bash metrillm bench --model $ARGUMENTS --perf-only --json ```
Skips quality evaluation — measures speed and memory only.
View previous results
```bash ls ~/.metrillm/results/ ```
Read any JSON file to see full benchmark details.
Share to the public leaderboard
```bash metrillm bench --model $ARGUMENTS --share ```
Uploads your result to the MetriLLM community leaderboard — an open, community-driven ranking of local LLM performance across real hardware. Compare your results with others and help the community find the best models for every setup. Shared data includes: model name, scores, hardware specs (CPU, RAM, GPU). No personal data is sent.
Interpreting Results
| Verdict | Score | Meaning | |---|---|---| | EXCELLENT | >= 80 | Fast and accurate — great fit | | GOOD | >= 60 | Solid — suitable for most tasks | | MARGINAL | >= 40 | Usable but with tradeoffs | | NOT RECOMMENDED | < 40 | Too slow or inaccurate |
- Key metrics to highlight:
- `tokensPerSecond` > 30 = good for interactive use
- `ttft` < 500ms = responsive
- `memoryUsedGB` vs available RAM = will it fit?
Tips
- Use `--perf-only` for quick tests
- Close GPU-intensive apps before benchmarking
- Benchmark duration varies depending on model speed and response length
Open Source
MetriLLM is free and open source (Apache 2.0). Contributions, issues, and feedback are welcome: github.com/MetriLLM/metrillm
Use Cases
- Test and benchmark local LLMs for speed, quality, and RAM usage
- Determine if a specific model is worth running on your hardware
- Compare local model performance across different quantization levels
- Evaluate inference quality and latency for local AI model selection
- Build automated model testing pipelines for local LLM deployment decisions
Pros & Cons
Pros
- +Compatible with multiple platforms including claude-code, openclaw
- +Well-documented with detailed usage instructions and examples
- +Open source with permissive licensing
Cons
- -No built-in analytics or usage metrics dashboard
- -Configuration may require familiarity with ai & machine learning concepts
FAQ
What does MetriLLM do?
What platforms support MetriLLM?
What are the use cases for MetriLLM?
100+ free AI tools
Writing, PDF, image, and developer tools — all in your browser.
Next Step
Use the skill detail page to evaluate fit and install steps. For a direct browser workflow, move into a focused tool route instead of staying in broader support surfaces.