Skip to content

RSS Aggregator

Caution

Build RSS/Atom feed aggregators with polling schedules, content deduplication, full-text extraction, AI summarization, and multi-format output.

By community 1,600 v1.0.1 Updated 2026-03-08

Install

Claude Code

Copy the SKILL.md file to .claude/skills/rss-aggregator.md

About This Skill

RSS Aggregator builds feed ingestion pipelines that collect, deduplicate, enrich, and distribute content from multiple RSS/Atom sources.

Feed Parsing

Handles RSS 2.0, Atom 1.0, and JSON Feed formats using feedparser (Python) or rss-parser (Node.js). Normalizes disparate feed schemas into a unified article model with consistent fields: title, url, author, published_at, summary, content.

Polling & Scheduling

Configurable per-feed poll intervals using cron expressions. Respects `Cache-Control` and `ETag`/`Last-Modified` headers to avoid re-fetching unchanged feeds. Conditional GET reduces bandwidth by 80-90% for cooperative sources.

Deduplication

URL-based and content-fingerprint deduplication. SimHash similarity detection catches near-duplicate articles from wire services (AP, Reuters) that appear across multiple outlets.

Full-Text Extraction

For feeds that only provide summaries, Readability.js or newspaper3k extracts the full article body from the source page. Respects robots.txt and implements polite crawl delays.

AI Enrichment

Optional LLM pipeline: 3-sentence summary, keyword extraction, category classification, sentiment scoring, and named entity recognition for people/organizations/topics.

Output

SQLite or PostgreSQL storage, REST API for querying articles, email digest renderer, and JSON Feed re-export for downstream consumption.

Use Cases

  • Aggregating news feeds from 50+ sources into a unified reader with deduplication
  • Building a tech news digest with AI-generated summaries delivered via email
  • Monitoring competitor blogs and generating weekly intelligence reports
  • Creating a personal knowledge management feed that exports to Notion or Obsidian

Pros & Cons

Pros

  • + Conditional GET reduces unnecessary bandwidth for cooperative feed sources
  • + SimHash deduplication catches near-duplicate wire service articles
  • + Full-text extraction works even for summary-only feeds
  • + AI enrichment pipeline adds structured metadata without manual tagging

Cons

  • - Full-text extraction may violate terms of service for some publishers
  • - LLM enrichment costs accumulate quickly with high feed volume

Related AI Tools

Related Skills

Stay Updated on Agent Skills

Get weekly curated skills + safety alerts

每周精选 Skills + 安全预警