Gemini Video Analyzer
VerifiedNative video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...
$ Add to .claude/skills/ About This Skill
# Gemini Video Analyzer
Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.
Quick Start
```bash # Analyze a video with default prompt (full description) GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4
# Ask a specific question GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"
# Manage uploaded files GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup ```
Supported Formats
MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.
How It Works
- Video uploads to Google's Files API (temporary, auto-deletes after 48h)
- Gemini processes at 1 frame/sec — understands motion, transitions, audio context
- Model generates response based on your prompt
- Way better than frame extraction for understanding temporal content
Use Cases
| Task | Example Prompt | |------|---------------| | General description | *(default — no prompt needed)* | | UI/text extraction | `"What text and UI elements are visible?"` | | Tutorial summary | `"Summarize the steps shown in this tutorial"` | | Bug report from video | `"Describe what went wrong in this screen recording"` | | Meeting notes | `"Summarize the key points discussed"` | | Content comparison | Upload 2 videos, ask for differences |
Configuration
Set `GOOGLE_AI_API_KEY` in your environment or `.env` file. Get a free key at aistudio.google.com.
Default model: `gemini-2.5-flash` (fast, cheap, excellent vision). Override with `--model gemini-2.5-pro` for complex analysis.
API Reference
See references/gemini-files-api.md for file upload limits, processing details, and advanced options.
Credits
Built by M. Abidi · LinkedIn · YouTube · GitHub · Book a Call
Use Cases
- Perform detailed video analysis with frame-by-frame content understanding
- Extract text, objects, and scenes from video content for data extraction
- Analyze video quality metrics including resolution, bitrate, and encoding
- Compare video content across multiple files for similarity detection
- Generate accessibility descriptions from video content for visual impairments
Pros & Cons
Pros
- +Frame-level analysis provides detailed content understanding
- +Multiple analysis modes: content, quality, and accessibility
- +Accessibility description generation serves an important inclusion need
Cons
- -Frame-by-frame analysis is computationally expensive for long videos
- -Only available on claude-code and openclaw platforms
- -Requires Gemini API access with video processing capabilities
FAQ
What does Gemini Video Analyzer do?
What platforms support Gemini Video Analyzer?
What are the use cases for Gemini Video Analyzer?
100+ free AI tools
Writing, PDF, image, and developer tools — all in your browser.
Next Step
Use the skill detail page to evaluate fit and install steps. For a direct browser workflow, move into a focused tool route instead of staying in broader support surfaces.