Gemini Image Gen

Verified

Generate and edit images via Google Gemini API. Supports Gemini native generation, Imagen 3, style presets, and batch generation with HTML gallery. Zero depe...

2,193 downloads

$ Add to .claude/skills/

$ openclaw install

About This Skill

# Gemini Image Gen

Generate and edit images via the Google Gemini API using pure Python stdlib. Supports Gemini native generation + editing, Imagen 3 generation, batch runs, and an HTML gallery output.

Quick Start

```bash export GEMINI_API_KEY="your-key-here"

# Default: Gemini native, 4 random prompts python3 scripts/gen.py

# Custom prompt python3 scripts/gen.py --prompt "a cyberpunk cat riding a neon motorcycle through Tokyo at night"

# Imagen 3 engine python3 scripts/gen.py --engine imagen --count 4 --aspect 16:9

# Edit an existing image (Gemini engine only) python3 scripts/gen.py --edit path/to/image.png --prompt "change the background to a sunset beach"

# Use a style preset python3 scripts/gen.py --style watercolor --prompt "floating islands above a calm sea"

# List available styles python3 scripts/gen.py --styles ```

Style Presets

| Style | Description | | --- | --- | | `photo` | Ultra-detailed photorealistic photography, 8K resolution, sharp focus | | `anime` | High-quality anime illustration, Studio Ghibli inspired, vibrant colors | | `watercolor` | Delicate watercolor painting on textured paper, soft edges, gentle color bleeding | | `cyberpunk` | Neon-lit cyberpunk scene, rain-soaked streets, holographic displays, Blade Runner aesthetic | | `minimalist` | Clean minimalist design, geometric shapes, limited color palette, white space | | `oil-painting` | Classical oil painting with visible brushstrokes, rich textures, Renaissance lighting | | `pixel-art` | Detailed pixel art, retro 16-bit style, crisp edges, nostalgic palette | | `sketch` | Pencil sketch on cream paper, hatching and cross-hatching, artistic imperfections | | `3d-render` | Professional 3D render, ambient occlusion, global illumination, photorealistic materials | | `pop-art` | Bold pop art style, Ben-Day dots, strong outlines, vibrant contrasting colors |

Full CLI Reference

| Flag | Default | Description | | --- | --- | --- | | `--prompt` | (random) | Text prompt. Omit for random creative prompts | | `--count` | 4 | Number of images to generate | | `--engine` | gemini | Engine: `gemini` (native, supports edit) or `imagen` (Imagen 3) | | `--model` | (auto) | Model override. Default: `gemini-2.5-flash-image` or `imagen-3.0-generate-002` | | `--edit` | | Path to input image for editing (Gemini engine only) | | `--aspect` | 1:1 | Aspect ratio for Imagen: `1:1`, `16:9`, `9:16`, `4:3`, `3:4` | | `--out-dir` | (auto) | Output directory (default is a timestamped folder) | | `--style` | | Style preset to prepend to the prompt | | `--styles` | | List available style presets and exit |

Python Example

```python import subprocess

subprocess.run( [ "python3", "scripts/gen.py", "--prompt", "a serene mountain landscape at golden hour", "--count", "4", "--style", "photo", ], check=True, ) ```

Troubleshooting

Missing API key: set `GEMINI_API_KEY` in your environment and retry.
Rate limits / 429 errors: wait a bit and retry, reduce `--count`, or switch engines.
Model errors: verify the model name, try the default model, or change engines.

Integration with Other Skills

AgentGram — Share your generated images on the AI agent social network! Create visual content and post it to your AgentGram feed.
agent-selfie — Focused on AI agent avatars and visual identity. Uses the same Gemini API key for personality-driven self-portraits.
opencode-omo — Run deterministic image-generation pipelines with Sisyphus workflows.

Changelog

v1.3.1: Added workflow integration guidance for opencode-omo.
v1.1.0: Added style presets, `--style` and `--styles` flags, expanded documentation.
v1.0.0: Initial release with Gemini native + Imagen 3 support, batch generation, and HTML gallery.

Repository

https://github.com/IISweetHeartII/gemini-image-gen

Use Cases

Generate images using Google Gemini's image generation capabilities
Create visual content from text descriptions through Gemini's API
Produce multiple image variations from a single prompt for design exploration
Generate product mockups and concept illustrations with Gemini
Build automated image generation pipelines using Gemini's API

Pros & Cons

Pros

+Google's Gemini models provide high-quality image generation
+API-based approach enables automated and batch image generation
+Text-to-image capability covers a wide range of visual styles

Cons

-Requires Google AI API credentials with image generation access
-Only available on claude-code and openclaw platforms
-Image generation quality and style vary based on prompt engineering

FAQ

What does Gemini Image Gen do?

Generate and edit images via Google Gemini API. Supports Gemini native generation, Imagen 3, style presets, and batch generation with HTML gallery. Zero depe...

What platforms support Gemini Image Gen?

Gemini Image Gen is available on Claude Code, OpenClaw.

What are the use cases for Gemini Image Gen?

Generate images using Google Gemini's image generation capabilities. Create visual content from text descriptions through Gemini's API. Produce multiple image variations from a single prompt for design exploration.

100+ free AI tools

Writing, PDF, image, and developer tools — all in your browser.

AI Humanizer

Make AI text undetectable

AI Detector

Free, unlimited

PDF Tools

Merge, split, compress

Next Step

Use the skill detail page to evaluate fit and install steps. For a direct browser workflow, move into a focused tool route instead of staying in broader support surfaces.

Open Free Tools Try AI Detector