Self-hosting
Technical InfrastructureRunning AI models on your own hardware or private cloud instead of using a provider's API — giving you full control over data, costs, and customization.
Self-hosting means downloading an AI model and running it on infrastructure you control. This could be a beefy desktop PC with a good GPU, a dedicated server, or a private cloud instance. Your data never leaves your environment, and there are no per-request API fees.
Self-hosting is viable thanks to open-source models (Llama, Mistral, Stable Diffusion) and tools that make deployment easier (Ollama, vLLM, text-generation-webui, ComfyUI). Hardware requirements vary: a basic Llama model runs on a MacBook, while competitive models need 24-80GB of GPU VRAM.
The tradeoffs: self-hosting gives you privacy, customization, and potentially lower costs at scale, but requires technical expertise, hardware investment, and you're responsible for updates and security. Most individuals use APIs; enterprises with data sensitivity requirements often self-host.
Real-World Example
Running Llama locally with Ollama means your conversations never leave your machine — complete privacy with zero API costs, though you need decent hardware.
Related Terms
More in Technical Infrastructure
FAQ
What is Self-hosting?
Running AI models on your own hardware or private cloud instead of using a provider's API — giving you full control over data, costs, and customization.
How is Self-hosting used in practice?
Running Llama locally with Ollama means your conversations never leave your machine — complete privacy with zero API costs, though you need decent hardware.
What concepts are related to Self-hosting?
Key related concepts include Open Source (AI), GPU (Graphics Processing Unit), VRAM (Video RAM), Ollama. Understanding these together gives a more complete picture of how Self-hosting fits into the AI landscape.