Most major providers offer a free tier, but the rate limits, concurrency, expiry, and data-usage policies differ widely and are often buried deep in the docs. This page lays the key terms of each provider's free tier side by side, so you can pick what fits your needs and sidestep the common limits.
| Platform | What's free | Rate limit (nominal) | Peak performance | Card required | Used for training | Main limitations | |
|---|---|---|---|---|---|---|---|
| GroqLPU ultra-fast inference | Open models: Llama / Qwen / GPT-OSS | 30 req/min14,400 req/day | Stable speedruns on dedicated hardware | No | No | 5 concurrent requests; limits counted per model, and some models have lower daily caps | Go → |
| Google GeminiAI Studio | Gemini Flash / Flash-Lite (Pro is trial-tier) | 15 req/min1,500 req/day | Throttled at peakshared compute, no SLA | No | Yes (free tier) | Free-tier requests may be used for model training; enabling billing cancels the free tier for that project and every call is then charged | Go → |
| OpenRouteraggregates 400+ models | Models with the :free suffix (25+ across 4 providers) |
20 req/min50 req/day (rises to 1,000 after a $10 top-up) | Throttled / queued at peakfree models get lower priority | No | Depends on the model | Failed requests still count toward your quota, so it burns down fast while debugging; un-topped-up accounts are capped at 50/day | Go → |
| Together AIopen-model platform | $1 free credit on signup + select free endpoints (Llama / Qwen / DeepSeek / Mixtral) | Varies by modelcredit-based, then pay-as-you-go | Production-grade stabilitydedicated serving | Optional | No | The $1 signup credit is one-time; once spent you pay standard rates, though select endpoints remain free | Go → |
| Cloudflare Workers AIedge inference | Generous daily allocation (10,000 Neurons/day); Llama / Mistral / open models | 10,000 Neurons/daydaily Neuron-based quota | Low latencyglobal edge network | No | No | Neurons reset daily; heavier models consume them faster, so a big model can exhaust the allocation quickly | Go → |
| DeepSeekfirst-party | New-user signup credit (DeepSeek V4 Flash / Pro) | One-time creditV4 Flash $0.14/M after | Stableofficial provider | No | No | Signup credit is one-time, then billed at standard rates; V4 Flash is among the cheapest paid rates globally, so the credit stretches a long way | Go → |
| Coherefirst-party | Free trial API key (Command R / R+) | 1,000 calls/monthrate-limited trial key | Good for prototypingofficial endpoint | No | Yes (trial) | Trial data may be used to improve models; trial keys are rate-limited and not intended for production | Go → |
| Mistral AILa Plateforme | Free experimentation tier (Mistral open models) | Limited rateevaluation-oriented | EU-hostedofficial endpoint | May be required | Check data policy | The free tier is intended for evaluation; confirm the current data-usage policy before sending sensitive content | Go → |
| Hugging FaceInference API | Free serverless inference for many open models | Shared poolrate-limited, best-effort | Variable latency at peakshared free pool, no SLA | No | No (public models) | Free tier is best-effort with no SLA; cold starts and queueing are common under load | Go → |
| GitHub Modelsvia GitHub account | Free access to GPT / Llama / Phi for prototyping | Low per-model limitsdev/testing only | Strict rate limitsthrottled under load | No | No | Intended for development and testing only, not production; rate limits are strict and enforced per model | Go → |
Rate limits are the nominal values from each provider's official docs; real-world experience depends on time of day, region, and account status. The "Peak performance" column is a qualitative read of public documentation and user reports, not live probe data. Policies change often (Gemini, for example, cut its free quota 50–80% in late 2025), so the provider's current docs are always the source of truth.
Rate limits, quotas, expiry windows, and data-usage policies shift often. Subscribe and we'll round up the changes and send them over, so you don't have to recheck the docs yourself.
Subscribe to free change alerts