Model catalog
Browse every model. Filter to yours.
Frontier and open models with capabilities, pricing, latency, use-case fit, persona fit, and HIPAA / SOC2 / GDPR / FedRAMP — verified against provider docs. Unknown values stay blank.
GPT Realtime mini
Smaller, cheaper Realtime voice model for high-volume voice agents.
Claude Opus 4.5
Anthropic's most capable model. Best-in-class for agentic coding and tool use.
Claude Haiku 4.5
Fast, low-cost model with frontier-level coding for its tier.
Claude Sonnet 4.5
Balanced flagship: strong coding and agent performance at much lower cost than Opus.
GPT Realtime
Speech-to-speech model for low-latency voice agents. Native audio understanding and generation over the Realtime API.
DeepSeek-V3.1
Open MoE with hybrid thinking/non-thinking modes. Excellent price-to-performance.
GPT-5
OpenAI's flagship multimodal reasoning model with adaptive thinking depth.
GPT-5 mini
Smaller, cheaper GPT-5 with most of the reasoning quality.
GPT-5 nano
Ultra-fast, ultra-cheap GPT-5 variant for high-volume tasks.
Gemini 2.5 Flash-Lite
Cheapest Gemini for high-volume classification, extraction, and translation.
Grok 4
Reasoning-first model with native tool use and real-time X search integration.
Gemini 2.5 Pro
Top-tier multimodal model with very long context and native audio/video understanding.
Gemini 2.5 Flash
Fast, cheap multimodal workhorse with controllable thinking budget.
Gemini 2.5 Flash with Native Audio (Live API)
Real-time speech-in / speech-out conversational model powering Gemini Live. Native audio with affective dialog and tool use.
o3
Deliberate reasoning model that thinks before answering. Strong at math and science.
Llama 4 Maverick
Open-weights mixture-of-experts multimodal model. 17B active / 400B total params.
Llama 4 Scout
Smaller open MoE (17B active / 109B total) with up to 10M-token context.
Command A
Enterprise-focused model optimized for RAG, tool use, and private deployment.
Mistral Large 2
Flagship dense 123B model. Strong multilingual and code performance.