Model Catalog

Open AI Models, Ready to Run

OpenAI-compatible inference for the latest open models. Swap your base URL and go.

Browse 12 open-source models from 7 providers

DeepSeek R1 0528

Just launched
TextReasoningCode
Provider
DeepSeek
Context
128K
Parameters
671B MoE

Updated R1 with deeper reasoning and algorithmic optimizations. Performance approaching O3 and Gemini 2.5 Pro on math, coding, and logic benchmarks.


Input

$0.55 / 1M tokens

Output

$2.19 / 1M tokens

model="deepseek-r1-0528"

DeepSeek V3.2

Just launched
TextChatReasoningCode
Provider
DeepSeek
Context
128K
Parameters
685B MoE

Harmonizes computational efficiency with superior reasoning and agentic performance via Sparse Attention and a Scalable Reinforcement Learning Framework.


Input

$0.27 / 1M tokens

Output

$1.10 / 1M tokens

model="deepseek-v3-2"

GLM 4.7

TextChatCodeReasoning
Provider
GLM
Context
128K
Parameters
~9B

Code-focused model with core coding, vibe coding, tool use, and complex reasoning. Designed as a developer-facing coding partner.


Input

$0.10 / 1M tokens

Output

$0.25 / 1M tokens

model="glm-4-7"

GLM 5

Just launched
TextChatCodeReasoning
Provider
GLM
Context
128K
Parameters
744B MoE

744B MoE (40B active) targeting complex systems engineering and long-horizon agentic tasks. Integrates DeepSeek Sparse Attention for reduced deployment cost.


Input

$0.40 / 1M tokens

Output

$1.50 / 1M tokens

model="glm-5"

GPT-OSS-120b

Just launched
TextChatReasoningCode
Provider
OpenAI
Context
128K
Parameters
120B

OpenAI's open-weight 120B model designed for powerful reasoning, agentic tasks, and versatile developer use cases.


Input

$0.80 / 1M tokens

Output

$2.40 / 1M tokens

model="gpt-oss-120b"

Kimi K2.5

MultimodalChatReasoning
Provider
Kimi
Context
128K
Parameters
~72B

Native multimodal agentic model trained on ~15T mixed visual and text tokens. Integrates vision, language, and advanced agentic capabilities with instant and thinking modes.


Input

$0.20 / 1M tokens

Output

$0.60 / 1M tokens

model="kimi-k2-5"

Llama 4 Maverick 17B 128E

MultimodalChatReasoning
Provider
Meta
Context
128K
Parameters
17B × 128E

Natively multimodal MoE model with 128 experts offering industry-leading text and image understanding. Part of Meta's next-generation Llama 4 series.


Input

$0.20 / 1M tokens

Output

$0.65 / 1M tokens

model="llama-4-maverick-17b-128e"

Llama 3.3 70B

TextChatCode
Provider
Meta
Context
128K
Parameters
70B

Instruction-tuned 70B model optimized for multilingual dialogue, outperforming many open and closed chat models on industry benchmarks.


Input

$0.12 / 1M tokens

Output

$0.30 / 1M tokens

model="llama-3-3-70b"

MiniMax M2.5

TextChatCodeReasoning
Provider
MiniMax
Context
128K
Parameters

RL-trained across hundreds of thousands of real-world environments. Achieves 80.2% on SWE-Bench Verified. SOTA in coding, agentic tool use, and office tasks.


Input

$0.30 / 1M tokens

Output

$1.00 / 1M tokens

model="minimax-m2-5"

Early Access

Be first in line.

Sign up for early API access. Don't see a model you need? Request it and we'll prioritize based on demand.

Frequently Asked Questions

Everything you need to know about the Hoonify AI model catalog and API.