Question 1

What models are supported on Hoonify AI?

Accepted Answer

Hoonify AI supports 10+ open-source models including DeepSeek R1, Qwen 3, Llama 4 Scout, Mistral Large, and more. New models are added the same day they are publicly released.

Question 2

Is Hoonify AI's inference API OpenAI-compatible?

Accepted Answer

Yes. Hoonify AI uses an OpenAI-compatible REST API. You only need to change your base URL and API key — no SDK changes required.

Question 3

Does Hoonify AI store prompts or model outputs?

Accepted Answer

No. Hoonify AI has zero data retention by default. Prompts and completions are never stored, logged, or used for training any model.

Question 4

What is the difference between serverless and dedicated inference?

Accepted Answer

Serverless inference is auto-scaling and billed per token — ideal for most workloads. Dedicated inference provides reserved GPU capacity, isolated workloads, and guaranteed throughput for production-critical applications. See the Enterprise page for details.

Question 5

How do I move from testing to production on Hoonify AI?

Accepted Answer

The same API endpoint and model ID you use in development works in production. Serverless inference auto-scales to match your traffic. For guaranteed capacity and SLAs, contact sales for a dedicated deployment.

Run Any Model. Zero Infrastructure.

One line to switch.
Everything else stays.

How Hoonify AI works

Your application sends a request

Hoonify AI routes to the right model

TurbOS schedules and dispatches

Response streams back instantly

Built for every AI workload

Frequently Asked Questions

Launching soon.