NEAR AI Cloud
NEAR AI Cloud is a private, verifiable AI inference platform. All inference runs inside TEEs (Trusted Execution Environments) with attestation proofs from Intel TDX and NVIDIA TEE. The API is fully OpenAI-compatible.
API
Two endpoint modes:
- Gateway:
https://cloud-api.near.ai/v1— routes to any model - Direct:
https://{model-slug}.completions.near.ai/v1— connects to a specific model’s TEE
Authentication: Authorization: Bearer <API_KEY>. Keys are generated at cloud.near.ai/api-keys.
Supported endpoints: /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/images/generations, /v1/audio/transcriptions, /v1/rerank, /v1/models.
Available models (as of 2026-04-07)
LLMs
| Model | $/1M in | $/1M out | Context |
|---|---|---|---|
anthropic/claude-opus-4-6 | 5.00 | 25.00 | 200K |
anthropic/claude-sonnet-4-5 | 3.00 | 15.50 | 200K |
google/gemini-3-pro | 1.25 | 15.00 | 1M |
openai/gpt-5.2 | 1.80 | 15.50 | 400K |
openai/gpt-oss-120b | 0.15 | 0.55 | 131K |
Qwen/Qwen3.5-122B-A10B | 0.40 | 3.20 | 131K |
Qwen/Qwen3-30B-A3B-Instruct-2507 | 0.15 | 0.55 | 262K |
zai-org/GLM-5-FP8 | 0.85 | 3.30 | 203K |
Vision / multimodal
| Model | $/1M in | $/1M out | Context |
|---|---|---|---|
Qwen/Qwen3-VL-30B-A3B-Instruct | 0.15 | 0.55 | 256K |
Image generation
| Model | $/1M in | $/1M out | Context |
|---|---|---|---|
black-forest-labs/FLUX.2-klein-4B | 1.00 | 1.00 | 128K |
Audio
| Model | $/1M in | $/1M out | Context |
|---|---|---|---|
openai/whisper-large-v3 | 0.01 | 0.01 | 448 |
Embeddings and reranking
| Model | $/1M tokens | Dimensions | Context |
|---|---|---|---|
Qwen/Qwen3-Embedding-0.6B | 0.01 | 1024 | 40K |
Qwen/Qwen3-Reranker-0.6B | 0.01 | — | 40K |
Embedding quality evaluation
Tested Qwen3-Embedding-0.6B on knowledge base content (2026-04-07). Results: top-1 retrieval is correct for obvious queries, but score gaps between relevant and irrelevant results are dangerously small (0.005–0.05). The 0.6B model is too small for production semantic search.
The reranker (Qwen3-Reranker-0.6B) shows the same problem — correct top-1 ordering but scores compressed into a narrow band (0.75–0.86), making it unreliable for distinguishing relevant from irrelevant content at scale.
See Crypto Embedding Providers for alternatives.
Notes
- The old API at
api.near.aiwas retired on 2025-10-31. - Crypto payment is native — all billing is on-chain via NEAR protocol.
- The
Authorization: Bearerheader requires the key to be passed via file (-H @file) in some curl environments due to shell escaping issues.