NEAR AI Cloud

entity
near-aiinferencecryptoembeddingsapi

NEAR AI Cloud is a private, verifiable AI inference platform. All inference runs inside TEEs (Trusted Execution Environments) with attestation proofs from Intel TDX and NVIDIA TEE. The API is fully OpenAI-compatible.

API

Two endpoint modes:

  • Gateway: https://cloud-api.near.ai/v1 — routes to any model
  • Direct: https://{model-slug}.completions.near.ai/v1 — connects to a specific model’s TEE

Authentication: Authorization: Bearer <API_KEY>. Keys are generated at cloud.near.ai/api-keys.

Supported endpoints: /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/images/generations, /v1/audio/transcriptions, /v1/rerank, /v1/models.

Available models (as of 2026-04-07)

LLMs

Model$/1M in$/1M outContext
anthropic/claude-opus-4-65.0025.00200K
anthropic/claude-sonnet-4-53.0015.50200K
google/gemini-3-pro1.2515.001M
openai/gpt-5.21.8015.50400K
openai/gpt-oss-120b0.150.55131K
Qwen/Qwen3.5-122B-A10B0.403.20131K
Qwen/Qwen3-30B-A3B-Instruct-25070.150.55262K
zai-org/GLM-5-FP80.853.30203K

Vision / multimodal

Model$/1M in$/1M outContext
Qwen/Qwen3-VL-30B-A3B-Instruct0.150.55256K

Image generation

Model$/1M in$/1M outContext
black-forest-labs/FLUX.2-klein-4B1.001.00128K

Audio

Model$/1M in$/1M outContext
openai/whisper-large-v30.010.01448

Embeddings and reranking

Model$/1M tokensDimensionsContext
Qwen/Qwen3-Embedding-0.6B0.01102440K
Qwen/Qwen3-Reranker-0.6B0.0140K

Embedding quality evaluation

Tested Qwen3-Embedding-0.6B on knowledge base content (2026-04-07). Results: top-1 retrieval is correct for obvious queries, but score gaps between relevant and irrelevant results are dangerously small (0.005–0.05). The 0.6B model is too small for production semantic search.

The reranker (Qwen3-Reranker-0.6B) shows the same problem — correct top-1 ordering but scores compressed into a narrow band (0.75–0.86), making it unreliable for distinguishing relevant from irrelevant content at scale.

See Crypto Embedding Providers for alternatives.

Notes

  • The old API at api.near.ai was retired on 2025-10-31.
  • Crypto payment is native — all billing is on-chain via NEAR protocol.
  • The Authorization: Bearer header requires the key to be passed via file (-H @file) in some curl environments due to shell escaping issues.