The room is the spec.
Doctus is a confidential fine-tuning + sealed hosting service running on Targon (Bittensor SN4) confidential GPUs — Intel TDX on the CPU side, NVIDIA Confidential Computing on the GPU side, Protected PCIe between them. Your training data is decrypted only inside the enclave. The trained adapter is encrypted to a key only you hold. We hold ciphertext and nothing else.
The API is OpenAI-compatible at the inference surface, with three additional Doctus endpoints for the upstream half of the lifecycle (fine-tune creation, poll, adapter delivery).
Base URL: https://api.doctus.cloud
Health: GET /healthz → { "ok": true, "ts": … }
One username, one fingerprint.
Doctus has no email, no password, no OAuth. Auth is a single 32-character alphanumeric fingerprint the server generates when you create the account. We Argon2id-hash it and store the hash; we never see the raw value again. Save it. Lose it = lose the account.
Create
POST /v1/account/create
Content-Type: application/json
{ "username": "testertester",
"coldkey": "5GNg4Czv…aBCy" // optional, Bittensor SS58
}
→ 201 Created
{ "account_id": "…",
"username": "testertester",
"fingerprint": "vlEhtGe244fg9PlBhnQgSgcvCR1j1TbF",
"fp": "1TbF",
"warning": "Copy this fingerprint to a safe place or download it now…" }
Authenticate
Authorization: Bearer vlEhtGe244fg9PlBhnQgSgcvCR1j1TbF # or, equivalently: Authorization: Bearer doc-vlEhtGe244fg9PlBhnQgSgcvCR1j1TbF
Every API request that needs an identity carries this header. The doc- prefix is optional — we strip it.
Create. Encrypt. Submit. Poll.
A fine-tune job goes through four states: queued → running → attesting → done (or failed). You open the job, get back an attested public key from the enclave, encrypt your dataset to it, push the ciphertext, then poll until the attestation is signed and the adapter is sealed.
Open a job
POST /v1/finetune/create
Authorization: Bearer …
{ "base_model": "Qwen/Qwen3-32B-TEE",
"lora_rank": 16,
"dataset_size_bytes": 312840,
"estimated_tokens": 280000 }
→ 201 Created
{ "job_id": "…",
"tvm_id": "…",
"attested_pubkey": "X25519:…",
"attestation_uri": "https://…",
"tier": "standard",
"price_tao": "0.018",
"price_usd": 5,
"hint": "encrypt your dataset to attested_pubkey…" }
Submit the dataset
Encrypt your dataset client-side to attested_pubkey (X25519 → ChaCha20-Poly1305 is the canonical envelope; see § 06). Then push the ciphertext:
POST /v1/finetune/{job_id}/data
Content-Type: application/octet-stream
<ciphertext bytes>
Poll
GET /v1/finetune/{job_id}
→ 200 OK
{ "id": "…",
"status": "done",
"base_model": "Qwen/Qwen3-32B-TEE",
"attested_pubkey":"X25519:…",
"attestation_uri":"https://…/attest/abc123" }
Encrypted to your key. Yours alone.
When the job completes, the trained adapter is encrypted inside the enclave to your wallet public key (the same key you passed at job creation, or your fingerprint-derived key if none was supplied). The ciphertext adapter is staged at a one-time-use URL.
OpenAI-compatible. Sealed end-to-end.
Once an adapter is deployed (Phase 1 wires the deploy step), you hit a standard OpenAI-compatible /v1/chat/completions endpoint with your fingerprint as the bearer. The inference runs inside a CNTR enclave; we do not log prompts or completions.
POST /v1/chat/completions
Authorization: Bearer <your fingerprint>
{ "model": "yourname/qwen3-32b-foo",
"messages": [{ "role": "user", "content": "…" }] }
The receipt is the security claim.
Every fine-tune job emits a signed attestation covering: the enclave's boot image hash, the Intel TDX measurement, the NVIDIA Confidential Computing report, and the loaded training code hash. Callers SHOULD verify the receipt against the published Targon TVM measurement before treating the result as confidential. The attestation pubkey returned by /v1/finetune/create is the ephemeral key derived inside that attested enclave.
Pay-per-run. No subscriptions.
- FIRST RUN — Free. ≤ 100k training examples.
- STANDARD — $5. ≤ 1M training tokens.
- EXTENDED — $25. ≤ 10M training tokens.
- HOSTING — $0.50 / M output tokens. Sealed serverless. No idle GPU charge.
Settlement is TAO at the prevailing oracle rate. There SHALL be no subscription, seat, or minimum charge.
The room is honest about what went wrong.
401 invalid fingerprint — bearer missing or wrong 402 insufficient balance — top up TAO 400 model / messages required — body shape wrong 400 streaming not yet supported — drop stream:true in v0 404 job not found — wrong id or not yours 503 oracle unavailable — try in a moment 503 targon unavailable — provider-side; we'll surface a status uri here in Phase 1