§ 01 · OVERVIEW

The room is the spec.

Doctus is a confidential fine-tuning + sealed hosting service running on Targon (Bittensor SN4) confidential GPUs — Intel TDX on the CPU side, NVIDIA Confidential Computing on the GPU side, Protected PCIe between them. Your training data is decrypted only inside the enclave. The trained adapter is encrypted to a key only you hold. We hold ciphertext and nothing else.

The API is OpenAI-compatible at the inference surface, with three additional Doctus endpoints for the upstream half of the lifecycle (fine-tune creation, poll, adapter delivery).

Base URL: https://api.doctus.cloud
Health:   GET /healthz → { "ok": true, "ts": … }
§ 02 · ACCOUNT & FINGERPRINT

One username, one fingerprint.

Doctus has no email, no password, no OAuth. Auth is a single 32-character alphanumeric fingerprint the server generates when you create the account. We Argon2id-hash it and store the hash; we never see the raw value again. Save it. Lose it = lose the account.

Create

POST /v1/account/create
Content-Type: application/json

{ "username": "testertester",
  "coldkey": "5GNg4Czv…aBCy"   // optional, Bittensor SS58
}

→ 201 Created
{ "account_id": "…",
  "username":   "testertester",
  "fingerprint": "vlEhtGe244fg9PlBhnQgSgcvCR1j1TbF",
  "fp": "1TbF",
  "warning": "Copy this fingerprint to a safe place or download it now…" }

Authenticate

Authorization: Bearer vlEhtGe244fg9PlBhnQgSgcvCR1j1TbF
   # or, equivalently:
Authorization: Bearer doc-vlEhtGe244fg9PlBhnQgSgcvCR1j1TbF

Every API request that needs an identity carries this header. The doc- prefix is optional — we strip it.

POST
/v1/account/create
Create an account. Generates a fresh fingerprint and returns it ONCE. Username is 3–32 alphanumeric characters, case-insensitive uniqueness.
GET
/v1/account
Read your account. Returns username, balance (TAO + USD), deposit address (when ready), linked coldkey.
POST
/v1/account/topup-address
Derive a per-account Bittensor SS58 deposit address. Phase 1 wires up the on-chain watcher.
§ 03 · FINE-TUNE

Create. Encrypt. Submit. Poll.

A fine-tune job goes through four states: queued → running → attesting → done (or failed). You open the job, get back an attested public key from the enclave, encrypt your dataset to it, push the ciphertext, then poll until the attestation is signed and the adapter is sealed.

Open a job

POST /v1/finetune/create
Authorization: Bearer …

{ "base_model": "Qwen/Qwen3-32B-TEE",
  "lora_rank": 16,
  "dataset_size_bytes": 312840,
  "estimated_tokens": 280000 }

→ 201 Created
{ "job_id": "…",
  "tvm_id": "…",
  "attested_pubkey": "X25519:…",
  "attestation_uri": "https://…",
  "tier": "standard",
  "price_tao": "0.018",
  "price_usd": 5,
  "hint": "encrypt your dataset to attested_pubkey…" }

Submit the dataset

Encrypt your dataset client-side to attested_pubkey (X25519 → ChaCha20-Poly1305 is the canonical envelope; see § 06). Then push the ciphertext:

POST /v1/finetune/{job_id}/data
Content-Type: application/octet-stream
<ciphertext bytes>

Poll

GET /v1/finetune/{job_id}

→ 200 OK
{ "id":            "…",
  "status":         "done",
  "base_model":     "Qwen/Qwen3-32B-TEE",
  "attested_pubkey":"X25519:…",
  "attestation_uri":"https://…/attest/abc123" }
POST
/v1/finetune/create
Open a LoRA job. Returns the attested enclave pubkey for dataset encryption + the price quote.
POST
/v1/finetune/{id}/data
Push the ciphertext dataset. Body is raw bytes encrypted to the attested pubkey.
GET
/v1/finetune/{id}
Poll the job. Returns current status + attestation receipt when ready.
GET
/v1/finetune
List your fine-tune jobs.
§ 04 · ADAPTER DELIVERY

Encrypted to your key. Yours alone.

When the job completes, the trained adapter is encrypted inside the enclave to your wallet public key (the same key you passed at job creation, or your fingerprint-derived key if none was supplied). The ciphertext adapter is staged at a one-time-use URL.

GET
/v1/finetune/{id}/adapter
Download the encrypted adapter. ChaCha20-Poly1305 envelope around the LoRA weights. ~50–200 MB.
§ 05 · HOSTED INFERENCE

OpenAI-compatible. Sealed end-to-end.

Once an adapter is deployed (Phase 1 wires the deploy step), you hit a standard OpenAI-compatible /v1/chat/completions endpoint with your fingerprint as the bearer. The inference runs inside a CNTR enclave; we do not log prompts or completions.

POST /v1/chat/completions
Authorization: Bearer <your fingerprint>

{ "model": "yourname/qwen3-32b-foo",
  "messages": [{ "role": "user", "content": "…" }] }
POST
/v1/chat/completions
OpenAI-compatible chat. Routes to your hosted model. Per-token billing in TAO. Streaming arrives in Phase 1.
§ 06 · ATTESTATION

The receipt is the security claim.

Every fine-tune job emits a signed attestation covering: the enclave's boot image hash, the Intel TDX measurement, the NVIDIA Confidential Computing report, and the loaded training code hash. Callers SHOULD verify the receipt against the published Targon TVM measurement before treating the result as confidential. The attestation pubkey returned by /v1/finetune/create is the ephemeral key derived inside that attested enclave.

§ 07 · PRICING

Pay-per-run. No subscriptions.

  • FIRST RUN — Free. ≤ 100k training examples.
  • STANDARD — $5. ≤ 1M training tokens.
  • EXTENDED — $25. ≤ 10M training tokens.
  • HOSTING — $0.50 / M output tokens. Sealed serverless. No idle GPU charge.

Settlement is TAO at the prevailing oracle rate. There SHALL be no subscription, seat, or minimum charge.

§ 08 · ERRORS

The room is honest about what went wrong.

401 invalid fingerprint            — bearer missing or wrong
402 insufficient balance           — top up TAO
400 model / messages required      — body shape wrong
400 streaming not yet supported    — drop stream:true in v0
404 job not found                  — wrong id or not yours
503 oracle unavailable             — try in a moment
503 targon unavailable             — provider-side; we'll surface a status uri here in Phase 1