Tuned Tensor
DocsDashboard

Runs

A run is a single end-to-end cycle: compile your behaviour spec into training data, augment examples with AI, fine-tune the model, and auto-evaluate the result.

The Run Object

{
  "id": "e0b7694b-2c65-4199-89a1-fc54a6a6010c",
  "behavior_spec_id": "cafd8799-...",
  "run_number": 1,
  "status": "completed",
  "spec_snapshot": { ... },
  "dataset_id": "dc66546b-...",
  "fine_tune_job_id": "b3e2b918-...",
  "model_id": "96e9f0d9-...",
  "hyperparameters": {
    "augment": true,
    "n_epochs": 4,
    "lora_rank": 8,
    "lora_alpha": 16
  },
  "eval_summary": {
    "total": 5,
    "avg_score": 0.82,
    "pass_rate": 0.8,
    "scoring_method": "llm_judge",
    "regressions": 0,
    "improvements": 3
  },
  "started_at": "2026-03-06T10:30:00.000Z",
  "completed_at": "2026-03-06T10:57:50.000Z"
}

Run Lifecycle

StatusDescription
preparingCompiling spec → augmenting examples → uploading to provider
trainingFine-tuning job running on Together AI
evaluatingModel being tested against the spec's examples
completedEval results available
failedError — check the error field
cancelledManually cancelled

Spec Snapshot

Every run captures a spec_snapshot — a frozen copy of the behaviour spec at run time. You can freely edit your spec between runs; each run preserves exactly what it trained on.

Eval Summary

FieldDescription
avg_scoreMean score across all examples (0–1)
pass_rateFraction of examples that passed (score ≥ 0.7)
exact_match_rateFraction of near-perfect scores (≥ 0.95)
avg_latency_msMean inference latency per example
scoring_methodllm_judge or similarity
regressionsExamples that scored ≥ 0.1 worse than previous run
improvementsExamples that scored ≥ 0.1 better than previous run

Start a Run

POST /api/v1/behavior-specs/:id/runs

curl -X POST https://api.tunedtensor.com/v1/behavior-specs/:id/runs \
  -H "Authorization: Bearer tt_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "augment": true,
    "hyperparameters": {
      "n_epochs": 4,
      "learning_rate": 0.00002,
      "lora_rank": 8,
      "lora_alpha": 16
    }
  }'
ParameterDefaultDescription
augmenttrueUse AI to expand examples into a larger training set
hyperparameters.n_epochs4Number of training epochs (1–20)
hyperparameters.learning_rateautoLearning rate
hyperparameters.batch_size8Training batch size (min 8)
hyperparameters.lora_rank8LoRA adapter rank
hyperparameters.lora_alpha16LoRA alpha scaling factor

Returns immediately with status preparing. Work happens asynchronously.

List Runs for a Spec

GET /api/v1/behavior-specs/:id/runs

curl https://api.tunedtensor.com/v1/behavior-specs/:id/runs \
  -H "Authorization: Bearer tt_your_api_key"

List All Runs

GET /api/v1/runs

curl https://api.tunedtensor.com/v1/runs \
  -H "Authorization: Bearer tt_your_api_key"

Returns runs across all specs with _spec_name for display.

Get Run Detail

GET /api/v1/runs/:id

curl https://api.tunedtensor.com/v1/runs/:id \
  -H "Authorization: Bearer tt_your_api_key"

Returns the full run with _evals — per-example results sorted by score (worst first). Each eval includes:

  • prompt, expected, actual
  • score (0–1), passed (boolean)
  • reasoning — LLM judge's explanation
  • latency_ms — inference time

Cancel a Run

POST /api/v1/runs/:id/cancel

curl -X POST https://api.tunedtensor.com/v1/runs/:id/cancel \
  -H "Authorization: Bearer tt_your_api_key"

Cancels runs in preparing, training, or evaluating status. Also cancels the provider fine-tuning job if running.