Runs
A run is a single end-to-end cycle: compile your behaviour spec into training data, augment examples with AI, fine-tune the model, and auto-evaluate the result.
The Run Object
{
"id": "e0b7694b-2c65-4199-89a1-fc54a6a6010c",
"behavior_spec_id": "cafd8799-...",
"run_number": 1,
"status": "completed",
"spec_snapshot": { ... },
"dataset_id": "dc66546b-...",
"fine_tune_job_id": "b3e2b918-...",
"model_id": "96e9f0d9-...",
"hyperparameters": {
"augment": true,
"n_epochs": 4,
"lora_rank": 8,
"lora_alpha": 16
},
"eval_summary": {
"total": 5,
"avg_score": 0.82,
"pass_rate": 0.8,
"scoring_method": "llm_judge",
"regressions": 0,
"improvements": 3
},
"started_at": "2026-03-06T10:30:00.000Z",
"completed_at": "2026-03-06T10:57:50.000Z"
}Run Lifecycle
| Status | Description |
|---|---|
preparing | Compiling spec → augmenting examples → uploading to provider |
training | Fine-tuning job running on Together AI |
evaluating | Model being tested against the spec's examples |
completed | Eval results available |
failed | Error — check the error field |
cancelled | Manually cancelled |
Spec Snapshot
Every run captures a spec_snapshot — a frozen copy of the behaviour spec at run time. You can freely edit your spec between runs; each run preserves exactly what it trained on.
Eval Summary
| Field | Description |
|---|---|
avg_score | Mean score across all examples (0–1) |
pass_rate | Fraction of examples that passed (score ≥ 0.7) |
exact_match_rate | Fraction of near-perfect scores (≥ 0.95) |
avg_latency_ms | Mean inference latency per example |
scoring_method | llm_judge or similarity |
regressions | Examples that scored ≥ 0.1 worse than previous run |
improvements | Examples that scored ≥ 0.1 better than previous run |
Start a Run
POST /api/v1/behavior-specs/:id/runs
curl -X POST https://api.tunedtensor.com/v1/behavior-specs/:id/runs \
-H "Authorization: Bearer tt_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"augment": true,
"hyperparameters": {
"n_epochs": 4,
"learning_rate": 0.00002,
"lora_rank": 8,
"lora_alpha": 16
}
}'| Parameter | Default | Description |
|---|---|---|
augment | true | Use AI to expand examples into a larger training set |
hyperparameters.n_epochs | 4 | Number of training epochs (1–20) |
hyperparameters.learning_rate | auto | Learning rate |
hyperparameters.batch_size | 8 | Training batch size (min 8) |
hyperparameters.lora_rank | 8 | LoRA adapter rank |
hyperparameters.lora_alpha | 16 | LoRA alpha scaling factor |
Returns immediately with status preparing. Work happens asynchronously.
List Runs for a Spec
GET /api/v1/behavior-specs/:id/runs
curl https://api.tunedtensor.com/v1/behavior-specs/:id/runs \
-H "Authorization: Bearer tt_your_api_key"List All Runs
GET /api/v1/runs
curl https://api.tunedtensor.com/v1/runs \
-H "Authorization: Bearer tt_your_api_key"Returns runs across all specs with _spec_name for display.
Get Run Detail
GET /api/v1/runs/:id
curl https://api.tunedtensor.com/v1/runs/:id \
-H "Authorization: Bearer tt_your_api_key"Returns the full run with _evals — per-example results sorted by score (worst first). Each eval includes:
prompt,expected,actualscore(0–1),passed(boolean)reasoning— LLM judge's explanationlatency_ms— inference time
Cancel a Run
POST /api/v1/runs/:id/cancel
curl -X POST https://api.tunedtensor.com/v1/runs/:id/cancel \
-H "Authorization: Bearer tt_your_api_key"Cancels runs in preparing, training, or evaluating status. Also cancels the provider fine-tuning job if running.