@sanity/ailf-studio 0.1.17 → 0.1.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -57,6 +57,28 @@ This registers:
57
57
  - The `ailf.evalRequest` document type (evaluation request triggers)
58
58
  - The **AI Literacy Framework** dashboard tool in the Studio sidebar
59
59
 
60
+ #### Document Actions
61
+
62
+ The plugin registers two document actions for triggering evaluations directly
63
+ from Studio:
64
+
65
+ - **Run Task Eval** (on `ailf.task` documents) — evaluates a single task. Click
66
+ ▶ in the document actions menu to run all test cases for the task against the
67
+ current documentation. The button shows the score when complete (~10–15 min).
68
+ No secrets needed — it creates an `ailf.evalRequest` document that a
69
+ server-side webhook dispatches to the pipeline.
70
+
71
+ - **Run AI Eval** (on content releases) — evaluates all tasks affected by a
72
+ content release. Appears in the release detail page's action bar. Answers "did
73
+ my doc changes help or hurt AI agent performance?" Shows score and delta vs
74
+ baseline when complete.
75
+
76
+ Both actions use the same mechanism: they create an `ailf.evalRequest` document
77
+ in the Content Lake with `status: "pending"`. A server-side Sanity webhook picks
78
+ up the document and dispatches the pipeline via GitHub Actions. The Studio
79
+ component polls for the resulting report and updates the button label with the
80
+ score.
81
+
60
82
  ### 3. Alternative: tool-only installation
61
83
 
62
84
  If you only want the dashboard tool without the document schemas (e.g., the
@@ -103,6 +125,21 @@ export default defineConfig({
103
125
  })
104
126
  ```
105
127
 
128
+ ## Task Execution Workflows
129
+
130
+ Tasks created in Studio are automatically included in every pipeline run — no
131
+ registration step needed. There are four ways to execute tasks:
132
+
133
+ | Method | Trigger | Scope |
134
+ | ------------------------------ | ----------------------------------------------------------------- | ----------------------------- |
135
+ | **Run Task Eval** action | Click ▶ on any `ailf.task` document | Single task |
136
+ | **Run AI Eval** release action | Click button on a content release page | Tasks affected by the release |
137
+ | **CLI pipeline** | `ailf pipeline` (with optional `--area`/`--task`/`--tag` filters) | All enabled tasks |
138
+ | **Scheduled pipeline** | GitHub Actions cron (daily + weekly) | All enabled tasks |
139
+
140
+ See [CONTRIBUTING_TASKS.md](../../docs/CONTRIBUTING_TASKS.md#running-your-task)
141
+ for the full execution flow and details on each method.
142
+
106
143
  ## Dashboard Views
107
144
 
108
145
  The plugin provides three tab views plus a detail drill-down, accessible from
package/dist/index.d.ts CHANGED
@@ -436,10 +436,24 @@ declare const reportSchema: {
436
436
  * - A gold-standard implementation (reference solution)
437
437
  * - When/how the task runs (execution controls)
438
438
  *
439
+ * ## Execution paths
440
+ *
441
+ * Published tasks are automatically discovered by the pipeline — no
442
+ * registration step needed. There are four ways to execute a task:
443
+ *
444
+ * 1. **Run Task Eval** — click ▶ on any ailf.task document in Studio.
445
+ * Creates an ailf.evalRequest scoped to this task. Webhook dispatches
446
+ * the pipeline; score appears inline when complete (~10–15 min).
447
+ * 2. **Run AI Eval** — click on a content release page. Auto-scopes to
448
+ * tasks whose canonical docs are in the release.
449
+ * 3. **CLI** — `ailf pipeline --task <id>` or `ailf pipeline --area <area>`.
450
+ * 4. **Scheduled** — GitHub Actions cron (daily baseline, weekly full).
451
+ *
439
452
  * Tasks can be authored natively in Studio or mirrored from external
440
453
  * repositories. Mirrored tasks have a read-only `origin` block that
441
454
  * tracks their source repo provenance.
442
455
  *
456
+ * @see docs/CONTRIBUTING_TASKS.md#running-your-task — full execution guide
443
457
  * @see docs/design-docs/tasks-as-content.md
444
458
  * @see docs/design-docs/tasks-as-content.md#decision-8-domain-specific-assertion-types-not-a-promptfoo-subset
445
459
  */
package/dist/index.js CHANGED
@@ -2481,6 +2481,26 @@ var taskSchema = defineType5({
2481
2481
  }),
2482
2482
  // -----------------------------------------------------------------------
2483
2483
  // Execution controls
2484
+ //
2485
+ // Pipeline support status:
2486
+ // - `enabled` → ✅ used in GROQ filter (execution.enabled != false)
2487
+ // - `blocking` → 🔮 forward-looking — schema reserved, not yet enforced
2488
+ // by the pipeline. PR check workflow reports scores but
2489
+ // does not gate merges on this flag.
2490
+ // - `trigger.branches`, `trigger.paths` → ✅ used by repo-based task
2491
+ // threshold evaluator (RepoThresholdEvaluator)
2492
+ // - `trigger.labels`, `trigger.schedule` → 🔮 forward-looking — schema
2493
+ // reserved, not yet read by the pipeline GROQ query or
2494
+ // mapped to TaskDefinition.
2495
+ // - `threshold.score` → ✅ used by repo-based threshold evaluator
2496
+ // - `threshold.dimensions` → 🔮 forward-looking — schema reserved, not
2497
+ // yet read by the pipeline GROQ query or mapped to
2498
+ // TaskDefinition.
2499
+ //
2500
+ // When implementing a forward-looking field, update the GROQ query in
2501
+ // content-lake-task-source.ts, the ContentLakeTask interface, the
2502
+ // mapToTaskDefinition() mapping, and (if pipeline-relevant) the
2503
+ // TaskDefinition port type in @sanity/ailf-core.
2484
2504
  // -----------------------------------------------------------------------
2485
2505
  defineField5({
2486
2506
  description: "Controls when and how this task runs in the pipeline. Tasks without execution controls use defaults: enabled, non-blocking, no filters, no threshold.",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@sanity/ailf-studio",
3
- "version": "0.1.17",
3
+ "version": "0.1.18",
4
4
  "description": "AI Literacy Framework — Sanity Studio dashboard plugin",
5
5
  "type": "module",
6
6
  "license": "MIT",