@sanity/ailf-studio 0.1.17 → 0.1.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,5 +1,9 @@
1
1
  # @sanity/ailf-studio
2
2
 
3
+ > ⚠️ **Internal package.** This package is published publicly for convenience
4
+ > but is intended for internal Sanity use only. APIs and schemas may change
5
+ > without notice. No support is provided for external consumers.
6
+
3
7
  Sanity Studio dashboard plugin for the **AI Literacy Framework**. Visualizes
4
8
  evaluation reports, score trends, comparisons, and content impact — directly
5
9
  inside Sanity Studio with no external backend.
@@ -17,10 +21,6 @@ AILF reports are stored.
17
21
  pnpm add @sanity/ailf-studio
18
22
  ```
19
23
 
20
- > **Note:** The package is published with `restricted` access to the `@sanity`
21
- > npm scope. You need an npm token with read access — see the root
22
- > [README](../../README.md#obtain-secrets) for how to obtain one.
23
-
24
24
  #### Within the monorepo
25
25
 
26
26
  ```bash
@@ -57,6 +57,28 @@ This registers:
57
57
  - The `ailf.evalRequest` document type (evaluation request triggers)
58
58
  - The **AI Literacy Framework** dashboard tool in the Studio sidebar
59
59
 
60
+ #### Document Actions
61
+
62
+ The plugin registers two document actions for triggering evaluations directly
63
+ from Studio:
64
+
65
+ - **Run Task Eval** (on `ailf.task` documents) — evaluates a single task. Click
66
+ ▶ in the document actions menu to run all test cases for the task against the
67
+ current documentation. The button shows the score when complete (~10–15 min).
68
+ No secrets needed — it creates an `ailf.evalRequest` document that a
69
+ server-side webhook dispatches to the pipeline.
70
+
71
+ - **Run AI Eval** (on content releases) — evaluates all tasks affected by a
72
+ content release. Appears in the release detail page's action bar. Answers "did
73
+ my doc changes help or hurt AI agent performance?" Shows score and delta vs
74
+ baseline when complete.
75
+
76
+ Both actions use the same mechanism: they create an `ailf.evalRequest` document
77
+ in the Content Lake with `status: "pending"`. A server-side Sanity webhook picks
78
+ up the document and dispatches the pipeline via GitHub Actions. The Studio
79
+ component polls for the resulting report and updates the button label with the
80
+ score.
81
+
60
82
  ### 3. Alternative: tool-only installation
61
83
 
62
84
  If you only want the dashboard tool without the document schemas (e.g., the
@@ -103,6 +125,22 @@ export default defineConfig({
103
125
  })
104
126
  ```
105
127
 
128
+ ## Task Execution Workflows
129
+
130
+ Tasks created in Studio are automatically included in every pipeline run — no
131
+ registration step needed. There are four ways to execute tasks:
132
+
133
+ | Method | Trigger | Scope |
134
+ | ------------------------------ | ----------------------------------------------------------------- | ----------------------------- |
135
+ | **Run Task Eval** action | Click ▶ on any `ailf.task` document | Single task |
136
+ | **Run AI Eval** release action | Click button on a content release page | Tasks affected by the release |
137
+ | **CLI pipeline** | `ailf pipeline` (with optional `--area`/`--task`/`--tag` filters) | All enabled tasks |
138
+ | **Scheduled pipeline** | GitHub Actions cron (daily + weekly) | All enabled tasks |
139
+
140
+ See the
141
+ [CONTRIBUTING_TASKS](https://github.com/sanity-labs/ai-literacy-framework/blob/main/docs/CONTRIBUTING_TASKS.md#running-your-task)
142
+ guide for the full execution flow and details on each method.
143
+
106
144
  ## Dashboard Views
107
145
 
108
146
  The plugin provides three tab views plus a detail drill-down, accessible from
@@ -178,15 +216,13 @@ To point it at a dedicated report dataset, configure the Studio's dataset:
178
216
 
179
217
  ```ts
180
218
  export default defineConfig({
181
- projectId: "3do82whm",
182
- dataset: "my-report-dataset", // or use AILF_REPORT_DATASET
219
+ projectId: "<your-project-id>",
220
+ dataset: "my-report-dataset",
183
221
  plugins: [ailfPlugin()],
184
222
  })
185
223
  ```
186
224
 
187
- Reports are written by the evaluation pipeline (`ailf pipeline --publish`). See
188
- the [report store design docs](../../docs/design-docs/report-store/index.md) for
189
- the full architecture.
225
+ Reports are written by the evaluation pipeline (`ailf pipeline --publish`).
190
226
 
191
227
  ## Exported API
192
228
 
@@ -299,9 +335,9 @@ consuming Studio's bundler (Vite) handles the final bundle.
299
335
 
300
336
  ## Related Documentation
301
337
 
302
- - [Report Store Design](../../docs/design-docs/report-store/index.md) — full
303
- architecture and implementation plan
304
- - [Visibility & Workflows](../../docs/design-docs/report-store/visibility-workflows.md)
338
+ - [Report Store Design](https://github.com/sanity-labs/ai-literacy-framework/blob/main/docs/design-docs/report-store/index.md)
339
+ — full architecture and implementation plan
340
+ - [Visibility & Workflows](https://github.com/sanity-labs/ai-literacy-framework/blob/main/docs/design-docs/report-store/visibility-workflows.md)
305
341
  — design rationale for the dashboard views
306
- - [Report Store Architecture](../../docs/design-docs/report-store/architecture.md)
342
+ - [Report Store Architecture](https://github.com/sanity-labs/ai-literacy-framework/blob/main/docs/design-docs/report-store/architecture.md)
307
343
  — Sanity Content Lake as the system of record
package/dist/index.d.ts CHANGED
@@ -436,10 +436,24 @@ declare const reportSchema: {
436
436
  * - A gold-standard implementation (reference solution)
437
437
  * - When/how the task runs (execution controls)
438
438
  *
439
+ * ## Execution paths
440
+ *
441
+ * Published tasks are automatically discovered by the pipeline — no
442
+ * registration step needed. There are four ways to execute a task:
443
+ *
444
+ * 1. **Run Task Eval** — click ▶ on any ailf.task document in Studio.
445
+ * Creates an ailf.evalRequest scoped to this task. Webhook dispatches
446
+ * the pipeline; score appears inline when complete (~10–15 min).
447
+ * 2. **Run AI Eval** — click on a content release page. Auto-scopes to
448
+ * tasks whose canonical docs are in the release.
449
+ * 3. **CLI** — `ailf pipeline --task <id>` or `ailf pipeline --area <area>`.
450
+ * 4. **Scheduled** — GitHub Actions cron (daily baseline, weekly full).
451
+ *
439
452
  * Tasks can be authored natively in Studio or mirrored from external
440
453
  * repositories. Mirrored tasks have a read-only `origin` block that
441
454
  * tracks their source repo provenance.
442
455
  *
456
+ * @see docs/CONTRIBUTING_TASKS.md#running-your-task — full execution guide
443
457
  * @see docs/design-docs/tasks-as-content.md
444
458
  * @see docs/design-docs/tasks-as-content.md#decision-8-domain-specific-assertion-types-not-a-promptfoo-subset
445
459
  */
package/dist/index.js CHANGED
@@ -2481,6 +2481,26 @@ var taskSchema = defineType5({
2481
2481
  }),
2482
2482
  // -----------------------------------------------------------------------
2483
2483
  // Execution controls
2484
+ //
2485
+ // Pipeline support status:
2486
+ // - `enabled` → ✅ used in GROQ filter (execution.enabled != false)
2487
+ // - `blocking` → 🔮 forward-looking — schema reserved, not yet enforced
2488
+ // by the pipeline. PR check workflow reports scores but
2489
+ // does not gate merges on this flag.
2490
+ // - `trigger.branches`, `trigger.paths` → ✅ used by repo-based task
2491
+ // threshold evaluator (RepoThresholdEvaluator)
2492
+ // - `trigger.labels`, `trigger.schedule` → 🔮 forward-looking — schema
2493
+ // reserved, not yet read by the pipeline GROQ query or
2494
+ // mapped to TaskDefinition.
2495
+ // - `threshold.score` → ✅ used by repo-based threshold evaluator
2496
+ // - `threshold.dimensions` → 🔮 forward-looking — schema reserved, not
2497
+ // yet read by the pipeline GROQ query or mapped to
2498
+ // TaskDefinition.
2499
+ //
2500
+ // When implementing a forward-looking field, update the GROQ query in
2501
+ // content-lake-task-source.ts, the ContentLakeTask interface, the
2502
+ // mapToTaskDefinition() mapping, and (if pipeline-relevant) the
2503
+ // TaskDefinition port type in @sanity/ailf-core.
2484
2504
  // -----------------------------------------------------------------------
2485
2505
  defineField5({
2486
2506
  description: "Controls when and how this task runs in the pipeline. Tasks without execution controls use defaults: enabled, non-blocking, no filters, no threshold.",
package/package.json CHANGED
@@ -1,12 +1,12 @@
1
1
  {
2
2
  "name": "@sanity/ailf-studio",
3
- "version": "0.1.17",
3
+ "version": "0.1.19",
4
4
  "description": "AI Literacy Framework — Sanity Studio dashboard plugin",
5
5
  "type": "module",
6
6
  "license": "MIT",
7
7
  "private": false,
8
8
  "publishConfig": {
9
- "access": "restricted"
9
+ "access": "public"
10
10
  },
11
11
  "repository": {
12
12
  "type": "git",