@machinespirits/eval 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (74) hide show
  1. package/README.md +91 -9
  2. package/config/eval-settings.yaml +3 -3
  3. package/config/paper-manifest.json +486 -0
  4. package/config/providers.yaml +9 -6
  5. package/config/tutor-agents.yaml +2261 -0
  6. package/content/README.md +23 -0
  7. package/content/courses/479/course.md +53 -0
  8. package/content/courses/479/lecture-1.md +361 -0
  9. package/content/courses/479/lecture-2.md +360 -0
  10. package/content/courses/479/lecture-3.md +655 -0
  11. package/content/courses/479/lecture-4.md +530 -0
  12. package/content/courses/479/lecture-5.md +326 -0
  13. package/content/courses/479/lecture-6.md +346 -0
  14. package/content/courses/479/lecture-7.md +326 -0
  15. package/content/courses/479/lecture-8.md +273 -0
  16. package/content/courses/479/roadmap-slides.md +656 -0
  17. package/content/manifest.yaml +8 -0
  18. package/docs/research/build.sh +44 -20
  19. package/docs/research/figures/figure10.png +0 -0
  20. package/docs/research/figures/figure11.png +0 -0
  21. package/docs/research/figures/figure3.png +0 -0
  22. package/docs/research/figures/figure4.png +0 -0
  23. package/docs/research/figures/figure5.png +0 -0
  24. package/docs/research/figures/figure6.png +0 -0
  25. package/docs/research/figures/figure7.png +0 -0
  26. package/docs/research/figures/figure8.png +0 -0
  27. package/docs/research/figures/figure9.png +0 -0
  28. package/docs/research/header.tex +23 -2
  29. package/docs/research/paper-full.md +941 -285
  30. package/docs/research/paper-short.md +216 -585
  31. package/docs/research/references.bib +132 -0
  32. package/docs/research/slides-header.tex +188 -0
  33. package/docs/research/slides-pptx.md +363 -0
  34. package/docs/research/slides.md +531 -0
  35. package/docs/research/style-reference-pptx.py +199 -0
  36. package/package.json +6 -5
  37. package/scripts/analyze-eval-results.js +69 -17
  38. package/scripts/analyze-mechanism-traces.js +763 -0
  39. package/scripts/analyze-modulation-learning.js +498 -0
  40. package/scripts/analyze-prosthesis.js +144 -0
  41. package/scripts/analyze-run.js +264 -79
  42. package/scripts/assess-transcripts.js +853 -0
  43. package/scripts/browse-transcripts.js +854 -0
  44. package/scripts/check-parse-failures.js +73 -0
  45. package/scripts/code-dialectical-modulation.js +1320 -0
  46. package/scripts/download-data.sh +55 -0
  47. package/scripts/eval-cli.js +106 -18
  48. package/scripts/generate-paper-figures.js +663 -0
  49. package/scripts/generate-paper-figures.py +577 -76
  50. package/scripts/generate-paper-tables.js +299 -0
  51. package/scripts/qualitative-analysis-ai.js +3 -3
  52. package/scripts/render-sequence-diagram.js +694 -0
  53. package/scripts/test-latency.js +210 -0
  54. package/scripts/test-rate-limit.js +95 -0
  55. package/scripts/test-token-budget.js +332 -0
  56. package/scripts/validate-paper-manifest.js +670 -0
  57. package/services/__tests__/evalConfigLoader.test.js +2 -2
  58. package/services/__tests__/learnerRubricEvaluator.test.js +361 -0
  59. package/services/__tests__/learnerTutorInteractionEngine.test.js +326 -0
  60. package/services/evaluationRunner.js +975 -98
  61. package/services/evaluationStore.js +12 -4
  62. package/services/learnerTutorInteractionEngine.js +27 -2
  63. package/services/mockProvider.js +133 -0
  64. package/services/promptRewriter.js +1471 -5
  65. package/services/rubricEvaluator.js +55 -2
  66. package/services/transcriptFormatter.js +675 -0
  67. package/docs/EVALUATION-VARIABLES.md +0 -589
  68. package/docs/REPLICATION-PLAN.md +0 -577
  69. package/scripts/analyze-run.mjs +0 -282
  70. package/scripts/compare-runs.js +0 -44
  71. package/scripts/compare-suggestions.js +0 -80
  72. package/scripts/dig-into-run.js +0 -158
  73. package/scripts/show-failed-suggestions.js +0 -64
  74. /package/scripts/{check-run.mjs → check-run.js} +0 -0
@@ -0,0 +1,656 @@
1
+ ---
2
+ title: "Learning Features Roadmap"
3
+ subtitle: "Developing Philosophers & Critical AI Thinkers"
4
+ type: presentation
5
+ ---
6
+
7
+ # Machine Spirits
8
+ ## Learning Features Roadmap
9
+
10
+ **Developing Philosophers & Critical AI Thinkers**
11
+
12
+ *Synthesized Analysis - December 2024*
13
+
14
+ ---
15
+
16
+ ## The Challenge
17
+
18
+ Our courses teach **philosophically dense material**:
19
+ - Hegel's *Phenomenology of Spirit*
20
+ - Master-servant dialectic
21
+ - Recognition theory
22
+ - Critical AI perspectives
23
+
24
+ **But**: Can undergraduates engage with 19th-century German philosophy without scaffolding?
25
+
26
+ ```notes
27
+ The core challenge: we have sophisticated infrastructure (multi-agent tutoring, simulations, qualitative analysis) but it's not well-connected to the actual course content. Students can click through lectures without ever being genuinely challenged.
28
+ ```
29
+
30
+ ---
31
+
32
+ ## Current State
33
+
34
+ ### What We Have
35
+
36
+ | Feature | Status |
37
+ |---------|--------|
38
+ | Multi-agent tutor | Built, but generic |
39
+ | Simulations | Built, but hidden |
40
+ | Qualitative analysis | Built, but disconnected |
41
+ | Learning map | Built, but activity-focused |
42
+ | 14 activity types | Built, but buried in YAML |
43
+
44
+ ### The Gap
45
+
46
+ Infrastructure exists. **Content integration doesn't.**
47
+
48
+ ```notes
49
+ Both Claude and ChatGPT analyses identified the same core problem: we have the technical capability but haven't wired it to the actual learning experience. Features are buried rather than surfaced at moments of need.
50
+ ```
51
+
52
+ ---
53
+
54
+ ## Guiding Principles
55
+
56
+ 1. **The learner must struggle**
57
+ - Passive consumption ≠ philosophy
58
+
59
+ 2. **Content and infrastructure must connect**
60
+ - Features without grounding fail
61
+
62
+ 3. **Scaffolding, not simplification**
63
+ - Make difficulty navigable
64
+
65
+ 4. **From consumption to construction**
66
+ - Build arguments, don't just read them
67
+
68
+ 5. **Dialectical practice**
69
+ - Thesis → Antithesis → Synthesis
70
+
71
+ ```notes
72
+ These principles come from the course content itself. If we're teaching Hegelian philosophy, shouldn't our platform embody Hegelian pedagogy? The servant becomes educated through labor on difficult texts, not through passive reception.
73
+ ```
74
+
75
+ ---
76
+
77
+ ## Phase 0: Infrastructure Wiring
78
+ ### Weeks 1-2
79
+
80
+ **Before adding features, make existing systems coherent.**
81
+
82
+ - [ ] Align text analysis to Markdown sources
83
+ - [ ] Generate and integrate `tutor-content.json`
84
+ - [ ] Create shared content index
85
+ - [ ] Verify tutor references real lecture IDs
86
+ - [ ] Fix simulation-content mismatches
87
+
88
+ **Success**: All existing features work with current content.
89
+
90
+ ```notes
91
+ This is the pragmatic first step that ChatGPT emphasized. No point building new features if the pipeline is broken. Two weeks of pure infrastructure work pays dividends for everything that follows.
92
+ ```
93
+
94
+ ---
95
+
96
+ ## Phase 1: Comprehension Foundation
97
+ ### Weeks 3-6
98
+
99
+ ### Living Glossary
100
+
101
+ Every complex term becomes clickable:
102
+ - Plain English explanation
103
+ - Original philosophical definition
104
+ - Historical context
105
+ - "I still don't get it" → deeper explanation
106
+
107
+ ### Multi-Lens Summarization
108
+
109
+ - One-sentence summary
110
+ - Technical lens
111
+ - Philosophical lens
112
+ - "ELI5" lens
113
+ - Policy lens
114
+
115
+ ```notes
116
+ This is the lowest-friction way to make dense philosophy accessible. Every barrier to understanding terminology is a potential dropout moment. Multiple summary lenses accommodate different learner backgrounds.
117
+ ```
118
+
119
+ ---
120
+
121
+ ## Phase 1 Continued
122
+
123
+ ### Reading Checkpoints
124
+
125
+ 1-2 micro-questions per section:
126
+ - Low stakes
127
+ - Immediate feedback
128
+ - "Review this section" if struggling
129
+
130
+ ### Lecture Summary Panel
131
+
132
+ Persistent sidebar showing:
133
+ - Key concepts
134
+ - Prerequisites (linked)
135
+ - Connections to other lectures
136
+ - Learning objectives
137
+
138
+ **Metrics**: Terms clicked: 5+ | Summary usage: 60%+
139
+
140
+ ```notes
141
+ Reading checkpoints are micro-assessments that help learners calibrate their understanding without high-stakes pressure. The summary panel keeps context visible as learners scroll through dense material.
142
+ ```
143
+
144
+ ---
145
+
146
+ ## Phase 2: Active Reading & Personas
147
+ ### Weeks 7-12
148
+
149
+ ### The Philosopher's Lens (NEW)
150
+
151
+ *From Gemini: "Transforming passive reading into active critical analysis."*
152
+
153
+ Toggle different **critical lenses** while reading:
154
+
155
+ | Lens | Highlights |
156
+ |------|------------|
157
+ | **Phenomenologist** | Recognition, breakdown moments |
158
+ | **Materialist** | Labor, power, infrastructure |
159
+ | **Techno-Optimist** | Capabilities, potential |
160
+ | **Skeptic** | Assumptions, evidence gaps |
161
+
162
+ LLM re-analyzes current paragraph from that perspective.
163
+
164
+ ```notes
165
+ This is a key Gemini contribution. Instead of just reading passively, students can actively switch perspectives and see how different philosophical traditions would interpret the same text. Temporary highlights with commentary tooltips appear.
166
+ ```
167
+
168
+ ---
169
+
170
+ ## Philosopher Personas & The Seminar
171
+
172
+ ### Personas in Chat
173
+
174
+ | Persona | Perspective |
175
+ |---------|-------------|
176
+ | **Hegel** | Dialectic, recognition, Spirit |
177
+ | **Marx** | Alienation, labor, ideology |
178
+ | **Freud** | Unconscious, ego/superego |
179
+ | **Turing** | Machine intelligence |
180
+ | **Claude** | AI self-reflection |
181
+
182
+ ### "The Seminar" (NEW from Gemini)
183
+
184
+ Multi-agent discussion: Student + 2-3 AI philosophers
185
+
186
+ > *Student reads "The Bitter Lesson"*
187
+ > *Agent A (Sutton):* Defends scaling
188
+ > *Agent B (Hegel):* Critiques lack of self-movement
189
+ > *Student:* Mediates or chooses a side
190
+
191
+ ```notes
192
+ The Seminar extends personas from one-on-one chat to group discussion. This simulates the experience of a real philosophical seminar where different thinkers engage with each other and the student must navigate competing perspectives.
193
+ ```
194
+
195
+ ---
196
+
197
+ ## Phase 2 Continued
198
+
199
+ ### Roleplay Activities
200
+
201
+ Learner takes a role:
202
+ - Philosopher defending a position
203
+ - Policymaker evaluating AI regulation
204
+ - Educator designing curriculum
205
+ - Critic challenging an argument
206
+
207
+ ### "Teach Me" Mode
208
+
209
+ - Learner explains a concept
210
+ - AI diagnoses gaps
211
+ - Targeted follow-up
212
+ - Builds metacognition
213
+
214
+ **Metrics**: Persona switches: 2+ | Roleplay completed: 1/course
215
+
216
+ ```notes
217
+ "Teach me" mode flips the traditional dynamic. Instead of the AI explaining to the learner, the learner explains to the AI, which then identifies misconceptions. This is powerful for developing true understanding vs. surface familiarity.
218
+ ```
219
+
220
+ ---
221
+
222
+ ## Metacognitive Prompts (NEW from Gemini)
223
+
224
+ ### Reading Behavior Detection
225
+
226
+ The system observes:
227
+ - Scroll speed (rapid vs. careful)
228
+ - Time-on-section
229
+ - Re-reading patterns
230
+ - Selection and annotation behavior
231
+
232
+ ### Intelligent Interventions
233
+
234
+ When rapid scrolling detected:
235
+ > *"You've covered a lot of ground. Can you explain sublation in your own words?"*
236
+
237
+ When struggling on a section:
238
+ > *"This is dense material. Would you like to try a different lens?"*
239
+
240
+ Prompts based on behavior, not just content.
241
+
242
+ ```notes
243
+ Gemini's insight: Use reading behavior to trigger metacognitive interventions. If a student is flying through Hegel without pausing, that's a signal - they may not be engaging deeply. A well-timed "stop and reflect" prompt can transform passive scrolling into active learning.
244
+ ```
245
+
246
+ ---
247
+
248
+ ## Phase 3: Simulation Integration
249
+ ### Weeks 11-16
250
+
251
+ ### Current Problem
252
+
253
+ Simulations exist (recognition, alienation, dialectic, emergence)
254
+
255
+ **But**:
256
+ - Hidden in Research Lab
257
+ - Not connected to reading
258
+ - No guided observation
259
+
260
+ ### Solution: Simulation Discovery
261
+
262
+ "See this in action" button on relevant paragraphs:
263
+ - Pre-set parameters matching concept
264
+ - Observation prompts tied to lecture
265
+ - Compare simulation to philosophical claim
266
+
267
+ ```notes
268
+ We have beautiful agent-based models of Hegelian concepts, but learners don't know they exist when they're struggling with the text. Surfacing simulations at the moment of relevance transforms them from hidden tools to learning catalysts.
269
+ ```
270
+
271
+ ---
272
+
273
+ ## Phase 3 Continued
274
+
275
+ ### Low-Code ABM Builder
276
+
277
+ Visual tool to create simulations:
278
+ 1. Choose template (recognition, alienation...)
279
+ 2. Map concepts to parameters
280
+ 3. Auto-generate hypothesis
281
+ 4. Observation checklist
282
+ 5. Save and share
283
+
284
+ ### Natural Language ABM (NEW from Gemini)
285
+
286
+ Describe simulation in plain English:
287
+
288
+ > *"Show me agents that only learn if recognized by a high-status agent"*
289
+
290
+ System generates simulation code with:
291
+ - `recognition_threshold` parameter
292
+ - `status_distribution` parameter
293
+
294
+ **Metrics**: Simulations from content: 3x | User-created ABMs: 10+
295
+
296
+ ```notes
297
+ Gemini proposed natural language to code generation for simulations. This lowers the barrier even further - students don't need to understand YAML or parameters, they just describe what they want to see. Requires sandboxed JS execution.
298
+ ```
299
+
300
+ ---
301
+
302
+ ## Phase 4: Dialectical Practice
303
+ ### Weeks 17-22
304
+
305
+ ### The Missing Antithesis
306
+
307
+ | Hegel's Dialectic | Current Platform | Needed |
308
+ |-------------------|------------------|--------|
309
+ | Thesis | Present info | Present info |
310
+ | Antithesis | ??? | **Challenge understanding** |
311
+ | Synthesis | ??? | **Support integration** |
312
+
313
+ We teach dialectics but don't practice them!
314
+
315
+ ```notes
316
+ There's an irony: we teach dialectical philosophy through a platform that doesn't embody dialectical learning. When a learner thinks they understand sublation, where is the challenge that tests and refines that understanding?
317
+ ```
318
+
319
+ ---
320
+
321
+ ## Dialectical Challenge System
322
+
323
+ After reading, structured challenge:
324
+
325
+ 1. **Thesis**: "What is Hegel's main claim about recognition?"
326
+
327
+ 2. **Evidence**: "Find 2-3 supporting quotes"
328
+
329
+ 3. **Antithesis**: AI presents counterargument
330
+
331
+ 4. **Defense**: Learner responds
332
+
333
+ 5. **Synthesis**: AI helps integrate understanding
334
+
335
+ **AI evaluates engagement quality, not correctness.**
336
+
337
+ ```notes
338
+ This is the core of philosophical practice - not just comprehending arguments but constructing, defending, and refining them. The dialectical challenge makes visible the invisible skill of philosophical thinking.
339
+ ```
340
+
341
+ ---
342
+
343
+ ## Visual Argument Builder
344
+
345
+ ```
346
+ [CLAIM] ────────────────────────────────
347
+
348
+ ├── [EVIDENCE 1] ── [WARRANT]
349
+ │ └── Source: Lecture 3
350
+
351
+ ├── [EVIDENCE 2] ── [WARRANT]
352
+
353
+ └── [COUNTERARGUMENT] ── [REBUTTAL]
354
+ ```
355
+
356
+ - **Drag evidence from ArticleReader** (NEW from Gemini)
357
+ - Link to lecture sources
358
+ - Export as essay outline
359
+
360
+ ### AI Critic (NEW from Gemini)
361
+
362
+ An agent that specifically attacks:
363
+ - Weak warrants
364
+ - Missing counterarguments
365
+ - Evidence gaps
366
+ - Logical incoherence
367
+
368
+ **Metrics**: Challenges completed: 70%+ | Arguments built: 3+
369
+
370
+ ```notes
371
+ Gemini added two key UX details: evidence dragging from the reader, and an AI Critic that specifically targets weaknesses. The Critic isn't just giving feedback - it's adversarially attacking the argument to make it stronger.
372
+ ```
373
+
374
+ ---
375
+
376
+ ## Phase 5: Thematic Analysis
377
+ ### Weeks 23-28
378
+
379
+ ### Lecture-First Analysis
380
+
381
+ One-click "Analyze this lecture":
382
+ - Generate themes and codes
383
+ - Accept/reject suggestions
384
+ - Train personal theme vocabulary
385
+ - Compare to course themes
386
+
387
+ ### "My Themes" Sidebar (NEW from Gemini)
388
+
389
+ Shows how current reading connects to:
390
+ - Student's ongoing research questions
391
+ - Theme relevance scores
392
+ - Related annotations
393
+
394
+ ### Auto-Link with Vector Search (NEW from Gemini)
395
+
396
+ - Flag paragraphs matching student's themes
397
+ - Match **even without keyword overlap**
398
+ - Background semantic matching
399
+
400
+ ```notes
401
+ Gemini's key insight: connect the research dashboard to the reading experience. As students highlight text, the system auto-suggests codes from their existing taxonomy. Vector search finds conceptually related content even when exact words differ.
402
+ ```
403
+
404
+ ---
405
+
406
+ ## Phase 6: AI System Analysis Lab
407
+ ### Weeks 29-34
408
+
409
+ ### Apply Philosophy to Real AI
410
+
411
+ 1. **Select System**: ChatGPT, Claude, DALL-E...
412
+
413
+ 2. **Select Framework**: Recognition, alienation, phenomenology
414
+
415
+ 3. **Guided Analysis**: Prompts for framework application
416
+
417
+ 4. **Collect Evidence**: Screenshots, transcripts
418
+
419
+ 5. **Synthesize**: AI-assisted writeup
420
+
421
+ 6. **Peer Review**: Share with classmates
422
+
423
+ **Metrics**: Analyses completed: 1/learner | Frameworks used: 2+
424
+
425
+ ```notes
426
+ This is where theory meets practice. Students aren't just learning about Hegelian recognition in the abstract - they're applying it to analyze actual AI systems they use every day. This creates genuine critical AI thinkers.
427
+ ```
428
+
429
+ ---
430
+
431
+ ## Phase 7: Mastery Overhaul
432
+ ### Weeks 35-42
433
+
434
+ ### Problem with Current Progress
435
+
436
+ Learning map shows:
437
+ - Activities completed ✓
438
+ - Lectures opened ✓
439
+
440
+ Learning map doesn't show:
441
+ - Concept understanding ✗
442
+ - Skill development ✗
443
+ - Epistemic growth ✗
444
+
445
+ **Completion ≠ Comprehension**
446
+
447
+ ```notes
448
+ Our current progress tracking is essentially a checklist. You can "complete" a course by clicking through everything without understanding anything. We need to track what actually matters.
449
+ ```
450
+
451
+ ---
452
+
453
+ ## Concept Mastery System
454
+
455
+ ### Mastery Levels
456
+ Exposed → Developing → Proficient → Mastered
457
+
458
+ ### Spaced Repetition
459
+ - Concepts decay without review
460
+ - System prompts re-engagement
461
+ - Connection challenges
462
+
463
+ ### Progress Beyond Completion
464
+ - Reading depth (not just opens)
465
+ - Concept confidence
466
+ - Explanation quality
467
+ - Argument construction
468
+
469
+ **Metrics**: Concepts at "Proficient": 60%+
470
+
471
+ ```notes
472
+ Spaced repetition is well-established for factual knowledge, but we're applying it to philosophical concepts. "Sublation" fades if not revisited and applied. The system should remind learners to re-engage with decaying concepts.
473
+ ```
474
+
475
+ ---
476
+
477
+ ## "Grand Narrative" View (NEW from Gemini)
478
+
479
+ ### Beyond Course Progress
480
+
481
+ The Learning Map currently shows:
482
+ - Lectures completed
483
+ - Activities finished
484
+ - Nodes unlocked
485
+
486
+ ### The Student's Intellectual Journey
487
+
488
+ A new map mode showing:
489
+ - **Evolution of ideas** - How student's thinking has changed
490
+ - **Position shifts** - Tracked epistemic stances over time
491
+ - **Key insights** - Breakthrough moments captured
492
+ - **Personal themes** - What the learner cares about
493
+
494
+ *Not just "what did I click" but "how have I grown?"*
495
+
496
+ ```notes
497
+ Gemini's insight: "A version of the Learning Map that visualizes the evolution of the student's own ideas, not just course progress." This transforms progress visualization from a checklist into a narrative of intellectual development.
498
+ ```
499
+
500
+ ---
501
+
502
+ ## Phase 8: Community
503
+ ### Weeks 43-52
504
+
505
+ ### Philosophy Thrives on Dialogue
506
+
507
+ **Collaborative Wiki**
508
+ - Student explanations
509
+ - Worked examples
510
+ - Peer curation
511
+ - Archive across cohorts
512
+
513
+ **Dialogue Simulator**
514
+ - Defend thesis against AI examiner
515
+ - Historical scenarios (Turing vs Jefferson)
516
+ - Peer spectator mode
517
+
518
+ **Study Groups**
519
+ - Complementary strengths
520
+ - Debate pairing
521
+
522
+ ```notes
523
+ The Western philosophical tradition is fundamentally dialogical - Socrates to Plato, Hegel responding to Kant. Our platform should facilitate genuine intellectual exchange, not just individual consumption.
524
+ ```
525
+
526
+ ---
527
+
528
+ ## Timeline Overview
529
+
530
+ | Phase | Weeks | Focus |
531
+ |-------|-------|-------|
532
+ | 0 | 1-2 | Infrastructure wiring |
533
+ | 1 | 3-6 | Comprehension foundation |
534
+ | 2 | 7-10 | Philosopher personas |
535
+ | 3 | 11-16 | Simulation integration |
536
+ | 4 | 17-22 | Dialectical practice |
537
+ | 5 | 23-28 | Thematic analysis |
538
+ | 6 | 29-34 | AI analysis lab |
539
+ | 7 | 35-42 | Mastery overhaul |
540
+ | 8 | 43-52 | Community features |
541
+
542
+ ```notes
543
+ This is roughly a year-long roadmap. Each phase builds on the previous. We start with infrastructure (necessary but not sufficient), build comprehension tools, then progressively enable deeper philosophical practice.
544
+ ```
545
+
546
+ ---
547
+
548
+ ## Key Metrics
549
+
550
+ | What | Target |
551
+ |------|--------|
552
+ | Terms clicked/session | 5+ |
553
+ | Summary usage | 60%+ learners |
554
+ | Persona switches | 2+ per session |
555
+ | Simulations from content | 3x baseline |
556
+ | Arguments constructed | 3+ per learner |
557
+ | Dialectical challenges | 70%+ completed |
558
+ | Concepts "Proficient" | 60%+ |
559
+ | Course completion | +10% vs baseline |
560
+ | Satisfaction | 4.2/5.0 |
561
+ | AI cost/learner/week | <$5 |
562
+
563
+ ```notes
564
+ Metrics help us know if features are working. But note the balance: engagement metrics (clicks, usage) AND outcome metrics (completion, satisfaction). We care about both activity and results.
565
+ ```
566
+
567
+ ---
568
+
569
+ ## Risks
570
+
571
+ | Risk | Mitigation |
572
+ |------|------------|
573
+ | LLM cost explosion | Caching, model tiering |
574
+ | Content pipeline breaks | Phase 0 fixes, testing |
575
+ | Feature overwhelm | Progressive disclosure |
576
+ | Low feature adoption | In-context suggestions |
577
+ | AI feedback quality | Human review loops |
578
+
579
+ ```notes
580
+ The biggest risk is probably feature overwhelm. We're proposing many new capabilities. Without careful curation and progressive disclosure, learners may feel lost rather than supported. Each feature needs to surface at the right moment.
581
+ ```
582
+
583
+ ---
584
+
585
+ ## Immediate Next Steps
586
+ ### Week 1
587
+
588
+ 1. **Fix content pipeline**
589
+ - Align text analysis to Markdown
590
+
591
+ 2. **Generate tutor content**
592
+ - Run and integrate tutor-content.json
593
+
594
+ 3. **Deploy glossary MVP**
595
+ - Term detection + basic explanations
596
+
597
+ 4. **Pilot one persona**
598
+ - "Ask Hegel" in chat for EPOL 479
599
+
600
+ 5. **Add one simulation hook**
601
+ - Recognition simulation in Lecture 3
602
+
603
+ ```notes
604
+ Start small, prove value, then expand. Week 1 should produce visible improvements that learners can experience immediately. The glossary and first persona are high-impact, low-risk starting points.
605
+ ```
606
+
607
+ ---
608
+
609
+ ## The Ultimate Goal
610
+
611
+ Our learners should finish these courses:
612
+
613
+ - **Constructing** arguments, not just consuming them
614
+
615
+ - **Challenging** AI systems, not just using them
616
+
617
+ - **Recognizing** their own assumptions
618
+
619
+ - **Entering** philosophical conversations, not just observing
620
+
621
+ The platform's job is to create the conditions for this transformation.
622
+
623
+ ```notes
624
+ This is what it means to develop philosophers and critical AI thinkers. Not just knowledge transfer, but capability development. Not just completion, but transformation. The features are means to this end.
625
+ ```
626
+
627
+ ---
628
+
629
+ ## What Would Hegel Say?
630
+
631
+ > "The individual who has not risked his life may admittedly be recognized as a *person*, but he has not achieved the truth of being recognized as a self-sufficient self-consciousness."
632
+
633
+ The learner must **struggle**.
634
+ The platform must **scaffold that struggle**.
635
+
636
+ Not eliminate difficulty—
637
+ make it **navigable**.
638
+
639
+ ```notes
640
+ Returning to the course content itself for final guidance. Hegel's insight is that genuine development requires genuine challenge. Our job isn't to make philosophy easy - it's to make the difficulty productive rather than destructive.
641
+ ```
642
+
643
+ ---
644
+
645
+ ## Questions?
646
+
647
+ **Full Documents**:
648
+ - `LEARNING_FEATURES_ANALYSIS.md` (Claude)
649
+ - `plan-gpt.md` (ChatGPT)
650
+ - `ROADMAP_SYNTHESIZED.md` (Combined)
651
+
652
+ **Next Review**: Implementation priorities for Phase 0-1
653
+
654
+ ```notes
655
+ Three documents capture the full analysis. The synthesized roadmap combines Claude's philosophical depth with ChatGPT's operational pragmatism. Ready to discuss priorities and begin implementation.
656
+ ```
@@ -0,0 +1,8 @@
1
+ # Eval Content Package (minimal)
2
+ # Read-only subset of @machinespirits/content-philosophy for evaluation.
3
+
4
+ name: "@machinespirits/eval-content"
5
+ version: "1.0.0"
6
+
7
+ content:
8
+ courses: "./courses"