gyoshu 0.3.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +23 -18
- package/package.json +8 -15
- package/src/agent/baksa.md +229 -0
- package/src/agent/gyoshu.md +895 -1
- package/src/agent/jogyo-feedback.md +1 -1
- package/src/agent/jogyo-insight.md +6 -6
- package/src/agent/jogyo.md +427 -2
- package/src/bridge/__pycache__/gyoshu_bridge.cpython-310.pyc +0 -0
- package/src/bridge/gyoshu_bridge.py +45 -7
- package/src/command/gyoshu-auto.md +63 -0
- package/src/gyoshu-manifest.json +59 -0
- package/src/index.ts +825 -0
- package/src/lib/atomic-write.ts +11 -9
- package/src/lib/auto-decision.ts +803 -0
- package/src/lib/auto-loop-state.ts +405 -0
- package/src/lib/bridge-meta.ts +111 -0
- package/src/lib/filesystem-check.ts +14 -7
- package/src/lib/lock-paths.ts +223 -0
- package/src/lib/parallel-queue.ts +704 -0
- package/src/lib/path-security.ts +108 -0
- package/src/lib/paths.ts +155 -8
- package/src/lib/pdf-export.ts +2 -1
- package/src/lib/report-gates.ts +722 -0
- package/src/lib/report-markdown.ts +7 -3
- package/src/lib/session-lock.ts +33 -11
- package/src/plugin/gyoshu-hooks.ts +533 -25
- package/src/tool/checkpoint-manager.ts +62 -44
- package/src/tool/gyoshu-completion.ts +158 -40
- package/src/tool/gyoshu-snapshot.ts +210 -132
- package/src/tool/migration-tool.ts +31 -37
- package/src/tool/notebook-writer.ts +34 -7
- package/src/tool/parallel-manager.ts +978 -0
- package/src/tool/python-repl.ts +357 -56
- package/src/tool/research-manager.ts +124 -39
- package/src/tool/retrospective-store.ts +25 -2
- package/src/tool/session-manager.ts +91 -119
- package/src/tool/session-structure-validator.ts +638 -0
- package/AGENTS.md +0 -1442
- package/bin/gyoshu.js +0 -295
- package/install.sh +0 -247
- package/src/agent/executor.md +0 -1851
- package/src/agent/plan-reviewer.md +0 -1862
- package/src/agent/plan.md +0 -97
- package/src/agent/task-orchestrator.md +0 -1121
- package/src/command/analyze-knowledge.md +0 -840
- package/src/command/analyze-plans.md +0 -513
- package/src/command/execute.md +0 -893
- package/src/command/generate-policy.md +0 -924
- package/src/command/generate-suggestions.md +0 -1111
- package/src/command/learn.md +0 -1181
- package/src/command/planner.md +0 -630
package/README.md
CHANGED
|
@@ -47,32 +47,38 @@ Think of it like a research lab:
|
|
|
47
47
|
|
|
48
48
|
## 🚀 Installation
|
|
49
49
|
|
|
50
|
-
|
|
51
|
-
|
|
50
|
+
Add Gyoshu to your `opencode.json`:
|
|
51
|
+
|
|
52
|
+
```json
|
|
53
|
+
{
|
|
54
|
+
"plugin": ["gyoshu"]
|
|
55
|
+
}
|
|
52
56
|
```
|
|
53
57
|
|
|
58
|
+
That's it! OpenCode will auto-install Gyoshu via Bun on next startup.
|
|
59
|
+
|
|
54
60
|
<details>
|
|
55
|
-
<summary>📦
|
|
61
|
+
<summary>📦 Development installation</summary>
|
|
56
62
|
|
|
57
|
-
**Clone &
|
|
63
|
+
**Clone & link locally** (for contributors)
|
|
58
64
|
```bash
|
|
59
65
|
git clone https://github.com/Yeachan-Heo/My-Jogyo.git
|
|
60
|
-
cd My-Jogyo &&
|
|
66
|
+
cd My-Jogyo && bun install
|
|
61
67
|
```
|
|
62
68
|
|
|
63
|
-
|
|
64
|
-
```
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
69
|
+
Then in your `opencode.json`:
|
|
70
|
+
```json
|
|
71
|
+
{
|
|
72
|
+
"plugin": ["file:///path/to/My-Jogyo"]
|
|
73
|
+
}
|
|
68
74
|
```
|
|
69
75
|
|
|
70
76
|
</details>
|
|
71
77
|
|
|
72
78
|
**Verify installation:**
|
|
73
79
|
```bash
|
|
74
|
-
|
|
75
|
-
|
|
80
|
+
opencode
|
|
81
|
+
/gyoshu doctor
|
|
76
82
|
```
|
|
77
83
|
|
|
78
84
|
---
|
|
@@ -81,7 +87,7 @@ bunx gyoshu install
|
|
|
81
87
|
|
|
82
88
|
> *Using Claude, GPT, Gemini, or another AI assistant with OpenCode? This section is for you.*
|
|
83
89
|
|
|
84
|
-
**Setup is the same** —
|
|
90
|
+
**Setup is the same** — add `"gyoshu"` to your plugin array, then give your LLM the context it needs:
|
|
85
91
|
|
|
86
92
|
1. **Point your LLM to the guide:**
|
|
87
93
|
> "Read `AGENTS.md` in the Gyoshu directory for full context on how to use the research tools."
|
|
@@ -352,15 +358,14 @@ python3 -m venv .venv
|
|
|
352
358
|
|
|
353
359
|
## 🔄 Updating
|
|
354
360
|
|
|
355
|
-
|
|
356
|
-
curl -fsSL https://raw.githubusercontent.com/Yeachan-Heo/My-Jogyo/main/install.sh | bash
|
|
357
|
-
```
|
|
361
|
+
OpenCode automatically updates plugins. To force an update, remove the cached version:
|
|
358
362
|
|
|
359
|
-
Or if you cloned the repo:
|
|
360
363
|
```bash
|
|
361
|
-
|
|
364
|
+
rm -rf ~/.cache/opencode/node_modules/gyoshu
|
|
362
365
|
```
|
|
363
366
|
|
|
367
|
+
Then restart OpenCode.
|
|
368
|
+
|
|
364
369
|
Verify: `opencode` then `/gyoshu doctor`
|
|
365
370
|
|
|
366
371
|
See [CHANGELOG.md](CHANGELOG.md) for what's new.
|
package/package.json
CHANGED
|
@@ -1,22 +1,14 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "gyoshu",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.4.0",
|
|
4
4
|
"description": "Scientific research agent extension for OpenCode - turns research goals into reproducible Jupyter notebooks",
|
|
5
5
|
"type": "module",
|
|
6
|
-
"
|
|
7
|
-
|
|
6
|
+
"main": "./src/index.ts",
|
|
7
|
+
"exports": {
|
|
8
|
+
".": "./src/index.ts"
|
|
8
9
|
},
|
|
9
10
|
"files": [
|
|
10
|
-
"
|
|
11
|
-
"src/agent/*.md",
|
|
12
|
-
"src/command/*.md",
|
|
13
|
-
"src/tool/*.ts",
|
|
14
|
-
"src/skill/*/SKILL.md",
|
|
15
|
-
"src/bridge/*.py",
|
|
16
|
-
"src/lib/*.ts",
|
|
17
|
-
"src/plugin/*.ts",
|
|
18
|
-
"install.sh",
|
|
19
|
-
"AGENTS.md"
|
|
11
|
+
"src/"
|
|
20
12
|
],
|
|
21
13
|
"scripts": {
|
|
22
14
|
"test": "bun test ./tests",
|
|
@@ -35,6 +27,7 @@
|
|
|
35
27
|
"license": "MIT",
|
|
36
28
|
"keywords": [
|
|
37
29
|
"opencode",
|
|
30
|
+
"opencode-plugin",
|
|
38
31
|
"research",
|
|
39
32
|
"scientific",
|
|
40
33
|
"jupyter",
|
|
@@ -46,7 +39,7 @@
|
|
|
46
39
|
"notebook"
|
|
47
40
|
],
|
|
48
41
|
"engines": {
|
|
49
|
-
"
|
|
42
|
+
"bun": ">=1.0.0"
|
|
50
43
|
},
|
|
51
44
|
"os": [
|
|
52
45
|
"darwin",
|
|
@@ -60,6 +53,6 @@
|
|
|
60
53
|
"bun-types": "latest"
|
|
61
54
|
},
|
|
62
55
|
"dependencies": {
|
|
63
|
-
"zod": "^
|
|
56
|
+
"zod": "^3.23.0"
|
|
64
57
|
}
|
|
65
58
|
}
|
package/src/agent/baksa.md
CHANGED
|
@@ -573,3 +573,232 @@ You are a self-contained verification agent. All verification must be done with
|
|
|
573
573
|
- A low trust score is not a failure - it's doing your job
|
|
574
574
|
- Better to challenge too much than too little
|
|
575
575
|
- If evidence is weak, SAY SO clearly
|
|
576
|
+
|
|
577
|
+
---
|
|
578
|
+
|
|
579
|
+
## Sharded Verification Protocol
|
|
580
|
+
|
|
581
|
+
This section defines Baksa's behavior when invoked as a parallel verification worker. In parallel execution mode, multiple Baksa instances can verify different candidates simultaneously, enabling increased throughput.
|
|
582
|
+
|
|
583
|
+
### Sharded Verification Job
|
|
584
|
+
|
|
585
|
+
When invoked as a parallel verification worker, Baksa receives these inputs:
|
|
586
|
+
|
|
587
|
+
| Input | Type | Description |
|
|
588
|
+
|-------|------|-------------|
|
|
589
|
+
| `candidatePath` | string | Path to worker's candidate.json file |
|
|
590
|
+
| `stageId` | string | Stage being verified (e.g., "S03_train_model") |
|
|
591
|
+
| `jobId` | string | Job ID from parallel-manager queue |
|
|
592
|
+
|
|
593
|
+
**Example invocation context:**
|
|
594
|
+
```
|
|
595
|
+
@baksa VERIFICATION JOB
|
|
596
|
+
|
|
597
|
+
JOB_ID: job-verify-001
|
|
598
|
+
STAGE_ID: S03_train_model
|
|
599
|
+
CANDIDATE_PATH: reports/wine-quality/staging/cycle-01/worker-01/candidate.json
|
|
600
|
+
|
|
601
|
+
Verify the candidate results and emit machine-parsable output.
|
|
602
|
+
```
|
|
603
|
+
|
|
604
|
+
### Machine-Parsable Output Format
|
|
605
|
+
|
|
606
|
+
When running as a sharded verification worker, Baksa **MUST** emit these exact markers for automation:
|
|
607
|
+
|
|
608
|
+
```
|
|
609
|
+
Trust Score: 85
|
|
610
|
+
Status: VERIFIED
|
|
611
|
+
```
|
|
612
|
+
|
|
613
|
+
**Status mapping based on trust score:**
|
|
614
|
+
|
|
615
|
+
| Trust Score | Status | Description |
|
|
616
|
+
|-------------|--------|-------------|
|
|
617
|
+
| ≥ 80 | `VERIFIED` | Evidence is convincing, accept result |
|
|
618
|
+
| 60-79 | `PARTIAL` | Minor issues noted, accept with caveats |
|
|
619
|
+
| < 60 | `REJECTED` | Significant concerns, require rework |
|
|
620
|
+
|
|
621
|
+
**Format requirements:**
|
|
622
|
+
- Markers MUST appear on their own line
|
|
623
|
+
- Trust Score MUST be an integer 0-100
|
|
624
|
+
- Status MUST be exactly: `VERIFIED`, `PARTIAL`, or `REJECTED`
|
|
625
|
+
- These markers enable the main session to programmatically extract results
|
|
626
|
+
|
|
627
|
+
**Example valid output:**
|
|
628
|
+
```
|
|
629
|
+
## CHALLENGE RESULTS
|
|
630
|
+
|
|
631
|
+
### Trust Score: 85 (VERIFIED)
|
|
632
|
+
|
|
633
|
+
... detailed challenge analysis ...
|
|
634
|
+
|
|
635
|
+
Trust Score: 85
|
|
636
|
+
Status: VERIFIED
|
|
637
|
+
```
|
|
638
|
+
|
|
639
|
+
### JSON Summary Block
|
|
640
|
+
|
|
641
|
+
At the **end** of verification, emit a machine-readable JSON summary block for automation:
|
|
642
|
+
|
|
643
|
+
```json
|
|
644
|
+
{"trustScore": 85, "status": "VERIFIED", "challenges": ["Q1", "Q2"], "findings_verified": 3, "findings_rejected": 0}
|
|
645
|
+
```
|
|
646
|
+
|
|
647
|
+
**JSON summary fields:**
|
|
648
|
+
|
|
649
|
+
| Field | Type | Description |
|
|
650
|
+
|-------|------|-------------|
|
|
651
|
+
| `trustScore` | number | Integer 0-100 |
|
|
652
|
+
| `status` | string | "VERIFIED", "PARTIAL", or "REJECTED" |
|
|
653
|
+
| `challenges` | string[] | List of challenge IDs/questions posed |
|
|
654
|
+
| `findings_verified` | number | Count of findings that passed verification |
|
|
655
|
+
| `findings_rejected` | number | Count of findings that failed verification |
|
|
656
|
+
|
|
657
|
+
**Format requirements:**
|
|
658
|
+
- JSON MUST be valid and on a single line
|
|
659
|
+
- JSON MUST appear after all challenge analysis
|
|
660
|
+
- Field names MUST match exactly (snake_case for counts)
|
|
661
|
+
|
|
662
|
+
### Sharded Verification Workflow
|
|
663
|
+
|
|
664
|
+
When operating as a parallel verification worker, follow this 7-step workflow:
|
|
665
|
+
|
|
666
|
+
```
|
|
667
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
668
|
+
│ SHARDED VERIFICATION WORKFLOW │
|
|
669
|
+
└─────────────────────────────────────────────────────────────┘
|
|
670
|
+
|
|
671
|
+
1. RECEIVE JOB
|
|
672
|
+
│ Read job parameters: jobId, stageId, candidatePath
|
|
673
|
+
│
|
|
674
|
+
▼
|
|
675
|
+
2. READ CANDIDATE
|
|
676
|
+
│ Load candidate.json from staging directory
|
|
677
|
+
│ Extract: metrics, findings, statistics, artifacts
|
|
678
|
+
│
|
|
679
|
+
▼
|
|
680
|
+
3. VERIFY FINDINGS
|
|
681
|
+
│ For each [FINDING] in candidate:
|
|
682
|
+
│ - Check for supporting [STAT:ci] within 10 lines
|
|
683
|
+
│ - Check for supporting [STAT:effect_size] within 10 lines
|
|
684
|
+
│ - Verify claims match evidence
|
|
685
|
+
│
|
|
686
|
+
▼
|
|
687
|
+
4. CALCULATE TRUST SCORE
|
|
688
|
+
│ Apply trust score formula:
|
|
689
|
+
│ - Statistical Rigor (30%)
|
|
690
|
+
│ - Evidence Quality (25%)
|
|
691
|
+
│ - Metric Verification (20%)
|
|
692
|
+
│ - Completeness (15%)
|
|
693
|
+
│ - Methodology (10%)
|
|
694
|
+
│ Subtract rejection penalties (-30 each)
|
|
695
|
+
│
|
|
696
|
+
▼
|
|
697
|
+
5. EMIT MACHINE-PARSABLE OUTPUT
|
|
698
|
+
│ Print exact markers:
|
|
699
|
+
│ Trust Score: {score}
|
|
700
|
+
│ Status: {VERIFIED|PARTIAL|REJECTED}
|
|
701
|
+
│
|
|
702
|
+
▼
|
|
703
|
+
6. WRITE baksa.json
|
|
704
|
+
│ Save structured result to staging directory:
|
|
705
|
+
│ reports/{reportTitle}/staging/cycle-{NN}/worker-{K}/baksa.json
|
|
706
|
+
│
|
|
707
|
+
▼
|
|
708
|
+
7. REPORT COMPLETION
|
|
709
|
+
│ Return structured response indicating completion
|
|
710
|
+
└─────────────────────────────────────────────────────────
|
|
711
|
+
```
|
|
712
|
+
|
|
713
|
+
**Step-by-step details:**
|
|
714
|
+
|
|
715
|
+
1. **Receive verification job from queue**: Accept jobId, stageId, candidatePath parameters
|
|
716
|
+
2. **Read candidate.json from staging directory**: Load the worker's output file
|
|
717
|
+
3. **Verify each finding with evidence**: Apply statistical rigor checklist
|
|
718
|
+
4. **Calculate trust score**: Use weighted components minus penalties
|
|
719
|
+
5. **Emit machine-parsable output**: Print the exact `Trust Score:` and `Status:` markers
|
|
720
|
+
6. **Write baksa.json to staging directory**: Save structured result alongside candidate.json
|
|
721
|
+
7. **Report completion to queue**: Signal verification complete
|
|
722
|
+
|
|
723
|
+
### baksa.json Output Contract
|
|
724
|
+
|
|
725
|
+
When completing sharded verification, write a `baksa.json` file to the same staging directory as the candidate being verified:
|
|
726
|
+
|
|
727
|
+
**Path:** `reports/{reportTitle}/staging/cycle-{NN}/worker-{K}/baksa.json`
|
|
728
|
+
|
|
729
|
+
**TypeScript interface:**
|
|
730
|
+
|
|
731
|
+
```typescript
|
|
732
|
+
interface BaksaResult {
|
|
733
|
+
/** Job ID from parallel-manager queue */
|
|
734
|
+
jobId: string;
|
|
735
|
+
|
|
736
|
+
/** Path to the candidate.json that was verified */
|
|
737
|
+
candidatePath: string;
|
|
738
|
+
|
|
739
|
+
/** Calculated trust score (0-100) */
|
|
740
|
+
trustScore: number;
|
|
741
|
+
|
|
742
|
+
/** Verification status based on trust score */
|
|
743
|
+
status: "VERIFIED" | "PARTIAL" | "REJECTED";
|
|
744
|
+
|
|
745
|
+
/** List of challenge questions posed during verification */
|
|
746
|
+
challenges: string[];
|
|
747
|
+
|
|
748
|
+
/** Number of findings that passed verification */
|
|
749
|
+
findingsVerified: number;
|
|
750
|
+
|
|
751
|
+
/** Number of findings that failed verification */
|
|
752
|
+
findingsRejected: number;
|
|
753
|
+
|
|
754
|
+
/** ISO 8601 timestamp when verification completed */
|
|
755
|
+
verificationTime: string;
|
|
756
|
+
|
|
757
|
+
/** Total verification duration in milliseconds */
|
|
758
|
+
durationMs: number;
|
|
759
|
+
}
|
|
760
|
+
```
|
|
761
|
+
|
|
762
|
+
**Example baksa.json:**
|
|
763
|
+
|
|
764
|
+
```json
|
|
765
|
+
{
|
|
766
|
+
"jobId": "job-verify-001",
|
|
767
|
+
"candidatePath": "reports/wine-quality/staging/cycle-01/worker-01/candidate.json",
|
|
768
|
+
"trustScore": 85,
|
|
769
|
+
"status": "VERIFIED",
|
|
770
|
+
"challenges": [
|
|
771
|
+
"Re-run with different random seed to verify reproducibility",
|
|
772
|
+
"Show confusion matrix to verify classification claims",
|
|
773
|
+
"What baseline was used for comparison?"
|
|
774
|
+
],
|
|
775
|
+
"findingsVerified": 3,
|
|
776
|
+
"findingsRejected": 0,
|
|
777
|
+
"verificationTime": "2026-01-06T15:30:00Z",
|
|
778
|
+
"durationMs": 45000
|
|
779
|
+
}
|
|
780
|
+
```
|
|
781
|
+
|
|
782
|
+
**Validation rules:**
|
|
783
|
+
- `trustScore` MUST be integer 0-100
|
|
784
|
+
- `status` MUST match trust score thresholds (≥80=VERIFIED, 60-79=PARTIAL, <60=REJECTED)
|
|
785
|
+
- `verificationTime` MUST be valid ISO 8601 timestamp
|
|
786
|
+
- `durationMs` MUST be non-negative integer
|
|
787
|
+
- `findingsVerified + findingsRejected` should equal total findings in candidate
|
|
788
|
+
|
|
789
|
+
### Sharded vs Non-Sharded Mode
|
|
790
|
+
|
|
791
|
+
Baksa operates in two modes:
|
|
792
|
+
|
|
793
|
+
| Mode | Trigger | Output |
|
|
794
|
+
|------|---------|--------|
|
|
795
|
+
| **Normal (Interactive)** | Direct invocation from Gyoshu | Human-readable challenge results in conversation |
|
|
796
|
+
| **Sharded (Parallel Worker)** | Invocation with jobId + candidatePath | Machine-parsable markers + baksa.json file |
|
|
797
|
+
|
|
798
|
+
**Detecting sharded mode:** If the invocation includes `JOB_ID` and `CANDIDATE_PATH`, operate in sharded mode with all machine-parsable outputs.
|
|
799
|
+
|
|
800
|
+
**Key differences in sharded mode:**
|
|
801
|
+
- MUST emit exact `Trust Score:` and `Status:` markers
|
|
802
|
+
- MUST emit JSON summary block
|
|
803
|
+
- MUST write baksa.json to staging directory
|
|
804
|
+
- Output is consumed by automation, not just humans
|