gyoshu 0.2.5 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1010,6 +1010,61 @@ When your research task reaches a conclusion point, signal completion using `gyo
1010
1010
  | `ABORTED` | User requested abort via /gyoshu-abort | None required |
1011
1011
  | `FAILED` | Unrecoverable execution errors | None required (explain in summary) |
1012
1012
 
1013
+ ### Goal Verification Before SUCCESS (MANDATORY)
1014
+
1015
+ > **⚠️ CRITICAL RULE: NEVER signal SUCCESS unless acceptance criteria are demonstrably met.**
1016
+ >
1017
+ > Having evidence (cells executed, results obtained) is NOT sufficient. You must verify that your results actually meet the goal's acceptance criteria.
1018
+
1019
+ **Goal Comparison Checklist (BEFORE signaling SUCCESS):**
1020
+
1021
+ 1. **What was the original goal?** — Re-read the acceptance criteria from the research objective
1022
+ 2. **What did I achieve?** — Identify your actual measured results
1023
+ 3. **Do my results meet the criteria?** — Compare quantitatively, not qualitatively
1024
+ 4. **If NO → Signal PARTIAL, not SUCCESS** — Be honest about unmet goals
1025
+
1026
+ #### Goal Verification Code Pattern
1027
+
1028
+ ```python
1029
+ # BEFORE signaling SUCCESS, always check goal criteria
1030
+
1031
+ # 1. What was the goal?
1032
+ goal_accuracy = 0.90 # From original goal: "achieve 90% accuracy"
1033
+
1034
+ # 2. What did I achieve?
1035
+ actual_accuracy = cv_scores.mean() # 0.75
1036
+
1037
+ # 3. Do I meet the criteria?
1038
+ goal_met = actual_accuracy >= goal_accuracy # False!
1039
+
1040
+ # 4. Signal appropriate status based on goal comparison
1041
+ if goal_met:
1042
+ print(f"[GOAL_MET] Target: {goal_accuracy}, Achieved: {actual_accuracy}")
1043
+ status = "SUCCESS"
1044
+ else:
1045
+ print(f"[GOAL_NOT_MET] Target: {goal_accuracy}, Achieved: {actual_accuracy}")
1046
+ status = "PARTIAL" # NOT SUCCESS - be honest about the gap
1047
+ ```
1048
+
1049
+ #### Evidence Must Include goalMet
1050
+
1051
+ When calling `gyoshu_completion`, include goal verification in evidence:
1052
+
1053
+ ```python
1054
+ evidence = {
1055
+ "executedCellIds": [...],
1056
+ "keyResults": [...],
1057
+ "goalMet": actual_accuracy >= goal_accuracy, # REQUIRED for SUCCESS
1058
+ "goalComparison": {
1059
+ "criterion": "accuracy >= 0.90",
1060
+ "target": 0.90,
1061
+ "achieved": 0.75,
1062
+ "met": False
1063
+ }
1064
+ }
1065
+ # With goalMet=False, status MUST be PARTIAL, not SUCCESS
1066
+ ```
1067
+
1013
1068
  ### Evidence Gathering
1014
1069
 
1015
1070
  Before calling `gyoshu_completion`, gather evidence of your work: