@laitszkin/apollo-toolkit 2.14.2 → 2.14.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,12 @@ All notable changes to this repository are documented in this file.
4
4
 
5
5
  ## [Unreleased]
6
6
 
7
+ ## [v2.14.3] - 2026-04-07
8
+
9
+ ### Changed
10
+ - Strengthen `systematic-debug` so runtime-pipeline investigations must anchor on one canonical run or artifact root, map failures to concrete stages, and separate toolchain/platform faults from application-logic faults before fixing.
11
+ - Strengthen `scheduled-runtime-health-check` so bounded runs must record the canonical run folder as soon as it materializes and use structured artifacts from that same run when analyzing health.
12
+
7
13
  ## [v2.14.2] - 2026-04-06
8
14
 
9
15
  ### Changed
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@laitszkin/apollo-toolkit",
3
- "version": "2.14.2",
3
+ "version": "2.14.3",
4
4
  "description": "Apollo Toolkit npm installer for managed skill copying across Codex, OpenClaw, and Trae.",
5
5
  "license": "MIT",
6
6
  "author": "LaiTszKin",
@@ -14,8 +14,8 @@ description: Use a background terminal to run a user-specified command immediate
14
14
 
15
15
  ## Standards
16
16
 
17
- - Evidence: Anchor every conclusion to the requested command, execution window, startup/shutdown timestamps, captured logs, and concrete runtime signals.
18
- - Execution: Collect the run contract, verify the real stop mechanism before launch, use a background terminal, optionally update the code only when the user asks, execute the requested command immediately or in the requested window, capture logs, stop cleanly when bounded, then delegate log review to `analyse-app-logs` only when findings are requested or needed.
17
+ - Evidence: Anchor every conclusion to the requested command, execution window, startup/shutdown timestamps, one canonical run folder or artifact root, captured logs, and concrete runtime signals.
18
+ - Execution: Collect the run contract, verify the real stop mechanism before launch, use a background terminal, optionally update the code only when the user asks, execute the requested command immediately or in the requested window, record the canonical run folder once the process materializes it, capture logs, stop cleanly when bounded, then delegate log review to `analyse-app-logs` only when findings are requested or needed.
19
19
  - Quality: Keep scheduling, execution, and shutdown deterministic; separate confirmed findings from hypotheses; and mark each assessed module healthy/degraded/failed/unknown with reasons.
20
20
  - Output: Return the run configuration, execution status, log locations, optional code-update result, optional module health by area, confirmed issues, potential issues, observability gaps, and scheduler status when applicable.
21
21
 
@@ -45,10 +45,12 @@ This skill is an orchestration layer. It owns the background terminal session, o
45
45
 
46
46
  - Prefer one bounded observation window over open-ended monitoring.
47
47
  - Use one dedicated background terminal session per requested run so execution and logs stay correlated.
48
+ - Record the canonical run directory, artifact root, or other generated output location as soon as it exists, and use that as the source of truth for later analysis.
48
49
  - Treat code update as optional and only perform it when the user explicitly requests it.
49
50
  - Treat startup, steady-state, and shutdown as part of the same investigation.
50
51
  - Do not call a module healthy unless there is at least one positive signal for it.
51
52
  - Separate scheduler failures, boot failures, runtime failures, and shutdown failures.
53
+ - For complex pipelines, identify the last successful stage before attributing the failure to application logic.
52
54
  - If logs cannot support a health judgment, mark the module as `unknown` instead of guessing.
53
55
 
54
56
  ## Required workflow
@@ -62,6 +64,7 @@ This skill is an orchestration layer. It owns the background terminal session, o
62
64
  - Create a dedicated run folder and record timezone, cwd, requested command, terminal session identifier, and any requested start/end boundaries.
63
65
  - Capture stdout and stderr from the beginning of the session so the full run stays auditable.
64
66
  - Identify and record the exact bounded-stop mechanism before launch: signal path, wrapper script, env var names, CLI flags, PID capture, and any project-specific shutdown helper.
67
+ - Decide in advance what the canonical evidence root will be if the command generates its own run directory, artifact bundle, database, or report file, so later diagnosis does not drift across multiple runs.
65
68
  3. Optionally update to the latest safe code state
66
69
  - Only do this step when the user explicitly asked to update the project before execution.
67
70
  - Prefer the repository's normal safe update path, such as `git pull --ff-only`, or the project's documented sync command if one exists.
@@ -73,6 +76,7 @@ This skill is an orchestration layer. It owns the background terminal session, o
73
76
  - If the user requested a future start time and no reliable scheduler is available, fail closed and report the scheduling limitation instead of starting early.
74
77
  5. Run and capture readiness
75
78
  - Execute the requested command in the same background terminal.
79
+ - As soon as the command emits or creates its canonical run directory, artifact root, or equivalent output location, record that path and reuse it for every later check.
76
80
  - Wait for a concrete readiness signal when the command is expected to stay up, such as a health endpoint, listening-port log, worker boot line, or queue-consumer ready message.
77
81
  - If readiness never arrives, stop the run, preserve logs, and treat it as a failed startup window.
78
82
  6. Observe and stop when bounded
@@ -84,7 +88,8 @@ This skill is an orchestration layer. It owns the background terminal session, o
84
88
  7. Explain findings from logs when requested
85
89
  - If the user asked for findings after completion, wait for the run to finish before analyzing the captured logs.
86
90
  - Invoke `analyse-app-logs` on only the captured runtime window.
87
- - Pass the service or module names, environment, timezone, run folder, relevant log files, and the exact start/end boundaries.
91
+ - Pass the service or module names, environment, timezone, canonical run folder, relevant log files, and the exact start/end boundaries.
92
+ - When the command produced reports, databases, or other structured artifacts, compare them against the same run's logs before making a health judgment.
88
93
  - Reuse its confirmed issues, hypotheses, and monitoring improvements instead of rewriting a separate incident workflow.
89
94
  8. Produce the final report
90
95
  - Always summarize the actual command executed, actual start/end timestamps, execution status, and log locations.
@@ -114,7 +119,7 @@ Absence of errors alone is not enough for `healthy`.
114
119
  Use this structure in responses:
115
120
 
116
121
  1. Run summary
117
- - Workspace, command, schedule if any, actual start/end timestamps, duration if bounded, readiness result, shutdown result if applicable, and log locations.
122
+ - Workspace, command, schedule if any, actual start/end timestamps, duration if bounded, readiness result, shutdown result if applicable, canonical run folder or artifact root, and log locations.
118
123
  2. Execution result
119
124
  - Whether the command completed, stayed up for the requested window, or failed early.
120
125
  3. Code update result
@@ -14,16 +14,18 @@ description: "Systematic debugging workflow for program issues: understand obser
14
14
 
15
15
  ## Standards
16
16
 
17
- - Evidence: Gather expected versus observed behavior from code and runtime facts before deciding on a cause.
18
- - Execution: Inspect the relevant paths, reproduce every plausible cause with tests, diagnose the real cause, then apply the minimal fix.
17
+ - Evidence: Gather expected versus observed behavior from code and runtime facts before deciding on a cause, and when the issue involves a runtime pipeline or bounded run, anchor the investigation to one canonical artifact root or run directory instead of mixed terminal snippets from multiple runs.
18
+ - Execution: Inspect the relevant paths, reproduce every plausible cause with tests or bounded reruns, map each observed failure to a concrete pipeline stage, distinguish toolchain/platform faults from application-logic faults, then apply the minimal fix.
19
19
  - Quality: Keep scope focused on the bug, prefer existing test patterns, and explicitly rule out hypotheses that could not be reproduced.
20
- - Output: Deliver the plausible-cause list, reproduction tests, validated fix summary, and passing-test confirmation.
20
+ - Output: Deliver the plausible-cause list, the canonical evidence source, reproduction tests or reruns, validated fix summary, and passing-test confirmation.
21
21
 
22
22
  ## Core Principles
23
23
 
24
24
  - Gather facts from user reports and code behavior before changing implementation.
25
25
  - Cover all plausible causes with reproducible tests instead of guessing a single cause.
26
26
  - Keep fixes minimal, focused, and validated by passing tests.
27
+ - When logs or runtime artifacts exist, treat one run as canonical and compare every conclusion against that same run's generated artifacts, not against ad hoc console recollection.
28
+ - When the failing flow crosses multiple layers, identify the last confirmed successful stage before assigning blame.
27
29
 
28
30
  ## Trigger Conditions
29
31
 
@@ -39,10 +41,11 @@ Also auto-invoke this skill when mismatch evidence appears during normal executi
39
41
 
40
42
  ## Required Workflow
41
43
 
42
- 1. **Understand and inspect**: Parse expected vs observed behavior, explore relevant code paths, and build a list of plausible root causes.
43
- 2. **Reproduce with tests**: Write or extend tests that reproduce every plausible cause.
44
- 3. **Diagnose and confirm**: Use reproduction evidence to confirm the true root cause and explicitly rule out non-causes.
45
- 4. **Fix and validate**: Implement focused fixes and iterate until all reproduction tests pass.
44
+ 1. **Understand and inspect**: Parse expected vs observed behavior, explore relevant code paths, record the canonical failing run or artifact root when runtime output is involved, and build a list of plausible root causes.
45
+ 2. **Map the failure boundary**: Break the flow into concrete stages such as setup, startup, readiness, steady-state execution, persistence, and shutdown, then identify the last stage that is confirmed to have succeeded.
46
+ 3. **Reproduce with tests or bounded reruns**: Write or extend tests that reproduce every plausible cause, and when the bug depends on runtime orchestration rerun the same bounded command or scenario instead of switching contexts mid-investigation.
47
+ 4. **Diagnose and confirm**: Use reproduction evidence to confirm the true root cause, explicitly rule out non-causes, and classify whether the fault belongs to the toolchain/platform layer, the orchestration layer, or application logic.
48
+ 5. **Fix and validate**: Implement focused fixes and iterate until all reproduction tests or bounded reruns pass.
46
49
 
47
50
  ## Implementation Guidelines
48
51
 
@@ -50,10 +53,13 @@ Also auto-invoke this skill when mismatch evidence appears during normal executi
50
53
  - Prefer existing test patterns and fixtures over creating new frameworks.
51
54
  - Keep the scope to bug reproduction and resolution; avoid unrelated refactors.
52
55
  - If a hypothesized cause cannot be reproduced, document why and deprioritize it explicitly.
56
+ - For long-running or generated-artifact workflows, record the exact command, timestamps, and artifact paths before inspecting outputs so later comparisons stay on the same evidence set.
57
+ - Do not mix baseline data and rerun data casually; compare the same scenario or command across runs and call out when a conclusion comes from a rerun rather than the original failure.
53
58
 
54
59
  ## Deliverables
55
60
 
56
61
  - Plausible root-cause list tied to concrete code paths
57
- - Reproduction tests for each plausible cause
62
+ - Canonical failing run or artifact root when runtime evidence exists
63
+ - Reproduction tests or bounded reruns for each plausible cause
58
64
  - Fix summary mapped to failing-then-passing tests
59
65
  - Final confirmation that all related tests pass