mdcontext 0.0.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/README.md +28 -0
- package/.changeset/config.json +11 -0
- package/.claude/settings.local.json +25 -0
- package/.github/workflows/ci.yml +83 -0
- package/.github/workflows/claude-code-review.yml +44 -0
- package/.github/workflows/claude.yml +85 -0
- package/.github/workflows/release.yml +113 -0
- package/.tldrignore +112 -0
- package/BACKLOG.md +338 -0
- package/CONTRIBUTING.md +186 -0
- package/NOTES/NOTES +44 -0
- package/README.md +434 -11
- package/biome.json +36 -0
- package/cspell.config.yaml +14 -0
- package/dist/chunk-23UPXDNL.js +3044 -0
- package/dist/chunk-2W7MO2DL.js +1366 -0
- package/dist/chunk-3NUAZGMA.js +1689 -0
- package/dist/chunk-7TOWB2XB.js +366 -0
- package/dist/chunk-7XOTOADQ.js +3065 -0
- package/dist/chunk-AH2PDM2K.js +3042 -0
- package/dist/chunk-BNXWSZ63.js +3742 -0
- package/dist/chunk-BTL5DJVU.js +3222 -0
- package/dist/chunk-HDHYG7E4.js +104 -0
- package/dist/chunk-HLR4KZBP.js +3234 -0
- package/dist/chunk-IP3FRFEB.js +1045 -0
- package/dist/chunk-KHU56VDO.js +3042 -0
- package/dist/chunk-KRYIFLQR.js +88 -0
- package/dist/chunk-LBSDNLEM.js +287 -0
- package/dist/chunk-MNTQ7HCP.js +2643 -0
- package/dist/chunk-MUJELQQ6.js +1387 -0
- package/dist/chunk-MXJGMSLV.js +2199 -0
- package/dist/chunk-N6QJGC3Z.js +2636 -0
- package/dist/chunk-OBELGBPM.js +1713 -0
- package/dist/chunk-OT7R5XTA.js +3192 -0
- package/dist/chunk-P7X4RA2T.js +106 -0
- package/dist/chunk-PIDUQNC2.js +3185 -0
- package/dist/chunk-POGCDIH4.js +3187 -0
- package/dist/chunk-PSIEOQGZ.js +3043 -0
- package/dist/chunk-PVRT3IHA.js +3238 -0
- package/dist/chunk-QNN4TT23.js +1430 -0
- package/dist/chunk-RE3R45RJ.js +3042 -0
- package/dist/chunk-S7E6TFX6.js +803 -0
- package/dist/chunk-SG6GLU4U.js +1378 -0
- package/dist/chunk-SJCDV2ST.js +274 -0
- package/dist/chunk-SYE5XLF3.js +104 -0
- package/dist/chunk-T5VLYBZD.js +103 -0
- package/dist/chunk-TOQB7VWU.js +3238 -0
- package/dist/chunk-VFNMZ4ZQ.js +3228 -0
- package/dist/chunk-VVTGZNBT.js +1629 -0
- package/dist/chunk-W7Q4RFEV.js +104 -0
- package/dist/chunk-XTYYVRLO.js +3190 -0
- package/dist/chunk-Y6MDYVJD.js +3063 -0
- package/dist/cli/main.d.ts +1 -0
- package/dist/cli/main.js +5458 -0
- package/dist/index.d.ts +653 -0
- package/dist/index.js +79 -0
- package/dist/mcp/server.d.ts +1 -0
- package/dist/mcp/server.js +472 -0
- package/dist/schema-BAWSG7KY.js +22 -0
- package/dist/schema-E3QUPL26.js +20 -0
- package/dist/schema-EHL7WUT6.js +20 -0
- package/docs/019-USAGE.md +625 -0
- package/docs/020-current-implementation.md +364 -0
- package/docs/021-DOGFOODING-FINDINGS.md +175 -0
- package/docs/BACKLOG.md +80 -0
- package/docs/CONFIG.md +1123 -0
- package/docs/DESIGN.md +439 -0
- package/docs/ERRORS.md +383 -0
- package/docs/PROJECT.md +88 -0
- package/docs/ROADMAP.md +407 -0
- package/docs/summarization.md +320 -0
- package/docs/test-links.md +9 -0
- package/justfile +40 -0
- package/package.json +74 -9
- package/pnpm-workspace.yaml +5 -0
- package/research/INDEX.md +315 -0
- package/research/code-review/README.md +90 -0
- package/research/code-review/cli-error-handling-review.md +979 -0
- package/research/code-review/code-review-validation-report.md +464 -0
- package/research/code-review/main-ts-review.md +1128 -0
- package/research/config-analysis/01-current-implementation.md +470 -0
- package/research/config-analysis/02-strategy-recommendation.md +428 -0
- package/research/config-analysis/03-task-candidates.md +715 -0
- package/research/config-analysis/033-research-configuration-management.md +828 -0
- package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
- package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
- package/research/config-docs/SUMMARY.md +357 -0
- package/research/config-docs/TEST-RESULTS.md +776 -0
- package/research/config-docs/TODO.md +542 -0
- package/research/config-docs/analysis.md +744 -0
- package/research/config-docs/fix-validation.md +502 -0
- package/research/config-docs/help-audit.md +264 -0
- package/research/config-docs/help-system-analysis.md +890 -0
- package/research/dogfood/consolidated-tool-evaluation.md +373 -0
- package/research/dogfood/strategy-a/a-synthesis.md +184 -0
- package/research/dogfood/strategy-a/a1-docs.md +226 -0
- package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
- package/research/dogfood/strategy-a/a3-llm.md +164 -0
- package/research/dogfood/strategy-b/b-synthesis.md +228 -0
- package/research/dogfood/strategy-b/b1-architecture.md +207 -0
- package/research/dogfood/strategy-b/b2-gaps.md +258 -0
- package/research/dogfood/strategy-b/b3-workflows.md +250 -0
- package/research/dogfood/strategy-c/c-synthesis.md +451 -0
- package/research/dogfood/strategy-c/c1-explorer.md +192 -0
- package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
- package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
- package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
- package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
- package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
- package/research/effect-cli-error-handling.md +845 -0
- package/research/effect-errors-as-values.md +943 -0
- package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
- package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
- package/research/errors-task-analysis/embeddings-analysis.md +709 -0
- package/research/errors-task-analysis/index-search-analysis.md +812 -0
- package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
- package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
- package/research/issue-review.md +603 -0
- package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
- package/research/llm-summarization/alternative-providers-2026.md +1428 -0
- package/research/llm-summarization/anthropic-2026.md +367 -0
- package/research/llm-summarization/claude-cli-integration.md +1706 -0
- package/research/llm-summarization/cli-integration-patterns.md +3155 -0
- package/research/llm-summarization/openai-2026.md +473 -0
- package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
- package/research/llm-summarization/opencode-cli-integration.md +1552 -0
- package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
- package/research/llm-summarization/prototype-results.md +56 -0
- package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
- package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
- package/research/mdcontext-error-analysis.md +521 -0
- package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
- package/research/mdcontext-pudding/01-index-embed.md +956 -0
- package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
- package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
- package/research/mdcontext-pudding/02-search.md +970 -0
- package/research/mdcontext-pudding/03-context.md +779 -0
- package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
- package/research/mdcontext-pudding/04-tree.md +704 -0
- package/research/mdcontext-pudding/05-config.md +1038 -0
- package/research/mdcontext-pudding/06-links-summary.txt +87 -0
- package/research/mdcontext-pudding/06-links.md +679 -0
- package/research/mdcontext-pudding/07-stats.md +693 -0
- package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
- package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
- package/research/mdcontext-pudding/README.md +168 -0
- package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
- package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
- package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
- package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
- package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
- package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
- package/research/research-quality-review.md +834 -0
- package/research/semantic-search/002-research-embedding-models.md +490 -0
- package/research/semantic-search/003-research-rag-alternatives.md +523 -0
- package/research/semantic-search/004-research-vector-search.md +841 -0
- package/research/semantic-search/032-research-semantic-search.md +427 -0
- package/research/semantic-search/embedding-text-analysis.md +156 -0
- package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
- package/research/semantic-search/query-processing-analysis.md +207 -0
- package/research/semantic-search/root-cause-and-solution.md +114 -0
- package/research/semantic-search/threshold-validation-report.md +69 -0
- package/research/semantic-search/vector-search-analysis.md +63 -0
- package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
- package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
- package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
- package/research/task-management-2026/03-lightweight-file-based.md +567 -0
- package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
- package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
- package/research/task-management-2026/linear/02-api-integrations.md +930 -0
- package/research/task-management-2026/linear/03-ai-features.md +368 -0
- package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
- package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
- package/research/test-path-issues.md +276 -0
- package/review/ALP-76/1-error-type-design.md +962 -0
- package/review/ALP-76/2-error-handling-patterns.md +906 -0
- package/review/ALP-76/3-error-presentation.md +624 -0
- package/review/ALP-76/4-test-coverage.md +625 -0
- package/review/ALP-76/5-migration-completeness.md +440 -0
- package/review/ALP-76/6-effect-best-practices.md +755 -0
- package/scripts/apply-branch-protection.sh +47 -0
- package/scripts/branch-protection-templates.json +79 -0
- package/scripts/prototype-summarization.ts +346 -0
- package/scripts/rebuild-hnswlib.js +58 -0
- package/scripts/setup-branch-protection.sh +64 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
- package/src/cli/argv-preprocessor.test.ts +210 -0
- package/src/cli/argv-preprocessor.ts +202 -0
- package/src/cli/cli.test.ts +627 -0
- package/src/cli/commands/backlinks.ts +54 -0
- package/src/cli/commands/config-cmd.ts +642 -0
- package/src/cli/commands/context.ts +285 -0
- package/src/cli/commands/duplicates.ts +122 -0
- package/src/cli/commands/embeddings.ts +529 -0
- package/src/cli/commands/index-cmd.ts +480 -0
- package/src/cli/commands/index.ts +16 -0
- package/src/cli/commands/links.ts +52 -0
- package/src/cli/commands/search.ts +1281 -0
- package/src/cli/commands/stats.ts +149 -0
- package/src/cli/commands/tree.ts +128 -0
- package/src/cli/config-layer.ts +176 -0
- package/src/cli/error-handler.test.ts +235 -0
- package/src/cli/error-handler.ts +655 -0
- package/src/cli/flag-schemas.ts +341 -0
- package/src/cli/help.ts +588 -0
- package/src/cli/index.ts +9 -0
- package/src/cli/main.ts +435 -0
- package/src/cli/options.ts +41 -0
- package/src/cli/shared-error-handling.ts +199 -0
- package/src/cli/typo-suggester.test.ts +105 -0
- package/src/cli/typo-suggester.ts +130 -0
- package/src/cli/utils.ts +259 -0
- package/src/config/file-provider.test.ts +320 -0
- package/src/config/file-provider.ts +273 -0
- package/src/config/index.ts +72 -0
- package/src/config/integration.test.ts +667 -0
- package/src/config/precedence.test.ts +277 -0
- package/src/config/precedence.ts +451 -0
- package/src/config/schema.test.ts +414 -0
- package/src/config/schema.ts +603 -0
- package/src/config/service.test.ts +320 -0
- package/src/config/service.ts +243 -0
- package/src/config/testing.test.ts +264 -0
- package/src/config/testing.ts +110 -0
- package/src/core/index.ts +1 -0
- package/src/core/types.ts +113 -0
- package/src/duplicates/detector.test.ts +183 -0
- package/src/duplicates/detector.ts +414 -0
- package/src/duplicates/index.ts +18 -0
- package/src/embeddings/embedding-namespace.test.ts +300 -0
- package/src/embeddings/embedding-namespace.ts +947 -0
- package/src/embeddings/heading-boost.test.ts +222 -0
- package/src/embeddings/hnsw-build-options.test.ts +198 -0
- package/src/embeddings/hyde.test.ts +272 -0
- package/src/embeddings/hyde.ts +264 -0
- package/src/embeddings/index.ts +10 -0
- package/src/embeddings/openai-provider.ts +414 -0
- package/src/embeddings/pricing.json +22 -0
- package/src/embeddings/provider-constants.ts +204 -0
- package/src/embeddings/provider-errors.test.ts +967 -0
- package/src/embeddings/provider-errors.ts +565 -0
- package/src/embeddings/provider-factory.test.ts +240 -0
- package/src/embeddings/provider-factory.ts +225 -0
- package/src/embeddings/provider-integration.test.ts +788 -0
- package/src/embeddings/query-preprocessing.test.ts +187 -0
- package/src/embeddings/semantic-search-threshold.test.ts +508 -0
- package/src/embeddings/semantic-search.ts +1270 -0
- package/src/embeddings/types.ts +359 -0
- package/src/embeddings/vector-store.ts +708 -0
- package/src/embeddings/voyage-provider.ts +313 -0
- package/src/errors/errors.test.ts +845 -0
- package/src/errors/index.ts +533 -0
- package/src/index/ignore-patterns.test.ts +354 -0
- package/src/index/ignore-patterns.ts +305 -0
- package/src/index/index.ts +4 -0
- package/src/index/indexer.ts +684 -0
- package/src/index/storage.ts +260 -0
- package/src/index/types.ts +147 -0
- package/src/index/watcher.ts +189 -0
- package/src/index.ts +30 -0
- package/src/integration/search-keyword.test.ts +678 -0
- package/src/mcp/server.ts +612 -0
- package/src/parser/index.ts +1 -0
- package/src/parser/parser.test.ts +291 -0
- package/src/parser/parser.ts +394 -0
- package/src/parser/section-filter.test.ts +277 -0
- package/src/parser/section-filter.ts +392 -0
- package/src/search/__tests__/hybrid-search.test.ts +650 -0
- package/src/search/bm25-store.ts +366 -0
- package/src/search/cross-encoder.test.ts +253 -0
- package/src/search/cross-encoder.ts +406 -0
- package/src/search/fuzzy-search.test.ts +419 -0
- package/src/search/fuzzy-search.ts +273 -0
- package/src/search/hybrid-search.ts +448 -0
- package/src/search/path-matcher.test.ts +276 -0
- package/src/search/path-matcher.ts +33 -0
- package/src/search/query-parser.test.ts +260 -0
- package/src/search/query-parser.ts +319 -0
- package/src/search/searcher.test.ts +280 -0
- package/src/search/searcher.ts +724 -0
- package/src/search/wink-bm25.d.ts +30 -0
- package/src/summarization/cli-providers/claude.ts +202 -0
- package/src/summarization/cli-providers/detection.test.ts +273 -0
- package/src/summarization/cli-providers/detection.ts +118 -0
- package/src/summarization/cli-providers/index.ts +8 -0
- package/src/summarization/cost.test.ts +139 -0
- package/src/summarization/cost.ts +102 -0
- package/src/summarization/error-handler.test.ts +127 -0
- package/src/summarization/error-handler.ts +111 -0
- package/src/summarization/index.ts +102 -0
- package/src/summarization/pipeline.test.ts +498 -0
- package/src/summarization/pipeline.ts +231 -0
- package/src/summarization/prompts.test.ts +269 -0
- package/src/summarization/prompts.ts +133 -0
- package/src/summarization/provider-factory.test.ts +396 -0
- package/src/summarization/provider-factory.ts +178 -0
- package/src/summarization/types.ts +184 -0
- package/src/summarize/budget-bugs.test.ts +620 -0
- package/src/summarize/formatters.ts +419 -0
- package/src/summarize/index.ts +20 -0
- package/src/summarize/summarizer.test.ts +275 -0
- package/src/summarize/summarizer.ts +597 -0
- package/src/summarize/verify-bugs.test.ts +238 -0
- package/src/types/huggingface-transformers.d.ts +66 -0
- package/src/utils/index.ts +1 -0
- package/src/utils/tokens.test.ts +142 -0
- package/src/utils/tokens.ts +186 -0
- package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
- package/tests/fixtures/cli/.mdcontext/config.json +8 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
- package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
- package/tests/fixtures/cli/README.md +9 -0
- package/tests/fixtures/cli/api-reference.md +11 -0
- package/tests/fixtures/cli/getting-started.md +11 -0
- package/tests/integration/embed-index.test.ts +712 -0
- package/tests/integration/search-context.test.ts +469 -0
- package/tests/integration/search-semantic.test.ts +522 -0
- package/tsconfig.json +26 -0
- package/vitest.config.ts +16 -0
- package/vitest.setup.ts +12 -0
|
@@ -0,0 +1,228 @@
|
|
|
1
|
+
# B-Synth: Strategy B Synthesis
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
The three Strategy B agents collectively identified a mature, self-aware specification that thoroughly documents what NOT to do (anti-patterns, invariants) but has significant gaps in terminology alignment, implementation guidance, and philosophical framing. The most critical finding is the HumanWork-Evolution.md document which already synthesizes feedback into a phased improvement plan - agents B1-B3 largely validated and expanded on this existing gap analysis.
|
|
6
|
+
|
|
7
|
+
## Cross-Agent Patterns
|
|
8
|
+
|
|
9
|
+
**Theme 1: Semantic Search Underperformance**
|
|
10
|
+
All three agents found semantic search unreliable for multi-word conceptual queries. All fell back to keyword search frequently. This is the strongest cross-agent signal about the mdcontext tool.
|
|
11
|
+
|
|
12
|
+
**Theme 2: HumanWork-Evolution.md as Critical Source**
|
|
13
|
+
Both B1 and B2 independently discovered this document as the authoritative source for gaps and critiques, validating its importance.
|
|
14
|
+
|
|
15
|
+
**Theme 3: Human-First Philosophy with Acknowledged Tensions**
|
|
16
|
+
All agents found the docs emphasize human control, but B2 identified a philosophical gap: the spec frames human control as end-state rather than transition phase toward "intelligence crystallization."
|
|
17
|
+
|
|
18
|
+
**Theme 4: Section-Level Context Extraction Praised**
|
|
19
|
+
All three agents highlighted the `--section` flag as highly effective for targeted retrieval.
|
|
20
|
+
|
|
21
|
+
**Theme 5: Checkpoint/Intervention Architecture**
|
|
22
|
+
B1 found checkpoints in anti-patterns, B3 found them in workflow design - the spec heavily emphasizes checkpoints as the key governance mechanism.
|
|
23
|
+
|
|
24
|
+
## Consolidated Findings
|
|
25
|
+
|
|
26
|
+
### Architecture Criticisms (from B1)
|
|
27
|
+
|
|
28
|
+
**External Criticisms (of traditional approaches):**
|
|
29
|
+
|
|
30
|
+
- Brittleness of pure automation (combinatorial explosion of rules)
|
|
31
|
+
- Coordination Trap (multiplies human translation work)
|
|
32
|
+
- Innovation Strangulation (automation-incompatible approaches avoided)
|
|
33
|
+
- Judgment Gap (80% flawless, 20% chaos)
|
|
34
|
+
- Context Collapse (context as configuration, not conversation)
|
|
35
|
+
- Observability Problem (black-box agents kill trust)
|
|
36
|
+
|
|
37
|
+
**Self-Imposed Constraints (internal guardrails):**
|
|
38
|
+
|
|
39
|
+
- 8 Architectural Invariants (no hidden state, no irreversible execution, etc.)
|
|
40
|
+
- 7 Memory Model Anti-Patterns
|
|
41
|
+
- 8 Workflow Anti-Patterns
|
|
42
|
+
|
|
43
|
+
**Open Questions (acknowledged gaps):**
|
|
44
|
+
|
|
45
|
+
- Alignment with human values at scale
|
|
46
|
+
- Limits of organizational intelligence
|
|
47
|
+
- Preventing organizational capture (self-perpetuation)
|
|
48
|
+
|
|
49
|
+
### Gaps Identified (from B2)
|
|
50
|
+
|
|
51
|
+
**Terminology Gaps:**
|
|
52
|
+
|
|
53
|
+
- Agent -> Actor (unified human/machine)
|
|
54
|
+
- Artifact -> Deliverable (business language)
|
|
55
|
+
- Event Memory -> The Ledger (IP capture emphasis)
|
|
56
|
+
|
|
57
|
+
**Missing Primitives:**
|
|
58
|
+
|
|
59
|
+
- Correction Event (captures human intelligence on modifications)
|
|
60
|
+
- Authority Gradient (replaces binary control)
|
|
61
|
+
- Pattern Crystallization (organizational learning mechanism)
|
|
62
|
+
|
|
63
|
+
**Architectural Gaps:**
|
|
64
|
+
|
|
65
|
+
- No geometric/semantic embeddings in Semantic Memory
|
|
66
|
+
- Cost model doesn't unify human hours and AI tokens
|
|
67
|
+
- Privacy model is "policy overlay" only
|
|
68
|
+
- No formal API specification
|
|
69
|
+
|
|
70
|
+
**Philosophical Gap:**
|
|
71
|
+
|
|
72
|
+
- Spec positions "human control" as goal
|
|
73
|
+
- Feedback suggests reframing as "intelligence extraction"
|
|
74
|
+
- Human corrections should become portable organizational intelligence
|
|
75
|
+
|
|
76
|
+
### Workflow Improvements (from B3)
|
|
77
|
+
|
|
78
|
+
**Core Philosophy:**
|
|
79
|
+
|
|
80
|
+
- Workflows as "guidance without control"
|
|
81
|
+
- Six concepts: Entry Signals, Roles, Phases, Activities, Checkpoints, Exit Conditions
|
|
82
|
+
- Checkpoints as primary governance mechanism
|
|
83
|
+
|
|
84
|
+
**Authority Gradient (4 modes):**
|
|
85
|
+
|
|
86
|
+
1. Instructional: Step-by-step human instructions
|
|
87
|
+
2. Consultative: Human defines goal, agent proposes
|
|
88
|
+
3. Supervisory: Agents execute, humans monitor
|
|
89
|
+
4. Exploratory: Alternating generation/testing
|
|
90
|
+
|
|
91
|
+
**Intervention Points:**
|
|
92
|
+
|
|
93
|
+
- Redirect, Override, Inject, Escalate
|
|
94
|
+
|
|
95
|
+
**Key Patterns:**
|
|
96
|
+
|
|
97
|
+
- Time Travel and Branching
|
|
98
|
+
- Parallel Exploration
|
|
99
|
+
- Immutable Workflow Versioning
|
|
100
|
+
|
|
101
|
+
**Organizational Transformation:**
|
|
102
|
+
|
|
103
|
+
- Choreographic Maturity Model (4 levels)
|
|
104
|
+
- Cultural shifts toward experimental mindsets
|
|
105
|
+
|
|
106
|
+
## Proposed Spec Changes (Prioritized)
|
|
107
|
+
|
|
108
|
+
### High Priority
|
|
109
|
+
|
|
110
|
+
- [ ] Rename Artifact -> Deliverable throughout (B2)
|
|
111
|
+
- [ ] Add Correction Event primitive (B2) - captures IP when humans modify outputs
|
|
112
|
+
- [ ] Add Authority Gradient to Execution Model (B2, B3) - instructional/consultative/supervisory/exploratory
|
|
113
|
+
- [ ] Expand Judgment Gap (80/20 problem) handling beyond "humans intervene" (B1)
|
|
114
|
+
- [ ] Add "Known Limitations and Trade-offs" section (B1) - what HumanWork sacrifices
|
|
115
|
+
- [ ] Unify cost model for Human + Machine Actors (B2)
|
|
116
|
+
|
|
117
|
+
### Medium Priority
|
|
118
|
+
|
|
119
|
+
- [ ] Add Actor primitive with type: Human | Machine (B2)
|
|
120
|
+
- [ ] Add Pattern Crystallization to Memory Model (B2)
|
|
121
|
+
- [ ] Rename Event Memory -> The Ledger (B2)
|
|
122
|
+
- [ ] Add cognitive telemetry to Checkpoints (B2) - deliberation_duration, confidence_signal, modification_depth
|
|
123
|
+
- [ ] Document concrete answers to Open Questions or mark as research priorities (B1)
|
|
124
|
+
- [ ] Create decision framework: when DAG-style execution IS appropriate (B1)
|
|
125
|
+
- [ ] Add explicit checkpoint requirements for high-stakes workflows (B3)
|
|
126
|
+
- [ ] Define minimum intervention points per workflow phase (B3)
|
|
127
|
+
|
|
128
|
+
### Low Priority
|
|
129
|
+
|
|
130
|
+
- [ ] Enhance Semantic Memory with geometric embeddings (B2)
|
|
131
|
+
- [ ] Add detection guidance for when Status Memory becomes authoritative (B1)
|
|
132
|
+
- [ ] Reframe "human control" as transition phase, not end state (B2)
|
|
133
|
+
- [ ] Adopt choreography language over orchestration (B2)
|
|
134
|
+
- [ ] Develop privacy model beyond "policy overlay" (B2)
|
|
135
|
+
- [ ] Create formal API specification (B2)
|
|
136
|
+
- [ ] Establish choreographic maturity assessment framework (B3)
|
|
137
|
+
- [ ] Create signals taxonomy (activity, outcome, attention, health) (B3)
|
|
138
|
+
|
|
139
|
+
## Tool Evaluation Synthesis
|
|
140
|
+
|
|
141
|
+
All three agents used the mdcontext tool extensively (38, 35, and 41 commands respectively = 114 total commands). Their assessments were remarkably consistent.
|
|
142
|
+
|
|
143
|
+
### Common Praise
|
|
144
|
+
|
|
145
|
+
- **Section-level context extraction** (`--section`) was universally praised as highly effective
|
|
146
|
+
- **Keyword search** was reliable and essential fallback
|
|
147
|
+
- **Token budget control** (`-t`) helped manage context size
|
|
148
|
+
- **Tree command** gave quick corpus overview
|
|
149
|
+
- **Fast embedding indexing** (~$0.003 cost)
|
|
150
|
+
- **Stats command** useful for understanding corpus size
|
|
151
|
+
|
|
152
|
+
### Common Frustrations
|
|
153
|
+
|
|
154
|
+
- **Semantic search returned 0 results** for multi-word conceptual queries (all 3 agents)
|
|
155
|
+
- **Token truncation** without clear indication of what was excluded
|
|
156
|
+
- **No way to chain or aggregate searches** - had to run many separate commands
|
|
157
|
+
- **Multi-word keyword searches failed** (e.g., "issue challenge gap" = 0 results)
|
|
158
|
+
- **False positives** in keyword search
|
|
159
|
+
- **No semantic search threshold adjustment**
|
|
160
|
+
|
|
161
|
+
### Suggested Improvements
|
|
162
|
+
|
|
163
|
+
- Add fuzzy/stemmed search (fail vs failure)
|
|
164
|
+
- Add "search within results" / progressive refinement
|
|
165
|
+
- Add context around keyword matches without re-running
|
|
166
|
+
- Add combined semantic+keyword hybrid mode
|
|
167
|
+
- Add cross-document synthesis
|
|
168
|
+
- Add batch context extraction for multiple sections/files
|
|
169
|
+
- Add "related sections" feature
|
|
170
|
+
- Add Boolean operators in keyword mode
|
|
171
|
+
- Add export/save functionality
|
|
172
|
+
- Add "what's undefined" query (terms used but not defined)
|
|
173
|
+
|
|
174
|
+
### Quantitative Summary
|
|
175
|
+
|
|
176
|
+
| Agent | Commands | Confidence | Rating |
|
|
177
|
+
| ----- | -------- | ---------- | ------ |
|
|
178
|
+
| B1 | 38 | Medium | 4/5 |
|
|
179
|
+
| B2 | 35 | High | 4/5 |
|
|
180
|
+
| B3 | 41 | High | 4/5 |
|
|
181
|
+
|
|
182
|
+
All agents rated the tool 4/5 and found it significantly faster than reading all files manually.
|
|
183
|
+
|
|
184
|
+
## Methodology Assessment
|
|
185
|
+
|
|
186
|
+
How well did Strategy B (divide by question) work?
|
|
187
|
+
|
|
188
|
+
### Strengths
|
|
189
|
+
|
|
190
|
+
- **Clear scope boundaries**: Each agent had a focused research question, avoiding overlap
|
|
191
|
+
- **Efficient parallelization**: Three agents could work simultaneously on different questions
|
|
192
|
+
- **Natural synthesis path**: Findings from each question type combined naturally into a coherent picture
|
|
193
|
+
- **Reduced redundancy**: Agents didn't repeat the same searches (unlike Strategy A file-based division)
|
|
194
|
+
- **Comprehensive coverage**: Architecture + Gaps + Workflows covers the spec from multiple angles
|
|
195
|
+
- **Discovery of key document**: Multiple agents independently found HumanWork-Evolution.md, validating its importance
|
|
196
|
+
|
|
197
|
+
### Weaknesses
|
|
198
|
+
|
|
199
|
+
- **Question boundaries can be fuzzy**: "Architecture criticisms" vs "gaps" had some overlap (e.g., observability problem)
|
|
200
|
+
- **Dependent insights split**: Authority Gradient appeared in both B2 (as gap) and B3 (as workflow improvement)
|
|
201
|
+
- **No shared discovery context**: B2 found HumanWork-Evolution.md which would have helped B1's research
|
|
202
|
+
- **Variable scope difficulty**: Some questions (workflows) were more expansive than others (architecture criticisms)
|
|
203
|
+
|
|
204
|
+
### Would Recommend For
|
|
205
|
+
|
|
206
|
+
- **Documentation analysis** where questions naturally partition the content
|
|
207
|
+
- **Due diligence reviews** (legal, technical, financial angles)
|
|
208
|
+
- **Research synthesis** where multiple perspectives on same corpus needed
|
|
209
|
+
- **Gap analysis** where "what exists" vs "what's missing" are distinct questions
|
|
210
|
+
- **Any task where questions are more natural than file divisions**
|
|
211
|
+
|
|
212
|
+
### Not Recommended For
|
|
213
|
+
|
|
214
|
+
- **Code review** (files matter more than questions)
|
|
215
|
+
- **Tasks where answers span all questions** (high synthesis overhead)
|
|
216
|
+
- **Simple/small corpora** (parallelization overhead not worth it)
|
|
217
|
+
|
|
218
|
+
## Appendix: Agent Command Efficiency
|
|
219
|
+
|
|
220
|
+
| Metric | B1 | B2 | B3 | Total |
|
|
221
|
+
| ------------------- | --- | --- | --- | ----- |
|
|
222
|
+
| Commands run | 38 | 35 | 41 | 114 |
|
|
223
|
+
| Semantic searches | 8 | 4 | 12 | 24 |
|
|
224
|
+
| Keyword searches | 22 | 23 | 0 | 45 |
|
|
225
|
+
| Context extractions | 13 | 9 | 19 | 41 |
|
|
226
|
+
| Tree/Stats/Index | 3 | 3 | 3 | 9 |
|
|
227
|
+
|
|
228
|
+
**Key observation**: B3 (workflows) used semantic search exclusively and found it more effective for their domain. B1 and B2 heavily relied on keyword search after semantic search failed. This suggests semantic search may work better for concrete concepts (workflows, collaboration) than abstract critiques (gaps, criticisms).
|
|
@@ -0,0 +1,207 @@
|
|
|
1
|
+
# Report: B1 - Architecture Critic Hunter
|
|
2
|
+
|
|
3
|
+
## Mission
|
|
4
|
+
|
|
5
|
+
Find architecture and design criticisms across all documentation
|
|
6
|
+
|
|
7
|
+
## Research Question
|
|
8
|
+
|
|
9
|
+
What architecture and design criticisms exist?
|
|
10
|
+
|
|
11
|
+
## Command Log
|
|
12
|
+
|
|
13
|
+
| # | Command | Purpose | Result | Useful? |
|
|
14
|
+
| --- | ------------------------------------------------------------------------------------------------------------- | ------------------------------- | ------------------------------------- | ------- |
|
|
15
|
+
| 1 | `mdcontext --help` | Learn tool | Got full usage guide | Yes |
|
|
16
|
+
| 2 | `mdcontext index --embed --force` | Index all files with embeddings | 23 docs, 922 sections, 904 embeddings | Yes |
|
|
17
|
+
| 3 | `mdcontext search "architecture criticism problems design flaws limitations"` | Semantic search for criticisms | 1 result (ARCHITECTURAL_FOUNDATIONS) | Partial |
|
|
18
|
+
| 4 | `mdcontext search "design trade-offs weaknesses concerns issues"` | Semantic search | 0 results | No |
|
|
19
|
+
| 5 | `mdcontext search "failure problems complexity challenges"` | Semantic search | 0 results | No |
|
|
20
|
+
| 6 | `mdcontext search "failure" --mode keyword` | Keyword search | 10 results | Yes |
|
|
21
|
+
| 7 | `mdcontext context docs.amorphic/02-THE_FAILURE_OF_PURE_AUTOMATION.md -t 3000` | Get full failure analysis | Full document | Yes |
|
|
22
|
+
| 8 | `mdcontext search "limitations" --mode keyword` | Keyword search | 1 result | Yes |
|
|
23
|
+
| 9 | `mdcontext search "problem" --mode keyword` | Keyword search | 10 results | Yes |
|
|
24
|
+
| 10 | `mdcontext search "risk" --mode keyword` | Keyword search | 10 results | Yes |
|
|
25
|
+
| 11 | `mdcontext search "anti-pattern" --mode keyword` | Keyword search | 4 results | Yes |
|
|
26
|
+
| 12 | `mdcontext context docs/05-MEMORY_MODEL.md --section "Anti-Patterns"` | Get memory anti-patterns | 7 forbidden patterns | Yes |
|
|
27
|
+
| 13 | `mdcontext context docs/06-WORKFLOWS.md --section "Anti-Patterns"` | Get workflow anti-patterns | 8 forbidden patterns | Yes |
|
|
28
|
+
| 14 | `mdcontext search "concern" --mode keyword` | Keyword search | 10 results | Yes |
|
|
29
|
+
| 15 | `mdcontext search "brittle" --mode keyword` | Keyword search | 10 results | Yes |
|
|
30
|
+
| 16 | `mdcontext search "complexity" --mode keyword` | Keyword search | 9 results | Yes |
|
|
31
|
+
| 17 | `mdcontext search "overhead" --mode keyword` | Keyword search | 7 results | Yes |
|
|
32
|
+
| 18 | `mdcontext search "design patterns architecture decision"` | Semantic search | 10 results | Yes |
|
|
33
|
+
| 19 | `mdcontext context docs.amorphic/03-ARCHITECTURAL_FOUNDATIONS.md -t 2000` | Get architectural foundations | Full document | Yes |
|
|
34
|
+
| 20 | `mdcontext context docs.amorphic/05-TECHNICAL_IMPLEMENTATION_PATTERNS.md -t 2500` | Get implementation patterns | Full document | Yes |
|
|
35
|
+
| 21 | `mdcontext search "gap" --mode keyword` | Keyword search | 8 results | Yes |
|
|
36
|
+
| 22 | `mdcontext context docs.amorphic/02-THE_FAILURE_OF_PURE_AUTOMATION.md --section "Judgment Gap"` | Get judgment gap section | Detailed section | Yes |
|
|
37
|
+
| 23 | `mdcontext search "forbidden" --mode keyword` | Keyword search | 5 results | Yes |
|
|
38
|
+
| 24 | `mdcontext search "corrupt" --mode keyword` | Keyword search | 5 results | Yes |
|
|
39
|
+
| 25 | `mdcontext tree docs/01-ARCHITECTURE.md` | Get document outline | 47 sections | Yes |
|
|
40
|
+
| 26 | `mdcontext context docs/01-ARCHITECTURE.md --section "Why This Architecture Works"` | Get rationale | Brief justification | Yes |
|
|
41
|
+
| 27 | `mdcontext context docs/01-ARCHITECTURE.md --section "Architectural Invariants"` | Get invariants | 8 invariants | Yes |
|
|
42
|
+
| 28 | `mdcontext search "fail" --mode keyword` | Keyword search | 10 results | Yes |
|
|
43
|
+
| 29 | `mdcontext context docs/00-README.md --section "What Problem"` | Get problem statement | Core problem | Yes |
|
|
44
|
+
| 30 | `mdcontext search "cost" --mode keyword` | Keyword search | 10 results | Yes |
|
|
45
|
+
| 31 | `mdcontext context docs.llm/feedback.md -t 3000` | Get feedback document | Chat feedback analysis | Yes |
|
|
46
|
+
| 32 | `mdcontext search "traditional" --mode keyword` | Keyword search | 10 results | Yes |
|
|
47
|
+
| 33 | `mdcontext tree docs.llm/amorphic.md` | Get amorphic outline | Full outline | Yes |
|
|
48
|
+
| 34 | `mdcontext context docs.llm/amorphic.md --section "Open Questions"` | Get open questions | 3 open questions | Yes |
|
|
49
|
+
| 35 | `mdcontext context docs.llm/amorphic.md --section "Paradox of Automation"` | Get paradox section | Detailed section | Yes |
|
|
50
|
+
| 36 | `mdcontext search "wrong" --mode keyword` | Keyword search | 6 results | Yes |
|
|
51
|
+
| 37 | `mdcontext context docs.amorphic/04-THE_HUMAN-AGENT_COLLABORATION_MODEL.md --section "Observability Problem"` | Get observability problem | Key issue identified | Yes |
|
|
52
|
+
| 38 | `mdcontext search "scale" --mode keyword` | Keyword search | 10 results | Yes |
|
|
53
|
+
|
|
54
|
+
## Findings
|
|
55
|
+
|
|
56
|
+
### Key Discoveries
|
|
57
|
+
|
|
58
|
+
#### 1. Criticisms of Traditional/Pure Automation (Major Theme)
|
|
59
|
+
|
|
60
|
+
The documentation extensively critiques traditional automation approaches:
|
|
61
|
+
|
|
62
|
+
- **Brittleness**: "The system becomes brittle not because any individual rule is wrong, but because the combinatorial explosion of rules creates a rigid lattice that cannot bend without breaking."
|
|
63
|
+
- **Coordination Trap**: Pure automation "multiplies coordination requirements by forcing human work into machine-readable formats that require constant translation and synchronization."
|
|
64
|
+
- **Innovation Strangulation**: "Teams avoid innovative approaches not because they're technically inferior, but because they're automation-incompatible."
|
|
65
|
+
- **Human Bottleneck Paradox**: Attempting to eliminate humans creates new bottlenecks in system configuration and exception handling.
|
|
66
|
+
- **Context Collapse**: Traditional systems treat "context as configuration rather than conversation."
|
|
67
|
+
- **Judgment Gap**: "Systems that handle 80% of cases flawlessly but create chaos in the remaining 20%."
|
|
68
|
+
|
|
69
|
+
#### 2. Agent System Criticisms (Self-Aware)
|
|
70
|
+
|
|
71
|
+
The documentation acknowledges problems with current agent systems:
|
|
72
|
+
|
|
73
|
+
> "Most agent systems fail at real work because they optimize for demos, single-shot tasks, and autonomous execution. They become opaque, brittle, hard to interrupt, impossible to rewind, and unsafe to scale."
|
|
74
|
+
> Source: docs/00-README.md
|
|
75
|
+
|
|
76
|
+
#### 3. Observability Problem
|
|
77
|
+
|
|
78
|
+
> "Most agent systems are black boxes. You send a request, wait, and get a result - with no visibility into what happened in between. When something goes wrong, you're left debugging phantom processes and mysterious failures. This opacity kills trust."
|
|
79
|
+
> Source: docs.amorphic/04-THE_HUMAN-AGENT_COLLABORATION_MODEL.md
|
|
80
|
+
|
|
81
|
+
#### 4. Anti-Patterns Explicitly Forbidden
|
|
82
|
+
|
|
83
|
+
**Memory Model Anti-Patterns:**
|
|
84
|
+
|
|
85
|
+
- Storing mutable state in Event Memory
|
|
86
|
+
- Treating Status Memory as authoritative
|
|
87
|
+
- Letting Semantic Memory drive execution
|
|
88
|
+
- Hiding Events from humans
|
|
89
|
+
- Creating circular dependencies between layers
|
|
90
|
+
- Bypassing Event Memory for "performance"
|
|
91
|
+
- Hard-deleting critical audit events
|
|
92
|
+
|
|
93
|
+
**Workflow Anti-Patterns:**
|
|
94
|
+
|
|
95
|
+
- Workflows that execute directly
|
|
96
|
+
- Workflows that mutate artifacts
|
|
97
|
+
- Workflows that allocate cost
|
|
98
|
+
- Workflows that own agents
|
|
99
|
+
- Hidden workflow state
|
|
100
|
+
- Workflows that become Turing-complete
|
|
101
|
+
- Mandatory workflows (at system level)
|
|
102
|
+
- Workflows that bypass Control Plane
|
|
103
|
+
|
|
104
|
+
#### 5. Architectural Invariants (Design Constraints)
|
|
105
|
+
|
|
106
|
+
The system explicitly maintains these constraints to avoid known issues:
|
|
107
|
+
|
|
108
|
+
- No hidden mutable state
|
|
109
|
+
- No irreversible execution
|
|
110
|
+
- No unobservable progress
|
|
111
|
+
- No agent-owned memory
|
|
112
|
+
- No loss of human authority
|
|
113
|
+
- No concurrent mutation of the same scope
|
|
114
|
+
- No execution without a Workspace
|
|
115
|
+
- No automatic flow from Org to Workspace
|
|
116
|
+
|
|
117
|
+
#### 6. Open Questions (Acknowledged Gaps)
|
|
118
|
+
|
|
119
|
+
> "How do we ensure HumanWork organizations remain aligned with human values as they become more autonomous?"
|
|
120
|
+
> "What are the limits of organizational intelligence? Are there problems that fundamentally require individual rather than collective cognition?"
|
|
121
|
+
> "How do we prevent organizational capture - scenarios where HumanWork systems optimize for their own perpetuation rather than their intended purposes?"
|
|
122
|
+
> Source: docs.llm/amorphic.md
|
|
123
|
+
|
|
124
|
+
#### 7. Substrate Problem
|
|
125
|
+
|
|
126
|
+
> "Implementation details leak into the conceptual model, making the workflow harder to reason about and modify."
|
|
127
|
+
> Source: docs.amorphic/03-ARCHITECTURAL_FOUNDATIONS.md
|
|
128
|
+
|
|
129
|
+
### Relevant Quotes/Sections Found
|
|
130
|
+
|
|
131
|
+
> "Pure automation assumes complete knowledge of the problem space. It requires that all possible states, transitions, and edge cases be enumerable at design time. This works beautifully for manufacturing widgets or processing financial transactions - domains where the rules are well-understood and the exceptions are genuinely exceptional. But knowledge work exists in a different regime entirely."
|
|
132
|
+
> Source: docs.amorphic/02-THE_FAILURE_OF_PURE_AUTOMATION.md, The Brittleness of Complete Systems
|
|
133
|
+
|
|
134
|
+
> "The paradox emerges when pure automation, in attempting to eliminate human bottlenecks, creates new bottlenecks in the form of system configuration, exception handling, and cross-system integration."
|
|
135
|
+
> Source: docs.amorphic/02-THE_FAILURE_OF_PURE_AUTOMATION.md, The Human Bottleneck Paradox
|
|
136
|
+
|
|
137
|
+
> "Traditional workflow systems model execution as directed acyclic graphs (DAGs) - nodes representing tasks, edges representing dependencies. This works well for batch processing and pipeline scenarios where the structure is known in advance. But it breaks down when workflows need to adapt their structure based on runtime conditions or accumulated learning."
|
|
138
|
+
> Source: docs.amorphic/03-ARCHITECTURAL_FOUNDATIONS.md, Component Relationships
|
|
139
|
+
|
|
140
|
+
> "If Status Memory cannot be rebuilt, it has become a source of truth and the system is corrupted."
|
|
141
|
+
> Source: docs/05-MEMORY_MODEL.md, The Hard Rule
|
|
142
|
+
|
|
143
|
+
### Answer to Research Question
|
|
144
|
+
|
|
145
|
+
**What architecture and design criticisms exist?**
|
|
146
|
+
|
|
147
|
+
The documentation contains extensive, self-aware architectural criticism organized into three categories:
|
|
148
|
+
|
|
149
|
+
1. **Criticisms of Traditional Approaches (external):** The docs thoroughly critique pure automation, traditional workflow systems (DAGs), black-box agent systems, and context-as-configuration approaches. These criticisms justify the HumanWork design decisions.
|
|
150
|
+
|
|
151
|
+
2. **Self-Imposed Constraints (internal guardrails):** The architecture explicitly forbids specific anti-patterns for both memory and workflows. These represent lessons learned about what NOT to do - treating them as "corrupted" states if they appear.
|
|
152
|
+
|
|
153
|
+
3. **Acknowledged Open Questions (honest gaps):** The documentation admits uncertainty about alignment with human values at scale, limits of organizational intelligence, and preventing organizational capture.
|
|
154
|
+
|
|
155
|
+
The architectural philosophy is defensive - explicitly naming what can go wrong and building constraints to prevent it. The invariants and anti-patterns serve as architectural "unit tests" against known failure modes.
|
|
156
|
+
|
|
157
|
+
## Proposed Spec Changes
|
|
158
|
+
|
|
159
|
+
- [ ] Add section on "Known Limitations and Trade-offs" to acknowledge what HumanWork architecture sacrifices (e.g., raw execution speed for observability)
|
|
160
|
+
- [ ] Expand on how the Judgment Gap (80/20 problem) is specifically addressed beyond "humans intervene"
|
|
161
|
+
- [ ] Document concrete answers to the Open Questions or mark them as research priorities
|
|
162
|
+
- [ ] Add guidance on detecting when Status Memory has "become authoritative" before corruption
|
|
163
|
+
- [ ] Create decision framework for when DAG-style execution IS appropriate vs. adaptive execution
|
|
164
|
+
|
|
165
|
+
## Tool Evaluation
|
|
166
|
+
|
|
167
|
+
### What Worked Well
|
|
168
|
+
|
|
169
|
+
- Keyword search (`--mode keyword`) was highly effective for finding specific terms like "failure", "brittle", "anti-pattern"
|
|
170
|
+
- Section-targeted context (`--section "X"`) efficiently extracted exactly what I needed
|
|
171
|
+
- The `tree` command helped understand document structure before diving in
|
|
172
|
+
- Embedding indexing was fast and one-time cost
|
|
173
|
+
- Token budget control (`-t`) helped manage context size
|
|
174
|
+
|
|
175
|
+
### What Was Frustrating
|
|
176
|
+
|
|
177
|
+
- Semantic search often returned 0 results for multi-word queries that should have matched
|
|
178
|
+
- Semantic search for "design trade-offs weaknesses concerns issues" returned nothing
|
|
179
|
+
- Semantic search for "failure problems complexity challenges" returned nothing
|
|
180
|
+
- Had to fall back to keyword search frequently after semantic failed
|
|
181
|
+
- Multi-word keyword searches didn't work (e.g., "issue challenge gap" = 0 results)
|
|
182
|
+
- Boolean operators in keyword mode unclear if supported
|
|
183
|
+
|
|
184
|
+
### What Was Missing
|
|
185
|
+
|
|
186
|
+
- No fuzzy/stemmed search (had to search "fail" vs "failure" separately)
|
|
187
|
+
- No "search within results" or progressive refinement
|
|
188
|
+
- No way to get context around keyword matches without re-running with `context`
|
|
189
|
+
- Semantic search threshold/sensitivity adjustment not available
|
|
190
|
+
- No combined semantic+keyword hybrid mode
|
|
191
|
+
- Difficult to search for concepts without exact terms
|
|
192
|
+
|
|
193
|
+
### Confidence Level
|
|
194
|
+
|
|
195
|
+
[X] Medium
|
|
196
|
+
|
|
197
|
+
The keyword search found the explicit criticisms comprehensively. However, I may have missed implicit criticisms or design concerns that don't use obvious negative terminology. Semantic search underperformed expectations.
|
|
198
|
+
|
|
199
|
+
### Would Use Again? (1-5)
|
|
200
|
+
|
|
201
|
+
**4** - Good for structured documentation analysis. Keyword search is reliable. Would use again but with clearer expectations that semantic search needs more work. The section-level context extraction is genuinely useful for targeted retrieval.
|
|
202
|
+
|
|
203
|
+
## Time & Efficiency
|
|
204
|
+
|
|
205
|
+
- Commands run: **38**
|
|
206
|
+
- Compared to reading all files: **Much less** - Would have taken 30+ minutes to read all docs manually. Tool-based search took approximately 15 minutes to find all relevant criticisms.
|
|
207
|
+
- Token efficiency: Reduced ~150k tokens of docs to targeted extracts totaling ~15k tokens of relevant content
|