groundswell 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (120) hide show
  1. package/.claude/settings.local.json +9 -0
  2. package/.claude/system_prompts/task-breakdown.md +100 -0
  3. package/PRPs/001-hierarchical-workflow-engine.md +2438 -0
  4. package/PRPs/PRDs/001-hierarchical-workflow-engine.md +543 -0
  5. package/PRPs/PRDs/002-agent-prompt.md +390 -0
  6. package/PRPs/PRDs/003-agent-prompt.md +943 -0
  7. package/PRPs/PRDs/004-agent-prompt.md +1136 -0
  8. package/PRPs/PRDs/tasks-001.json +492 -0
  9. package/PRPs/README.md +83 -0
  10. package/PRPs/templates/prp_base.md +222 -0
  11. package/README.md +218 -0
  12. package/docs/agent.md +422 -0
  13. package/docs/prompt.md +419 -0
  14. package/docs/workflow.md +600 -0
  15. package/examples/README.md +244 -0
  16. package/examples/examples/01-basic-workflow.ts +100 -0
  17. package/examples/examples/02-decorator-options.ts +217 -0
  18. package/examples/examples/03-parent-child.ts +241 -0
  19. package/examples/examples/04-observers-debugger.ts +340 -0
  20. package/examples/examples/05-error-handling.ts +387 -0
  21. package/examples/examples/06-concurrent-tasks.ts +352 -0
  22. package/examples/examples/07-agent-loops.ts +432 -0
  23. package/examples/examples/08-sdk-features.ts +667 -0
  24. package/examples/examples/09-reflection.ts +573 -0
  25. package/examples/examples/10-introspection.ts +550 -0
  26. package/examples/index.ts +143 -0
  27. package/examples/utils/helpers.ts +57 -0
  28. package/llms_full.txt +5890 -0
  29. package/package.json +63 -0
  30. package/plan/P1P2/PRP.md +527 -0
  31. package/plan/P1P2/research/LRU_CACHE_BEST_PRACTICES.md +1929 -0
  32. package/plan/P1P2/research/LRU_CACHE_CODE_PATTERNS.md +857 -0
  33. package/plan/P1P2/research/LRU_CACHE_INTEGRATION_GUIDE.md +738 -0
  34. package/plan/P1P2/research/LRU_CACHE_RESEARCH_INDEX.md +424 -0
  35. package/plan/P1P2/research/REFLECTION_INDEX.md +291 -0
  36. package/plan/P1P2/research/REFLECTION_RESEARCH_REPORT.md +1342 -0
  37. package/plan/P1P2/research/RESEARCH_SUMMARY.md +342 -0
  38. package/plan/P1P2/research/anthropic-sdk.md +174 -0
  39. package/plan/P1P2/research/async-local-storage.md +200 -0
  40. package/plan/P1P2/research/reflection-code-patterns.md +1205 -0
  41. package/plan/P1P2/research/reflection-decision-matrix.md +421 -0
  42. package/plan/P1P2/research/reflection-implementation-guide.md +1341 -0
  43. package/plan/P1P2/research/reflection-integration-guide.md +834 -0
  44. package/plan/P1P2/research/reflection-patterns.md +1468 -0
  45. package/plan/P1P2/research/reflection-quick-reference.md +558 -0
  46. package/plan/P1P2/research/zod-schema.md +152 -0
  47. package/plan/P3P4/PRP.md +1388 -0
  48. package/plan/P3P4/research/caching-lru.md +116 -0
  49. package/plan/P3P4/research/introspection-tools.md +177 -0
  50. package/plan/P3P4/research/reflection-patterns.md +117 -0
  51. package/plan/P4P5/PRP.md +1136 -0
  52. package/plan/P4P5/research/RESEARCH_SUMMARY.md +151 -0
  53. package/plan/architecture/external_deps.md +358 -0
  54. package/plan/architecture/system_context.md +242 -0
  55. package/plan/backlog.json +867 -0
  56. package/plan/research/INTROSPECTION_RESEARCH_SUMMARY.md +378 -0
  57. package/plan/research/README-INTROSPECTION.md +352 -0
  58. package/plan/research/agent-introspection-patterns.md +1085 -0
  59. package/plan/research/introspection-security-guide.md +928 -0
  60. package/plan/research/introspection-tool-examples.md +875 -0
  61. package/scripts/generate-llms-full.ts +206 -0
  62. package/src/__tests__/integration/agent-workflow.test.ts +256 -0
  63. package/src/__tests__/integration/tree-mirroring.test.ts +114 -0
  64. package/src/__tests__/unit/agent.test.ts +169 -0
  65. package/src/__tests__/unit/cache-key.test.ts +182 -0
  66. package/src/__tests__/unit/cache.test.ts +172 -0
  67. package/src/__tests__/unit/context.test.ts +138 -0
  68. package/src/__tests__/unit/decorators.test.ts +100 -0
  69. package/src/__tests__/unit/introspection-tools.test.ts +277 -0
  70. package/src/__tests__/unit/prompt.test.ts +135 -0
  71. package/src/__tests__/unit/reflection.test.ts +210 -0
  72. package/src/__tests__/unit/tree-debugger.test.ts +85 -0
  73. package/src/__tests__/unit/workflow.test.ts +81 -0
  74. package/src/cache/cache-key.ts +244 -0
  75. package/src/cache/cache.ts +236 -0
  76. package/src/cache/index.ts +8 -0
  77. package/src/core/agent.ts +573 -0
  78. package/src/core/context.ts +119 -0
  79. package/src/core/event-tree.ts +260 -0
  80. package/src/core/factory.ts +123 -0
  81. package/src/core/index.ts +17 -0
  82. package/src/core/logger.ts +87 -0
  83. package/src/core/mcp-handler.ts +184 -0
  84. package/src/core/prompt.ts +150 -0
  85. package/src/core/workflow-context.ts +349 -0
  86. package/src/core/workflow.ts +302 -0
  87. package/src/debugger/index.ts +1 -0
  88. package/src/debugger/tree-debugger.ts +210 -0
  89. package/src/decorators/index.ts +3 -0
  90. package/src/decorators/observed-state.ts +95 -0
  91. package/src/decorators/step.ts +139 -0
  92. package/src/decorators/task.ts +96 -0
  93. package/src/examples/index.ts +2 -0
  94. package/src/examples/tdd-orchestrator.ts +65 -0
  95. package/src/examples/test-cycle-workflow.ts +64 -0
  96. package/src/index.ts +140 -0
  97. package/src/reflection/index.ts +5 -0
  98. package/src/reflection/reflection.ts +407 -0
  99. package/src/tools/index.ts +36 -0
  100. package/src/tools/introspection.ts +464 -0
  101. package/src/types/agent.ts +90 -0
  102. package/src/types/decorators.ts +25 -0
  103. package/src/types/error-strategy.ts +13 -0
  104. package/src/types/error.ts +20 -0
  105. package/src/types/events.ts +74 -0
  106. package/src/types/index.ts +55 -0
  107. package/src/types/logging.ts +24 -0
  108. package/src/types/observer.ts +18 -0
  109. package/src/types/prompt.ts +40 -0
  110. package/src/types/reflection.ts +117 -0
  111. package/src/types/sdk-primitives.ts +128 -0
  112. package/src/types/snapshot.ts +14 -0
  113. package/src/types/workflow-context.ts +163 -0
  114. package/src/types/workflow.ts +37 -0
  115. package/src/utils/id.ts +11 -0
  116. package/src/utils/index.ts +3 -0
  117. package/src/utils/observable.ts +77 -0
  118. package/tasks.json +0 -0
  119. package/tsconfig.json +22 -0
  120. package/vitest.config.ts +16 -0
@@ -0,0 +1,378 @@
1
+ # Agent Introspection Research Summary
2
+
3
+ **Comprehensive research on agent introspection and self-awareness patterns in AI orchestration frameworks**
4
+
5
+ ---
6
+
7
+ ## Documents Generated
8
+
9
+ This research package includes 4 comprehensive documents:
10
+
11
+ ### 1. **agent-introspection-patterns.md** (Main Reference)
12
+ - Complete introspection capability framework
13
+ - Anthropic Tool Format specifications (JSON Schema)
14
+ - Hierarchy inspection patterns
15
+ - Security boundaries and read-only access patterns
16
+ - Self-modification capabilities with safety guards
17
+ - Ready-to-implement patterns for Groundswell
18
+
19
+ ### 2. **introspection-tool-examples.md** (Practical Implementation)
20
+ - 7 complete tool definitions with JSON schemas
21
+ - Example invocations and responses for each tool
22
+ - Agent usage examples showing real-world patterns
23
+ - Integration patterns (4 common workflows)
24
+ - Security checklist for implementation
25
+
26
+ ### 3. **introspection-security-guide.md** (Security Focus)
27
+ - Detailed threat models with attack scenarios
28
+ - 4 major threat vectors with mitigations
29
+ - Code examples for security patterns
30
+ - Implementation checklist
31
+ - Operational recommendations
32
+ - Testing and incident response procedures
33
+
34
+ ### 4. **INTROSPECTION_RESEARCH_SUMMARY.md** (This File)
35
+ - Overview of all research
36
+ - Quick reference guide
37
+ - Implementation roadmap for Groundswell
38
+
39
+ ---
40
+
41
+ ## Key Findings
42
+
43
+ ### 1. Introspection is Essential but Risky
44
+
45
+ **Why Agents Need Introspection:**
46
+ - Adaptive decision-making (agents must understand their context)
47
+ - Error recovery (agents must know what failed and why)
48
+ - Resource optimization (agents must check cache status)
49
+ - Self-improvement (agents must learn from history)
50
+
51
+ **Key Risks:**
52
+ - Information leakage (secrets in ancestor state)
53
+ - Prompt injection (untrusted data in outputs)
54
+ - Privilege escalation (recursive workflow spawning)
55
+ - Denial of service (unbounded queries)
56
+
57
+ ### 2. Tool-Based Model is Superior
58
+
59
+ Introspection should be implemented as **explicit read-only tools**, not via emergent agent capabilities:
60
+
61
+ - Anthropic research shows agent introspection is "highly unreliable and limited in scope"
62
+ - Explicit tools provide structured, validated data
63
+ - Tools enable access control and audit logging
64
+ - Tools prevent injection by returning typed data
65
+
66
+ ### 3. Anthropic Tool Format is Standardized
67
+
68
+ All introspection tools follow Anthropic's tool definition schema:
69
+
70
+ ```json
71
+ {
72
+ "name": "tool_name",
73
+ "description": "Clear, detailed description",
74
+ "input_schema": {
75
+ "type": "object",
76
+ "properties": { /* JSON Schema */ },
77
+ "required": []
78
+ }
79
+ }
80
+ ```
81
+
82
+ This format:
83
+ - Is well-documented by Anthropic
84
+ - Works with MCP servers
85
+ - Integrates with Anthropic API
86
+ - Enables strict schema validation
87
+
88
+ ### 4. Hierarchy Access Pattern
89
+
90
+ Safe hierarchy inspection requires:
91
+
92
+ ```
93
+ Agent → Asks: "Where am I in the tree?"
94
+ → Gets: HierarchyInfo (current, parent, ancestors, siblings)
95
+ → Can traverse: Up (ancestors), Down (children), Sideways (siblings)
96
+ → Cannot: Modify, execute, or see implementation details
97
+ ```
98
+
99
+ Key insight: **Read-only, structured data with explicit filters**
100
+
101
+ ### 5. Seven Core Tools Identified
102
+
103
+ ```
104
+ 1. workflow_inspect_hierarchy → "Where am I?"
105
+ 2. workflow_read_ancestor_outputs → "What did my parents do?"
106
+ 3. workflow_inspect_cache → "What's cached?"
107
+ 4. workflow_read_event_history → "What happened?"
108
+ 5. workflow_inspect_state_snapshot → "What was the state?"
109
+ 6. workflow_spawn_child → "Can I create a child?"
110
+ 7. workflow_generate_dynamic_prompt → "What prompt should child use?"
111
+ ```
112
+
113
+ Each has specific security considerations and limits.
114
+
115
+ ### 6. Security is Implementable
116
+
117
+ Research from AWS, Google, and Anthropic shows proven patterns:
118
+
119
+ - **Secret Protection**: Filter before returning (redaction patterns)
120
+ - **Injection Prevention**: Validate output data, treat as untrusted input
121
+ - **Privilege Escalation**: Template-based spawning, depth degradation
122
+ - **DoS Prevention**: Hard limits (depth, results, time, rate)
123
+
124
+ All patterns are concrete and implementable.
125
+
126
+ ---
127
+
128
+ ## Groundswell-Specific Recommendations
129
+
130
+ ### Phase 1: Foundation (Weeks 1-2)
131
+
132
+ Implement core introspection service without spawning:
133
+
134
+ 1. **Create `WorkflowIntrospectionService`** (src/core/introspection-service.ts)
135
+ - Leverage existing `EventTreeHandle`
136
+ - Add `HierarchyInfo` data structure
137
+ - Implement ancestor/sibling traversal
138
+
139
+ 2. **Create Introspection Tools** (src/core/introspection-tools.ts)
140
+ - Tools 1-5: Inspection tools (read-only)
141
+ - Register with Agent
142
+ - Add to Agent configuration
143
+
144
+ 3. **Add Secret Filtering**
145
+ - Redaction patterns for common secrets
146
+ - Filter applied before returning state snapshots
147
+ - Unit tests for secret detection
148
+
149
+ ### Phase 2: Safety (Weeks 3-4)
150
+
151
+ Add security protections:
152
+
153
+ 1. **Input Validation**
154
+ - Validate all tool inputs against schema
155
+ - Check all limit parameters
156
+ - Enforce time range limits
157
+
158
+ 2. **Output Sanitization**
159
+ - Ancestor output validation
160
+ - Injection detection
161
+ - Result truncation
162
+
163
+ 3. **Audit Logging**
164
+ - Log all introspection queries
165
+ - Track query metrics
166
+ - Set up alerting for suspicious patterns
167
+
168
+ ### Phase 3: Self-Modification (Weeks 5-6)
169
+
170
+ Add controlled spawning:
171
+
172
+ 1. **Workflow Templates**
173
+ - Define 3-5 templates for different use cases
174
+ - Specify capabilities and limits per template
175
+ - Hardcode to prevent arbitrary workflow creation
176
+
177
+ 2. **Spawn Tool**
178
+ - Implement `workflow_spawn_child`
179
+ - Template validation
180
+ - Resource enforcement
181
+
182
+ 3. **Dynamic Prompts**
183
+ - Implement `workflow_generate_dynamic_prompt`
184
+ - Template-based generation
185
+ - Safety validation
186
+
187
+ ### Phase 4: Operations (Weeks 7-8)
188
+
189
+ Make it production-ready:
190
+
191
+ 1. **Monitoring**
192
+ - Dashboard for introspection usage
193
+ - Alerts for anomalous patterns
194
+ - Query performance metrics
195
+
196
+ 2. **Documentation**
197
+ - Agent usage guide
198
+ - Security guidelines
199
+ - Troubleshooting guide
200
+
201
+ 3. **Testing**
202
+ - Unit tests (security focus)
203
+ - Integration tests (realistic workflows)
204
+ - Penetration testing (external)
205
+
206
+ ---
207
+
208
+ ## Implementation Checklist
209
+
210
+ ### Security-Critical Items
211
+
212
+ - [ ] All tools are read-only (no state modification)
213
+ - [ ] Secrets filtered from state snapshots
214
+ - [ ] Ancestor outputs validated for injection
215
+ - [ ] Max limits enforced on all queries
216
+ - [ ] Spawning requires template approval
217
+ - [ ] All queries logged and audit-traceable
218
+
219
+ ### Required Code Changes
220
+
221
+ ```
222
+ src/core/
223
+ ├── introspection-service.ts (NEW)
224
+ ├── introspection-tools.ts (NEW)
225
+ └── workflow-introspection-limits.ts (NEW)
226
+
227
+ src/types/
228
+ ├── introspection.ts (NEW)
229
+ └── existing updates
230
+
231
+ src/__tests__/
232
+ ├── unit/introspection-service.test.ts (NEW)
233
+ ├── unit/introspection-security.test.ts (NEW)
234
+ └── integration/introspection-workflow.test.ts (NEW)
235
+ ```
236
+
237
+ ### Configuration Needed
238
+
239
+ ```typescript
240
+ // Default introspection limits
241
+ const limits = {
242
+ max_ancestry_depth: 20,
243
+ max_result_items: 10000,
244
+ max_result_bytes: 10 * 1024 * 1024,
245
+ max_query_time_ms: 5000,
246
+ max_concurrent_queries: 5
247
+ };
248
+
249
+ // Workflow templates
250
+ const templates = {
251
+ 'data_validation': { /* ... */ },
252
+ 'data_transformation': { /* ... */ },
253
+ 'data_analysis': { /* ... */ }
254
+ };
255
+
256
+ // State access policy
257
+ const statePolicy = {
258
+ 'field_name': 'public' | 'sensitive' | 'secret'
259
+ };
260
+ ```
261
+
262
+ ---
263
+
264
+ ## Research Sources
265
+
266
+ ### Primary Sources
267
+
268
+ 1. **Anthropic**
269
+ - Introspection Research: https://www.anthropic.com/research/introspection
270
+ - Tool Use Docs: https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview
271
+ - Agent SDK: https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk
272
+
273
+ 2. **Model Context Protocol**
274
+ - Tool Specification: https://modelcontextprotocol.io/docs/concepts/tools
275
+ - Server Implementation Guide
276
+
277
+ 3. **Security Research**
278
+ - AgentArmor (Prompt Injection Defense): https://arxiv.org/html/2508.01249v2
279
+ - Design Patterns for Security: https://arxiv.org/pdf/2506.08837
280
+ - Hierarchical Multi-Agent Systems: https://arxiv.org/html/2508.12683
281
+
282
+ 4. **Cloud Providers**
283
+ - AWS Bedrock Agents: https://aws.amazon.com/blogs/machine-learning/securing-amazon-bedrock-agents
284
+ - Google ADK Safety: https://google.github.io/adk-docs/safety/
285
+ - Azure AI Agent Service
286
+
287
+ 5. **Orchestration Frameworks**
288
+ - LangGraph Workflows: https://docs.langchain.com/oss/python/langgraph/workflows-agents
289
+ - CrewAI and LangChain patterns
290
+ - Microsoft Multi-Agent Intelligence
291
+
292
+ ### Secondary Sources
293
+
294
+ - Access Control in AI Era (Auth0)
295
+ - Secure Database Access Patterns
296
+ - Multi-Agent Security (DEV Community)
297
+ - Rate Limiting and DDoS Prevention
298
+ - Audit Logging Best Practices
299
+
300
+ ---
301
+
302
+ ## Next Steps
303
+
304
+ ### For Groundswell Maintainers
305
+
306
+ 1. **Review** all three detailed documents
307
+ 2. **Choose** implementation phases based on timeline
308
+ 3. **Create** GitHub issues for each phase
309
+ 4. **Assign** based on team capacity
310
+ 5. **Plan** security review before Phase 3
311
+
312
+ ### For Security Team
313
+
314
+ 1. **Review** threat models in security guide
315
+ 2. **Validate** mitigation strategies
316
+ 3. **Plan** penetration testing
317
+ 4. **Create** operational runbooks
318
+ 5. **Set up** monitoring and alerting
319
+
320
+ ### For Agent Developers
321
+
322
+ 1. **Familiarize** with tool specifications
323
+ 2. **Understand** limitations and best practices
324
+ 3. **Review** security guidelines
325
+ 4. **Test** introspection in test environments
326
+ 5. **Report** issues via standard channels
327
+
328
+ ---
329
+
330
+ ## Key Takeaways
331
+
332
+ 1. **Introspection is necessary** for adaptive agents, but must be carefully controlled
333
+ 2. **Tool-based implementation** is superior to emergent capabilities
334
+ 3. **Read-only access** prevents most misuse
335
+ 4. **Hard limits** on queries prevent DoS
336
+ 5. **Template-based spawning** prevents privilege escalation
337
+ 6. **Secret filtering** prevents data leakage
338
+ 7. **Audit logging** enables detection and response
339
+
340
+ ---
341
+
342
+ ## Questions and Answers
343
+
344
+ **Q: Do agents really need to introspect themselves?**
345
+ A: Yes. Modern agentic patterns (Reflexion, ReAct) require agents to reflect on their reasoning and execution. Without introspection tools, agents resort to unreliable emergent capabilities.
346
+
347
+ **Q: Isn't this just letting agents inspect arbitrary code?**
348
+ A: No. Introspection tools return **data** about workflow execution (hierarchy, outputs, events), not code. All data is validated, filtered, and limited.
349
+
350
+ **Q: Can an agent privilege escalate through spawning?**
351
+ A: Only if allowed by configuration. Template-based spawning with approval and depth degradation prevents escalation. Root workflows get most capabilities, leaf workflows get none.
352
+
353
+ **Q: What if an ancestor gets compromised?**
354
+ A: Ancestor outputs are treated as untrusted input. All data is validated, injection patterns are detected, and sensitive fields are redacted. A compromised ancestor cannot inject prompts into children.
355
+
356
+ **Q: How do you prevent information leakage?**
357
+ A: Multiple layers: secrets never stored in observed state, output sanitization with redaction patterns, state access policies, and audit logging of all queries.
358
+
359
+ **Q: Is this production-ready?**
360
+ A: The research is ready. Implementation will take 6-8 weeks in phases. Phase 1 (inspection) is lowest risk; Phase 3 (spawning) requires the most scrutiny.
361
+
362
+ ---
363
+
364
+ ## Contact and Support
365
+
366
+ For questions about this research or implementation:
367
+
368
+ 1. Review the detailed documents first
369
+ 2. Check the security guide for threat models
370
+ 3. Reference the tool examples for implementation details
371
+ 4. File GitHub issues for clarification
372
+
373
+ ---
374
+
375
+ **Status:** Ready for Implementation
376
+ **Last Updated:** December 8, 2025
377
+ **Confidence Level:** HIGH (based on production research from Anthropic, AWS, Google)
378
+