rhachet-roles-bhrain 0.1.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/roles/architect/briefs/brains.replic/arc000.sources.[catalog].md +178 -0
- package/dist/roles/architect/briefs/brains.replic/arc101.concept.llm.[article].md +25 -0
- package/dist/roles/architect/briefs/brains.replic/arc102.concept.repl.[article].md +33 -0
- package/dist/roles/architect/briefs/brains.replic/arc103.concept.replic-brain.[article].md +35 -0
- package/dist/roles/architect/briefs/brains.replic/arc104.concept.context-window.[article].md +40 -0
- package/dist/roles/architect/briefs/brains.replic/arc105.concept.system-prompt.[article].md +44 -0
- package/dist/roles/architect/briefs/brains.replic/arc106.concept.tool-definition.[article].md +59 -0
- package/dist/roles/architect/briefs/brains.replic/arc107.concept.tool-call.[article].md +54 -0
- package/dist/roles/architect/briefs/brains.replic/arc108.concept.tool-result.[article].md +58 -0
- package/dist/roles/architect/briefs/brains.replic/arc109.concept.agentic-loop.[article].md +62 -0
- package/dist/roles/architect/briefs/brains.replic/arc110.concept.reasoning-trace.[article].md +58 -0
- package/dist/roles/architect/briefs/brains.replic/arc111.concept.react-pattern.[article].md +65 -0
- package/dist/roles/architect/briefs/brains.replic/arc112.concept.reflexion-pattern.[article].md +68 -0
- package/dist/roles/architect/briefs/brains.replic/arc113.concept.tree-of-thoughts.[article].md +76 -0
- package/dist/roles/architect/briefs/brains.replic/arc114.concept.self-consistency.[article].md +73 -0
- package/dist/roles/architect/briefs/brains.replic/arc115.concept.lats-pattern.[article].md +78 -0
- package/dist/roles/architect/briefs/brains.replic/arc116.concept.context-compaction.[article].md +71 -0
- package/dist/roles/architect/briefs/brains.replic/arc117.concept.subagent.[article].md +71 -0
- package/dist/roles/architect/briefs/brains.replic/arc118.concept.extended-thinking.[article].md +69 -0
- package/dist/roles/architect/briefs/brains.replic/arc119.concept.mcp.[article].md +78 -0
- package/dist/roles/architect/briefs/brains.replic/arc120.concept.session.[article].md +67 -0
- package/dist/roles/architect/briefs/brains.replic/arc121.concept.message.[article].md +79 -0
- package/dist/roles/architect/briefs/brains.replic/arc122.concept.plan-and-solve.[article].md +80 -0
- package/dist/roles/architect/briefs/brains.replic/arc150.concepts.treestruct.[article].md +126 -0
- package/dist/roles/architect/briefs/brains.replic/arc201.blueprint.claude-code.[article].md +417 -0
- package/dist/roles/architect/briefs/brains.replic/arc201.blueprint.claude-code.zoomin.reason.[article].md +507 -0
- package/dist/roles/architect/briefs/brains.replic/arc202.blueprint.codex.[article].md +354 -0
- package/dist/roles/architect/briefs/brains.replic/arc300.blueprints.comparison.[catalog].md +284 -0
- package/dist/roles/thinker/briefs/term=brain.atomic_vs_replic.md +8 -0
- package/package.json +3 -2
|
@@ -0,0 +1,178 @@
|
|
|
1
|
+
# research sources catalog
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
a comprehensive catalog of sources documenting replic brain architectures, reasoning patterns, and performance benchmarks.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
these sources form the foundation for understanding how replic brains (LLMs behind REPLs) are constructed, compared, and evaluated.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## foundational papers
|
|
14
|
+
|
|
15
|
+
### reasoning patterns
|
|
16
|
+
|
|
17
|
+
| # | title | authors | url | date | relevance |
|
|
18
|
+
|---|-------|---------|-----|------|-----------|
|
|
19
|
+
| 1 | ReAct: Synergizing Reasoning and Acting in Language Models | Yao et al. | [arXiv:2210.03629](https://arxiv.org/abs/2210.03629) | 2022-10, ICLR 2023 | foundational pattern for interleaved reasoning + action in agentic loops |
|
|
20
|
+
| 2 | Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | Wei et al. | [arXiv:2201.11903](https://arxiv.org/abs/2201.11903) | 2022-01, NeurIPS 2022 | foundational technique enabling step-by-step reasoning |
|
|
21
|
+
| 3 | Tree of Thoughts: Deliberate Problem Solving with Large Language Models | Yao et al. | [arXiv:2305.10601](https://arxiv.org/abs/2305.10601) | 2023-05, NeurIPS 2023 | exploration over reasoning paths, enables backtracking |
|
|
22
|
+
| 4 | Reflexion: Language Agents with Verbal Reinforcement Learning | Shinn et al. | [arXiv:2303.11366](https://arxiv.org/abs/2303.11366) | 2023-03 | self-reflection and iterative refinement for agents |
|
|
23
|
+
| 5 | Self-Consistency Improves Chain of Thought Reasoning | Wang et al. | [arXiv:2203.11171](https://arxiv.org/abs/2203.11171) | 2022-03 | multiple reasoning paths with majority voting |
|
|
24
|
+
| 6 | Language Agent Tree Search (LATS) | Zhou et al. | [arXiv:2310.04406](https://arxiv.org/abs/2310.04406) | 2023-10, ICML 2024 | Monte Carlo tree search for language agents |
|
|
25
|
+
|
|
26
|
+
### prompting techniques
|
|
27
|
+
|
|
28
|
+
| # | title | authors | url | date | relevance |
|
|
29
|
+
|---|-------|---------|-----|------|-----------|
|
|
30
|
+
| 7 | Plan-and-Solve Prompting | Wang et al. | [arXiv:2305.04091](https://arxiv.org/abs/2305.04091) | 2023-05 | zero-shot decomposition into planning + execution |
|
|
31
|
+
| 8 | Least-to-Most Prompting | Zhou et al. | [arXiv:2205.10625](https://arxiv.org/abs/2205.10625) | 2022-05 | decompose complex problems into simpler subproblems |
|
|
32
|
+
| 9 | Self-Ask: Measuring and Narrowing the Compositionality Gap | Press et al. | [arXiv:2210.03350](https://arxiv.org/abs/2210.03350) | 2022-10, ICLR 2023 | model asks follow-up questions before answering |
|
|
33
|
+
| 10 | Auto-CoT: Automatic Chain of Thought Prompting | Zhang et al. | [arXiv:2210.03493](https://arxiv.org/abs/2210.03493) | 2022-10 | automatic demonstration construction with diversity |
|
|
34
|
+
| 11 | Show Your Work: Scratchpads for Intermediate Computation | Nye et al. | [arXiv:2112.00114](https://arxiv.org/abs/2112.00114) | 2021-12 | intermediate computation steps improve multi-step tasks |
|
|
35
|
+
|
|
36
|
+
### tool use and function calling
|
|
37
|
+
|
|
38
|
+
| # | title | authors | url | date | relevance |
|
|
39
|
+
|---|-------|---------|-----|------|-----------|
|
|
40
|
+
| 12 | Toolformer: Language Models Can Teach Themselves to Use Tools | Schick et al. | [arXiv:2302.04761](https://arxiv.org/abs/2302.04761) | 2023-02 | self-supervised tool use learning |
|
|
41
|
+
| 13 | PAL: Program-Aided Language Models | Gao et al. | [arXiv:2211.10435](https://arxiv.org/abs/2211.10435) | 2022-11 | offload computation to program interpreter |
|
|
42
|
+
|
|
43
|
+
### embodied and agentic systems
|
|
44
|
+
|
|
45
|
+
| # | title | authors | url | date | relevance |
|
|
46
|
+
|---|-------|---------|-----|------|-----------|
|
|
47
|
+
| 14 | Inner Monologue: Embodied Reasoning through Planning with Language Models | Huang et al. | [arXiv:2207.05608](https://arxiv.org/abs/2207.05608) | 2022-07, CoRL | closed-loop feedback for robot planning |
|
|
48
|
+
| 15 | Generative Agents: Interactive Simulacra of Human Behavior | Park et al. | [arXiv:2304.03442](https://arxiv.org/abs/2304.03442) | 2023-04 | memory architecture for believable agent behavior |
|
|
49
|
+
| 16 | MemGPT: Towards LLMs as Operating Systems | Packer et al. | [arXiv:2310.08560](https://arxiv.org/abs/2310.08560) | 2023-10 | virtual context management, memory hierarchy |
|
|
50
|
+
|
|
51
|
+
### surveys and meta-analyses
|
|
52
|
+
|
|
53
|
+
| # | title | authors | url | date | relevance |
|
|
54
|
+
|---|-------|---------|-----|------|-----------|
|
|
55
|
+
| 17 | CoALA: Cognitive Architectures for Language Agents | Sumers et al. | [arXiv:2309.02427](https://arxiv.org/abs/2309.02427) | 2023-09 | systematic framework for language agent architectures |
|
|
56
|
+
| 18 | AgentBench: Evaluating LLMs as Agents | Liu et al. | [arXiv:2308.03688](https://arxiv.org/abs/2308.03688) | 2023-08 | comprehensive benchmark for evaluating LLM agents |
|
|
57
|
+
| 19 | Understanding the Planning of LLM Agents: A Survey | Huang et al. | [arXiv:2402.02716](https://arxiv.org/abs/2402.02716) | 2024-02 | taxonomy of planning approaches |
|
|
58
|
+
| 20 | ADaPT: As-Needed Decomposition and Planning | Prasad et al. | [arXiv:2311.05772](https://arxiv.org/abs/2311.05772) | 2023-11, NAACL 2024 | adaptive decomposition based on task complexity |
|
|
59
|
+
| 21 | Reasoning with Language Model Prompting: A Survey | Qiao et al. | [github.com/zjunlp](https://github.com/zjunlp/Prompt4ReasoningPapers) | 2023, ACL 2023 | comprehensive survey of prompting for reasoning |
|
|
60
|
+
| 22 | LLM-Based Agents for Tool Learning: A Survey | - | [Springer](https://link.springer.com/article/10.1007/s41019-025-00296-9) | 2024 | survey of tool learning approaches |
|
|
61
|
+
| 23 | A Survey of Task Planning with Large Language Models | - | [Intelligent Computing](https://spj.science.org/doi/10.34133/icomputing.0124) | 2024 | task planning and decomposition survey |
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## claude-code architecture
|
|
66
|
+
|
|
67
|
+
| # | title | source | url | date | relevance |
|
|
68
|
+
|---|-------|--------|-----|------|-----------|
|
|
69
|
+
| 24 | Building Effective Agents | Anthropic blog | [anthropic.com/research](https://www.anthropic.com/research/building-effective-agents) | 2024-12 | official guidance on agent architecture patterns |
|
|
70
|
+
| 25 | Building Agents with the Claude Agent SDK | Anthropic engineering | [anthropic.com/engineering](https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk) | 2024 | SDK architecture and design principles |
|
|
71
|
+
| 26 | Advanced Tool Use on Claude | Anthropic engineering | [anthropic.com/engineering](https://www.anthropic.com/engineering/advanced-tool-use) | 2024 | tool search, programmatic calling, examples |
|
|
72
|
+
| 27 | Claude's Extended Thinking | Anthropic | [anthropic.com/news](https://www.anthropic.com/news/visible-extended-thinking) | 2025 | hybrid reasoning with thinking budgets |
|
|
73
|
+
| 28 | The "think" Tool: Enabling Claude to Stop and Think | Anthropic engineering | [anthropic.com/engineering](https://www.anthropic.com/engineering/claude-think-tool) | 2024 | structured thinking during tool use |
|
|
74
|
+
| 29 | Model Context Protocol (MCP) | Anthropic | [modelcontextprotocol.io](https://modelcontextprotocol.io/) | 2024 | standardized tool integration protocol |
|
|
75
|
+
| 30 | Claude Agent SDK Repository | GitHub | [github.com/anthropics/claude-agent-sdk-python](https://github.com/anthropics/claude-agent-sdk-python) | 2024 | reference implementation |
|
|
76
|
+
| 31 | Claude Code System Prompt Analysis | Zenn (community) | [zenn.dev](https://zenn.dev/) | 2024 | detailed analysis of system prompt structure |
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## alternative architectures
|
|
81
|
+
|
|
82
|
+
### openai codex
|
|
83
|
+
|
|
84
|
+
| # | title | source | url | date | relevance |
|
|
85
|
+
|---|-------|--------|-----|------|-----------|
|
|
86
|
+
| 32 | Codex Cloud Architecture | OpenAI | [developers.openai.com/codex/cloud](https://developers.openai.com/codex/cloud/) | 2025 | cloud sandbox execution environment |
|
|
87
|
+
| 33 | Codex CLI Features | OpenAI | [developers.openai.com/codex/cli/features](https://developers.openai.com/codex/cli/features/) | 2025 | CLI capabilities and workflows |
|
|
88
|
+
| 34 | Codex Security Guide | OpenAI | [developers.openai.com/codex/security](https://developers.openai.com/codex/security/) | 2025 | sandboxing and network isolation |
|
|
89
|
+
| 35 | Codex SDK | OpenAI | [developers.openai.com/codex/sdk](https://developers.openai.com/codex/sdk/) | 2025 | programmatic integration |
|
|
90
|
+
|
|
91
|
+
### other coding assistants
|
|
92
|
+
|
|
93
|
+
| # | title | source | url | date | relevance |
|
|
94
|
+
|---|-------|--------|-----|------|-----------|
|
|
95
|
+
| 36 | Devin AI Architecture | various | (multiple sources) | 2024 | autonomous software engineer |
|
|
96
|
+
| 37 | Aider: AI Pair Programming | aider.chat | [aider.chat](https://aider.chat/) | 2024 | git-integrated coding assistant |
|
|
97
|
+
| 38 | Cursor AI | cursor.com | [cursor.com](https://cursor.com/) | 2024 | AI-augmented IDE |
|
|
98
|
+
| 39 | GitHub Copilot | GitHub | [github.blog](https://github.blog/) | 2024 | code completion and chat |
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## context management
|
|
103
|
+
|
|
104
|
+
| # | title | source | url | date | relevance |
|
|
105
|
+
|---|-------|--------|-----|------|-----------|
|
|
106
|
+
| 40 | MemGPT: Virtual Context Management | Packer et al. | [arXiv:2310.08560](https://arxiv.org/abs/2310.08560) | 2023-10 | hierarchical memory for extended context |
|
|
107
|
+
| 41 | Extended Thinking Documentation | Anthropic | [support.claude.com](https://support.claude.com/en/articles/10574485-using-extended-thinking) | 2025 | thinking budgets and serial test-time compute |
|
|
108
|
+
| 42 | Context Window Management Strategies | various | (multiple sources) | 2024 | summarization and compaction techniques |
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## benchmarks and performance
|
|
113
|
+
|
|
114
|
+
### code generation benchmarks
|
|
115
|
+
|
|
116
|
+
| # | title | source | url | date | relevance |
|
|
117
|
+
|---|-------|--------|-----|------|-----------|
|
|
118
|
+
| 43 | SWE-bench: Software Engineering Benchmark | Princeton NLP | [swebench.com](https://www.swebench.com/) | 2024 | real-world github issue resolution |
|
|
119
|
+
| 44 | HumanEval Benchmark | OpenAI | [paperswithcode.com](https://paperswithcode.com/dataset/humaneval) | 2021 | function synthesis from docstrings |
|
|
120
|
+
| 45 | HumanEval Pro and MBPP Pro | - | [arXiv:2412.21199](https://arxiv.org/abs/2412.21199) | 2024-12, ACL 2025 | self-invoking code generation |
|
|
121
|
+
| 46 | MBPP Benchmark | Google | [paperswithcode.com](https://paperswithcode.com/sota/code-generation-on-mbpp) | 2021 | mostly basic python problems |
|
|
122
|
+
| 47 | EvalPlus Leaderboard | - | [evalplus.github.io](https://evalplus.github.io/leaderboard.html) | 2024 | augmented test suites for HumanEval/MBPP |
|
|
123
|
+
|
|
124
|
+
### agent benchmarks
|
|
125
|
+
|
|
126
|
+
| # | title | source | url | date | relevance |
|
|
127
|
+
|---|-------|--------|-----|------|-----------|
|
|
128
|
+
| 48 | AgentBench | Liu et al. | [arXiv:2308.03688](https://arxiv.org/abs/2308.03688) | 2023-08 | multi-environment agent evaluation |
|
|
129
|
+
| 49 | LiveBench | - | [livebench.ai](https://livebench.ai/) | 2024 | continuously updated LLM evaluation |
|
|
130
|
+
|
|
131
|
+
### tool comparisons
|
|
132
|
+
|
|
133
|
+
| # | title | source | url | date | relevance |
|
|
134
|
+
|---|-------|--------|-----|------|-----------|
|
|
135
|
+
| 50 | Coding Agents Comparison | Artificial Analysis | [artificialanalysis.ai](https://artificialanalysis.ai/insights/coding-agents-comparison) | 2024 | Claude Code, Cursor, Copilot comparison |
|
|
136
|
+
| 51 | AI Coding Agents Benchmark 2025 | Render | [render.com/blog](https://render.com/blog/ai-coding-agents-benchmark) | 2025 | practical benchmark results |
|
|
137
|
+
| 52 | Best AI Code Apply Tools 2025 | Morph | [morphllm.com/comparisons](https://www.morphllm.com/comparisons) | 2025 | enterprise benchmarks |
|
|
138
|
+
| 53 | Claude Code vs Cursor Deep Comparison | Qodo | [qodo.ai/blog](https://www.qodo.ai/blog/claude-code-vs-cursor/) | 2025 | detailed feature comparison |
|
|
139
|
+
|
|
140
|
+
---
|
|
141
|
+
|
|
142
|
+
## reasoning strategy comparisons
|
|
143
|
+
|
|
144
|
+
| # | title | source | url | date | relevance |
|
|
145
|
+
|---|-------|--------|-----|------|-----------|
|
|
146
|
+
| 54 | ReAct vs CoT Performance | Google Research | [research.google/blog](https://research.google/blog/react-synergizing-reasoning-and-acting-in-language-models/) | 2022 | comparative analysis on HotPotQA, FEVER, ALFWorld |
|
|
147
|
+
| 55 | Comprehensive Guide to ReAct Prompting | Mercity | [mercity.ai/blog-post](https://www.mercity.ai/blog-post/react-prompting-and-react-based-agentic-systems) | 2024 | practical guide with performance notes |
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## key performance findings
|
|
152
|
+
|
|
153
|
+
### reasoning strategies (from sources)
|
|
154
|
+
|
|
155
|
+
| strategy | benchmark | performance | source |
|
|
156
|
+
|----------|-----------|-------------|--------|
|
|
157
|
+
| ReAct | HotPotQA | competitive with CoT, better grounding | [1] |
|
|
158
|
+
| ReAct | ALFWorld | +34% over imitation learning | [1] |
|
|
159
|
+
| CoT | HotPotQA | higher success, but 56% hallucination in failures | [1] |
|
|
160
|
+
| Self-Consistency | various | significant improvement over single-path CoT | [5] |
|
|
161
|
+
| Tree of Thoughts | Game of 24 | 74% (vs 4% CoT) | [3] |
|
|
162
|
+
| LATS | HotPotQA | state-of-the-art with MCTS | [6] |
|
|
163
|
+
| Reflexion | ALFWorld | +22% improvement via self-reflection | [4] |
|
|
164
|
+
|
|
165
|
+
### code generation (from sources)
|
|
166
|
+
|
|
167
|
+
| model/tool | HumanEval | MBPP | SWE-bench | source |
|
|
168
|
+
|------------|-----------|------|-----------|--------|
|
|
169
|
+
| o1-mini | 96.2% | - | - | [45] |
|
|
170
|
+
| o1-mini (Pro) | 76.2% | - | - | [45] |
|
|
171
|
+
| GPT-4o + planning | - | 84.8% | - | [43] |
|
|
172
|
+
| Claude Code | - | - | ~49% (verified) | [44] |
|
|
173
|
+
|
|
174
|
+
---
|
|
175
|
+
|
|
176
|
+
## access date
|
|
177
|
+
|
|
178
|
+
all sources accessed: 2025-12-23
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# llm (large language model)
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
a neural network trained on vast text corpora that predicts the next token in a sequence, enabling it to generate coherent text, follow instructions, and perform reasoning.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
the llm is the core intelligence of a replic brain. it provides the reasoning and generation capabilities that power all downstream behaviors — from understanding natural language to generating code to deciding which tools to invoke.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- (none — foundational primitive)
|
|
14
|
+
|
|
15
|
+
## key characteristics
|
|
16
|
+
|
|
17
|
+
- **autoregressive generation**: produces output one token at a time, conditioning on all previous tokens
|
|
18
|
+
- **context window**: has a fixed maximum number of tokens it can process at once
|
|
19
|
+
- **emergent capabilities**: reasoning, instruction-following, and tool use emerge at scale
|
|
20
|
+
- **stochastic**: outputs are probabilistic, controlled via temperature parameter
|
|
21
|
+
|
|
22
|
+
## sources
|
|
23
|
+
|
|
24
|
+
- [Chain-of-Thought Prompting](https://arxiv.org/abs/2201.11903) — demonstrates reasoning emergence in LLMs
|
|
25
|
+
- [CoALA: Cognitive Architectures for Language Agents](https://arxiv.org/abs/2309.02427) — positions LLM as central to agent architecture
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# repl (read-eval-print-loop)
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
an interactive programming pattern where the system reads user input, evaluates it, prints the result, and loops back to read more input.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
the repl pattern enables iterative, conversational interaction. when an llm operates behind a repl, it can receive feedback, execute actions, observe results, and refine its approach — the foundation of agentic behavior.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- (none — foundational primitive)
|
|
14
|
+
|
|
15
|
+
## key characteristics
|
|
16
|
+
|
|
17
|
+
- **read**: accept input from user or environment
|
|
18
|
+
- **eval**: process and execute the input
|
|
19
|
+
- **print**: display results or output
|
|
20
|
+
- **loop**: repeat the cycle indefinitely
|
|
21
|
+
|
|
22
|
+
## in replic brains
|
|
23
|
+
|
|
24
|
+
the repl structure manifests as:
|
|
25
|
+
1. **read**: receive user message or tool result
|
|
26
|
+
2. **eval**: llm generates response/action
|
|
27
|
+
3. **print**: emit response or execute tool
|
|
28
|
+
4. **loop**: continue until task complete
|
|
29
|
+
|
|
30
|
+
## sources
|
|
31
|
+
|
|
32
|
+
- unix shell, lisp repl — historical precedent
|
|
33
|
+
- [Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) — describes agentic loop as repl-like
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# replic-brain
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
an llm operating behind a repl interface, forming an interactive system capable of iterative reasoning, tool use, and task completion through conversation.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
the term "replic brain" names the specific architectural pattern where an llm's capabilities are harnessed through a read-eval-print-loop. this pattern enables the llm to act as an agent — receiving feedback, taking actions, and iterating toward goals.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `llm` — provides reasoning and generation
|
|
14
|
+
- `repl` — provides interactive loop structure
|
|
15
|
+
|
|
16
|
+
## key characteristics
|
|
17
|
+
|
|
18
|
+
- **conversational**: maintains dialogue state across turns
|
|
19
|
+
- **tool-capable**: can invoke external tools and process results
|
|
20
|
+
- **iterative**: refines approach based on feedback
|
|
21
|
+
- **goal-directed**: works toward completing user tasks
|
|
22
|
+
|
|
23
|
+
## examples
|
|
24
|
+
|
|
25
|
+
| name | organization | context |
|
|
26
|
+
|------|--------------|---------|
|
|
27
|
+
| claude code | anthropic | cli-based coding assistant |
|
|
28
|
+
| codex | openai | cloud sandbox coding agent |
|
|
29
|
+
| cursor | cursor inc | ide-integrated coding assistant |
|
|
30
|
+
| aider | open source | git-integrated pair programmer |
|
|
31
|
+
|
|
32
|
+
## sources
|
|
33
|
+
|
|
34
|
+
- [Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) — describes the agentic loop pattern
|
|
35
|
+
- [CoALA](https://arxiv.org/abs/2309.02427) — systematic framework for language agent architectures
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
# context-window
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
the fixed-size buffer of tokens that an llm can process in a single forward pass, representing its working memory during generation.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
the context window is the fundamental constraint that shapes all replic brain architectures. it determines how much conversation history, code, and tool results can be processed simultaneously. strategies like context compaction, subagents, and memory hierarchies exist specifically to work within or around this limit.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `llm` — context window is a property of the llm
|
|
14
|
+
|
|
15
|
+
## key characteristics
|
|
16
|
+
|
|
17
|
+
- **fixed size**: measured in tokens (e.g., 200k tokens for claude)
|
|
18
|
+
- **attention-based**: all tokens attend to all other tokens (O(n²) complexity)
|
|
19
|
+
- **includes everything**: system prompt, conversation, tool calls, tool results
|
|
20
|
+
- **ephemeral**: cleared between sessions (no persistent memory)
|
|
21
|
+
|
|
22
|
+
## size examples (2024-2025)
|
|
23
|
+
|
|
24
|
+
| model | context window |
|
|
25
|
+
|-------|----------------|
|
|
26
|
+
| gpt-4o | 128k tokens |
|
|
27
|
+
| claude 3.5/4 | 200k tokens |
|
|
28
|
+
| gemini 1.5 | 1m+ tokens |
|
|
29
|
+
|
|
30
|
+
## implications for architecture
|
|
31
|
+
|
|
32
|
+
- longer context = more code/conversation in single pass
|
|
33
|
+
- summarization needed when context fills
|
|
34
|
+
- subagents can isolate context usage
|
|
35
|
+
- caching optimizes repeated prompts
|
|
36
|
+
|
|
37
|
+
## sources
|
|
38
|
+
|
|
39
|
+
- [MemGPT](https://arxiv.org/abs/2310.08560) — virtual context management
|
|
40
|
+
- [Claude Documentation](https://docs.anthropic.com) — 200k token context
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# system-prompt
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
the initial instructions provided to an llm at the start of a conversation that define its behavior, capabilities, constraints, and persona.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
the system prompt is the primary mechanism for shaping a replic brain's behavior. it establishes what tools are available, how to use them, what style to adopt, and what constraints to follow. changes to the system prompt fundamentally alter the agent's capabilities.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `llm` — interprets and follows the system prompt
|
|
14
|
+
- `context-window` — system prompt consumes context tokens
|
|
15
|
+
|
|
16
|
+
## key characteristics
|
|
17
|
+
|
|
18
|
+
- **persistent**: remains in context for entire conversation
|
|
19
|
+
- **privileged**: typically cannot be overridden by user messages
|
|
20
|
+
- **declarative**: describes desired behavior, not code
|
|
21
|
+
- **composable**: can include tool definitions, examples, constraints
|
|
22
|
+
|
|
23
|
+
## typical contents (replic brains)
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
- persona/role definition
|
|
27
|
+
- available tools and their schemas
|
|
28
|
+
- behavioral constraints
|
|
29
|
+
- output format requirements
|
|
30
|
+
- examples of desired behavior
|
|
31
|
+
- safety guardrails
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## examples
|
|
35
|
+
|
|
36
|
+
| system | system prompt size | notable elements |
|
|
37
|
+
|--------|-------------------|------------------|
|
|
38
|
+
| claude code | ~15k tokens | tool definitions, coding guidelines, git workflow |
|
|
39
|
+
| codex | varies | sandbox rules, security constraints |
|
|
40
|
+
|
|
41
|
+
## sources
|
|
42
|
+
|
|
43
|
+
- [Claude Code System Prompt Analysis](https://zenn.dev/) — detailed breakdown
|
|
44
|
+
- [Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) — system prompt design
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# tool-definition
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
a schema that describes a tool's interface — its name, purpose, parameters, and expected behavior — enabling an llm to understand when and how to invoke it.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
tool definitions are the contract between the llm and external capabilities. by providing structured schemas, the llm can reason about which tools apply to a task, what parameters to provide, and what results to expect. well-designed tool definitions are essential for reliable agent behavior.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `system-prompt` — tool definitions are typically included in system prompt
|
|
14
|
+
- `llm` — interprets tool definitions and decides when to invoke
|
|
15
|
+
|
|
16
|
+
## key characteristics
|
|
17
|
+
|
|
18
|
+
- **name**: unique identifier for the tool
|
|
19
|
+
- **description**: natural language explanation of purpose
|
|
20
|
+
- **parameters**: typed schema (often JSON Schema) for inputs
|
|
21
|
+
- **returns**: description of expected output format
|
|
22
|
+
|
|
23
|
+
## example schema
|
|
24
|
+
|
|
25
|
+
```json
|
|
26
|
+
{
|
|
27
|
+
"name": "Read",
|
|
28
|
+
"description": "Reads a file from the filesystem",
|
|
29
|
+
"parameters": {
|
|
30
|
+
"type": "object",
|
|
31
|
+
"properties": {
|
|
32
|
+
"file_path": {
|
|
33
|
+
"type": "string",
|
|
34
|
+
"description": "Absolute path to the file"
|
|
35
|
+
},
|
|
36
|
+
"offset": {
|
|
37
|
+
"type": "number",
|
|
38
|
+
"description": "Line number to start reading from"
|
|
39
|
+
}
|
|
40
|
+
},
|
|
41
|
+
"required": ["file_path"]
|
|
42
|
+
}
|
|
43
|
+
}
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
## tool categories (replic brains)
|
|
47
|
+
|
|
48
|
+
| category | examples |
|
|
49
|
+
|----------|----------|
|
|
50
|
+
| filesystem | Read, Write, Edit, Glob, Grep |
|
|
51
|
+
| execution | Bash, Task |
|
|
52
|
+
| communication | WebSearch, WebFetch |
|
|
53
|
+
| coordination | TodoWrite, AskUserQuestion |
|
|
54
|
+
|
|
55
|
+
## sources
|
|
56
|
+
|
|
57
|
+
- [Advanced Tool Use on Claude](https://www.anthropic.com/engineering/advanced-tool-use) — tool definition patterns
|
|
58
|
+
- [Toolformer](https://arxiv.org/abs/2302.04761) — self-supervised tool learning
|
|
59
|
+
- [MCP Specification](https://modelcontextprotocol.io/) — standardized tool protocol
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
# tool-call
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
a structured request generated by an llm to invoke a specific tool with given parameters, representing the agent's decision to take an external action.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
tool calls are the mechanism by which replic brains interact with the world beyond text generation. when the llm determines that a task requires external action — reading a file, executing code, searching the web — it emits a tool call. this is the bridge between reasoning and action.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `tool-definition` — provides the schema for valid calls
|
|
14
|
+
- `llm` — generates the tool call
|
|
15
|
+
- `agentic-loop` — processes and executes the call
|
|
16
|
+
|
|
17
|
+
## key characteristics
|
|
18
|
+
|
|
19
|
+
- **structured**: follows the schema defined in tool-definition
|
|
20
|
+
- **atomic**: one tool call = one action
|
|
21
|
+
- **declarative**: specifies what to do, not how
|
|
22
|
+
- **interruptible**: execution may require user approval
|
|
23
|
+
|
|
24
|
+
## example
|
|
25
|
+
|
|
26
|
+
```json
|
|
27
|
+
{
|
|
28
|
+
"tool": "Read",
|
|
29
|
+
"parameters": {
|
|
30
|
+
"file_path": "/home/user/project/src/main.ts",
|
|
31
|
+
"offset": 100,
|
|
32
|
+
"limit": 50
|
|
33
|
+
}
|
|
34
|
+
}
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## lifecycle
|
|
38
|
+
|
|
39
|
+
1. llm generates tool call in response
|
|
40
|
+
2. system extracts and validates parameters
|
|
41
|
+
3. permission check (if required)
|
|
42
|
+
4. tool executes
|
|
43
|
+
5. result returned to context as `tool-result`
|
|
44
|
+
|
|
45
|
+
## parallelism
|
|
46
|
+
|
|
47
|
+
some systems allow multiple tool calls in a single response:
|
|
48
|
+
- **independent calls**: can execute in parallel
|
|
49
|
+
- **dependent calls**: must execute sequentially
|
|
50
|
+
|
|
51
|
+
## sources
|
|
52
|
+
|
|
53
|
+
- [ReAct](https://arxiv.org/abs/2210.03629) — action in reasoning+acting loop
|
|
54
|
+
- [Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) — tool call patterns
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# tool-result
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
the output returned from executing a tool call, injected back into the llm's context as feedback for continued reasoning.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
tool results close the feedback loop in agentic systems. they allow the llm to observe the consequences of its actions, verify success or failure, and adjust its approach. without tool results, the agent would be operating blind.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `tool-call` — produces the result
|
|
14
|
+
- `context-window` — result consumes context tokens
|
|
15
|
+
|
|
16
|
+
## key characteristics
|
|
17
|
+
|
|
18
|
+
- **structured or text**: depends on tool (json, stdout, file contents)
|
|
19
|
+
- **may be truncated**: large outputs compressed to fit context
|
|
20
|
+
- **includes metadata**: success/failure, error messages
|
|
21
|
+
- **grounds reasoning**: provides factual information
|
|
22
|
+
|
|
23
|
+
## example
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
Tool Result (Read):
|
|
27
|
+
1→import { Component } from 'react';
|
|
28
|
+
2→
|
|
29
|
+
3→export class App extends Component {
|
|
30
|
+
4→ render() {
|
|
31
|
+
5→ return <div>Hello World</div>;
|
|
32
|
+
6→ }
|
|
33
|
+
7→}
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## handling large results
|
|
37
|
+
|
|
38
|
+
| strategy | description |
|
|
39
|
+
|----------|-------------|
|
|
40
|
+
| truncation | cut output at N characters/lines |
|
|
41
|
+
| summarization | llm summarizes before adding to context |
|
|
42
|
+
| pagination | return chunks with offset/limit |
|
|
43
|
+
| streaming | process incrementally |
|
|
44
|
+
|
|
45
|
+
## error results
|
|
46
|
+
|
|
47
|
+
tool results may indicate failures:
|
|
48
|
+
- file not found
|
|
49
|
+
- permission denied
|
|
50
|
+
- timeout exceeded
|
|
51
|
+
- invalid parameters
|
|
52
|
+
|
|
53
|
+
the llm must interpret these and adapt its approach.
|
|
54
|
+
|
|
55
|
+
## sources
|
|
56
|
+
|
|
57
|
+
- [ReAct](https://arxiv.org/abs/2210.03629) — observation in ReAct loop
|
|
58
|
+
- [Inner Monologue](https://arxiv.org/abs/2207.05608) — environment feedback
|
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
# agentic-loop
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
the core execution pattern of replic brains: repeatedly generate llm response, check for tool calls, execute tools, append results, and continue until task completion.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
the agentic loop is what transforms a stateless llm into a persistent, goal-directed agent. by iterating through generate → execute → observe cycles, the agent can accomplish complex multi-step tasks that exceed what a single generation could produce.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `llm` — generates responses and tool calls
|
|
14
|
+
- `repl` — the agentic loop is the repl pattern applied to llms
|
|
15
|
+
- `tool-call` — actions to execute
|
|
16
|
+
- `tool-result` — feedback from actions
|
|
17
|
+
- `context-window` — accumulates conversation history
|
|
18
|
+
|
|
19
|
+
## pseudocode
|
|
20
|
+
|
|
21
|
+
```python
|
|
22
|
+
while True:
|
|
23
|
+
response = llm.generate(context_window)
|
|
24
|
+
|
|
25
|
+
if response.has_tool_calls:
|
|
26
|
+
for tool_call in response.tool_calls:
|
|
27
|
+
result = execute_tool(tool_call)
|
|
28
|
+
context_window.append(result)
|
|
29
|
+
else:
|
|
30
|
+
return response # task complete, exit loop
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## key characteristics
|
|
34
|
+
|
|
35
|
+
- **unbounded iteration**: continues until llm stops calling tools
|
|
36
|
+
- **context accumulation**: history grows with each turn
|
|
37
|
+
- **goal-directed**: llm decides when task is complete
|
|
38
|
+
- **interruptible**: user can inject messages mid-loop
|
|
39
|
+
|
|
40
|
+
## termination conditions
|
|
41
|
+
|
|
42
|
+
| condition | trigger |
|
|
43
|
+
| ------------------ | ------------------------------- |
|
|
44
|
+
| natural completion | llm responds without tool calls |
|
|
45
|
+
| context exhaustion | context window fills up |
|
|
46
|
+
| user interruption | user sends new message |
|
|
47
|
+
| timeout | max iterations or time exceeded |
|
|
48
|
+
| error | unrecoverable tool failure |
|
|
49
|
+
|
|
50
|
+
## variants
|
|
51
|
+
|
|
52
|
+
| variant | modification |
|
|
53
|
+
| ----------- | ------------------------------------ |
|
|
54
|
+
| react | interleave explicit reasoning traces |
|
|
55
|
+
| reflexion | add self-reflection after failures |
|
|
56
|
+
| tree search | explore multiple action branches |
|
|
57
|
+
|
|
58
|
+
## sources
|
|
59
|
+
|
|
60
|
+
- [Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) — canonical loop description
|
|
61
|
+
- [ReAct](https://arxiv.org/abs/2210.03629) — reasoning-augmented loop
|
|
62
|
+
- [CoALA](https://arxiv.org/abs/2309.02427) — framework for loop variants
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# reasoning-trace
|
|
2
|
+
|
|
3
|
+
## .what
|
|
4
|
+
|
|
5
|
+
explicit intermediate reasoning steps that an llm produces before generating a final answer, making its thought process visible and improving accuracy on complex tasks.
|
|
6
|
+
|
|
7
|
+
## .why
|
|
8
|
+
|
|
9
|
+
reasoning traces transform black-box generation into interpretable problem-solving. by externalizing intermediate steps, the llm can break down complex problems, catch errors in its reasoning, and provide explanations that humans can verify. this is the foundation for chain-of-thought and related techniques.
|
|
10
|
+
|
|
11
|
+
## dependsOn
|
|
12
|
+
|
|
13
|
+
- `llm` — generates the reasoning trace
|
|
14
|
+
- `context-window` — traces consume tokens
|
|
15
|
+
|
|
16
|
+
## key characteristics
|
|
17
|
+
|
|
18
|
+
- **intermediate**: produced before final answer
|
|
19
|
+
- **natural language**: typically readable text
|
|
20
|
+
- **step-by-step**: decomposed reasoning
|
|
21
|
+
- **self-documenting**: explains the llm's approach
|
|
22
|
+
|
|
23
|
+
## example
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
Question: What is 23 × 17?
|
|
27
|
+
|
|
28
|
+
Reasoning trace:
|
|
29
|
+
- I need to multiply 23 by 17
|
|
30
|
+
- 23 × 17 = 23 × (10 + 7)
|
|
31
|
+
- = 23 × 10 + 23 × 7
|
|
32
|
+
- = 230 + 161
|
|
33
|
+
- = 391
|
|
34
|
+
|
|
35
|
+
Answer: 391
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## elicitation methods
|
|
39
|
+
|
|
40
|
+
| method | technique |
|
|
41
|
+
|--------|-----------|
|
|
42
|
+
| zero-shot | "Let's think step by step" |
|
|
43
|
+
| few-shot | provide example traces |
|
|
44
|
+
| trained | fine-tune on trace datasets |
|
|
45
|
+
| scratchpad | dedicated trace output space |
|
|
46
|
+
|
|
47
|
+
## benefits
|
|
48
|
+
|
|
49
|
+
- improved accuracy on reasoning tasks
|
|
50
|
+
- reduced hallucination
|
|
51
|
+
- debuggable/auditable outputs
|
|
52
|
+
- enables self-correction
|
|
53
|
+
|
|
54
|
+
## sources
|
|
55
|
+
|
|
56
|
+
- [Chain-of-Thought Prompting](https://arxiv.org/abs/2201.11903) — seminal paper
|
|
57
|
+
- [Scratchpads](https://arxiv.org/abs/2112.00114) — intermediate computation
|
|
58
|
+
- [Extended Thinking](https://www.anthropic.com/news/visible-extended-thinking) — claude's thinking mode
|