@exulu/backend 1.54.0 → 1.55.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/index.cjs +1970 -1176
- package/dist/index.d.cts +6 -29
- package/dist/index.d.ts +6 -29
- package/dist/index.js +1963 -1164
- package/ee/agentic-retrieval/v3/agent-loop.ts +49 -3
- package/ee/agentic-retrieval/v3/classifier.ts +42 -37
- package/ee/agentic-retrieval/v3/index.ts +112 -18
- package/ee/agentic-retrieval/v3/session-tools-registry.ts +20 -0
- package/ee/agentic-retrieval/v3/strategies.ts +28 -24
- package/ee/agentic-retrieval/v3/tools.ts +226 -111
- package/ee/agentic-retrieval/v3/trajectory.ts +227 -14
- package/ee/invoke-skills/create-sandbox.ts +119 -0
- package/ee/python/documents/processing/doc_processor.ts +106 -14
- package/package.json +4 -2
- package/ee/agentic-retrieval/ANALYSIS.md +0 -658
- package/ee/agentic-retrieval/index.ts +0 -1109
- package/ee/agentic-retrieval/logs/README.md +0 -198
- package/ee/agentic-retrieval/v2.ts +0 -1628
- package/ee/agentic-retrieval/v4/agent-loop.ts +0 -121
- package/ee/agentic-retrieval/v4/embed-preprocessor.ts +0 -76
- package/ee/agentic-retrieval/v4/index.ts +0 -181
- package/ee/agentic-retrieval/v4/system-prompt.ts +0 -248
- package/ee/agentic-retrieval/v4/tools.ts +0 -241
- package/ee/agentic-retrieval/v4/types.ts +0 -29
|
@@ -1,198 +0,0 @@
|
|
|
1
|
-
# Agentic Retrieval Trajectory Logs
|
|
2
|
-
|
|
3
|
-
This directory contains detailed logs of every retrieval trajectory executed by the v2 agentic retrieval system.
|
|
4
|
-
|
|
5
|
-
## Purpose
|
|
6
|
-
|
|
7
|
-
These logs capture the complete "thinking process" of the retrieval agent, including:
|
|
8
|
-
- Initial query and detected language
|
|
9
|
-
- Each reasoning step and decision
|
|
10
|
-
- Every tool call with inputs and outputs
|
|
11
|
-
- Dynamic tools created during execution
|
|
12
|
-
- Token usage and performance metrics
|
|
13
|
-
- Final results and success/failure status
|
|
14
|
-
|
|
15
|
-
## Directory Structure
|
|
16
|
-
|
|
17
|
-
```
|
|
18
|
-
logs/
|
|
19
|
-
├── YYYY-MM-DD/ # Logs organized by date
|
|
20
|
-
│ ├── traj_*.json # Individual trajectory logs
|
|
21
|
-
│ └── _daily_summary.jsonl # Daily summary (one line per trajectory)
|
|
22
|
-
└── README.md # This file
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
## Trajectory File Format
|
|
26
|
-
|
|
27
|
-
Each trajectory is saved as a JSON file with this structure:
|
|
28
|
-
|
|
29
|
-
```json
|
|
30
|
-
{
|
|
31
|
-
"trajectory_id": "traj_1234567890_abc123def",
|
|
32
|
-
"timestamp": "2026-04-09T10:30:45.123Z",
|
|
33
|
-
"initial_query": "wie sind die Abmessungen vom Liftstarter 16kw",
|
|
34
|
-
"detected_language": "deu",
|
|
35
|
-
"available_contexts": ["techDoc", "vorschriften"],
|
|
36
|
-
"enabled_contexts": ["techDoc", "vorschriften"],
|
|
37
|
-
"reranker_used": "cohere-rerank-multilingual-v3",
|
|
38
|
-
"custom_instructions": "...",
|
|
39
|
-
|
|
40
|
-
"steps": [
|
|
41
|
-
{
|
|
42
|
-
"step_number": 1,
|
|
43
|
-
"timestamp": "2026-04-09T10:30:45.234Z",
|
|
44
|
-
|
|
45
|
-
"reasoning": {
|
|
46
|
-
"text": "I must call tool search_content with...",
|
|
47
|
-
"finished": false,
|
|
48
|
-
"tokens_used": 1250,
|
|
49
|
-
"duration_ms": 850
|
|
50
|
-
},
|
|
51
|
-
|
|
52
|
-
"tool_execution": {
|
|
53
|
-
"tools_called": [
|
|
54
|
-
{
|
|
55
|
-
"tool_name": "search_content",
|
|
56
|
-
"tool_id": "call_abc123",
|
|
57
|
-
"input": {
|
|
58
|
-
"query": "Liftstarter 16kw Abmessungen",
|
|
59
|
-
"knowledge_base_ids": ["techDoc"],
|
|
60
|
-
"searchMethod": "hybrid",
|
|
61
|
-
"limit": 10
|
|
62
|
-
},
|
|
63
|
-
"output_summary": "[{\"item_name\":\"Liftstarter_16kw.pdf\",\"chunk_content\":\"Die Abmessungen...",
|
|
64
|
-
"output_length": 15234,
|
|
65
|
-
"success": true,
|
|
66
|
-
"duration_ms": 1200
|
|
67
|
-
}
|
|
68
|
-
],
|
|
69
|
-
"chunks_retrieved": 10,
|
|
70
|
-
"chunks_after_reranking": 8,
|
|
71
|
-
"total_tokens_used": 3500
|
|
72
|
-
},
|
|
73
|
-
|
|
74
|
-
"dynamic_tools_created": [
|
|
75
|
-
"get_more_content_from_Liftstarter_16kw_pdf",
|
|
76
|
-
"get_Liftstarter_16kw_pdf_page_1_content"
|
|
77
|
-
]
|
|
78
|
-
}
|
|
79
|
-
],
|
|
80
|
-
|
|
81
|
-
"final_results": {
|
|
82
|
-
"total_chunks": 8,
|
|
83
|
-
"total_steps": 2,
|
|
84
|
-
"total_tokens": 4750,
|
|
85
|
-
"total_duration_ms": 3250,
|
|
86
|
-
"success": true
|
|
87
|
-
},
|
|
88
|
-
|
|
89
|
-
"performance": {
|
|
90
|
-
"tokens_per_step": [1250, 3500],
|
|
91
|
-
"avg_tokens_per_step": 2375,
|
|
92
|
-
"chunks_per_step": [10, 0],
|
|
93
|
-
"tool_usage_frequency": {
|
|
94
|
-
"search_content": 1,
|
|
95
|
-
"count_items_or_chunks": 0,
|
|
96
|
-
"save_search_results": 0
|
|
97
|
-
}
|
|
98
|
-
}
|
|
99
|
-
}
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
## Daily Summary Format
|
|
103
|
-
|
|
104
|
-
The `_daily_summary.jsonl` file contains one JSON object per line (newline-delimited JSON):
|
|
105
|
-
|
|
106
|
-
```jsonl
|
|
107
|
-
{"trajectory_id":"traj_1234567890_abc123def","timestamp":"2026-04-09T10:30:45.123Z","query":"wie sind die Abmessungen vom Liftstarter 16kw","tokens":4750,"chunks":8,"steps":2,"duration_ms":3250,"success":true}
|
|
108
|
-
{"trajectory_id":"traj_1234567891_def456ghi","timestamp":"2026-04-09T11:15:22.456Z","query":"count all documents","tokens":1200,"chunks":0,"steps":1,"duration_ms":800,"success":true}
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
## Using These Logs for Analysis
|
|
112
|
-
|
|
113
|
-
### 1. Analyze Successful vs Failed Retrievals
|
|
114
|
-
|
|
115
|
-
```bash
|
|
116
|
-
# Find all failed retrievals
|
|
117
|
-
cat logs/2026-04-09/_daily_summary.jsonl | jq 'select(.success == false)'
|
|
118
|
-
|
|
119
|
-
# Find trajectories that used many tokens
|
|
120
|
-
cat logs/2026-04-09/_daily_summary.jsonl | jq 'select(.tokens > 5000)'
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
### 2. Identify Common Tool Usage Patterns
|
|
124
|
-
|
|
125
|
-
```bash
|
|
126
|
-
# Extract tool usage frequency from all trajectories
|
|
127
|
-
for file in logs/2026-04-09/traj_*.json; do
|
|
128
|
-
jq '.performance.tool_usage_frequency' "$file"
|
|
129
|
-
done
|
|
130
|
-
```
|
|
131
|
-
|
|
132
|
-
### 3. Find Queries That Needed Multiple Steps
|
|
133
|
-
|
|
134
|
-
```bash
|
|
135
|
-
# Trajectories with more than 2 steps
|
|
136
|
-
cat logs/2026-04-09/_daily_summary.jsonl | jq 'select(.steps > 2)'
|
|
137
|
-
```
|
|
138
|
-
|
|
139
|
-
### 4. Review Specific Trajectory
|
|
140
|
-
|
|
141
|
-
```bash
|
|
142
|
-
# Pretty-print a specific trajectory
|
|
143
|
-
jq '.' logs/2026-04-09/traj_1234567890_abc123def.json
|
|
144
|
-
```
|
|
145
|
-
|
|
146
|
-
### 5. Analyze Agent Reasoning
|
|
147
|
-
|
|
148
|
-
```bash
|
|
149
|
-
# Extract all reasoning text from a trajectory
|
|
150
|
-
jq '.steps[].reasoning.text' logs/2026-04-09/traj_1234567890_abc123def.json
|
|
151
|
-
```
|
|
152
|
-
|
|
153
|
-
## Improvement Workflow
|
|
154
|
-
|
|
155
|
-
1. **Collect trajectories** over a period (e.g., 1 week)
|
|
156
|
-
2. **Identify patterns**:
|
|
157
|
-
- Which queries consistently fail?
|
|
158
|
-
- Which tool combinations work well?
|
|
159
|
-
- Are there inefficient search strategies?
|
|
160
|
-
3. **Analyze specific failed cases**:
|
|
161
|
-
- What did the agent try?
|
|
162
|
-
- Why did it fail?
|
|
163
|
-
- What should it have done instead?
|
|
164
|
-
4. **Update agent instructions** based on findings
|
|
165
|
-
5. **Compare before/after** trajectories to measure improvement
|
|
166
|
-
|
|
167
|
-
## Example Analysis Questions
|
|
168
|
-
|
|
169
|
-
- **Token Efficiency**: Are certain query types using too many tokens?
|
|
170
|
-
- **Tool Selection**: Is the agent choosing the right tools for the job?
|
|
171
|
-
- **Search Strategy**: Is hybrid search always best, or do some queries benefit from keyword-only?
|
|
172
|
-
- **Multi-step Reasoning**: When does the agent need multiple steps vs single step?
|
|
173
|
-
- **Dynamic Tools**: Are get_more_content tools being used effectively?
|
|
174
|
-
- **Failure Patterns**: What causes retrieval failures?
|
|
175
|
-
|
|
176
|
-
## Feeding Trajectories Back to Claude Code
|
|
177
|
-
|
|
178
|
-
To analyze a trajectory:
|
|
179
|
-
|
|
180
|
-
1. Find the trajectory file (e.g., `logs/2026-04-09/traj_1234567890_abc123def.json`)
|
|
181
|
-
2. Share it with Claude Code for analysis:
|
|
182
|
-
|
|
183
|
-
```
|
|
184
|
-
I want you to analyze this retrieval trajectory and suggest improvements:
|
|
185
|
-
[paste contents of trajectory file]
|
|
186
|
-
|
|
187
|
-
Please analyze:
|
|
188
|
-
1. Was the agent's reasoning logical?
|
|
189
|
-
2. Did it choose the right tools?
|
|
190
|
-
3. Could it have been more efficient?
|
|
191
|
-
4. What would you change about the search strategy?
|
|
192
|
-
```
|
|
193
|
-
|
|
194
|
-
Claude Code can then provide specific recommendations for improving the agent's behavior.
|
|
195
|
-
|
|
196
|
-
## Privacy Note
|
|
197
|
-
|
|
198
|
-
Trajectory logs may contain sensitive information from user queries and retrieved content. Ensure proper access controls on the logs directory.
|