pikuri-core 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +5 -3
  3. data/lib/pikuri/agent/chat_transport.rb +135 -11
  4. data/lib/pikuri/agent/configurator.rb +4 -4
  5. data/lib/pikuri/agent/context_window_detector.rb +103 -52
  6. data/lib/pikuri/agent/control/step_limit.rb +39 -7
  7. data/lib/pikuri/agent/event.rb +43 -16
  8. data/lib/pikuri/agent/extension.rb +31 -17
  9. data/lib/pikuri/agent/extension_context.rb +147 -0
  10. data/lib/pikuri/agent/listener/terminal.rb +30 -37
  11. data/lib/pikuri/agent/listener/token_log.rb +60 -13
  12. data/lib/pikuri/agent/listener.rb +12 -5
  13. data/lib/pikuri/agent/listener_list.rb +7 -17
  14. data/lib/pikuri/agent/synthesizer.rb +93 -67
  15. data/lib/pikuri/agent.rb +358 -403
  16. data/lib/pikuri/extractor/html.rb +303 -0
  17. data/lib/pikuri/extractor/passthrough.rb +64 -0
  18. data/lib/pikuri/extractor.rb +314 -0
  19. data/lib/pikuri/file_type.rb +74 -266
  20. data/lib/pikuri/sanitizer.rb +179 -0
  21. data/lib/pikuri/subprocess.rb +73 -2
  22. data/lib/pikuri/tool/calculator.rb +213 -41
  23. data/lib/pikuri/tool/fetch.rb +10 -9
  24. data/lib/pikuri/tool/parameters.rb +65 -2
  25. data/lib/pikuri/tool/scraper.rb +186 -0
  26. data/lib/pikuri/tool/search/brave.rb +32 -18
  27. data/lib/pikuri/tool/search/duckduckgo.rb +18 -7
  28. data/lib/pikuri/tool/search/engines.rb +72 -49
  29. data/lib/pikuri/tool/search/exa.rb +34 -22
  30. data/lib/pikuri/tool/web_scrape.rb +5 -5
  31. data/lib/pikuri/tool/web_search.rb +45 -26
  32. data/lib/pikuri/version.rb +1 -1
  33. data/lib/pikuri-core.rb +11 -10
  34. metadata +9 -66
  35. data/lib/pikuri/tool/scraper/fetch_error.rb +0 -16
  36. data/lib/pikuri/tool/scraper/html.rb +0 -285
  37. data/lib/pikuri/tool/scraper/pdf.rb +0 -54
  38. data/lib/pikuri/tool/scraper/simple.rb +0 -183
@@ -2,13 +2,13 @@
2
2
 
3
3
  module Pikuri
4
4
  class Agent
5
- # Step-exhaustion rescue. When an +Agent+'s
6
- # {Control::StepLimit} trips, +Agent#run_loop+ catches the
7
- # +Exceeded+ exception and hands off to {Synthesizer.run} so
8
- # the run still produces something useful — a tools-free
9
- # assistant turn that answers the user's question from
10
- # whatever evidence the failed agent collected before running
11
- # out of budget.
5
+ # Prompt builder for the step-exhaustion rescue. When an
6
+ # +Agent+'s {Control::StepLimit} trips with the +:synthesize+
7
+ # policy, +Agent#run_loop+ runs this module's prompt on a
8
+ # nested tools-free agent so the run still produces something
9
+ # useful — an assistant turn that answers the user's question
10
+ # from whatever evidence the failed agent collected before
11
+ # running out of budget.
12
12
  #
13
13
  # == Why this exists
14
14
  #
@@ -22,16 +22,24 @@ module Pikuri
22
22
  # answer is largely in the messages — it just needs a
23
23
  # tools-free pass to synthesize.
24
24
  #
25
+ # Salvage is the wrong move for some agents, which is why the
26
+ # policy lives on {Control::StepLimit} and defaults to
27
+ # +:raise+ — a coding agent's half-finished work can't be
28
+ # completed by a tools-free pass, only described. See
29
+ # {Control::StepLimit}'s class header.
30
+ #
25
31
  # == Seam discipline
26
32
  #
27
- # {Synthesizer.run} does not reference +RubyLLM::*+. +Agent+
28
- # constructs the synth chat itself (the one +RubyLLM.chat+
29
- # call lives in +lib/agent.rb+, same as the parent chat) and
30
- # passes it in. +Synthesizer+ only calls instance methods on
31
- # whatever +chat+ it receives +#with_instructions+,
32
- # +#ask+, +#messages+ and uses {Agent.wire_chat} for the
33
- # event-stream wiring so the synth chat emits events with
34
- # the same shape as the main chat.
33
+ # This module is pure prompt construction — no chat handling,
34
+ # no +RubyLLM.chat+ call, no event wiring. The execution side
35
+ # (constructing the nested agent, sharing the parent's
36
+ # listener stream and cancellable, capturing the answer) is
37
+ # +Agent#run_synthesizer+'s job: the synth is a regular
38
+ # tools-free +Agent+, the same construction shape the +agent+
39
+ # tool from +pikuri-subagents+ uses for sub-agents. The only
40
+ # +RubyLLM::*+ surface read here is the value-type
41
+ # +RubyLLM::Message+ / +ToolCall+ passthrough (per the
42
+ # value-type rule in CLAUDE.md).
35
43
  module Synthesizer
36
44
  # The synthesizer's system prompt. Strict and short: use
37
45
  # the evidence, don't apologize, admit gaps when present.
@@ -39,58 +47,6 @@ module Pikuri
39
47
  You are given evidence another agent collected before running out of steps. Answer the user's question using only this evidence. You have no tools. If the evidence is insufficient, state plainly what's missing and what partial answer you can give. Do not apologize or comment on the previous agent.
40
48
  PROMPT
41
49
 
42
- # Configure +chat+ for synthesis, run one turn against it,
43
- # and return the final assistant content. The chat is wired
44
- # for the event stream via {Agent.wire_chat} so the synth's
45
- # reasoning and answer flow through the same listener
46
- # surface the parent agent uses — terminal renders them
47
- # inline (padded under sub-agent), an in-memory recorder
48
- # picks them up, a TokenLog tags them with the synth id.
49
- #
50
- # @param chat [RubyLLM::Chat] a *fresh* chat with no tools.
51
- # The caller is responsible for constructing it with the
52
- # same model/provider configuration the parent used.
53
- # @param parent_messages [Array<RubyLLM::Message>] the
54
- # parent chat's full message history at the moment of
55
- # step exhaustion. Used to build the evidence transcript.
56
- # @param user_message [String] the user's original question
57
- # from the parent turn that exhausted.
58
- # @param listeners [Agent::ListenerList] listeners to wire
59
- # the synth chat into. Typically the parent agent's list
60
- # run through {ListenerList#for_sub_agent} with the
61
- # synth's +id:+ so any +TokenLog+ tags its lines with
62
- # the synth bracket and any +Terminal+ pads its output.
63
- # @param step_limit [Control::StepLimit, nil] defensive
64
- # step budget. The synth has no tools so it should never
65
- # trip +before_tool_call+, but a buggy provider that
66
- # somehow returned a tool call would loop without one.
67
- # Pass +nil+ to skip.
68
- # @param cancellable [Control::Cancellable, nil]
69
- # cancellation control. Typically the parent's instance,
70
- # shared by reference so a user cancel during synthesis
71
- # still works. Pass +nil+ to skip.
72
- # @param streaming [Boolean] mirror the parent agent's
73
- # +streaming+ flag. When +true+, {Agent.streaming_block}
74
- # is passed to +chat.ask+ so the synth's reasoning and
75
- # answer flow through the listener stream as deltas in
76
- # addition to the final {Event::Thinking} / {Event::Assistant}
77
- # bookends.
78
- # @return [String, nil] the synth's final assistant
79
- # content, or +nil+ if the synth somehow produced no
80
- # assistant message
81
- def self.run(chat:, parent_messages:, user_message:, listeners:,
82
- step_limit: nil, cancellable: nil, streaming: false)
83
- chat.with_instructions(SYSTEM_PROMPT)
84
- Agent.wire_chat(chat, listeners: listeners, step_limit: step_limit, cancellable: cancellable)
85
- prompt = build_prompt(parent_messages: parent_messages, user_message: user_message)
86
- if streaming
87
- chat.ask(prompt, &Agent.streaming_block(listeners: listeners, cancellable: cancellable))
88
- else
89
- chat.ask(prompt)
90
- end
91
- chat.messages.reverse.find { |m| m.role == :assistant }&.content
92
- end
93
-
94
50
  # Render the user's question plus an "Evidence gathered"
95
51
  # section built from +parent_messages+ as a single prompt
96
52
  # string. Pure function — no I/O, safe to test directly
@@ -140,6 +96,76 @@ module Pikuri
140
96
  lines.join("\n").rstrip
141
97
  end
142
98
  private_class_method :format_evidence
99
+
100
+ # The +:synthesize+ arm of the step-exhaustion policy (see the
101
+ # class header). Runs the {Synthesizer} prompt over the
102
+ # exhausted chat's history on a nested tools-free +Agent+ —
103
+ # the same construction shape the +agent+ tool from
104
+ # +pikuri-subagents+ uses for sub-agents, so the synth gets
105
+ # listener propagation, transport / context-window-cap /
106
+ # streaming inheritance, and teardown via +close+ for free.
107
+ # The synth's answer is returned.
108
+ #
109
+ # @param ctx [ExtensionContext]
110
+ # @param chat_messages [Array<RubyLLM::Message>] the
111
+ # exhausted chat's full message history, the evidence
112
+ # {.build_prompt} renders
113
+ # @param user_message [String] the user's original question
114
+ # from the turn that exhausted
115
+ # @raise [Control::Cancellable::Cancelled] when a cancel
116
+ # landed between the budget tripping and this rescue —
117
+ # cancellation wins over salvage
118
+ # @return [String] the synth answer
119
+ def self.run_synthesizer(ctx, chat_messages, user_message)
120
+ # Check the cancel flag *before* constructing the synth: the
121
+ # nested run_loop resets the shared cancellable at its turn
122
+ # boundary, which would erase a cancel requested in this
123
+ # window. The raise propagates without a parent-side
124
+ # {Event::Cancelled} — a cancel *during* synthesis emits it
125
+ # from the synth's own rescue (on the derived listener list)
126
+ # instead, so either way the stream sees at most one.
127
+ ctx.agent.cancellable&.check!
128
+
129
+ ctx.emit_event(Event::FallbackNotice.new(
130
+ reason: "agent exhausted #{ctx.agent.step_limit.max} steps; " \
131
+ 'synthesizing answer from gathered evidence'
132
+ ))
133
+
134
+ # Synth runs under this agent's identity but with a
135
+ # different system prompt, so it gets a distinct
136
+ # +_synthesizer+ suffix on the id — same +_+ separator the
137
+ # sub-agent generator uses, so main becomes +"synthesizer"+
138
+ # and a sub-agent +"researcher 0"+ becomes
139
+ # +"researcher 0_synthesizer"+. Any +TokenLog+ in the list
140
+ # tags the synth's prompt under that bracket so it's
141
+ # obvious from the log which turns were the rescue rather
142
+ # than the original loop.
143
+ synth_id = ctx.agent.id.empty? ? 'synthesizer' : "#{ctx.agent.id}_synthesizer"
144
+ synth = Agent.new(
145
+ # Carry the parent's resolved cap on the transport so the synth
146
+ # reuses it without a re-probe — the cap rides {ChatTransport}
147
+ # now, not an +Agent.new(context_window:)+ kwarg.
148
+ transport: ctx.agent.transport.with(context_window: ctx.agent.context_window_cap),
149
+ system_prompt: Synthesizer::SYSTEM_PROMPT,
150
+ # Defensive budget with the default :raise policy: the
151
+ # synth has no tools so it should never tick, but a buggy
152
+ # provider that somehow returns a tool call must not loop
153
+ # forever — and a synth that needs its own synth is a bug,
154
+ # not a rescue.
155
+ step_limit: Control::StepLimit.new(max: 1),
156
+ cancellable: ctx.agent.cancellable,
157
+ id: synth_id,
158
+ streaming: ctx.agent.streaming
159
+ ) { |c| c.add_listeners(ctx.sub_agent_listeners(id: synth_id)) }
160
+ begin
161
+ synth.run_loop(user_message: Synthesizer.build_prompt(
162
+ parent_messages: chat_messages, user_message: user_message
163
+ ))
164
+ synth.last_assistant_content
165
+ ensure
166
+ synth.close
167
+ end
168
+ end
143
169
  end
144
170
  end
145
171
  end