RubyGems - pikuri-core - Versions diffs - 0.0.5 → 0.0.7 - Mend

pikuri-core 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

checksums.yaml +4 -4
data/README.md +5 -3
data/lib/pikuri/agent/chat_transport.rb +135 -11
data/lib/pikuri/agent/configurator.rb +4 -4
data/lib/pikuri/agent/context_window_detector.rb +103 -52
data/lib/pikuri/agent/control/step_limit.rb +39 -7
data/lib/pikuri/agent/event.rb +43 -16
data/lib/pikuri/agent/extension.rb +31 -17
data/lib/pikuri/agent/extension_context.rb +147 -0
data/lib/pikuri/agent/listener/terminal.rb +30 -37
data/lib/pikuri/agent/listener/token_log.rb +60 -13
data/lib/pikuri/agent/listener.rb +12 -5
data/lib/pikuri/agent/listener_list.rb +7 -17
data/lib/pikuri/agent/synthesizer.rb +93 -67
data/lib/pikuri/agent.rb +358 -403
data/lib/pikuri/extractor/html.rb +303 -0
data/lib/pikuri/extractor/passthrough.rb +64 -0
data/lib/pikuri/extractor.rb +314 -0
data/lib/pikuri/file_type.rb +74 -266
data/lib/pikuri/sanitizer.rb +179 -0
data/lib/pikuri/subprocess.rb +73 -2
data/lib/pikuri/tool/calculator.rb +213 -41
data/lib/pikuri/tool/fetch.rb +10 -9
data/lib/pikuri/tool/parameters.rb +65 -2
data/lib/pikuri/tool/scraper.rb +186 -0
data/lib/pikuri/tool/search/brave.rb +32 -18
data/lib/pikuri/tool/search/duckduckgo.rb +18 -7
data/lib/pikuri/tool/search/engines.rb +72 -49
data/lib/pikuri/tool/search/exa.rb +34 -22
data/lib/pikuri/tool/web_scrape.rb +5 -5
data/lib/pikuri/tool/web_search.rb +45 -26
data/lib/pikuri/version.rb +1 -1
data/lib/pikuri-core.rb +11 -10
metadata +9 -66
data/lib/pikuri/tool/scraper/fetch_error.rb +0 -16
data/lib/pikuri/tool/scraper/html.rb +0 -285
data/lib/pikuri/tool/scraper/pdf.rb +0 -54
data/lib/pikuri/tool/scraper/simple.rb +0 -183

data/lib/pikuri/agent/synthesizer.rb CHANGED Viewed

@@ -2,13 +2,13 @@
 module Pikuri
   class Agent
-    # Step-exhaustion rescue. When an +Agent+'s
-    # {Control::StepLimit} trips, +Agent#run_loop+ catches the
-    # +Exceeded+ exception and hands off to {Synthesizer.run} so
-    # the run still produces something useful — a tools-free
-    # assistant turn that answers the user's question from
-    # whatever evidence the failed agent collected before running
-    # out of budget.
+    # Prompt builder for the step-exhaustion rescue. When an
+    # +Agent+'s {Control::StepLimit} trips with the +:synthesize+
+    # policy, +Agent#run_loop+ runs this module's prompt on a
+    # nested tools-free agent so the run still produces something
+    # useful — an assistant turn that answers the user's question
+    # from whatever evidence the failed agent collected before
+    # running out of budget.
     #
     # == Why this exists
     #
@@ -22,16 +22,24 @@ module Pikuri
     # answer is largely in the messages — it just needs a
     # tools-free pass to synthesize.
     #
+    # Salvage is the wrong move for some agents, which is why the
+    # policy lives on {Control::StepLimit} and defaults to
+    # +:raise+ — a coding agent's half-finished work can't be
+    # completed by a tools-free pass, only described. See
+    # {Control::StepLimit}'s class header.
+    #
     # == Seam discipline
     #
-    # {Synthesizer.run} does not reference +RubyLLM::*+. +Agent+
-    # constructs the synth chat itself (the one +RubyLLM.chat+
-    # call lives in +lib/agent.rb+, same as the parent chat) and
-    # passes it in. +Synthesizer+ only calls instance methods on
-    # whatever +chat+ it receives — +#with_instructions+,
-    # +#ask+, +#messages+ — and uses {Agent.wire_chat} for the
-    # event-stream wiring so the synth chat emits events with
-    # the same shape as the main chat.
+    # This module is pure prompt construction — no chat handling,
+    # no +RubyLLM.chat+ call, no event wiring. The execution side
+    # (constructing the nested agent, sharing the parent's
+    # listener stream and cancellable, capturing the answer) is
+    # +Agent#run_synthesizer+'s job: the synth is a regular
+    # tools-free +Agent+, the same construction shape the +agent+
+    # tool from +pikuri-subagents+ uses for sub-agents. The only
+    # +RubyLLM::*+ surface read here is the value-type
+    # +RubyLLM::Message+ / +ToolCall+ passthrough (per the
+    # value-type rule in CLAUDE.md).
     module Synthesizer
       # The synthesizer's system prompt. Strict and short: use
       # the evidence, don't apologize, admit gaps when present.
@@ -39,58 +47,6 @@ module Pikuri
         You are given evidence another agent collected before running out of steps. Answer the user's question using only this evidence. You have no tools. If the evidence is insufficient, state plainly what's missing and what partial answer you can give. Do not apologize or comment on the previous agent.
       PROMPT
-      # Configure +chat+ for synthesis, run one turn against it,
-      # and return the final assistant content. The chat is wired
-      # for the event stream via {Agent.wire_chat} so the synth's
-      # reasoning and answer flow through the same listener
-      # surface the parent agent uses — terminal renders them
-      # inline (padded under sub-agent), an in-memory recorder
-      # picks them up, a TokenLog tags them with the synth id.
-      #
-      # @param chat [RubyLLM::Chat] a *fresh* chat with no tools.
-      #   The caller is responsible for constructing it with the
-      #   same model/provider configuration the parent used.
-      # @param parent_messages [Array<RubyLLM::Message>] the
-      #   parent chat's full message history at the moment of
-      #   step exhaustion. Used to build the evidence transcript.
-      # @param user_message [String] the user's original question
-      #   from the parent turn that exhausted.
-      # @param listeners [Agent::ListenerList] listeners to wire
-      #   the synth chat into. Typically the parent agent's list
-      #   run through {ListenerList#for_sub_agent} with the
-      #   synth's +id:+ so any +TokenLog+ tags its lines with
-      #   the synth bracket and any +Terminal+ pads its output.
-      # @param step_limit [Control::StepLimit, nil] defensive
-      #   step budget. The synth has no tools so it should never
-      #   trip +before_tool_call+, but a buggy provider that
-      #   somehow returned a tool call would loop without one.
-      #   Pass +nil+ to skip.
-      # @param cancellable [Control::Cancellable, nil]
-      #   cancellation control. Typically the parent's instance,
-      #   shared by reference so a user cancel during synthesis
-      #   still works. Pass +nil+ to skip.
-      # @param streaming [Boolean] mirror the parent agent's
-      #   +streaming+ flag. When +true+, {Agent.streaming_block}
-      #   is passed to +chat.ask+ so the synth's reasoning and
-      #   answer flow through the listener stream as deltas in
-      #   addition to the final {Event::Thinking} / {Event::Assistant}
-      #   bookends.
-      # @return [String, nil] the synth's final assistant
-      #   content, or +nil+ if the synth somehow produced no
-      #   assistant message
-      def self.run(chat:, parent_messages:, user_message:, listeners:,
-                   step_limit: nil, cancellable: nil, streaming: false)
-        chat.with_instructions(SYSTEM_PROMPT)
-        Agent.wire_chat(chat, listeners: listeners, step_limit: step_limit, cancellable: cancellable)
-        prompt = build_prompt(parent_messages: parent_messages, user_message: user_message)
-        if streaming
-          chat.ask(prompt, &Agent.streaming_block(listeners: listeners, cancellable: cancellable))
-        else
-          chat.ask(prompt)
-        end
-        chat.messages.reverse.find { |m| m.role == :assistant }&.content
-      end
       # Render the user's question plus an "Evidence gathered"
       # section built from +parent_messages+ as a single prompt
       # string. Pure function — no I/O, safe to test directly
@@ -140,6 +96,76 @@ module Pikuri
         lines.join("\n").rstrip
       end
       private_class_method :format_evidence
+      # The +:synthesize+ arm of the step-exhaustion policy (see the
+      # class header). Runs the {Synthesizer} prompt over the
+      # exhausted chat's history on a nested tools-free +Agent+ —
+      # the same construction shape the +agent+ tool from
+      # +pikuri-subagents+ uses for sub-agents, so the synth gets
+      # listener propagation, transport / context-window-cap /
+      # streaming inheritance, and teardown via +close+ for free.
+      # The synth's answer is returned.
+      #
+      # @param ctx [ExtensionContext]
+      # @param chat_messages [Array<RubyLLM::Message>] the
+      #   exhausted chat's full message history, the evidence
+      #   {.build_prompt} renders
+      # @param user_message [String] the user's original question
+      #   from the turn that exhausted
+      # @raise [Control::Cancellable::Cancelled] when a cancel
+      #   landed between the budget tripping and this rescue —
+      #   cancellation wins over salvage
+      # @return [String] the synth answer
+      def self.run_synthesizer(ctx, chat_messages, user_message)
+        # Check the cancel flag *before* constructing the synth: the
+        # nested run_loop resets the shared cancellable at its turn
+        # boundary, which would erase a cancel requested in this
+        # window. The raise propagates without a parent-side
+        # {Event::Cancelled} — a cancel *during* synthesis emits it
+        # from the synth's own rescue (on the derived listener list)
+        # instead, so either way the stream sees at most one.
+        ctx.agent.cancellable&.check!
+        ctx.emit_event(Event::FallbackNotice.new(
+                         reason: "agent exhausted #{ctx.agent.step_limit.max} steps; " \
+                                                 'synthesizing answer from gathered evidence'
+                       ))
+        # Synth runs under this agent's identity but with a
+        # different system prompt, so it gets a distinct
+        # +_synthesizer+ suffix on the id — same +_+ separator the
+        # sub-agent generator uses, so main becomes +"synthesizer"+
+        # and a sub-agent +"researcher 0"+ becomes
+        # +"researcher 0_synthesizer"+. Any +TokenLog+ in the list
+        # tags the synth's prompt under that bracket so it's
+        # obvious from the log which turns were the rescue rather
+        # than the original loop.
+        synth_id = ctx.agent.id.empty? ? 'synthesizer' : "#{ctx.agent.id}_synthesizer"
+        synth = Agent.new(
+          # Carry the parent's resolved cap on the transport so the synth
+          # reuses it without a re-probe — the cap rides {ChatTransport}
+          # now, not an +Agent.new(context_window:)+ kwarg.
+          transport: ctx.agent.transport.with(context_window: ctx.agent.context_window_cap),
+          system_prompt: Synthesizer::SYSTEM_PROMPT,
+          # Defensive budget with the default :raise policy: the
+          # synth has no tools so it should never tick, but a buggy
+          # provider that somehow returns a tool call must not loop
+          # forever — and a synth that needs its own synth is a bug,
+          # not a rescue.
+          step_limit: Control::StepLimit.new(max: 1),
+          cancellable: ctx.agent.cancellable,
+          id: synth_id,
+          streaming: ctx.agent.streaming
+        ) { |c| c.add_listeners(ctx.sub_agent_listeners(id: synth_id)) }
+        begin
+          synth.run_loop(user_message: Synthesizer.build_prompt(
+            parent_messages: chat_messages, user_message: user_message
+          ))
+          synth.last_assistant_content
+        ensure
+          synth.close
+        end
+      end
     end
   end
 end