pikuri-workspace 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: afd83aa622997eea8b09c700282cf401753d1925ace04a0c4ade3856c07ee8b4
4
- data.tar.gz: 4cd46c7de8d42112e1119e68bab0ce7937ed9e5c270a29ceb34e5089178abd4c
3
+ metadata.gz: 3918af9bb28c07c5706f2738899b05cda0a9c65a3a59b7e10799a98c8b8bc0b8
4
+ data.tar.gz: 142a335f7f42891c0f9f5cd9964dc31202691f1fa470fef526ad61280a2f8bff
5
5
  SHA512:
6
- metadata.gz: c818b87a2aca2f0f615d084cc7c6e0457d0a84e5cd5be8b061c36aba50a071ad2565c27170d33ae25ca810dee8a25b16b1a13b25477b21e1c00eb79d067f2bb0
7
- data.tar.gz: 7277b1ba878546589d4107136f4d49e3dc68b0ee5aff80d3eeff8e06f9d97b4fa569eb8caf186337e39c72e9f33812e3963fd433eeb35f926f82bb6b681bb579
6
+ metadata.gz: a20b8468fa2fc9cbed1aed620b645443cc6fb8969aad611de24e8620c88efe16b692265aabb5520d0c5231f5f6da2c31a0cae3f815eed2fd4e7840ef13e637d5
7
+ data.tar.gz: b5c6f5bc61a0f45641127a252c5f55bac9e0d3f5bb5ee577abeed31419859454daf7b7db26a0f5880cd638304ab3bd7da5d63b910093c4223372f21b5a97f2cb
data/README.md CHANGED
@@ -49,7 +49,7 @@ end
49
49
  `Workspace` is the "look-but-don't-leak" guard around filesystem
50
50
  access. Read tools route through `#resolve_for_read(path)`; mutating
51
51
  tools route through `#resolve_for_write(path)` + the Confirmer's
52
- `#confirm?(prompt:)`. Pass `temp: true` to mint an ephemeral
52
+ `#confirm?(request:)`. Pass `temp: true` to mint an ephemeral
53
53
  writable playground via `Dir.mktmpdir` — its path is exposed as
54
54
  `workspace.temp` and auto-removed at process exit.
55
55
 
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require 'rainbow'
4
+
3
5
  module Pikuri
4
6
  module Workspace
5
7
  # Port for asking the user to confirm a potentially destructive tool
@@ -25,25 +27,74 @@ module Pikuri
25
27
  # == Seam discipline
26
28
  #
27
29
  # Tools that need confirmation take a {Confirmer} via constructor and
28
- # invoke {#confirm?} with a fully-composed prompt String. Tools do
29
- # *not* call +gets+ / +puts+ directly same lesson as listeners,
30
- # keep IO at the seam so a future TUI / web client can plug a
31
- # different implementation in without touching tool code.
30
+ # invoke {#confirm?} with a semantic {Request} *what* is being
31
+ # asked, never *how* it should look. All presentation belongs to the
32
+ # confirmer implementation: color, the answer cue, answer parsing,
33
+ # and security-relevant neutralizing hostile bytes in
34
+ # LLM-supplied text. The chrome-independent half of that (escape
35
+ # control bytes, flag bidi / zero-width / homoglyph spoofs) is the
36
+ # shared {Pikuri::Sanitizer}; the medium-specific half stays with the
37
+ # renderer that knows its medium (a terminal prints the sanitized
38
+ # text directly; a web client wraps it in HTML-escaping). Tools do *not*
39
+ # call +gets+ / +puts+ directly — same lesson as listeners, keep IO
40
+ # at the seam so a TUI / web client can plug a different
41
+ # implementation in without touching tool code.
32
42
  class Confirmer
33
- # @param prompt [String] human-readable question composed by the
34
- # calling tool. The confirmer renders it and parses the answer;
35
- # it does NOT compose its own prompt content. Caller owns the
36
- # closing punctuation and any "(y/n)" cue.
43
+ # The semantic payload of one confirmation. Two fields:
44
+ #
45
+ # * +question+ one-line headline composed by the calling tool,
46
+ # e.g. +"OK to overwrite foo.rb: 120 → 245 bytes?"+. The caller
47
+ # owns the phrasing and punctuation; the renderer owns the
48
+ # answer cue (+"(y/n)?"+, buttons, ...).
49
+ # * +detail+ — optional preformatted body, possibly multi-line:
50
+ # the raw bash command for {Pikuri::Code::Bash}, +nil+ for
51
+ # {Write}. Renderers typically display it monospaced / dimmed.
52
+ #
53
+ # Both fields are RAW, straight from tool arguments the LLM
54
+ # composed (+detail+ is the command verbatim; +question+ may embed
55
+ # an LLM-written description). Renderers MUST neutralize them before
56
+ # display — route through {Pikuri::Sanitizer} (see {Terminal}), and
57
+ # additionally HTML-escape in a web client.
58
+ Request = Data.define(:question, :detail) do
59
+ # @param question [String] one-line headline; caller owns phrasing
60
+ # @param detail [String, nil] optional preformatted body
61
+ def initialize(question:, detail: nil)
62
+ super
63
+ end
64
+ end
65
+
66
+ # @param request [Request] semantic content composed by the
67
+ # calling tool. The confirmer renders it (escaping for its
68
+ # medium), poses the question, and parses the answer.
37
69
  # @return [Boolean] +true+ iff approved
38
70
  # @raise [NotImplementedError] in the abstract base
39
- def confirm?(prompt:)
71
+ def confirm?(request:)
40
72
  raise NotImplementedError, "#{self.class}#confirm? must be implemented"
41
73
  end
42
74
 
43
- # Stdin/stdout implementation: prints +prompt+ on its own line (a
44
- # leading +puts+ guarantees separation from any streamed output
45
- # the +Terminal+ listener may have produced just above), reads one
46
- # line from +$stdin+, parses it strictly:
75
+ # Stdin/stdout implementation. Renders the request as up to several
76
+ # lines (a leading +puts+ guarantees separation from any streamed
77
+ # output the +Terminal+ listener may have produced just above):
78
+ #
79
+ # 1. a bold-yellow warning block — one line per
80
+ # {Pikuri::Sanitizer::Warning} — shown only when the sanitizer
81
+ # flagged something suspicious in the question or detail
82
+ # 2. the question, bold
83
+ # 3. the detail, dim — omitted when +nil+
84
+ # 4. the +(y/n)?+ cue
85
+ #
86
+ # Both question and detail pass through {Pikuri::Sanitizer}, which
87
+ # neutralizes control bytes — without it, a model could craft a
88
+ # command or description containing +"\rrm -rf ~/"+ that visually
89
+ # overwrites the echoed line after the user has already read it —
90
+ # and reports *why* it was unsafe so the user reads the warning
91
+ # before answering. Colors come from Rainbow (already in the
92
+ # dependency closure via pikuri-core), which self-disables on
93
+ # non-TTY output; the bold-yellow warning rendering is this
94
+ # terminal chrome's call, not the sanitizer's (the +Warning+
95
+ # carries plain text only).
96
+ #
97
+ # Then reads one line from +$stdin+ and parses it strictly:
47
98
  #
48
99
  # * +"y"+ / +"yes"+ (case-insensitive, stripped) → +true+
49
100
  # * +"n"+ / +"no"+ → +false+
@@ -53,11 +104,21 @@ module Pikuri
53
104
  #
54
105
  # No retry cap; EOF eventually breaks adversarial input.
55
106
  class Terminal < Confirmer
56
- # @param prompt [String]
107
+ # @param request [Request]
57
108
  # @return [Boolean]
58
- def confirm?(prompt:)
109
+ def confirm?(request:)
110
+ question = Pikuri::Sanitizer.sanitize(request.question)
111
+ detail = request.detail ? Pikuri::Sanitizer.sanitize(request.detail) : nil
112
+ warnings = question.warnings + (detail ? detail.warnings : [])
113
+
59
114
  puts
60
- puts prompt
115
+ unless warnings.empty?
116
+ puts Rainbow('⚠ Suspicious content detected — read carefully before approving:').yellow.bold
117
+ warnings.each { |w| puts Rainbow(" ! #{w.explanation}").yellow }
118
+ end
119
+ puts Rainbow(question.text).bold
120
+ puts Rainbow(detail.text).dimgray if detail
121
+ puts '(y/n)?'
61
122
  $stdout.flush
62
123
  loop do
63
124
  line = $stdin.gets
@@ -78,9 +139,9 @@ module Pikuri
78
139
  # coordinate stdin. The name +AUTO_APPROVE+ matches the public
79
140
  # constant {AUTO_APPROVE}.
80
141
  class AutoApprove < Confirmer
81
- # @param prompt [String] ignored
142
+ # @param request [Request] ignored
82
143
  # @return [true]
83
- def confirm?(prompt:)
144
+ def confirm?(request:)
84
145
  true
85
146
  end
86
147
  end
@@ -174,12 +174,6 @@ module Pikuri
174
174
  # raised at construction time for a denied project root.
175
175
  class Error < StandardError; end
176
176
 
177
- # Parent directory under which every workspace mints its
178
- # umbrella ({#internal_temp}). Honors +XDG_CACHE_HOME+ when set,
179
- # else +~/.cache+; the +pikuri+ subdir is owned by us.
180
- # +mkdir_p+'d lazily on first umbrella access.
181
- CACHE_BASE = File.join(ENV['XDG_CACHE_HOME'] || File.join(Dir.home, '.cache'), 'pikuri')
182
-
183
177
  # Umbrella dirs older than this are reaped by
184
178
  # {.sweep_stale_internal_temps!} at gem load. Generous enough
185
179
  # that a long-lived pikuri session in another shell isn't
@@ -279,7 +273,7 @@ module Pikuri
279
273
  end
280
274
 
281
275
  # Per-workspace ephemeral umbrella. Minted lazily on first call
282
- # under {CACHE_BASE}. Registered with {Pikuri::Finalizers} for
276
+ # under {Paths::cache}. Registered with {Pikuri::Finalizers} for
283
277
  # removal the moment it's minted, so anything subsequently placed inside
284
278
  # (the playground, {Pikuri::Code::Bash::Sandbox::Bubblewrap}'s
285
279
  # overlay state) gets wiped together. Callers that want
@@ -298,8 +292,8 @@ module Pikuri
298
292
  # one registry. The +path.exist?+ guard makes the removal a no-op
299
293
  # when the dir is already gone (test cleanup, manual rm).
300
294
  def self.mint_internal_temp
301
- FileUtils.mkdir_p(CACHE_BASE)
302
- path = Pathname.new(Dir.mktmpdir('workspace-', CACHE_BASE)).realpath
295
+ FileUtils.mkdir_p(Paths.cache)
296
+ path = Pathname.new(Dir.mktmpdir('workspace-', Paths.cache)).realpath
303
297
  Pikuri::Finalizers.register { FileUtils.remove_entry(path.to_s) if path.exist? }
304
298
  path
305
299
  end
@@ -307,18 +301,18 @@ module Pikuri
307
301
  # Reap +workspace-*+ umbrella dirs that have outlived
308
302
  # {INTERNAL_TEMP_STALE_SECONDS}. Called once at gem load via
309
303
  # {Pikuri::Workspace} so each process boot inherits a tidy
310
- # {CACHE_BASE}. Failures (permission denied, racing concurrent
304
+ # {Paths::cache}. Failures (permission denied, racing concurrent
311
305
  # sweeper) are swallowed — best-effort cleanup; the
312
306
  # {Pikuri::Finalizers} removal is the load-bearing path.
313
307
  #
314
308
  # @return [void]
315
309
  def self.sweep_stale_internal_temps!
316
- return unless File.directory?(CACHE_BASE)
310
+ return unless File.directory?(Paths.cache)
317
311
 
318
312
  cutoff = Time.now - INTERNAL_TEMP_STALE_SECONDS
319
- Dir.children(CACHE_BASE).each do |entry|
313
+ Dir.children(Paths.cache).each do |entry|
320
314
  next unless entry.start_with?('workspace-')
321
- path = File.join(CACHE_BASE, entry)
315
+ path = File.join(Paths.cache, entry)
322
316
  next unless File.directory?(path)
323
317
  next if File.mtime(path) > cutoff
324
318
 
@@ -27,7 +27,7 @@ module Pikuri
27
27
  #
28
28
  # The line/byte windowing is delegated to
29
29
  # {Pikuri::FileType.read_as_text_paged}, which returns a
30
- # {Pikuri::FileType::Page} this tool renders; the same windower
30
+ # {Pikuri::Extractor::Page} this tool renders; the same windower
31
31
  # backs +VectorDb::Tools::Read+. Two independent limits, whichever fires
32
32
  # first wins:
33
33
  #
@@ -38,27 +38,29 @@ module Pikuri
38
38
  # Additionally, individual lines longer than {MAX_LINE_LENGTH} chars
39
39
  # are truncated with {LINE_TRUNCATION_MARKER} appended; the model is
40
40
  # told to reach for +grep+ to find content inside such files. (These
41
- # constants alias the +PAGE_*+ ones on {Pikuri::FileType} — one
41
+ # constants alias the +PAGE_*+ ones on {Pikuri::Extractor} — one
42
42
  # source of truth, shared with +VectorDb::Tools::Read+.)
43
43
  #
44
- # == PDF extraction
44
+ # == PDF (and other extracted formats)
45
45
  #
46
- # PDFs are detected by their +%PDF-+ magic prefix in the sample bytes
47
- # and routed through {Pikuri::FileType.read_as_text_paged} instead of
48
- # the binary-refusal path. The extractor walks pages lazily via
49
- # +pdf-reader+, emitting one synthetic +"--- Page N ---"+ header line
50
- # per page followed by that page's text. The offset / limit /
51
- # MAX_BYTES contract is identical to the text path — extraction stops
52
- # as soon as the line or byte cap is hit, so reading the first window
53
- # of a 500-page PDF only parses the few pages needed. Line numbers in
54
- # PDF output are for citation back to the user only; PDFs are not
55
- # editable through {Edit}.
46
+ # Which formats read as text is the {Pikuri::Extractor} registry's
47
+ # business, not this tool's: with pikuri-pdf's extractor
48
+ # registered, PDFs are claimed by their +%PDF-+ magic prefix ahead
49
+ # of the binary refusal and extracted with one synthetic
50
+ # +"--- Page N ---"+ header line per page (see
51
+ # +Pikuri::Extractors::PDF+); a gem plugging another extractor
52
+ # into the registry extends this tool for free. Extraction is lazy
53
+ # where the format allows (+extract_lines+): reading the first
54
+ # window of a 500-page PDF parses only the pages the window needs.
55
+ # Formats without a lazy line shape (HTML) are extracted in full
56
+ # and then windowed. Line numbers in PDF output are for citation
57
+ # back to the user only; PDFs are not editable through {Edit}.
56
58
  #
57
59
  # PDFs with no extractable text (scanned images, empty documents) come
58
60
  # back with an LLM-actionable hint string rather than an empty
59
61
  # observation. Encrypted / malformed / XFA-form PDFs surface as
60
- # +"Error: cannot extract PDF text: ..."+ — same convention as other
61
- # tool errors the model can react to. No OCR.
62
+ # +"Error: ..."+ — same convention as other tool errors the model
63
+ # can react to. No OCR.
62
64
  #
63
65
  # == Image attachments
64
66
  #
@@ -96,36 +98,38 @@ module Pikuri
96
98
  # * Image larger than {MAX_IMAGE_BYTES} → +"Error: image too large…"+,
97
99
  # leaving the model to pick a different file or ask the user to
98
100
  # resize.
99
- # * Binary content → {Pikuri::FileType.binary?} on the sample; any
100
- # +NUL+ byte or a sample dense in control characters triggers
101
- # refusal. Catches archives and compiled artifacts without an
102
- # extension list to maintain. PDFs and supported images are
103
- # intercepted by their respective magic-byte checks via
104
- # {Pikuri::FileType.detect_mime} before the binary sniff — see
105
- # above.
101
+ # * Binary content → nothing in the {Pikuri::Extractor} registry
102
+ # claims it ({Pikuri::Extractor::Passthrough} declines on the
103
+ # {Pikuri::FileType.binary?} heuristic: any +NUL+ byte or a
104
+ # sample dense in control characters). Catches archives and
105
+ # compiled artifacts without an extension list to maintain.
106
+ # Registered extractors (pikuri-pdf's PDF, pikuri-extractors'
107
+ # office formats) claim their bytes ahead of that refusal;
108
+ # images are intercepted here via {Pikuri::FileType.detect_mime}
109
+ # before extraction is attempted — see above.
106
110
  # * Offset past EOF → +"Error: offset N is beyond end of file (M lines total)"+.
107
111
  class Read < Pikuri::Tool
108
- # The windowing constants live on {Pikuri::FileType} now (shared
112
+ # The windowing constants live on {Pikuri::Extractor} (shared
109
113
  # with +VectorDb::Tools::Read+); these aliases keep the names this tool's
110
114
  # description and specs reference pointing at the single source.
111
115
 
112
116
  # @return [Integer] default value of the +limit+ parameter (number
113
117
  # of lines to read per call).
114
- DEFAULT_LIMIT = Pikuri::FileType::PAGE_DEFAULT_LIMIT
118
+ DEFAULT_LIMIT = Pikuri::Extractor::PAGE_DEFAULT_LIMIT
115
119
 
116
120
  # @return [Integer] per-line character cap; longer lines are
117
121
  # truncated with {LINE_TRUNCATION_MARKER}.
118
- MAX_LINE_LENGTH = Pikuri::FileType::PAGE_MAX_LINE_LENGTH
122
+ MAX_LINE_LENGTH = Pikuri::Extractor::PAGE_MAX_LINE_LENGTH
119
123
 
120
124
  # @return [String] suffix appended to lines truncated by
121
125
  # {MAX_LINE_LENGTH}.
122
- LINE_TRUNCATION_MARKER = Pikuri::FileType::PAGE_LINE_TRUNCATION_MARKER
126
+ LINE_TRUNCATION_MARKER = Pikuri::Extractor::PAGE_LINE_TRUNCATION_MARKER
123
127
 
124
128
  # @return [Integer] hard byte cap on input content collected per
125
129
  # call. Counted on the line bytes (plus one for the joining
126
130
  # newline); the rendered output is slightly larger due to the
127
131
  # per-line +"%6d\t"+ prefix.
128
- MAX_BYTES = Pikuri::FileType::PAGE_MAX_BYTES
132
+ MAX_BYTES = Pikuri::Extractor::PAGE_MAX_BYTES
129
133
 
130
134
  # @return [String] human-readable form of {MAX_BYTES} for the
131
135
  # continuation marker.
@@ -221,11 +225,6 @@ module Pikuri
221
225
 
222
226
  mime = Pikuri::FileType.detect_mime(resolved)
223
227
  return format_image(path: path, resolved: resolved, mime: mime) if mime&.start_with?('image/')
224
- # PDFs are binary by the heuristic, so the PDF route (handled
225
- # inside read_as_text_paged) must win over the binary refusal.
226
- if mime != 'application/pdf' && Pikuri::FileType.binary?(resolved)
227
- return "Error: cannot read binary file: #{path}"
228
- end
229
228
 
230
229
  page = Pikuri::FileType.read_as_text_paged(
231
230
  resolved, offset: offset, limit: limit,
@@ -236,18 +235,24 @@ module Pikuri
236
235
  "Error: #{e.message}"
237
236
  rescue Errno::EACCES => e
238
237
  "Error: cannot read #{path}: #{e.message}"
238
+ rescue ArgumentError
239
+ # Nothing in the Extractor registry claimed the content —
240
+ # read_as_text_paged's binary refusal (directories and images
241
+ # were already handled above).
242
+ "Error: cannot read binary file: #{path}"
239
243
  rescue RuntimeError => e
240
- # Malformed / unsupported PDF surfaced by read_as_text_paged.
244
+ # Extraction failure (malformed / unsupported PDF, ...)
245
+ # surfaced by read_as_text_paged.
241
246
  "Error: #{e.message}"
242
247
  end
243
248
 
244
- # Render a {Pikuri::FileType::Page} as the cat-n observation: a
249
+ # Render a {Pikuri::Extractor::Page} as the cat-n observation: a
245
250
  # six-column line number, a tab, then the (already-truncated)
246
251
  # content, followed by a trailer that tells the model whether to
247
252
  # page on. PDF pages carry +"--- Page N ---"+ marker lines from
248
253
  # the extractor; the +kind+ only changes trailer wording here.
249
254
  #
250
- # @param page [Pikuri::FileType::Page]
255
+ # @param page [Pikuri::Extractor::Page]
251
256
  # @return [String]
252
257
  def self.render_page(page)
253
258
  if page.lines.empty?
@@ -281,7 +286,7 @@ module Pikuri
281
286
  # text-free PDF gets an LLM-actionable hint rather than the
282
287
  # plain-file "(Empty file)".
283
288
  #
284
- # @param page [Pikuri::FileType::Page]
289
+ # @param page [Pikuri::Extractor::Page]
285
290
  # @return [String]
286
291
  def self.empty_message(page)
287
292
  if page.kind == :pdf
@@ -111,8 +111,10 @@ module Pikuri
111
111
  'file and try again.'
112
112
  end
113
113
 
114
- prompt = "OK to overwrite #{path}: #{existing.bytesize} → #{content.bytesize} bytes? (y/n)"
115
- return "Error: user declined the write to #{path}." unless confirmer.confirm?(prompt: prompt)
114
+ request = Confirmer::Request.new(
115
+ question: "OK to overwrite #{path}: #{existing.bytesize} #{content.bytesize} bytes?"
116
+ )
117
+ return "Error: user declined the write to #{path}." unless confirmer.confirm?(request: request)
116
118
 
117
119
  write_bytes(resolved, content)
118
120
  "Updated #{path} (#{existing.bytesize} → #{content.bytesize} bytes)"
metadata CHANGED
@@ -1,14 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pikuri-workspace
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.5
4
+ version: 0.0.7
5
5
  platform: ruby
6
6
  authors:
7
7
  - Martin Vysny
8
- autorequire:
9
8
  bindir: bin
10
9
  cert_chain: []
11
- date: 2026-06-04 00:00:00.000000000 Z
10
+ date: 1980-01-02 00:00:00.000000000 Z
12
11
  dependencies:
13
12
  - !ruby/object:Gem::Dependency
14
13
  name: pikuri-core
@@ -16,14 +15,14 @@ dependencies:
16
15
  requirements:
17
16
  - - '='
18
17
  - !ruby/object:Gem::Version
19
- version: 0.0.5
18
+ version: 0.0.7
20
19
  type: :runtime
21
20
  prerelease: false
22
21
  version_requirements: !ruby/object:Gem::Requirement
23
22
  requirements:
24
23
  - - '='
25
24
  - !ruby/object:Gem::Version
26
- version: 0.0.5
25
+ version: 0.0.7
27
26
  description: |
28
27
  pikuri-workspace adds "operate on a directory tree" to pikuri-core
29
28
  agents: the +Pikuri::Workspace::Filesystem+ class that scopes
@@ -59,7 +58,6 @@ metadata:
59
58
  changelog_uri: https://codeberg.org/mvysny/pikuri/src/branch/master/CHANGELOG.md
60
59
  bug_tracker_uri: https://codeberg.org/mvysny/pikuri/issues
61
60
  rubygems_mfa_required: 'true'
62
- post_install_message:
63
61
  rdoc_options: []
64
62
  require_paths:
65
63
  - lib
@@ -74,8 +72,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
74
72
  - !ruby/object:Gem::Version
75
73
  version: '0'
76
74
  requirements: []
77
- rubygems_version: 3.5.22
78
- signing_key:
75
+ rubygems_version: 3.6.7
79
76
  specification_version: 4
80
77
  summary: Filesystem tools (Read/Write/Edit/Grep/Glob) + Workspace + Confirmer seams
81
78
  for pikuri.