ace-compressor 0.24.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.ace-defaults/compressor/config.yml +11 -0
- data/.ace-defaults/nav/protocols/tmpl-sources/ace-compressor.yml +10 -0
- data/CHANGELOG.md +357 -0
- data/README.md +46 -0
- data/Rakefile +15 -0
- data/exe/ace-compressor +13 -0
- data/handbook/templates/agent/minify-single-source.template.md +34 -0
- data/lib/ace/compressor/atoms/canonical_block_transformer.rb +341 -0
- data/lib/ace/compressor/atoms/compact_policy_classifier.rb +130 -0
- data/lib/ace/compressor/atoms/markdown_parser.rb +190 -0
- data/lib/ace/compressor/atoms/retention_reporter.rb +111 -0
- data/lib/ace/compressor/cli/commands/benchmark.rb +51 -0
- data/lib/ace/compressor/cli/commands/compress.rb +89 -0
- data/lib/ace/compressor/cli.rb +23 -0
- data/lib/ace/compressor/models/context_pack.rb +175 -0
- data/lib/ace/compressor/molecules/cache_store.rb +301 -0
- data/lib/ace/compressor/molecules/input_resolver.rb +98 -0
- data/lib/ace/compressor/organisms/agent_compressor.rb +325 -0
- data/lib/ace/compressor/organisms/benchmark_runner.rb +172 -0
- data/lib/ace/compressor/organisms/compact_compressor.rb +470 -0
- data/lib/ace/compressor/organisms/compression_runner.rb +315 -0
- data/lib/ace/compressor/organisms/exact_compressor.rb +187 -0
- data/lib/ace/compressor/version.rb +7 -0
- data/lib/ace/compressor.rb +109 -0
- metadata +156 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: f4997ce4637fdebe201f78de3865aec56aa7b781ad6cc244c33d850c980ffdf5
|
|
4
|
+
data.tar.gz: 8aa267c479867137824bfb6ae1cdb238d356a62756f94248f0d6bed9530bf376
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: d2bce8e96654bb2d35db475c126e9b10f82bda99d8b8d0b520473c6a87de02823c589dd653bd7055c869f1233d0815fb64f744af9779c05bb3f72dcccfb1dbb2
|
|
7
|
+
data.tar.gz: ac83609da32295f432ae0e98f75194ab4ce38c4f9edb22b93b6bb725e2ae6392d4b8770a6b63c91b1c6fa8ec8e7ad19b2dd42fc7c6350ff54ce1c902c6cfe383
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
default_mode: exact
|
|
2
|
+
default_format: path
|
|
3
|
+
cache_dir: .ace-local/compressor
|
|
4
|
+
shared_cache_dir: ~/.ace/cache/compressor
|
|
5
|
+
shared_cache_scope: workflow_only
|
|
6
|
+
|
|
7
|
+
# Model alias or provider:model used for agent-mode prompt minification.
|
|
8
|
+
agent_model: glite
|
|
9
|
+
|
|
10
|
+
# Template resource used to compose the agent-mode system prompt via ace-bundle.
|
|
11
|
+
agent_template_uri: tmpl://agent/minify-single-source
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
---
|
|
2
|
+
# Template Sources Protocol Configuration for ace-compressor gem
|
|
3
|
+
name: ace-compressor
|
|
4
|
+
type: gem
|
|
5
|
+
description: Templates from ace-compressor gem
|
|
6
|
+
priority: 10
|
|
7
|
+
config:
|
|
8
|
+
relative_path: handbook/templates
|
|
9
|
+
pattern: "**/*.template.md"
|
|
10
|
+
enabled: true
|
data/CHANGELOG.md
ADDED
|
@@ -0,0 +1,357 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
## [Unreleased]
|
|
6
|
+
|
|
7
|
+
## [0.24.1] - 2026-03-23
|
|
8
|
+
|
|
9
|
+
### Fixed
|
|
10
|
+
- Report true compression gains for ace-bundle inputs by reading the `.meta.json` sidecar instead of comparing against already-compressed content.
|
|
11
|
+
- Skip re-compression when ace-bundle has already compressed the content, preventing output size inflation.
|
|
12
|
+
|
|
13
|
+
## [0.24.0] - 2026-03-23
|
|
14
|
+
|
|
15
|
+
### Changed
|
|
16
|
+
- Refreshed the package README to the current ACE layout pattern with use-case-led structure, integration links, feature summary, and updated quick-start guidance aligned to the current CLI surface.
|
|
17
|
+
|
|
18
|
+
## [0.23.2] - 2026-03-22
|
|
19
|
+
|
|
20
|
+
### Changed
|
|
21
|
+
- Normalized installation and quick-start README examples to fenced code blocks and updated command examples to consistent `mise exec --` execution format.
|
|
22
|
+
|
|
23
|
+
## [0.23.1] - 2026-03-22
|
|
24
|
+
|
|
25
|
+
### Changed
|
|
26
|
+
- Refreshed the README structure with dedicated purpose/install sections, preserved quick-start command coverage, and added a canonical ACE footer link.
|
|
27
|
+
|
|
28
|
+
## [0.23.0] - 2026-03-21
|
|
29
|
+
|
|
30
|
+
### Changed
|
|
31
|
+
- Added initial `TS-COMP-001` value-gated smoke E2E coverage for `ace-compressor`, including scenario runner/verifier contracts and an ADD/SKIP decision record with unit-coverage evidence.
|
|
32
|
+
|
|
33
|
+
## [0.22.1] - 2026-03-18
|
|
34
|
+
|
|
35
|
+
### Changed
|
|
36
|
+
- Migrated CLI namespace from `Ace::Core::CLI::*` to `Ace::Support::Cli::*` (ace-support-cli is now the canonical home for CLI infrastructure).
|
|
37
|
+
|
|
38
|
+
|
|
39
|
+
## [0.22.0] - 2026-03-18
|
|
40
|
+
|
|
41
|
+
### Changed
|
|
42
|
+
- Removed legacy backward-compatibility behavior as part of the 0.10 cleanup release.
|
|
43
|
+
|
|
44
|
+
|
|
45
|
+
## [0.21.3] - 2026-03-15
|
|
46
|
+
|
|
47
|
+
### Changed
|
|
48
|
+
- Migrated CLI framework from dry-cli to ace-support-cli
|
|
49
|
+
|
|
50
|
+
## [0.21.2] - 2026-03-13
|
|
51
|
+
|
|
52
|
+
### Fixed
|
|
53
|
+
- Preserved markdown frontmatter-only files during exact compression by falling back to raw content emission when no parseable blocks remain.
|
|
54
|
+
|
|
55
|
+
## [0.21.1] - 2026-03-09
|
|
56
|
+
|
|
57
|
+
### Fixed
|
|
58
|
+
- Preserved stable logical source identity for ACE-native inputs resolved through `ace-bundle`, so preset/protocol/config runs no longer key cache entries or emit `FILE|...` records from ephemeral temp bundle paths.
|
|
59
|
+
|
|
60
|
+
### Changed
|
|
61
|
+
- Separated source identity from content file resolution in the runner/cache pipeline so compression still operates on concrete files while cache manifests and output records use stable user-facing source paths.
|
|
62
|
+
|
|
63
|
+
### Technical
|
|
64
|
+
- Expanded resolver, cache-store, runner, and command regression coverage for repeated ACE-native input runs, stable cache hits, and deterministic emitted source records.
|
|
65
|
+
|
|
66
|
+
## [0.21.0] - 2026-03-09
|
|
67
|
+
|
|
68
|
+
### Added
|
|
69
|
+
- Added `mode: "agent"` support to `Ace::Compressor.compress_text`, routing in-memory text compression through the agent engine while preserving the content-only return contract.
|
|
70
|
+
|
|
71
|
+
### Changed
|
|
72
|
+
- Changed fenced markdown handling to pass through nested `ContextPack` records directly instead of re-encoding them as opaque `CODE|markdown|...` payloads during recompression.
|
|
73
|
+
|
|
74
|
+
### Technical
|
|
75
|
+
- Added regression coverage for nested `ContextPack` passthrough and agent-mode `compress_text` behavior.
|
|
76
|
+
|
|
77
|
+
## [0.20.0] - 2026-03-09
|
|
78
|
+
|
|
79
|
+
### Added
|
|
80
|
+
- Added optional `labels:` parameter to `CacheStore#manifest` for stable cache keys independent of filesystem paths, enabling cache reuse across tmpdir-based callers.
|
|
81
|
+
|
|
82
|
+
## [0.19.2] - 2026-03-09
|
|
83
|
+
|
|
84
|
+
### Added
|
|
85
|
+
- Added `compress_text` convenience method to `ExactCompressor` and top-level `Ace::Compressor` module for in-memory text compression without filesystem access.
|
|
86
|
+
|
|
87
|
+
## [0.19.1] - 2026-03-09
|
|
88
|
+
|
|
89
|
+
### Fixed
|
|
90
|
+
- Fixed temporary directory leak in `InputResolver` — each CLI run now cleans up its temp dir via `ensure` block in `CompressionRunner`.
|
|
91
|
+
- Removed unused `@resolved_files` instance variable from `InputResolver`.
|
|
92
|
+
|
|
93
|
+
## [0.19.0] - 2026-03-09
|
|
94
|
+
|
|
95
|
+
### Added
|
|
96
|
+
- Added `--source-scope` with `merged|per-source` modes so `ace-compressor compress` can emit one output per resolved source while preserving existing merged behavior by default.
|
|
97
|
+
- Added per-source runner behavior and regression coverage to keep per-source output ordering stable and deterministic.
|
|
98
|
+
|
|
99
|
+
### Changed
|
|
100
|
+
- Changed input resolution so protocol URLs like `wfi://...` are routed through `ace-bundle` resolution instead of being treated as missing filesystem paths.
|
|
101
|
+
- Updated usage docs with per-source examples, option documentation, and output-path constraints for multi-input per-source runs.
|
|
102
|
+
|
|
103
|
+
### Technical
|
|
104
|
+
- Expanded command, runner, and resolver tests to cover invalid source-scope errors, per-source path emission ordering, and unresolved protocol URL failures.
|
|
105
|
+
|
|
106
|
+
## [0.18.0] - 2026-03-09
|
|
107
|
+
|
|
108
|
+
### Added
|
|
109
|
+
- Added ACE-native source input resolution so `ace-compressor compress` accepts preset names and YAML bundle config files directly.
|
|
110
|
+
- Added an input resolver molecule and focused molecule-level tests for preset/config detection, mixed-source resolution, and failure messaging.
|
|
111
|
+
|
|
112
|
+
### Changed
|
|
113
|
+
- Changed compression runner flow to normalize inputs before mode dispatch, preserving existing ContextPack output contracts across exact/compact/agent modes.
|
|
114
|
+
- Updated usage documentation with preset/config examples, mixed-source behavior, and explicit resolver failure conditions.
|
|
115
|
+
|
|
116
|
+
### Fixed
|
|
117
|
+
- Fixed cache stem generation for resolver-produced sources outside the repository root so preset/config flows no longer crash during canonical cache path derivation.
|
|
118
|
+
|
|
119
|
+
### Technical
|
|
120
|
+
- Expanded command and molecule regression coverage for resolved preset/config paths and external-source cache canonicalization.
|
|
121
|
+
|
|
122
|
+
## [0.17.0] - 2026-03-09
|
|
123
|
+
|
|
124
|
+
### Added
|
|
125
|
+
- Added `ace-compressor benchmark` to compare `exact`, `compact`, and `agent` on live files using byte/line deltas plus retention coverage against the exact baseline.
|
|
126
|
+
- Added shared per-machine workflow cache support with configurable `shared_cache_dir` and `shared_cache_scope` so stable `wfi://...` sources can be reused across worktrees on the same machine.
|
|
127
|
+
|
|
128
|
+
### Changed
|
|
129
|
+
- Changed the CLI entrypoint to route benchmark runs separately while preserving existing `compress` behavior and output paths.
|
|
130
|
+
- Changed cache handling so eligible shared-cache hits hydrate back into the normal local canonical cache path.
|
|
131
|
+
|
|
132
|
+
### Technical
|
|
133
|
+
- Added retention reporting and benchmark runner internals to keep comparison/reporting logic out of the normal compression path.
|
|
134
|
+
- Expanded command and organism regression coverage for benchmark output and cross-project shared workflow cache reuse.
|
|
135
|
+
|
|
136
|
+
## [0.16.0] - 2026-03-09
|
|
137
|
+
|
|
138
|
+
### Changed
|
|
139
|
+
- Changed exact-mode workflow encoding to compact long natural-language list items into shorter phrase slugs while preserving stable `LIST|...` structure and item order.
|
|
140
|
+
- Changed exact-mode shell fence handling so script-like bash blocks collapse into single `CODE|bash|...` records instead of many verbose `CMD|...` lines.
|
|
141
|
+
|
|
142
|
+
### Technical
|
|
143
|
+
- Added regression coverage for compact narrative list slugs and script-style shell block normalization.
|
|
144
|
+
- Bumped cache contracts so exact, compact, and agent workflow reruns refresh artifacts generated with the previous list/shell encoding.
|
|
145
|
+
|
|
146
|
+
## [0.15.0] - 2026-03-09
|
|
147
|
+
|
|
148
|
+
### Added
|
|
149
|
+
- Added deterministic post-LLM list-item compaction so agent mode shortens verbose list payloads while preserving item identity and ordering.
|
|
150
|
+
|
|
151
|
+
### Changed
|
|
152
|
+
- Changed exact-mode table encoding to store semantic `cols=` and `rows=` data instead of escaped markdown table syntax, dramatically reducing table-heavy pack size.
|
|
153
|
+
- Changed compact-mode table parsing to understand structured table records while preserving existing strategy and loss metadata behavior.
|
|
154
|
+
|
|
155
|
+
### Technical
|
|
156
|
+
- Expanded exact/compact/command regression coverage for structured table records and deterministic list-item compaction.
|
|
157
|
+
- Bumped cache contracts so exact, compact, and agent reruns refresh stale artifacts after the table/list encoding changes.
|
|
158
|
+
|
|
159
|
+
## [0.14.0] - 2026-03-08
|
|
160
|
+
|
|
161
|
+
### Fixed
|
|
162
|
+
- Moved agent-mode model and prompt-template defaults out of Ruby and into `ace-compressor` config/protocol defaults.
|
|
163
|
+
- Registered `ace-compressor` template sources under gem-local `.ace-defaults/nav/protocols/tmpl-sources/` so agent prompts resolve via `tmpl://`.
|
|
164
|
+
- Rebuilt agent mode as payload-only rewriting over exact output so the LLM rewrites `SUMMARY|`, `FACT|`, and long `LIST|...` values without regenerating `ContextPack` structure or leaking prompt scaffolding into packs.
|
|
165
|
+
|
|
166
|
+
### Technical
|
|
167
|
+
- Added regression coverage for config-driven `agent_model`, `agent_template_uri`, and deprecated `agent_provider` fallback behavior.
|
|
168
|
+
- Bumped the agent cache contract to invalidate stale refusal/fallback artifacts from earlier agent-mode implementations.
|
|
169
|
+
|
|
170
|
+
## [0.13.0] - 2026-03-08
|
|
171
|
+
|
|
172
|
+
### Added
|
|
173
|
+
- Added explicit degraded-success fallback metadata (`FALLBACK|source=...|from=agent|to=exact|...`) for `--mode agent` provider/validation failure paths.
|
|
174
|
+
- Added runner/command plumbing for fallback detection and human-readable degraded notices while preserving machine-readable fallback records.
|
|
175
|
+
|
|
176
|
+
### Changed
|
|
177
|
+
- Changed agent failure handling to degrade to exact-mode output with fidelity failure evidence instead of refusal artifacts.
|
|
178
|
+
- Changed usage contract/docs to describe agent degraded fallback (`FALLBACK|...`) and exit `0` behavior.
|
|
179
|
+
- Changed agent-mode cache manifest contract keying to refresh stale pre-fallback artifacts.
|
|
180
|
+
|
|
181
|
+
### Technical
|
|
182
|
+
- Expanded organism/runner/command regression coverage for degraded agent fallback behavior and zero-exit verification.
|
|
183
|
+
|
|
184
|
+
## [0.12.0] - 2026-03-08
|
|
185
|
+
|
|
186
|
+
### Added
|
|
187
|
+
- Added a dedicated single-source agent minification prompt template (`handbook/templates/agent/minify-single-source.template.md`) for resource-driven prompt composition.
|
|
188
|
+
|
|
189
|
+
### Changed
|
|
190
|
+
- Hardened `agent` single-source output contract by replacing plan-template prompt composition with a dedicated minification template and stronger success-path fidelity checks.
|
|
191
|
+
- Expanded agent-mode validation to reject summary-only output, enforce numeric fidelity, and require compressed output to be smaller than exact baseline.
|
|
192
|
+
|
|
193
|
+
### Technical
|
|
194
|
+
- Added agent-mode regression coverage for required command/example retention, numeric token preservation, summary-collapse rejection, and size gate behavior.
|
|
195
|
+
|
|
196
|
+
## [0.11.0] - 2026-03-08
|
|
197
|
+
|
|
198
|
+
### Added
|
|
199
|
+
- Added `agent` compression mode with protocol-composed prompt flow (`ace-bundle`), `ace-llm` invocation, and validator-visible concept inventory markers.
|
|
200
|
+
- Added `AgentCompressor` organism and targeted tests for pass/fail/provider-unavailable spike behavior.
|
|
201
|
+
|
|
202
|
+
### Changed
|
|
203
|
+
- Extended CLI and compression runner mode contracts to support `--mode agent` routing and mode-aware refusal messaging.
|
|
204
|
+
- Updated usage documentation with the single-source agent spike contract, expected output markers, and refusal/fallback guidance.
|
|
205
|
+
|
|
206
|
+
### Technical
|
|
207
|
+
- Expanded command and runner regression coverage for agent-mode acceptance, refusal handling, and metadata contract stability.
|
|
208
|
+
|
|
209
|
+
## [0.10.3] - 2026-03-07
|
|
210
|
+
|
|
211
|
+
### Technical
|
|
212
|
+
- Applied shine-cycle polish to compact mode internals: added high-level class documentation and replaced table-strategy magic numbers with named constants.
|
|
213
|
+
|
|
214
|
+
## [0.10.2] - 2026-03-07
|
|
215
|
+
|
|
216
|
+
### Technical
|
|
217
|
+
- Completed fit-cycle review/apply-feedback/release flow for PR #243; no actionable findings remained after feedback synthesis retries.
|
|
218
|
+
|
|
219
|
+
## [0.10.1] - 2026-03-07
|
|
220
|
+
|
|
221
|
+
### Technical
|
|
222
|
+
- Completed valid-cycle review/apply-feedback/release flow for PR #243; no medium+ correctness findings required code changes.
|
|
223
|
+
|
|
224
|
+
## [0.10.0] - 2026-03-07
|
|
225
|
+
|
|
226
|
+
### Added
|
|
227
|
+
- Added explicit compact-mode reduction metadata records: `LOSS|...` and `EXAMPLE_REF|...`.
|
|
228
|
+
- Added cross-source example deduplication that collapses duplicate examples to references with provenance.
|
|
229
|
+
|
|
230
|
+
### Changed
|
|
231
|
+
- Changed compact table encoding to emit explicit per-table strategy metadata (`preserve`, `schema_plus_key_rows`, `summarize_with_loss`).
|
|
232
|
+
- Changed compact table reduction to report data-row-only retained/original counts and preserve sensitive table content.
|
|
233
|
+
- Updated compact usage documentation to include the new `TABLE|...|strategy=...`, `LOSS|...`, and `EXAMPLE_REF|...` contract.
|
|
234
|
+
|
|
235
|
+
### Technical
|
|
236
|
+
- Expanded compact organism/command regression coverage for table strategy selection, loss signaling, example-ref collapse, and mimicry-required example preservation.
|
|
237
|
+
|
|
238
|
+
## [0.9.0] - 2026-03-07
|
|
239
|
+
|
|
240
|
+
### Added
|
|
241
|
+
- Added mixed-source compact-mode behavior with per-source policy classes (`narrative-heavy`, `mixed`, `rule-heavy`) and fidelity/refusal metadata records (`FIDELITY|`, `REFUSAL|`, `GUIDANCE|`).
|
|
242
|
+
- Added rule-preservation fidelity checks for mixed documents (`compact_with_exact_rule_sections`) so policy-bearing records can pass compact mode without forced refusal.
|
|
243
|
+
|
|
244
|
+
### Changed
|
|
245
|
+
- Changed compact-mode execution to preserve safe-source output even when other sources refuse and to return non-zero outcome when refusal metadata is present.
|
|
246
|
+
- Added compatibility for interface-contract invocations with optional leading `compress` verb (`ace-compressor compress ...`).
|
|
247
|
+
|
|
248
|
+
### Technical
|
|
249
|
+
- Expanded classifier, compact organism, and command test coverage for mixed-doc pass/fail paths, partial-refusal output retention, and refusal-driven exit semantics.
|
|
250
|
+
|
|
251
|
+
## [0.8.0] - 2026-03-07
|
|
252
|
+
|
|
253
|
+
### Added
|
|
254
|
+
- Added compact-mode narrative policy classification with runtime `POLICY|class=...|action=...` metadata records.
|
|
255
|
+
- Added `CompactCompressor` and classifier atoms for `narrative-heavy` aggressive compaction and `unknown` conservative fallback.
|
|
256
|
+
|
|
257
|
+
### Changed
|
|
258
|
+
- Extended CLI/runtime mode support from exact-only to `exact|compact`, including mode-aware validation and dispatch.
|
|
259
|
+
- Generalized explicit binary/empty-input error messaging to reflect the active compression mode.
|
|
260
|
+
|
|
261
|
+
### Technical
|
|
262
|
+
- Added compact-mode organism/command/atom tests and regression checks for policy emission, fallback behavior, and size reduction versus exact mode.
|
|
263
|
+
- Updated README and usage docs with compact-mode command examples and output contract details.
|
|
264
|
+
|
|
265
|
+
## [0.7.1] - 2026-03-07
|
|
266
|
+
|
|
267
|
+
### Fixed
|
|
268
|
+
- Emitted each `FILE|...` record inline with its source records so multi-file exact packs have unambiguous file scope.
|
|
269
|
+
- Canonicalized prose `Example: ...` markers into `EXAMPLE|tool=...` records instead of leaving them as plain facts.
|
|
270
|
+
- Replaced ad-hoc section-derived list records with stable `LIST|section|[...]` output while still promoting problem-context lists to `PROBLEMS|[...]`.
|
|
271
|
+
|
|
272
|
+
### Technical
|
|
273
|
+
- Updated exact-mode regression tests, usage docs, and changelog text to match the finalized ContextPack/3 contract.
|
|
274
|
+
|
|
275
|
+
## [0.7.0] - 2026-03-07
|
|
276
|
+
|
|
277
|
+
### Changed
|
|
278
|
+
- Migrated exact-mode output to ContextPack/3 with semantic canonical encoding for headings, prose,
|
|
279
|
+
lists, and fenced/table content.
|
|
280
|
+
- Introduced section-scoped output (`FILE|`, `SEC|`) and typed semantic records (`SUMMARY|`, `FACT|`, `RULE|`,
|
|
281
|
+
`CONSTRAINT|`, `PROBLEMS|`, `LIST|section|[...]`, `EXAMPLE|`, `CMD|`, `FILES|`, `TREE|`, `CODE|`) in the
|
|
282
|
+
exact-mode wire format.
|
|
283
|
+
|
|
284
|
+
### Added
|
|
285
|
+
- Added a canonical block transformation layer between markdown parsing and pack encoding for deterministic markdown normalization.
|
|
286
|
+
|
|
287
|
+
### Technical
|
|
288
|
+
- Fixed exact-mode source scoping so each `FILE|...` record now directly precedes that source's records.
|
|
289
|
+
- Updated tests, CLI help text, and docs to describe the ContextPack/3 contract.
|
|
290
|
+
|
|
291
|
+
## [0.6.0] - 2026-03-07
|
|
292
|
+
|
|
293
|
+
### Changed
|
|
294
|
+
- Switched exact-mode pack output from verbose `ContextPack/1` key-value records to compact `ContextPack/2` fixed-position records with a source table and implicit section context.
|
|
295
|
+
|
|
296
|
+
### Technical
|
|
297
|
+
- Reduced repeated exact-mode overhead by removing per-record `src=`, `id=`, and `sec=` fields.
|
|
298
|
+
- Updated cache keys, tests, and usage docs for the `ContextPack/2` wire format.
|
|
299
|
+
|
|
300
|
+
## [0.5.0] - 2026-03-07
|
|
301
|
+
|
|
302
|
+
### Changed
|
|
303
|
+
- Switched `ace-compressor` to a single-command CLI: `ace-compressor [SOURCES...]` no longer requires the `compress` subcommand.
|
|
304
|
+
- Added `--output` for explicit pack save destinations and `--format path|stdio|stats` for console rendering.
|
|
305
|
+
- Default command behavior now writes/read a canonical cache artifact under `.ace-local/compressor` and prints the saved path.
|
|
306
|
+
- Reworked `--format stats` into a human-readable summary showing cache state plus original-vs-packed byte and line deltas.
|
|
307
|
+
|
|
308
|
+
### Technical
|
|
309
|
+
- Added canonical cache manifests and metadata sidecars keyed by source content SHA-256 plus mode.
|
|
310
|
+
- Reused cached packs for unchanged source sets instead of recompressing on repeat runs.
|
|
311
|
+
- Backfill missing stats totals into existing cache metadata on cache hits so older cache entries remain usable.
|
|
312
|
+
|
|
313
|
+
## [0.4.3] - 2026-03-07
|
|
314
|
+
### Technical
|
|
315
|
+
- Removed dead `return 0` from `Compress#call` (dry-cli ignores the return value).
|
|
316
|
+
- Expanded README with quick-start examples and link to `docs/usage.md`.
|
|
317
|
+
|
|
318
|
+
## [0.4.2] - 2026-03-07
|
|
319
|
+
|
|
320
|
+
### Technical
|
|
321
|
+
- Removed redundant `uniq` pass in directory traversal — `Find.find` never yields duplicate paths.
|
|
322
|
+
|
|
323
|
+
## [0.4.1] - 2026-03-07
|
|
324
|
+
|
|
325
|
+
### Fixed
|
|
326
|
+
- Binary files with supported extensions (`.md`, `.txt`) in directories are now correctly skipped during traversal instead of being silently included and producing garbage output.
|
|
327
|
+
|
|
328
|
+
### Technical
|
|
329
|
+
- Added usage documentation (`docs/usage.md`) covering all CLI commands, output format, scenarios, error conditions, and troubleshooting.
|
|
330
|
+
|
|
331
|
+
## [0.4.0] - 2026-03-07
|
|
332
|
+
|
|
333
|
+
### Added
|
|
334
|
+
- Added explicit unresolved markers for image-only markdown references in exact mode output.
|
|
335
|
+
- Added explicit fallback markers for fenced-code blocks in exact mode output.
|
|
336
|
+
- Added table-preservation records so markdown tables are represented structurally in output.
|
|
337
|
+
|
|
338
|
+
### Fixed
|
|
339
|
+
- Preserved imperative modality and numeric facts with dedicated command-level regression tests.
|
|
340
|
+
|
|
341
|
+
### Technical
|
|
342
|
+
- Expanded exact-mode command and organism test coverage for unresolved/fallback/table hardening.
|
|
343
|
+
|
|
344
|
+
## [0.3.0] - 2026-03-06
|
|
345
|
+
|
|
346
|
+
### Added
|
|
347
|
+
- Added exact-mode support for multi-file and directory inputs with deterministic source ordering.
|
|
348
|
+
- Added merged pack output with per-record source provenance (`src=...`) and multi-source header metadata.
|
|
349
|
+
|
|
350
|
+
### Fixed
|
|
351
|
+
- Added loud failures for explicit binary inputs and directories with no supported markdown/text files.
|
|
352
|
+
- Added duplicate explicit source collapse so repeated paths emit once.
|
|
353
|
+
|
|
354
|
+
## [0.2.0] - 2026-03-06
|
|
355
|
+
|
|
356
|
+
### Added
|
|
357
|
+
- Bootstrap runnable exact-mode single-file compression path.
|
data/README.md
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
<div align="center">
|
|
2
|
+
<h1> ACE - Compressor </h1>
|
|
3
|
+
|
|
4
|
+
Compress Markdown and text into ContextPack/3 artifacts for efficient LLM context loading.
|
|
5
|
+
|
|
6
|
+
<img src="https://raw.githubusercontent.com/cs3b/ace/main/docs/brand/AgenticCodingEnvironment.Logo.XS.jpg" alt="ACE Logo" width="480">
|
|
7
|
+
<br><br>
|
|
8
|
+
|
|
9
|
+
<a href="https://rubygems.org/gems/ace-compressor"><img alt="Gem Version" src="https://img.shields.io/gem/v/ace-compressor.svg" /></a>
|
|
10
|
+
<a href="https://www.ruby-lang.org"><img alt="Ruby" src="https://img.shields.io/badge/Ruby-3.2+-CC342D?logo=ruby" /></a>
|
|
11
|
+
<a href="https://opensource.org/licenses/MIT"><img alt="License: MIT" src="https://img.shields.io/badge/License-MIT-blue.svg" /></a>
|
|
12
|
+
|
|
13
|
+
</div>
|
|
14
|
+
|
|
15
|
+
> Works with: Claude Code, Codex CLI, OpenCode, Gemini CLI, pi-agent, and more.
|
|
16
|
+
|
|
17
|
+
[Getting Started](docs/getting-started.md) | [Usage Guide](docs/usage.md)
|
|
18
|
+
|
|
19
|
+

|
|
20
|
+
|
|
21
|
+
`ace-compressor` provides deterministic context compression for one or more sources, with exact extraction, policy-driven compact output, and agent-assisted payload rewriting while preserving record structure. It integrates with [ace-bundle](../ace-bundle) for input resolution and [ace-llm](../ace-llm) for agent-mode rewriting. Compression returns are moderate — expect ~10% byte reduction in exact mode and ~25% in agent mode — with the main wins coming from line count reduction and structured record normalization.
|
|
22
|
+
|
|
23
|
+
## How It Works
|
|
24
|
+
|
|
25
|
+
1. Feed one or more Markdown or text sources into a compression mode.
|
|
26
|
+
2. The compressor extracts structured records, applies mode-specific reduction policies, and produces a stable `ContextPack/3` artifact — a pipe-delimited record format preserving document structure in minimal tokens.
|
|
27
|
+
3. Output goes to a cache-backed path for reuse or inline to stdio, with optional stats summaries and benchmark comparisons.
|
|
28
|
+
|
|
29
|
+
| Mode | What it does | Tradeoff |
|
|
30
|
+
|------|-------------|----------|
|
|
31
|
+
| `exact` | Canonical semantic extraction — headings, prose, lists, and code become compact typed records. Fully deterministic, no LLM involved. | ~10% byte reduction; preserves all content |
|
|
32
|
+
| `compact` | Policy-driven narrative compaction that classifies sections and applies aggressive reduction rules. Emits explicit `LOSS|` markers for anything dropped. | Higher compression; may lose detail |
|
|
33
|
+
| `agent` | Runs exact extraction first, then uses an LLM to rewrite selected payloads (`SUMMARY|`, `FACT|`, long `LIST|` values) while keeping record structure deterministic. | ~25% byte reduction; requires [ace-llm](../ace-llm) |
|
|
34
|
+
|
|
35
|
+
## Use Cases
|
|
36
|
+
|
|
37
|
+
**Prepare large docs for agent workflows** - run [`ace-compressor docs/vision.md --mode exact`](docs/usage.md) to reduce source size while keeping provenance and section structure intact for [ace-bundle](../ace-bundle) payloads.
|
|
38
|
+
|
|
39
|
+
**Compare compression strategies on real files** - run benchmark mode across `exact`, `compact`, and `agent` to inspect retention and size tradeoffs before committing to a compression approach.
|
|
40
|
+
|
|
41
|
+
**Build repeatable context artifacts** - generate stable output paths and reuse cache-backed pack artifacts across runs in CI pipelines and multi-agent loops.
|
|
42
|
+
|
|
43
|
+
**Rewrite payloads with LLM assistance** - use agent mode with [ace-llm](../ace-llm) to rewrite payload text while keeping the deterministic `ContextPack/3` structure, producing more concise context for downstream consumers.
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
[Getting Started](docs/getting-started.md) | [Usage Guide](docs/usage.md) | Part of [ACE](https://github.com/cs3b/ace)
|
data/Rakefile
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "bundler/gem_tasks"
|
|
4
|
+
require "minitest/test_task"
|
|
5
|
+
|
|
6
|
+
desc "Run tests using ace-test"
|
|
7
|
+
task :test do
|
|
8
|
+
sh "ace-test"
|
|
9
|
+
end
|
|
10
|
+
|
|
11
|
+
desc "Run tests directly (CI mode)"
|
|
12
|
+
Minitest::TestTask.create(:ci)
|
|
13
|
+
|
|
14
|
+
task spec: :test
|
|
15
|
+
task default: :test
|
data/exe/ace-compressor
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
#!/usr/bin/env ruby
|
|
2
|
+
# frozen_string_literal: true
|
|
3
|
+
|
|
4
|
+
require_relative "../lib/ace/compressor"
|
|
5
|
+
|
|
6
|
+
args = ARGV.empty? ? ["--help"] : ARGV
|
|
7
|
+
|
|
8
|
+
begin
|
|
9
|
+
Ace::Compressor::CLI.start(args)
|
|
10
|
+
rescue Ace::Support::Cli::Error => e
|
|
11
|
+
warn e.message
|
|
12
|
+
exit(e.exit_code)
|
|
13
|
+
end
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
---
|
|
2
|
+
doc-type: template
|
|
3
|
+
title: Agent Payload Rewriter
|
|
4
|
+
purpose: Documentation for ace-compressor/handbook/templates/agent/minify-single-source.template.md
|
|
5
|
+
ace-docs:
|
|
6
|
+
last-updated: 2026-03-09
|
|
7
|
+
last-checked: 2026-03-21
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Agent Payload Rewriter
|
|
11
|
+
|
|
12
|
+
You rewrite only payload data for `ace-compressor` agent mode.
|
|
13
|
+
|
|
14
|
+
Return strict JSON only. Do not return ContextPack records, markdown, explanations, or code fences.
|
|
15
|
+
|
|
16
|
+
Output format:
|
|
17
|
+
|
|
18
|
+
{
|
|
19
|
+
"records": [
|
|
20
|
+
{"id": "r1", "payload": "shorter text"},
|
|
21
|
+
{"id": "r2", "items": ["short_item_one", "short_item_two"]}
|
|
22
|
+
]
|
|
23
|
+
}
|
|
24
|
+
|
|
25
|
+
Rules:
|
|
26
|
+
- Return every input `id` exactly once.
|
|
27
|
+
- For `SUMMARY` and `FACT`, rewrite `payload` shorter while preserving meaning.
|
|
28
|
+
- For `LIST`, return the same number of `items` in the same order.
|
|
29
|
+
- Preserve explicit identities: tool names, ADR numbers, path fragments, acronyms, command names, and distinguishing nouns.
|
|
30
|
+
- Remove repeated phrasing and boilerplate.
|
|
31
|
+
- Prefer concise, information-dense wording.
|
|
32
|
+
- Make list items aggressively short while keeping item identity. Prefer `config` over `configuration`, `docs` over `documentation`, `arch` over `architecture`, and drop filler tokens like `with`, `for`, `and`, `the`.
|
|
33
|
+
- Do not invent new facts.
|
|
34
|
+
- Do not emit any wrapper text before or after the JSON object.
|