headson 0.4.0__cp310-abi3-macosx_11_0_arm64.whl → 0.5.1__cp310-abi3-macosx_11_0_arm64.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of headson might be problematic. Click here for more details.

headson/__init__.py CHANGED
@@ -1,4 +1,6 @@
1
- # Re-export the public API from the compiled extension submodule
1
+ from __future__ import annotations
2
+
3
+ # Directly re-export the compiled extension function with the final signature.
2
4
  from .headson import summarize # type: ignore
3
5
 
4
6
  __all__ = ["summarize"]
headson/headson.abi3.so CHANGED
Binary file
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: headson
3
- Version: 0.4.0
3
+ Version: 0.5.1
4
4
  Classifier: Programming Language :: Python
5
5
  Classifier: Programming Language :: Python :: 3
6
6
  Classifier: Programming Language :: Rust
@@ -66,7 +66,8 @@ Common flags:
66
66
  - `--no-space`: no space after `:` in objects
67
67
  - `--indent <STR>`: indentation unit (default: two spaces)
68
68
  - `--string-cap <N>`: max graphemes to consider per string (default: 500)
69
- - `--tail`: prefer the end of arrays when truncating. Strings are unaffected. In `pseudo`/`js` templates the omission marker appears at the start; `json` remains strict JSON with no annotations.
69
+ - `--head`: prefer the beginning of arrays when truncating (keep first N). Strings are unaffected. In `pseudo`/`js` templates the omission marker appears near the end; `json` remains strict. Mutually exclusive with `--tail`.
70
+ - `--tail`: prefer the end of arrays when truncating (keep last N). Strings are unaffected. In `pseudo`/`js` templates the omission marker appears at the start; `json` remains strict. Mutually exclusive with `--head`.
70
71
 
71
72
  Notes:
72
73
 
@@ -77,6 +78,7 @@ Notes:
77
78
  - Using `--global-budget` may truncate or omit entire files to respect the total budget.
78
79
  - The tool finds the largest preview that fits the budget; if even the tiniest preview exceeds it, you still get a minimal, valid preview.
79
80
  - When passing file paths, directories and binary files are ignored; a notice is printed to stderr for each (e.g., `Ignored binary file: ./path/to/file`). Stdin mode reads the stream as-is.
81
+ - Head vs Tail sampling: these options bias which part of arrays are kept before rendering. They guarantee the kept segment is contiguous at the chosen side (prefix for `--head`, suffix for `--tail`). Display templates may still insert additional internal gap markers inside that kept segment to honor very small budgets; `json` remains strict and unannotated.
80
82
 
81
83
  Quick one‑liners:
82
84
 
@@ -137,10 +139,10 @@ A thin Python extension module is available on PyPI as `headson`.
137
139
 
138
140
  - Install: `pip install headson` (ABI3 wheels for Python 3.10+ on Linux/macOS/Windows).
139
141
  - API:
140
- - `headson.summarize(text: str, *, template: str = "pseudo", character_budget: int | None = None, tail: bool = False) -> str`
142
+ - `headson.summarize(text: str, *, template: str = "pseudo", character_budget: int | None = None, skew: str = "balanced") -> str`
141
143
  - `template`: one of `"json" | "pseudo" | "js"`
142
144
  - `character_budget`: maximum output size in characters (default: 500)
143
- - `tail`: prefer the end of arrays when truncating; strings unaffected. Affects only display templates (`pseudo`/`js`); `json` remains strict.
145
+ - `skew`: one of `"balanced" | "head" | "tail"` (focus arrays on start vs end; only affects display templates; `json` remains strict).
144
146
 
145
147
  Example:
146
148
 
@@ -158,11 +160,56 @@ print(
158
160
  json.dumps(list(range(100))),
159
161
  template="pseudo",
160
162
  character_budget=80,
161
- tail=True,
163
+ skew="tail",
162
164
  )
163
165
  )
164
166
  ```
165
167
 
168
+ # Algorithm
169
+
170
+ ```mermaid
171
+ %%{init: {"themeCSS": ".cluster > rect { fill: transparent; stroke: transparent; } .clusterLabel > text { font-size: 16px; font-weight: 600; } .clusterLabel span { padding: 6px 10px; font-size: 16px; font-weight: 600; }"}}%%
172
+ flowchart TD
173
+ subgraph Deserialization
174
+ direction TB
175
+ A["Input file(s)"]
176
+ A -- Single --> C["Parse into optimized tree (with array pre‑sampling) ¹"]
177
+ A -- Multiple --> D["Parse each file and wrap into a fileset object"]
178
+ D --> C
179
+ end
180
+ subgraph Prioritization
181
+ direction TB
182
+ E["Build priority order ²"]
183
+ F["Choose top N nodes ³"]
184
+ end
185
+ subgraph Serialization
186
+ direction TB
187
+ G["Render attempt ⁴"]
188
+ H["Output preview string"]
189
+ end
190
+ C --> E
191
+ E --> F
192
+ F --> G
193
+ G --> F
194
+ F --> H
195
+ %% Color classes for categories
196
+ classDef des fill:#eaf2ff,stroke:#3b82f6,stroke-width:1px,color:#0f172a;
197
+ classDef prio fill:#ecfdf5,stroke:#10b981,stroke-width:1px,color:#064e3b;
198
+ classDef ser fill:#fff1f2,stroke:#f43f5e,stroke-width:1px,color:#7f1d1d;
199
+ class A,C,D des;
200
+ class E,F prio;
201
+ class G,H ser;
202
+ style Deserialization fill:transparent,stroke:transparent
203
+ style Prioritization fill:transparent,stroke:transparent
204
+ style Serialization fill:transparent,stroke:transparent
205
+ ```
206
+
207
+ ## Footnotes
208
+ - <sup><b>[1]</b></sup> <b>Optimized tree representation</b>: An arena‑style tree stored in flat, contiguous buffers. Each node records its kind and value plus index ranges into shared child and key arrays. Arrays are ingested in a single pass and may be deterministically pre‑sampled: the first element is always kept; additional elements are selected via a fixed per‑index inclusion test; for kept elements, original indices are stored and full lengths are counted. This enables accurate omission info and internal gap markers later, while minimizing pointer chasing.
209
+ - <sup><b>[2]</b></sup> <b>Priority order</b>: Nodes are scored so previews surface representative structure and values first. Arrays can favor head/mid/tail coverage (default) or strictly the head; tail preference flips head/tail when configured. Object properties are ordered by key, and strings expand by grapheme with early characters prioritized over very deep expansions.
210
+ - <sup><b>[3]</b></sup> <b>Choose top N nodes (binary search)</b>: Iteratively picks N so that the rendered preview fits within the character budget, looping between “choose N” and a render attempt to converge quickly.
211
+ - <sup><b>[4]</b></sup> <b>Render attempt</b>: Serializes the currently included nodes using the selected template. Omission summaries and per-file section headers appear in display templates (pseudo/js); json remains strict. For arrays, display templates may insert internal gap markers between non‑contiguous kept items using original indices.
212
+
166
213
  ## License
167
214
 
168
215
  MIT
@@ -0,0 +1,6 @@
1
+ headson-0.5.1.dist-info/METADATA,sha256=pv3wYywpSE8OuULv5-LgGycmOJi2kFukurzkLUGdyIY,9371
2
+ headson-0.5.1.dist-info/WHEEL,sha256=-lwEpi49KOTCcgx48T3fLSP8Dxynwa-iRMZNo-JZaqc,103
3
+ headson-0.5.1.dist-info/licenses/LICENSE,sha256=GZ9row3L2LsnOSbEuGMQZ0zKOIEd5tHr76cZHpg4KK8,1072
4
+ headson/__init__.py,sha256=8DXFB8ahlywyQXJsscl3w_wgbcQi7sj7zEuV28wR60E,187
5
+ headson/headson.abi3.so,sha256=IOPVBfgMnRLzVUeY-JHJAFOPnW-x_vP6aiskgtG6pcY,604752
6
+ headson-0.5.1.dist-info/RECORD,,
@@ -1,6 +0,0 @@
1
- headson-0.4.0.dist-info/METADATA,sha256=ynLBqMBpx40wDmbk1JnRxLj1IIc6fM7MGgJTUnVFSpw,5862
2
- headson-0.4.0.dist-info/WHEEL,sha256=-lwEpi49KOTCcgx48T3fLSP8Dxynwa-iRMZNo-JZaqc,103
3
- headson-0.4.0.dist-info/licenses/LICENSE,sha256=GZ9row3L2LsnOSbEuGMQZ0zKOIEd5tHr76cZHpg4KK8,1072
4
- headson/__init__.py,sha256=PnXEkHuT6aEqKi8lL11uZU2IZ5cGgFqfO43xShmpros,137
5
- headson/headson.abi3.so,sha256=KB8ibXCt41RWTqa-ne-PNuJ3Y6ILMftQN4cU3_96yps,604944
6
- headson-0.4.0.dist-info/RECORD,,