PyPI - gcf-python - Versions diffs - 0.2.0__tar.gz → 0.3.1__tar.gz - Mend

gcf-python 0.2.0tar.gz → 0.3.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

{gcf_python-0.2.0 → gcf_python-0.3.1}/CHANGELOG.md RENAMED Viewed

@@ -1,5 +1,9 @@
 # Changelog
+## v0.3.0 (2026-06-05)
+- `encode_generic`: primitive arrays inlined as `name[N]: val1,val2,val3`
 ## v0.2.0 (2026-06-05)
 - **Breaking**: `encode()` now emits `edges=N` in header line

{gcf_python-0.2.0 → gcf_python-0.3.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: gcf-python
-Version: 0.2.0
+Version: 0.3.1
 Summary: Python implementation of GCF (Graph Compact Format): token-optimized wire format for LLM tool responses
 Project-URL: Homepage, https://github.com/blackwell-systems/gcf-python
 Project-URL: Documentation, https://blackwell-systems.github.io/gcf/
@@ -32,7 +32,7 @@ Description-Content-Type: text/markdown
 Python implementation of [GCF (Graph Compact Format)](https://gcformat.com/) — the most token-efficient wire format for LLMs. A drop-in alternative to JSON and TOON for any structured data.
-**79% fewer input tokens than JSON. 75% fewer output tokens. 52% smaller than TOON. 100% LLM comprehension at 500 symbols, where JSON fails at 66.7%.**
+**79% fewer input tokens than JSON. 75% fewer output tokens. 52% smaller than TOON. 100% LLM comprehension at 500 symbols, where JSON scores 76.9% and TOON scores 92.3%.**
 Docs: [gcformat.com](https://gcformat.com/) · [Playground](https://gcformat.com/playground.html) · [GCF vs TOON](https://gcformat.com/guide/vs-toon.html)
@@ -189,15 +189,17 @@ Works on dicts, lists, and primitives. Lists of uniform dicts get tabular rows.
 ## Comprehension Eval
-Rigorous 3-way benchmark (GCF vs TOON vs JSON) at 500 symbols, 200 edges. Six structured extraction questions sent to an LLM:
+Rigorous 3-way benchmark (GCF vs TOON vs JSON) at 500 symbols, 200 edges. 13 structured extraction questions sent to an LLM with zero format instructions:
 | Format | Accuracy | Tokens | vs JSON |
 |--------|----------|--------|---------|
-| **GCF** | **100%** (6/6) | **11,090** | **79% fewer** |
-| TOON | 100% (6/6) | 16,378 | 69% fewer |
-| JSON | 66.7% (4/6) | 53,341 | baseline |
+| **GCF** | **100%** (13/13) | **11,090** | **79% fewer** |
+| TOON | 92.3% (12/13) | 16,378 | 69% fewer |
+| JSON | 76.9% (10/13) | 53,341 | baseline |
-JSON failed on counting tasks. GCF and TOON both achieved perfect accuracy. GCF does it in 32% fewer tokens.
+GCF is the only format with perfect accuracy at scale, at 32% fewer tokens than TOON.
+Reproduce: `git clone https://github.com/blackwell-systems/gcf-go && cd gcf-go/eval && GOWORK=off go test -run TestComprehension -v -timeout 0`
 ## Token Efficiency (TOON's Own Benchmark)
@@ -205,13 +207,13 @@ Running [TOON's benchmark harness](https://github.com/blackwell-systems/toon/tre
 | Track | GCF | TOON | Result |
 |-------|-----|------|--------|
-| Mixed-structure (nested, semi-uniform) | 169,554 | 227,896 | **GCF 34% smaller** |
-| Flat-only (tabular) | 66,026 | 67,837 | **GCF 3% smaller** |
-| Semi-uniform event logs | 107,269 | 154,032 | **GCF 44% smaller** |
+| Mixed-structure (nested, semi-uniform) | 170,367 | 227,896 | **GCF 34% smaller** |
+| Flat-only (tabular) | 66,029 | 67,837 | **GCF 3% smaller** |
+| Semi-uniform event logs | 108,158 | 154,032 | **GCF 42% smaller** |
-GCF wins on every dataset except deeply nested config (75 tokens on a 618-token payload). On semi-uniform data, GCF uses 44% fewer tokens than TOON.
+GCF wins all 6 datasets. On semi-uniform data (the most common real-world pattern), GCF uses 42% fewer tokens than TOON.
-Reproducible: [blackwell-systems/toon@gcf-comparison](https://github.com/blackwell-systems/toon/tree/gcf-comparison)
+Reproduce: `git clone https://github.com/blackwell-systems/toon && cd toon && git checkout gcf-comparison && cd benchmarks && pnpm install && pnpm benchmark:tokens`
 ## Links

{gcf_python-0.2.0 → gcf_python-0.3.1}/README.md RENAMED Viewed

@@ -7,7 +7,7 @@
 Python implementation of [GCF (Graph Compact Format)](https://gcformat.com/) — the most token-efficient wire format for LLMs. A drop-in alternative to JSON and TOON for any structured data.
-**79% fewer input tokens than JSON. 75% fewer output tokens. 52% smaller than TOON. 100% LLM comprehension at 500 symbols, where JSON fails at 66.7%.**
+**79% fewer input tokens than JSON. 75% fewer output tokens. 52% smaller than TOON. 100% LLM comprehension at 500 symbols, where JSON scores 76.9% and TOON scores 92.3%.**
 Docs: [gcformat.com](https://gcformat.com/) · [Playground](https://gcformat.com/playground.html) · [GCF vs TOON](https://gcformat.com/guide/vs-toon.html)
@@ -164,15 +164,17 @@ Works on dicts, lists, and primitives. Lists of uniform dicts get tabular rows.
 ## Comprehension Eval
-Rigorous 3-way benchmark (GCF vs TOON vs JSON) at 500 symbols, 200 edges. Six structured extraction questions sent to an LLM:
+Rigorous 3-way benchmark (GCF vs TOON vs JSON) at 500 symbols, 200 edges. 13 structured extraction questions sent to an LLM with zero format instructions:
 | Format | Accuracy | Tokens | vs JSON |
 |--------|----------|--------|---------|
-| **GCF** | **100%** (6/6) | **11,090** | **79% fewer** |
-| TOON | 100% (6/6) | 16,378 | 69% fewer |
-| JSON | 66.7% (4/6) | 53,341 | baseline |
+| **GCF** | **100%** (13/13) | **11,090** | **79% fewer** |
+| TOON | 92.3% (12/13) | 16,378 | 69% fewer |
+| JSON | 76.9% (10/13) | 53,341 | baseline |
-JSON failed on counting tasks. GCF and TOON both achieved perfect accuracy. GCF does it in 32% fewer tokens.
+GCF is the only format with perfect accuracy at scale, at 32% fewer tokens than TOON.
+Reproduce: `git clone https://github.com/blackwell-systems/gcf-go && cd gcf-go/eval && GOWORK=off go test -run TestComprehension -v -timeout 0`
 ## Token Efficiency (TOON's Own Benchmark)
@@ -180,13 +182,13 @@ Running [TOON's benchmark harness](https://github.com/blackwell-systems/toon/tre
 | Track | GCF | TOON | Result |
 |-------|-----|------|--------|
-| Mixed-structure (nested, semi-uniform) | 169,554 | 227,896 | **GCF 34% smaller** |
-| Flat-only (tabular) | 66,026 | 67,837 | **GCF 3% smaller** |
-| Semi-uniform event logs | 107,269 | 154,032 | **GCF 44% smaller** |
+| Mixed-structure (nested, semi-uniform) | 170,367 | 227,896 | **GCF 34% smaller** |
+| Flat-only (tabular) | 66,029 | 67,837 | **GCF 3% smaller** |
+| Semi-uniform event logs | 108,158 | 154,032 | **GCF 42% smaller** |
-GCF wins on every dataset except deeply nested config (75 tokens on a 618-token payload). On semi-uniform data, GCF uses 44% fewer tokens than TOON.
+GCF wins all 6 datasets. On semi-uniform data (the most common real-world pattern), GCF uses 42% fewer tokens than TOON.
-Reproducible: [blackwell-systems/toon@gcf-comparison](https://github.com/blackwell-systems/toon/tree/gcf-comparison)
+Reproduce: `git clone https://github.com/blackwell-systems/toon && cd toon && git checkout gcf-comparison && cd benchmarks && pnpm install && pnpm benchmark:tokens`
 ## Links

{gcf_python-0.2.0 → gcf_python-0.3.1}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "gcf-python"
-version = "0.2.0"
+version = "0.3.1"
 description = "Python implementation of GCF (Graph Compact Format): token-optimized wire format for LLM tool responses"
 readme = "README.md"
 license = {text = "MIT"}

{gcf_python-0.2.0 → gcf_python-0.3.1}/src/gcf/generic.py RENAMED Viewed

@@ -59,6 +59,10 @@ def _encode_array(items: list, name: str, lines: list[str], depth: int) -> None:
     if _is_uniform_dict_list(items):
         _encode_tabular(items, name, lines, depth)
+    elif all(not isinstance(item, (dict, list)) for item in items):
+        # Primitive array: inline as comma-separated values.
+        vals = ",".join(_format_value(item) for item in items)
+        lines.append(f"{prefix}{name}[{len(items)}]: {vals}")
     else:
         lines.append(f"{prefix}## {name} [{len(items)}]")
         for i, item in enumerate(items):