gcf-python 0.1.2__tar.gz → 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (24) hide show
  1. {gcf_python-0.1.2 → gcf_python-0.2.0}/CHANGELOG.md +13 -0
  2. {gcf_python-0.1.2 → gcf_python-0.2.0}/PKG-INFO +17 -10
  3. {gcf_python-0.1.2 → gcf_python-0.2.0}/README.md +16 -9
  4. {gcf_python-0.1.2 → gcf_python-0.2.0}/pyproject.toml +1 -1
  5. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/__init__.py +1 -1
  6. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/decode.py +7 -0
  7. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/encode.py +13 -7
  8. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/generic.py +8 -7
  9. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/session.py +13 -7
  10. {gcf_python-0.1.2 → gcf_python-0.2.0}/tests/test_decode.py +4 -4
  11. {gcf_python-0.1.2 → gcf_python-0.2.0}/tests/test_encode.py +6 -5
  12. {gcf_python-0.1.2 → gcf_python-0.2.0}/tests/test_generic.py +2 -2
  13. {gcf_python-0.1.2 → gcf_python-0.2.0}/.github/workflows/ci.yml +0 -0
  14. {gcf_python-0.1.2 → gcf_python-0.2.0}/.github/workflows/publish.yml +0 -0
  15. {gcf_python-0.1.2 → gcf_python-0.2.0}/.gitignore +0 -0
  16. {gcf_python-0.1.2 → gcf_python-0.2.0}/LICENSE +0 -0
  17. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/cli.py +0 -0
  18. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/constants.py +0 -0
  19. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/delta.py +0 -0
  20. {gcf_python-0.1.2 → gcf_python-0.2.0}/src/gcf/types.py +0 -0
  21. {gcf_python-0.1.2 → gcf_python-0.2.0}/tests/__init__.py +0 -0
  22. {gcf_python-0.1.2 → gcf_python-0.2.0}/tests/test_delta.py +0 -0
  23. {gcf_python-0.1.2 → gcf_python-0.2.0}/tests/test_roundtrip.py +0 -0
  24. {gcf_python-0.1.2 → gcf_python-0.2.0}/tests/test_session.py +0 -0
@@ -1,5 +1,18 @@
1
1
  # Changelog
2
2
 
3
+ ## v0.2.0 (2026-06-05)
4
+
5
+ - **Breaking**: `encode()` now emits `edges=N` in header line
6
+ - **Breaking**: `encode()` now emits `## edges [N]` section header (was `## edges`)
7
+ - `decode()` updated to parse `## edges [N]` format (strips bracket suffix)
8
+ - Session encoder updated to emit new edge count format
9
+
10
+ ## v0.1.3 (2026-06-04)
11
+
12
+ - Docs: update README for PyPI discoverability (gcformat.com, proxy, vs-toon links)
13
+ - Fix: decoder rejects headers missing required `tool` field (conformance)
14
+ - Fix: escape newlines as `\n` in quoted strings in `encode_generic`
15
+
3
16
  ## v0.1.2 (2026-06-04)
4
17
 
5
18
  - Fix: escape `"` inside quoted strings in `encode_generic`
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: gcf-python
3
- Version: 0.1.2
3
+ Version: 0.2.0
4
4
  Summary: Python implementation of GCF (Graph Compact Format): token-optimized wire format for LLM tool responses
5
5
  Project-URL: Homepage, https://github.com/blackwell-systems/gcf-python
6
6
  Project-URL: Documentation, https://blackwell-systems.github.io/gcf/
@@ -30,9 +30,11 @@ Description-Content-Type: text/markdown
30
30
 
31
31
  # gcf-python
32
32
 
33
- Python implementation of [GCF (Graph Compact Format)](https://github.com/blackwell-systems/gcf).
33
+ Python implementation of [GCF (Graph Compact Format)](https://gcformat.com/) — the most token-efficient wire format for LLMs. A drop-in alternative to JSON and TOON for any structured data.
34
34
 
35
- **84% fewer tokens than JSON. 32% fewer than TOON. 100% LLM comprehension accuracy at 500 symbols, where JSON fails.**
35
+ **79% fewer input tokens than JSON. 75% fewer output tokens. 52% smaller than TOON. 100% LLM comprehension at 500 symbols, where JSON fails at 66.7%.**
36
+
37
+ Docs: [gcformat.com](https://gcformat.com/) · [Playground](https://gcformat.com/playground.html) · [GCF vs TOON](https://gcformat.com/guide/vs-toon.html)
36
38
 
37
39
  ## Install
38
40
 
@@ -40,7 +42,7 @@ Python implementation of [GCF (Graph Compact Format)](https://github.com/blackwe
40
42
  pip install gcf-python
41
43
  ```
42
44
 
43
- Zero dependencies. Pure Python. Python 3.9+. Includes CLI.
45
+ Zero dependencies. Pure Python. Python 3.9+. Includes CLI. Don't want to change code? Use the [MCP proxy](https://github.com/blackwell-systems/gcf-proxy) for zero-code adoption.
44
46
 
45
47
  ## CLI
46
48
 
@@ -84,12 +86,12 @@ output = encode(p)
84
86
 
85
87
  Output:
86
88
  ```
87
- GCF tool=context_for_task budget=5000 tokens=1847 symbols=2
89
+ GCF tool=context_for_task budget=5000 tokens=1847 symbols=2 edges=1
88
90
  ## targets
89
91
  @0 fn pkg.AuthMiddleware 0.78 lsp_resolved
90
92
  ## related
91
93
  @1 fn pkg.NewServer 0.54 lsp_resolved
92
- ## edges
94
+ ## edges [1]
93
95
  @0<@1 calls
94
96
  ```
95
97
 
@@ -211,11 +213,16 @@ GCF wins on every dataset except deeply nested config (75 tokens on a 618-token
211
213
 
212
214
  Reproducible: [blackwell-systems/toon@gcf-comparison](https://github.com/blackwell-systems/toon/tree/gcf-comparison)
213
215
 
214
- ## Other Implementations
216
+ ## Links
215
217
 
216
- - **Go**: [github.com/blackwell-systems/gcf-go](https://github.com/blackwell-systems/gcf-go)
217
- - **TypeScript**: [github.com/blackwell-systems/gcf-typescript](https://github.com/blackwell-systems/gcf-typescript)
218
- - **Specification**: [github.com/blackwell-systems/gcf](https://github.com/blackwell-systems/gcf)
218
+ - [Documentation](https://gcformat.com/)
219
+ - [Playground](https://gcformat.com/playground.html)
220
+ - [Specification](https://github.com/blackwell-systems/gcf)
221
+ - [Go library](https://github.com/blackwell-systems/gcf-go)
222
+ - [TypeScript library](https://github.com/blackwell-systems/gcf-typescript)
223
+ - [MCP Proxy](https://github.com/blackwell-systems/gcf-proxy) (zero-code adoption)
224
+ - [GCF vs TOON](https://gcformat.com/guide/vs-toon.html)
225
+ - [TOON benchmark fork](https://github.com/blackwell-systems/toon/tree/gcf-comparison)
219
226
 
220
227
  ## License
221
228
 
@@ -5,9 +5,11 @@
5
5
 
6
6
  # gcf-python
7
7
 
8
- Python implementation of [GCF (Graph Compact Format)](https://github.com/blackwell-systems/gcf).
8
+ Python implementation of [GCF (Graph Compact Format)](https://gcformat.com/) — the most token-efficient wire format for LLMs. A drop-in alternative to JSON and TOON for any structured data.
9
9
 
10
- **84% fewer tokens than JSON. 32% fewer than TOON. 100% LLM comprehension accuracy at 500 symbols, where JSON fails.**
10
+ **79% fewer input tokens than JSON. 75% fewer output tokens. 52% smaller than TOON. 100% LLM comprehension at 500 symbols, where JSON fails at 66.7%.**
11
+
12
+ Docs: [gcformat.com](https://gcformat.com/) · [Playground](https://gcformat.com/playground.html) · [GCF vs TOON](https://gcformat.com/guide/vs-toon.html)
11
13
 
12
14
  ## Install
13
15
 
@@ -15,7 +17,7 @@ Python implementation of [GCF (Graph Compact Format)](https://github.com/blackwe
15
17
  pip install gcf-python
16
18
  ```
17
19
 
18
- Zero dependencies. Pure Python. Python 3.9+. Includes CLI.
20
+ Zero dependencies. Pure Python. Python 3.9+. Includes CLI. Don't want to change code? Use the [MCP proxy](https://github.com/blackwell-systems/gcf-proxy) for zero-code adoption.
19
21
 
20
22
  ## CLI
21
23
 
@@ -59,12 +61,12 @@ output = encode(p)
59
61
 
60
62
  Output:
61
63
  ```
62
- GCF tool=context_for_task budget=5000 tokens=1847 symbols=2
64
+ GCF tool=context_for_task budget=5000 tokens=1847 symbols=2 edges=1
63
65
  ## targets
64
66
  @0 fn pkg.AuthMiddleware 0.78 lsp_resolved
65
67
  ## related
66
68
  @1 fn pkg.NewServer 0.54 lsp_resolved
67
- ## edges
69
+ ## edges [1]
68
70
  @0<@1 calls
69
71
  ```
70
72
 
@@ -186,11 +188,16 @@ GCF wins on every dataset except deeply nested config (75 tokens on a 618-token
186
188
 
187
189
  Reproducible: [blackwell-systems/toon@gcf-comparison](https://github.com/blackwell-systems/toon/tree/gcf-comparison)
188
190
 
189
- ## Other Implementations
191
+ ## Links
190
192
 
191
- - **Go**: [github.com/blackwell-systems/gcf-go](https://github.com/blackwell-systems/gcf-go)
192
- - **TypeScript**: [github.com/blackwell-systems/gcf-typescript](https://github.com/blackwell-systems/gcf-typescript)
193
- - **Specification**: [github.com/blackwell-systems/gcf](https://github.com/blackwell-systems/gcf)
193
+ - [Documentation](https://gcformat.com/)
194
+ - [Playground](https://gcformat.com/playground.html)
195
+ - [Specification](https://github.com/blackwell-systems/gcf)
196
+ - [Go library](https://github.com/blackwell-systems/gcf-go)
197
+ - [TypeScript library](https://github.com/blackwell-systems/gcf-typescript)
198
+ - [MCP Proxy](https://github.com/blackwell-systems/gcf-proxy) (zero-code adoption)
199
+ - [GCF vs TOON](https://gcformat.com/guide/vs-toon.html)
200
+ - [TOON benchmark fork](https://github.com/blackwell-systems/toon/tree/gcf-comparison)
194
201
 
195
202
  ## License
196
203
 
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
4
4
 
5
5
  [project]
6
6
  name = "gcf-python"
7
- version = "0.1.2"
7
+ version = "0.2.0"
8
8
  description = "Python implementation of GCF (Graph Compact Format): token-optimized wire format for LLM tool responses"
9
9
  readme = "README.md"
10
10
  license = {text = "MIT"}
@@ -59,4 +59,4 @@ __all__ = [
59
59
  "encode_with_session",
60
60
  ]
61
61
 
62
- __version__ = "0.1.2"
62
+ __version__ = "0.1.3"
@@ -34,6 +34,9 @@ def decode(input_text: str) -> Payload:
34
34
  raise DecodeError(f"invalid header, expected 'GCF ...' got {header!r}")
35
35
  _parse_header(header[4:], p)
36
36
 
37
+ if not p.tool:
38
+ raise DecodeError("header missing required 'tool' field")
39
+
37
40
  # Parse body: symbols and edges.
38
41
  symbols: list[Symbol] = []
39
42
  sym_by_id: dict[int, Symbol] = {}
@@ -48,6 +51,10 @@ def decode(input_text: str) -> Payload:
48
51
  # Group header.
49
52
  if line.startswith("## "):
50
53
  group = line[3:]
54
+ # Strip bracket suffix: "edges [200]" -> "edges"
55
+ bracket_idx = group.find(" [")
56
+ if bracket_idx >= 0:
57
+ group = group[:bracket_idx]
51
58
  in_edges = group == "edges"
52
59
  if not in_edges:
53
60
  if group == "targets":
@@ -17,17 +17,23 @@ def encode(p: Payload) -> str:
17
17
  """
18
18
  parts: list[str] = []
19
19
 
20
- # Header line.
21
- header = f"GCF tool={p.tool} budget={p.token_budget} tokens={p.tokens_used} symbols={len(p.symbols)}"
22
- if p.pack_root:
23
- header += f" pack_root={p.pack_root}"
24
- parts.append(header)
25
-
26
20
  # Build symbol index for edge references.
27
21
  sym_index: dict[str, int] = {}
28
22
  for i, s in enumerate(p.symbols):
29
23
  sym_index[s.qualified_name] = i
30
24
 
25
+ # Count valid edges (both endpoints in symbol index).
26
+ valid_edges = sum(
27
+ 1 for e in p.edges
28
+ if e.source in sym_index and e.target in sym_index
29
+ )
30
+
31
+ # Header line.
32
+ header = f"GCF tool={p.tool} budget={p.token_budget} tokens={p.tokens_used} symbols={len(p.symbols)} edges={valid_edges}"
33
+ if p.pack_root:
34
+ header += f" pack_root={p.pack_root}"
35
+ parts.append(header)
36
+
31
37
  # Group symbols by distance.
32
38
  groups = _group_by_distance(p.symbols)
33
39
  group_names = ["targets", "related", "extended"]
@@ -58,7 +64,7 @@ def encode(p: Payload) -> str:
58
64
  if e.status and e.status != "unchanged":
59
65
  line += f" {e.status}"
60
66
  edge_lines.append(line)
61
- parts.append("## edges")
67
+ parts.append(f"## edges [{len(edge_lines)}]")
62
68
  parts.extend(edge_lines)
63
69
 
64
70
  return "\n".join(parts) + "\n"
@@ -18,6 +18,8 @@ def encode_generic(data: Any) -> str:
18
18
  Returns:
19
19
  GCF-formatted text string.
20
20
  """
21
+ if data is None or not isinstance(data, (dict, list)):
22
+ return str(data) if data is not None else "-"
21
23
  lines: list[str] = []
22
24
  _encode_value(data, lines, depth=0)
23
25
  return "\n".join(lines) + "\n" if lines else "\n"
@@ -33,15 +35,16 @@ def _encode_value(value: Any, lines: list[str], depth: int) -> None:
33
35
  lines.append(_indent(depth) + _format_value(value))
34
36
 
35
37
 
36
- def _encode_dict(d: dict, lines: list[str], depth: int) -> None:
38
+ def _encode_dict(d: dict, lines: list[str], depth: int, name: str | None = None) -> None:
37
39
  """Encode a dict into key=value pairs with section headers for nested values."""
38
40
  prefix = _indent(depth)
41
+ if name is not None:
42
+ lines.append(f"{prefix}## {name}")
39
43
  for key, value in d.items():
40
44
  if isinstance(value, list):
41
45
  _encode_array(value, key, lines, depth)
42
46
  elif isinstance(value, dict):
43
- lines.append(f"{prefix}## {key}")
44
- _encode_dict(value, lines, depth + 1)
47
+ _encode_dict(value, lines, depth + 1, name=key)
45
48
  else:
46
49
  lines.append(f"{prefix}{key}={_format_value(value)}")
47
50
 
@@ -85,14 +88,12 @@ def _encode_tabular(items: list[dict], name: str, lines: list[str], depth: int)
85
88
 
86
89
  if nested_fields:
87
90
  lines.append(f"{prefix}@{i} {row_str}")
88
- inner_prefix = _indent(depth + 1)
89
91
  for nk in nested_fields:
90
92
  nv = item.get(nk)
91
93
  if isinstance(nv, list):
92
94
  _encode_array(nv, nk, lines, depth + 1)
93
95
  elif isinstance(nv, dict):
94
- lines.append(f"{inner_prefix}## {nk}")
95
- _encode_dict(nv, lines, depth + 2)
96
+ _encode_dict(nv, lines, depth + 1, name=nk)
96
97
  else:
97
98
  lines.append(f"{prefix}{row_str}")
98
99
 
@@ -141,7 +142,7 @@ def _format_value(value: Any) -> str:
141
142
  return str(value)
142
143
  s = str(value)
143
144
  if "|" in s or "\n" in s or s == "":
144
- escaped = s.replace("\\", "\\\\").replace('"', '\\"')
145
+ escaped = s.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
145
146
  return f'"{escaped}"'
146
147
  return s
147
148
 
@@ -77,20 +77,26 @@ def encode_with_session(p: Payload, sess: Session | None = None) -> str:
77
77
 
78
78
  parts: list[str] = []
79
79
 
80
+ # Build local ID mapping for this response.
81
+ local_index: dict[str, int] = {}
82
+ for i, s in enumerate(p.symbols):
83
+ local_index[s.qualified_name] = i
84
+
85
+ # Count valid edges.
86
+ valid_edges = sum(
87
+ 1 for e in p.edges
88
+ if e.source in local_index and e.target in local_index
89
+ )
90
+
80
91
  # Header with session=true marker.
81
92
  header = (
82
93
  f"GCF tool={p.tool} budget={p.token_budget} tokens={p.tokens_used} "
83
- f"symbols={len(p.symbols)} session=true"
94
+ f"symbols={len(p.symbols)} edges={valid_edges} session=true"
84
95
  )
85
96
  if p.pack_root:
86
97
  header += f" pack_root={p.pack_root}"
87
98
  parts.append(header)
88
99
 
89
- # Build local ID mapping for this response.
90
- local_index: dict[str, int] = {}
91
- for i, s in enumerate(p.symbols):
92
- local_index[s.qualified_name] = i
93
-
94
100
  # Track which symbols are new (need full declaration).
95
101
  new_symbols: list[Symbol] = []
96
102
 
@@ -122,7 +128,7 @@ def encode_with_session(p: Payload, sess: Session | None = None) -> str:
122
128
 
123
129
  # Edges section.
124
130
  if p.edges:
125
- parts.append("## edges")
131
+ parts.append(f"## edges [{valid_edges}]")
126
132
  for e in p.edges:
127
133
  src_idx = local_index.get(e.source)
128
134
  tgt_idx = local_index.get(e.target)
@@ -13,7 +13,7 @@ def test_decode_basic_payload():
13
13
  "@0 fn pkg.AuthMiddleware 0.78 lsp_resolved\n"
14
14
  "## related\n"
15
15
  "@1 fn pkg.NewServer 0.54 lsp_resolved\n"
16
- "## edges\n"
16
+ "## edges [1]\n"
17
17
  "@0<@1 calls\n"
18
18
  )
19
19
 
@@ -93,7 +93,7 @@ def test_decode_edge_with_status():
93
93
  "## targets\n"
94
94
  "@0 fn a.A 0.90 x\n"
95
95
  "@1 fn b.B 0.80 x\n"
96
- "## edges\n"
96
+ "## edges [1]\n"
97
97
  "@0<@1 calls added\n"
98
98
  )
99
99
  p = decode(input_text)
@@ -171,7 +171,7 @@ def test_decode_edge_missing_separator():
171
171
  "## targets\n"
172
172
  "@0 fn a.A 0.90 x\n"
173
173
  "@1 fn b.B 0.80 x\n"
174
- "## edges\n"
174
+ "## edges [1]\n"
175
175
  "@0@1 calls\n"
176
176
  )
177
177
  with pytest.raises(DecodeError, match="missing '<' separator"):
@@ -184,7 +184,7 @@ def test_decode_edge_unknown_symbol():
184
184
  "GCF tool=test budget=100 tokens=50 symbols=1\n"
185
185
  "## targets\n"
186
186
  "@0 fn a.A 0.90 x\n"
187
- "## edges\n"
187
+ "## edges [1]\n"
188
188
  "@0<@99 calls\n"
189
189
  )
190
190
  with pytest.raises(DecodeError, match="unknown symbol id"):
@@ -32,12 +32,12 @@ def test_encode_basic_payload():
32
32
 
33
33
  output = encode(p)
34
34
  expected = (
35
- "GCF tool=context_for_task budget=5000 tokens=1847 symbols=2\n"
35
+ "GCF tool=context_for_task budget=5000 tokens=1847 symbols=2 edges=1\n"
36
36
  "## targets\n"
37
37
  "@0 fn pkg.AuthMiddleware 0.78 lsp_resolved\n"
38
38
  "## related\n"
39
39
  "@1 fn pkg.NewServer 0.54 lsp_resolved\n"
40
- "## edges\n"
40
+ "## edges [1]\n"
41
41
  "@0<@1 calls\n"
42
42
  )
43
43
  assert output == expected
@@ -175,8 +175,9 @@ def test_encode_skips_edges_with_missing_symbols():
175
175
  )
176
176
  output = encode(p)
177
177
  # Section header is emitted (matches Go), but no edge lines beneath it
178
- assert "## edges" in output
179
- lines_after_edges = output.split("## edges\n")[1]
178
+ assert "## edges [0]" in output
179
+ assert "edges=0" in output
180
+ lines_after_edges = output.split("## edges [0]\n")[1]
180
181
  assert lines_after_edges.strip() == ""
181
182
 
182
183
 
@@ -184,4 +185,4 @@ def test_encode_empty_payload():
184
185
  """Empty payload produces only header."""
185
186
  p = Payload(tool="test", token_budget=100, tokens_used=0)
186
187
  output = encode(p)
187
- assert output == "GCF tool=test budget=100 tokens=0 symbols=0\n"
188
+ assert output == "GCF tool=test budget=100 tokens=0 symbols=0 edges=0\n"
@@ -159,8 +159,8 @@ def test_encode_non_uniform_list():
159
159
 
160
160
  def test_encode_primitive_value():
161
161
  """A bare primitive is encoded directly."""
162
- assert encode_generic(42) == "42\n"
163
- assert encode_generic("hello") == "hello\n"
162
+ assert encode_generic(42) == "42"
163
+ assert encode_generic("hello") == "hello"
164
164
 
165
165
 
166
166
  def test_encode_string_with_pipe():
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes