yaml-reference 2.7.0__tar.gz → 2.8.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.github/copilot-instructions.md +30 -21
  2. yaml_reference-2.8.0/.vscode/settings.json +13 -0
  3. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/PKG-INFO +11 -2
  4. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/README.md +10 -1
  5. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/tests/unit/test_flatten.py +15 -0
  6. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/tests/unit/test_merge.py +20 -0
  7. yaml_reference-2.8.0/tests/unit/test_multidocument.py +34 -0
  8. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/tests/unit/test_reference.py +74 -0
  9. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/yaml_reference/__init__.py +168 -28
  10. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/yaml_reference/cli.py +7 -0
  11. yaml_reference-2.7.0/.vscode/settings.json +0 -8
  12. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.github/workflows/pytests-pr.yml +0 -0
  13. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.github/workflows/release.yml +0 -0
  14. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.github/workflows/spectests-pr.yml +0 -0
  15. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.gitignore +0 -0
  16. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.pre-commit-config.yaml +0 -0
  17. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.python-version +0 -0
  18. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/.zed/settings.json +0 -0
  19. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/GitVersion.yml +0 -0
  20. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/LICENSE +0 -0
  21. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/Makefile +0 -0
  22. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/pyproject.toml +0 -0
  23. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/scripts/spec-test.sh +0 -0
  24. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/scripts/update-readme-badge.sh +0 -0
  25. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/tests/unit/conftest.py +0 -0
  26. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/tests/unit/test_ignore.py +0 -0
  27. {yaml_reference-2.7.0 → yaml_reference-2.8.0}/uv.lock +0 -0
@@ -2,7 +2,7 @@
2
2
 
3
3
  ## Project Overview
4
4
 
5
- **yaml-reference** is a Python library that extends `ruamel.yaml` with cross-file YAML composition using custom tags (`!reference`, `!reference-all`, `!flatten`, `!merge`). It's built to be a reference implementation of the [yaml-reference-specs](https://github.com/dsillman2000/yaml-reference-specs) specification.
5
+ **yaml-reference** is a Python library that extends `ruamel.yaml` with cross-file YAML composition using custom tags (`!reference`, `!reference-all`, `!flatten`, `!merge`, `!ignore`). It's built to be a reference implementation of the [yaml-reference-specs](https://github.com/dsillman2000/yaml-reference-specs) specification.
6
6
 
7
7
  ## Build, Test, and Lint
8
8
 
@@ -62,19 +62,21 @@ uv build
62
62
  The library is structured in two key parts:
63
63
 
64
64
  ### Core Module (`yaml_reference/__init__.py`)
65
- - **Reference & ReferenceAll classes**: Represent the `!reference` and `!reference-all` YAML tags as Python objects
66
- - **parse_yaml_with_references()**: Parses YAML, returning Reference/ReferenceAll objects without resolving them (one layer only)
67
- - **load_yaml_with_references()**: Fully recursively resolves all references, returning a complete Python dict
68
- - **Flatten & Merge classes**: Represent `!flatten` and `!merge` tag logic
69
- - **YAML loader setup**: Registers custom constructors with `ruamel.yaml.YAML` for each tag
65
+ - **Reference & ReferenceAll classes**: Represent the `!reference` and `!reference-all` YAML tags as Python objects, supporting both mapping form and scalar shorthand (`!reference path/to/file.yml`, `!reference-all glob/*.yml`)
66
+ - **Ignore, Flatten, and Merge classes**: Represent `!ignore`, `!flatten`, and `!merge` tag logic
67
+ - **parse_yaml_with_references()**: Parses YAML and preserves composition tags as Python objects without resolving cross-file references
68
+ - **load_yaml_with_references()**: Fully resolves references, then prunes ignored content, flattens sequences, and merges mappings to produce the final Python data structure
69
+ - **Helper transforms**: `prune_ignores()`, `flatten_sequences()`, and `merge_mappings()` implement the post-resolution evaluation pipeline
70
+ - **YAML loader setup**: Registers custom constructors with `ruamel.yaml.YAML` for each supported tag
70
71
 
71
72
  ### CLI Module (`yaml_reference/cli.py`)
72
- - Simple entry point that calls the core loading functions
73
+ - Simple entry point that calls the core loading functions for YAML containing any supported composition tags
73
74
  - Outputs JSON to stdout (compatible with spec tests)
74
75
  - Takes optional `--allow` flag for path restrictions
75
76
 
76
77
  ### Test Structure (`tests/unit/`)
77
78
  - `test_reference.py`: Tests for `!reference` and `!reference-all` tag resolution
79
+ - `test_ignore.py`: Tests for `!ignore` parsing and pruning behavior
78
80
  - `test_flatten.py`: Tests for `!flatten` tag behavior
79
81
  - `test_merge.py`: Tests for `!merge` tag behavior
80
82
  - `conftest.py`: Pytest fixtures and test utilities
@@ -83,33 +85,40 @@ The library is structured in two key parts:
83
85
 
84
86
  ### Security-First Path Handling
85
87
  1. **Relative paths only**: All references must use relative paths (e.g., `path: "config/db.yaml"`). Absolute paths raise `ValueError`.
86
- 2. **Path restriction by default**: References can only access files in the same directory or subdirectories (no `..` to escape). Use `allow_paths` parameter to explicitly allow other directory trees.
88
+ 2. **Path restriction by default**: The referencing file's parent directory is always allowed. Use `allow_paths` to explicitly allow additional directory trees.
87
89
  3. **Security invariant**: Disallowed files are **never opened or read into memory**. Path filtering happens before file I/O.
88
- 4. **Silent omission (for `!reference-all`)**: When a glob pattern matches files outside allowed paths, those files are silently dropped from results and the function returns `rc=0` (not an error).
90
+ 4. **Silent omission (for `!reference-all`)**: When a glob pattern matches files outside allowed paths, those files are silently dropped from results. Empty or fully filtered globs resolve to `[]` rather than an error.
89
91
 
90
92
  ### YAML Tag Implementation Pattern
91
93
  Each custom tag follows this pattern:
92
94
  1. Define a class with `yaml_tag` attribute
93
- 2. Implement `@classmethod from_yaml(cls, constructor, node)` to parse from YAML
95
+ 2. Implement `@classmethod from_yaml(cls, constructor, node)` to parse from YAML, handling scalar, mapping, or sequence nodes as needed
94
96
  3. Register constructor with the YAML loader in `__init__.py`
95
97
  4. The class instance persists through `parse_yaml_with_references()`, allowing layer-by-layer resolution
96
98
 
99
+ ### Reference Tag Forms
100
+ 1. **Scalar shorthand is supported**: `!reference path/to/file.yml` and `!reference-all glob/*.yml` are valid when only `path` or `glob` is needed.
101
+ 2. **Mapping form is still required for optional fields**: Use mappings such as `{ path: "file.yml", anchor: "section" }` when specifying `anchor`.
102
+
97
103
  ### Reference Resolution Order
98
- 1. **Circular reference detection** occurs during recursive resolution by tracking a "resolution stack"
104
+ 1. **Circular reference detection** occurs during recursive resolution by tracking visited file paths
99
105
  2. **Anchors** (optional parameter): If specified, extract only the anchored section from the referenced file
100
- 3. **Recursive expansion**: `load_yaml_with_references()` recursively expands all tags, applying `!flatten` and `!merge` logic as it encounters them
106
+ 3. **Recursive expansion**: `load_yaml_with_references()` recursively resolves `!reference` and `!reference-all` first
107
+ 4. **Ignore pruning**: `!ignore` content is removed after full reference resolution so ignored values from referenced files can remove their parent keys or list items
108
+ 5. **Post-processing**: `!flatten` is evaluated after ignore pruning, and `!merge` is evaluated last
101
109
 
102
110
  ### Error Handling
103
- - **ValueError** for spec violations: absolute paths, circular references, invalid anchors
111
+ - **ValueError** for spec violations: absolute paths, circular references, invalid anchors, malformed merge contents
104
112
  - **FileNotFoundError** for missing referenced files
105
- - **Glob errors**: Return empty list `[]` if glob matches no files (silent omission)
113
+ - **PermissionError** for disallowed `!reference` targets
114
+ - **Glob behavior**: `!reference-all` returns `[]` when a glob matches no files or when all matches are filtered out by path restrictions
106
115
 
107
116
  ### Spec Compliance Testing
108
117
  The project tests against `yaml-reference-specs`, a Go-based reference implementation. The spec tests verify:
109
- - Correct expansion of all four tags
118
+ - Correct expansion of all supported tags
110
119
  - Proper error detection (bad paths, missing files, circular refs)
111
120
  - Path restriction enforcement
112
- - Edge cases like empty globs and nested composition
121
+ - Edge cases like empty globs, ignored content, shorthand reference syntax, and nested composition
113
122
 
114
123
  Run with: `make spec-test` or `scripts/spec-test.sh`
115
124
 
@@ -127,15 +136,15 @@ Install hooks with: `pre-commit install`
127
136
  ### Adding a new tag type
128
137
  1. Create a class in `yaml_reference/__init__.py` with `yaml_tag` attribute and `from_yaml()` classmethod
129
138
  2. Register the constructor after the class definition
130
- 3. Add resolution logic (handle in recursive expansion)
139
+ 3. Add resolution or post-processing logic in the appropriate stage (`_recursively_resolve_references()`, `prune_ignores()`, `flatten_sequences()`, or `merge_mappings()`)
131
140
  4. Write tests in `tests/unit/test_*.py` following existing patterns
132
141
  5. Update README.md with usage example
133
142
 
134
143
  ### Debugging a reference resolution issue
135
- 1. Use `parse_yaml_with_references()` to see raw Reference objects before resolution
136
- 2. Add print statements or use a debugger to trace the `_resolve_references()` recursive calls
137
- 3. Check the resolution stack to verify circular reference detection is working
138
- 4. Run a specific test with `-v` flag to see detailed assertion output
144
+ 1. Use `parse_yaml_with_references()` to inspect raw `Reference`, `ReferenceAll`, `Ignore`, `Flatten`, and `Merge` objects before evaluation
145
+ 2. Trace `_recursively_resolve_references()` to debug cross-file expansion and circular reference handling
146
+ 3. Check the post-processing stages in order: `prune_ignores()`, then `flatten_sequences()`, then `merge_mappings()`
147
+ 4. Run the most specific unit test with `-v` flag to see detailed assertion output
139
148
 
140
149
  ### Updating error messages
141
150
  Ensure error messages follow this pattern: include the problematic value, the path of the file where the error occurred, and the specific constraint violated. This helps spec tests verify proper error handling.
@@ -0,0 +1,13 @@
1
+ {
2
+ "yaml.customTags": [
3
+ "!reference mapping",
4
+ "!reference scalar",
5
+ "!reference-all mapping",
6
+ "!reference-all scalar",
7
+ "!flatten sequence",
8
+ "!merge sequence",
9
+ "!ignore scalar",
10
+ "!ignore mapping",
11
+ "!ignore sequence"
12
+ ]
13
+ }
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: yaml-reference
3
- Version: 2.7.0
3
+ Version: 2.8.0
4
4
  Summary: Extension package built on top of `ruamel.yaml` to support cross-file references in YAML files using tags `!reference` and `!reference-all`.
5
5
  Project-URL: Repository, https://github.com/dsillman2000/yaml-reference.git
6
6
  Author-email: David Sillman <dsillman2000@gmail.com>
@@ -40,7 +40,7 @@ uv add yaml-reference
40
40
  ```
41
41
 
42
42
  ## Spec
43
- ![Spec Status](https://img.shields.io/badge/spec%20v0.2.8--1-passing-brightgreen?link=https%3A%2F%2Fgithub.com%2Fdsillman2000%2Fyaml-reference-specs%2Ftree%2Fv0.2.8-1)
43
+ ![Spec Status](https://img.shields.io/badge/spec%20v0.2.9--0-passing-brightgreen?link=https%3A%2F%2Fgithub.com%2Fdsillman2000%2Fyaml-reference-specs%2Ftree%2Fv0.2.9-0)
44
44
 
45
45
  This Python library implements the YAML specification for cross-file references and YAML composition in YAML files using tags `!reference`, `!reference-all`, `!flatten`, `!merge`, and `!ignore` as defined in the [yaml-reference-specs project](https://github.com/dsillman2000/yaml-reference-specs).
46
46
 
@@ -123,6 +123,15 @@ networks: !reference-all { glob: "networks/*.yaml" }
123
123
 
124
124
  Use the mapping form when you need optional arguments such as `anchor`; use the scalar shorthand when you only need `path` or `glob`.
125
125
 
126
+ ### Multi-Document YAML
127
+
128
+ yaml-reference distinguishes between a single YAML document whose root value is a sequence and a YAML file that contains multiple documents separated by `---`.
129
+
130
+ - `!reference` requires the target file to contain exactly one YAML document. If the referenced file contains multiple documents, loading fails with a `ValueError`.
131
+ - `!reference-all` expands matched files document-by-document. A single-document file contributes one list element, while a multi-document file contributes one element per document in document order.
132
+ - When `anchor` is used with `!reference-all`, the anchored value is extracted from every document in each matched file, preserving file order and then document order.
133
+ - If the root input file contains multiple documents, `load_yaml_with_references()` returns a Python list with one resolved output element per document. Root documents tagged with `!ignore` are omitted entirely.
134
+
126
135
  ### The `!ignore` Tag
127
136
 
128
137
  The `!ignore` tag marks YAML content that should be parsed but omitted from the final resolved output. The most common use case is a hidden section of reusable anchors that should remain available for aliases elsewhere in the document without being emitted in the resolved result.
@@ -14,7 +14,7 @@ uv add yaml-reference
14
14
  ```
15
15
 
16
16
  ## Spec
17
- ![Spec Status](https://img.shields.io/badge/spec%20v0.2.8--1-passing-brightgreen?link=https%3A%2F%2Fgithub.com%2Fdsillman2000%2Fyaml-reference-specs%2Ftree%2Fv0.2.8-1)
17
+ ![Spec Status](https://img.shields.io/badge/spec%20v0.2.9--0-passing-brightgreen?link=https%3A%2F%2Fgithub.com%2Fdsillman2000%2Fyaml-reference-specs%2Ftree%2Fv0.2.9-0)
18
18
 
19
19
  This Python library implements the YAML specification for cross-file references and YAML composition in YAML files using tags `!reference`, `!reference-all`, `!flatten`, `!merge`, and `!ignore` as defined in the [yaml-reference-specs project](https://github.com/dsillman2000/yaml-reference-specs).
20
20
 
@@ -97,6 +97,15 @@ networks: !reference-all { glob: "networks/*.yaml" }
97
97
 
98
98
  Use the mapping form when you need optional arguments such as `anchor`; use the scalar shorthand when you only need `path` or `glob`.
99
99
 
100
+ ### Multi-Document YAML
101
+
102
+ yaml-reference distinguishes between a single YAML document whose root value is a sequence and a YAML file that contains multiple documents separated by `---`.
103
+
104
+ - `!reference` requires the target file to contain exactly one YAML document. If the referenced file contains multiple documents, loading fails with a `ValueError`.
105
+ - `!reference-all` expands matched files document-by-document. A single-document file contributes one list element, while a multi-document file contributes one element per document in document order.
106
+ - When `anchor` is used with `!reference-all`, the anchored value is extracted from every document in each matched file, preserving file order and then document order.
107
+ - If the root input file contains multiple documents, `load_yaml_with_references()` returns a Python list with one resolved output element per document. Root documents tagged with `!ignore` are omitted entirely.
108
+
100
109
  ### The `!ignore` Tag
101
110
 
102
111
  The `!ignore` tag marks YAML content that should be parsed but omitted from the final resolved output. The most common use case is a hidden section of reusable anchors that should remain available for aliases elsewhere in the document without being emitted in the resolved result.
@@ -130,6 +130,21 @@ data: !flatten
130
130
  assert data["data"] == [1, 2, 3, 4, 5, 6]
131
131
 
132
132
 
133
+ def test_flatten_combined_with_multi_document_reference_all(stage_files):
134
+ files = {
135
+ "main.yml": """
136
+ data: !flatten
137
+ - !reference-all { glob: ./entries.yml }
138
+ """,
139
+ "entries.yml": "---\n- [1, 2]\n---\n- [3, 4]\n",
140
+ }
141
+ stg = stage_files(files)
142
+
143
+ data = load_yaml_with_references(stg / "main.yml")
144
+
145
+ assert data["data"] == [1, 2, 3, 4]
146
+
147
+
133
148
  def test_parse_flatten_tag(stage_files):
134
149
  """Test that !flatten tags are parsed correctly without resolution."""
135
150
  files = {
@@ -169,3 +169,23 @@ def test_flatten_and_merge(stage_files):
169
169
  stg = stage_files(files)
170
170
  data = load_yaml_with_references(stg / "test.yml")
171
171
  assert data["result"] == [{"a": 2}, {"b": 2, "c": 3}]
172
+
173
+
174
+ def test_merge_combined_with_multi_document_reference_all(stage_files):
175
+ files = {
176
+ "test.yml": """
177
+ result: !merge
178
+ - {base: true, version: 1}
179
+ - !reference-all { glob: ./patches.yml }
180
+ """,
181
+ "patches.yml": "---\nversion: 2\n---\nfeature: enabled\n",
182
+ }
183
+ stg = stage_files(files)
184
+
185
+ data = load_yaml_with_references(stg / "test.yml")
186
+
187
+ assert data["result"] == {
188
+ "base": True,
189
+ "version": 2,
190
+ "feature": "enabled",
191
+ }
@@ -0,0 +1,34 @@
1
+ from yaml_reference import load_yaml_with_references
2
+
3
+
4
+ def test_multi_document_root_file_loads_as_array(stage_files):
5
+ files = {
6
+ "root.yml": """
7
+ ---
8
+ service: !reference { path: ./service.yml }
9
+ ---
10
+ ignored_only: !ignore true
11
+ --- !ignore
12
+ drop_me: true
13
+ ---
14
+ items: !flatten
15
+ - !reference-all { glob: ./entries.yml }
16
+ ---
17
+ config: !merge
18
+ - {a: 1}
19
+ - !reference-all { glob: ./patches.yml }
20
+ """,
21
+ "service.yml": "name: api\n",
22
+ "entries.yml": "---\n- [1, 2]\n---\n- [3, 4]\n",
23
+ "patches.yml": "---\na: 2\n---\nb: 3\n",
24
+ }
25
+ stg = stage_files(files)
26
+
27
+ data = load_yaml_with_references(stg / "root.yml")
28
+
29
+ assert data == [
30
+ {"service": {"name": "api"}},
31
+ {},
32
+ {"items": [1, 2, 3, 4]},
33
+ {"config": {"a": 2, "b": 3}},
34
+ ]
@@ -57,6 +57,20 @@ def test_reference_load_shorthand(stage_files):
57
57
  assert data["contents"]["inner"] == "inner_value"
58
58
 
59
59
 
60
+ def test_reference_rejects_multi_document_target(stage_files):
61
+ files = {
62
+ "test.yml": "contents: !reference { path: ./multi.yml }",
63
+ "multi.yml": "---\nvalue: 1\n---\nvalue: 2\n",
64
+ }
65
+ stg = stage_files(files)
66
+
67
+ with pytest.raises(
68
+ ValueError,
69
+ match="contains multiple YAML documents and cannot be used with !reference",
70
+ ):
71
+ load_yaml_with_references(stg / "test.yml")
72
+
73
+
60
74
  def test_reference_all_load(stage_files):
61
75
  files = {
62
76
  "test.yml": "hello: world\ncontents: !reference-all { glob: ./chapters/*.yml }",
@@ -103,6 +117,66 @@ def test_reference_all_load_shorthand(stage_files):
103
117
  assert {"chapter_value": 3} in data["contents"]
104
118
 
105
119
 
120
+ def test_reference_all_expands_multi_document_file(stage_files):
121
+ files = {
122
+ "test.yml": "contents: !reference-all { glob: ./multi.yml }",
123
+ "multi.yml": "---\nvalue: 1\n---\nvalue: 2\n",
124
+ }
125
+ stg = stage_files(files)
126
+
127
+ data = load_yaml_with_references(stg / "test.yml")
128
+
129
+ assert data["contents"] == [{"value": 1}, {"value": 2}]
130
+
131
+
132
+ def test_reference_all_mixed_single_and_multi_document_order(stage_files):
133
+ files = {
134
+ "test.yml": "contents: !reference-all { glob: ./parts/*.yml }",
135
+ "parts/a.yml": "value: a\n",
136
+ "parts/b.yml": "---\nvalue: b1\n---\nvalue: b2\n",
137
+ "parts/c.yml": "value: c\n",
138
+ }
139
+ stg = stage_files(files)
140
+
141
+ data = load_yaml_with_references(stg / "test.yml")
142
+
143
+ assert data["contents"] == [
144
+ {"value": "a"},
145
+ {"value": "b1"},
146
+ {"value": "b2"},
147
+ {"value": "c"},
148
+ ]
149
+
150
+
151
+ def test_reference_all_skips_ignored_root_documents_in_multi_document_file(stage_files):
152
+ files = {
153
+ "test.yml": "contents: !reference-all { glob: ./multi.yml }",
154
+ "multi.yml": "--- !ignore\nignored: true\n---\nvalue: kept\n",
155
+ }
156
+ stg = stage_files(files)
157
+
158
+ data = load_yaml_with_references(stg / "test.yml")
159
+
160
+ assert data["contents"] == [{"value": "kept"}]
161
+
162
+
163
+ def test_reference_all_anchor_extracts_from_every_document(stage_files):
164
+ files = {
165
+ "test.yml": "contents: !reference-all { glob: ./parts/*.yml, anchor: item }",
166
+ "parts/a.yml": "---\nroot: &item {value: 1}\n---\nroot: &item {value: 2}\n",
167
+ "parts/b.yml": "root: &item {value: 3}\n",
168
+ }
169
+ stg = stage_files(files)
170
+
171
+ data = load_yaml_with_references(stg / "test.yml")
172
+
173
+ assert data["contents"] == [
174
+ {"value": 1},
175
+ {"value": 2},
176
+ {"value": 3},
177
+ ]
178
+
179
+
106
180
  def test_parse_references(stage_files):
107
181
  files = {
108
182
  "test.yml": "inner: !reference { path: next/open.yml }\n",
@@ -1,6 +1,7 @@
1
1
  import io
2
2
  import os
3
3
  from collections import defaultdict
4
+ from dataclasses import dataclass
4
5
  from pathlib import Path
5
6
  from typing import IO, Any, Optional, Sequence, Union
6
7
 
@@ -252,9 +253,32 @@ class Merge:
252
253
  return cls(seq)
253
254
 
254
255
 
256
+ @dataclass
257
+ class MultiDocument:
258
+ documents: list[Any]
259
+ is_multi_document: bool
260
+
261
+ def __repr__(self):
262
+ return (
263
+ "MultiDocument("
264
+ f"documents={self.documents!r}, is_multi_document={self.is_multi_document!r}"
265
+ ")"
266
+ )
267
+
268
+
255
269
  PathLike = Union[str, Path, os.PathLike]
256
270
 
257
271
 
272
+ def _build_yaml_loader() -> YAML:
273
+ yaml = YAML(typ="safe")
274
+ yaml.register_class(Reference)
275
+ yaml.register_class(ReferenceAll)
276
+ yaml.register_class(Flatten)
277
+ yaml.register_class(Merge)
278
+ yaml.register_class(Ignore)
279
+ return yaml
280
+
281
+
258
282
  def _check_file_path(path: PathLike, allow_paths: Sequence[PathLike]) -> Path:
259
283
  if not isinstance(path, Path):
260
284
  path = Path(path)
@@ -273,11 +297,35 @@ def _check_file_path(path: PathLike, allow_paths: Sequence[PathLike]) -> Path:
273
297
  raise PermissionError(f"File '{path}' is not allowed.")
274
298
 
275
299
 
276
- def _extract_anchor_from_parser_events(yaml: YAML, stream: IO, anchor: str) -> Any:
300
+ def _collect_document_event_streams(yaml: YAML, stream: IO) -> list[list[events.Event]]:
301
+ document_streams = []
302
+ current_document = None
303
+ for event in yaml.parse(stream):
304
+ if isinstance(event, events.DocumentStartEvent):
305
+ current_document = [events.StreamStartEvent(), event]
306
+ elif isinstance(event, events.DocumentEndEvent):
307
+ if current_document is None:
308
+ current_document = [
309
+ events.StreamStartEvent(),
310
+ events.DocumentStartEvent(),
311
+ ]
312
+ current_document.append(event)
313
+ current_document.append(events.StreamEndEvent())
314
+ document_streams.append(current_document)
315
+ current_document = None
316
+ elif not isinstance(event, (events.StreamStartEvent, events.StreamEndEvent)):
317
+ if current_document is not None:
318
+ current_document.append(event)
319
+ return document_streams
320
+
321
+
322
+ def _extract_anchor_from_parser_events(
323
+ yaml: YAML, parsed_events: Sequence[events.Event], anchor: str
324
+ ) -> Any:
277
325
  anchor_lookup = dict()
278
326
  level_lookup = defaultdict(int)
279
327
  _nonzero_keys = lambda dd: [key for key, value in dd.items() if value > 0] # noqa: E731
280
- for event in yaml.parse(stream):
328
+ for event in parsed_events:
281
329
  if (
282
330
  hasattr(event, "anchor")
283
331
  and event.anchor is not None
@@ -360,14 +408,51 @@ def _extract_anchor_from_parser_events(yaml: YAML, stream: IO, anchor: str) -> A
360
408
  )
361
409
  raise ValueError(msg)
362
410
  strio.seek(0)
363
- document = yaml.load(strio)
411
+ document = _build_yaml_loader().load(strio)
364
412
  return document
365
413
 
366
414
 
415
+ def _parse_yaml_documents(
416
+ file_path: PathLike,
417
+ anchor: Optional[str] = None,
418
+ allow_paths: Optional[Sequence[PathLike]] = None,
419
+ ) -> MultiDocument:
420
+ if not allow_paths:
421
+ allow_paths = [Path(file_path).parent.absolute()]
422
+ path: Path = _check_file_path(file_path, allow_paths=allow_paths)
423
+
424
+ if anchor is None:
425
+ yaml = _build_yaml_loader()
426
+ with path.open("r") as f:
427
+ parsed_documents = list(yaml.load_all(f))
428
+ else:
429
+ yaml = _build_yaml_loader()
430
+ with path.open("r") as f:
431
+ document_streams = _collect_document_event_streams(yaml, f)
432
+ if not document_streams:
433
+ raise ValueError(f"Anchor '{anchor}' not found in the YAML document.")
434
+ parsed_documents = [
435
+ _extract_anchor_from_parser_events(yaml, document_stream, anchor)
436
+ for document_stream in document_streams
437
+ ]
438
+
439
+ if not parsed_documents:
440
+ parsed_documents = [None]
441
+
442
+ parsed_documents = [
443
+ _recursively_attribute_location_to_references(document, path)
444
+ for document in parsed_documents
445
+ ]
446
+ return MultiDocument(
447
+ documents=parsed_documents,
448
+ is_multi_document=len(parsed_documents) > 1,
449
+ )
450
+
451
+
367
452
  def parse_yaml_with_references(
368
453
  file_path: PathLike,
369
454
  anchor: Optional[str] = None,
370
- allow_paths: Sequence[PathLike] = [],
455
+ allow_paths: Optional[Sequence[PathLike]] = None,
371
456
  ) -> Any:
372
457
  """
373
458
  Interface method for reading a YAML file into memory which contains references. References are not resolved in the
@@ -386,29 +471,25 @@ def parse_yaml_with_references(
386
471
  ValueError: If the file is not a valid YAML file.
387
472
 
388
473
  """
389
- if not allow_paths:
390
- allow_paths = [Path(file_path).parent.absolute()]
391
- path: Path = _check_file_path(file_path, allow_paths=allow_paths)
392
-
393
- yaml = YAML(typ="safe")
394
- yaml.register_class(Reference)
395
- yaml.register_class(ReferenceAll)
396
- yaml.register_class(Flatten)
397
- yaml.register_class(Merge)
398
- yaml.register_class(Ignore)
399
-
400
- if not anchor:
401
- with path.open("r") as f:
402
- parsed = yaml.load(f)
403
- else:
404
- with path.open("r") as f:
405
- parsed = _extract_anchor_from_parser_events(yaml, f, anchor)
406
-
407
- parsed = _recursively_attribute_location_to_references(parsed, path)
474
+ parsed = _parse_yaml_documents(
475
+ file_path,
476
+ anchor=anchor,
477
+ allow_paths=allow_paths,
478
+ )
479
+ if not parsed.is_multi_document and len(parsed.documents) == 1:
480
+ return parsed.documents[0]
408
481
  return parsed
409
482
 
410
483
 
411
484
  def _recursively_attribute_location_to_references(data: Any, base_path: Path):
485
+ if isinstance(data, MultiDocument):
486
+ return MultiDocument(
487
+ documents=[
488
+ _recursively_attribute_location_to_references(item, base_path)
489
+ for item in data.documents
490
+ ],
491
+ is_multi_document=data.is_multi_document,
492
+ )
412
493
  if isinstance(data, Flatten):
413
494
  return Flatten(
414
495
  sequence=[
@@ -514,6 +595,17 @@ def _recursively_resolve_references(
514
595
  if visited_paths is None:
515
596
  visited_paths = set()
516
597
 
598
+ if isinstance(data, MultiDocument):
599
+ return MultiDocument(
600
+ documents=[
601
+ _recursively_resolve_references(
602
+ item, allow_paths=allow_paths, visited_paths=visited_paths
603
+ )
604
+ for item in data.documents
605
+ ],
606
+ is_multi_document=data.is_multi_document,
607
+ )
608
+
517
609
  if isinstance(data, Flatten):
518
610
  return Flatten(
519
611
  sequence=[
@@ -547,11 +639,18 @@ def _recursively_resolve_references(
547
639
  # Check for circular reference and track path
548
640
  _check_and_track_path(abs_path, visited_paths)
549
641
 
550
- parsed = parse_yaml_with_references(
642
+ parsed = _parse_yaml_documents(
551
643
  abs_path, anchor=data.anchor, allow_paths=allow_paths
552
644
  )
645
+
646
+ if len(parsed.documents) != 1:
647
+ visited_paths.remove(abs_path)
648
+ raise ValueError(
649
+ f"Referenced file '{abs_path}' contains multiple YAML documents and cannot be used with !reference."
650
+ )
651
+
553
652
  resolved = _recursively_resolve_references(
554
- parsed, allow_paths=allow_paths, visited_paths=visited_paths
653
+ parsed.documents[0], allow_paths=allow_paths, visited_paths=visited_paths
555
654
  )
556
655
 
557
656
  # Remove current path from visited set after processing
@@ -587,13 +686,16 @@ def _recursively_resolve_references(
587
686
  # Check for circular reference and track path
588
687
  _check_and_track_path(path, visited_paths)
589
688
 
590
- parsed = parse_yaml_with_references(
689
+ parsed = _parse_yaml_documents(
591
690
  path, anchor=data.anchor, allow_paths=allow_paths
592
691
  )
593
692
  resolved = _recursively_resolve_references(
594
693
  parsed, allow_paths=allow_paths, visited_paths=visited_paths
595
694
  )
596
- resolved_items.append(resolved)
695
+ if isinstance(resolved, MultiDocument):
696
+ resolved_items.extend(resolved.documents)
697
+ else:
698
+ resolved_items.append(resolved)
597
699
 
598
700
  # Remove current path from visited set after processing
599
701
  visited_paths.remove(path)
@@ -623,6 +725,11 @@ def flatten_sequences(data: Any) -> Any:
623
725
  Given an object which may contain Flatten(...) objects which was parsed from a YAML document containing !flatten
624
726
  tags, return the object without any Flatten(...) objects, but having flattened all sequences marked with them.
625
727
  """
728
+ if isinstance(data, MultiDocument):
729
+ return MultiDocument(
730
+ documents=[flatten_sequences(item) for item in data.documents],
731
+ is_multi_document=data.is_multi_document,
732
+ )
626
733
  if isinstance(data, Flatten):
627
734
  return data.flattened()
628
735
  if isinstance(data, Merge):
@@ -641,6 +748,11 @@ def merge_mappings(data: Any) -> Any:
641
748
  Given an object which may contain Merge(...) objects which was parsed from a YAML document containing !merge
642
749
  tags, return the object without any Merge(...) objects, but having merged all mappings marked with them.
643
750
  """
751
+ if isinstance(data, MultiDocument):
752
+ return MultiDocument(
753
+ documents=[merge_mappings(item) for item in data.documents],
754
+ is_multi_document=data.is_multi_document,
755
+ )
644
756
  if isinstance(data, Merge):
645
757
  return merge_mappings(data.merged())
646
758
  if isinstance(data, list):
@@ -658,6 +770,25 @@ def prune_ignores(data: Any) -> Any:
658
770
  removed from the list. If an Ignore(...) object is found as a value in a dict, the key-value pair is removed from
659
771
  the dict. If an Ignore(...) object is found as a value which is not in a list or dict, it is replaced with None.
660
772
  """
773
+ if isinstance(data, MultiDocument):
774
+ if not data.is_multi_document:
775
+ if not data.documents:
776
+ return MultiDocument(documents=[None], is_multi_document=False)
777
+ return MultiDocument(
778
+ documents=[prune_ignores(data.documents[0])],
779
+ is_multi_document=False,
780
+ )
781
+
782
+ pruned_documents = []
783
+ for item in data.documents:
784
+ # For multi-document streams, only omit documents explicitly tagged !ignore.
785
+ # Preserve documents that prune to None (e.g., explicit null/empty documents)
786
+ # so that document count and ordering remain stable.
787
+ if isinstance(item, Ignore):
788
+ continue
789
+ pruned_item = prune_ignores(item)
790
+ pruned_documents.append(pruned_item)
791
+ return MultiDocument(documents=pruned_documents, is_multi_document=True)
661
792
  if isinstance(data, Ignore):
662
793
  return None
663
794
  if isinstance(data, Flatten):
@@ -715,7 +846,7 @@ def load_yaml_with_references(
715
846
  allow_paths = []
716
847
  allow_paths += [Path(file_path).parent.absolute()]
717
848
  path = _check_file_path(file_path, allow_paths=allow_paths)
718
- parsed = parse_yaml_with_references(path, allow_paths=allow_paths)
849
+ parsed = _parse_yaml_documents(path, allow_paths=allow_paths)
719
850
 
720
851
  # Initialize visited paths with the root file to detect self-references
721
852
  visited_paths = {path.resolve()}
@@ -732,6 +863,14 @@ def load_yaml_with_references(
732
863
  pruned = prune_ignores(resolved)
733
864
  flattened = flatten_sequences(pruned)
734
865
  merged = merge_mappings(flattened)
866
+ if isinstance(merged, MultiDocument):
867
+ if merged.is_multi_document:
868
+ return merged.documents
869
+ if not merged.documents:
870
+ return None
871
+ if len(merged.documents) == 1:
872
+ return merged.documents[0]
873
+ return None
735
874
  return merged
736
875
 
737
876
 
@@ -742,6 +881,7 @@ __all__ = [
742
881
  "Flatten",
743
882
  "merge_mappings",
744
883
  "Merge",
884
+ "MultiDocument",
745
885
  "prune_ignores",
746
886
  "Ignore",
747
887
  ]
@@ -2,6 +2,7 @@ import json
2
2
  import sys
3
3
  from pathlib import Path
4
4
 
5
+ from ruamel.yaml.error import YAMLError
5
6
  from yaml_reference import load_yaml_with_references
6
7
 
7
8
 
@@ -33,6 +34,12 @@ def compile_main(input_file: str, allow_paths: list[str] = []):
33
34
  file=sys.stderr,
34
35
  )
35
36
  sys.exit(1)
37
+ except (FileNotFoundError, ValueError, YAMLError) as err:
38
+ print(
39
+ f'Error: Failed to compile "{input_path}":\n{err}',
40
+ file=sys.stderr,
41
+ )
42
+ sys.exit(1)
36
43
 
37
44
  json.dump(data, sys.stdout, sort_keys=True, indent=2)
38
45
 
@@ -1,8 +0,0 @@
1
- {
2
- "yaml.customTags": [
3
- "!reference mapping",
4
- "!reference-all mapping",
5
- "!flatten sequence",
6
- "!merge sequence"
7
- ]
8
- }
File without changes
File without changes
File without changes