yaml-reference 2.2.1__tar.gz → 2.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: yaml-reference
3
- Version: 2.2.1
3
+ Version: 2.3.0
4
4
  Summary: Extension package built on top of `ruamel.yaml` to support cross-file references in YAML files using tags `!reference` and `!reference-all`.
5
5
  Project-URL: Repository, https://github.com/dsillman2000/yaml-reference.git
6
6
  Author-email: David Sillman <dsillman2000@gmail.com>
@@ -155,15 +155,15 @@ version: 3.1
155
155
  $ yaml-reference-cli root.yaml | yq -P > .compiled/root.yaml
156
156
  ```
157
157
 
158
- ## Safety note
158
+ ## Circular reference protection
159
159
 
160
- As of now, the specification does not require any explicit protection against circular references. This package does not check for circular references and will result in an infinite loop (max recursion depth exceeded) if a circular reference is encountered. Onus is on the users of this package to ensure that circular references are not present in their referential YAML files.
160
+ As required by the yaml-reference-specs specification, this package includes circular reference detection to prevent infinite recursion. If a circular reference is detected (e.g., A references B, B references C, C references A), a `ValueError` will be raised with a descriptive error message. This protects against self-references and circular chains in both `!reference` and `!reference-all` tags.
161
161
 
162
162
  ## Security considerations
163
163
 
164
- ### Path restriction with `allow_paths`
164
+ ### Path restriction and `allow_paths`
165
165
 
166
- By default, `!reference` and `!reference-all` tags can only reference files within the same directory as the source YAML file. To allow references to files in other directories, you must explicitly specify allowed paths using the `allow_paths` parameter:
166
+ By default, `!reference` and `!reference-all` tags can only reference files within the same directory as the source YAML file (or child subdirectories). To allow references to files in other disparate directory trees, you must explicitly specify allowed paths using the `allow_paths` parameter:
167
167
 
168
168
  ```python
169
169
  from yaml_reference import load_yaml_with_references
@@ -181,15 +181,11 @@ In the CLI, use the `--allow` flag:
181
181
  yaml-reference compile input.yml --allow /allowed/path1 --allow /allowed/path2
182
182
  ```
183
183
 
184
- ### Absolute path restrictions
185
-
186
- References using absolute paths (e.g., `/tmp/file.yml`) are explicitly rejected with a `ValueError`. All reference paths must be relative to the source file's directory.
187
-
188
- ### Permission errors
184
+ Whether or not `allow_paths` is specified, the default behavior is to allow references to files in the same directory as the source YAML file (or subdirectories). "Back-navigating" out of a the root directory is not allowed (".." local references in a root YAML file). This provides a secure baseline to prevent unsafe access which is not explicitly allowed.
189
185
 
190
- If a reference attempts to access a file outside the allowed paths, a `PermissionError` is raised. This prevents unauthorized file access through YAML references.
186
+ ### Absolute path restrictions
191
187
 
192
- Whether or not `allow_paths` is specified, the default behavior is to allow references to files in the same directory as the source YAML file (or subdirectories). "Back-navigating" out of a the root directory is not allowed (".." local references in a root YAML file). This provides a secure baseline to prevent unsafe access which is not explicitly allowed.
188
+ References using absolute paths (e.g., `/tmp/file.yml`) are explicitly rejected with a `ValueError`. All reference paths must be relative to the source file's directory. If you absolutely must reference an absolute path, relative paths to symlinks can be used. Note that their target directories must be explicitly allowed to avoid permission errors (see the above section about "Path restriction and `allow_paths`").
193
189
 
194
190
  ## Acknowledgements
195
191
 
@@ -129,15 +129,15 @@ version: 3.1
129
129
  $ yaml-reference-cli root.yaml | yq -P > .compiled/root.yaml
130
130
  ```
131
131
 
132
- ## Safety note
132
+ ## Circular reference protection
133
133
 
134
- As of now, the specification does not require any explicit protection against circular references. This package does not check for circular references and will result in an infinite loop (max recursion depth exceeded) if a circular reference is encountered. Onus is on the users of this package to ensure that circular references are not present in their referential YAML files.
134
+ As required by the yaml-reference-specs specification, this package includes circular reference detection to prevent infinite recursion. If a circular reference is detected (e.g., A references B, B references C, C references A), a `ValueError` will be raised with a descriptive error message. This protects against self-references and circular chains in both `!reference` and `!reference-all` tags.
135
135
 
136
136
  ## Security considerations
137
137
 
138
- ### Path restriction with `allow_paths`
138
+ ### Path restriction and `allow_paths`
139
139
 
140
- By default, `!reference` and `!reference-all` tags can only reference files within the same directory as the source YAML file. To allow references to files in other directories, you must explicitly specify allowed paths using the `allow_paths` parameter:
140
+ By default, `!reference` and `!reference-all` tags can only reference files within the same directory as the source YAML file (or child subdirectories). To allow references to files in other disparate directory trees, you must explicitly specify allowed paths using the `allow_paths` parameter:
141
141
 
142
142
  ```python
143
143
  from yaml_reference import load_yaml_with_references
@@ -155,15 +155,11 @@ In the CLI, use the `--allow` flag:
155
155
  yaml-reference compile input.yml --allow /allowed/path1 --allow /allowed/path2
156
156
  ```
157
157
 
158
- ### Absolute path restrictions
159
-
160
- References using absolute paths (e.g., `/tmp/file.yml`) are explicitly rejected with a `ValueError`. All reference paths must be relative to the source file's directory.
161
-
162
- ### Permission errors
158
+ Whether or not `allow_paths` is specified, the default behavior is to allow references to files in the same directory as the source YAML file (or subdirectories). "Back-navigating" out of a the root directory is not allowed (".." local references in a root YAML file). This provides a secure baseline to prevent unsafe access which is not explicitly allowed.
163
159
 
164
- If a reference attempts to access a file outside the allowed paths, a `PermissionError` is raised. This prevents unauthorized file access through YAML references.
160
+ ### Absolute path restrictions
165
161
 
166
- Whether or not `allow_paths` is specified, the default behavior is to allow references to files in the same directory as the source YAML file (or subdirectories). "Back-navigating" out of a the root directory is not allowed (".." local references in a root YAML file). This provides a secure baseline to prevent unsafe access which is not explicitly allowed.
162
+ References using absolute paths (e.g., `/tmp/file.yml`) are explicitly rejected with a `ValueError`. All reference paths must be relative to the source file's directory. If you absolutely must reference an absolute path, relative paths to symlinks can be used. Note that their target directories must be explicitly allowed to avoid permission errors (see the above section about "Path restriction and `allow_paths`").
167
163
 
168
164
  ## Acknowledgements
169
165
 
@@ -160,3 +160,50 @@ def test_allow_paths_load_yaml_with_references(stage_files):
160
160
  load_yaml_with_references(
161
161
  stg / "inner/with_all.yml", allow_paths=[stg / "some"]
162
162
  )
163
+
164
+
165
+ @pytest.mark.parametrize(
166
+ "test_name, files, entry_point",
167
+ [
168
+ (
169
+ "self_reference",
170
+ {
171
+ "self_ref.yml": "data: !reference { path: ./self_ref.yml }",
172
+ },
173
+ "self_ref.yml",
174
+ ),
175
+ (
176
+ "triangle_reference",
177
+ {
178
+ "file1.yml": "name: File 1\nref: !reference { path: ./file2.yml }",
179
+ "file2.yml": "name: File 2\nref: !reference { path: ./file3.yml }",
180
+ "file3.yml": "name: File 3\nref: !reference { path: ./file1.yml }",
181
+ },
182
+ "file1.yml",
183
+ ),
184
+ (
185
+ "reference_all_circular",
186
+ {
187
+ "main.yml": "data: !reference-all { glob: ./refs/*.yml }",
188
+ "refs/file1.yml": "name: File 1\nref: !reference { path: ../main.yml }",
189
+ "refs/file2.yml": "name: File 2",
190
+ },
191
+ "main.yml",
192
+ ),
193
+ (
194
+ "reference_all_self",
195
+ {
196
+ "main.yml": "data: !reference-all { glob: '*.yml' }",
197
+ "doc-a.yml": "name: Doc A",
198
+ "doc-b.yml": "name: Doc B",
199
+ },
200
+ "main.yml",
201
+ ),
202
+ ],
203
+ )
204
+ def test_circular_reference_detection(stage_files, test_name, files, entry_point):
205
+ """Test that circular references are detected and disallowed."""
206
+ stg = stage_files(files)
207
+
208
+ with pytest.raises(ValueError, match="Circular reference detected"):
209
+ load_yaml_with_references(stg / entry_point)
@@ -1,6 +1,6 @@
1
1
  import os
2
2
  from pathlib import Path
3
- from typing import Any, Sequence, Union
3
+ from typing import Any, Optional, Sequence, Union
4
4
 
5
5
  from ruamel.yaml import YAML
6
6
 
@@ -148,11 +148,62 @@ def _recursively_attribute_location_to_references(data: Any, base_path: Path):
148
148
  return data
149
149
 
150
150
 
151
- def _recursively_resolve_references(data: Any, allow_paths: Sequence[Path]) -> Any:
151
+ def _check_and_track_path(path: Path, visited_paths: set[Path]) -> None:
152
+ """
153
+ Check for circular reference and add path to visited set.
154
+
155
+ Args:
156
+ path: The file path to check and track.
157
+ visited_paths: Set of visited file paths.
158
+
159
+ Raises:
160
+ ValueError: If a circular reference is detected.
161
+ """
162
+ if path in visited_paths:
163
+ raise ValueError(
164
+ f"Circular reference detected: {path} has already been visited. "
165
+ f"Visited path chain: {visited_paths}"
166
+ )
167
+ visited_paths.add(path)
168
+
169
+
170
+ def _recursively_resolve_references(
171
+ data: Any, allow_paths: Sequence[Path], visited_paths: Optional[set[Path]] = None
172
+ ) -> Any:
173
+ """
174
+ Recursively resolve references in YAML data.
175
+
176
+ Args:
177
+ data: The YAML data to resolve references in.
178
+ allow_paths: List of allowed paths for file access.
179
+ visited_paths: Set of file paths that have been visited during resolution.
180
+ Used to detect circular references.
181
+
182
+ Returns:
183
+ The resolved YAML data with all references expanded.
184
+
185
+ Raises:
186
+ ValueError: If a circular reference is detected.
187
+ """
188
+ if visited_paths is None:
189
+ visited_paths = set()
190
+
152
191
  if isinstance(data, Reference):
153
192
  abs_path = (Path(data.location).parent / data.path).resolve()
193
+
194
+ # Check for circular reference and track path
195
+ _check_and_track_path(abs_path, visited_paths)
196
+
154
197
  parsed = parse_yaml_with_references(abs_path, allow_paths=allow_paths)
155
- return _recursively_resolve_references(parsed, allow_paths=allow_paths)
198
+ resolved = _recursively_resolve_references(
199
+ parsed, allow_paths=allow_paths, visited_paths=visited_paths
200
+ )
201
+
202
+ # Remove current path from visited set after processing
203
+ visited_paths.remove(abs_path)
204
+
205
+ return resolved
206
+
156
207
  elif isinstance(data, ReferenceAll):
157
208
  glob_results = Path(data.location).parent.glob(data.glob)
158
209
  abs_paths = [path.resolve() for path in glob_results]
@@ -161,22 +212,35 @@ def _recursively_resolve_references(data: Any, allow_paths: Sequence[Path]) -> A
161
212
  f'No files found matching glob pattern "{data.glob}" in directory "{Path(data.location).parent}"'
162
213
  )
163
214
  abs_paths = sorted(abs_paths, key=lambda x: str(x))
164
- parsed = [
165
- parse_yaml_with_references(path, allow_paths=allow_paths)
166
- for path in abs_paths
167
- ]
168
- return [
169
- _recursively_resolve_references(item, allow_paths=allow_paths)
170
- for item in parsed
171
- ]
215
+
216
+ resolved_items = []
217
+ for path in abs_paths:
218
+ # Check for circular reference and track path
219
+ _check_and_track_path(path, visited_paths)
220
+
221
+ parsed = parse_yaml_with_references(path, allow_paths=allow_paths)
222
+ resolved = _recursively_resolve_references(
223
+ parsed, allow_paths=allow_paths, visited_paths=visited_paths
224
+ )
225
+ resolved_items.append(resolved)
226
+
227
+ # Remove current path from visited set after processing
228
+ visited_paths.remove(path)
229
+
230
+ return resolved_items
231
+
172
232
  elif isinstance(data, list):
173
233
  return [
174
- _recursively_resolve_references(item, allow_paths=allow_paths)
234
+ _recursively_resolve_references(
235
+ item, allow_paths=allow_paths, visited_paths=visited_paths
236
+ )
175
237
  for item in data
176
238
  ]
177
239
  elif isinstance(data, dict):
178
240
  return {
179
- key: _recursively_resolve_references(value, allow_paths=allow_paths)
241
+ key: _recursively_resolve_references(
242
+ value, allow_paths=allow_paths, visited_paths=visited_paths
243
+ )
180
244
  for key, value in data.items()
181
245
  }
182
246
  else:
@@ -211,7 +275,15 @@ def load_yaml_with_references(
211
275
  allow_paths += [Path(file_path).parent.absolute()]
212
276
  path = _check_file_path(file_path, allow_paths=allow_paths)
213
277
  parsed = parse_yaml_with_references(path, allow_paths=allow_paths)
214
- return _recursively_resolve_references(parsed, allow_paths=allow_paths) # type: ignore
278
+
279
+ # Initialize visited paths with the root file to detect self-references
280
+ visited_paths = {path.resolve()}
281
+
282
+ return _recursively_resolve_references(
283
+ parsed,
284
+ allow_paths=allow_paths, # type: ignore
285
+ visited_paths=visited_paths,
286
+ )
215
287
 
216
288
 
217
289
  __all__ = ["parse_yaml_with_references", "load_yaml_with_references"]
File without changes
File without changes
File without changes