atdata 0.2.3b1__tar.gz → 0.3.1b1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/issues.db +0 -0
- atdata-0.3.1b1/.claude/commands/adr.md +42 -0
- atdata-0.3.1b1/.claude/commands/changelog.md +61 -0
- atdata-0.3.1b1/.claude/commands/feature.md +43 -0
- atdata-0.3.1b1/.claude/commands/release.md +63 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.github/workflows/uv-test.yml +58 -3
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.gitignore +3 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/04_schema_evolution.md +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/05_lexicon_namespace.md +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/assessment.md +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/record_lexicon_assessment.md +22 -22
- atdata-0.2.3b1/.planning/setup/decisions/sampleSchema_design_questions.md → atdata-0.3.1b1/.planning/phases/01-atproto-foundation/decisions/schema_design_questions.md +3 -3
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/examples/code/ndarray_roundtrip.py +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/examples/code/validate_ndarray_shim.py +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/examples/dataset_blob_storage.json +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/examples/dataset_external_storage.json +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/examples/lens_example.json +2 -2
- atdata-0.2.3b1/.planning/setup/examples/sampleSchema_example.json → atdata-0.3.1b1/.planning/phases/01-atproto-foundation/examples/schema_example.json +2 -2
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/README.md +7 -7
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/README_ARRAY_FORMATS.md +5 -5
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/README_SCHEMA_TYPES.md +10 -10
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ac.foundation.dataset.getLatestSchema.json +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ac.foundation.dataset.lens.json +2 -2
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ac.foundation.dataset.record.json +1 -1
- atdata-0.2.3b1/.planning/setup/lexicons/ac.foundation.dataset.sampleSchema.json → atdata-0.3.1b1/.planning/phases/01-atproto-foundation/lexicons/ac.foundation.dataset.schema.json +3 -3
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ac.foundation.dataset.schemaType.json +1 -1
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/ndarray_shim_spec.md +1 -1
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.vscode/settings.json +6 -0
- atdata-0.3.1b1/CHANGELOG.md +153 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/CLAUDE.md +122 -35
- {atdata-0.2.3b1 → atdata-0.3.1b1}/PKG-INFO +5 -2
- {atdata-0.2.3b1 → atdata-0.3.1b1}/README.md +1 -1
- atdata-0.3.1b1/benchmarks/bench_atmosphere.py +220 -0
- atdata-0.3.1b1/benchmarks/bench_dataset_io.py +293 -0
- atdata-0.3.1b1/benchmarks/bench_index_providers.py +215 -0
- atdata-0.3.1b1/benchmarks/bench_query.py +278 -0
- atdata-0.3.1b1/benchmarks/conftest.py +345 -0
- atdata-0.3.1b1/benchmarks/render_report.py +462 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/AbstractDataStore.html +72 -288
- atdata-0.3.1b1/docs/api/AbstractIndex.html +1043 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/AtUri.html +61 -194
- atdata-0.3.1b1/docs/api/AtmosphereClient.html +684 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/AtmosphereIndex.html +68 -203
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/AtmosphereIndexEntry.html +59 -192
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/BlobSource.html +59 -192
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/DataSource.html +69 -271
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/Dataset.html +645 -338
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/DatasetDict.html +60 -193
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/DatasetLoader.html +61 -194
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/DatasetPublisher.html +68 -202
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/DictSample.html +66 -345
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/IndexEntry.html +60 -201
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/Lens.html +59 -192
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/LensLoader.html +61 -194
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/LensPublisher.html +71 -205
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/PDSBlobStore.html +61 -199
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/Packable-protocol.html +59 -250
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/PackableSample.html +65 -278
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/S3Source.html +59 -192
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/SampleBatch.html +71 -212
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/SchemaLoader.html +65 -199
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/SchemaPublisher.html +65 -199
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/URLSource.html +59 -192
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/index.html +65 -198
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/load_dataset.html +60 -193
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/local.Index.html +533 -250
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/local.LocalDatasetEntry.html +60 -193
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/local.S3DataStore.html +101 -193
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/packable.html +62 -252
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/api/promote_to_atmosphere.html +64 -196
- atdata-0.3.1b1/docs/benchmarks/index.html +1331 -0
- atdata-0.3.1b1/docs/examples/index-workflow.html +1186 -0
- atdata-0.3.1b1/docs/examples/index.html +928 -0
- atdata-0.3.1b1/docs/examples/lens-transforms.html +1154 -0
- atdata-0.3.1b1/docs/examples/manifest-queries.html +1148 -0
- atdata-0.3.1b1/docs/examples/multi-split.html +1147 -0
- atdata-0.3.1b1/docs/examples/typed-pipeline.html +1132 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/index.html +194 -425
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/architecture.html +126 -212
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/atmosphere.html +134 -220
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/datasets.html +125 -211
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/deployment.html +112 -198
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/lenses.html +123 -209
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/load-dataset.html +124 -210
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/local-storage.html +123 -209
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/packable-samples.html +125 -211
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/promotion.html +120 -206
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/protocols.html +124 -210
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/troubleshooting.html +113 -199
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/reference/uri-spec.html +114 -200
- atdata-0.3.1b1/docs/robots.txt +1 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/search.json +659 -387
- atdata-0.3.1b1/docs/site_libs/bootstrap/bootstrap-62ce3d63edf8507b4d15f75c6b92352a.min.css +12 -0
- atdata-0.2.3b1/docs/site_libs/quarto-html/quarto-syntax-highlighting-9582434199d49cc9e91654cdeeb4866b.css → atdata-0.3.1b1/docs/site_libs/quarto-html/quarto-syntax-highlighting-b854dd4081d6110d4acfde180236d7b2.css +2 -2
- atdata-0.3.1b1/docs/sitemap.xml +223 -0
- atdata-0.3.1b1/docs/styles.css +50 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/tutorials/atmosphere.html +203 -299
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/tutorials/local-workflow.html +283 -374
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/tutorials/promotion.html +245 -389
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/tutorials/quickstart.html +152 -243
- atdata-0.3.1b1/docs_src/.nojekyll +0 -0
- atdata-0.3.1b1/docs_src/_brand.yml +73 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/_quarto.yml +79 -13
- atdata-0.3.1b1/docs_src/api/AbstractDataStore.qmd +54 -0
- atdata-0.3.1b1/docs_src/api/AbstractIndex.qmd +153 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/AtUri.qmd +2 -2
- atdata-0.3.1b1/docs_src/api/AtmosphereClient.qmd +4 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/AtmosphereIndex.qmd +11 -9
- atdata-0.3.1b1/docs_src/api/DataSource.qmd +54 -0
- atdata-0.3.1b1/docs_src/api/Dataset.qmd +510 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/DatasetDict.qmd +1 -1
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/DatasetLoader.qmd +2 -2
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/DatasetPublisher.qmd +2 -3
- atdata-0.3.1b1/docs_src/api/DictSample.qmd +96 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/IndexEntry.qmd +1 -3
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/LensLoader.qmd +2 -2
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/LensPublisher.qmd +4 -5
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/PDSBlobStore.qmd +3 -3
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/Packable-protocol.qmd +1 -31
- atdata-0.3.1b1/docs_src/api/PackableSample.qmd +59 -0
- atdata-0.3.1b1/docs_src/api/SampleBatch.qmd +31 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/SchemaLoader.qmd +3 -4
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/SchemaPublisher.qmd +3 -4
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/index.qmd +6 -6
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/load_dataset.qmd +1 -1
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/local.Index.qmd +284 -77
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/local.LocalDatasetEntry.qmd +3 -3
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/local.S3DataStore.qmd +23 -7
- atdata-0.3.1b1/docs_src/api/packable.qmd +23 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/promote_to_atmosphere.qmd +8 -3
- atdata-0.3.1b1/docs_src/examples/index-workflow.qmd +191 -0
- atdata-0.3.1b1/docs_src/examples/index.qmd +22 -0
- atdata-0.3.1b1/docs_src/examples/lens-transforms.qmd +198 -0
- atdata-0.3.1b1/docs_src/examples/manifest-queries.qmd +174 -0
- atdata-0.3.1b1/docs_src/examples/multi-split.qmd +174 -0
- atdata-0.3.1b1/docs_src/examples/typed-pipeline.qmd +168 -0
- atdata-0.3.1b1/docs_src/index.qmd +138 -0
- atdata-0.3.1b1/docs_src/objects.json +1 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/troubleshooting.qmd +1 -1
- atdata-0.3.1b1/docs_src/styles.css +50 -0
- atdata-0.3.1b1/docs_src/theme-dark.scss +1 -0
- atdata-0.3.1b1/docs_src/theme-light.scss +15 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/tutorials/atmosphere.qmd +30 -41
- atdata-0.3.1b1/docs_src/tutorials/local-workflow.qmd +270 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/tutorials/promotion.qmd +65 -126
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/tutorials/quickstart.qmd +8 -16
- atdata-0.3.1b1/justfile +51 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.arrayFormat.json +1 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.getLatestSchema.json +1 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.lens.json +1 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.record.json +1 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.schema.json +1 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.schemaType.json +1 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.storageBlobs.json +1 -0
- atdata-0.3.1b1/lexicons/ac.foundation.dataset.storageExternal.json +1 -0
- atdata-0.3.1b1/lexicons/ndarray_shim.json +1 -0
- atdata-0.3.1b1/prototyping/human-review-atmosphere.ipynb +66 -0
- atdata-0.3.1b1/prototyping/human-review-local.ipynb +674 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/pyproject.toml +15 -1
- atdata-0.3.1b1/src/atdata/.gitignore +1 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/__init__.py +39 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/_cid.py +0 -21
- atdata-0.3.1b1/src/atdata/_exceptions.py +168 -0
- atdata-0.3.1b1/src/atdata/_helpers.py +86 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/_hf_api.py +95 -11
- atdata-0.3.1b1/src/atdata/_logging.py +70 -0
- atdata-0.3.1b1/src/atdata/_protocols.py +343 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/_schema_codec.py +7 -6
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/_stub_manager.py +5 -25
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/_type_utils.py +28 -2
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/atmosphere/__init__.py +31 -20
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/atmosphere/_types.py +4 -4
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/atmosphere/client.py +64 -12
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/atmosphere/lens.py +11 -12
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/atmosphere/records.py +12 -12
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/atmosphere/schema.py +16 -18
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/atmosphere/store.py +6 -7
- atdata-0.3.1b1/src/atdata/cli/__init__.py +208 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/cli/diagnose.py +2 -2
- atdata-0.2.3b1/src/atdata/cli/local.py → atdata-0.3.1b1/src/atdata/cli/infra.py +11 -11
- atdata-0.3.1b1/src/atdata/cli/inspect.py +69 -0
- atdata-0.3.1b1/src/atdata/cli/preview.py +63 -0
- atdata-0.3.1b1/src/atdata/cli/schema.py +109 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/dataset.py +583 -328
- atdata-0.3.1b1/src/atdata/index/__init__.py +54 -0
- atdata-0.3.1b1/src/atdata/index/_entry.py +157 -0
- atdata-0.3.1b1/src/atdata/index/_index.py +1198 -0
- atdata-0.3.1b1/src/atdata/index/_schema.py +380 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/lens.py +9 -2
- atdata-0.3.1b1/src/atdata/lexicons/__init__.py +121 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.arrayFormat.json +16 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.getLatestSchema.json +78 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.lens.json +99 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.record.json +96 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.schema.json +107 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.schemaType.json +16 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.storageBlobs.json +24 -0
- atdata-0.3.1b1/src/atdata/lexicons/ac.foundation.dataset.storageExternal.json +25 -0
- atdata-0.3.1b1/src/atdata/lexicons/ndarray_shim.json +16 -0
- atdata-0.3.1b1/src/atdata/local/__init__.py +70 -0
- atdata-0.3.1b1/src/atdata/local/_repo_legacy.py +218 -0
- atdata-0.3.1b1/src/atdata/manifest/__init__.py +28 -0
- atdata-0.3.1b1/src/atdata/manifest/_aggregates.py +156 -0
- atdata-0.3.1b1/src/atdata/manifest/_builder.py +163 -0
- atdata-0.3.1b1/src/atdata/manifest/_fields.py +154 -0
- atdata-0.3.1b1/src/atdata/manifest/_manifest.py +146 -0
- atdata-0.3.1b1/src/atdata/manifest/_query.py +150 -0
- atdata-0.3.1b1/src/atdata/manifest/_writer.py +74 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/promote.py +18 -14
- atdata-0.3.1b1/src/atdata/providers/__init__.py +25 -0
- atdata-0.3.1b1/src/atdata/providers/_base.py +140 -0
- atdata-0.3.1b1/src/atdata/providers/_factory.py +69 -0
- atdata-0.3.1b1/src/atdata/providers/_postgres.py +214 -0
- atdata-0.3.1b1/src/atdata/providers/_redis.py +171 -0
- atdata-0.3.1b1/src/atdata/providers/_sqlite.py +191 -0
- atdata-0.3.1b1/src/atdata/repository.py +323 -0
- atdata-0.3.1b1/src/atdata/stores/__init__.py +23 -0
- atdata-0.3.1b1/src/atdata/stores/_disk.py +123 -0
- atdata-0.3.1b1/src/atdata/stores/_s3.py +349 -0
- atdata-0.3.1b1/src/atdata/testing.py +341 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/EXPECTED_WARNINGS.md +2 -2
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_atmosphere.py +42 -46
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_cid.py +0 -44
- atdata-0.3.1b1/tests/test_cli.py +790 -0
- atdata-0.3.1b1/tests/test_coverage_gaps.py +306 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_dataset.py +44 -10
- atdata-0.3.1b1/tests/test_dev_experience.py +423 -0
- atdata-0.3.1b1/tests/test_disk_store.py +123 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_helpers.py +49 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_hf_api.py +48 -8
- atdata-0.3.1b1/tests/test_index_providers.py +477 -0
- atdata-0.3.1b1/tests/test_index_write.py +254 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration.py +1 -1
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_atmosphere.py +25 -26
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_atmosphere_live.py +16 -42
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_cross_backend.py +28 -27
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_dynamic_types.py +2 -2
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_edge_cases.py +3 -3
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_error_handling.py +32 -40
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_lens.py +1 -1
- atdata-0.3.1b1/tests/test_integration_manifest.py +263 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_promotion.py +28 -30
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_lens.py +1 -1
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_local.py +52 -200
- atdata-0.3.1b1/tests/test_logging.py +60 -0
- atdata-0.3.1b1/tests/test_manifest.py +528 -0
- atdata-0.3.1b1/tests/test_partial_failure.py +152 -0
- atdata-0.3.1b1/tests/test_postgres_provider.py +411 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_promote.py +4 -4
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_protocols.py +10 -7
- atdata-0.3.1b1/tests/test_query_coverage.py +215 -0
- atdata-0.3.1b1/tests/test_repository.py +379 -0
- atdata-0.3.1b1/tests/test_repository_coverage.py +265 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_sources.py +5 -0
- atdata-0.3.1b1/tests/test_stub_manager.py +533 -0
- atdata-0.3.1b1/tests/test_testing.py +246 -0
- atdata-0.3.1b1/tests/test_type_utils.py +182 -0
- atdata-0.3.1b1/tests/test_write_samples.py +173 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/uv.lock +114 -2
- atdata-0.2.3b1/CHANGELOG.md +0 -195
- atdata-0.2.3b1/docs/api/AbstractIndex.html +0 -1356
- atdata-0.2.3b1/docs/api/AtmosphereClient.html +0 -1891
- atdata-0.2.3b1/docs/robots.txt +0 -1
- atdata-0.2.3b1/docs/site_libs/bootstrap/bootstrap-62bce24ca844314e7bb1a34dbdfe05cc.min.css +0 -12
- atdata-0.2.3b1/docs/site_libs/bootstrap/bootstrap-dark-7964ffd8887b0991fe8d71c6c8bc75d6.min.css +0 -12
- atdata-0.2.3b1/docs/site_libs/quarto-html/quarto-syntax-highlighting-dark-8dcd8563ea6803ab7cbb3d71ca5772e1.css +0 -210
- atdata-0.2.3b1/docs/sitemap.xml +0 -199
- atdata-0.2.3b1/docs_src/api/AbstractDataStore.qmd +0 -94
- atdata-0.2.3b1/docs_src/api/AbstractIndex.qmd +0 -236
- atdata-0.2.3b1/docs_src/api/AtmosphereClient.qmd +0 -422
- atdata-0.2.3b1/docs_src/api/DataSource.qmd +0 -95
- atdata-0.2.3b1/docs_src/api/Dataset.qmd +0 -241
- atdata-0.2.3b1/docs_src/api/DictSample.qmd +0 -151
- atdata-0.2.3b1/docs_src/api/PackableSample.qmd +0 -83
- atdata-0.2.3b1/docs_src/api/SampleBatch.qmd +0 -42
- atdata-0.2.3b1/docs_src/api/packable.qmd +0 -45
- atdata-0.2.3b1/docs_src/index.qmd +0 -247
- atdata-0.2.3b1/docs_src/objects.json +0 -1
- atdata-0.2.3b1/docs_src/tutorials/local-workflow.qmd +0 -269
- atdata-0.2.3b1/justfile +0 -2
- atdata-0.2.3b1/prototyping/human-review-atmosphere.ipynb +0 -25
- atdata-0.2.3b1/prototyping/human-review-local.ipynb +0 -634
- atdata-0.2.3b1/src/atdata/_helpers.py +0 -60
- atdata-0.2.3b1/src/atdata/_protocols.py +0 -504
- atdata-0.2.3b1/src/atdata/cli/__init__.py +0 -222
- atdata-0.2.3b1/src/atdata/local.py +0 -1720
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/c.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/cpp.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/csharp.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/global.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/go.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/java.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/javascript-react.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/javascript.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/kotlin.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/odin.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/php.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/project.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/python.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/ruby.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/rust.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/scala.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/swift.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/typescript-react.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/typescript.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.chainlink/rules/zig.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.claude/hooks/post-edit-check.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.claude/hooks/prompt-guard.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.claude/hooks/session-start.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.claude/settings.json +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.github/workflows/uv-publish-pypi.yml +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/01_overview.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/02_lexicon_design.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/03_python_client.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/04_appview.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/05_codegen.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/README.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/atproto_integration.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/01_schema_representation_format.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/02_lens_code_storage.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/03_webdataset_storage.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/06_lexicon_validation.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/decisions/README.md +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ac.foundation.dataset.arrayFormat.json +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ac.foundation.dataset.storageBlobs.json +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ac.foundation.dataset.storageExternal.json +0 -0
- {atdata-0.2.3b1/.planning/setup → atdata-0.3.1b1/.planning/phases/01-atproto-foundation}/lexicons/ndarray_shim.json +0 -0
- {atdata-0.2.3b1/.planning/roadmap/v0.2 → atdata-0.3.1b1/.planning/phases/02-v0.2-review}/03_human-review-assessment.md +0 -0
- {atdata-0.2.3b1/.planning/roadmap/v0.3 → atdata-0.3.1b1/.planning/phases/03-v0.3-roadmap}/01_codebase-review.md +0 -0
- {atdata-0.2.3b1/.planning/roadmap/v0.3 → atdata-0.3.1b1/.planning/phases/03-v0.3-roadmap}/02_synthesis-roadmap.md +0 -0
- {atdata-0.2.3b1/.planning/roadmap/v0.3 → atdata-0.3.1b1/.planning/phases/03-v0.3-roadmap}/architecture-doc.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.python-version +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.reference/atproto_lexicon_guide.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.reference/atproto_lexicon_spec.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.reference/huggingface-datasets/architecture.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.reference/huggingface-datasets/loading-guide.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.reference/huggingface-datasets/loading-methods.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.reference/huggingface-datasets/main-classes.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.reference/python_atproto_sdk.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.review/comprehensive-review.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/.review/human-review.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/LICENSE +0 -0
- /atdata-0.2.3b1/docs/.nojekyll → /atdata-0.3.1b1/benchmarks/__init__.py +0 -0
- {atdata-0.2.3b1/docs_src → atdata-0.3.1b1/docs}/.nojekyll +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/assets/styles.css +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/bootstrap/bootstrap-icons.css +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/bootstrap/bootstrap-icons.woff +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/bootstrap/bootstrap.min.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/clipboard/clipboard.min.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-html/anchor.min.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-html/popper.min.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-html/quarto.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-html/tabsets/tabsets.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-html/tippy.css +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-html/tippy.umd.min.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-nav/headroom.min.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-nav/quarto-nav.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-search/autocomplete.umd.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-search/fuse.min.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs/site_libs/quarto-search/quarto-search.js +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/api-index-handwritten.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/atmosphere.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/datasets.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/index.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/lenses.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/load-dataset.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/local-storage.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/packable-samples.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/promotion.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.backup/protocols.md +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/.gitignore +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/AtmosphereIndexEntry.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/BlobSource.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/Lens.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/S3Source.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/api/URLSource.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/assets/styles.css +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/architecture.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/atmosphere.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/datasets.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/deployment.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/lenses.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/load-dataset.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/local-storage.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/packable-samples.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/promotion.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/protocols.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/docs_src/reference/uri-spec.qmd +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/examples/atmosphere_demo.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/examples/local_workflow.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/examples/promote_workflow.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/issues.db +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/prototyping/.credentials/.gitignore +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/prototyping/data/.gitignore +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/src/atdata/_sources.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/conftest.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/fixtures/test_samples.tar +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_e2e.py +0 -0
- {atdata-0.2.3b1 → atdata-0.3.1b1}/tests/test_integration_local.py +0 -0
|
Binary file
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
---
|
|
2
|
+
allowed-tools: Bash(git status:*), Bash(git log:*), Bash(chainlink tree:*), Bash(chainlink comment:*), Bash(chainlink subissue:*), Bash(chainlink create:*), Bash(chainlink session:*), Bash(chainlink --help), Bash(chainlink close:*), Bash(uv run pytest:*), Bash(uv run ruff:*)
|
|
3
|
+
description: Perform an adversarial review
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
|
|
8
|
+
- Current issue tree: !`chainlink tree`
|
|
9
|
+
- Current test outputs: !`uv run pytest -v`
|
|
10
|
+
- Recent commits: !`git log --oneline -10`
|
|
11
|
+
- Chainlink help: !`chainlink --help`
|
|
12
|
+
|
|
13
|
+
## Your task
|
|
14
|
+
|
|
15
|
+
1. Develop summary assessment of test suite
|
|
16
|
+
- Look through all of the unit tests currently in the project, and create a plan of how well these tests are implemented to test the functionality at the core of the project, how well these tests actually fully cover desired behavior and edge cases, whether the tests are formally correct, and whether there is any redundancy in the tests or documentation for them
|
|
17
|
+
- Develop a plan for how to address these concerns point by point
|
|
18
|
+
2. Develop summary assessment of codebase
|
|
19
|
+
- Look through all of the source files currently in the project's main modules, and create a plan of how well-implemented, efficient, and generalizable the current implementation is, as well as whether there is adequate, too sparse, or too verbose documentation
|
|
20
|
+
- Develop a plan for improvements, tweaks, or refactors that could be applied to the current codebase and its documentation
|
|
21
|
+
3. Create issue and subissues
|
|
22
|
+
- Create a base issue in chainlink for this adversarial review
|
|
23
|
+
- Create subissues for each of the plan items addressed in steps 1 and 2.
|
|
24
|
+
4. Address all subissues for this adversarial review
|
|
25
|
+
- Ordered by priority, address and close each of the subissues identified
|
|
26
|
+
- Provide thorough documentation of each step you take in the chainlink comments
|
|
27
|
+
|
|
28
|
+
## Constraints
|
|
29
|
+
|
|
30
|
+
- **Adversarial**: You are engaging in this task from the perspective of a reviewer that is hyper-critical.
|
|
31
|
+
- **Optimize code contraction**: You are operating as one half of a cyclical dyad, in which the other half is responsible for generating a lot of code, but has a propensity to write too much, and write implementations that are verbose, inefficient, or inaccurate at times. Your job is to be the critical eye, and to identify and implement revisions that make the code concise, efficient, and formally correct.
|
|
32
|
+
- **Consider test correctness**: The tests you are presented with are not necessarily complete for covering the desired functionality. Think through ways in which you could make the test suite more accurate to the task at hand, and also of ways in which you could test the codebase's functionality that are not currently addressed. Be creative and leverage web search in this endeavor to see current best practices for the problem that could aid developing tests.
|
|
33
|
+
- **Preserve documentation for API generation**: This project uses quartodoc to auto-generate API documentation from docstrings. Docstrings are a feature, not bloat. When reviewing documentation verbosity, apply these rules:
|
|
34
|
+
- **KEEP**: Module-level docstrings, class-level docstrings, `Args:`, `Returns:`, `Raises:`, `Examples:` sections on all public APIs
|
|
35
|
+
- **KEEP**: Docstrings that explain *why* something works a certain way, non-obvious behavior, or protocol/interface contracts
|
|
36
|
+
- **KEEP**: `Examples:` sections — these render as live code samples in the docs site
|
|
37
|
+
- **TRIM**: Docstrings that *only* restate the function signature with no added value (e.g. "`name: The name`" when the type hint already says `name: str`)
|
|
38
|
+
- **TRIM**: Multi-paragraph explanations on private/internal helpers where a one-liner suffices
|
|
39
|
+
- **NEVER REMOVE**: Docstrings from public API methods, protocol definitions, or decorated classes
|
|
40
|
+
- When in doubt, leave the docstring. A slightly verbose docstring that helps a user is better than a missing one that forces them to read source.
|
|
41
|
+
- **Batch mechanical fixes**: Group similar changes (e.g. all weak assertion fixes) into a single commit rather than one subissue per file. Reserve individual subissues for changes that require design thought.
|
|
42
|
+
- **Close low-value issues**: If a finding would add complexity, risk regressions, or save fewer than 10 lines, close it as "not worth the churn" with a comment explaining why.
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
---
|
|
2
|
+
allowed-tools: Bash(git log:*), Bash(git tag:*), Bash(git diff:*), Bash(chainlink *)
|
|
3
|
+
description: Generate a clean CHANGELOG entry from recent work
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
|
|
8
|
+
- Current version: !`grep '^version' pyproject.toml`
|
|
9
|
+
- Recent tags: !`git tag --sort=-creatordate | head -5`
|
|
10
|
+
- CHANGELOG head: !`head -20 CHANGELOG.md`
|
|
11
|
+
- Recent chainlink issues: !`chainlink list`
|
|
12
|
+
|
|
13
|
+
## Your task
|
|
14
|
+
|
|
15
|
+
Generate a properly structured CHANGELOG entry for the current release, following [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format.
|
|
16
|
+
|
|
17
|
+
### 1. Gather changes
|
|
18
|
+
|
|
19
|
+
Identify all changes since the last release by examining:
|
|
20
|
+
- `git log --oneline <last-release-tag-or-branch>..HEAD` for commit messages
|
|
21
|
+
- `chainlink list` for closed issues and their descriptions
|
|
22
|
+
- `git diff --stat <last-release-tag-or-branch>..HEAD` for files changed
|
|
23
|
+
|
|
24
|
+
### 2. Categorize changes
|
|
25
|
+
|
|
26
|
+
Sort changes into Keep a Changelog sections:
|
|
27
|
+
|
|
28
|
+
- **Added**: New features, new files, new public APIs, new test suites
|
|
29
|
+
- **Changed**: Modifications to existing behavior, refactors, dependency updates, CI changes
|
|
30
|
+
- **Fixed**: Bug fixes, lint fixes, CI fixes
|
|
31
|
+
- **Deprecated**: Newly deprecated APIs (with migration path)
|
|
32
|
+
- **Removed**: Removed features, deleted files, removed APIs
|
|
33
|
+
|
|
34
|
+
### 3. Write the entry
|
|
35
|
+
|
|
36
|
+
Follow these formatting rules:
|
|
37
|
+
- Each item should be a concise, user-facing description — not a chainlink issue title
|
|
38
|
+
- Group related changes under bold sub-headers (e.g. **`LocalDiskStore`**: description)
|
|
39
|
+
- Use nested bullets for sub-items that belong to a feature group
|
|
40
|
+
- Omit internal-only changes (individual subissue closes, review assessments, investigation tickets)
|
|
41
|
+
- Include GitHub issue references where relevant (e.g. `(GH#42)`)
|
|
42
|
+
- Do NOT include chainlink issue numbers — these are internal tracking
|
|
43
|
+
|
|
44
|
+
### 4. Update CHANGELOG.md
|
|
45
|
+
|
|
46
|
+
- Insert the new version section between `## [Unreleased]` and the previous release
|
|
47
|
+
- Leave `## [Unreleased]` empty at the top
|
|
48
|
+
- Do not modify any existing release sections below
|
|
49
|
+
|
|
50
|
+
### 5. Verify
|
|
51
|
+
|
|
52
|
+
- Confirm the CHANGELOG renders as valid markdown
|
|
53
|
+
- Confirm no chainlink auto-appended entries leaked into existing release sections
|
|
54
|
+
|
|
55
|
+
## Constraints
|
|
56
|
+
|
|
57
|
+
- Follow Keep a Changelog format strictly
|
|
58
|
+
- Write for the library's users, not for internal tracking
|
|
59
|
+
- Consolidate — 5 well-written bullets are better than 30 issue titles
|
|
60
|
+
- Preserve existing release sections exactly as they are
|
|
61
|
+
- If chainlink has appended noise to existing sections, clean it up
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
---
|
|
2
|
+
allowed-tools: Bash(git *), Bash(chainlink *)
|
|
3
|
+
description: Create a feature branch from a human-readable description
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
|
|
8
|
+
- Current branch: !`git branch --show-current`
|
|
9
|
+
- Recent release branches: !`git branch --list 'release/*' | tail -5`
|
|
10
|
+
- Existing feature branches: !`git branch --list 'feature/*' | tail -10`
|
|
11
|
+
- Working tree status: !`git status --short`
|
|
12
|
+
- Remotes: !`git remote -v`
|
|
13
|
+
|
|
14
|
+
## Your task
|
|
15
|
+
|
|
16
|
+
The user will provide a human-readable description of the feature (e.g. "add batch retry logic"). Create a feature branch following the project's naming convention.
|
|
17
|
+
|
|
18
|
+
### 1. Derive the branch name
|
|
19
|
+
|
|
20
|
+
- Slugify the description: lowercase, strip non-alphanumeric characters (except hyphens), replace spaces with hyphens, collapse consecutive hyphens.
|
|
21
|
+
- The branch name is `feature/<slug>` (e.g. `feature/add-batch-retry-logic`).
|
|
22
|
+
- If the slug is empty or the branch already exists, ask the user for a different name.
|
|
23
|
+
|
|
24
|
+
### 2. Validate preconditions
|
|
25
|
+
|
|
26
|
+
- Confirm there are no uncommitted changes (other than `.chainlink/issues.db`). If there are, warn the user and ask whether to stash or abort.
|
|
27
|
+
- Identify the base branch. Default to the current branch. If the user provides a `--from <ref>` argument, use that instead.
|
|
28
|
+
|
|
29
|
+
### 3. Create the branch
|
|
30
|
+
|
|
31
|
+
- `git checkout -b feature/<slug>` (from the resolved base)
|
|
32
|
+
- Print the created branch name so the user can confirm.
|
|
33
|
+
|
|
34
|
+
### 4. Track in chainlink
|
|
35
|
+
|
|
36
|
+
- Create a chainlink issue for the feature work with the user's original description as the title.
|
|
37
|
+
- Set priority to `medium` (unless the user specifies otherwise).
|
|
38
|
+
|
|
39
|
+
## Constraints
|
|
40
|
+
|
|
41
|
+
- Never force-push or delete branches.
|
|
42
|
+
- Do not push the branch to a remote — the user will do that when ready.
|
|
43
|
+
- Keep the slug concise. If the description is very long, truncate to the first 6-8 meaningful words.
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
---
|
|
2
|
+
allowed-tools: Bash(git *), Bash(gh *), Bash(uv lock*), Bash(uv run ruff*), Bash(uv run pytest*), Bash(chainlink *), Bash(uv run ruff format*)
|
|
3
|
+
description: Prepare and submit a beta release
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
|
|
8
|
+
- Current branch: !`git branch --show-current`
|
|
9
|
+
- Recent commits: !`git log --oneline -15`
|
|
10
|
+
- All branches: !`git branch --list 'release/*' | tail -5`
|
|
11
|
+
- Current version: !`grep '^version' pyproject.toml`
|
|
12
|
+
- Remotes: !`git remote -v`
|
|
13
|
+
|
|
14
|
+
## Your task
|
|
15
|
+
|
|
16
|
+
The user will provide a version string (e.g. `v0.3.0b2`). Perform the full release flow:
|
|
17
|
+
|
|
18
|
+
### 1. Validate preconditions
|
|
19
|
+
- Confirm all tests pass: `uv run pytest tests/ -x -q`
|
|
20
|
+
- Confirm lint is clean: `uv run ruff check src/ tests/`
|
|
21
|
+
- Confirm formatting is clean: `uv run ruff format --check src/ tests/` (fix with `uv run ruff format src/ tests/` if needed)
|
|
22
|
+
- Confirm no uncommitted changes (other than `.chainlink/issues.db`)
|
|
23
|
+
- Identify the previous release branch to branch from (e.g. `release/v0.3.0b1`)
|
|
24
|
+
- Identify the feature branch to merge (current branch or ask user)
|
|
25
|
+
|
|
26
|
+
### 2. Create release branch
|
|
27
|
+
- Stash any uncommitted changes
|
|
28
|
+
- `git checkout <previous-release-branch>`
|
|
29
|
+
- `git checkout -b release/<version>`
|
|
30
|
+
- `git merge <feature-branch> --no-ff --no-edit`
|
|
31
|
+
- `git stash pop` (if anything was stashed)
|
|
32
|
+
|
|
33
|
+
### 3. Prepare release
|
|
34
|
+
- Bump version in `pyproject.toml`
|
|
35
|
+
- Run `uv lock` to update the lockfile
|
|
36
|
+
- Run `/changelog` skill to generate a clean CHANGELOG entry (or generate one manually following Keep a Changelog format with Added/Changed/Fixed sections)
|
|
37
|
+
- Run `uv run ruff check src/ tests/` and fix any lint errors
|
|
38
|
+
- Run `uv run ruff format --check src/ tests/` and fix any format errors (run `uv run ruff format src/ tests/` to auto-fix)
|
|
39
|
+
- Run `uv run pytest tests/ -x -q` to confirm tests pass
|
|
40
|
+
|
|
41
|
+
### 4. Commit and push
|
|
42
|
+
- `git add pyproject.toml uv.lock CHANGELOG.md .chainlink/issues.db`
|
|
43
|
+
- `git commit -m "release: prepare <version>"`
|
|
44
|
+
- `git push -u origin release/<version>`
|
|
45
|
+
|
|
46
|
+
### 5. Create PR
|
|
47
|
+
- Create PR to `upstream/main` using `gh pr create`:
|
|
48
|
+
- `--repo forecast-bio/atdata`
|
|
49
|
+
- `--base main`
|
|
50
|
+
- `--head release/<version>`
|
|
51
|
+
- Title: `release: <version>`
|
|
52
|
+
- Body: summary of changes from CHANGELOG, test plan with pass counts
|
|
53
|
+
|
|
54
|
+
### 6. Track in chainlink
|
|
55
|
+
- Create a chainlink issue for the release, close when PR is submitted
|
|
56
|
+
|
|
57
|
+
## Constraints
|
|
58
|
+
|
|
59
|
+
- Always use `--no-ff` for merges to preserve branch topology
|
|
60
|
+
- Always run `uv lock` after version bumps — stale lockfiles break CI
|
|
61
|
+
- Always run both `ruff check` and `ruff format --check` before committing — either will fail CI
|
|
62
|
+
- Never force-push to release branches
|
|
63
|
+
- The CHANGELOG should follow Keep a Changelog format with proper Added/Changed/Fixed sections, not a flat list of chainlink issues
|
|
@@ -4,13 +4,11 @@ on:
|
|
|
4
4
|
push:
|
|
5
5
|
branches:
|
|
6
6
|
- main
|
|
7
|
-
- release/*
|
|
8
7
|
pull_request:
|
|
9
|
-
branches:
|
|
10
|
-
- main
|
|
11
8
|
|
|
12
9
|
permissions:
|
|
13
10
|
contents: read
|
|
11
|
+
actions: read
|
|
14
12
|
|
|
15
13
|
concurrency:
|
|
16
14
|
group: ${{ github.workflow }}-${{ github.ref }}
|
|
@@ -77,3 +75,60 @@ jobs:
|
|
|
77
75
|
with:
|
|
78
76
|
fail_ci_if_error: false
|
|
79
77
|
token: ${{ secrets.CODECOV_TOKEN }}
|
|
78
|
+
|
|
79
|
+
benchmark:
|
|
80
|
+
name: Benchmarks
|
|
81
|
+
runs-on: ubuntu-latest
|
|
82
|
+
needs: [lint]
|
|
83
|
+
permissions:
|
|
84
|
+
contents: write
|
|
85
|
+
actions: write
|
|
86
|
+
steps:
|
|
87
|
+
- uses: actions/checkout@v5
|
|
88
|
+
|
|
89
|
+
- name: Set up Python
|
|
90
|
+
uses: actions/setup-python@v5
|
|
91
|
+
with:
|
|
92
|
+
python-version: "3.14"
|
|
93
|
+
|
|
94
|
+
- name: Install uv
|
|
95
|
+
uses: astral-sh/setup-uv@v6
|
|
96
|
+
with:
|
|
97
|
+
enable-cache: true
|
|
98
|
+
|
|
99
|
+
- name: Install just
|
|
100
|
+
uses: extractions/setup-just@v2
|
|
101
|
+
|
|
102
|
+
- name: Install the project
|
|
103
|
+
run: uv sync --locked --all-extras --dev
|
|
104
|
+
|
|
105
|
+
- name: Start Redis
|
|
106
|
+
uses: supercharge/redis-github-action@1.8.1
|
|
107
|
+
with:
|
|
108
|
+
redis-version: 7
|
|
109
|
+
|
|
110
|
+
- name: Run benchmarks
|
|
111
|
+
run: just bench
|
|
112
|
+
|
|
113
|
+
- name: Copy report to docs
|
|
114
|
+
run: |
|
|
115
|
+
mkdir -p docs/benchmarks
|
|
116
|
+
cp .bench/report.html docs/benchmarks/index.html
|
|
117
|
+
|
|
118
|
+
- name: Commit updated benchmark docs
|
|
119
|
+
if: github.event_name == 'push'
|
|
120
|
+
run: |
|
|
121
|
+
git config user.name "github-actions[bot]"
|
|
122
|
+
git config user.email "github-actions[bot]@users.noreply.github.com"
|
|
123
|
+
git add docs/benchmarks/index.html
|
|
124
|
+
git diff --cached --quiet || git commit -m "docs: update benchmark report [skip ci]"
|
|
125
|
+
git push
|
|
126
|
+
|
|
127
|
+
- name: Upload benchmark report
|
|
128
|
+
uses: actions/upload-artifact@v4
|
|
129
|
+
if: always()
|
|
130
|
+
with:
|
|
131
|
+
name: benchmark-report
|
|
132
|
+
path: |
|
|
133
|
+
.bench/report.html
|
|
134
|
+
.bench/*.json
|
|
@@ -9,7 +9,7 @@
|
|
|
9
9
|
|
|
10
10
|
For this, let's take the following approach:
|
|
11
11
|
|
|
12
|
-
1. Let's make the `rkey` for the `ac.foundation.dataset.
|
|
12
|
+
1. Let's make the `rkey` for the `ac.foundation.dataset.schema` records be of type `any`.
|
|
13
13
|
2. Then, we can have our own standard for the `rkey` being of the format `{NSID}@{semver}`, where `{NSID}` gives an NSID for the permanent identifier of this sample schema type.
|
|
14
14
|
* This allows us to bookkeep on the version updates
|
|
15
15
|
* We can make a `ac.foundation.dataset.getLatestSchema` `query` Lexicon that will provide the record for the latest version of a given schema, as well
|
|
@@ -16,7 +16,7 @@ The finalized design decisions prioritize **flexibility and future-proofing** ov
|
|
|
16
16
|
2. **Lens Code (#46)**: External repos (GitHub + tangled.org), language metadata, future attestation
|
|
17
17
|
3. **Storage (#47)**: Hybrid (URLs + blobs) from start, AppView proxy for blobs
|
|
18
18
|
4. **Evolution (#48)**: rkey as {NSID}@{semver}, getLatestSchema query, optional migration Lenses
|
|
19
|
-
5. **Namespace (#49)**: `ac.foundation.dataset.*` (
|
|
19
|
+
5. **Namespace (#49)**: `ac.foundation.dataset.*` (schema, record, lens)
|
|
20
20
|
|
|
21
21
|
---
|
|
22
22
|
|
|
@@ -17,7 +17,7 @@ Comprehensive assessment of `ac.foundation.dataset.record` Lexicon design agains
|
|
|
17
17
|
The record Lexicon provides a solid foundation for dataset indexing with hybrid storage support. Key strengths include clean union-based storage design and appropriate use of ATProto primitives. However, several issues need addressing:
|
|
18
18
|
|
|
19
19
|
- ⚠️ **Critical**: schemaRef should use format validation
|
|
20
|
-
- ⚠️ **High**: Metadata structure inconsistency with
|
|
20
|
+
- ⚠️ **High**: Metadata structure inconsistency with schema pattern
|
|
21
21
|
- ⚠️ **Medium**: Missing $type discriminators in union variants
|
|
22
22
|
- ✅ **Strength**: Clean storage union design
|
|
23
23
|
- ✅ **Strength**: Appropriate use of tid keys for datasets
|
|
@@ -40,8 +40,8 @@ The record Lexicon provides a solid foundation for dataset indexing with hybrid
|
|
|
40
40
|
- Appropriate for records without natural semantic keys
|
|
41
41
|
- Consistent with ATProto patterns for user-generated content
|
|
42
42
|
|
|
43
|
-
**Comparison to
|
|
44
|
-
-
|
|
43
|
+
**Comparison to schema:**
|
|
44
|
+
- schema uses `"key": "any"` for versioned rkeys like `{NSID}@{semver}`
|
|
45
45
|
- record uses `"key": "tid"` for chronological dataset entries
|
|
46
46
|
- Both choices are appropriate for their use cases
|
|
47
47
|
|
|
@@ -59,14 +59,14 @@ The record Lexicon provides a solid foundation for dataset indexing with hybrid
|
|
|
59
59
|
}
|
|
60
60
|
```
|
|
61
61
|
|
|
62
|
-
**Problem:** Should use `"format": "at-uri"` like we did for
|
|
62
|
+
**Problem:** Should use `"format": "at-uri"` like we did for schema fields.
|
|
63
63
|
|
|
64
64
|
**Fix:**
|
|
65
65
|
```json
|
|
66
66
|
"schemaRef": {
|
|
67
67
|
"type": "string",
|
|
68
68
|
"format": "at-uri",
|
|
69
|
-
"description": "AT-URI reference to the
|
|
69
|
+
"description": "AT-URI reference to the schema record",
|
|
70
70
|
"maxLength": 500
|
|
71
71
|
}
|
|
72
72
|
```
|
|
@@ -77,7 +77,7 @@ The record Lexicon provides a solid foundation for dataset indexing with hybrid
|
|
|
77
77
|
|
|
78
78
|
#### Issue 2.2: License Field Inconsistency ⚠️ **Medium**
|
|
79
79
|
|
|
80
|
-
|
|
80
|
+
schema metadata:
|
|
81
81
|
```json
|
|
82
82
|
"license": {
|
|
83
83
|
"type": "string",
|
|
@@ -97,7 +97,7 @@ record:
|
|
|
97
97
|
|
|
98
98
|
**Problem:** Inconsistent maxLength and less detailed guidance.
|
|
99
99
|
|
|
100
|
-
**Recommendation:** Align with
|
|
100
|
+
**Recommendation:** Align with schema:
|
|
101
101
|
- maxLength: 200 (to support full URLs)
|
|
102
102
|
- Enhanced description with examples
|
|
103
103
|
- Reference Schema.org license property
|
|
@@ -106,7 +106,7 @@ record:
|
|
|
106
106
|
|
|
107
107
|
#### Issue 2.3: Tags Field Inconsistency ⚠️ **Medium**
|
|
108
108
|
|
|
109
|
-
|
|
109
|
+
schema metadata:
|
|
110
110
|
```json
|
|
111
111
|
"tags": {
|
|
112
112
|
"type": "array",
|
|
@@ -145,7 +145,7 @@ record:
|
|
|
145
145
|
"license": {...}
|
|
146
146
|
```
|
|
147
147
|
|
|
148
|
-
|
|
148
|
+
schema:
|
|
149
149
|
```json
|
|
150
150
|
"metadata": {
|
|
151
151
|
"type": "object",
|
|
@@ -164,10 +164,10 @@ sampleSchema:
|
|
|
164
164
|
- Pros: More discoverable (top-level fields, indexed/searchable)
|
|
165
165
|
- Pros: Validated by Lexicon
|
|
166
166
|
- Cons: Duplicates structure with metadata blob
|
|
167
|
-
- Cons: Inconsistent with
|
|
167
|
+
- Cons: Inconsistent with schema pattern
|
|
168
168
|
|
|
169
169
|
**Option B: Unified Metadata Object**
|
|
170
|
-
- Pros: Consistent with
|
|
170
|
+
- Pros: Consistent with schema
|
|
171
171
|
- Pros: Single source of truth
|
|
172
172
|
- Cons: Less discoverable for search
|
|
173
173
|
- Cons: Can't validate blob contents
|
|
@@ -298,7 +298,7 @@ storageExternal:
|
|
|
298
298
|
|
|
299
299
|
**Arguments for closed: false (current):**
|
|
300
300
|
- Future extensibility (e.g., IPFS-native, Filecoin, Arweave)
|
|
301
|
-
- Consistent with
|
|
301
|
+
- Consistent with schema schema union pattern
|
|
302
302
|
- Graceful degradation for unknown types
|
|
303
303
|
|
|
304
304
|
**Recommendation:** Keep open but document in description that external/blobs are the canonical types maintained by foundation.ac.
|
|
@@ -307,7 +307,7 @@ storageExternal:
|
|
|
307
307
|
|
|
308
308
|
### 8. Missing Fields from Standard Patterns
|
|
309
309
|
|
|
310
|
-
Comparing to Schema.org Dataset and
|
|
310
|
+
Comparing to Schema.org Dataset and schema patterns:
|
|
311
311
|
|
|
312
312
|
**Consider Adding:**
|
|
313
313
|
|
|
@@ -317,7 +317,7 @@ Comparing to Schema.org Dataset and sampleSchema patterns:
|
|
|
317
317
|
|
|
318
318
|
2. **Version** - Dataset versioning?
|
|
319
319
|
- Current approach: New record per version (via tid)
|
|
320
|
-
- Alternative: Add explicit `version` field like
|
|
320
|
+
- Alternative: Add explicit `version` field like schema
|
|
321
321
|
- **Recommendation:** Document that versioning is via new records, reference via AT-URI with tid
|
|
322
322
|
|
|
323
323
|
3. **Citation** - How to cite this dataset?
|
|
@@ -357,7 +357,7 @@ Comparing to Schema.org Dataset and sampleSchema patterns:
|
|
|
357
357
|
{
|
|
358
358
|
"$type": "ac.foundation.dataset.record",
|
|
359
359
|
"name": "CIFAR-10 Training Set",
|
|
360
|
-
"schemaRef": "at://did:plc:abc123/ac.foundation.dataset.
|
|
360
|
+
"schemaRef": "at://did:plc:abc123/ac.foundation.dataset.schema/imageclassification@1.0.0",
|
|
361
361
|
"storage": {"type": "external", "urls": ["..."]}
|
|
362
362
|
}
|
|
363
363
|
```
|
|
@@ -400,7 +400,7 @@ Comparing to Schema.org Dataset and sampleSchema patterns:
|
|
|
400
400
|
### High Priority (Should Fix)
|
|
401
401
|
|
|
402
402
|
4. **Align metadata pattern** - Clarify relationship between top-level fields and metadata blob
|
|
403
|
-
5. **Standardize license field** - Match
|
|
403
|
+
5. **Standardize license field** - Match schema maxLength and description
|
|
404
404
|
6. **Standardize tags field** - Use consistent limits or document rationale
|
|
405
405
|
|
|
406
406
|
### Medium Priority (Consider)
|
|
@@ -419,9 +419,9 @@ Comparing to Schema.org Dataset and sampleSchema patterns:
|
|
|
419
419
|
|
|
420
420
|
## Consistency Matrix
|
|
421
421
|
|
|
422
|
-
Comparison of patterns between
|
|
422
|
+
Comparison of patterns between schema and record Lexicons:
|
|
423
423
|
|
|
424
|
-
| Pattern |
|
|
424
|
+
| Pattern | schema | record | Status |
|
|
425
425
|
|---------|--------------|--------|--------|
|
|
426
426
|
| AT-URI format | ✅ Uses format | ❌ Missing | **Fix** |
|
|
427
427
|
| License field | 200 chars, detailed | 100 chars, basic | **Align** |
|
|
@@ -440,15 +440,15 @@ Comparison of patterns between sampleSchema and record Lexicons:
|
|
|
440
440
|
1. Add `"format": "at-uri"` to schemaRef field
|
|
441
441
|
2. Change storage union variants to use `$type` discriminator
|
|
442
442
|
3. Verify blob array item type with ATProto specification
|
|
443
|
-
4. Align license field with
|
|
444
|
-
5. Decide on tags limits (recommend matching
|
|
443
|
+
4. Align license field with schema (maxLength: 200, enhanced description)
|
|
444
|
+
5. Decide on tags limits (recommend matching schema: 150/30)
|
|
445
445
|
|
|
446
446
|
### Documentation Improvements
|
|
447
447
|
|
|
448
448
|
6. Add description clarifying metadata blob vs top-level fields relationship
|
|
449
449
|
7. Document that dataset versioning is via new records (tids)
|
|
450
450
|
8. Add note about storage union extensibility
|
|
451
|
-
9. Cross-reference with
|
|
451
|
+
9. Cross-reference with schema Lexicon
|
|
452
452
|
|
|
453
453
|
### Consider for Phase 2
|
|
454
454
|
|
|
@@ -460,7 +460,7 @@ Comparison of patterns between sampleSchema and record Lexicons:
|
|
|
460
460
|
|
|
461
461
|
## Conclusion
|
|
462
462
|
|
|
463
|
-
The record Lexicon provides a solid foundation but needs refinement for ATProto compliance and consistency with
|
|
463
|
+
The record Lexicon provides a solid foundation but needs refinement for ATProto compliance and consistency with schema patterns. The storage union design is excellent, and the use of tids is appropriate. Primary concerns are format validation, union discriminators, and metadata pattern clarity.
|
|
464
464
|
|
|
465
465
|
**Estimated effort to address critical issues:** 2-3 hours
|
|
466
466
|
**Recommended timeline:** Before Phase 1 completion
|
|
@@ -1,6 +1,6 @@
|
|
|
1
|
-
#
|
|
1
|
+
# schema Lexicon Design Questions
|
|
2
2
|
|
|
3
|
-
This document captures open design questions for the `ac.foundation.dataset.
|
|
3
|
+
This document captures open design questions for the `ac.foundation.dataset.schema` Lexicon that require user decisions before implementation.
|
|
4
4
|
|
|
5
5
|
## Q1: Key Format Validation
|
|
6
6
|
|
|
@@ -156,7 +156,7 @@ Should we add an explicit default value for `ndarrayShimUri`?
|
|
|
156
156
|
|
|
157
157
|
## Notes
|
|
158
158
|
|
|
159
|
-
These questions should be resolved before finalizing the
|
|
159
|
+
These questions should be resolved before finalizing the schema Lexicon design. Some can be deferred to Phase 2 implementation based on priority.
|
|
160
160
|
|
|
161
161
|
**Priority:**
|
|
162
162
|
- Q1: High (affects rkey strategy)
|
|
@@ -41,7 +41,7 @@ def bytes_to_array(b: bytes) -> np.ndarray:
|
|
|
41
41
|
# Step 2: Load the JSON Schema for ImageSample
|
|
42
42
|
|
|
43
43
|
# Get path to the schema example
|
|
44
|
-
schema_path = Path(__file__).parent.parent / "
|
|
44
|
+
schema_path = Path(__file__).parent.parent / "schema_example.json"
|
|
45
45
|
with open(schema_path) as f:
|
|
46
46
|
schema_record = json.load(f)
|
|
47
47
|
|
|
@@ -312,5 +312,5 @@ print("""
|
|
|
312
312
|
- Shim definition is sound and reusable
|
|
313
313
|
- Works as both inline $def and external $ref
|
|
314
314
|
- Compatible with JSON Schema tooling
|
|
315
|
-
- Ready for use in ac.foundation.dataset.
|
|
315
|
+
- Ready for use in ac.foundation.dataset.schema Lexicon
|
|
316
316
|
""")
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"$type": "ac.foundation.dataset.record",
|
|
3
3
|
"name": "Small Sample Dataset",
|
|
4
|
-
"schemaRef": "at://did:plc:def456/ac.foundation.dataset.
|
|
4
|
+
"schemaRef": "at://did:plc:def456/ac.foundation.dataset.schema/textsample@2.1.0",
|
|
5
5
|
"storage": {
|
|
6
6
|
"$type": "ac.foundation.dataset.storageBlobs",
|
|
7
7
|
"blobs": [
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"$type": "ac.foundation.dataset.record",
|
|
3
3
|
"name": "CIFAR-10 Training Set",
|
|
4
|
-
"schemaRef": "at://did:plc:abc123/ac.foundation.dataset.
|
|
4
|
+
"schemaRef": "at://did:plc:abc123/ac.foundation.dataset.schema/imageclassification@1.0.0",
|
|
5
5
|
"storage": {
|
|
6
6
|
"$type": "ac.foundation.dataset.storageExternal",
|
|
7
7
|
"urls": [
|
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
{
|
|
2
2
|
"$type": "ac.foundation.dataset.lens",
|
|
3
3
|
"name": "RGB to Grayscale Conversion",
|
|
4
|
-
"sourceSchema": "at://did:plc:abc123/ac.foundation.dataset.
|
|
5
|
-
"targetSchema": "at://did:plc:abc123/ac.foundation.dataset.
|
|
4
|
+
"sourceSchema": "at://did:plc:abc123/ac.foundation.dataset.schema/rgbimage@1.0.0",
|
|
5
|
+
"targetSchema": "at://did:plc:abc123/ac.foundation.dataset.schema/grayscaleimage@1.0.0",
|
|
6
6
|
"description": "Converts RGB images to grayscale using standard luminosity formula",
|
|
7
7
|
"getterCode": {
|
|
8
8
|
"repository": "https://github.com/alice/vision-lenses",
|
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
{
|
|
2
|
-
"$type": "ac.foundation.dataset.
|
|
2
|
+
"$type": "ac.foundation.dataset.schema",
|
|
3
3
|
"name": "ImageSample",
|
|
4
4
|
"version": "1.0.0",
|
|
5
5
|
"schemaType": "jsonSchema",
|
|
6
6
|
"schema": {
|
|
7
|
-
"$type": "ac.foundation.dataset.
|
|
7
|
+
"$type": "ac.foundation.dataset.schema#jsonSchemaFormat",
|
|
8
8
|
"$schema": "http://json-schema.org/draft-07/schema#",
|
|
9
9
|
"title": "ImageSample",
|
|
10
10
|
"type": "object",
|
|
@@ -6,16 +6,16 @@ This directory contains the ATProto Lexicon JSON definitions for the distributed
|
|
|
6
6
|
|
|
7
7
|
### Core Record Types
|
|
8
8
|
|
|
9
|
-
1. **[ac.foundation.dataset.
|
|
9
|
+
1. **[ac.foundation.dataset.schema](ac.foundation.dataset.schema.json)**
|
|
10
10
|
- Defines PackableSample-compatible sample types using JSON Schema
|
|
11
11
|
- Supports versioning via rkey format: `{NSID}@{semver}`
|
|
12
12
|
- Includes NDArray shim for ML/scientific data types
|
|
13
|
-
- Example: [
|
|
13
|
+
- Example: [schema_example.json](../examples/schema_example.json)
|
|
14
14
|
|
|
15
15
|
2. **[ac.foundation.dataset.record](ac.foundation.dataset.record.json)**
|
|
16
16
|
- Index records for WebDataset-backed datasets
|
|
17
17
|
- Hybrid storage support (external URLs + PDS blobs)
|
|
18
|
-
- References
|
|
18
|
+
- References schema for type information
|
|
19
19
|
- Examples:
|
|
20
20
|
- [External storage](../examples/dataset_external_storage.json)
|
|
21
21
|
- [Blob storage](../examples/dataset_blob_storage.json)
|
|
@@ -40,7 +40,7 @@ This directory contains the ATProto Lexicon JSON definitions for the distributed
|
|
|
40
40
|
All Lexicons use the `ac.foundation.dataset.*` namespace:
|
|
41
41
|
- `ac.foundation` - Organization namespace
|
|
42
42
|
- `dataset` - Domain (distributed datasets)
|
|
43
|
-
- Specific record types: `
|
|
43
|
+
- Specific record types: `schema`, `record`, `lens`
|
|
44
44
|
|
|
45
45
|
### 2. Schema Versioning (rkey Convention)
|
|
46
46
|
|
|
@@ -57,7 +57,7 @@ All Lexicons use the `ac.foundation.dataset.*` namespace:
|
|
|
57
57
|
- Natural query pattern via `getLatestSchema`
|
|
58
58
|
- Clear semantic versioning enforcement
|
|
59
59
|
|
|
60
|
-
**Implementation**: The
|
|
60
|
+
**Implementation**: The schema Lexicon uses `"key": "any"` to support this custom format.
|
|
61
61
|
|
|
62
62
|
### 3. JSON Schema with NDArray Shim
|
|
63
63
|
|
|
@@ -151,7 +151,7 @@ schema_uri = publisher.publish_schema(
|
|
|
151
151
|
version="1.0.0",
|
|
152
152
|
description="RGB image with label"
|
|
153
153
|
)
|
|
154
|
-
# Result: at://did:plc:abc123/ac.foundation.dataset.
|
|
154
|
+
# Result: at://did:plc:abc123/ac.foundation.dataset.schema/imagesample@1.0.0
|
|
155
155
|
```
|
|
156
156
|
|
|
157
157
|
### Publishing a Dataset
|
|
@@ -226,7 +226,7 @@ See [06_lexicon_validation.md](../decisions/06_lexicon_validation.md) for valida
|
|
|
226
226
|
|
|
227
227
|
```bash
|
|
228
228
|
# Validate Lexicon JSON (requires ATProto tooling)
|
|
229
|
-
atproto-lexicon validate ac.foundation.dataset.
|
|
229
|
+
atproto-lexicon validate ac.foundation.dataset.schema.json
|
|
230
230
|
|
|
231
231
|
# Validate example records
|
|
232
232
|
python scripts/validate_examples.py
|