classifyre-schemas 0.4.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,6 @@
1
+ Metadata-Version: 2.4
2
+ Name: classifyre-schemas
3
+ Version: 0.4.2
4
+ Summary: Shared JSON schemas for Classifyre monorepo
5
+ Requires-Python: >=3.12
6
+ Requires-Dist: fastjsonschema==2.21.2
@@ -0,0 +1,15 @@
1
+ schemas/ADDING_NEW_SOURCE.md,sha256=LY621BjMqQ9oF_Sz1i7KFWegaeBzCmZoyWixMD-0BaM,3976
2
+ schemas/__init__.py,sha256=6b4P0L2Mh7aSV_ZQZPG6LF3zhxlG0x3hiGjzssGmjbo,74
3
+ schemas/all_detectors.json,sha256=d-mtCDG0hIpidwwkHhn4LX5II2IVUmfxKEEkPlvwt68,59332
4
+ schemas/all_detectors.ts,sha256=FogI5ARdKCV3Lwb8D1Yg4niZz_h4oVU944G5MLwskH0,107
5
+ schemas/all_detectors_examples.json,sha256=cUCqbuRv_HlZ2d7a0SFbGFg6FI-Zhe1HjhOZ6lscNls,46757
6
+ schemas/all_detectors_examples.ts,sha256=0EuO-OVn76kCCGnjUvFvjDaTGJlQJdlJrT2-cyRBSFo,92
7
+ schemas/all_input_examples.json,sha256=QDjxw2ozc0h1wMjeQWKhvqVyU6KST10r80Gdz24Yb8o,41490
8
+ schemas/all_input_sources.json,sha256=1eLwhiFBikusVeOlsc25dLBT8wK-jpTt1pi21b2FXFQ,123109
9
+ schemas/assistant_contexts.json,sha256=I_ZcJATou1QAChjkATflebFB8PGEdnMTpHrHnwufeO8,810
10
+ schemas/assistant_knowledge.json,sha256=sTVs0819_trXAbDTsF86K_PAEbR9g8bpMz-5QBizN-M,65488
11
+ schemas/detector_config.json,sha256=_4C1GodMG7X5lvJ0e8k2yH-ymIPH-g_0laChq6scmus,1583
12
+ schemas/single_asset_scan_results.json,sha256=y2hvfpAvRiMzwoi75n8r7fcc7Qt80u1LvrhEXx7Qabw,11945
13
+ classifyre_schemas-0.4.2.dist-info/METADATA,sha256=mtEk_VVVnagltX2KvucfkgfL3WL9vjETRhPveXYS3Ys,177
14
+ classifyre_schemas-0.4.2.dist-info/WHEEL,sha256=QccIxa26bgl1E6uMy58deGWi-0aeIkkangHcxk2kWfw,87
15
+ classifyre_schemas-0.4.2.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.29.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,85 @@
1
+ # Adding a New Source
2
+
3
+ This guide lives under `packages/schemas` because every new source starts with schema design, but it covers the end-to-end checklist across CLI, API, DB, and Web. Use WordPress as a working example for patterns and structure.
4
+
5
+ Reference files for WordPress:
6
+ `packages/schemas/src/schemas/all_input_sources.json`
7
+ `packages/schemas/src/schemas/all_input_examples.json`
8
+ `apps/cli/src/sources/wordpress/source.py`
9
+ `apps/cli/tests/integration/test_wordpress_real.py`
10
+
11
+ **Step 1: Research and Design**
12
+
13
+ 1. Confirm the source is read-only for ingestion and discovery.
14
+ 2. Identify the auth modes the source supports and prefer existing libraries (for example `requests`, database drivers, or official SDKs) over custom HTTP clients.
15
+ 3. Decide which config fields are truly needed, and how to represent them with `required`, `masked`, and `optional` sections.
16
+
17
+ **Step 2: Add the Input Schema**
18
+
19
+ 1. Update `packages/schemas/src/schemas/all_input_sources.json`.
20
+ 2. Add new `*Required`, `*Masked`, and `*Optional` definitions for the source.
21
+ 3. Add a new `*Input` definition using `allOf` with `CoreInput`.
22
+ 4. Add the new input to the top-level `oneOf`.
23
+ 5. Add the new source to `definitions.AssetType` (uppercase enum string).
24
+
25
+ **Step 3: Add Examples**
26
+
27
+ 1. Update `packages/schemas/src/schemas/all_input_examples.json`.
28
+ 2. Provide multiple examples covering auth variations and common use cases.
29
+ 3. If the root `all_input_examples.json` is still used by your flow, keep it in sync with the schemas copy.
30
+
31
+ **Step 4: Validate Examples**
32
+
33
+ 1. Update the mapping in `packages/schemas/scripts/validate_examples.py` (`TYPE_TO_DEFINITION`) for the new source.
34
+ 2. Run the validator script when you add or change examples.
35
+
36
+ **Step 5: Regenerate Pydantic Models (CLI)**
37
+
38
+ 1. Run `uv run python scripts/generate_models.py` from `apps/cli`.
39
+ 2. Confirm `apps/cli/src/models/generated_input.py` includes the new input model.
40
+
41
+ **Step 6: Implement the CLI Source**
42
+
43
+ 1. Add a new package under `apps/cli/src/sources/<source_name>/`.
44
+ 2. Implement `BaseSource` methods: `test_connection`, `extract`, `generate_hash_id`, `abort`, and optionally `fetch_content`.
45
+ 3. Use streaming batches in `extract` and follow existing patterns for detectors and hashing.
46
+ 4. Ensure the module is importable so the auto-discovery in `src/sources/__init__.py` can register it.
47
+ 5. Prefer existing libraries already used in the repo over custom clients.
48
+
49
+ **Step 7: Tests**
50
+
51
+ 1. Ask the requester whether a test system or credentials are available.
52
+ 2. Add unit tests for config parsing, paging, and error handling.
53
+ 3. If real credentials or a test instance are available, add an integration test in `apps/cli/tests/integration`.
54
+ 4. Use `test_wordpress_real.py` as a model for structure and expectations.
55
+
56
+ **Step 8: API Updates**
57
+
58
+ 1. Search for `AssetType` and source lists to ensure the new type is recognized in API logic.
59
+ 2. If DTOs or validations change, update `apps/api` and regenerate OpenAPI as needed.
60
+
61
+ **Step 9: Prisma Schema and Migrations**
62
+
63
+ 1. Update `enum AssetType` in `apps/api/prisma/schema.prisma`.
64
+ 2. Run Prisma migration and client generation if the enum changes.
65
+
66
+ **Step 10: OpenAPI and API Client**
67
+
68
+ 1. Regenerate `apps/api/openapi.json` when API DTOs or enums change.
69
+ 2. Regenerate the API client in `packages/api-client` if the OpenAPI spec changes.
70
+
71
+ **Step 11: Web and UI**
72
+
73
+ 1. Update source lists, icons, and labels in the web and UI packages.
74
+ 2. Common places to check:
75
+ `apps/web/components/source-type-selector.tsx`
76
+ `apps/web/app/(dashboard)/discovery/page.tsx`
77
+ `packages/ui/src/components/source-icon.tsx`
78
+ `packages/ui/src/mocks/types.ts`
79
+ `packages/ui/src/mocks/sources.ts`
80
+
81
+ **Quality Bar**
82
+
83
+ 1. Keep configuration minimal, documented, and consistent with other sources.
84
+ 2. Keep code clean, testable, and reusable with clear separation between IO, parsing, and transformation.
85
+ 3. Prefer stable, well-maintained dependencies and avoid custom protocol implementations.
schemas/__init__.py ADDED
@@ -0,0 +1,3 @@
1
+ """Shared JSON schemas for Classifyre monorepo."""
2
+
3
+ __version__ = "0.0.1"