disclosure-alpha 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (203) hide show
  1. disclosure_alpha-1.0.0/.coverage +0 -0
  2. disclosure_alpha-1.0.0/.github/workflows/ci.yml +35 -0
  3. disclosure_alpha-1.0.0/.github/workflows/integration.yml +30 -0
  4. disclosure_alpha-1.0.0/.github/workflows/publish.yml +21 -0
  5. disclosure_alpha-1.0.0/.gitignore +12 -0
  6. disclosure_alpha-1.0.0/.readthedocs.yaml +20 -0
  7. disclosure_alpha-1.0.0/CONTRIBUTING.md +52 -0
  8. disclosure_alpha-1.0.0/LICENSE +17 -0
  9. disclosure_alpha-1.0.0/PKG-INFO +290 -0
  10. disclosure_alpha-1.0.0/README.md +246 -0
  11. disclosure_alpha-1.0.0/SECURITY.md +20 -0
  12. disclosure_alpha-1.0.0/data/universe/README.md +39 -0
  13. disclosure_alpha-1.0.0/data/universe/sp500.csv +504 -0
  14. disclosure_alpha-1.0.0/data/validation/README.md +245 -0
  15. disclosure_alpha-1.0.0/data/validation/baselines/dictionary_shift_baseline.json +10319 -0
  16. disclosure_alpha-1.0.0/data/validation/corpus/.gitkeep +1 -0
  17. disclosure_alpha-1.0.0/data/validation/corpus/sp500_item1a.manifest.json +67 -0
  18. disclosure_alpha-1.0.0/data/validation/reports/.gitkeep +1 -0
  19. disclosure_alpha-1.0.0/data/validation/reports/deterministic_validation_report.json +156 -0
  20. disclosure_alpha-1.0.0/data/validation/reports/dictionary_shift_report.json +38 -0
  21. disclosure_alpha-1.0.0/data/validation/reports/l3_outcomes_report.json +48 -0
  22. disclosure_alpha-1.0.0/data/validation/reports/l3_outcomes_report_edgar.json +48 -0
  23. disclosure_alpha-1.0.0/data/validation/reports/l3_outcomes_report_edgar_fy2024.json +48 -0
  24. disclosure_alpha-1.0.0/docs/CONTRIBUTING_DOCS.md +50 -0
  25. disclosure_alpha-1.0.0/docs/README.md +30 -0
  26. disclosure_alpha-1.0.0/docs/_includes/component-plain-english.md +13 -0
  27. disclosure_alpha-1.0.0/docs/_includes/pipeline-diagram.md +34 -0
  28. disclosure_alpha-1.0.0/docs/_includes/score-scale.md +10 -0
  29. disclosure_alpha-1.0.0/docs/_static/.gitkeep +0 -0
  30. disclosure_alpha-1.0.0/docs/_templates/page_template.md +26 -0
  31. disclosure_alpha-1.0.0/docs/appendix/changelog.md +62 -0
  32. disclosure_alpha-1.0.0/docs/appendix/glossary.md +58 -0
  33. disclosure_alpha-1.0.0/docs/appendix/index.md +10 -0
  34. disclosure_alpha-1.0.0/docs/assets/readme-hero.png +0 -0
  35. disclosure_alpha-1.0.0/docs/conf.py +117 -0
  36. disclosure_alpha-1.0.0/docs/developer/architecture.md +12 -0
  37. disclosure_alpha-1.0.0/docs/developer/index.md +17 -0
  38. disclosure_alpha-1.0.0/docs/developer/testing.md +12 -0
  39. disclosure_alpha-1.0.0/docs/examples/panel-response-snippet.json +51 -0
  40. disclosure_alpha-1.0.0/docs/examples/score-minimal-10k.json +266 -0
  41. disclosure_alpha-1.0.0/docs/examples/score-with-prior-snippet.json +28 -0
  42. disclosure_alpha-1.0.0/docs/getting-started/choose-your-surface.md +75 -0
  43. disclosure_alpha-1.0.0/docs/getting-started/concepts.md +34 -0
  44. disclosure_alpha-1.0.0/docs/getting-started/faq.md +99 -0
  45. disclosure_alpha-1.0.0/docs/getting-started/index.md +16 -0
  46. disclosure_alpha-1.0.0/docs/getting-started/installation.md +64 -0
  47. disclosure_alpha-1.0.0/docs/getting-started/quickstart-cli.md +73 -0
  48. disclosure_alpha-1.0.0/docs/getting-started/quickstart-python.md +85 -0
  49. disclosure_alpha-1.0.0/docs/getting-started/sec-edgar-setup.md +35 -0
  50. disclosure_alpha-1.0.0/docs/getting-started/understanding-scores.md +96 -0
  51. disclosure_alpha-1.0.0/docs/guides/cli/index.md +68 -0
  52. disclosure_alpha-1.0.0/docs/guides/http/endpoints/changes.md +12 -0
  53. disclosure_alpha-1.0.0/docs/guides/http/endpoints/filings.md +12 -0
  54. disclosure_alpha-1.0.0/docs/guides/http/endpoints/flags.md +12 -0
  55. disclosure_alpha-1.0.0/docs/guides/http/endpoints/health.md +12 -0
  56. disclosure_alpha-1.0.0/docs/guides/http/endpoints/matrix.md +12 -0
  57. disclosure_alpha-1.0.0/docs/guides/http/endpoints/metrics.md +12 -0
  58. disclosure_alpha-1.0.0/docs/guides/http/endpoints/panel.md +12 -0
  59. disclosure_alpha-1.0.0/docs/guides/http/endpoints/sections.md +12 -0
  60. disclosure_alpha-1.0.0/docs/guides/http/index.md +144 -0
  61. disclosure_alpha-1.0.0/docs/guides/index.md +14 -0
  62. disclosure_alpha-1.0.0/docs/guides/mcp/index.md +73 -0
  63. disclosure_alpha-1.0.0/docs/guides/python/index.md +67 -0
  64. disclosure_alpha-1.0.0/docs/guides/workflows/index.md +155 -0
  65. disclosure_alpha-1.0.0/docs/index.md +90 -0
  66. disclosure_alpha-1.0.0/docs/legal.md +32 -0
  67. disclosure_alpha-1.0.0/docs/methodology/aggregation.md +208 -0
  68. disclosure_alpha-1.0.0/docs/methodology/dictionaries/enrichment-research.md +769 -0
  69. disclosure_alpha-1.0.0/docs/methodology/diff-engine.md +137 -0
  70. disclosure_alpha-1.0.0/docs/methodology/index.md +25 -0
  71. disclosure_alpha-1.0.0/docs/methodology/metrics-engine.md +143 -0
  72. disclosure_alpha-1.0.0/docs/methodology/overview.md +94 -0
  73. disclosure_alpha-1.0.0/docs/methodology/research-foundation.md +76 -0
  74. disclosure_alpha-1.0.0/docs/methodology/roadmap/v2-improvement-plan.md +195 -0
  75. disclosure_alpha-1.0.0/docs/postman/disclosure-alpha-analytics.postman_collection.json +54 -0
  76. disclosure_alpha-1.0.0/docs/postman/disclosure-alpha-api.postman_collection.json +229 -0
  77. disclosure_alpha-1.0.0/docs/postman/disclosure-alpha-compliance.postman_collection.json +54 -0
  78. disclosure_alpha-1.0.0/docs/postman/disclosure-alpha-discovery.postman_collection.json +40 -0
  79. disclosure_alpha-1.0.0/docs/postman/disclosure-alpha-panel.postman_collection.json +30 -0
  80. disclosure_alpha-1.0.0/docs/postman/disclosure-alpha-scores.postman_collection.json +77 -0
  81. disclosure_alpha-1.0.0/docs/reference/environment-variables.md +13 -0
  82. disclosure_alpha-1.0.0/docs/reference/http/openapi.md +67 -0
  83. disclosure_alpha-1.0.0/docs/reference/http/schemas/changes.md +5 -0
  84. disclosure_alpha-1.0.0/docs/reference/http/schemas/common.md +5 -0
  85. disclosure_alpha-1.0.0/docs/reference/http/schemas/flags.md +5 -0
  86. disclosure_alpha-1.0.0/docs/reference/http/schemas/index.md +11 -0
  87. disclosure_alpha-1.0.0/docs/reference/http/schemas/matrix.md +5 -0
  88. disclosure_alpha-1.0.0/docs/reference/http/schemas/panel.md +5 -0
  89. disclosure_alpha-1.0.0/docs/reference/index.md +12 -0
  90. disclosure_alpha-1.0.0/docs/reference/python/api.md +14 -0
  91. disclosure_alpha-1.0.0/docs/reference/python/cli.md +27 -0
  92. disclosure_alpha-1.0.0/docs/reference/python/deterministic_scoring.md +30 -0
  93. disclosure_alpha-1.0.0/docs/reference/python/dictionaries.md +51 -0
  94. disclosure_alpha-1.0.0/docs/reference/python/diff_engine.md +31 -0
  95. disclosure_alpha-1.0.0/docs/reference/python/edgar.md +34 -0
  96. disclosure_alpha-1.0.0/docs/reference/python/index.md +18 -0
  97. disclosure_alpha-1.0.0/docs/reference/python/mcp.md +43 -0
  98. disclosure_alpha-1.0.0/docs/reference/python/pipeline.md +33 -0
  99. disclosure_alpha-1.0.0/docs/reference/python/section_extractor.md +32 -0
  100. disclosure_alpha-1.0.0/docs/reference/python/text_metrics.md +30 -0
  101. disclosure_alpha-1.0.0/docs/reference/python/validation.md +5 -0
  102. disclosure_alpha-1.0.0/docs/reference/section-taxonomy.md +48 -0
  103. disclosure_alpha-1.0.0/docs/requirements.txt +4 -0
  104. disclosure_alpha-1.0.0/docs/validation/evidence-and-limitations.md +45 -0
  105. disclosure_alpha-1.0.0/docs/validation/index.md +9 -0
  106. disclosure_alpha-1.0.0/pyproject.toml +65 -0
  107. disclosure_alpha-1.0.0/scripts/audit_validation_corpus.py +87 -0
  108. disclosure_alpha-1.0.0/scripts/build_validation_corpus.py +84 -0
  109. disclosure_alpha-1.0.0/scripts/build_validation_corpus_from_edgar.py +278 -0
  110. disclosure_alpha-1.0.0/scripts/diagnose_item1a.py +128 -0
  111. disclosure_alpha-1.0.0/scripts/fetch_sp500_universe.py +70 -0
  112. disclosure_alpha-1.0.0/scripts/fetch_validation_outcomes.py +137 -0
  113. disclosure_alpha-1.0.0/scripts/validate_deterministic_construct.py +77 -0
  114. disclosure_alpha-1.0.0/scripts/validate_deterministic_outcomes.py +98 -0
  115. disclosure_alpha-1.0.0/scripts/validate_dictionary_shift.py +202 -0
  116. disclosure_alpha-1.0.0/src/disclosure_alpha/__init__.py +49 -0
  117. disclosure_alpha-1.0.0/src/disclosure_alpha/api/__init__.py +3 -0
  118. disclosure_alpha-1.0.0/src/disclosure_alpha/api/app.py +22 -0
  119. disclosure_alpha-1.0.0/src/disclosure_alpha/api/app_factory.py +16 -0
  120. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/__init__.py +19 -0
  121. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/changes.py +61 -0
  122. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/deps.py +48 -0
  123. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/filings.py +43 -0
  124. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/flags.py +59 -0
  125. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/health.py +10 -0
  126. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/matrix.py +108 -0
  127. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/metrics.py +52 -0
  128. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/panel.py +84 -0
  129. disclosure_alpha-1.0.0/src/disclosure_alpha/api/endpoints/sections.py +56 -0
  130. disclosure_alpha-1.0.0/src/disclosure_alpha/api/helpers.py +106 -0
  131. disclosure_alpha-1.0.0/src/disclosure_alpha/api/routes.py +5 -0
  132. disclosure_alpha-1.0.0/src/disclosure_alpha/api/schemas/__init__.py +29 -0
  133. disclosure_alpha-1.0.0/src/disclosure_alpha/api/schemas/changes.py +20 -0
  134. disclosure_alpha-1.0.0/src/disclosure_alpha/api/schemas/common.py +47 -0
  135. disclosure_alpha-1.0.0/src/disclosure_alpha/api/schemas/flags.py +20 -0
  136. disclosure_alpha-1.0.0/src/disclosure_alpha/api/schemas/matrix.py +13 -0
  137. disclosure_alpha-1.0.0/src/disclosure_alpha/api/schemas/panel.py +33 -0
  138. disclosure_alpha-1.0.0/src/disclosure_alpha/api/shapes.py +70 -0
  139. disclosure_alpha-1.0.0/src/disclosure_alpha/cli.py +97 -0
  140. disclosure_alpha-1.0.0/src/disclosure_alpha/confidence.py +18 -0
  141. disclosure_alpha-1.0.0/src/disclosure_alpha/deterministic_scoring.py +396 -0
  142. disclosure_alpha-1.0.0/src/disclosure_alpha/dictionaries.py +649 -0
  143. disclosure_alpha-1.0.0/src/disclosure_alpha/diff_engine.py +134 -0
  144. disclosure_alpha-1.0.0/src/disclosure_alpha/edgar/__init__.py +20 -0
  145. disclosure_alpha-1.0.0/src/disclosure_alpha/edgar/cache.py +54 -0
  146. disclosure_alpha-1.0.0/src/disclosure_alpha/edgar/client.py +78 -0
  147. disclosure_alpha-1.0.0/src/disclosure_alpha/edgar/resolver.py +469 -0
  148. disclosure_alpha-1.0.0/src/disclosure_alpha/edgar/types.py +34 -0
  149. disclosure_alpha-1.0.0/src/disclosure_alpha/embedding_service.py +70 -0
  150. disclosure_alpha-1.0.0/src/disclosure_alpha/filing_normalizer.py +30 -0
  151. disclosure_alpha-1.0.0/src/disclosure_alpha/mcp/__init__.py +0 -0
  152. disclosure_alpha-1.0.0/src/disclosure_alpha/mcp/analyst.py +49 -0
  153. disclosure_alpha-1.0.0/src/disclosure_alpha/mcp/builder.py +67 -0
  154. disclosure_alpha-1.0.0/src/disclosure_alpha/mcp/server.py +23 -0
  155. disclosure_alpha-1.0.0/src/disclosure_alpha/mcp/tools.py +159 -0
  156. disclosure_alpha-1.0.0/src/disclosure_alpha/pipeline.py +469 -0
  157. disclosure_alpha-1.0.0/src/disclosure_alpha/scoring_types.py +85 -0
  158. disclosure_alpha-1.0.0/src/disclosure_alpha/section_extractor.py +789 -0
  159. disclosure_alpha-1.0.0/src/disclosure_alpha/text_cleaner.py +37 -0
  160. disclosure_alpha-1.0.0/src/disclosure_alpha/text_matching.py +79 -0
  161. disclosure_alpha-1.0.0/src/disclosure_alpha/text_metrics.py +161 -0
  162. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/__init__.py +22 -0
  163. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/construct.py +291 -0
  164. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/corpus.py +159 -0
  165. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/edgar_gates.py +103 -0
  166. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/monotonicity.py +126 -0
  167. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/openbb_client.py +70 -0
  168. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/outcomes.py +244 -0
  169. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/outcomes_validation.py +250 -0
  170. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/references/__init__.py +8 -0
  171. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/references/boilerplate.py +54 -0
  172. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/references/ner.py +64 -0
  173. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/types.py +62 -0
  174. disclosure_alpha-1.0.0/src/disclosure_alpha/validation/universe.py +46 -0
  175. disclosure_alpha-1.0.0/src/disclosure_alpha/version.py +11 -0
  176. disclosure_alpha-1.0.0/tests/conftest.py +25 -0
  177. disclosure_alpha-1.0.0/tests/fixtures/dictionary_near_miss_snippets.json +54 -0
  178. disclosure_alpha-1.0.0/tests/fixtures/filings/aapl_2025_10k.html +8 -0
  179. disclosure_alpha-1.0.0/tests/fixtures/filings/amzn_2026_10k.html +8 -0
  180. disclosure_alpha-1.0.0/tests/fixtures/filings/tgt_2026_10k.html +8 -0
  181. disclosure_alpha-1.0.0/tests/fixtures/sample_10k.html +17 -0
  182. disclosure_alpha-1.0.0/tests/fixtures/validation/mini_corpus.jsonl +3 -0
  183. disclosure_alpha-1.0.0/tests/html_fixtures.py +27 -0
  184. disclosure_alpha-1.0.0/tests/test_api.py +299 -0
  185. disclosure_alpha-1.0.0/tests/test_api_changes.py +124 -0
  186. disclosure_alpha-1.0.0/tests/test_api_composite.py +99 -0
  187. disclosure_alpha-1.0.0/tests/test_api_flags.py +97 -0
  188. disclosure_alpha-1.0.0/tests/test_api_panel.py +93 -0
  189. disclosure_alpha-1.0.0/tests/test_cli.py +98 -0
  190. disclosure_alpha-1.0.0/tests/test_confidence.py +29 -0
  191. disclosure_alpha-1.0.0/tests/test_construct_validity.py +195 -0
  192. disclosure_alpha-1.0.0/tests/test_deterministic_scoring.py +320 -0
  193. disclosure_alpha-1.0.0/tests/test_dictionary_snippets.py +195 -0
  194. disclosure_alpha-1.0.0/tests/test_diff_engine.py +48 -0
  195. disclosure_alpha-1.0.0/tests/test_edgar_resolver.py +78 -0
  196. disclosure_alpha-1.0.0/tests/test_mcp.py +112 -0
  197. disclosure_alpha-1.0.0/tests/test_mcp_analyst.py +20 -0
  198. disclosure_alpha-1.0.0/tests/test_mcp_builder.py +28 -0
  199. disclosure_alpha-1.0.0/tests/test_monotonicity.py +76 -0
  200. disclosure_alpha-1.0.0/tests/test_outcomes.py +82 -0
  201. disclosure_alpha-1.0.0/tests/test_pipeline.py +221 -0
  202. disclosure_alpha-1.0.0/tests/test_section_extractor.py +238 -0
  203. disclosure_alpha-1.0.0/tests/test_text_metrics.py +215 -0
Binary file
@@ -0,0 +1,35 @@
1
+ name: ci
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ pull_request:
7
+
8
+ jobs:
9
+ docs:
10
+ runs-on: ubuntu-latest
11
+ steps:
12
+ - uses: actions/checkout@v4
13
+ - uses: actions/setup-python@v5
14
+ with:
15
+ python-version: "3.11"
16
+ - run: pip install -e ".[api,mcp,dev]" && pip install -r docs/requirements.txt
17
+ - run: sphinx-build -W -b html docs docs/_build/html
18
+
19
+ test:
20
+ runs-on: ubuntu-latest
21
+ strategy:
22
+ matrix:
23
+ python-version: ["3.11", "3.12"]
24
+ steps:
25
+ - uses: actions/checkout@v4
26
+ - uses: actions/setup-python@v5
27
+ with:
28
+ python-version: ${{ matrix.python-version }}
29
+ - run: pip install -e ".[api,mcp,dev]"
30
+ - run: pytest -q -m "not integration" --cov=disclosure_alpha --cov-fail-under=75
31
+ env:
32
+ EMBEDDING_BACKEND: tfidf
33
+ - name: Dictionary distribution shift (non-blocking)
34
+ run: python scripts/validate_dictionary_shift.py --allow-fail
35
+ continue-on-error: true
@@ -0,0 +1,30 @@
1
+ name: integration
2
+
3
+ on:
4
+ workflow_dispatch:
5
+ schedule:
6
+ - cron: "0 6 * * *"
7
+
8
+ jobs:
9
+ integration:
10
+ runs-on: ubuntu-latest
11
+ steps:
12
+ - uses: actions/checkout@v4
13
+ - uses: actions/setup-python@v5
14
+ with:
15
+ python-version: "3.12"
16
+ - run: pip install -e ".[api,mcp,outcomes,dev]"
17
+ - run: pytest -q -m integration
18
+ env:
19
+ RUN_INTEGRATION: "1"
20
+ EMBEDDING_BACKEND: tfidf
21
+ - name: EDGAR smoke score
22
+ env:
23
+ SEC_USER_AGENT: ${{ secrets.SEC_USER_AGENT }}
24
+ run: |
25
+ if [ -z "$SEC_USER_AGENT" ]; then
26
+ echo "Skipping EDGAR smoke (SEC_USER_AGENT secret not set)"
27
+ exit 0
28
+ fi
29
+ score=$(disclosure-alpha score --ticker AAPL --fiscal-year 2025 --form 10-K | python -c "import json,sys; print(json.load(sys.stdin)['scores']['overall_disclosure_risk_score'])")
30
+ test -n "$score"
@@ -0,0 +1,21 @@
1
+ name: Publish to PyPI
2
+
3
+ on:
4
+ release:
5
+ types: [published]
6
+
7
+ permissions:
8
+ id-token: write
9
+
10
+ jobs:
11
+ publish:
12
+ runs-on: ubuntu-latest
13
+ environment: pypi
14
+ steps:
15
+ - uses: actions/checkout@v4
16
+ - uses: actions/setup-python@v5
17
+ with:
18
+ python-version: "3.11"
19
+ - run: pip install build
20
+ - run: python -m build
21
+ - uses: pypa/gh-action-pypi-publish@release/v1
@@ -0,0 +1,12 @@
1
+ __pycache__/
2
+ *.py[cod]
3
+ .venv/
4
+ dist/
5
+ *.egg-info/
6
+ .pytest_cache/
7
+ .mypy_cache/
8
+ .ruff_cache/
9
+ data/cache/
10
+ data/validation/corpus/*.jsonl
11
+ data/validation/outcomes/
12
+ docs/_build/
@@ -0,0 +1,20 @@
1
+ # Read the Docs configuration — https://docs.readthedocs.io/en/stable/config-file/v2.html
2
+ version: 2
3
+
4
+ build:
5
+ os: ubuntu-22.04
6
+ tools:
7
+ python: "3.11"
8
+
9
+ sphinx:
10
+ configuration: docs/conf.py
11
+
12
+ python:
13
+ install:
14
+ - method: pip
15
+ path: .
16
+ extra_requirements:
17
+ - api
18
+ - mcp
19
+ - dev
20
+ - requirements: docs/requirements.txt
@@ -0,0 +1,52 @@
1
+ # Contributing to Disclosure Alpha
2
+
3
+ Thanks for your interest in contributing. Disclosure Alpha is Apache-2.0 licensed.
4
+
5
+ ## Install for development
6
+
7
+ Clone the repository and install in editable mode:
8
+
9
+ ```bash
10
+ git clone https://github.com/alwank/disclosure-alpha.git
11
+ cd disclosure-alpha
12
+ pip install -e ".[api,mcp,dev]"
13
+ ```
14
+
15
+ End users install from PyPI instead: `pip install "disclosure-alpha[api,mcp,dev]"`. See [Installation](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/installation.html).
16
+
17
+ ## Run tests
18
+
19
+ ```bash
20
+ pytest -q -m "not integration" --cov=disclosure_alpha --cov-fail-under=75
21
+ ```
22
+
23
+ Integration tests (network / EDGAR):
24
+
25
+ ```bash
26
+ RUN_INTEGRATION=1 pytest -q -m integration
27
+ ```
28
+
29
+ Set `SEC_USER_AGENT="YourName your@email.com"` for EDGAR-backed tests.
30
+
31
+ ## Build docs locally
32
+
33
+ ```bash
34
+ pip install -r docs/requirements.txt
35
+ sphinx-build -W -b html docs docs/_build/html
36
+ ```
37
+
38
+ Documentation-only edits: see [docs/CONTRIBUTING_DOCS.md](docs/CONTRIBUTING_DOCS.md).
39
+
40
+ ## Claim boundaries
41
+
42
+ When writing docs or examples, match [Evidence & limitations](https://disclosure-alpha.readthedocs.io/en/stable/validation/evidence-and-limitations.html):
43
+
44
+ - **Supported:** deterministic Item 1A on ~425 S&P 500 FY2025 10-Ks; partial L2 construct validity; partial L3 volatility association
45
+ - **Do not claim:** full-index validation, earnings-surprise prediction, buy/sell signals, or composite LLM scoring in the open-source API
46
+
47
+ ## Pull requests
48
+
49
+ 1. Fork and branch from `main`
50
+ 2. Keep changes focused
51
+ 3. Ensure tests and docs build pass
52
+ 4. Open a PR with a clear description of what changed and why
@@ -0,0 +1,17 @@
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ Copyright 2026 Disclosure Alpha contributors
6
+
7
+ Licensed under the Apache License, Version 2.0 (the "License");
8
+ you may not use this file except in compliance with the License.
9
+ You may obtain a copy of the License at
10
+
11
+ http://www.apache.org/licenses/LICENSE-2.0
12
+
13
+ Unless required by applicable law or agreed to in writing, software
14
+ distributed under the License is distributed on an "AS IS" BASIS,
15
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16
+ See the License for the specific language governing permissions and
17
+ limitations under the License.
@@ -0,0 +1,290 @@
1
+ Metadata-Version: 2.4
2
+ Name: disclosure-alpha
3
+ Version: 1.0.0
4
+ Summary: Open-source deterministic SEC filing analytics: parse, metrics, diff, score
5
+ Project-URL: Homepage, https://disclosure-alpha.readthedocs.io/en/stable/
6
+ Project-URL: Documentation, https://disclosure-alpha.readthedocs.io/en/stable/
7
+ Project-URL: Repository, https://github.com/alwank/disclosure-alpha
8
+ Project-URL: Issues, https://github.com/alwank/disclosure-alpha/issues
9
+ Project-URL: Changelog, https://github.com/alwank/disclosure-alpha/blob/main/docs/appendix/changelog.md
10
+ Author-email: Alwan <alwan.alkautsar@gmail.com>
11
+ License: Apache-2.0
12
+ License-File: LICENSE
13
+ Keywords: 10-k,disclosure,edgar,mcp,nlp,sec
14
+ Classifier: Development Status :: 5 - Production/Stable
15
+ Classifier: Intended Audience :: Developers
16
+ Classifier: License :: OSI Approved :: Apache Software License
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Topic :: Office/Business :: Financial
19
+ Requires-Python: >=3.11
20
+ Requires-Dist: beautifulsoup4>=4.12.0
21
+ Requires-Dist: lxml>=5.1.0
22
+ Requires-Dist: numpy>=1.26.0
23
+ Requires-Dist: scikit-learn>=1.4.0
24
+ Requires-Dist: sec-parser<0.59,>=0.58.1
25
+ Provides-Extra: api
26
+ Requires-Dist: fastapi>=0.110; extra == 'api'
27
+ Requires-Dist: pydantic>=2.5.0; extra == 'api'
28
+ Requires-Dist: uvicorn[standard]>=0.27; extra == 'api'
29
+ Provides-Extra: dev
30
+ Requires-Dist: httpx2<3,>=2.4.0; extra == 'dev'
31
+ Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
32
+ Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
33
+ Requires-Dist: pytest>=7.4.0; extra == 'dev'
34
+ Provides-Extra: mcp
35
+ Requires-Dist: mcp>=1.0.0; extra == 'mcp'
36
+ Requires-Dist: pydantic>=2.5.0; extra == 'mcp'
37
+ Provides-Extra: outcomes
38
+ Requires-Dist: yfinance>=0.2.0; extra == 'outcomes'
39
+ Provides-Extra: semantic
40
+ Requires-Dist: sentence-transformers>=2.2.0; extra == 'semantic'
41
+ Provides-Extra: validation
42
+ Requires-Dist: spacy>=3.7.0; extra == 'validation'
43
+ Description-Content-Type: text/markdown
44
+
45
+ <p align="center">
46
+ <img src="docs/assets/readme-hero.png" alt="Disclosure Alpha: Turn SEC filing language into reproducible risk scores" width="720">
47
+ </p>
48
+
49
+ <p align="center">
50
+ <a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.11+-blue.svg" alt="Python 3.11+"></a>
51
+ <a href="https://pypi.org/project/disclosure-alpha/"><img src="https://img.shields.io/pypi/v/disclosure-alpha.svg" alt="PyPI"></a>
52
+ <a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache--2.0-green.svg" alt="License: Apache-2.0"></a>
53
+ <a href="https://disclosure-alpha.readthedocs.io/en/stable/"><img src="https://img.shields.io/badge/docs-readthedocs-blue.svg" alt="Documentation"></a>
54
+ <a href="https://github.com/alwank/disclosure-alpha/actions/workflows/ci.yml"><img src="https://github.com/alwank/disclosure-alpha/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
55
+ </p>
56
+
57
+ <p align="center">
58
+ Extract sections, measure tone and boilerplate, detect year-over-year changes, and screen peers.<br>
59
+ Deterministic, versioned JSON. <strong>No LLM required</strong>.
60
+ </p>
61
+
62
+ <p align="center">
63
+ <strong><a href="https://disclosure-alpha.readthedocs.io/en/stable/getting-started/index.html">Get started</a></strong>
64
+ </p>
65
+
66
+ ## What it is
67
+
68
+ Open-source, deterministic SEC filing analytics for **10-K, 10-Q, and 8-K** HTML. Reproducible JSON scores from text metrics, boolean risk flags, and section diffs. Self-hosted CLI, Python SDK, HTTP API, and MCP.
69
+
70
+ ## What it is not
71
+
72
+ - Not investment advice or a trading signal
73
+ - Not a substitute for reading the filing
74
+ - Not composite LLM scoring (open-source HTTP API is deterministic only; `view=composite` returns 402)
75
+
76
+ Full scope and limits: [Evidence & limitations](https://disclosure-alpha.readthedocs.io/en/stable/validation/evidence-and-limitations.html).
77
+
78
+ ## Why Disclosure Alpha
79
+
80
+ Comparing risk-factor and MD&A language across filings, or against a company's prior year, is slow manual work. Disclosure Alpha extracts SEC sections, runs reproducible text metrics and diffs, and returns sortable JSON scores you can wire into notebooks, screeners, or agents. The same deterministic engine powers every integration surface, with version strings in every response for reproducibility.
81
+
82
+ ## What you can do
83
+
84
+ Disclosure Alpha delivers deterministic scores (nine components, 0-100), section extraction from 10-K/10-Q/8-K HTML, year-over-year change detection, and four integration surfaces ([section taxonomy](https://disclosure-alpha.readthedocs.io/en/stable/reference/section-taxonomy.html)).
85
+
86
+ | Task | How |
87
+ |------|-----|
88
+ | Score one company | `disclosure-alpha score --ticker AAPL --fiscal-year 2025 --form 10-K` |
89
+ | Screen up to 25 tickers | HTTP `POST /v1/panel/disclosure-matrix` |
90
+ | Compare year-over-year | `--prior-html prior.html` or HTTP `compare=prior` |
91
+ | Work offline (no EDGAR) | `disclosure-alpha score --html filing.html --form 10-K` |
92
+ | Inspect raw signals | `disclosure-alpha metrics …` or `GET /disclosure-metrics` |
93
+ | Pull boolean risk flags | `GET /disclosure-flags` |
94
+ | Debug section extraction | `disclosure-alpha extract …` or `GET /sections` |
95
+
96
+ ```bash
97
+ # Screen a peer set (start disclosure-alpha-api first)
98
+ curl -s -X POST "http://localhost:8000/v1/panel/disclosure-matrix" \
99
+ -H "Content-Type: application/json" \
100
+ -d '{"tickers": ["AAPL", "MSFT", "GOOGL"], "fiscal_year": 2025, "form_type": "10-K"}'
101
+
102
+ # Year-over-year change from local HTML (no network required)
103
+ disclosure-alpha score --html current.html --form 10-K --prior-html prior.html
104
+
105
+ # Raw metrics without headline aggregation
106
+ disclosure-alpha metrics --ticker AAPL --fiscal-year 2025 --form 10-K
107
+ ```
108
+
109
+ Copy-paste recipes: [Workflows](https://disclosure-alpha.readthedocs.io/en/stable/guides/workflows/index.html).
110
+
111
+ ## How it works
112
+
113
+ Same pipeline powers every integration surface.
114
+
115
+ ```mermaid
116
+ flowchart TB
117
+ ingest["Ingest (HTML or EDGAR)"]
118
+ extract["extract_sections_from_html()"]
119
+ metrics["compute_section_metrics()"]
120
+ aggregate["aggregate_deterministic_matrix()"]
121
+ output["ScoreResult JSON"]
122
+
123
+ ingest --> extract
124
+ extract --> metrics
125
+ metrics --> aggregate
126
+ aggregate --> output
127
+
128
+ subgraph deterministic ["Deterministic stage"]
129
+ metrics
130
+ end
131
+ ```
132
+
133
+ ## Score signals
134
+
135
+ Nine weighted components (0-100; higher = more disclosure risk) feed the headline `overall_disclosure_risk_score`:
136
+
137
+ | Signal | What it captures |
138
+ |--------|------------------|
139
+ | Risk-factor intensity | Negative and uncertainty tone in Item 1A |
140
+ | Disclosure change | Year-over-year language shift vs prior filing |
141
+ | MD&A uncertainty | Demand stress and margin pressure in MD&A |
142
+ | Legal / regulatory risk | Investigation and litigation language + flags |
143
+ | Liquidity stress | Covenant and cash-flow stress signals |
144
+ | Boilerplate | Vague, templated risk language |
145
+ | Internal controls | Weakness signals in controls disclosures |
146
+ | Event severity | Material changes in risk text (diff-only) |
147
+ | Tone negativity | Cross-section negative language |
148
+
149
+ **Scale:** 0-25 low concern · 26-50 moderate · 51-75 elevated · 76-100 high. Higher = more disclosure risk, except `specificity_quality_score` (higher = more specific).
150
+
151
+ `specificity_quality_score` is also returned but is excluded from headline weights. Full field guide: [Understanding scores](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/understanding-scores.html).
152
+
153
+ ## Who it's for
154
+
155
+ | You are… | Start with… |
156
+ |----------|-------------|
157
+ | Researcher / notebook user | CLI or Python SDK |
158
+ | Building a screener or dashboard | HTTP API + Panel |
159
+ | Wiring Cursor / Claude | MCP Analyst |
160
+ | Custom agent pipeline | MCP Builder |
161
+
162
+ Not sure? See [Choose your surface](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/choose-your-surface.html).
163
+
164
+ ## Quick start
165
+
166
+ Requires **Python 3.11+**.
167
+
168
+ **1. Install from PyPI**
169
+
170
+ ```bash
171
+ pip install "disclosure-alpha[dev]"
172
+ ```
173
+
174
+ For HTTP API and MCP: `pip install "disclosure-alpha[api,mcp,dev]"`. Full install options: [Installation](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/installation.html).
175
+
176
+ **2. Set your SEC User-Agent**
177
+
178
+ ```bash
179
+ export SEC_USER_AGENT="YourName your@email.com"
180
+ ```
181
+
182
+ Required for ticker/EDGAR commands. See [SEC EDGAR setup](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/sec-edgar-setup.html).
183
+
184
+ **3. Score a filing**
185
+
186
+ ```bash
187
+ disclosure-alpha score --ticker AAPL --fiscal-year 2025 --form 10-K \
188
+ | jq '.scores.overall_disclosure_risk_score'
189
+ ```
190
+
191
+ ```python
192
+ from disclosure_alpha import score_filing_ticker
193
+ result = score_filing_ticker("AAPL", 2025, form_type="10-K")
194
+ print(result.scores.overall_disclosure_risk_score)
195
+ ```
196
+
197
+ ## Integrate your way
198
+
199
+ | Surface | Entry | Granularity |
200
+ |---------|-------|-------------|
201
+ | CLI | `disclosure-alpha` | `extract` → `metrics` → `score` (stepwise or full pipeline) |
202
+ | Python | `import disclosure_alpha` | Same pipeline as CLI; compose in notebooks |
203
+ | HTTP API | `disclosure-alpha-api` | 8 endpoints: filings, sections, metrics, matrix, flags, changes, panel |
204
+ | MCP Analyst | `disclosure-alpha-mcp-analyst` | Ticker discovery + score (2 tools) |
205
+ | MCP Builder | `disclosure-alpha-mcp-builder` | Full pipeline as 5 composable tools |
206
+
207
+ HTTP matrix tiers: `tier=lite` (headline score), `tier=standard` (components + metrics), `tier=analyst` (provenance for audit).
208
+
209
+ ```bash
210
+ # Single-ticker dashboard headline (start disclosure-alpha-api first)
211
+ curl "http://localhost:8000/v1/company/AAPL/disclosure-matrix?fiscal_year=2025&form_type=10-K&tier=lite"
212
+
213
+ disclosure-alpha-api # HTTP on :8000
214
+ disclosure-alpha-mcp-analyst # MCP for Cursor / Claude Desktop
215
+ ```
216
+
217
+ Endpoint map, Postman collections (`docs/postman/`), and MCP tool reference: **[Guides](https://disclosure-alpha.readthedocs.io/en/stable/guides/index.html)**.
218
+
219
+ ## MCP in Cursor
220
+
221
+ Add to your MCP settings (Analyst bundle; requires `pip install "disclosure-alpha[mcp,dev]"`):
222
+
223
+ ```json
224
+ {
225
+ "mcpServers": {
226
+ "disclosure-alpha": {
227
+ "command": "disclosure-alpha-mcp-analyst",
228
+ "env": {
229
+ "SEC_USER_AGENT": "YourName your@email.com"
230
+ }
231
+ }
232
+ }
233
+ }
234
+ ```
235
+
236
+ Full MCP guide: [MCP](https://disclosure-alpha.readthedocs.io/en/stable/guides/mcp/index.html) (Builder bundle for raw HTML pipelines).
237
+
238
+ ## Research-backed
239
+
240
+ Validated on **~425 S&P 500 FY2025 10-Ks** (~84% of the index):
241
+
242
+ | Check | Result |
243
+ |-------|--------|
244
+ | Language quality | Boilerplate and specificity scores correlate with independent text measures (Spearman ρ ~0.68 / ~0.84) |
245
+ | Real-world signal | Higher disclosure risk scores associate with higher 90-day post-filing volatility in the same cohort |
246
+
247
+ Metrics draw on finance text-analysis literature (Loughran-McDonald tone proxies, boilerplate and specificity measures). See [Research foundation](https://disclosure-alpha.readthedocs.io/en/stable/methodology/research-foundation.html).
248
+
249
+ Research tool, not investment advice. Read the underlying filings. Full scope and limits: **[Evidence & limitations](https://disclosure-alpha.readthedocs.io/en/stable/validation/evidence-and-limitations.html)**.
250
+
251
+ ## Example output
252
+
253
+ See [Understanding scores](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/understanding-scores.html) for field definitions.
254
+
255
+ **Single filing score** (synthetic 10-K):
256
+
257
+ ```json
258
+ {
259
+ "scores": {
260
+ "overall_disclosure_risk_score": 17.84,
261
+ "score_coverage_ratio": 0.7778,
262
+ "components": {
263
+ "risk_factor_intensity_score": 8.62,
264
+ "boilerplate_risk_score": 42.53,
265
+ "legal_regulatory_risk_score": 25.34
266
+ }
267
+ }
268
+ }
269
+ ```
270
+
271
+ More examples (YoY change, panel screener): [`docs/examples/`](docs/examples/) and [Workflows](https://disclosure-alpha.readthedocs.io/en/stable/guides/workflows/index.html).
272
+
273
+ ## Documentation
274
+
275
+ | I want to… | Start here |
276
+ |------------|------------|
277
+ | Copy-paste recipes | [Workflows](https://disclosure-alpha.readthedocs.io/en/stable/guides/workflows/index.html) |
278
+ | Interpret scores | [Understanding scores](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/understanding-scores.html) |
279
+ | Score from terminal | [Quickstart CLI](https://disclosure-alpha.readthedocs.io/en/stable/getting-started/quickstart-cli.html) |
280
+ | Build a screener | [HTTP guides](https://disclosure-alpha.readthedocs.io/en/stable/guides/http/index.html) |
281
+ | Wire an agent | [MCP guide](https://disclosure-alpha.readthedocs.io/en/stable/guides/mcp/index.html) |
282
+ | See methodology | [Methodology overview](https://disclosure-alpha.readthedocs.io/en/stable/methodology/overview.html) |
283
+
284
+ ## License
285
+
286
+ Apache-2.0. See [LICENSE](LICENSE).
287
+
288
+ ## Contributors
289
+
290
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, tests, and docs build.