ossuary-risk 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. ossuary_risk-0.1.0/.env.example +21 -0
  2. ossuary_risk-0.1.0/.gitignore +67 -0
  3. ossuary_risk-0.1.0/PKG-INFO +241 -0
  4. ossuary_risk-0.1.0/README.md +201 -0
  5. ossuary_risk-0.1.0/docs/methodology.md +353 -0
  6. ossuary_risk-0.1.0/pyproject.toml +76 -0
  7. ossuary_risk-0.1.0/scripts/t1_comparison.py +296 -0
  8. ossuary_risk-0.1.0/scripts/validate.py +1360 -0
  9. ossuary_risk-0.1.0/src/ossuary/__init__.py +7 -0
  10. ossuary_risk-0.1.0/src/ossuary/api/__init__.py +1 -0
  11. ossuary_risk-0.1.0/src/ossuary/api/main.py +173 -0
  12. ossuary_risk-0.1.0/src/ossuary/cli.py +309 -0
  13. ossuary_risk-0.1.0/src/ossuary/collectors/__init__.py +8 -0
  14. ossuary_risk-0.1.0/src/ossuary/collectors/base.py +26 -0
  15. ossuary_risk-0.1.0/src/ossuary/collectors/git.py +231 -0
  16. ossuary_risk-0.1.0/src/ossuary/collectors/github.py +495 -0
  17. ossuary_risk-0.1.0/src/ossuary/collectors/npm.py +113 -0
  18. ossuary_risk-0.1.0/src/ossuary/collectors/pypi.py +118 -0
  19. ossuary_risk-0.1.0/src/ossuary/db/__init__.py +15 -0
  20. ossuary_risk-0.1.0/src/ossuary/db/models.py +197 -0
  21. ossuary_risk-0.1.0/src/ossuary/db/session.py +49 -0
  22. ossuary_risk-0.1.0/src/ossuary/scoring/__init__.py +16 -0
  23. ossuary_risk-0.1.0/src/ossuary/scoring/engine.py +318 -0
  24. ossuary_risk-0.1.0/src/ossuary/scoring/factors.py +175 -0
  25. ossuary_risk-0.1.0/src/ossuary/scoring/reputation.py +326 -0
  26. ossuary_risk-0.1.0/src/ossuary/sentiment/__init__.py +5 -0
  27. ossuary_risk-0.1.0/src/ossuary/sentiment/analyzer.py +232 -0
  28. ossuary_risk-0.1.0/tests/__init__.py +1 -0
  29. ossuary_risk-0.1.0/tests/test_scoring.py +168 -0
  30. ossuary_risk-0.1.0/validation_results.json +250 -0
  31. ossuary_risk-0.1.0/validation_results_100.json +2438 -0
  32. ossuary_risk-0.1.0/validation_results_cleaned.json +566 -0
  33. ossuary_risk-0.1.0/validation_results_expanded.json +634 -0
  34. ossuary_risk-0.1.0/validation_results_expanded_v2.json +1334 -0
@@ -0,0 +1,21 @@
1
+ # GitHub API token (required for higher rate limits)
2
+ GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
3
+
4
+ # Database configuration
5
+ DATABASE_URL=postgresql://ossuary:ossuary@localhost:5432/ossuary
6
+
7
+ # For SQLite during development (comment out DATABASE_URL above)
8
+ # DATABASE_URL=sqlite:///./ossuary.db
9
+
10
+ # API configuration
11
+ API_HOST=0.0.0.0
12
+ API_PORT=8000
13
+
14
+ # Logging
15
+ LOG_LEVEL=INFO
16
+
17
+ # Repository storage path (where we clone repos for analysis)
18
+ REPOS_PATH=./repos
19
+
20
+ # Analysis settings
21
+ ANALYSIS_CACHE_HOURS=24
@@ -0,0 +1,67 @@
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ build/
8
+ develop-eggs/
9
+ dist/
10
+ downloads/
11
+ eggs/
12
+ .eggs/
13
+ lib/
14
+ lib64/
15
+ parts/
16
+ sdist/
17
+ var/
18
+ wheels/
19
+ *.egg-info/
20
+ .installed.cfg
21
+ *.egg
22
+
23
+ # Virtual environments
24
+ .venv/
25
+ venv/
26
+ ENV/
27
+
28
+ # IDE
29
+ .idea/
30
+ .vscode/
31
+ *.swp
32
+ *.swo
33
+ *~
34
+
35
+ # Testing
36
+ .coverage
37
+ .pytest_cache/
38
+ htmlcov/
39
+ .tox/
40
+ .nox/
41
+
42
+ # Type checking
43
+ .mypy_cache/
44
+
45
+ # Environment
46
+ .env
47
+ .env.local
48
+ .env.*.local
49
+
50
+ # Database
51
+ *.db
52
+ *.sqlite3
53
+
54
+ # Logs
55
+ *.log
56
+ logs/
57
+
58
+ # Cloned repos for analysis
59
+ repos/
60
+
61
+ # Local data
62
+ data/
63
+ *.json.local
64
+
65
+ # OS
66
+ .DS_Store
67
+ Thumbs.db
@@ -0,0 +1,241 @@
1
+ Metadata-Version: 2.4
2
+ Name: ossuary-risk
3
+ Version: 0.1.0
4
+ Summary: OSS Supply Chain Risk Scoring - Where abandoned packages come to rest
5
+ Project-URL: Homepage, https://github.com/anicka-net/ossuary
6
+ Project-URL: Repository, https://github.com/anicka-net/ossuary
7
+ Project-URL: Documentation, https://github.com/anicka-net/ossuary/blob/main/docs/methodology.md
8
+ Author: Anicka
9
+ License-Expression: MIT
10
+ Keywords: oss,risk,scoring,security,supply-chain
11
+ Classifier: Development Status :: 3 - Alpha
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: License :: OSI Approved :: MIT License
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.11
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Topic :: Security
18
+ Requires-Python: >=3.11
19
+ Requires-Dist: alembic>=1.13.0
20
+ Requires-Dist: fastapi>=0.109.0
21
+ Requires-Dist: gitpython>=3.1.0
22
+ Requires-Dist: httpx>=0.26.0
23
+ Requires-Dist: psycopg2-binary>=2.9.0
24
+ Requires-Dist: pydantic-settings>=2.1.0
25
+ Requires-Dist: pydantic>=2.5.0
26
+ Requires-Dist: rich>=13.0.0
27
+ Requires-Dist: sqlalchemy>=2.0.0
28
+ Requires-Dist: textblob>=0.18.0
29
+ Requires-Dist: typer>=0.9.0
30
+ Requires-Dist: uvicorn[standard]>=0.27.0
31
+ Requires-Dist: vadersentiment>=3.3.0
32
+ Provides-Extra: dev
33
+ Requires-Dist: httpx>=0.26.0; extra == 'dev'
34
+ Requires-Dist: mypy>=1.8.0; extra == 'dev'
35
+ Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
36
+ Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
37
+ Requires-Dist: pytest>=7.4.0; extra == 'dev'
38
+ Requires-Dist: ruff>=0.1.0; extra == 'dev'
39
+ Description-Content-Type: text/markdown
40
+
41
+ # Ossuary
42
+
43
+ **OSS Supply Chain Risk Scoring** - Where abandoned packages come to rest.
44
+
45
+ Ossuary analyzes open source packages to identify governance-based supply chain risks before incidents occur. It calculates a risk score based on maintainer concentration, activity levels, and protective factors.
46
+
47
+ ## What It Detects
48
+
49
+ Ossuary focuses on **governance failures** - the type of vulnerability that enabled attacks like:
50
+
51
+ - **event-stream** (2018) - Abandoned package handed off to malicious maintainer
52
+ - **colors/faker** (2022) - Frustrated maintainer intentionally sabotaged packages
53
+
54
+ ### Detection Capabilities
55
+
56
+ | Can Detect | Cannot Detect |
57
+ |------------|---------------|
58
+ | Maintainer abandonment | Account compromise (like ua-parser-js) |
59
+ | High concentration risk | Dependency confusion attacks |
60
+ | Economic frustration signals | Typosquatting |
61
+ | Declining activity trends | Malicious code injection |
62
+ | Governance centralization | |
63
+
64
+ ## Quick Start
65
+
66
+ ```bash
67
+ # Install
68
+ pip install ossuary
69
+
70
+ # Initialize database (optional, for caching)
71
+ ossuary init
72
+
73
+ # Score a package
74
+ ossuary score event-stream --ecosystem npm
75
+
76
+ # Score with historical cutoff (T-1 analysis)
77
+ ossuary score event-stream --ecosystem npm --cutoff 2018-09-01
78
+
79
+ # Output as JSON
80
+ ossuary score requests --ecosystem pypi --json
81
+ ```
82
+
83
+ ## Risk Levels
84
+
85
+ | Score | Level | Semaphore | Action |
86
+ |-------|-------|-----------|--------|
87
+ | 0-20 | Very Low | 🟒 | Routine monitoring |
88
+ | 21-40 | Low | 🟒 | Quarterly review |
89
+ | 41-60 | Moderate | 🟑 | Monthly review |
90
+ | 61-80 | High | 🟠 | Weekly review + contingency plan |
91
+ | 81-100 | Critical | πŸ”΄ | Immediate action required |
92
+
93
+ ## Scoring Methodology
94
+
95
+ ```
96
+ Final Score = Base Risk + Activity Modifier + Protective Factors
97
+ (20-100) (-30 to +20) (-100 to +20)
98
+ ```
99
+
100
+ ### Base Risk (Maintainer Concentration)
101
+
102
+ | Concentration | Points |
103
+ |---------------|--------|
104
+ | <30% | 20 |
105
+ | 30-50% | 40 |
106
+ | 50-70% | 60 |
107
+ | 70-90% | 80 |
108
+ | >90% | 100 |
109
+
110
+ ### Activity Modifier
111
+
112
+ | Commits/Year | Points |
113
+ |--------------|--------|
114
+ | >50 | -30 |
115
+ | 12-50 | -15 |
116
+ | 4-11 | 0 |
117
+ | <4 | +20 |
118
+
119
+ ### Protective Factors
120
+
121
+ | Factor | Points |
122
+ |--------|--------|
123
+ | Tier-1 maintainer (500+ repos or 100K+ stars) | -25 |
124
+ | GitHub Sponsors enabled | -15 |
125
+ | Organization with 3+ admins | -15 |
126
+ | >50M weekly downloads | -20 |
127
+ | >10M weekly downloads | -10 |
128
+ | <40% concentration | -10 |
129
+ | >20 contributors | -10 |
130
+ | CII Best Practices badge | -10 |
131
+ | **Frustration signals detected** | **+20** |
132
+
133
+ ## API Usage
134
+
135
+ Start the API server:
136
+
137
+ ```bash
138
+ uvicorn ossuary.api.main:app --host 0.0.0.0 --port 8000
139
+ ```
140
+
141
+ Query a package:
142
+
143
+ ```bash
144
+ curl "http://localhost:8000/score/npm/event-stream"
145
+ ```
146
+
147
+ Response:
148
+
149
+ ```json
150
+ {
151
+ "package": "event-stream",
152
+ "ecosystem": "npm",
153
+ "score": 100,
154
+ "risk_level": "CRITICAL",
155
+ "semaphore": "πŸ”΄",
156
+ "explanation": "πŸ”΄ CRITICAL (100). Critical concentration (90%): single person controls nearly all commits. Project appears abandoned (<4 commits/year).",
157
+ "recommendations": [
158
+ "IMMEDIATE: Identify alternative packages or prepare to fork",
159
+ "Do not accept new versions without manual code review"
160
+ ]
161
+ }
162
+ ```
163
+
164
+ ## Development
165
+
166
+ ```bash
167
+ # Clone
168
+ git clone https://github.com/anicka/ossuary.git
169
+ cd ossuary
170
+
171
+ # Install with dev dependencies
172
+ pip install -e ".[dev]"
173
+
174
+ # Run tests
175
+ pytest
176
+
177
+ # Run linter
178
+ ruff check src/
179
+
180
+ # Type check
181
+ mypy src/
182
+ ```
183
+
184
+ ## Configuration
185
+
186
+ Environment variables:
187
+
188
+ ```bash
189
+ # Required for higher GitHub API rate limits
190
+ GITHUB_TOKEN=ghp_xxxxxxxxxxxxx
191
+
192
+ # Database (defaults to SQLite)
193
+ DATABASE_URL=postgresql://user:pass@localhost/ossuary
194
+
195
+ # Repository storage
196
+ REPOS_PATH=./repos
197
+ ```
198
+
199
+ ## Architecture
200
+
201
+ ```
202
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
203
+ β”‚ API / CLI β”‚
204
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
205
+ β”‚
206
+ β–Ό
207
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
208
+ β”‚ Scoring Engine β”‚
209
+ β”‚ - Base risk (concentration) β”‚
210
+ β”‚ - Activity modifier β”‚
211
+ β”‚ - Protective factors β”‚
212
+ β”‚ - Sentiment analysis β”‚
213
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
214
+ β”‚
215
+ β–Ό
216
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
217
+ β”‚ Data Collectors β”‚
218
+ β”‚ GitCollector | GitHubCollector | NpmCollector | PyPICollectorβ”‚
219
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
220
+ ```
221
+
222
+ ## Validation
223
+
224
+ Validated on 93 packages (20 incidents + 73 controls):
225
+
226
+ - **Accuracy**: 91.4%
227
+ - **Precision**: 92.9%
228
+ - **Recall**: 65.0%
229
+ - **F1 Score**: 0.76
230
+
231
+ T-1 analysis confirms **100% predictive detection** of governance-detectable incidents before they occurred.
232
+
233
+ See [methodology documentation](docs/methodology.md) for details.
234
+
235
+ ## License
236
+
237
+ MIT
238
+
239
+ ## Academic Context
240
+
241
+ This project supports MBA thesis research on OSS supply chain risk. Key contribution: demonstrating that meaningful risk indicators are observable in public metadata before incidents occur.
@@ -0,0 +1,201 @@
1
+ # Ossuary
2
+
3
+ **OSS Supply Chain Risk Scoring** - Where abandoned packages come to rest.
4
+
5
+ Ossuary analyzes open source packages to identify governance-based supply chain risks before incidents occur. It calculates a risk score based on maintainer concentration, activity levels, and protective factors.
6
+
7
+ ## What It Detects
8
+
9
+ Ossuary focuses on **governance failures** - the type of vulnerability that enabled attacks like:
10
+
11
+ - **event-stream** (2018) - Abandoned package handed off to malicious maintainer
12
+ - **colors/faker** (2022) - Frustrated maintainer intentionally sabotaged packages
13
+
14
+ ### Detection Capabilities
15
+
16
+ | Can Detect | Cannot Detect |
17
+ |------------|---------------|
18
+ | Maintainer abandonment | Account compromise (like ua-parser-js) |
19
+ | High concentration risk | Dependency confusion attacks |
20
+ | Economic frustration signals | Typosquatting |
21
+ | Declining activity trends | Malicious code injection |
22
+ | Governance centralization | |
23
+
24
+ ## Quick Start
25
+
26
+ ```bash
27
+ # Install
28
+ pip install ossuary
29
+
30
+ # Initialize database (optional, for caching)
31
+ ossuary init
32
+
33
+ # Score a package
34
+ ossuary score event-stream --ecosystem npm
35
+
36
+ # Score with historical cutoff (T-1 analysis)
37
+ ossuary score event-stream --ecosystem npm --cutoff 2018-09-01
38
+
39
+ # Output as JSON
40
+ ossuary score requests --ecosystem pypi --json
41
+ ```
42
+
43
+ ## Risk Levels
44
+
45
+ | Score | Level | Semaphore | Action |
46
+ |-------|-------|-----------|--------|
47
+ | 0-20 | Very Low | 🟒 | Routine monitoring |
48
+ | 21-40 | Low | 🟒 | Quarterly review |
49
+ | 41-60 | Moderate | 🟑 | Monthly review |
50
+ | 61-80 | High | 🟠 | Weekly review + contingency plan |
51
+ | 81-100 | Critical | πŸ”΄ | Immediate action required |
52
+
53
+ ## Scoring Methodology
54
+
55
+ ```
56
+ Final Score = Base Risk + Activity Modifier + Protective Factors
57
+ (20-100) (-30 to +20) (-100 to +20)
58
+ ```
59
+
60
+ ### Base Risk (Maintainer Concentration)
61
+
62
+ | Concentration | Points |
63
+ |---------------|--------|
64
+ | <30% | 20 |
65
+ | 30-50% | 40 |
66
+ | 50-70% | 60 |
67
+ | 70-90% | 80 |
68
+ | >90% | 100 |
69
+
70
+ ### Activity Modifier
71
+
72
+ | Commits/Year | Points |
73
+ |--------------|--------|
74
+ | >50 | -30 |
75
+ | 12-50 | -15 |
76
+ | 4-11 | 0 |
77
+ | <4 | +20 |
78
+
79
+ ### Protective Factors
80
+
81
+ | Factor | Points |
82
+ |--------|--------|
83
+ | Tier-1 maintainer (500+ repos or 100K+ stars) | -25 |
84
+ | GitHub Sponsors enabled | -15 |
85
+ | Organization with 3+ admins | -15 |
86
+ | >50M weekly downloads | -20 |
87
+ | >10M weekly downloads | -10 |
88
+ | <40% concentration | -10 |
89
+ | >20 contributors | -10 |
90
+ | CII Best Practices badge | -10 |
91
+ | **Frustration signals detected** | **+20** |
92
+
93
+ ## API Usage
94
+
95
+ Start the API server:
96
+
97
+ ```bash
98
+ uvicorn ossuary.api.main:app --host 0.0.0.0 --port 8000
99
+ ```
100
+
101
+ Query a package:
102
+
103
+ ```bash
104
+ curl "http://localhost:8000/score/npm/event-stream"
105
+ ```
106
+
107
+ Response:
108
+
109
+ ```json
110
+ {
111
+ "package": "event-stream",
112
+ "ecosystem": "npm",
113
+ "score": 100,
114
+ "risk_level": "CRITICAL",
115
+ "semaphore": "πŸ”΄",
116
+ "explanation": "πŸ”΄ CRITICAL (100). Critical concentration (90%): single person controls nearly all commits. Project appears abandoned (<4 commits/year).",
117
+ "recommendations": [
118
+ "IMMEDIATE: Identify alternative packages or prepare to fork",
119
+ "Do not accept new versions without manual code review"
120
+ ]
121
+ }
122
+ ```
123
+
124
+ ## Development
125
+
126
+ ```bash
127
+ # Clone
128
+ git clone https://github.com/anicka/ossuary.git
129
+ cd ossuary
130
+
131
+ # Install with dev dependencies
132
+ pip install -e ".[dev]"
133
+
134
+ # Run tests
135
+ pytest
136
+
137
+ # Run linter
138
+ ruff check src/
139
+
140
+ # Type check
141
+ mypy src/
142
+ ```
143
+
144
+ ## Configuration
145
+
146
+ Environment variables:
147
+
148
+ ```bash
149
+ # Required for higher GitHub API rate limits
150
+ GITHUB_TOKEN=ghp_xxxxxxxxxxxxx
151
+
152
+ # Database (defaults to SQLite)
153
+ DATABASE_URL=postgresql://user:pass@localhost/ossuary
154
+
155
+ # Repository storage
156
+ REPOS_PATH=./repos
157
+ ```
158
+
159
+ ## Architecture
160
+
161
+ ```
162
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
163
+ β”‚ API / CLI β”‚
164
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
165
+ β”‚
166
+ β–Ό
167
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
168
+ β”‚ Scoring Engine β”‚
169
+ β”‚ - Base risk (concentration) β”‚
170
+ β”‚ - Activity modifier β”‚
171
+ β”‚ - Protective factors β”‚
172
+ β”‚ - Sentiment analysis β”‚
173
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
174
+ β”‚
175
+ β–Ό
176
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
177
+ β”‚ Data Collectors β”‚
178
+ β”‚ GitCollector | GitHubCollector | NpmCollector | PyPICollectorβ”‚
179
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
180
+ ```
181
+
182
+ ## Validation
183
+
184
+ Validated on 93 packages (20 incidents + 73 controls):
185
+
186
+ - **Accuracy**: 91.4%
187
+ - **Precision**: 92.9%
188
+ - **Recall**: 65.0%
189
+ - **F1 Score**: 0.76
190
+
191
+ T-1 analysis confirms **100% predictive detection** of governance-detectable incidents before they occurred.
192
+
193
+ See [methodology documentation](docs/methodology.md) for details.
194
+
195
+ ## License
196
+
197
+ MIT
198
+
199
+ ## Academic Context
200
+
201
+ This project supports MBA thesis research on OSS supply chain risk. Key contribution: demonstrating that meaningful risk indicators are observable in public metadata before incidents occur.