ai-critic 0.2.4__tar.gz → 0.2.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,200 @@
1
+ Metadata-Version: 2.4
2
+ Name: ai-critic
3
+ Version: 0.2.5
4
+ Summary: Fast AI evaluator for scikit-learn models
5
+ Author-email: Luiz Seabra <filipedemarco@yahoo.com>
6
+ Requires-Python: >=3.9
7
+ Description-Content-Type: text/markdown
8
+ Requires-Dist: numpy
9
+ Requires-Dist: scikit-learn
10
+
11
+ # ai-critic 🧠: The Quality Gate for Machine Learning Models
12
+
13
+ **ai-critic** is a specialized **decision-making** tool designed to audit the reliability and readiness for deployment of scikit-learn compatible Machine Learning models.
14
+
15
+ Instead of just measuring performance (accuracy, F1 score), **ai-critic** acts as a "Quality Gate," operating the model in search of hidden risks that can lead to production failures, such as data leaks, structural overfitting, and vulnerability to noise.
16
+
17
+ ---
18
+
19
+ ## 🚀 1. Getting Started (The Basics)
20
+
21
+ This section is ideal for beginners who need a quick verdict on the health of their model.
22
+
23
+ ### 1.1. Installation
24
+
25
+ Install the library directly from PyPI:
26
+
27
+ ```bash
28
+ pip install ai-critic
29
+ ```
30
+
31
+ ### 1.2. The Quick Verdict
32
+
33
+ With just a few lines, you can get an executive evaluation and a deployment recommendation.
34
+
35
+ ```python
36
+ from ai_critic import AICritic
37
+ from sklearn.ensemble import RandomForestClassifier
38
+ from sklearn.datasets import make_classification
39
+
40
+ # 1. Prepare your data and model
41
+ X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
42
+ model = RandomForestClassifier(max_depth=5, random_state=42)
43
+
44
+ # 2. Initialize Criticism
45
+ # AICritic performs all audits internally
46
+ critic = AICritic(model, X, y)
47
+
48
+ # 3. Obtain the Executive Summary
49
+ report = critic.evaluate(view="executive")
50
+
51
+ print(f"Verdict: {report['verdict']}")
52
+ print(f"Risk: {report['risk_level']}")
53
+ print(f"Reason Main: {report['main_reason']}")
54
+
55
+ #Expected Output:
56
+
57
+ # Verdict: ✅ Acceptable
58
+ # Risk: Low
59
+ # Main Reason: No critic risks detected.
60
+
61
+ ```
62
+
63
+ ---
64
+
65
+ ## 💡 2. Understanding the Critique (The Intermediary)
66
+
67
+ For the data scientist who needs to understand *why* the model received a verdict and what the next steps are.
68
+
69
+ ### 2.1. The Four Pillars of the Audit
70
+
71
+ The **ai-critic** evaluates your model across four critic dimensions.
72
+
73
+ | Category | Main Risk | Code Module |
74
+ | :--- | :--- | :--- |
75
+ | 📈 **Validation** | Suspicious CV Scores | `ai_critic.performance` |
76
+ | 🧪 **Robustness** | Noise Vulnerability | `ai_critic.robustness` |
77
+
78
+ 2.2. Visual and Technical Analysis
79
+
80
+ The `evaluate` method allows you to view the results and access the complete technical report.
81
+
82
+ ```Python
83
+ # Continuing the previous example...
84
+
85
+ # 1. Generate the full report and visualizations
86
+ # plot=True generates Correlation, Learning Curve, and Robustness graphs
87
+ full_report = critic.evaluate(view="all", plot=True)
88
+
89
+ # 2. Access the Technical Summary for Recommendations
90
+ technical_summary = full_report["technical"]
91
+
92
+ print("\n--- Technical Recommendations ---")
93
+ for i, risk in enumerate(technical_summary["key_risks"]):
94
+ print(f"Risk {i+1}: {risk}")
95
+ print(f"Recommendation: {technical_summary['recommendations'][i]}")
96
+
97
+ # Example of Risk (if there were one):
98
+ # Risk 1: The depth of the tree may be too high for the size of the dataset.
99
+
100
+ # Recommendation: Reduce model complexity or adjust hyperparameters.
101
+
102
+
103
+ ###2.3. Robustness Test
104
+
105
+ A robust model should maintain its performance even with small disturbances in the data. The `ai-critic` test assesses this by injecting noise into the input data.
106
+
107
+ ```python
108
+ # Accessing the specific result of the Robustness module
109
+ robustness_result = full_report["details"]["robustness"]
110
+
111
+ print("\n--- Robustness Test ---")
112
+ print(f"Original CV Score: {robustness_result['cv_score_original']:.4f}")
113
+ print(f"CV Score with Noise: {robustness_result['cv_score_noisy']:.4f}")
114
+ print(f"Performance Drop: {robustness_result['performance_drop']:.4f}")
115
+ print(f"Robustness Verdict: {robustness_result['verdict']}")
116
+
117
+ # Possible Verdicts:
118
+ # - Stable: Acceptable drop.
119
+
120
+ # - Fragile: Significant drop (risk).
121
+
122
+ # - Misleading: Original performance inflated by leakage.
123
+
124
+ ```
125
+
126
+ ---
127
+
128
+ ## ⚙️ 3. Integration and Governance (The Advanced)
129
+
130
+ This section is for MLOps engineers and architects looking to integrate **ai-critic** into automated pipelines and create custom deployment logic.
131
+
132
+ ###3.1. The Deployment Gate (`deploy_decision`)
133
+
134
+ The `deploy_decision()` method is the final control point. It returns a structured object that classifies problems into *Hard Blockers* (prevent deployment) and *Soft Blockers* (require attention, but can be accepted with reservations).
135
+
136
+ Python
137
+ # Example of use in a CI/CD pipeline
138
+ decision = critic.deploy_decision()
139
+
140
+ if decision["deploy"]:
141
+ print("✅ Deployment Approved. Risk Level: Low.")
142
+ other:
143
+ print(f"❌ Deployment Blocked. Risk Level: {decision['risk_level'].upper()}")
144
+ print("Blocking Issues:")
145
+ for issue in decision["blocking_issues"]:
146
+ print(f"- {problem}")
147
+
148
+ # The decision object also includes a heuristic confidence score (0.0 to 1.0)
149
+ print(f"Heuristic Confidence in Model: {decision['confidence']:.2f}")
150
+
151
+ ```
152
+
153
+ ###3.2. AccessFor custom *governance* rules or logic, you can access the raw data of each module through the `"details"` view.
154
+
155
+ ```python
156
+ # Accessing Data Leakage Details
157
+ data_details = critic.evaluate(view="details")["data"]
158
+
159
+ if data_details["data_leakage"]["suspected"]:
160
+
161
+ print("\n--- Data Leak Alert ---")
162
+
163
+ for detail in data_details["data_leakage"]["details"]:
164
+
165
+ print(f"Feature {detail['feature_index']} with correlation of {detail['correlation']:.4f}")
166
+
167
+ # Accessing Structural Overfitting Details
168
+ config_details = critic.evaluate(view="details")["config"]
169
+
170
+ if config_details["structural_warnings"]:
171
+
172
+ print("\n--- Structural Alert ---")
173
+
174
+ for warning in config_details["structural_warnings"]:
175
+
176
+ print(f"Warning: {warning['message']} (Max Depth: {warning['max_depth']}, Recommended: {warning['recommended_max_depth']})")
177
+ ```
178
+
179
+ ### 3.3. Best Practices and Use Cases
180
+
181
+ | Use | Recommended Action |
182
+ | :--- | :--- |
183
+ | **CI/CD** | Use `deploy_decision()` as an automated quality gate. |
184
+ | **Tuning** | Use the technical view to guide hyperparameter optimization. |
185
+ | **Governance** | Log the details view for auditing and compliance. |
186
+ | **Communication** | Use the executive view to report risks to non-technical stakeholders. |
187
+
188
+ ---
189
+
190
+ ## 📄 License
191
+
192
+ Distributed under the **MIT License**.
193
+
194
+ --
195
+
196
+ ## 🧠 Final Note
197
+
198
+ > **ai-critic** is not a benchmarking tool. It's a decision-making tool.
199
+
200
+ If a model fails here, it doesn't mean it's "bad," but rather that it **shouldn't be trusted yet**. The goal is to inject the necessary skepticism to build truly robust AI systems.
@@ -0,0 +1,190 @@
1
+ # ai-critic 🧠: The Quality Gate for Machine Learning Models
2
+
3
+ **ai-critic** is a specialized **decision-making** tool designed to audit the reliability and readiness for deployment of scikit-learn compatible Machine Learning models.
4
+
5
+ Instead of just measuring performance (accuracy, F1 score), **ai-critic** acts as a "Quality Gate," operating the model in search of hidden risks that can lead to production failures, such as data leaks, structural overfitting, and vulnerability to noise.
6
+
7
+ ---
8
+
9
+ ## 🚀 1. Getting Started (The Basics)
10
+
11
+ This section is ideal for beginners who need a quick verdict on the health of their model.
12
+
13
+ ### 1.1. Installation
14
+
15
+ Install the library directly from PyPI:
16
+
17
+ ```bash
18
+ pip install ai-critic
19
+ ```
20
+
21
+ ### 1.2. The Quick Verdict
22
+
23
+ With just a few lines, you can get an executive evaluation and a deployment recommendation.
24
+
25
+ ```python
26
+ from ai_critic import AICritic
27
+ from sklearn.ensemble import RandomForestClassifier
28
+ from sklearn.datasets import make_classification
29
+
30
+ # 1. Prepare your data and model
31
+ X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
32
+ model = RandomForestClassifier(max_depth=5, random_state=42)
33
+
34
+ # 2. Initialize Criticism
35
+ # AICritic performs all audits internally
36
+ critic = AICritic(model, X, y)
37
+
38
+ # 3. Obtain the Executive Summary
39
+ report = critic.evaluate(view="executive")
40
+
41
+ print(f"Verdict: {report['verdict']}")
42
+ print(f"Risk: {report['risk_level']}")
43
+ print(f"Reason Main: {report['main_reason']}")
44
+
45
+ #Expected Output:
46
+
47
+ # Verdict: ✅ Acceptable
48
+ # Risk: Low
49
+ # Main Reason: No critic risks detected.
50
+
51
+ ```
52
+
53
+ ---
54
+
55
+ ## 💡 2. Understanding the Critique (The Intermediary)
56
+
57
+ For the data scientist who needs to understand *why* the model received a verdict and what the next steps are.
58
+
59
+ ### 2.1. The Four Pillars of the Audit
60
+
61
+ The **ai-critic** evaluates your model across four critic dimensions.
62
+
63
+ | Category | Main Risk | Code Module |
64
+ | :--- | :--- | :--- |
65
+ | 📈 **Validation** | Suspicious CV Scores | `ai_critic.performance` |
66
+ | 🧪 **Robustness** | Noise Vulnerability | `ai_critic.robustness` |
67
+
68
+ 2.2. Visual and Technical Analysis
69
+
70
+ The `evaluate` method allows you to view the results and access the complete technical report.
71
+
72
+ ```Python
73
+ # Continuing the previous example...
74
+
75
+ # 1. Generate the full report and visualizations
76
+ # plot=True generates Correlation, Learning Curve, and Robustness graphs
77
+ full_report = critic.evaluate(view="all", plot=True)
78
+
79
+ # 2. Access the Technical Summary for Recommendations
80
+ technical_summary = full_report["technical"]
81
+
82
+ print("\n--- Technical Recommendations ---")
83
+ for i, risk in enumerate(technical_summary["key_risks"]):
84
+ print(f"Risk {i+1}: {risk}")
85
+ print(f"Recommendation: {technical_summary['recommendations'][i]}")
86
+
87
+ # Example of Risk (if there were one):
88
+ # Risk 1: The depth of the tree may be too high for the size of the dataset.
89
+
90
+ # Recommendation: Reduce model complexity or adjust hyperparameters.
91
+
92
+
93
+ ###2.3. Robustness Test
94
+
95
+ A robust model should maintain its performance even with small disturbances in the data. The `ai-critic` test assesses this by injecting noise into the input data.
96
+
97
+ ```python
98
+ # Accessing the specific result of the Robustness module
99
+ robustness_result = full_report["details"]["robustness"]
100
+
101
+ print("\n--- Robustness Test ---")
102
+ print(f"Original CV Score: {robustness_result['cv_score_original']:.4f}")
103
+ print(f"CV Score with Noise: {robustness_result['cv_score_noisy']:.4f}")
104
+ print(f"Performance Drop: {robustness_result['performance_drop']:.4f}")
105
+ print(f"Robustness Verdict: {robustness_result['verdict']}")
106
+
107
+ # Possible Verdicts:
108
+ # - Stable: Acceptable drop.
109
+
110
+ # - Fragile: Significant drop (risk).
111
+
112
+ # - Misleading: Original performance inflated by leakage.
113
+
114
+ ```
115
+
116
+ ---
117
+
118
+ ## ⚙️ 3. Integration and Governance (The Advanced)
119
+
120
+ This section is for MLOps engineers and architects looking to integrate **ai-critic** into automated pipelines and create custom deployment logic.
121
+
122
+ ###3.1. The Deployment Gate (`deploy_decision`)
123
+
124
+ The `deploy_decision()` method is the final control point. It returns a structured object that classifies problems into *Hard Blockers* (prevent deployment) and *Soft Blockers* (require attention, but can be accepted with reservations).
125
+
126
+ Python
127
+ # Example of use in a CI/CD pipeline
128
+ decision = critic.deploy_decision()
129
+
130
+ if decision["deploy"]:
131
+ print("✅ Deployment Approved. Risk Level: Low.")
132
+ other:
133
+ print(f"❌ Deployment Blocked. Risk Level: {decision['risk_level'].upper()}")
134
+ print("Blocking Issues:")
135
+ for issue in decision["blocking_issues"]:
136
+ print(f"- {problem}")
137
+
138
+ # The decision object also includes a heuristic confidence score (0.0 to 1.0)
139
+ print(f"Heuristic Confidence in Model: {decision['confidence']:.2f}")
140
+
141
+ ```
142
+
143
+ ###3.2. AccessFor custom *governance* rules or logic, you can access the raw data of each module through the `"details"` view.
144
+
145
+ ```python
146
+ # Accessing Data Leakage Details
147
+ data_details = critic.evaluate(view="details")["data"]
148
+
149
+ if data_details["data_leakage"]["suspected"]:
150
+
151
+ print("\n--- Data Leak Alert ---")
152
+
153
+ for detail in data_details["data_leakage"]["details"]:
154
+
155
+ print(f"Feature {detail['feature_index']} with correlation of {detail['correlation']:.4f}")
156
+
157
+ # Accessing Structural Overfitting Details
158
+ config_details = critic.evaluate(view="details")["config"]
159
+
160
+ if config_details["structural_warnings"]:
161
+
162
+ print("\n--- Structural Alert ---")
163
+
164
+ for warning in config_details["structural_warnings"]:
165
+
166
+ print(f"Warning: {warning['message']} (Max Depth: {warning['max_depth']}, Recommended: {warning['recommended_max_depth']})")
167
+ ```
168
+
169
+ ### 3.3. Best Practices and Use Cases
170
+
171
+ | Use | Recommended Action |
172
+ | :--- | :--- |
173
+ | **CI/CD** | Use `deploy_decision()` as an automated quality gate. |
174
+ | **Tuning** | Use the technical view to guide hyperparameter optimization. |
175
+ | **Governance** | Log the details view for auditing and compliance. |
176
+ | **Communication** | Use the executive view to report risks to non-technical stakeholders. |
177
+
178
+ ---
179
+
180
+ ## 📄 License
181
+
182
+ Distributed under the **MIT License**.
183
+
184
+ --
185
+
186
+ ## 🧠 Final Note
187
+
188
+ > **ai-critic** is not a benchmarking tool. It's a decision-making tool.
189
+
190
+ If a model fails here, it doesn't mean it's "bad," but rather that it **shouldn't be trusted yet**. The goal is to inject the necessary skepticism to build truly robust AI systems.
@@ -130,3 +130,94 @@ class AICritic:
130
130
  return {k: payload[k] for k in view if k in payload}
131
131
 
132
132
  return payload.get(view)
133
+ def deploy_decision(self):
134
+ """
135
+ Final deployment gate.
136
+
137
+ Returns
138
+ -------
139
+ dict
140
+ {
141
+ "deploy": bool,
142
+ "risk_level": str,
143
+ "blocking_issues": list[str],
144
+ "confidence": float
145
+ }
146
+ """
147
+
148
+ # Reusa TODA a pipeline existente
149
+ report = self.evaluate(view="all", plot=False)
150
+
151
+ data_risk = report["details"]["data"]["data_leakage"]["suspected"]
152
+ perfect_cv = report["details"]["performance"]["suspiciously_perfect"]
153
+ robustness_verdict = report["details"]["robustness"]["verdict"]
154
+ structural_warnings = report["details"]["config"]["structural_warnings"]
155
+
156
+ blocking_issues = []
157
+ risk_level = "low"
158
+
159
+ # =========================
160
+ # Hard blockers (❌)
161
+ # =========================
162
+ if data_risk and perfect_cv:
163
+ blocking_issues.append(
164
+ "Data leakage combined with suspiciously perfect CV score"
165
+ )
166
+ risk_level = "high"
167
+
168
+ if robustness_verdict == "misleading":
169
+ blocking_issues.append(
170
+ "Robustness results are misleading due to inflated baseline performance"
171
+ )
172
+ risk_level = "high"
173
+
174
+ if data_risk:
175
+ blocking_issues.append(
176
+ "Suspected target leakage in feature set"
177
+ )
178
+ risk_level = "high"
179
+
180
+ # =========================
181
+ # Soft blockers (⚠️)
182
+ # =========================
183
+ if risk_level != "high":
184
+ if robustness_verdict == "fragile":
185
+ blocking_issues.append(
186
+ "Model performance degrades significantly under noise"
187
+ )
188
+ risk_level = "medium"
189
+
190
+ if perfect_cv:
191
+ blocking_issues.append(
192
+ "Suspiciously perfect cross-validation score"
193
+ )
194
+ risk_level = "medium"
195
+
196
+ if structural_warnings:
197
+ blocking_issues.append(
198
+ "Structural complexity risks detected in model configuration"
199
+ )
200
+ risk_level = "medium"
201
+
202
+ # =========================
203
+ # Final decision
204
+ # =========================
205
+ deploy = len(blocking_issues) == 0
206
+
207
+ # =========================
208
+ # Confidence heuristic
209
+ # =========================
210
+ confidence = 1.0
211
+ confidence -= 0.35 if data_risk else 0
212
+ confidence -= 0.25 if perfect_cv else 0
213
+ confidence -= 0.25 if robustness_verdict in ("fragile", "misleading") else 0
214
+ confidence -= 0.15 if structural_warnings else 0
215
+ confidence = max(0.0, round(confidence, 2))
216
+
217
+ return {
218
+ "deploy": deploy,
219
+ "risk_level": risk_level,
220
+ "blocking_issues": blocking_issues,
221
+ "confidence": confidence
222
+ }
223
+
@@ -0,0 +1,200 @@
1
+ Metadata-Version: 2.4
2
+ Name: ai-critic
3
+ Version: 0.2.5
4
+ Summary: Fast AI evaluator for scikit-learn models
5
+ Author-email: Luiz Seabra <filipedemarco@yahoo.com>
6
+ Requires-Python: >=3.9
7
+ Description-Content-Type: text/markdown
8
+ Requires-Dist: numpy
9
+ Requires-Dist: scikit-learn
10
+
11
+ # ai-critic 🧠: The Quality Gate for Machine Learning Models
12
+
13
+ **ai-critic** is a specialized **decision-making** tool designed to audit the reliability and readiness for deployment of scikit-learn compatible Machine Learning models.
14
+
15
+ Instead of just measuring performance (accuracy, F1 score), **ai-critic** acts as a "Quality Gate," operating the model in search of hidden risks that can lead to production failures, such as data leaks, structural overfitting, and vulnerability to noise.
16
+
17
+ ---
18
+
19
+ ## 🚀 1. Getting Started (The Basics)
20
+
21
+ This section is ideal for beginners who need a quick verdict on the health of their model.
22
+
23
+ ### 1.1. Installation
24
+
25
+ Install the library directly from PyPI:
26
+
27
+ ```bash
28
+ pip install ai-critic
29
+ ```
30
+
31
+ ### 1.2. The Quick Verdict
32
+
33
+ With just a few lines, you can get an executive evaluation and a deployment recommendation.
34
+
35
+ ```python
36
+ from ai_critic import AICritic
37
+ from sklearn.ensemble import RandomForestClassifier
38
+ from sklearn.datasets import make_classification
39
+
40
+ # 1. Prepare your data and model
41
+ X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
42
+ model = RandomForestClassifier(max_depth=5, random_state=42)
43
+
44
+ # 2. Initialize Criticism
45
+ # AICritic performs all audits internally
46
+ critic = AICritic(model, X, y)
47
+
48
+ # 3. Obtain the Executive Summary
49
+ report = critic.evaluate(view="executive")
50
+
51
+ print(f"Verdict: {report['verdict']}")
52
+ print(f"Risk: {report['risk_level']}")
53
+ print(f"Reason Main: {report['main_reason']}")
54
+
55
+ #Expected Output:
56
+
57
+ # Verdict: ✅ Acceptable
58
+ # Risk: Low
59
+ # Main Reason: No critic risks detected.
60
+
61
+ ```
62
+
63
+ ---
64
+
65
+ ## 💡 2. Understanding the Critique (The Intermediary)
66
+
67
+ For the data scientist who needs to understand *why* the model received a verdict and what the next steps are.
68
+
69
+ ### 2.1. The Four Pillars of the Audit
70
+
71
+ The **ai-critic** evaluates your model across four critic dimensions.
72
+
73
+ | Category | Main Risk | Code Module |
74
+ | :--- | :--- | :--- |
75
+ | 📈 **Validation** | Suspicious CV Scores | `ai_critic.performance` |
76
+ | 🧪 **Robustness** | Noise Vulnerability | `ai_critic.robustness` |
77
+
78
+ 2.2. Visual and Technical Analysis
79
+
80
+ The `evaluate` method allows you to view the results and access the complete technical report.
81
+
82
+ ```Python
83
+ # Continuing the previous example...
84
+
85
+ # 1. Generate the full report and visualizations
86
+ # plot=True generates Correlation, Learning Curve, and Robustness graphs
87
+ full_report = critic.evaluate(view="all", plot=True)
88
+
89
+ # 2. Access the Technical Summary for Recommendations
90
+ technical_summary = full_report["technical"]
91
+
92
+ print("\n--- Technical Recommendations ---")
93
+ for i, risk in enumerate(technical_summary["key_risks"]):
94
+ print(f"Risk {i+1}: {risk}")
95
+ print(f"Recommendation: {technical_summary['recommendations'][i]}")
96
+
97
+ # Example of Risk (if there were one):
98
+ # Risk 1: The depth of the tree may be too high for the size of the dataset.
99
+
100
+ # Recommendation: Reduce model complexity or adjust hyperparameters.
101
+
102
+
103
+ ###2.3. Robustness Test
104
+
105
+ A robust model should maintain its performance even with small disturbances in the data. The `ai-critic` test assesses this by injecting noise into the input data.
106
+
107
+ ```python
108
+ # Accessing the specific result of the Robustness module
109
+ robustness_result = full_report["details"]["robustness"]
110
+
111
+ print("\n--- Robustness Test ---")
112
+ print(f"Original CV Score: {robustness_result['cv_score_original']:.4f}")
113
+ print(f"CV Score with Noise: {robustness_result['cv_score_noisy']:.4f}")
114
+ print(f"Performance Drop: {robustness_result['performance_drop']:.4f}")
115
+ print(f"Robustness Verdict: {robustness_result['verdict']}")
116
+
117
+ # Possible Verdicts:
118
+ # - Stable: Acceptable drop.
119
+
120
+ # - Fragile: Significant drop (risk).
121
+
122
+ # - Misleading: Original performance inflated by leakage.
123
+
124
+ ```
125
+
126
+ ---
127
+
128
+ ## ⚙️ 3. Integration and Governance (The Advanced)
129
+
130
+ This section is for MLOps engineers and architects looking to integrate **ai-critic** into automated pipelines and create custom deployment logic.
131
+
132
+ ###3.1. The Deployment Gate (`deploy_decision`)
133
+
134
+ The `deploy_decision()` method is the final control point. It returns a structured object that classifies problems into *Hard Blockers* (prevent deployment) and *Soft Blockers* (require attention, but can be accepted with reservations).
135
+
136
+ Python
137
+ # Example of use in a CI/CD pipeline
138
+ decision = critic.deploy_decision()
139
+
140
+ if decision["deploy"]:
141
+ print("✅ Deployment Approved. Risk Level: Low.")
142
+ other:
143
+ print(f"❌ Deployment Blocked. Risk Level: {decision['risk_level'].upper()}")
144
+ print("Blocking Issues:")
145
+ for issue in decision["blocking_issues"]:
146
+ print(f"- {problem}")
147
+
148
+ # The decision object also includes a heuristic confidence score (0.0 to 1.0)
149
+ print(f"Heuristic Confidence in Model: {decision['confidence']:.2f}")
150
+
151
+ ```
152
+
153
+ ###3.2. AccessFor custom *governance* rules or logic, you can access the raw data of each module through the `"details"` view.
154
+
155
+ ```python
156
+ # Accessing Data Leakage Details
157
+ data_details = critic.evaluate(view="details")["data"]
158
+
159
+ if data_details["data_leakage"]["suspected"]:
160
+
161
+ print("\n--- Data Leak Alert ---")
162
+
163
+ for detail in data_details["data_leakage"]["details"]:
164
+
165
+ print(f"Feature {detail['feature_index']} with correlation of {detail['correlation']:.4f}")
166
+
167
+ # Accessing Structural Overfitting Details
168
+ config_details = critic.evaluate(view="details")["config"]
169
+
170
+ if config_details["structural_warnings"]:
171
+
172
+ print("\n--- Structural Alert ---")
173
+
174
+ for warning in config_details["structural_warnings"]:
175
+
176
+ print(f"Warning: {warning['message']} (Max Depth: {warning['max_depth']}, Recommended: {warning['recommended_max_depth']})")
177
+ ```
178
+
179
+ ### 3.3. Best Practices and Use Cases
180
+
181
+ | Use | Recommended Action |
182
+ | :--- | :--- |
183
+ | **CI/CD** | Use `deploy_decision()` as an automated quality gate. |
184
+ | **Tuning** | Use the technical view to guide hyperparameter optimization. |
185
+ | **Governance** | Log the details view for auditing and compliance. |
186
+ | **Communication** | Use the executive view to report risks to non-technical stakeholders. |
187
+
188
+ ---
189
+
190
+ ## 📄 License
191
+
192
+ Distributed under the **MIT License**.
193
+
194
+ --
195
+
196
+ ## 🧠 Final Note
197
+
198
+ > **ai-critic** is not a benchmarking tool. It's a decision-making tool.
199
+
200
+ If a model fails here, it doesn't mean it's "bad," but rather that it **shouldn't be trusted yet**. The goal is to inject the necessary skepticism to build truly robust AI systems.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "ai-critic"
7
- version = "0.2.4"
7
+ version = "0.2.5"
8
8
  description = "Fast AI evaluator for scikit-learn models"
9
9
  readme = "README.md"
10
10
  authors = [
ai_critic-0.2.4/PKG-INFO DELETED
@@ -1,76 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: ai-critic
3
- Version: 0.2.4
4
- Summary: Fast AI evaluator for scikit-learn models
5
- Author-email: Luiz Seabra <filipedemarco@yahoo.com>
6
- Requires-Python: >=3.9
7
- Description-Content-Type: text/markdown
8
- Requires-Dist: numpy
9
- Requires-Dist: scikit-learn
10
-
11
- Performance under noise
12
-
13
- > Visualizations are optional and do not affect the decision logic.
14
-
15
- ---
16
-
17
- ## ⚙️ Main API
18
-
19
- ### `AICritic(model, X, y)`
20
-
21
- * `model`: scikit-learn compatible estimator
22
- * `X`: feature matrix
23
- * `y`: target vector
24
-
25
- ### `evaluate(view="all", plot=False)`
26
-
27
- * `view`: `"executive"`, `"technical"`, `"details"`, `"all"` or custom list
28
- * `plot`: generates graphs when `True`
29
-
30
- ---
31
-
32
- ## 🧠 What ai-critic Detects
33
-
34
- | Category | Risks |
35
-
36
- | ------------ | ---------------------------------------- |
37
-
38
- | 🔍 Data | Target Leakage, NaNs, Imbalance |
39
-
40
- | 🧱 Structure | Excessive Complexity, Overfitting |
41
-
42
- | 📈 Validation | Perfect or Statistically Suspicious CV |
43
-
44
- | 🧪 Robustness | Stable, Fragile, or Misleading |
45
-
46
- ---
47
-
48
- ## 🛡️ Best Practices
49
-
50
- * **CI/CD:** Use executive output as a *quality gate*
51
- * **Iteration:** Use technical output during tuning
52
- * **Governance:** Log detailed output
53
- * **Skepticism:** Never blindly trust a perfect CV
54
-
55
- ---
56
-
57
- ## 🧭 Use Cases
58
-
59
- * Pre-deployment Audit
60
- * ML Governance
61
- * CI/CD Pipelines
62
- * Risk Communication for Non-Technical Users
63
-
64
- ---
65
-
66
- ## 📄 License
67
-
68
- Distributed under the **MIT License**.
69
-
70
- ---
71
-
72
- ## 🧠 Final Note
73
-
74
- **ai-critic** is not a *benchmarking* tool. It's a **decision-making tool**.
75
-
76
- If a model fails here, it doesn't mean it's bad—it means it **shouldn't be trusted yet**.
ai_critic-0.2.4/README.md DELETED
@@ -1,66 +0,0 @@
1
- Performance under noise
2
-
3
- > Visualizations are optional and do not affect the decision logic.
4
-
5
- ---
6
-
7
- ## ⚙️ Main API
8
-
9
- ### `AICritic(model, X, y)`
10
-
11
- * `model`: scikit-learn compatible estimator
12
- * `X`: feature matrix
13
- * `y`: target vector
14
-
15
- ### `evaluate(view="all", plot=False)`
16
-
17
- * `view`: `"executive"`, `"technical"`, `"details"`, `"all"` or custom list
18
- * `plot`: generates graphs when `True`
19
-
20
- ---
21
-
22
- ## 🧠 What ai-critic Detects
23
-
24
- | Category | Risks |
25
-
26
- | ------------ | ---------------------------------------- |
27
-
28
- | 🔍 Data | Target Leakage, NaNs, Imbalance |
29
-
30
- | 🧱 Structure | Excessive Complexity, Overfitting |
31
-
32
- | 📈 Validation | Perfect or Statistically Suspicious CV |
33
-
34
- | 🧪 Robustness | Stable, Fragile, or Misleading |
35
-
36
- ---
37
-
38
- ## 🛡️ Best Practices
39
-
40
- * **CI/CD:** Use executive output as a *quality gate*
41
- * **Iteration:** Use technical output during tuning
42
- * **Governance:** Log detailed output
43
- * **Skepticism:** Never blindly trust a perfect CV
44
-
45
- ---
46
-
47
- ## 🧭 Use Cases
48
-
49
- * Pre-deployment Audit
50
- * ML Governance
51
- * CI/CD Pipelines
52
- * Risk Communication for Non-Technical Users
53
-
54
- ---
55
-
56
- ## 📄 License
57
-
58
- Distributed under the **MIT License**.
59
-
60
- ---
61
-
62
- ## 🧠 Final Note
63
-
64
- **ai-critic** is not a *benchmarking* tool. It's a **decision-making tool**.
65
-
66
- If a model fails here, it doesn't mean it's bad—it means it **shouldn't be trusted yet**.
@@ -1,76 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: ai-critic
3
- Version: 0.2.4
4
- Summary: Fast AI evaluator for scikit-learn models
5
- Author-email: Luiz Seabra <filipedemarco@yahoo.com>
6
- Requires-Python: >=3.9
7
- Description-Content-Type: text/markdown
8
- Requires-Dist: numpy
9
- Requires-Dist: scikit-learn
10
-
11
- Performance under noise
12
-
13
- > Visualizations are optional and do not affect the decision logic.
14
-
15
- ---
16
-
17
- ## ⚙️ Main API
18
-
19
- ### `AICritic(model, X, y)`
20
-
21
- * `model`: scikit-learn compatible estimator
22
- * `X`: feature matrix
23
- * `y`: target vector
24
-
25
- ### `evaluate(view="all", plot=False)`
26
-
27
- * `view`: `"executive"`, `"technical"`, `"details"`, `"all"` or custom list
28
- * `plot`: generates graphs when `True`
29
-
30
- ---
31
-
32
- ## 🧠 What ai-critic Detects
33
-
34
- | Category | Risks |
35
-
36
- | ------------ | ---------------------------------------- |
37
-
38
- | 🔍 Data | Target Leakage, NaNs, Imbalance |
39
-
40
- | 🧱 Structure | Excessive Complexity, Overfitting |
41
-
42
- | 📈 Validation | Perfect or Statistically Suspicious CV |
43
-
44
- | 🧪 Robustness | Stable, Fragile, or Misleading |
45
-
46
- ---
47
-
48
- ## 🛡️ Best Practices
49
-
50
- * **CI/CD:** Use executive output as a *quality gate*
51
- * **Iteration:** Use technical output during tuning
52
- * **Governance:** Log detailed output
53
- * **Skepticism:** Never blindly trust a perfect CV
54
-
55
- ---
56
-
57
- ## 🧭 Use Cases
58
-
59
- * Pre-deployment Audit
60
- * ML Governance
61
- * CI/CD Pipelines
62
- * Risk Communication for Non-Technical Users
63
-
64
- ---
65
-
66
- ## 📄 License
67
-
68
- Distributed under the **MIT License**.
69
-
70
- ---
71
-
72
- ## 🧠 Final Note
73
-
74
- **ai-critic** is not a *benchmarking* tool. It's a **decision-making tool**.
75
-
76
- If a model fails here, it doesn't mean it's bad—it means it **shouldn't be trusted yet**.
File without changes
File without changes
File without changes