mail-swarms 1.3.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- mail/__init__.py +35 -0
- mail/api.py +1964 -0
- mail/cli.py +432 -0
- mail/client.py +1657 -0
- mail/config/__init__.py +8 -0
- mail/config/client.py +87 -0
- mail/config/server.py +165 -0
- mail/core/__init__.py +72 -0
- mail/core/actions.py +69 -0
- mail/core/agents.py +73 -0
- mail/core/message.py +366 -0
- mail/core/runtime.py +3537 -0
- mail/core/tasks.py +311 -0
- mail/core/tools.py +1206 -0
- mail/db/__init__.py +0 -0
- mail/db/init.py +182 -0
- mail/db/types.py +65 -0
- mail/db/utils.py +523 -0
- mail/examples/__init__.py +27 -0
- mail/examples/analyst_dummy/__init__.py +15 -0
- mail/examples/analyst_dummy/agent.py +136 -0
- mail/examples/analyst_dummy/prompts.py +44 -0
- mail/examples/consultant_dummy/__init__.py +15 -0
- mail/examples/consultant_dummy/agent.py +136 -0
- mail/examples/consultant_dummy/prompts.py +42 -0
- mail/examples/data_analysis/__init__.py +40 -0
- mail/examples/data_analysis/analyst/__init__.py +9 -0
- mail/examples/data_analysis/analyst/agent.py +67 -0
- mail/examples/data_analysis/analyst/prompts.py +53 -0
- mail/examples/data_analysis/processor/__init__.py +13 -0
- mail/examples/data_analysis/processor/actions.py +293 -0
- mail/examples/data_analysis/processor/agent.py +67 -0
- mail/examples/data_analysis/processor/prompts.py +48 -0
- mail/examples/data_analysis/reporter/__init__.py +10 -0
- mail/examples/data_analysis/reporter/actions.py +187 -0
- mail/examples/data_analysis/reporter/agent.py +67 -0
- mail/examples/data_analysis/reporter/prompts.py +49 -0
- mail/examples/data_analysis/statistics/__init__.py +18 -0
- mail/examples/data_analysis/statistics/actions.py +343 -0
- mail/examples/data_analysis/statistics/agent.py +67 -0
- mail/examples/data_analysis/statistics/prompts.py +60 -0
- mail/examples/mafia/__init__.py +0 -0
- mail/examples/mafia/game.py +1537 -0
- mail/examples/mafia/narrator_tools.py +396 -0
- mail/examples/mafia/personas.py +240 -0
- mail/examples/mafia/prompts.py +489 -0
- mail/examples/mafia/roles.py +147 -0
- mail/examples/mafia/spec.md +350 -0
- mail/examples/math_dummy/__init__.py +23 -0
- mail/examples/math_dummy/actions.py +252 -0
- mail/examples/math_dummy/agent.py +136 -0
- mail/examples/math_dummy/prompts.py +46 -0
- mail/examples/math_dummy/types.py +5 -0
- mail/examples/research/__init__.py +39 -0
- mail/examples/research/researcher/__init__.py +9 -0
- mail/examples/research/researcher/agent.py +67 -0
- mail/examples/research/researcher/prompts.py +54 -0
- mail/examples/research/searcher/__init__.py +10 -0
- mail/examples/research/searcher/actions.py +324 -0
- mail/examples/research/searcher/agent.py +67 -0
- mail/examples/research/searcher/prompts.py +53 -0
- mail/examples/research/summarizer/__init__.py +18 -0
- mail/examples/research/summarizer/actions.py +255 -0
- mail/examples/research/summarizer/agent.py +67 -0
- mail/examples/research/summarizer/prompts.py +55 -0
- mail/examples/research/verifier/__init__.py +10 -0
- mail/examples/research/verifier/actions.py +337 -0
- mail/examples/research/verifier/agent.py +67 -0
- mail/examples/research/verifier/prompts.py +52 -0
- mail/examples/supervisor/__init__.py +11 -0
- mail/examples/supervisor/agent.py +4 -0
- mail/examples/supervisor/prompts.py +93 -0
- mail/examples/support/__init__.py +33 -0
- mail/examples/support/classifier/__init__.py +10 -0
- mail/examples/support/classifier/actions.py +307 -0
- mail/examples/support/classifier/agent.py +68 -0
- mail/examples/support/classifier/prompts.py +56 -0
- mail/examples/support/coordinator/__init__.py +9 -0
- mail/examples/support/coordinator/agent.py +67 -0
- mail/examples/support/coordinator/prompts.py +48 -0
- mail/examples/support/faq/__init__.py +10 -0
- mail/examples/support/faq/actions.py +182 -0
- mail/examples/support/faq/agent.py +67 -0
- mail/examples/support/faq/prompts.py +42 -0
- mail/examples/support/sentiment/__init__.py +15 -0
- mail/examples/support/sentiment/actions.py +341 -0
- mail/examples/support/sentiment/agent.py +67 -0
- mail/examples/support/sentiment/prompts.py +54 -0
- mail/examples/weather_dummy/__init__.py +23 -0
- mail/examples/weather_dummy/actions.py +75 -0
- mail/examples/weather_dummy/agent.py +136 -0
- mail/examples/weather_dummy/prompts.py +35 -0
- mail/examples/weather_dummy/types.py +5 -0
- mail/factories/__init__.py +27 -0
- mail/factories/action.py +223 -0
- mail/factories/base.py +1531 -0
- mail/factories/supervisor.py +241 -0
- mail/net/__init__.py +7 -0
- mail/net/registry.py +712 -0
- mail/net/router.py +728 -0
- mail/net/server_utils.py +114 -0
- mail/net/types.py +247 -0
- mail/server.py +1605 -0
- mail/stdlib/__init__.py +0 -0
- mail/stdlib/anthropic/__init__.py +0 -0
- mail/stdlib/fs/__init__.py +15 -0
- mail/stdlib/fs/actions.py +209 -0
- mail/stdlib/http/__init__.py +19 -0
- mail/stdlib/http/actions.py +333 -0
- mail/stdlib/interswarm/__init__.py +11 -0
- mail/stdlib/interswarm/actions.py +208 -0
- mail/stdlib/mcp/__init__.py +19 -0
- mail/stdlib/mcp/actions.py +294 -0
- mail/stdlib/openai/__init__.py +13 -0
- mail/stdlib/openai/agents.py +451 -0
- mail/summarizer.py +234 -0
- mail/swarms_json/__init__.py +27 -0
- mail/swarms_json/types.py +87 -0
- mail/swarms_json/utils.py +255 -0
- mail/url_scheme.py +51 -0
- mail/utils/__init__.py +53 -0
- mail/utils/auth.py +194 -0
- mail/utils/context.py +17 -0
- mail/utils/logger.py +73 -0
- mail/utils/openai.py +212 -0
- mail/utils/parsing.py +89 -0
- mail/utils/serialize.py +292 -0
- mail/utils/store.py +49 -0
- mail/utils/string_builder.py +119 -0
- mail/utils/version.py +20 -0
- mail_swarms-1.3.2.dist-info/METADATA +237 -0
- mail_swarms-1.3.2.dist-info/RECORD +137 -0
- mail_swarms-1.3.2.dist-info/WHEEL +4 -0
- mail_swarms-1.3.2.dist-info/entry_points.txt +2 -0
- mail_swarms-1.3.2.dist-info/licenses/LICENSE +202 -0
- mail_swarms-1.3.2.dist-info/licenses/NOTICE +10 -0
- mail_swarms-1.3.2.dist-info/licenses/THIRD_PARTY_NOTICES.md +12334 -0
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
# SPDX-License-Identifier: Apache-2.0
|
|
2
|
+
# Copyright (c) 2025 Charon Labs
|
|
3
|
+
|
|
4
|
+
SYSPROMPT = """You are reporter@{swarm}, the report generation specialist for this data analysis swarm.
|
|
5
|
+
|
|
6
|
+
# Your Role
|
|
7
|
+
Format analysis results into clear, professional reports with tables, summaries, and visualizations.
|
|
8
|
+
|
|
9
|
+
# Critical Rule: Responding
|
|
10
|
+
You CANNOT talk to users directly or call `task_complete`. You MUST use `send_response` to reply to the agent who contacted you.
|
|
11
|
+
- When you receive a request, note the sender (usually "analyst")
|
|
12
|
+
- After formatting the report, call `send_response(target=<sender>, subject="Re: ...", body=<your report>)`
|
|
13
|
+
- Include the COMPLETE formatted report in your response body
|
|
14
|
+
|
|
15
|
+
# Tools
|
|
16
|
+
|
|
17
|
+
## Report Formatting
|
|
18
|
+
- `format_report(title, sections)`: Generate a formatted markdown report
|
|
19
|
+
|
|
20
|
+
## Communication
|
|
21
|
+
- `send_response(target, subject, body)`: Reply to the agent who requested information
|
|
22
|
+
- `send_request(target, subject, body)`: Ask another agent for information
|
|
23
|
+
- `acknowledge_broadcast(note)`: Acknowledge a broadcast message
|
|
24
|
+
- `ignore_broadcast(reason)`: Ignore an irrelevant broadcast
|
|
25
|
+
|
|
26
|
+
# Report Sections
|
|
27
|
+
|
|
28
|
+
When creating reports, organize content into sections:
|
|
29
|
+
- **summary**: Executive summary or key findings
|
|
30
|
+
- **data_overview**: Description of the dataset
|
|
31
|
+
- **statistics**: Statistical results in tables
|
|
32
|
+
- **insights**: Interpretations and recommendations
|
|
33
|
+
- **appendix**: Additional details or raw data
|
|
34
|
+
|
|
35
|
+
# Workflow
|
|
36
|
+
|
|
37
|
+
1. Receive request from another agent (note the sender)
|
|
38
|
+
2. Organize the provided data into logical sections
|
|
39
|
+
3. Call `format_report` with appropriate title and sections
|
|
40
|
+
4. Call `send_response` to the original sender with the formatted report
|
|
41
|
+
|
|
42
|
+
# Guidelines
|
|
43
|
+
|
|
44
|
+
- Make reports clear and easy to scan
|
|
45
|
+
- Use markdown tables for numerical data
|
|
46
|
+
- Highlight key findings prominently
|
|
47
|
+
- Include both raw numbers and interpretations
|
|
48
|
+
- Keep summaries concise but informative
|
|
49
|
+
- Use "Re: <original subject>" as your response subject"""
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
# SPDX-License-Identifier: Apache-2.0
|
|
2
|
+
# Copyright (c) 2025 Charon Labs
|
|
3
|
+
|
|
4
|
+
"""Statistics agent for the Data Analysis swarm."""
|
|
5
|
+
|
|
6
|
+
from mail.examples.data_analysis.statistics.agent import LiteLLMStatisticsFunction
|
|
7
|
+
from mail.examples.data_analysis.statistics.actions import (
|
|
8
|
+
calculate_statistics,
|
|
9
|
+
run_correlation,
|
|
10
|
+
)
|
|
11
|
+
from mail.examples.data_analysis.statistics.prompts import SYSPROMPT
|
|
12
|
+
|
|
13
|
+
__all__ = [
|
|
14
|
+
"LiteLLMStatisticsFunction",
|
|
15
|
+
"calculate_statistics",
|
|
16
|
+
"run_correlation",
|
|
17
|
+
"SYSPROMPT",
|
|
18
|
+
]
|
|
@@ -0,0 +1,343 @@
|
|
|
1
|
+
# SPDX-License-Identifier: Apache-2.0
|
|
2
|
+
# Copyright (c) 2025 Charon Labs
|
|
3
|
+
|
|
4
|
+
"""Statistical calculation actions for the Data Analysis swarm.
|
|
5
|
+
|
|
6
|
+
These actions perform REAL statistical calculations (not dummy data).
|
|
7
|
+
"""
|
|
8
|
+
|
|
9
|
+
import json
|
|
10
|
+
import math
|
|
11
|
+
from collections import Counter
|
|
12
|
+
from typing import Any
|
|
13
|
+
|
|
14
|
+
from mail import action
|
|
15
|
+
|
|
16
|
+
# All available metrics
|
|
17
|
+
AVAILABLE_METRICS = [
|
|
18
|
+
"count",
|
|
19
|
+
"mean",
|
|
20
|
+
"median",
|
|
21
|
+
"mode",
|
|
22
|
+
"std",
|
|
23
|
+
"variance",
|
|
24
|
+
"min",
|
|
25
|
+
"max",
|
|
26
|
+
"range",
|
|
27
|
+
"sum",
|
|
28
|
+
"percentile_25",
|
|
29
|
+
"percentile_75",
|
|
30
|
+
"iqr",
|
|
31
|
+
]
|
|
32
|
+
|
|
33
|
+
|
|
34
|
+
def _extract_numeric_values(data: list[Any]) -> list[float]:
|
|
35
|
+
"""Extract numeric values from data, filtering out non-numeric items."""
|
|
36
|
+
values = []
|
|
37
|
+
for item in data:
|
|
38
|
+
if isinstance(item, int | float) and not math.isnan(item) and not math.isinf(item):
|
|
39
|
+
values.append(float(item))
|
|
40
|
+
return values
|
|
41
|
+
|
|
42
|
+
|
|
43
|
+
def _calculate_mean(values: list[float]) -> float:
|
|
44
|
+
"""Calculate arithmetic mean."""
|
|
45
|
+
return sum(values) / len(values)
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
def _calculate_median(values: list[float]) -> float:
|
|
49
|
+
"""Calculate median value."""
|
|
50
|
+
sorted_vals = sorted(values)
|
|
51
|
+
n = len(sorted_vals)
|
|
52
|
+
mid = n // 2
|
|
53
|
+
if n % 2 == 0:
|
|
54
|
+
return (sorted_vals[mid - 1] + sorted_vals[mid]) / 2
|
|
55
|
+
return sorted_vals[mid]
|
|
56
|
+
|
|
57
|
+
|
|
58
|
+
def _calculate_mode(values: list[float]) -> float | None:
|
|
59
|
+
"""Calculate mode (most frequent value)."""
|
|
60
|
+
if not values:
|
|
61
|
+
return None
|
|
62
|
+
counter = Counter(values)
|
|
63
|
+
max_count = max(counter.values())
|
|
64
|
+
modes = [val for val, count in counter.items() if count == max_count]
|
|
65
|
+
return modes[0] if len(modes) == 1 else None # Return None if multi-modal
|
|
66
|
+
|
|
67
|
+
|
|
68
|
+
def _calculate_variance(values: list[float], mean: float) -> float:
|
|
69
|
+
"""Calculate population variance."""
|
|
70
|
+
if len(values) < 2:
|
|
71
|
+
return 0.0
|
|
72
|
+
return sum((x - mean) ** 2 for x in values) / len(values)
|
|
73
|
+
|
|
74
|
+
|
|
75
|
+
def _calculate_std(values: list[float], mean: float) -> float:
|
|
76
|
+
"""Calculate population standard deviation."""
|
|
77
|
+
return math.sqrt(_calculate_variance(values, mean))
|
|
78
|
+
|
|
79
|
+
|
|
80
|
+
def _calculate_percentile(values: list[float], percentile: float) -> float:
|
|
81
|
+
"""Calculate a given percentile."""
|
|
82
|
+
sorted_vals = sorted(values)
|
|
83
|
+
n = len(sorted_vals)
|
|
84
|
+
index = (percentile / 100) * (n - 1)
|
|
85
|
+
lower = int(index)
|
|
86
|
+
upper = lower + 1
|
|
87
|
+
if upper >= n:
|
|
88
|
+
return sorted_vals[-1]
|
|
89
|
+
fraction = index - lower
|
|
90
|
+
return sorted_vals[lower] + fraction * (sorted_vals[upper] - sorted_vals[lower])
|
|
91
|
+
|
|
92
|
+
|
|
93
|
+
CALCULATE_STATISTICS_PARAMETERS = {
|
|
94
|
+
"type": "object",
|
|
95
|
+
"properties": {
|
|
96
|
+
"data": {
|
|
97
|
+
"type": "array",
|
|
98
|
+
"items": {"type": "number"},
|
|
99
|
+
"description": "Array of numeric values to analyze",
|
|
100
|
+
},
|
|
101
|
+
"metrics": {
|
|
102
|
+
"type": "array",
|
|
103
|
+
"items": {
|
|
104
|
+
"type": "string",
|
|
105
|
+
"enum": AVAILABLE_METRICS,
|
|
106
|
+
},
|
|
107
|
+
"description": f"List of metrics to calculate. Available: {', '.join(AVAILABLE_METRICS)}",
|
|
108
|
+
},
|
|
109
|
+
},
|
|
110
|
+
"required": ["data"],
|
|
111
|
+
}
|
|
112
|
+
|
|
113
|
+
|
|
114
|
+
@action(
|
|
115
|
+
name="calculate_statistics",
|
|
116
|
+
description="Calculate descriptive statistics for a numeric dataset.",
|
|
117
|
+
parameters=CALCULATE_STATISTICS_PARAMETERS,
|
|
118
|
+
)
|
|
119
|
+
async def calculate_statistics(args: dict[str, Any]) -> str:
|
|
120
|
+
"""Calculate descriptive statistics on numeric data."""
|
|
121
|
+
try:
|
|
122
|
+
data = args["data"]
|
|
123
|
+
metrics = args.get("metrics", ["count", "mean", "median", "std", "min", "max"])
|
|
124
|
+
except KeyError as e:
|
|
125
|
+
return f"Error: {e} is required"
|
|
126
|
+
|
|
127
|
+
# Extract numeric values
|
|
128
|
+
values = _extract_numeric_values(data)
|
|
129
|
+
|
|
130
|
+
if not values:
|
|
131
|
+
return json.dumps(
|
|
132
|
+
{
|
|
133
|
+
"error": "No valid numeric values found in data",
|
|
134
|
+
"original_count": len(data),
|
|
135
|
+
"valid_count": 0,
|
|
136
|
+
}
|
|
137
|
+
)
|
|
138
|
+
|
|
139
|
+
# Calculate requested metrics
|
|
140
|
+
results: dict[str, Any] = {
|
|
141
|
+
"data_points": len(values),
|
|
142
|
+
"original_count": len(data),
|
|
143
|
+
"valid_count": len(values),
|
|
144
|
+
"metrics": {},
|
|
145
|
+
}
|
|
146
|
+
|
|
147
|
+
# Pre-calculate common values
|
|
148
|
+
mean = _calculate_mean(values)
|
|
149
|
+
sorted_vals = sorted(values)
|
|
150
|
+
|
|
151
|
+
for metric in metrics:
|
|
152
|
+
if metric not in AVAILABLE_METRICS:
|
|
153
|
+
results["metrics"][metric] = {"error": f"Unknown metric: {metric}"} # type: ignore
|
|
154
|
+
continue
|
|
155
|
+
|
|
156
|
+
try:
|
|
157
|
+
if metric == "count":
|
|
158
|
+
results["metrics"][metric] = len(values) # type: ignore
|
|
159
|
+
elif metric == "mean":
|
|
160
|
+
results["metrics"][metric] = round(mean, 4) # type: ignore
|
|
161
|
+
elif metric == "median":
|
|
162
|
+
results["metrics"][metric] = round(_calculate_median(values), 4) # type: ignore
|
|
163
|
+
elif metric == "mode":
|
|
164
|
+
mode = _calculate_mode(values)
|
|
165
|
+
results["metrics"][metric] = ( # type: ignore
|
|
166
|
+
round(mode, 4) if mode is not None else "multimodal"
|
|
167
|
+
)
|
|
168
|
+
elif metric == "std":
|
|
169
|
+
results["metrics"][metric] = round(_calculate_std(values, mean), 4) # type: ignore
|
|
170
|
+
elif metric == "variance":
|
|
171
|
+
results["metrics"][metric] = round(_calculate_variance(values, mean), 4) # type: ignore
|
|
172
|
+
elif metric == "min":
|
|
173
|
+
results["metrics"][metric] = round(min(values), 4) # type: ignore
|
|
174
|
+
elif metric == "max":
|
|
175
|
+
results["metrics"][metric] = round(max(values), 4) # type: ignore
|
|
176
|
+
elif metric == "range":
|
|
177
|
+
results["metrics"][metric] = round(max(values) - min(values), 4) # type: ignore
|
|
178
|
+
elif metric == "sum":
|
|
179
|
+
results["metrics"][metric] = round(sum(values), 4) # type: ignore
|
|
180
|
+
elif metric == "percentile_25":
|
|
181
|
+
results["metrics"][metric] = round(_calculate_percentile(values, 25), 4) # type: ignore
|
|
182
|
+
elif metric == "percentile_75":
|
|
183
|
+
results["metrics"][metric] = round(_calculate_percentile(values, 75), 4) # type: ignore
|
|
184
|
+
elif metric == "iqr":
|
|
185
|
+
q1 = _calculate_percentile(values, 25)
|
|
186
|
+
q3 = _calculate_percentile(values, 75)
|
|
187
|
+
results["metrics"][metric] = round(q3 - q1, 4) # type: ignore
|
|
188
|
+
except Exception as e:
|
|
189
|
+
results["metrics"][metric] = {"error": str(e)} # type: ignore
|
|
190
|
+
|
|
191
|
+
# Add interpretation
|
|
192
|
+
results["interpretation"] = _generate_interpretation(
|
|
193
|
+
results["metrics"], len(values) # type: ignore
|
|
194
|
+
)
|
|
195
|
+
|
|
196
|
+
return json.dumps(results)
|
|
197
|
+
|
|
198
|
+
|
|
199
|
+
def _generate_interpretation(metrics: dict[str, Any], n: int) -> str:
|
|
200
|
+
"""Generate a human-readable interpretation of the statistics."""
|
|
201
|
+
parts = []
|
|
202
|
+
|
|
203
|
+
parts.append(f"Analysis based on {n} data points.")
|
|
204
|
+
|
|
205
|
+
if "mean" in metrics and "median" in metrics:
|
|
206
|
+
mean = metrics["mean"]
|
|
207
|
+
median = metrics["median"]
|
|
208
|
+
if isinstance(mean, int | float) and isinstance(median, int | float):
|
|
209
|
+
if abs(mean - median) / max(abs(mean), 0.001) > 0.1:
|
|
210
|
+
if mean > median:
|
|
211
|
+
parts.append(
|
|
212
|
+
"Data is right-skewed (mean > median), indicating some high outliers."
|
|
213
|
+
)
|
|
214
|
+
else:
|
|
215
|
+
parts.append(
|
|
216
|
+
"Data is left-skewed (mean < median), indicating some low outliers."
|
|
217
|
+
)
|
|
218
|
+
else:
|
|
219
|
+
parts.append("Data appears roughly symmetric (mean ≈ median).")
|
|
220
|
+
|
|
221
|
+
if "std" in metrics and "mean" in metrics:
|
|
222
|
+
std = metrics["std"]
|
|
223
|
+
mean = metrics["mean"]
|
|
224
|
+
if isinstance(std, int | float) and isinstance(mean, int | float) and mean != 0:
|
|
225
|
+
cv = abs(std / mean)
|
|
226
|
+
if cv > 1:
|
|
227
|
+
parts.append("High variability in the data (CV > 100%).")
|
|
228
|
+
elif cv > 0.3:
|
|
229
|
+
parts.append("Moderate variability in the data.")
|
|
230
|
+
else:
|
|
231
|
+
parts.append("Low variability, data is relatively consistent.")
|
|
232
|
+
|
|
233
|
+
return " ".join(parts)
|
|
234
|
+
|
|
235
|
+
|
|
236
|
+
RUN_CORRELATION_PARAMETERS = {
|
|
237
|
+
"type": "object",
|
|
238
|
+
"properties": {
|
|
239
|
+
"x": {
|
|
240
|
+
"type": "array",
|
|
241
|
+
"items": {"type": "number"},
|
|
242
|
+
"description": "First variable (array of numbers)",
|
|
243
|
+
},
|
|
244
|
+
"y": {
|
|
245
|
+
"type": "array",
|
|
246
|
+
"items": {"type": "number"},
|
|
247
|
+
"description": "Second variable (array of numbers)",
|
|
248
|
+
},
|
|
249
|
+
},
|
|
250
|
+
"required": ["x", "y"],
|
|
251
|
+
}
|
|
252
|
+
|
|
253
|
+
|
|
254
|
+
@action(
|
|
255
|
+
name="run_correlation",
|
|
256
|
+
description="Calculate the Pearson correlation coefficient between two variables.",
|
|
257
|
+
parameters=RUN_CORRELATION_PARAMETERS,
|
|
258
|
+
)
|
|
259
|
+
async def run_correlation(args: dict[str, Any]) -> str:
|
|
260
|
+
"""Calculate Pearson correlation between two variables."""
|
|
261
|
+
try:
|
|
262
|
+
x_data = args["x"]
|
|
263
|
+
y_data = args["y"]
|
|
264
|
+
except KeyError as e:
|
|
265
|
+
return f"Error: {e} is required"
|
|
266
|
+
|
|
267
|
+
# Extract numeric values and pair them
|
|
268
|
+
x_values = _extract_numeric_values(x_data)
|
|
269
|
+
y_values = _extract_numeric_values(y_data)
|
|
270
|
+
|
|
271
|
+
if len(x_values) != len(y_values):
|
|
272
|
+
# Truncate to shorter length
|
|
273
|
+
min_len = min(len(x_values), len(y_values))
|
|
274
|
+
x_values = x_values[:min_len]
|
|
275
|
+
y_values = y_values[:min_len]
|
|
276
|
+
|
|
277
|
+
n = len(x_values)
|
|
278
|
+
|
|
279
|
+
if n < 3:
|
|
280
|
+
return json.dumps(
|
|
281
|
+
{
|
|
282
|
+
"error": "Need at least 3 paired data points for correlation",
|
|
283
|
+
"data_points": n,
|
|
284
|
+
}
|
|
285
|
+
)
|
|
286
|
+
|
|
287
|
+
# Calculate Pearson correlation coefficient
|
|
288
|
+
mean_x = sum(x_values) / n
|
|
289
|
+
mean_y = sum(y_values) / n
|
|
290
|
+
|
|
291
|
+
# Calculate covariance and standard deviations
|
|
292
|
+
covariance = (
|
|
293
|
+
sum((x - mean_x) * (y - mean_y) for x, y in zip(x_values, y_values)) / n
|
|
294
|
+
)
|
|
295
|
+
std_x = math.sqrt(sum((x - mean_x) ** 2 for x in x_values) / n)
|
|
296
|
+
std_y = math.sqrt(sum((y - mean_y) ** 2 for y in y_values) / n)
|
|
297
|
+
|
|
298
|
+
if std_x == 0 or std_y == 0:
|
|
299
|
+
return json.dumps(
|
|
300
|
+
{
|
|
301
|
+
"error": "Cannot calculate correlation when one variable has zero variance",
|
|
302
|
+
"std_x": std_x,
|
|
303
|
+
"std_y": std_y,
|
|
304
|
+
}
|
|
305
|
+
)
|
|
306
|
+
|
|
307
|
+
r = covariance / (std_x * std_y)
|
|
308
|
+
|
|
309
|
+
# Calculate R-squared
|
|
310
|
+
r_squared = r**2
|
|
311
|
+
|
|
312
|
+
# Determine strength and direction
|
|
313
|
+
if abs(r) >= 0.8:
|
|
314
|
+
strength = "strong"
|
|
315
|
+
elif abs(r) >= 0.5:
|
|
316
|
+
strength = "moderate"
|
|
317
|
+
elif abs(r) >= 0.3:
|
|
318
|
+
strength = "weak"
|
|
319
|
+
else:
|
|
320
|
+
strength = "very weak or no"
|
|
321
|
+
|
|
322
|
+
direction = "positive" if r > 0 else "negative" if r < 0 else "no"
|
|
323
|
+
|
|
324
|
+
result = {
|
|
325
|
+
"correlation_coefficient": round(r, 4),
|
|
326
|
+
"r_squared": round(r_squared, 4),
|
|
327
|
+
"data_points": n,
|
|
328
|
+
"strength": strength,
|
|
329
|
+
"direction": direction,
|
|
330
|
+
"interpretation": f"There is a {strength} {direction} correlation (r={r:.3f}). "
|
|
331
|
+
f"The R² value of {r_squared:.3f} means that approximately "
|
|
332
|
+
f"{r_squared * 100:.1f}% of the variance in Y can be explained by X.",
|
|
333
|
+
"x_stats": {
|
|
334
|
+
"mean": round(mean_x, 4),
|
|
335
|
+
"std": round(std_x, 4),
|
|
336
|
+
},
|
|
337
|
+
"y_stats": {
|
|
338
|
+
"mean": round(mean_y, 4),
|
|
339
|
+
"std": round(std_y, 4),
|
|
340
|
+
},
|
|
341
|
+
}
|
|
342
|
+
|
|
343
|
+
return json.dumps(result)
|
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
# SPDX-License-Identifier: Apache-2.0
|
|
2
|
+
# Copyright (c) 2025 Charon Labs
|
|
3
|
+
|
|
4
|
+
"""Statistics agent for the Data Analysis swarm."""
|
|
5
|
+
|
|
6
|
+
from collections.abc import Awaitable
|
|
7
|
+
from typing import Any, Literal
|
|
8
|
+
|
|
9
|
+
from mail.core.agents import AgentOutput
|
|
10
|
+
from mail.factories.action import LiteLLMActionAgentFunction
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
class LiteLLMStatisticsFunction(LiteLLMActionAgentFunction):
|
|
14
|
+
"""
|
|
15
|
+
Statistics agent that performs statistical calculations.
|
|
16
|
+
|
|
17
|
+
This agent calculates descriptive statistics and correlation
|
|
18
|
+
coefficients on numeric data.
|
|
19
|
+
"""
|
|
20
|
+
|
|
21
|
+
def __init__(
|
|
22
|
+
self,
|
|
23
|
+
name: str,
|
|
24
|
+
comm_targets: list[str],
|
|
25
|
+
tools: list[dict[str, Any]],
|
|
26
|
+
llm: str,
|
|
27
|
+
system: str,
|
|
28
|
+
user_token: str = "",
|
|
29
|
+
enable_entrypoint: bool = False,
|
|
30
|
+
enable_interswarm: bool = False,
|
|
31
|
+
can_complete_tasks: bool = False,
|
|
32
|
+
tool_format: Literal["completions", "responses"] = "responses",
|
|
33
|
+
exclude_tools: list[str] = [],
|
|
34
|
+
reasoning_effort: Literal["minimal", "low", "medium", "high"] | None = None,
|
|
35
|
+
thinking_budget: int | None = None,
|
|
36
|
+
max_tokens: int | None = None,
|
|
37
|
+
memory: bool = True,
|
|
38
|
+
use_proxy: bool = True,
|
|
39
|
+
_debug_include_mail_tools: bool = True,
|
|
40
|
+
) -> None:
|
|
41
|
+
super().__init__(
|
|
42
|
+
name=name,
|
|
43
|
+
comm_targets=comm_targets,
|
|
44
|
+
tools=tools,
|
|
45
|
+
llm=llm,
|
|
46
|
+
system=system,
|
|
47
|
+
user_token=user_token,
|
|
48
|
+
enable_entrypoint=enable_entrypoint,
|
|
49
|
+
enable_interswarm=enable_interswarm,
|
|
50
|
+
can_complete_tasks=can_complete_tasks,
|
|
51
|
+
tool_format=tool_format,
|
|
52
|
+
exclude_tools=exclude_tools,
|
|
53
|
+
reasoning_effort=reasoning_effort,
|
|
54
|
+
thinking_budget=thinking_budget,
|
|
55
|
+
max_tokens=max_tokens,
|
|
56
|
+
memory=memory,
|
|
57
|
+
use_proxy=use_proxy,
|
|
58
|
+
_debug_include_mail_tools=_debug_include_mail_tools,
|
|
59
|
+
)
|
|
60
|
+
|
|
61
|
+
def __call__(
|
|
62
|
+
self,
|
|
63
|
+
messages: list[dict[str, Any]],
|
|
64
|
+
tool_choice: str | dict[str, str] = "required",
|
|
65
|
+
) -> Awaitable[AgentOutput]:
|
|
66
|
+
"""Execute the statistics agent function."""
|
|
67
|
+
return super().__call__(messages, tool_choice)
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
# SPDX-License-Identifier: Apache-2.0
|
|
2
|
+
# Copyright (c) 2025 Charon Labs
|
|
3
|
+
|
|
4
|
+
SYSPROMPT = """You are statistics@{swarm}, the statistical analysis specialist for this data analysis swarm.
|
|
5
|
+
|
|
6
|
+
# Your Role
|
|
7
|
+
Perform statistical calculations on data including descriptive statistics and correlation analysis.
|
|
8
|
+
|
|
9
|
+
# Critical Rule: Responding
|
|
10
|
+
You CANNOT talk to users directly or call `task_complete`. You MUST use `send_response` to reply to the agent who contacted you.
|
|
11
|
+
- When you receive a request, note the sender (usually "analyst")
|
|
12
|
+
- After calculating statistics, call `send_response(target=<sender>, subject="Re: ...", body=<your results>)`
|
|
13
|
+
- Include ALL statistical results in your response body
|
|
14
|
+
|
|
15
|
+
# Tools
|
|
16
|
+
|
|
17
|
+
## Statistical Calculations
|
|
18
|
+
- `calculate_statistics(data, metrics)`: Calculate descriptive statistics on numeric data
|
|
19
|
+
- `run_correlation(x, y)`: Calculate correlation coefficient between two variables
|
|
20
|
+
|
|
21
|
+
## Communication
|
|
22
|
+
- `send_response(target, subject, body)`: Reply to the agent who requested information
|
|
23
|
+
- `send_request(target, subject, body)`: Ask another agent for information
|
|
24
|
+
- `acknowledge_broadcast(note)`: Acknowledge a broadcast message
|
|
25
|
+
- `ignore_broadcast(reason)`: Ignore an irrelevant broadcast
|
|
26
|
+
|
|
27
|
+
# Available Metrics
|
|
28
|
+
|
|
29
|
+
For `calculate_statistics`, you can request these metrics:
|
|
30
|
+
- **count**: Number of values
|
|
31
|
+
- **mean**: Arithmetic mean (average)
|
|
32
|
+
- **median**: Middle value
|
|
33
|
+
- **mode**: Most frequent value
|
|
34
|
+
- **std**: Standard deviation
|
|
35
|
+
- **variance**: Variance
|
|
36
|
+
- **min**: Minimum value
|
|
37
|
+
- **max**: Maximum value
|
|
38
|
+
- **range**: Difference between max and min
|
|
39
|
+
- **sum**: Sum of all values
|
|
40
|
+
- **percentile_25**: 25th percentile (Q1)
|
|
41
|
+
- **percentile_75**: 75th percentile (Q3)
|
|
42
|
+
- **iqr**: Interquartile range (Q3 - Q1)
|
|
43
|
+
|
|
44
|
+
# Workflow
|
|
45
|
+
|
|
46
|
+
1. Receive request from another agent (note the sender)
|
|
47
|
+
2. Extract the numeric data to analyze
|
|
48
|
+
3. Call the appropriate statistical action(s)
|
|
49
|
+
4. Interpret the results
|
|
50
|
+
5. Call `send_response` to the original sender with:
|
|
51
|
+
- Raw statistical values
|
|
52
|
+
- Brief interpretation of what the numbers mean
|
|
53
|
+
|
|
54
|
+
# Guidelines
|
|
55
|
+
|
|
56
|
+
- Always validate that data is numeric before analysis
|
|
57
|
+
- Report both the raw numbers and their meaning
|
|
58
|
+
- For correlations, explain the strength and direction
|
|
59
|
+
- If data has issues (too few points, non-numeric), explain clearly
|
|
60
|
+
- Use "Re: <original subject>" as your response subject"""
|
|
File without changes
|