crewlyze 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. package/.dockerignore +12 -0
  2. package/.gitattributes +2 -0
  3. package/CHANGELOG.md +86 -0
  4. package/Dockerfile +21 -0
  5. package/LICENSE +21 -0
  6. package/README.md +139 -0
  7. package/USAGE.md +106 -0
  8. package/agents/__init__.py +0 -0
  9. package/agents/cleaner.py +38 -0
  10. package/agents/insights.py +44 -0
  11. package/agents/relation.py +36 -0
  12. package/agents/visualizer.py +41 -0
  13. package/assets/badge_crewai.svg +4 -0
  14. package/assets/badge_matplotlib.svg +4 -0
  15. package/assets/badge_ollama.svg +4 -0
  16. package/assets/badge_pandas.svg +4 -0
  17. package/assets/badge_seaborn.svg +4 -0
  18. package/assets/branding_image.png +0 -0
  19. package/assets/complete_workflow.svg +216 -0
  20. package/assets/favicon.png +0 -0
  21. package/assets/logo.png +0 -0
  22. package/assets/stars.svg +12 -0
  23. package/bin/crewlyze.js +79 -0
  24. package/config/README.md +129 -0
  25. package/config/__init__.py +1 -0
  26. package/config/context.py +16 -0
  27. package/config/llm_config.py +300 -0
  28. package/config/metrics_tracker.py +70 -0
  29. package/crew.py +870 -0
  30. package/crewlyze-3.1.0.tgz +0 -0
  31. package/fix_syntax.py +54 -0
  32. package/main.py +1279 -0
  33. package/package.json +22 -0
  34. package/pyproject.toml +32 -0
  35. package/requirements.txt +33 -0
  36. package/tools/__init__.py +0 -0
  37. package/tools/dataset_tools.py +803 -0
  38. package/ui/__init__.py +3 -0
  39. package/ui/copilot.py +200 -0
  40. package/ui/export.py +800 -0
  41. package/update_appjs.py +54 -0
  42. package/update_llm.py +21 -0
  43. package/update_main.py +20 -0
  44. package/web/app.js +3142 -0
  45. package/web/index.html +1105 -0
  46. package/web/style.css +2561 -0
  47. package/workflows/__init__.py +0 -0
  48. package/workflows/pipeline.py +254 -0
package/.dockerignore ADDED
@@ -0,0 +1,12 @@
1
+ __pycache__/
2
+ *.py[cod]
3
+ *$py.class
4
+ .env
5
+ data/sessions/
6
+ outputs/
7
+ .git
8
+ .vscode
9
+ .idea
10
+ .venv
11
+ venv/
12
+ env/
package/.gitattributes ADDED
@@ -0,0 +1,2 @@
1
+ *.png filter=lfs diff=lfs merge=lfs -text
2
+ *.bin filter=lfs diff=lfs merge=lfs -text
package/CHANGELOG.md ADDED
@@ -0,0 +1,86 @@
1
+ # Changelog
2
+
3
+ All notable changes to the Crewlyze project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [3.0.0] - 2026-06-27
9
+
10
+ ### Architecture Refactor
11
+ - **Modular UI Package**: Extracted all Streamlit UI logic from `app.py` into a dedicated `ui/` package:
12
+ - `ui/styles.py` — CSS injection (glassmorphism "Obsidian & Electric Violet" theme)
13
+ - `ui/components.py` — `display_text_as_bullets`, `display_relations`, `StreamlitLogger` (module-level, not inline)
14
+ - `ui/export.py` — ReportLab PDF builder, wrapped with `@st.cache_data`
15
+ - **Security — Subprocess Sandboxing**: All LLM-generated Python code (cleaning and visualization) now runs in an isolated child process via `subprocess.run()`. `exec()` is never called in the parent process, eliminating RCE risk.
16
+ - **Per-session File Isolation**: Each browser session gets its own `data/sessions/<id>/` and `outputs/<id>/` directories. No cross-session data leakage.
17
+ - **Content-hashed Caching**: Analysis results and PDF exports are cached by MD5 of the uploaded file content, not the filename. Re-uploading the same file never triggers a redundant re-run.
18
+ - **XSS-safe Output**: All LLM-generated text is `html.escape()`'d before injection into `unsafe_allow_html` markdown blocks.
19
+
20
+ ### Improvements
21
+ - **Explicit Run Button**: Analysis no longer fires automatically on upload. Users configure the LLM provider in the sidebar and then click **▶️ Run Analysis**.
22
+ - **Numbered List Regex**: Bullet stripping now uses `re.sub(r"^[\d]+\.\s+", "", line)` — handles all numbered items (N.), not just 1–3.
23
+ - **LLM Config Isolation**: Provider/model/API key are stored in `st.session_state` and only written to `os.environ` immediately before `run_crew()` is invoked.
24
+ - **Agent Factory Pattern**: All agent factories (`make_cleaner_agent`, etc.) are called fresh on every `run_crew()` invocation, picking up the latest sidebar config without requiring `importlib.reload()`.
25
+ - **Session Cleanup**: `_cleanup_old_sessions()` automatically removes session directories older than 24 hours on every run.
26
+
27
+ ### Fixed
28
+ - **Session Isolation Bug**: `execute_visualization_code` tool no longer creates a root-level `outputs/` directory, which previously bypassed per-session isolation.
29
+ - **Stale Cached PDF**: PDF export is now `@st.cache_data` wrapped with a content-hash key — it is never rebuilt on every Streamlit rerender.
30
+ - **Unicode Crash**: `StreamlitLogger.write()` re-encodes through the terminal's actual encoding with `errors='replace'`, preventing `UnicodeEncodeError` on Windows cp1252 terminals.
31
+
32
+ ### Removed
33
+ - `validator.py` (merged into cleaner agent's responsibility)
34
+ - `code_gen.py` (replaced by inline visualization task in `visualizer.py`)
35
+ - `index.html` report output (replaced by the interactive Streamlit dashboard)
36
+ - `outputs/op.py` collected code output (agent code is shown in "Visualization Architecture" section)
37
+
38
+ ## [2.1.0] - 2025-11-27
39
+
40
+ ### UI Overhaul
41
+ - **Premium Design**: Introduced a new "Obsidian & Electric Violet" theme with glassmorphism effects.
42
+ - **Single-Page Layout**: Removed sidebar navigation for a seamless, scrolling experience.
43
+ - **Enhanced Components**:
44
+ - Redesigned "Column Relations" display with visual cards.
45
+ - Styled bullet points for cleaner readability.
46
+ - Modern typography using 'Outfit' and 'JetBrains Mono'.
47
+ - **Interactive Sidebar**: Redesigned configuration panel and "About" section with GitHub integration.
48
+
49
+ ### Improvements
50
+ - **Robustness**: Improved error handling for LLM API calls and visualization generation.
51
+ - **Consistency**: Unified styling for both live analysis results and cached sessions.
52
+
53
+
54
+ ## [2.0.0] - 2025-11-26
55
+
56
+ ### Major Features
57
+ - **Data Analysis as a Service**: Rebranded and restructured for premium service delivery.
58
+ - **Enhanced Validator Agent**: Now acts as a "Data Quality Assurance Specialist" providing detailed quality scores (0-100), decision logic, and specific warnings.
59
+ - **Business Intelligence Agent**: Upgraded Insights Agent to a "Business Intelligence Analyst" role, focusing on synthesizing findings from cleaning, validation, and relation tasks.
60
+ - **Token Optimization**: Significantly reduced token usage by removing dynamic data context injection and optimizing agent prompts.
61
+ - **Professional Reporting**: Updated `index.html` with a modern, dark-themed UI, visual scorecards for data quality, and structured insight presentation.
62
+
63
+ ### Changed
64
+ - **Project Branding**: Renamed to "Crewlyze".
65
+ - **Agent Roles**:
66
+ - Validator: Dataset Validator -> Data Quality Assurance Specialist
67
+ - Insights: Insights Agent -> Business Intelligence Analyst
68
+ - **Workflow**: Streamlined pipeline to use static task definitions for better efficiency.
69
+ - **Licensing**: Added MIT License and copyright headers to all source files.
70
+
71
+ ### Fixed
72
+ - **Rate Limit Issues**: Optimized prompts and removed heavy context to prevent LLM rate limit errors.
73
+ - **Task Conflicts**: Resolved overlapping task descriptions between Validator and Insights agents.
74
+
75
+ ## [1.0.0] - 2023-10-XX
76
+
77
+ ### Added
78
+ - Initial release of CrewAI Data Analyst Agent
79
+ - Modular agent system with cleaner, validator, relation, code_gen, and insights agents
80
+ - Automated CSV processing and analysis pipeline
81
+ - HTML report generation with interactive elements
82
+ - LLM integration via Ollama backend
83
+
84
+ ---
85
+
86
+ **Status**: ✅ Working | 🚀 Production Ready | 📊 Data Analysis as a Service
package/Dockerfile ADDED
@@ -0,0 +1,21 @@
1
+ FROM python:3.10-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Install system dependencies if needed
6
+ RUN apt-get update && apt-get install -y --no-install-recommends \
7
+ build-essential \
8
+ && rm -rf /var/lib/apt/lists/*
9
+
10
+ # Install python dependencies
11
+ COPY requirements.txt .
12
+ RUN pip install --no-cache-dir -r requirements.txt
13
+
14
+ # Copy application source code
15
+ COPY . .
16
+
17
+ # Expose Hugging Face Spaces default port
18
+ EXPOSE 7860
19
+
20
+ # Run FastAPI using uvicorn on port 7860
21
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Sowmiyan S
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,139 @@
1
+ ---
2
+ title: Crewlyze
3
+ emoji: 📊
4
+ colorFrom: indigo
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_port: 7860
8
+ ---
9
+ # Crewlyze
10
+
11
+ <p align="center">
12
+ <img src="assets/stars.svg" alt="5-star rating" height="28" />
13
+ &nbsp;&nbsp;
14
+ <img src="assets/badge_crewai.svg" alt="crewai" height="28" />
15
+ <img src="assets/badge_pandas.svg" alt="pandas" height="28" />
16
+ <img src="assets/badge_matplotlib.svg" alt="matplotlib" height="28" />
17
+ <img src="assets/badge_seaborn.svg" alt="seaborn" height="28" />
18
+ <img src="assets/badge_ollama.svg" alt="ollama" height="28" />
19
+ </p>
20
+
21
+ ## Branding
22
+
23
+ <p align="center">
24
+ <img src="assets/branding_image.png" alt="Transform Raw Datasets Into Insights With Agentic AI Analysts" width="100%" />
25
+ </p>
26
+
27
+ ## Overview
28
+
29
+ > **Autonomous Data Intelligence as a Service** | A premium, modular data-analyst pipeline powered by LLM-driven agents. Upload a CSV to initialize a workspace, chat with your dataset in real-time, execute custom schema modifications via natural language, and run a complete multi-agent pipeline to generate structured audits, correlation maps, and executive business summaries.
30
+
31
+ <p align="center">
32
+ <img src="assets/complete_workflow.svg" alt="Crewlyze Workflow" width="100%" />
33
+ </p>
34
+
35
+ ---
36
+
37
+ ## 🚀 Key Features
38
+
39
+ Once a project is initialized, the system branches into two distinct, high-impact paths:
40
+
41
+ ### Track A: AI Data Chat (Interactive Exploration)
42
+ - **Natural Language Querying**: Query your dataset directly to get condition-based rows, statistics, or aggregations.
43
+ - **On-the-Fly Data Prep**: Ask the copilot to perform edits in-place, such as `Rename column "Q3_Sales" to "Sales_Q3"` or `Delete column "Notes"`, and watch the live data preview table update dynamically.
44
+ - **Instant Visualizations**: Command the chat bot to create custom charts (e.g. *"plot a neon-purple scatter chart of rating vs cost"*). It writes and runs the matplotlib code in a sandboxed subprocess to output charts inline.
45
+
46
+ ### Track B: Agentic Analysis (CrewAI Pipeline)
47
+ Select and run specific automated tasks through the multi-agent pipeline:
48
+ 1. **Data Cleaner (🧹)**: Audits columns, formats values, drops redundant rows, and generates a structured cleaning audit trail.
49
+ 2. **Relationship Mapper (🔗)**: Maps numeric and categorical variables, rendering zoomable, interactive **Plotly** correlation charts.
50
+ 3. **Business Insights (💡)**: Analyzes statistical summaries and generates McKinsey/BCG consulting cards (Observation ➔ Implication ➔ Strategy) alongside critical risk alerts.
51
+ 4. **Visualizer Agent (📈)**: Automatically creates, styles, and saves formatted matplotlib PNG graphs.
52
+
53
+ ---
54
+
55
+ ## 🛠️ Installation & Setup
56
+
57
+ 1. **Clone & Navigate**:
58
+ ```bash
59
+ git clone https://github.com/your-username/Multi-Agent-Data-Analysis-System-with-CrewAI.git
60
+ cd Multi-Agent-Data-Analysis-System-with-CrewAI
61
+ ```
62
+
63
+ 2. **Initialize Environment**:
64
+ ```bash
65
+ python -m venv .venv
66
+ # Windows:
67
+ .\.venv\Scripts\Activate.ps1
68
+ # macOS/Linux:
69
+ source .venv/bin/activate
70
+ ```
71
+
72
+ 3. **Install Dependencies**:
73
+ ```bash
74
+ pip install -r requirements.txt
75
+ ```
76
+
77
+ 4. **Launch Server**:
78
+ ```bash
79
+ # Start the FastAPI Web App
80
+ python -m uvicorn main:app --host 127.0.0.1 --port 8000
81
+ ```
82
+ *Alternatively, double-click `run_web.bat` (Windows) to boot the server automatically.*
83
+
84
+ 5. **Open Browser**:
85
+ Navigate to [http://127.0.0.1:8000](http://127.0.0.1:8000)
86
+
87
+ ---
88
+
89
+ ## 📂 Project Structure
90
+
91
+ ```
92
+ .
93
+ ├── agents/ # CrewAI Agent factories
94
+ │ ├── cleaner.py # 🧹 Data Cleaner Agent
95
+ │ ├── relation.py # 🔗 Relationship Analyst Agent
96
+ │ ├── insights.py # 💡 BI McKinsey Insights Agent
97
+ │ └── visualizer.py # 📈 Matplotlib Visualizer Agent
98
+ ├── config/ # Platform configuration
99
+ │ ├── llm_config.py # Multi-Provider settings and model catalog
100
+ │ └── __init__.py
101
+ ├── tools/ # Orchestration tools
102
+ │ └── dataset_tools.py # read_head, subprocess sandbox runner, plotly builder
103
+ ├── ui/ # Document export services
104
+ │ └── export.py # Formatted PDF Cover & Content builder
105
+ ├── workflows/ # Workflow pipelines
106
+ │ └── pipeline.py # Make pipeline orchestration (adaptive cooldown)
107
+ ├── web/ # Web Frontend Assets
108
+ │ ├── index.html # Glassmorphic Workspace structure
109
+ │ ├── app.js # Frontend core logic (SSE logs, Chat, API hooks)
110
+ │ └── style.css # Dark Electric-Violet Theme styles
111
+ ├── data/ # Dynamic project sessions
112
+ │ └── sessions/ # Concurrency-isolated session directories
113
+ │ └── <session_id>/
114
+ │ ├── original_upload.csv
115
+ │ ├── cleaned.csv
116
+ │ └── metadata.json
117
+ ├── outputs/ # Sandbox generated PNG charts
118
+ │ └── <session_id>/
119
+ ├── assets/ # Static icons and complete_workflow.svg
120
+ ├── requirements.txt # Python package catalog
121
+ ├── main.py # FastAPI backend routing endpoints
122
+ ├── README.md # This file
123
+ ├── USAGE.md # Detailed user guide
124
+ ├── CHANGELOG.md # Version history
125
+ └── LICENSE # MIT License
126
+ ```
127
+
128
+ ---
129
+
130
+ ## ⚙️ Provider Gateway Support
131
+ The system integrates a custom gateway supporting **13+ LLM providers** through local configuration or environment variables:
132
+ - **Cloud Gateways**: OpenAI, Anthropic, Google Gemini, NVIDIA NIM, Groq, Mistral, TogetherAI, Cohere, OpenRouter, DeepSeek, Perplexity, HuggingFace.
133
+ - **Local Sandbox**: Ollama (auto-detects local models via the Ollama catalog).
134
+
135
+ ---
136
+
137
+ *Crewlyze*
138
+ *Copyright (c) 2025 Sowmiyan S*
139
+ *Licensed under the MIT License*
package/USAGE.md ADDED
@@ -0,0 +1,106 @@
1
+ # Usage Guide
2
+
3
+ ## Overview
4
+
5
+ **Crewlyze** is a premium "Data Analysis as a Service" tool. It uses a swarm of specialized AI agents to clean, validate, analyze, and visualize your datasets automatically.
6
+
7
+ ## Prerequisites
8
+
9
+ - **Python**: Version 3.10 or higher
10
+ - **API Key**: A Groq, OpenAI, Anthropic, or Hugging Face API key (or a local Ollama setup).
11
+
12
+ ## Installation
13
+
14
+ 1. Clone the repository:
15
+ ```bash
16
+ git clone https://github.com/yourusername/Multi-Agent-Data-Analysis.git
17
+ cd Multi-Agent-Data-Analysis
18
+ ```
19
+
20
+ 2. Create a virtual environment:
21
+ ```bash
22
+ python -m venv .venv
23
+ .\.venv\Scripts\Activate.ps1 # Windows
24
+ # source .venv/bin/activate # Mac/Linux
25
+ ```
26
+
27
+ 3. Install dependencies:
28
+ ```bash
29
+ pip install -r requirements.txt
30
+ ```
31
+
32
+ 4. Configure your environment:
33
+ Create a `.env` file in the root directory:
34
+ ```env
35
+ # Example for Groq
36
+ LLM_PROVIDER=groq
37
+ GROQ_API_KEY=your_groq_api_key_here
38
+
39
+ # Example for Hugging Face
40
+ # LLM_PROVIDER=huggingface
41
+ # HUGGINGFACE_API_KEY=your_huggingface_api_key_here
42
+ ```
43
+
44
+ ## Quick Start
45
+
46
+ 1. **Prepare Data**: Ensure your CSV file is ready.
47
+
48
+ 2. **Run the System**:
49
+ ```bash
50
+ python crew.py
51
+ ```
52
+
53
+ 3. **Input Path**: When prompted, paste the full path to your CSV file (or press Enter to use the default `data/TB_Burden_Country.csv`).
54
+
55
+ 4. **View Results**:
56
+ - The system will automatically open `index.html` in your default browser.
57
+ - This report contains your Data Quality Score, Cleaning Logs, Visualizations, and Business Insights.
58
+
59
+ ## Detailed Features
60
+
61
+ ### 1. Data Quality Assurance
62
+ The **Data Quality Assurance Specialist** scans your dataset for:
63
+ - Missing values and anomalies
64
+ - Sufficient volume for analysis
65
+ - Data type consistency
66
+ - **Output**: A 0-100 Quality Score and a GO/NO-GO decision.
67
+
68
+ ### 2. Automated Cleaning
69
+ The **Data Cleaner** agent:
70
+ - Removes duplicates
71
+ - Fills missing values (Mean for numeric, Mode for categorical)
72
+ - Standardizes formats
73
+
74
+ ### 3. Relationship Analysis
75
+ The **Relationship Analyst**:
76
+ - Identifies correlations between columns
77
+ - Selects the best visualization type (Scatter, Bar, Line, Heatmap, etc.)
78
+
79
+ ### 4. Visualization Generation
80
+ The **Code Generator**:
81
+ - Writes bug-free Matplotlib/Seaborn code
82
+ - Executes the code to generate charts embedded in the report
83
+
84
+ ### 5. Business Intelligence
85
+ The **Business Intelligence Analyst**:
86
+ - Synthesizes all findings into actionable strategic insights.
87
+
88
+ ## Troubleshooting
89
+
90
+ ### Rate Limit Errors
91
+ If you see `RateLimitError`:
92
+ - Switch to a smaller model in `config/llm_config.py` (e.g., `llama-3.1-8b-instant`).
93
+ - The system is optimized to minimize token usage, but heavy usage may still hit free tier limits.
94
+
95
+ ### Browser Not Opening
96
+ - Manually open `index.html` in your browser.
97
+
98
+ ## Support
99
+
100
+ For issues, please open a ticket on our GitHub repository.
101
+
102
+ ---
103
+
104
+ *Crewlyze*
105
+ *Copyright (c) 2025 Sowmiyan S*
106
+ *Licensed under the MIT License*
File without changes
@@ -0,0 +1,38 @@
1
+ # Crewlyze
2
+ # Copyright (c) 2025 Sowmiyan S
3
+ # Licensed under the MIT License
4
+
5
+ from crewai import Agent, LLM
6
+ from config.llm_config import get_llm_params
7
+ from tools.dataset_tools import DatasetTools
8
+
9
+
10
+ def make_cleaner_agent() -> Agent:
11
+ """Factory — creates a fresh Data Cleaner agent with the current LLM config.
12
+
13
+ max_iter=5: read profile (already in task desc) → write cleaning code →
14
+ call clean_dataset_with_python → verify → final answer. 5 steps is enough.
15
+ """
16
+ return Agent(
17
+ name="Data Cleaner",
18
+ role="Dataset cleaning expert",
19
+ backstory=(
20
+ "You are an expert data cleaning specialist. The task description already "
21
+ "contains a full dataset profile (shape, dtypes, missing %, sample rows). "
22
+ "Use that profile to immediately identify quality issues and write cleaning "
23
+ "code — DO NOT call read_dataset_head or get_dataset_info first."
24
+ ),
25
+ goal=(
26
+ "Clean the dataset at the given file path by executing a Python script using "
27
+ "'Clean Dataset with Python Code'. When done, return a concise plain-text "
28
+ "bulleted list of the cleaning actions you took."
29
+ ),
30
+ llm=LLM(**get_llm_params()),
31
+ tools=[
32
+ DatasetTools.read_dataset_head, # fallback only
33
+ DatasetTools.get_dataset_info, # fallback only
34
+ DatasetTools.clean_dataset_with_python,
35
+ ],
36
+ max_iter=5,
37
+ verbose=True,
38
+ )
@@ -0,0 +1,44 @@
1
+ # Crewlyze
2
+ # Copyright (c) 2025 Sowmiyan S
3
+ # Licensed under the MIT License
4
+
5
+ from crewai import Agent, LLM
6
+ from config.llm_config import get_llm_params
7
+ from tools.dataset_tools import DatasetTools
8
+
9
+
10
+ def make_insights_agent() -> Agent:
11
+ """Factory — creates a fresh BI Insights agent with the current LLM config.
12
+
13
+ Enforces high-value management consulting output instead of dummy text.
14
+ """
15
+ return Agent(
16
+ name="Business Intelligence Analyst",
17
+ role="Derive strategic business insights and ROI-focused recommendations",
18
+ goal=(
19
+ "Generate 5 high-impact, context-specific business insights from the data profile "
20
+ "and column relationships. Format each insight as a numbered list item. "
21
+ "DO NOT write generic comments or dummy filler text. Each insight MUST include:\n"
22
+ "- **Observation**: The exact pattern, trend, or correlation shown in the columns.\n"
23
+ "- **Business Implication**: What this means for operational efficiency, revenue, customer satisfaction, or risk.\n"
24
+ "- **Actionable Strategy**: A concrete, practical recommendation the company can execute immediately to drive business value."
25
+ ),
26
+ backstory=(
27
+ "You are a Senior BI Director and Management Consultant (ex-McKinsey/BCG). You possess "
28
+ "a sharp ability to look at data profiles, column distributions, and correlations and immediately "
29
+ "translate them into strategic business realities. You write clearly, professionally, and persuasively. "
30
+ "You never use vague summaries or generic fillers — every point you make is tailored, analytical, "
31
+ "and directly useful to executive management.\n\n"
32
+ "CRITICAL CORRELATION RULE: Double check all correlation coefficient values you mention. Never state a "
33
+ "correlation is strong or moderate if the coefficient is 0 or -0. If the correlation coefficient is near 0, "
34
+ "there is no linear correlation. Quote the actual coefficients from the correlation matrix tool accurately."
35
+ ),
36
+ llm=LLM(**get_llm_params()),
37
+ tools=[
38
+ DatasetTools.read_dataset_head,
39
+ DatasetTools.get_dataset_info,
40
+ DatasetTools.get_correlation_matrix,
41
+ ],
42
+ max_iter=3,
43
+ verbose=True,
44
+ )
@@ -0,0 +1,36 @@
1
+ # Crewlyze
2
+ # Copyright (c) 2025 Sowmiyan S
3
+ # Licensed under the MIT License
4
+
5
+ from crewai import Agent, LLM
6
+ from config.llm_config import get_llm_params
7
+ from tools.dataset_tools import DatasetTools
8
+
9
+
10
+ def make_relation_agent() -> Agent:
11
+ """Factory — creates a fresh Relation Analyst agent with the current LLM config."""
12
+ return Agent(
13
+ name="Analyst",
14
+ role="Identify high-value business correlations and dataset relationships",
15
+ goal=(
16
+ "Identify 5 key column relationships with high business relevance (e.g. comparing "
17
+ "metrics like cost vs revenue, demographic factors vs outcome, or country vs rate, "
18
+ "rather than trivial ID columns or metadata). Output ONLY a list in this exact format:\n"
19
+ "- X: [Column1] | Y: [Column2] | Type: [ChartType]\n"
20
+ "DO NOT output any introductions, explanations, or other text."
21
+ ),
22
+ backstory=(
23
+ "You are a Senior Quantitative Analyst. You have a keen eye for finding statistical "
24
+ "relations that translate to real-world business dynamics. You strictly follow "
25
+ "formatting guidelines and never invent columns that don't exist in the provided profile.\n\n"
26
+ "CRITICAL CHART RULE: If either Column1 (X) or Column2 (Y) is categorical (e.g. contains discrete "
27
+ "values like categories, gender, status, chest pain type 'cp', or classes), do NOT recommend a "
28
+ "'Scatter Plot'. Instead, recommend a 'Bar Chart' or 'Box Plot' or 'Grouped Bar Chart'. Scatter Plots "
29
+ "must only be used for continuous numeric vs continuous numeric variables."
30
+ ),
31
+ allow_delegation=False,
32
+ llm=LLM(**get_llm_params()),
33
+ tools=[DatasetTools.read_dataset_head, DatasetTools.get_correlation_matrix],
34
+ max_iter=3,
35
+ verbose=True,
36
+ )
@@ -0,0 +1,41 @@
1
+ # Crewlyze
2
+ # Copyright (c) 2025 Sowmiyan S
3
+ # Licensed under the MIT License
4
+
5
+ from crewai import Agent, LLM
6
+ from config.llm_config import get_llm_params
7
+ from tools.dataset_tools import DatasetTools
8
+
9
+
10
+ def make_visualizer_agent() -> Agent:
11
+ """Factory — creates a fresh Visualizer agent with the current LLM config."""
12
+ return Agent(
13
+ name="Data Visualizer",
14
+ role="Premium Data Visualization & Plotting Expert",
15
+ backstory=(
16
+ "You are a master of data visualization design and analytics. You believe that charts must be "
17
+ "both statistically correct AND visually stunning. You use seaborn and matplotlib to design "
18
+ "corporate-grade, light-themed figures that executives love.\n\n"
19
+ "You have access to a sandbox execution tool 'Execute Visualization Code' where the pandas DataFrame "
20
+ "is already loaded as `df` and a helper function `save_chart(filename)` is pre-defined for you.\n\n"
21
+ "CRITICAL RULE: You will be given a 'RELATIONSHIPS TO VISUALIZE' section in your task. You MUST "
22
+ "generate charts for EXACTLY those specified column pairs (X and Y columns listed). Do NOT invent "
23
+ "different columns. Do NOT skip any pair. Use the chart Type hint given for each pair.\n\n"
24
+ "Apply a clean white theme: set figure facecolor to 'white', axes facecolor to '#f8fafc', "
25
+ "tick/label colors to '#334155'. Use high-contrast corporate colors like '#4f46e5', '#06b6d4', '#ec4899', '#10b981'."
26
+ ),
27
+ goal=(
28
+ "Generate premium seaborn/matplotlib charts for EACH relationship pair listed in the "
29
+ "'RELATIONSHIPS TO VISUALIZE' section. Execute Python code using 'Execute Visualization Code' "
30
+ "for every pair, saving each chart with save_chart(). Apply dark-themed professional styling. "
31
+ "If a pair fails, try an alternative chart type before giving up. Must generate at least 3 charts."
32
+ ),
33
+ llm=LLM(**get_llm_params()),
34
+ tools=[
35
+ DatasetTools.read_dataset_head,
36
+ DatasetTools.get_dataset_info,
37
+ DatasetTools.execute_visualization_code,
38
+ ],
39
+ max_iter=7,
40
+ verbose=True,
41
+ )
@@ -0,0 +1,4 @@
1
+ <svg xmlns="http://www.w3.org/2000/svg" width="140" height="36" viewBox="0 0 140 36">
2
+ <rect rx="6" width="140" height="36" fill="#0b1320" />
3
+ <text x="12" y="23" font-family="Segoe UI, Roboto" font-size="14" fill="#fff">crewai</text>
4
+ </svg>
@@ -0,0 +1,4 @@
1
+ <svg xmlns="http://www.w3.org/2000/svg" width="140" height="36" viewBox="0 0 140 36">
2
+ <rect rx="6" width="140" height="36" fill="#0b3d91" />
3
+ <text x="12" y="23" font-family="Segoe UI, Roboto" font-size="14" fill="#fff">matplotlib</text>
4
+ </svg>
@@ -0,0 +1,4 @@
1
+ <svg xmlns="http://www.w3.org/2000/svg" width="140" height="36" viewBox="0 0 140 36">
2
+ <rect rx="6" width="140" height="36" fill="#1f2937" />
3
+ <text x="12" y="23" font-family="Segoe UI, Roboto" font-size="14" fill="#fff">Ollama</text>
4
+ </svg>
@@ -0,0 +1,4 @@
1
+ <svg xmlns="http://www.w3.org/2000/svg" width="140" height="36" viewBox="0 0 140 36">
2
+ <rect rx="6" width="140" height="36" fill="#150458" />
3
+ <text x="12" y="23" font-family="Segoe UI, Roboto" font-size="14" fill="#fff">pandas</text>
4
+ </svg>
@@ -0,0 +1,4 @@
1
+ <svg xmlns="http://www.w3.org/2000/svg" width="140" height="36" viewBox="0 0 140 36">
2
+ <rect rx="6" width="140" height="36" fill="#0f172a" />
3
+ <text x="12" y="23" font-family="Segoe UI, Roboto" font-size="14" fill="#fff">seaborn</text>
4
+ </svg>
Binary file