crushdataai 1.2.1 → 1.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -25,12 +25,16 @@ Before writing any code, ask the user:
25
25
  2. **Data Context**: Which tables contain the data? What time range?
26
26
  3. **Metric Definitions**: How does YOUR company define the key metrics? Any filters?
27
27
 
28
+ 4. **Script Organization**: All analysis scripts must be saved in an `analysis/` folder.
29
+ 5. **Python Environment**: Check for `venv` or `.venv`. If missing, run `python3 -m venv venv`. Install/Run inside venv.
30
+ 6. **Reports**: Save all validation/profiling outputs to `reports/` folder. Create if missing.
31
+
28
32
  ### 2. Secure Data Access
29
33
  - **Check Connections**: Run `npx crushdataai connections` first.
30
34
  - **Missing Data?**: If the data source is not listed (e.g. on Desktop/Database), **INSTRUCT** the user to run:
31
35
  `npx crushdataai connect`
32
- - **Get Code**: Use `npx crushdataai snippet <name>` to access data.
33
- - **Security**: **DO NOT** ask for credentials or manual file moves.
36
+ - **Get Code**: **ALWAYS** use `npx crushdataai snippet <name>` to get loading code.
37
+ - **Security**: **DO NOT** ask user to copy/move files to `data/`. Treat connected data as read-only.
34
38
 
35
39
  ---
36
40
 
@@ -67,9 +71,21 @@ print(f"Date range: {df['date'].min()} to {df['date'].max()}")
67
71
  print(f"Missing values:\n{df.isnull().sum()}")
68
72
  ```
69
73
 
70
- Ask: "I found X rows, Y users, dates from A to B. Does this match expectation?"
74
+ 72: Ask: "I found X rows, Y users, dates from A to B. Does this match expectation?"
75
+ 73:
76
+ 74: ---
77
+ 75:
78
+ 76: ## Step 3b: Data Cleaning & Transformation (ETL)
79
+ 77:
80
+ 78: Based on profiling findings, perform necessary cleaning BEFORE analysis:
81
+ 79: - **Handle Missing Values**: Impute or drop based on logic.
82
+ 80: - **Remove Duplicates**: Check primary keys.
83
+ 81: - **Fix Data Types**: Ensure dates are datetime objects, numbers are numeric.
84
+ 82: - **Feature Engineering**: Create calculated fields needed for analysis.
85
+ - **Script Location**: Save all cleaning/ETL scripts in `etl/` folder.
86
+ 83:
87
+ 84: ---
71
88
 
72
- ---
73
89
 
74
90
  ## Step 4: Execute with Validation
75
91
 
@@ -29,15 +29,27 @@ When user requests data analysis work (analyze, query, dashboard, metrics, EDA,
29
29
  - How does YOUR company define the key metrics?
30
30
  - Any filters to apply? (exclude test users, internal accounts?)
31
31
  - What timezone should I use for dates?
32
- - What timezone should I use for dates?
32
+
33
+ 4. **Script Organization**
34
+ - Save all analysis scripts in an `analysis/` folder.
35
+ - Create this folder if it does not exist.
36
+
37
+ 5. **Python Environment**
38
+ - Check for `venv` or `.venv`. If missing, `python3 -m venv venv`.
39
+ - Install dependencies in venv (`venv/bin/pip install`).
40
+ - Run scripts using venv python (`venv/bin/python`).
41
+
42
+ 6. **Reports**
43
+ - Save all profiling, validation, and findings to `reports/` folder.
44
+ - Create this folder if it does not exist.
33
45
  ```
34
46
 
35
47
  ### 1b. Secure Data Access
36
48
  - **Check Connections**: Run `npx crushdataai connections` first.
37
49
  - **Missing Data?**: If the data source is not listed (e.g. on Desktop/Database), **INSTRUCT** the user to run:
38
50
  `npx crushdataai connect`
39
- - **Get Code**: Use `npx crushdataai snippet <name>` to access data.
40
- - **Security**: **DO NOT** ask for credentials or manual file moves.
51
+ - **Get Code**: **ALWAYS** use `npx crushdataai snippet <name>` to get loading code.
52
+ - **Security**: **DO NOT** ask user to copy/move files to `data/`. Treat connected data as read-only.
41
53
 
42
54
  ### Step 2: Search Relevant Domains
43
55
 
@@ -97,6 +109,15 @@ FROM table;
97
109
  **Report findings to user before proceeding:**
98
110
  > "I found X rows, Y unique users, date range from A to B. Does this match your expectation?"
99
111
 
112
+ ### Step 3b: Data Cleaning & Transformation (ETL)
113
+
114
+ **Address data quality issues found in Step 3:**
115
+ 1. **Cleaning**: Handle missing values, remove invalid duplicates, fix types.
116
+ 2. **Transformation**: Standardize categories, parse dates, normalize text.
117
+ 3. **Feature Engineering**: Create calculated columns needed for metrics.
118
+
119
+ *Tip: Save cleaning scripts to `etl/` folder. Save processed data to `data/processed/`.*
120
+
100
121
  ### Step 4: Execute Analysis with Validation
101
122
 
102
123
  **Before JOINs:**
@@ -18,13 +18,16 @@ Before coding, ask:
18
18
  - What business question should this answer?
19
19
  - Which tables contain the data?
20
20
  - How does YOUR company define key metrics?
21
+ - **Script Folder**: Save scripts in `analysis/`. Create folder if needed.
22
+ - **Python Env**: Check for `venv`. If missing, create `venv`. Always run inside venv.
23
+ - **Reports**: Save all outputs to `reports/` folder (profiling, validation).
21
24
 
22
25
  ### 2. Secure Data Access
23
26
  - **Check Connections**: Run `npx crushdataai connections` first.
24
27
  - **Missing Data?**: If the data source is not listed (e.g. on Desktop/Database), **INSTRUCT** the user to run:
25
28
  `npx crushdataai connect`
26
- - **Get Code**: Use `npx crushdataai snippet <name>` to access data.
27
- - **Security**: **DO NOT** ask for credentials or manual file moves.
29
+ - **Get Code**: **ALWAYS** use `npx crushdataai snippet <name>` to get loading code.
30
+ - **Security**: **DO NOT** ask user to copy/move files to `data/`. Treat connected data as read-only.
28
31
 
29
32
  ### 3. Search Knowledge Base
30
33
  ```bash
@@ -40,6 +43,12 @@ print(f"Shape: {df.shape}, Dates: {df['date'].min()} to {df['date'].max()}")
40
43
  ```
41
44
  Report and confirm before proceeding.
42
45
 
46
+ ### 3b. Data Cleaning & Transformation (ETL)
47
+ - Clean: Missing, duplicates, types
48
+ - Transform: Feature engineering
49
+ - Save: Scripts in `etl/`
50
+ - Verify: Re-check shape
51
+
43
52
  ### 4. Validate
44
53
  - Verify JOINs
45
54
  - Check totals
@@ -15,13 +15,16 @@ Before coding, ask:
15
15
  - What business question should this answer?
16
16
  - Which tables contain the data?
17
17
  - How does YOUR company define the key metrics?
18
+ - **Script Folder**: Save scripts in `analysis/`. Create folder if needed.
19
+ - **Python Env**: Check for `venv`. If missing, create `venv`. Always run inside venv.
20
+ - **Reports**: Save all outputs to `reports/` folder (profiling, validation).
18
21
 
19
22
  ### 2. Secure Data Access
20
23
  - **Check Connections**: Run `npx crushdataai connections` first.
21
24
  - **Missing Data?**: If the data source is not listed (e.g. on Desktop/Database), **INSTRUCT** the user to run:
22
25
  `npx crushdataai connect`
23
- - **Get Code**: Use `npx crushdataai snippet <name>` to access data.
24
- - **Security**: **DO NOT** ask for credentials or manual file moves.
26
+ - **Get Code**: **ALWAYS** use `npx crushdataai snippet <name>` to get loading code.
27
+ - **Security**: **DO NOT** ask user to copy/move files to `data/`. Treat connected data as read-only.
25
28
 
26
29
  ### 2. Search Knowledge Base
27
30
  ```bash
@@ -38,6 +41,12 @@ print(f"Shape: {df.shape}, Date range: {df['date'].min()} to {df['date'].max()}"
38
41
  ```
39
42
  Report findings and ask user for confirmation.
40
43
 
44
+ ### 3b. Data Cleaning & Transformation (ETL)
45
+ - **Clean**: Missing values, types, duplicates.
46
+ - **Transform**: Standardize formats, create calculated fields.
47
+ - **Save**: Scripts go to `etl/` folder.
48
+ - **Verify**: Check data again after cleaning.
49
+
41
50
  ### 4. Validate Before Delivery
42
51
  - Check JOINs don't multiply rows unexpectedly
43
52
  - Verify totals seem reasonable
@@ -14,13 +14,16 @@ Before writing code, gather:
14
14
  - Data tables/sources
15
15
  - Company-specific metric definitions
16
16
  - Time range and filters
17
+ - **Script Folder**: Save scripts in `analysis/`. Create folder if needed.
18
+ - **Python Env**: Check for `venv`. If missing, create `venv`. Always run inside venv.
19
+ - **Reports**: Save all outputs to `reports/` folder (profiling, validation).
17
20
 
18
21
  ### 2. Secure Data Access
19
22
  - **Check Connections**: Run `npx crushdataai connections` first.
20
23
  - **Missing Data?**: If the data source is not listed (e.g. on Desktop/Database), **INSTRUCT** the user to run:
21
24
  `npx crushdataai connect`
22
- - **Get Code**: Use `npx crushdataai snippet <name>` to access data.
23
- - **Security**: **DO NOT** ask for credentials or manual file moves.
25
+ - **Get Code**: **ALWAYS** use `npx crushdataai snippet <name>` to get loading code.
26
+ - **Security**: **DO NOT** ask user to copy/move files to `data/`. Treat connected data as read-only.
24
27
 
25
28
  ### 3. Search Before Implementing
26
29
  ```bash
@@ -37,6 +40,12 @@ Run and report:
37
40
 
38
41
  Ask: "Does this match your expectation?"
39
42
 
43
+ ### 3b. Data Cleaning & Transformation (ETL)
44
+ - Handle missing/duplicates
45
+ - Fix types & formats
46
+ - Create calculated fields
47
+ - **Save**: Scripts go to `etl/` folder
48
+
40
49
  ### 4. Validate Before Delivery
41
50
  - Sanity check totals
42
51
  - Compare to benchmarks
@@ -1,9 +1,10 @@
1
1
  Workflow Name,Step Number,Step Name,Description,Questions to Ask,Tools/Commands,Outputs,Common Mistakes
2
2
  Exploratory Data Analysis,1,Define Objectives,Understand what insights are needed,"What business questions should this EDA answer? Who is the primary audience for these findings?","None - conversation with stakeholder","Clear list of questions to answer","Starting analysis without clear goals"
3
- Exploratory Data Analysis,2,Data Profiling,Understand data structure shape and types,"How many rows do you expect? What date range should I focus on?","df.info(), df.describe(), df.isnull().sum(), df.dtypes","Data profile report with shape types and missing values","Skipping profiling and diving straight into analysis"
4
- Exploratory Data Analysis,3,Univariate Analysis,Analyze individual columns distributions,"Are there any columns I should focus on specifically?","df.hist(), df.value_counts(), df.describe()","Histograms and value distributions for key columns","Not checking for outliers or unexpected values"
5
- Exploratory Data Analysis,4,Bivariate Analysis,Relationships between variables,"Which relationships are most important to understand?","df.corr(), scatter plots, grouped statistics","Correlation matrix and scatter plots showing relationships","Missing important correlations by not testing all pairs"
6
- Exploratory Data Analysis,5,Document Findings,Summarize insights,"What format do you prefer for the findings summary?","Markdown report generation","Summary report with key insights and recommendations","Not prioritizing findings by business impact"
3
+ Exploratory Data Analysis,2,Data Profiling,Understand data structure shape and types,"How many rows do you expect? What date range should I focus on?","df.info(), df.describe(), df.isnull().sum(), df.dtypes","Save profiling attributes to reports/profiling_report.md","Skipping profiling and diving straight into analysis"
4
+ Exploratory Data Analysis,3,Data Cleaning,Handle missing/duplicates and fix types,"Are there known quality issues? Should we impute or drop?","df.fillna(), df.drop_duplicates(), df.astype(); Save script to etl/ folder","Cleaned dataset ready for analysis","Skipping cleaning leads to wrong distributions"
5
+ Exploratory Data Analysis,4,Univariate Analysis,Analyze individual columns distributions,"Are there any columns I should focus on specifically?","df.hist(), df.value_counts(), df.describe()","Histograms and value distributions for key columns","Not checking for outliers or unexpected values"
6
+ Exploratory Data Analysis,5,Bivariate Analysis,Relationships between variables,"Which relationships are most important to understand?","df.corr(), scatter plots, grouped statistics","Correlation matrix and scatter plots showing relationships","Missing important correlations by not testing all pairs"
7
+ Exploratory Data Analysis,6,Document Findings,Summarize insights,"What format do you prefer for the findings summary?","Markdown report generation","Save summary report to reports/insights.md","Not prioritizing findings by business impact"
7
8
  Dashboard Creation,1,Define Audience,Who will use the dashboard and for what purpose,"Is this for executives (high-level KPIs) or analysts (detailed breakdowns)? How often will they view it?","None - conversation","Clear audience definition and use case","Building for wrong audience (too detailed for execs or too simple for analysts)"
8
9
  Dashboard Creation,2,Identify KPIs,What metrics matter most to track,"What are your top 5-7 metrics? Do you have targets for each?","Search industry metrics database","Prioritized list of KPIs with targets","Too many metrics (7+ KPIs causes cognitive overload)"
9
10
  Dashboard Creation,3,Data Preparation,Get data into usable format,"Which tables contain this data? What granularity (daily/weekly)?","SQL queries, pandas transformations","Clean aggregated data ready for visualization","Not validating data before visualization"
@@ -16,19 +17,22 @@ A/B Test Analysis,4,Statistical Analysis,Calculate significance and effect size,
16
17
  A/B Test Analysis,5,Interpret Results,What does this mean for business,"Should we roll out, iterate, or abandon based on results?","Business impact calculation","Actionable recommendation with expected impact","Declaring winner without considering practical significance"
17
18
  Cohort Analysis,1,Define Cohort,How to group users for analysis,"Should I cohort by signup date, first purchase, or another event?","None - conversation","Cohort definition documented","Using wrong cohort definition for the question"
18
19
  Cohort Analysis,2,Define Metric,What to measure over time,"Should I track retention, revenue, or activity? Over what time periods?","None - conversation","Metric and time periods defined","Measuring wrong metric for the business question"
19
- Cohort Analysis,3,Build Cohort Table,SQL for cohort pivot table,"Is there a specific date range to analyze?","SQL with window functions, pivot tables","Cohort table with periods as columns","Off-by-one errors in period calculations"
20
- Cohort Analysis,4,Visualize,Create retention heatmap,"Any specific cohorts to highlight?","Heatmap visualization","Color-coded retention heatmap","Using colors that don't show progression clearly"
21
- Cohort Analysis,5,Insights,Identify patterns and explain why,"Which cohorts performed best/worst?","Comparative analysis","Insights report with recommended actions","Not investigating WHY cohorts differ"
20
+ Cohort Analysis,3,Data QC & Prep,Clean data and handle nulls before grouping,"Are user IDs unique? Any null dates?","check_duplicates(), handled nulls; Save script to etl/ folder","Clean cohort/event dataset","Including null dates in cohort groups"
21
+ Cohort Analysis,4,Build Cohort Table,SQL for cohort pivot table,"Is there a specific date range to analyze?","SQL with window functions, pivot tables","Cohort table with periods as columns","Off-by-one errors in period calculations"
22
+ Cohort Analysis,5,Visualize,Create retention heatmap,"Any specific cohorts to highlight?","Heatmap visualization","Color-coded retention heatmap","Using colors that don't show progression clearly"
23
+ Cohort Analysis,6,Insights,Identify patterns and explain why,"Which cohorts performed best/worst?","Comparative analysis","Insights report with recommended actions","Not investigating WHY cohorts differ"
22
24
  Funnel Analysis,1,Define Steps,What are the funnel stages in order,"What is the first step? What is the final conversion event?","None - conversation","Ordered list of funnel steps","Missing steps or having steps out of order"
23
- Funnel Analysis,2,Count Users,How many users at each step,"What time window should I use for the funnel?","SQL to count distinct users per step","User counts at each stage","Counting sessions instead of unique users"
24
- Funnel Analysis,3,Calculate Drop-off,Where are users leaving,"Are there any known issues at specific steps?","Conversion rate between steps","Drop-off rates between each step","Comparing non-sequential steps"
25
- Funnel Analysis,4,Visualize,Create funnel chart,"Prefer horizontal bars or funnel shape?","Funnel or horizontal bar visualization","Funnel visualization","Not labeling percentages clearly"
26
- Funnel Analysis,5,Recommendations,How to improve conversion,"What lever do you have to improve each step?","Analysis of biggest opportunities","Prioritized list of improvement suggestions","Focusing on small improvements instead of biggest drop-offs"
25
+ Funnel Analysis,2,Data QC & Prep,Ensure event data is clean,"Are event names standardized? Any missing sessions?","Mapping event names, removing bots; Save script to etl/ folder","Clean event stream","Counting duplicate events"
26
+ Funnel Analysis,3,Count Users,How many users at each step,"What time window should I use for the funnel?","SQL to count distinct users per step","User counts at each stage","Counting sessions instead of unique users"
27
+ Funnel Analysis,4,Calculate Drop-off,Where are users leaving,"Are there any known issues at specific steps?","Conversion rate between steps","Drop-off rates between each step","Comparing non-sequential steps"
28
+ Funnel Analysis,5,Visualize,Create funnel chart,"Prefer horizontal bars or funnel shape?","Funnel or horizontal bar visualization","Funnel visualization","Not labeling percentages clearly"
29
+ Funnel Analysis,6,Recommendations,How to improve conversion,"What lever do you have to improve each step?","Analysis of biggest opportunities","Prioritized list of improvement suggestions","Focusing on small improvements instead of biggest drop-offs"
27
30
  Time Series Analysis,1,Define Metric,What to analyze over time,"Daily revenue, weekly users, or monthly orders? How far back?","None - conversation","Metric and time range defined","Wrong granularity (too granular hides trends or too aggregated misses patterns)"
28
- Time Series Analysis,2,Aggregate,Group by time period,"Any specific date filters? Exclude weekends?","SQL with DATE_TRUNC, GROUP BY","Aggregated time series data","Timezone issues in date grouping"
29
- Time Series Analysis,3,Decompose,Identify trend seasonality residual,"Is there known seasonality (weekly/monthly/yearly)?","Seasonal decomposition, moving averages","Decomposed components visualization","Ignoring seasonality when comparing periods"
30
- Time Series Analysis,4,Compare Periods,YoY MoM WoW comparisons,"Which comparison periods matter most?","LAG functions, period-over-period calculations","Comparison table with growth rates","Comparing incomplete periods"
31
- Time Series Analysis,5,Forecast (optional),Predict future values,"Do you need forecasting? What horizon?","Simple forecasting models","Forecast with confidence intervals","Overfitting on historical data"
31
+ Time Series Analysis,2,Data QC & Prep,Handle gaps and outliers in time series,"Missing dates? Extreme outliers?","Resampling, outlier detection; Save script to etl/ folder","Clean time series","Interpolating missing data incorrectly"
32
+ Time Series Analysis,3,Aggregate,Group by time period,"Any specific date filters? Exclude weekends?","SQL with DATE_TRUNC, GROUP BY","Aggregated time series data","Timezone issues in date grouping"
33
+ Time Series Analysis,4,Decompose,Identify trend seasonality residual,"Is there known seasonality (weekly/monthly/yearly)?","Seasonal decomposition, moving averages","Decomposed components visualization","Ignoring seasonality when comparing periods"
34
+ Time Series Analysis,5,Compare Periods,YoY MoM WoW comparisons,"Which comparison periods matter most?","LAG functions, period-over-period calculations","Comparison table with growth rates","Comparing incomplete periods"
35
+ Time Series Analysis,6,Forecast (optional),Predict future values,"Do you need forecasting? What horizon?","Simple forecasting models","Forecast with confidence intervals","Overfitting on historical data"
32
36
  Customer Segmentation,1,Define Variables,What to segment on,"RFM, behavior, or demographics? What actions should differ by segment?","None - conversation","Segmentation variables defined","Choosing variables that don't drive different actions"
33
37
  Customer Segmentation,2,Feature Engineering,Calculate segment variables,"What time window for calculating features?","SQL or Python for RFM or other features","Feature table ready for segmentation","Using raw values instead of normalized scores"
34
38
  Customer Segmentation,3,Clustering,Group similar customers,"How many segments should we create?","K-means or rule-based segmentation","Cluster assignments","Too many or too few segments"
@@ -12,13 +12,16 @@ Ask before coding:
12
12
  - Business question this analysis should answer
13
13
  - Which tables/databases contain the data
14
14
  - Company-specific metric definitions
15
+ - **Script Folder**: Save scripts in `analysis/`. Create folder if needed.
16
+ - **Python Env**: Check for `venv`. If missing, create `venv`. Always run inside venv.
17
+ - **Reports**: Save all outputs to `reports/` folder (profiling, validation).
15
18
 
16
19
  ### 2. Secure Data Access
17
20
  - **Check Connections**: Run `npx crushdataai connections` first.
18
21
  - **Missing Data?**: If the data source is not listed (e.g. on Desktop/Database), **INSTRUCT** the user to run:
19
22
  `npx crushdataai connect`
20
- - **Get Code**: Use `npx crushdataai snippet <name>` to access data.
21
- - **Security**: **DO NOT** ask for credentials or manual file moves.
23
+ - **Get Code**: **ALWAYS** use `npx crushdataai snippet <name>` to get loading code.
24
+ - **Security**: **DO NOT** ask user to copy/move files to `data/`. Treat connected data as read-only.
22
25
 
23
26
  ### 3. Search Knowledge
24
27
  ```bash
@@ -35,6 +38,12 @@ Run profiling before any analysis:
35
38
  print(f"Shape: {df.shape}, Dates: {df['date'].min()} to {df['date'].max()}")
36
39
  ```
37
40
 
41
+ ### 3b. Data Cleaning & Transformation (ETL)
42
+ - Handle missing values/duplicates
43
+ - Fix data types
44
+ - Create calculated fields
45
+ - **Save**: Scripts go to `etl/` folder
46
+
38
47
  ### 4. Validate
39
48
  - Verify JOINs don't multiply rows
40
49
  - Check totals are reasonable
@@ -4,6 +4,7 @@ export declare class ShopifyConnector implements Connector {
4
4
  type: string;
5
5
  test(connection: Connection): Promise<boolean>;
6
6
  getTables(connection: Connection): Promise<Table[]>;
7
+ private fetchTotalCount;
7
8
  getData(connection: Connection, tableName: string, page: number, limit: number): Promise<TableData>;
8
9
  getSnippet(connection: Connection, lang: string): string;
9
10
  }
@@ -6,23 +6,168 @@ class ShopifyConnector {
6
6
  this.type = 'shopify';
7
7
  }
8
8
  async test(connection) {
9
+ console.log(`[Shopify] Testing connection for ${connection.name} (Store: ${connection.store})`);
9
10
  if (!connection.store || !connection.apiKey) {
10
- throw new Error('Store URL and API Key are required');
11
+ throw new Error('Store URL and Admin API Access Token are required');
12
+ }
13
+ // Basic validation for Store URL
14
+ if (connection.store.includes('_') && !connection.store.includes('.')) {
15
+ throw new Error(`Invalid Store URL: "${connection.store}". It looks like you might have pasted an API key/secret here. The Store URL should be something like "your-shop.myshopify.com".`);
16
+ }
17
+ const storeUrl = connection.store.replace(/\/$/, '');
18
+ const url = `${storeUrl.startsWith('http') ? '' : 'https://'}${storeUrl}/admin/api/2024-04/shop.json`;
19
+ try {
20
+ const response = await fetch(url, {
21
+ headers: {
22
+ 'X-Shopify-Access-Token': connection.apiKey,
23
+ 'Content-Type': 'application/json'
24
+ }
25
+ });
26
+ if (!response.ok) {
27
+ const errorData = await response.json();
28
+ console.error(`[Shopify] Test failed:`, errorData);
29
+ throw new Error(errorData.errors || `Shopify API error: ${response.statusText}`);
30
+ }
31
+ console.log(`[Shopify] Connection test successful for ${connection.name}`);
32
+ return true;
33
+ }
34
+ catch (error) {
35
+ console.error(`[Shopify] Connection test error:`, error.message);
36
+ throw new Error(`Connection failed: ${error.message}`);
11
37
  }
12
- return true;
13
38
  }
14
39
  async getTables(connection) {
15
- return [];
40
+ console.log(`[Shopify] getTables called for ${connection.name}`);
41
+ const tables = ['orders', 'products', 'customers'];
42
+ const result = [];
43
+ for (const table of tables) {
44
+ try {
45
+ const count = await this.fetchTotalCount(connection, table);
46
+ result.push({ name: table, type: 'shopify', rowCount: count });
47
+ }
48
+ catch (e) {
49
+ console.warn(`[Shopify] Could not fetch count for ${table}:`, e);
50
+ result.push({ name: table, type: 'shopify' });
51
+ }
52
+ }
53
+ return result;
54
+ }
55
+ async fetchTotalCount(connection, tableName) {
56
+ if (!connection.store || !connection.apiKey)
57
+ return null;
58
+ const storeUrl = connection.store.replace(/\/$/, '');
59
+ const url = `${storeUrl.startsWith('http') ? '' : 'https://'}${storeUrl}/admin/api/2024-04/${tableName}/count.json`;
60
+ try {
61
+ const response = await fetch(url, {
62
+ headers: {
63
+ 'X-Shopify-Access-Token': connection.apiKey,
64
+ 'Content-Type': 'application/json'
65
+ }
66
+ });
67
+ if (!response.ok)
68
+ return null;
69
+ const data = await response.json();
70
+ return data.count;
71
+ }
72
+ catch {
73
+ return null;
74
+ }
16
75
  }
17
76
  async getData(connection, tableName, page, limit) {
18
- return {
19
- columns: [],
20
- rows: [],
21
- pagination: { page, limit, totalRows: 0, totalPages: 0, startIdx: 0, endIdx: 0 }
22
- };
77
+ if (!connection.store || !connection.apiKey) {
78
+ throw new Error('Store URL and Admin API Access Token are required');
79
+ }
80
+ const storeUrl = connection.store.replace(/\/$/, '');
81
+ let url = `${storeUrl.startsWith('http') ? '' : 'https://'}${storeUrl}/admin/api/2024-04/${tableName}.json?limit=${limit}`;
82
+ // Add status=any for orders to fetch more than just 'open' orders
83
+ if (tableName === 'orders') {
84
+ url += '&status=any';
85
+ }
86
+ console.log(`[Shopify] Fetching ${tableName} from: ${url}`);
87
+ try {
88
+ const response = await fetch(url, {
89
+ headers: {
90
+ 'X-Shopify-Access-Token': connection.apiKey,
91
+ 'Content-Type': 'application/json'
92
+ }
93
+ });
94
+ if (!response.ok) {
95
+ const errorData = await response.json();
96
+ console.error(`[Shopify] Error response:`, errorData);
97
+ throw new Error(errorData.errors || `Shopify API error: ${response.statusText}`);
98
+ }
99
+ const data = await response.json();
100
+ const rawRows = data[tableName] || [];
101
+ console.log(`[Shopify] Successfully fetched ${rawRows.length} rows for ${tableName}`);
102
+ // Transform Shopify nested JSON into flat rows for the UI
103
+ const rows = rawRows.map((item) => {
104
+ const flat = {};
105
+ for (const key in item) {
106
+ if (typeof item[key] === 'object' && item[key] !== null) {
107
+ flat[key] = JSON.stringify(item[key]);
108
+ }
109
+ else {
110
+ flat[key] = item[key];
111
+ }
112
+ }
113
+ return flat;
114
+ });
115
+ const columns = rows.length > 0 ? Object.keys(rows[0]) : [];
116
+ const totalRows = (await this.fetchTotalCount(connection, tableName)) || rows.length;
117
+ const totalPages = Math.ceil(totalRows / limit) || 1;
118
+ return {
119
+ columns,
120
+ rows,
121
+ pagination: {
122
+ page,
123
+ limit,
124
+ totalRows,
125
+ totalPages,
126
+ startIdx: (page - 1) * limit + 1,
127
+ endIdx: (page - 1) * limit + rows.length
128
+ }
129
+ };
130
+ }
131
+ catch (error) {
132
+ console.error(`[Shopify] Fetch error:`, error.message);
133
+ throw new Error(`Failed to fetch data: ${error.message}`);
134
+ }
23
135
  }
24
136
  getSnippet(connection, lang) {
25
- return `# Shopify snippet generation not implemented yet`;
137
+ const storeUrl = connection.store?.replace(/\/$/, '') || 'your-shop.myshopify.com';
138
+ const apiKey = connection.apiKey || 'shpat_...';
139
+ if (lang === 'python') {
140
+ return `import requests
141
+ import pandas as pd
142
+
143
+ # Connection: ${connection.name}
144
+ # Type: shopify
145
+ shop_url = "${storeUrl}"
146
+ access_token = "${apiKey}"
147
+ api_version = "2024-04"
148
+
149
+ def fetch_shopify_data(endpoint):
150
+ url = f"{'https://' if not shop_url.startswith('http') else ''}{shop_url}/admin/api/{api_version}/{endpoint}.json"
151
+ headers = {
152
+ "X-Shopify-Access-Token": access_token,
153
+ "Content-Type": "application/json"
154
+ }
155
+
156
+ response = requests.get(url, headers=headers)
157
+ response.raise_for_status()
158
+ return response.json()
159
+
160
+ try:
161
+ # Example: Fetching orders
162
+ data = fetch_shopify_data("orders")
163
+ df = pd.DataFrame(data["orders"])
164
+ print(f"Successfully loaded {len(df)} orders from ${connection.name}")
165
+ print(df.head())
166
+ except Exception as e:
167
+ print(f"Error loading Shopify data: {e}")
168
+ `;
169
+ }
170
+ return `# Language ${lang} not supported for Shopify connector yet.`;
26
171
  }
27
172
  }
28
173
  exports.ShopifyConnector = ShopifyConnector;
package/dist/server.js CHANGED
@@ -67,7 +67,7 @@ app.use(express_1.default.static(uiPath));
67
67
  app.use('/api/connections', connections_1.default);
68
68
  // API: Health check
69
69
  app.get('/api/health', (_req, res) => {
70
- res.json({ status: 'ok', version: '2.0.0' });
70
+ res.json({ status: 'ok', version: '1.2.6' });
71
71
  });
72
72
  // Serve index.html for root
73
73
  app.get('/', (_req, res) => {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "crushdataai",
3
- "version": "1.2.1",
3
+ "version": "1.2.6",
4
4
  "description": "CLI to install CrushData AI data analyst skill for AI coding assistants",
5
5
  "main": "dist/index.js",
6
6
  "bin": {