codemodctl 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,174 +1,165 @@
1
1
  # codemodctl
2
2
 
3
- A CLI tool for workflow engine operations, providing handy commands to interact with codemod APIs and process workflow data.
3
+ CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis.
4
4
 
5
5
  ## Installation
6
6
 
7
7
  ```bash
8
- pnpm install @acme/codemodctl
8
+ npm install codemodctl
9
9
  ```
10
10
 
11
11
  ## Usage
12
12
 
13
+ ### As a CLI Tool
14
+
13
15
  ```bash
14
- codemodctl <command> [options]
16
+ # Analyze CODEOWNERS and generate sharding configuration
17
+ codemodctl codeowner --shard-size 20 --state-prop shards --rule ./rule.yaml
15
18
  ```
16
19
 
17
- ## Commands
20
+ ### As a Library
18
21
 
19
- ### `pr create`
22
+ #### Deterministic File Sharding
20
23
 
21
- Create a pull request using the codemod API.
24
+ ```typescript
25
+ import { getShardForFilename, fitsInShard, distributeFilesAcrossShards } from 'codemodctl/sharding';
22
26
 
23
- ```bash
24
- codemodctl pr create --title "feat: implement new feature" [options]
25
- ```
27
+ // Get the shard index for a specific file - always deterministic!
28
+ const shardIndex = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
26
29
 
27
- #### Options
30
+ // Same file + same shard count = same result, every time
31
+ const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
32
+ const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
33
+ console.log(shard1 === shard2); // always true
28
34
 
29
- - `--title <title>` (required): The title of the pull request
30
- - `--body <body>` (optional): The body/description of the pull request
31
- - `--head <branch>` (optional): The head branch for the pull request
32
- - `--base <branch>` (optional): The base branch to merge into (defaults to 'main')
35
+ // Check if a file belongs to a specific shard
36
+ const belongsToShard = fitsInShard('src/components/Button.tsx', {
37
+ shardCount: 5,
38
+ shardIndex: 2
39
+ });
33
40
 
34
- #### Examples
41
+ // Distribute all files across shards
42
+ const files = ['file1.ts', 'file2.ts', 'file3.ts'];
43
+ const distribution = distributeFilesAcrossShards(files, 5);
44
+ ```
35
45
 
36
- ```bash
37
- # Create a simple PR with just a title
38
- codemodctl pr create --title "feat: implement new feature"
46
+ #### Codeowner Analysis
39
47
 
40
- # Create a PR with title and body
41
- codemodctl pr create --title "feat: implement new feature" --body "This PR implements a new feature that improves performance"
48
+ ```typescript
49
+ import { analyzeCodeowners, findCodeownersFile } from 'codemodctl/codeowners';
42
50
 
43
- # Create a PR with custom branches
44
- codemodctl pr create --title "feat: implement new feature" --head "feature-branch" --base "develop"
51
+ // Analyze codeowners and generate shard configuration
52
+ const result = await analyzeCodeowners({
53
+ shardSize: 20,
54
+ rulePath: './rule.yaml',
55
+ projectRoot: process.cwd()
56
+ });
45
57
 
46
- # Create a comprehensive PR
47
- codemodctl pr create \
48
- --title "feat: implement new feature" \
49
- --body "This PR implements a new feature that improves performance by 50%" \
50
- --head "feature-branch" \
51
- --base "main"
58
+ console.log(`Generated ${result.shards.length} shards for ${result.totalFiles} files`);
59
+ result.teams.forEach(team => {
60
+ console.log(`Team "${team.team}" owns ${team.fileCount} files`);
61
+ });
52
62
  ```
53
63
 
54
- ### `shard codeowner`
64
+ #### Complete API
55
65
 
56
- Analyze a GitHub CODEOWNERS file and generate sharding output for teams based on file ownership.
66
+ ```typescript
67
+ import codemodctl from 'codemodctl';
57
68
 
58
- ```bash
59
- codemodctl shard codeowner --shard-size <size> --state-prop <property-name> [--codeowners <path>]
69
+ // Access all utilities through the default export
70
+ const shardIndex = await codemodctl.sharding.getShardForFilename('file.ts', { shardCount: 5 });
71
+ const analysis = await codemodctl.codeowners.analyzeCodeowners(options);
60
72
  ```
61
73
 
62
- #### Options
63
-
64
- - `--shard-size <size>` (required): Number of files per shard
65
- - `--state-prop <property>` (required): Property name to use in the state output
66
- - `--codeowners <path>` (optional): Path to CODEOWNERS file. If not provided, searches in current directory, `.github/`, or `docs/`
74
+ ## Key Features
67
75
 
68
- #### Examples
76
+ ### Deterministic File Sharding
69
77
 
70
- ```bash
71
- # Create shards with 10 files per shard (auto-discover CODEOWNERS file)
72
- codemodctl shard codeowner --shard-size 10 --state-prop teamShards
78
+ The sharding algorithm uses deterministic hashing to ensure:
73
79
 
74
- # Create shards with custom CODEOWNERS path
75
- codemodctl shard codeowner --shard-size 25 --state-prop migrationShards --codeowners ./custom/CODEOWNERS
80
+ - **Perfect consistency**: Same file + same shard count = same result, always
81
+ - **No external dependencies**: Result depends only on filename and shard count
82
+ - **Even distribution**: SHA1 hashing provides good distribution across shards
83
+ - **Simple API**: No complex parameters or configuration needed
84
+ - **Team-aware sharding**: Works with codeowner boundaries
76
85
 
77
- # Create shards for a specific team distribution
78
- codemodctl shard codeowner --shard-size 50 --state-prop deploymentShards
79
- ```
86
+ ### Codeowner Analysis
80
87
 
81
- #### How it works
88
+ - **Automatic CODEOWNERS detection**: Searches common locations (root, .github/, docs/)
89
+ - **AST-grep integration**: Analyze files using custom rules
90
+ - **Team-based grouping**: Groups files by their assigned teams
91
+ - **Shard generation**: Creates optimal shard configuration based on team ownership
82
92
 
83
- 1. **CODEOWNERS Discovery**: Automatically finds CODEOWNERS file in common locations (root, `.github/`, `docs/`) or uses provided path
84
- 2. **File Analysis**: Uses the `codeowners` npm package to parse the GitHub CODEOWNERS file and determine file ownership
85
- 3. **File Counting**: Scans the repository and counts files owned by each team/user (excluding common ignore patterns)
86
- 4. **Shard Calculation**: Divides the total files by the shard size to determine number of shards needed per team
87
- 5. **Output Generation**: Creates a JSON array with team and shard information
88
- 6. **State Output**: Writes the result to the file specified by `$STATE_OUTPUTS` environment variable
93
+ ## API Reference
89
94
 
90
- #### CODEOWNERS File Format
95
+ ### Sharding Functions
91
96
 
92
- The tool works with standard GitHub CODEOWNERS syntax:
97
+ - `getShardForFilename(filename, { shardCount })` - Get shard index for a file
98
+ - `fitsInShard(filename, { shardCount, shardIndex })` - Check shard membership
99
+ - `distributeFilesAcrossShards(files, shardCount)` - Distribute files across shards
100
+ - `calculateOptimalShardCount(totalFiles, targetShardSize)` - Calculate optimal shard count
101
+ - `getFileHashPosition(filename)` - Get consistent hash position for a file
93
102
 
94
- ```
95
- # Global owners
96
- * @global-team
103
+ All functions are deterministic: same input always produces the same output.
97
104
 
98
- # Frontend files
99
- src/components/ @frontend-team
100
- *.tsx @frontend-team @design-team
105
+ ### Codeowner Functions
101
106
 
102
- # Backend files
103
- src/api/ @backend-team
104
- *.sql @database-team @backend-team
107
+ - `analyzeCodeowners(options)` - Complete analysis with shard generation
108
+ - `findCodeownersFile(projectRoot?, explicitPath?)` - Locate CODEOWNERS file
109
+ - `loadAstGrepRule(rulePath)` - Parse AST-grep rule from YAML
110
+ - `analyzeFilesByOwner(codeownersPath, rule, projectRoot?)` - Group files by owner
111
+ - `generateShards(filesByOwner, shardSize)` - Generate shard configuration
112
+ - `normalizeOwnerName(owner)` - Normalize owner names
105
113
 
106
- # DevOps files
107
- .github/ @devops-team
108
- Dockerfile @devops-team
109
- ```
114
+ ## Usage Examples
110
115
 
111
- #### Example Output
112
-
113
- For teams with the following file ownership:
114
- - `frontend-team`: 100 files
115
- - `backend-team`: 75 files
116
- - `devops-team`: 25 files
117
-
118
- With `--shard-size 25`:
119
-
120
- ```json
121
- [
122
- {"team": "frontend-team", "shard": "1/4"},
123
- {"team": "frontend-team", "shard": "2/4"},
124
- {"team": "frontend-team", "shard": "3/4"},
125
- {"team": "frontend-team", "shard": "4/4"},
126
- {"team": "backend-team", "shard": "1/3"},
127
- {"team": "backend-team", "shard": "2/3"},
128
- {"team": "backend-team", "shard": "3/3"},
129
- {"team": "devops-team", "shard": "1/1"}
130
- ]
131
- ```
116
+ ### Simple Deterministic Sharding
117
+ ```typescript
118
+ import { getShardForFilename, distributeFilesAcrossShards } from 'codemodctl/sharding';
132
119
 
133
- The tool will write to the state output file:
134
- ```
135
- teamShards=[{"team": "frontend-team", "shard": "1/4"}, {"team": "frontend-team", "shard": "2/4"}, ...]
136
- ```
120
+ // Get shard for a file - always deterministic
121
+ const shard = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
137
122
 
138
- ## Environment Variables
123
+ // Same input always gives same output
124
+ const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
125
+ const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
126
+ console.log(shard1 === shard2); // always true
139
127
 
140
- ### For PR Creation
141
- - `BUTTERFLOW_API_ENDPOINT`: The API endpoint for codemod services
142
- - `BUTTERFLOW_API_AUTH_TOKEN`: Authentication token for API requests
143
- - `CODEMOD_TASK_ID`: Current task ID for the workflow
128
+ // Different shard counts may give different results (that's expected)
129
+ const shard5 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
130
+ const shard10 = getShardForFilename('src/components/Button.tsx', { shardCount: 10 });
131
+ // shard5 and shard10 may be different, but each is consistent
144
132
 
145
- ### For State Output
146
- - `STATE_OUTPUTS`: Path to the file where state outputs should be written
133
+ // Distribute files deterministically
134
+ const files = ['file1.ts', 'file2.ts', 'file3.ts'];
135
+ const distribution = distributeFilesAcrossShards(files, 5);
136
+ ```
147
137
 
148
- ## Integration with Workflow Engine
138
+ ### Key Benefits
139
+ - **No complex parameters**: Just filename and shard count
140
+ - **Perfectly deterministic**: Same input = same output, always
141
+ - **Fast and simple**: Pure hash-based assignment
142
+ - **Works across runs**: File gets same shard whether filesystem changes or not
149
143
 
150
- codemodctl is designed to work seamlessly with the workflow engine:
144
+ ## CLI Commands
151
145
 
152
- 1. **PR Creation**: Automatically uses workflow context (task ID, project name) when available
153
- 2. **State Management**: Writes outputs to the workflow state file for use in subsequent steps
154
- 3. **Error Handling**: Provides clear error messages and appropriate exit codes for workflow integration
146
+ ### `codeowner`
155
147
 
156
- ## Development
148
+ Analyze CODEOWNERS file and generate sharding configuration.
157
149
 
158
150
  ```bash
159
- # Install dependencies
160
- pnpm install
161
-
162
- # Build the project
163
- pnpm build
151
+ codemodctl codeowner [options]
164
152
 
165
- # Run tests
166
- pnpm test
167
-
168
- # Run in development mode
169
- pnpm dev
153
+ Options:
154
+ -s, --shard-size <size> Number of files per shard (required)
155
+ -p, --state-prop <prop> Property name for state output (required)
156
+ -c, --codeowners <path> Path to CODEOWNERS file (optional)
157
+ -r, --rule <path> Path to AST-grep rule file (required)
170
158
  ```
171
159
 
160
+ Environment variables:
161
+ - `STATE_OUTPUTS`: Path to write state output file
162
+
172
163
  ## License
173
164
 
174
- MIT
165
+ MIT
package/dist/cli.js CHANGED
@@ -1,7 +1,221 @@
1
1
  #!/usr/bin/env node
2
- import { prCommand, shardCommand } from "./shard-BOcsYHKh.js";
2
+ import { analyzeCodeowners } from "./codeowner-analysis-CeRIGuLu.js";
3
3
  import { defineCommand, runMain } from "citty";
4
+ import { exec } from "node:child_process";
5
+ import crypto from "node:crypto";
6
+ import fetch from "node-fetch";
7
+ import { writeFile } from "node:fs/promises";
4
8
 
9
+ //#region src/commands/pr/create.ts
10
+ const createCommand = defineCommand({
11
+ meta: {
12
+ name: "create",
13
+ description: "Create a pull request"
14
+ },
15
+ args: {
16
+ title: {
17
+ type: "string",
18
+ description: "Title of the pull request",
19
+ required: true
20
+ },
21
+ body: {
22
+ type: "string",
23
+ description: "Body/description of the pull request",
24
+ required: false
25
+ },
26
+ head: {
27
+ type: "string",
28
+ description: "Head branch for the pull request",
29
+ required: false
30
+ },
31
+ base: {
32
+ type: "string",
33
+ description: "Base branch to merge into",
34
+ required: false,
35
+ default: "main"
36
+ },
37
+ push: {
38
+ type: "boolean",
39
+ required: false
40
+ },
41
+ commitMessage: {
42
+ type: "string",
43
+ description: "Message to commit",
44
+ required: false
45
+ },
46
+ branchName: {
47
+ type: "string",
48
+ description: "Branch to create the pull request from",
49
+ required: false
50
+ }
51
+ },
52
+ async run({ args }) {
53
+ const { title, body, head, base, push, commitMessage, branchName } = args;
54
+ if (push && !commitMessage) {
55
+ console.error("Error: commitMessage is required if commit is true");
56
+ process.exit(1);
57
+ }
58
+ const apiEndpoint = process.env.BUTTERFLOW_API_ENDPOINT;
59
+ const authToken = process.env.BUTTERFLOW_API_AUTH_TOKEN;
60
+ const taskId = process.env.CODEMOD_TASK_ID;
61
+ if (!taskId) {
62
+ console.error("Error: CODEMOD_TASK_ID environment variable is required");
63
+ process.exit(1);
64
+ }
65
+ if (!apiEndpoint) {
66
+ console.error("Error: BUTTERFLOW_API_ENDPOINT environment variable is required");
67
+ process.exit(1);
68
+ }
69
+ if (!authToken) {
70
+ console.error("Error: BUTTERFLOW_API_AUTH_TOKEN environment variable is required");
71
+ process.exit(1);
72
+ }
73
+ if (exec("git diff --quiet || git diff --cached --quiet").stderr?.read().toString().trim()) {
74
+ console.error("No changes detected, skipping pull request creation.");
75
+ process.exit(0);
76
+ }
77
+ const taskIdSignature = crypto.createHash("sha256").update(taskId).digest("hex").slice(0, 8);
78
+ const codemodBranchName = branchName ? branchName : `codemod-${taskIdSignature}`;
79
+ if (push) {
80
+ const addChanges = exec("git add .");
81
+ if (addChanges.stderr?.read().toString().trim()) {
82
+ console.error("Error: Failed to add changes");
83
+ console.error(addChanges.stderr?.read().toString().trim());
84
+ process.exit(1);
85
+ }
86
+ const commitChanges = exec(`git commit -m "${commitMessage}"`);
87
+ if (commitChanges.stderr?.read().toString().trim()) {
88
+ console.error("Error: Failed to commit changes");
89
+ console.error(commitChanges.stderr?.read().toString().trim());
90
+ process.exit(1);
91
+ }
92
+ const pushChanges = exec(`git push origin ${codemodBranchName} --force`);
93
+ if (pushChanges.stderr?.read().toString().trim()) {
94
+ console.error("Error: Failed to push changes");
95
+ console.error(pushChanges.stderr?.read().toString().trim());
96
+ process.exit(1);
97
+ }
98
+ }
99
+ const prData = { title };
100
+ if (body) prData.body = body;
101
+ if (head) prData.head = head;
102
+ if (base) prData.base = base;
103
+ try {
104
+ console.debug("Creating pull request...");
105
+ console.debug(`Title: ${title}`);
106
+ if (body) console.debug(`Body: ${body}`);
107
+ if (head) console.debug(`Head: ${head}`);
108
+ console.debug(`Base: ${base}`);
109
+ const response = await fetch(`${apiEndpoint}/api/butterflow/v1/tasks/${taskId}/pull-request`, {
110
+ method: "POST",
111
+ headers: {
112
+ Authorization: `Bearer ${authToken}`,
113
+ "Content-Type": "application/json"
114
+ },
115
+ body: JSON.stringify(prData)
116
+ });
117
+ if (!response.ok) {
118
+ const errorText = await response.text();
119
+ throw new Error(`HTTP ${response.status}: ${errorText}`);
120
+ }
121
+ await response.json();
122
+ console.log("✅ Pull request created successfully!");
123
+ } catch (error) {
124
+ console.error("❌ Failed to create pull request:");
125
+ console.error(error instanceof Error ? error.message : String(error));
126
+ process.exit(1);
127
+ }
128
+ }
129
+ });
130
+
131
+ //#endregion
132
+ //#region src/commands/pr/index.ts
133
+ const prCommand = defineCommand({
134
+ meta: {
135
+ name: "pr",
136
+ description: "Pull request operations"
137
+ },
138
+ subCommands: { create: createCommand }
139
+ });
140
+
141
+ //#endregion
142
+ //#region src/commands/shard/codeowner.ts
143
+ const codeownerCommand = defineCommand({
144
+ meta: {
145
+ name: "codeowner",
146
+ description: "Analyze GitHub CODEOWNERS file and create sharding output"
147
+ },
148
+ args: {
149
+ shardSize: {
150
+ type: "string",
151
+ alias: "s",
152
+ description: "Number of files per shard",
153
+ required: true
154
+ },
155
+ stateProp: {
156
+ type: "string",
157
+ alias: "p",
158
+ description: "Property name for state output",
159
+ required: true
160
+ },
161
+ codeowners: {
162
+ type: "string",
163
+ alias: "c",
164
+ description: "Path to CODEOWNERS file (optional)",
165
+ required: false
166
+ },
167
+ rule: {
168
+ type: "string",
169
+ alias: "r",
170
+ description: "Path to rule file",
171
+ required: true
172
+ }
173
+ },
174
+ async run({ args }) {
175
+ const { shardSize: shardSizeStr, stateProp, codeowners: codeownersPath, rule: rulePath } = args;
176
+ const shardSize = parseInt(shardSizeStr, 10);
177
+ if (isNaN(shardSize) || shardSize <= 0) {
178
+ console.error("Error: shard-size must be a positive number");
179
+ process.exit(1);
180
+ }
181
+ const stateOutputsPath = process.env.STATE_OUTPUTS;
182
+ if (!stateOutputsPath) {
183
+ console.error("Error: STATE_OUTPUTS environment variable is required");
184
+ process.exit(1);
185
+ }
186
+ try {
187
+ console.log(`State property: ${stateProp}`);
188
+ const analysisOptions = {
189
+ shardSize,
190
+ codeownersPath,
191
+ rulePath,
192
+ projectRoot: process.cwd()
193
+ };
194
+ const result = await analyzeCodeowners(analysisOptions);
195
+ const stateOutput = `${stateProp}=${JSON.stringify(result.shards)}\n`;
196
+ console.log(`Writing state output to: ${stateOutputsPath}`);
197
+ await writeFile(stateOutputsPath, stateOutput, { flag: "a" });
198
+ console.log("✅ Sharding completed successfully!");
199
+ console.log("Generated shards:", JSON.stringify(result.shards, null, 2));
200
+ } catch (error) {
201
+ console.error("❌ Failed to process codeowner file:");
202
+ console.error(error instanceof Error ? error.message : String(error));
203
+ process.exit(1);
204
+ }
205
+ }
206
+ });
207
+
208
+ //#endregion
209
+ //#region src/commands/shard/index.ts
210
+ const shardCommand = defineCommand({
211
+ meta: {
212
+ name: "shard",
213
+ description: "Sharding operations for distributing work"
214
+ },
215
+ subCommands: { codeowner: codeownerCommand }
216
+ });
217
+
218
+ //#endregion
5
219
  //#region src/cli.ts
6
220
  const main = defineCommand({
7
221
  meta: {
@@ -0,0 +1,212 @@
1
+ #!/usr/bin/env node
2
+ import crypto from "node:crypto";
3
+ import { readFile } from "node:fs/promises";
4
+ import { existsSync } from "node:fs";
5
+ import path, { resolve } from "node:path";
6
+ import { Lang, findInFiles } from "@ast-grep/napi";
7
+ import Codeowners from "codeowners";
8
+ import yaml from "yaml";
9
+
10
+ //#region src/utils/consistent-sharding.ts
11
+ const HASH_RING_SIZE = 1e6;
12
+ /**
13
+ * Generates a numeric hash from a filename using SHA1
14
+ */
15
+ function getNumericFileNameSha1(filename) {
16
+ return parseInt(crypto.createHash("sha1").update(filename).digest("hex"), 16);
17
+ }
18
+ /**
19
+ * Maps a filename to a consistent position on the hash ring (0 to HASH_RING_SIZE-1)
20
+ * This position remains constant regardless of shard count changes
21
+ */
22
+ function getFileHashPosition(filename) {
23
+ return getNumericFileNameSha1(filename) % HASH_RING_SIZE;
24
+ }
25
+ /**
26
+ * Gets the shard index for a filename using deterministic hashing
27
+ * Files get assigned to a consistent preferred shard regardless of total count
28
+ *
29
+ * @param filename - The file path to hash
30
+ * @param shardCount - Total number of shards
31
+ * @returns Shard index (0-based)
32
+ */
33
+ function getShardForFilename(filename, { shardCount }) {
34
+ if (shardCount <= 0) throw new Error("Shard count must be greater than 0");
35
+ return getNumericFileNameSha1(filename) % 10 % shardCount;
36
+ }
37
+ /**
38
+ * Checks if a file belongs to a specific shard
39
+ *
40
+ * @param filename - The file path to check
41
+ * @param shardCount - Total number of shards
42
+ * @param shardIndex - The shard index to check against (0-based)
43
+ * @returns True if file belongs to the specified shard
44
+ */
45
+ function fitsInShard(filename, { shardCount, shardIndex }) {
46
+ return getShardForFilename(filename, { shardCount }) === shardIndex;
47
+ }
48
+ /**
49
+ * Distributes files across shards using deterministic hashing
50
+ *
51
+ * @param filenames - Array of file paths
52
+ * @param shardCount - Total number of shards
53
+ * @returns Map of shard index to array of filenames
54
+ */
55
+ function distributeFilesAcrossShards(filenames, shardCount) {
56
+ if (shardCount <= 0) throw new Error("Shard count must be greater than 0");
57
+ const shardMap = /* @__PURE__ */ new Map();
58
+ for (let i = 0; i < shardCount; i++) shardMap.set(i, []);
59
+ for (const filename of filenames) {
60
+ const shardIndex = getShardForFilename(filename, { shardCount });
61
+ shardMap.get(shardIndex).push(filename);
62
+ }
63
+ return shardMap;
64
+ }
65
+ /**
66
+ * Calculate optimal number of shards based on target shard size
67
+ *
68
+ * @param totalFiles - Total number of files
69
+ * @param targetShardSize - Desired number of files per shard
70
+ * @returns Number of shards needed
71
+ */
72
+ function calculateOptimalShardCount(totalFiles, targetShardSize) {
73
+ return Math.ceil(totalFiles / targetShardSize);
74
+ }
75
+
76
+ //#endregion
77
+ //#region src/utils/counted-promise.ts
78
+ function countedPromise(func) {
79
+ return async (lang, t, cb) => {
80
+ let i = 0;
81
+ let fileCount = void 0;
82
+ let resolve$1 = () => {};
83
+ async function wrapped(...args) {
84
+ const ret = await cb(...args);
85
+ if (++i === fileCount) resolve$1();
86
+ return ret;
87
+ }
88
+ fileCount = await func(lang, t, wrapped);
89
+ if (fileCount > i) await new Promise((r) => resolve$1 = r);
90
+ return fileCount;
91
+ };
92
+ }
93
+ const countedFindInFiles = countedPromise(findInFiles);
94
+
95
+ //#endregion
96
+ //#region src/utils/codeowner-analysis.ts
97
+ /**
98
+ * Finds and resolves the CODEOWNERS file path
99
+ * Searches in common locations: root, .github/, docs/
100
+ */
101
+ async function findCodeownersFile(projectRoot = process.cwd(), explicitPath) {
102
+ if (explicitPath) {
103
+ const resolvedPath = resolve(explicitPath);
104
+ if (!existsSync(resolvedPath)) throw new Error(`CODEOWNERS file not found at: ${resolvedPath}`);
105
+ return resolvedPath;
106
+ }
107
+ const searchPaths = [
108
+ resolve(projectRoot, "CODEOWNERS"),
109
+ resolve(projectRoot, ".github", "CODEOWNERS"),
110
+ resolve(projectRoot, "docs", "CODEOWNERS")
111
+ ];
112
+ for (const searchPath of searchPaths) if (existsSync(searchPath)) return searchPath;
113
+ throw new Error("CODEOWNERS file not found. Please specify path explicitly or ensure CODEOWNERS file exists in project root, .github/, or docs/ folder.");
114
+ }
115
+ /**
116
+ * Loads and parses AST-grep rule from YAML file
117
+ */
118
+ async function loadAstGrepRule(rulePath) {
119
+ if (!existsSync(rulePath)) throw new Error(`Rule file not found at: ${rulePath}`);
120
+ const ruleContent = await readFile(rulePath, "utf8");
121
+ const parsedRules = yaml.parseAllDocuments(ruleContent);
122
+ if (!parsedRules[0]) throw new Error("Invalid rule file: no rules found");
123
+ return parsedRules[0].toJSON();
124
+ }
125
+ /**
126
+ * Normalizes owner name by removing @ prefix and converting to lowercase
127
+ */
128
+ function normalizeOwnerName(owner) {
129
+ return owner.replace("@", "").toLowerCase();
130
+ }
131
+ /**
132
+ * Analyzes files and groups them by codeowner team
133
+ */
134
+ async function analyzeFilesByOwner(codeownersPath, rule, projectRoot = process.cwd()) {
135
+ const codeowners = new Codeowners(codeownersPath);
136
+ const gitRootDir = codeowners.codeownersDirectory;
137
+ const filesByOwner = /* @__PURE__ */ new Map();
138
+ await countedFindInFiles(Lang.TypeScript, {
139
+ matcher: rule,
140
+ paths: [projectRoot]
141
+ }, (err, matches) => {
142
+ if (err) {
143
+ console.error("AST-grep error:", err);
144
+ return;
145
+ }
146
+ if (!matches || matches.length === 0) return;
147
+ const fileName = matches[0]?.getRoot().filename();
148
+ if (!fileName) return;
149
+ const relativePath = path.relative(gitRootDir, fileName);
150
+ const owners = codeowners.getOwner(relativePath);
151
+ let ownerKey;
152
+ if (owners && owners.length > 0) {
153
+ const owner = owners[0];
154
+ ownerKey = normalizeOwnerName(owner ?? "unknown");
155
+ } else ownerKey = "unassigned";
156
+ if (!filesByOwner.has(ownerKey)) filesByOwner.set(ownerKey, []);
157
+ filesByOwner.get(ownerKey).push(relativePath);
158
+ });
159
+ return filesByOwner;
160
+ }
161
+ /**
162
+ * Generates shard configuration from team file analysis
163
+ */
164
+ function generateShards(filesByOwner, shardSize) {
165
+ const allShards = [];
166
+ for (const [team, files] of filesByOwner.entries()) {
167
+ const fileCount = files.length;
168
+ const numShards = calculateOptimalShardCount(fileCount, shardSize);
169
+ console.log(`Team "${team}" owns ${fileCount} files, creating ${numShards} shards`);
170
+ for (let i = 1; i <= numShards; i++) allShards.push({
171
+ team,
172
+ shard: `${i}/${numShards}`
173
+ });
174
+ }
175
+ return allShards;
176
+ }
177
+ /**
178
+ * Converts file ownership map to team info array
179
+ */
180
+ function getTeamFileInfo(filesByOwner) {
181
+ return Array.from(filesByOwner.entries()).map(([team, files]) => ({
182
+ team,
183
+ fileCount: files.length,
184
+ files
185
+ }));
186
+ }
187
+ /**
188
+ * Main function to analyze codeowners and generate shard configuration
189
+ */
190
+ async function analyzeCodeowners(options) {
191
+ const { shardSize, codeownersPath, rulePath, projectRoot = process.cwd() } = options;
192
+ const resolvedCodeownersPath = await findCodeownersFile(projectRoot, codeownersPath);
193
+ const rule = await loadAstGrepRule(rulePath);
194
+ console.log(`Analyzing CODEOWNERS file: ${resolvedCodeownersPath}`);
195
+ console.log(`Using rule file: ${rulePath}`);
196
+ console.log(`Shard size: ${shardSize}`);
197
+ console.log("Analyzing files with AST-grep...");
198
+ const filesByOwner = await analyzeFilesByOwner(resolvedCodeownersPath, rule, projectRoot);
199
+ console.log("File analysis completed. Generating shards...");
200
+ const teams = getTeamFileInfo(filesByOwner);
201
+ const shards = generateShards(filesByOwner, shardSize);
202
+ const totalFiles = Array.from(filesByOwner.values()).reduce((sum, files) => sum + files.length, 0);
203
+ console.log(`Generated ${shards.length} total shards for ${totalFiles} files`);
204
+ return {
205
+ teams,
206
+ shards,
207
+ totalFiles
208
+ };
209
+ }
210
+
211
+ //#endregion
212
+ export { analyzeCodeowners, analyzeFilesByOwner, calculateOptimalShardCount, distributeFilesAcrossShards, findCodeownersFile, fitsInShard, generateShards, getFileHashPosition, getNumericFileNameSha1, getShardForFilename, getTeamFileInfo, loadAstGrepRule, normalizeOwnerName };
package/dist/index.d.ts CHANGED
@@ -1,9 +1,107 @@
1
- import * as citty1 from "citty";
2
-
3
- //#region src/commands/pr/index.d.ts
4
- declare const prCommand: citty1.CommandDef<citty1.ArgsDef>;
1
+ //#region src/utils/consistent-sharding.d.ts
2
+ /**
3
+ * Generates a numeric hash from a filename using SHA1
4
+ */
5
+ declare function getNumericFileNameSha1(filename: string): number;
6
+ /**
7
+ * Maps a filename to a consistent position on the hash ring (0 to HASH_RING_SIZE-1)
8
+ * This position remains constant regardless of shard count changes
9
+ */
10
+ declare function getFileHashPosition(filename: string): number;
11
+ /**
12
+ * Gets the shard index for a filename using deterministic hashing
13
+ * Files get assigned to a consistent preferred shard regardless of total count
14
+ *
15
+ * @param filename - The file path to hash
16
+ * @param shardCount - Total number of shards
17
+ * @returns Shard index (0-based)
18
+ */
19
+ declare function getShardForFilename(filename: string, {
20
+ shardCount
21
+ }: {
22
+ shardCount: number;
23
+ }): number;
24
+ /**
25
+ * Checks if a file belongs to a specific shard
26
+ *
27
+ * @param filename - The file path to check
28
+ * @param shardCount - Total number of shards
29
+ * @param shardIndex - The shard index to check against (0-based)
30
+ * @returns True if file belongs to the specified shard
31
+ */
32
+ declare function fitsInShard(filename: string, {
33
+ shardCount,
34
+ shardIndex
35
+ }: {
36
+ shardCount: number;
37
+ shardIndex: number;
38
+ }): boolean;
39
+ /**
40
+ * Distributes files across shards using deterministic hashing
41
+ *
42
+ * @param filenames - Array of file paths
43
+ * @param shardCount - Total number of shards
44
+ * @returns Map of shard index to array of filenames
45
+ */
46
+ declare function distributeFilesAcrossShards(filenames: string[], shardCount: number): Map<number, string[]>;
47
+ /**
48
+ * Calculate optimal number of shards based on target shard size
49
+ *
50
+ * @param totalFiles - Total number of files
51
+ * @param targetShardSize - Desired number of files per shard
52
+ * @returns Number of shards needed
53
+ */
54
+ declare function calculateOptimalShardCount(totalFiles: number, targetShardSize: number): number;
5
55
  //#endregion
6
- //#region src/commands/shard/index.d.ts
7
- declare const shardCommand: citty1.CommandDef<citty1.ArgsDef>;
56
+ //#region src/utils/codeowner-analysis.d.ts
57
+ interface ShardResult {
58
+ team: string;
59
+ shard: string;
60
+ }
61
+ interface TeamFileInfo {
62
+ team: string;
63
+ fileCount: number;
64
+ files: string[];
65
+ }
66
+ interface CodeownerAnalysisOptions {
67
+ shardSize: number;
68
+ codeownersPath?: string;
69
+ rulePath: string;
70
+ projectRoot?: string;
71
+ }
72
+ interface CodeownerAnalysisResult {
73
+ teams: TeamFileInfo[];
74
+ shards: ShardResult[];
75
+ totalFiles: number;
76
+ }
77
+ /**
78
+ * Finds and resolves the CODEOWNERS file path
79
+ * Searches in common locations: root, .github/, docs/
80
+ */
81
+ declare function findCodeownersFile(projectRoot?: string, explicitPath?: string): Promise<string>;
82
+ /**
83
+ * Loads and parses AST-grep rule from YAML file
84
+ */
85
+ declare function loadAstGrepRule(rulePath: string): Promise<any>;
86
+ /**
87
+ * Normalizes owner name by removing @ prefix and converting to lowercase
88
+ */
89
+ declare function normalizeOwnerName(owner: string): string;
90
+ /**
91
+ * Analyzes files and groups them by codeowner team
92
+ */
93
+ declare function analyzeFilesByOwner(codeownersPath: string, rule: any, projectRoot?: string): Promise<Map<string, string[]>>;
94
+ /**
95
+ * Generates shard configuration from team file analysis
96
+ */
97
+ declare function generateShards(filesByOwner: Map<string, string[]>, shardSize: number): ShardResult[];
98
+ /**
99
+ * Converts file ownership map to team info array
100
+ */
101
+ declare function getTeamFileInfo(filesByOwner: Map<string, string[]>): TeamFileInfo[];
102
+ /**
103
+ * Main function to analyze codeowners and generate shard configuration
104
+ */
105
+ declare function analyzeCodeowners(options: CodeownerAnalysisOptions): Promise<CodeownerAnalysisResult>;
8
106
  //#endregion
9
- export { prCommand, shardCommand };
107
+ export { CodeownerAnalysisOptions, CodeownerAnalysisResult, ShardResult, TeamFileInfo, analyzeCodeowners, analyzeFilesByOwner, calculateOptimalShardCount, distributeFilesAcrossShards, findCodeownersFile, fitsInShard, generateShards, getFileHashPosition, getNumericFileNameSha1, getShardForFilename, getTeamFileInfo, loadAstGrepRule, normalizeOwnerName };
package/dist/index.js CHANGED
@@ -1,4 +1,4 @@
1
1
  #!/usr/bin/env node
2
- import { prCommand, shardCommand } from "./shard-BOcsYHKh.js";
2
+ import { analyzeCodeowners, analyzeFilesByOwner, calculateOptimalShardCount, distributeFilesAcrossShards, findCodeownersFile, fitsInShard, generateShards, getFileHashPosition, getNumericFileNameSha1, getShardForFilename, getTeamFileInfo, loadAstGrepRule, normalizeOwnerName } from "./codeowner-analysis-CeRIGuLu.js";
3
3
 
4
- export { prCommand, shardCommand };
4
+ export { analyzeCodeowners, analyzeFilesByOwner, calculateOptimalShardCount, distributeFilesAcrossShards, findCodeownersFile, fitsInShard, generateShards, getFileHashPosition, getNumericFileNameSha1, getShardForFilename, getTeamFileInfo, loadAstGrepRule, normalizeOwnerName };
package/package.json CHANGED
@@ -1,12 +1,24 @@
1
1
  {
2
2
  "name": "codemodctl",
3
- "version": "0.1.1",
4
- "description": "CLI tool for workflow engine operations",
3
+ "version": "0.1.2",
4
+ "description": "CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis",
5
5
  "type": "module",
6
6
  "exports": {
7
7
  ".": {
8
8
  "types": "./dist/index.d.ts",
9
9
  "default": "./dist/index.js"
10
+ },
11
+ "./sharding": {
12
+ "types": "./dist/utils/consistent-sharding.d.ts",
13
+ "default": "./dist/utils/consistent-sharding.js"
14
+ },
15
+ "./codeowners": {
16
+ "types": "./dist/utils/codeowner-analysis.d.ts",
17
+ "default": "./dist/utils/codeowner-analysis.js"
18
+ },
19
+ "./utils": {
20
+ "types": "./dist/utils/index.d.ts",
21
+ "default": "./dist/utils/index.js"
10
22
  }
11
23
  },
12
24
  "bin": {
@@ -27,7 +39,8 @@
27
39
  "citty": "^0.1.6",
28
40
  "codeowners": "^5.1.1",
29
41
  "glob": "^11.0.0",
30
- "node-fetch": "^3.3.2"
42
+ "node-fetch": "^3.3.2",
43
+ "yaml": "^2.7.1"
31
44
  },
32
45
  "devDependencies": {
33
46
  "@acme/tsconfig": "workspace:*",
@@ -38,9 +51,13 @@
38
51
  },
39
52
  "keywords": [
40
53
  "cli",
54
+ "library",
41
55
  "codemod",
42
56
  "workflow",
43
- "automation"
57
+ "automation",
58
+ "sharding",
59
+ "codeowners",
60
+ "consistent-hashing"
44
61
  ],
45
62
  "author": "Codemod",
46
63
  "license": "MIT"
@@ -1,249 +0,0 @@
1
- #!/usr/bin/env node
2
- import { defineCommand } from "citty";
3
- import fetch from "node-fetch";
4
- import { execSync } from "node:child_process";
5
- import { existsSync } from "node:fs";
6
- import { writeFile } from "node:fs/promises";
7
- import { resolve } from "node:path";
8
- import Codeowners from "codeowners";
9
- import { glob } from "glob";
10
-
11
- //#region src/commands/pr/create.ts
12
- const createCommand = defineCommand({
13
- meta: {
14
- name: "create",
15
- description: "Create a pull request"
16
- },
17
- args: {
18
- title: {
19
- type: "string",
20
- description: "Title of the pull request",
21
- required: true
22
- },
23
- body: {
24
- type: "string",
25
- description: "Body/description of the pull request",
26
- required: false
27
- },
28
- head: {
29
- type: "string",
30
- description: "Head branch for the pull request",
31
- required: false
32
- },
33
- base: {
34
- type: "string",
35
- description: "Base branch to merge into",
36
- required: false,
37
- default: "main"
38
- }
39
- },
40
- async run({ args }) {
41
- const { title, body, head, base } = args;
42
- const apiEndpoint = process.env.BUTTERFLOW_API_ENDPOINT;
43
- const authToken = process.env.BUTTERFLOW_API_AUTH_TOKEN;
44
- const taskId = process.env.CODEMOD_TASK_ID;
45
- if (!apiEndpoint) {
46
- console.error("Error: BUTTERFLOW_API_ENDPOINT environment variable is required");
47
- process.exit(1);
48
- }
49
- if (!authToken) {
50
- console.error("Error: BUTTERFLOW_API_AUTH_TOKEN environment variable is required");
51
- process.exit(1);
52
- }
53
- if (!taskId) {
54
- console.error("Error: CODEMOD_TASK_ID environment variable is required");
55
- process.exit(1);
56
- }
57
- const prData = { title };
58
- if (body) prData.body = body;
59
- if (head) prData.head = head;
60
- if (base) prData.base = base;
61
- try {
62
- console.log("Creating pull request...");
63
- console.log(`Title: ${title}`);
64
- if (body) console.log(`Body: ${body}`);
65
- if (head) console.log(`Head: ${head}`);
66
- console.log(`Base: ${base}`);
67
- const response = await fetch(`${apiEndpoint}/api/butterflow/v1/tasks/${taskId}/pull-request`, {
68
- method: "POST",
69
- headers: {
70
- Authorization: `Bearer ${authToken}`,
71
- "Content-Type": "application/json"
72
- },
73
- body: JSON.stringify(prData)
74
- });
75
- if (!response.ok) {
76
- const errorText = await response.text();
77
- throw new Error(`HTTP ${response.status}: ${errorText}`);
78
- }
79
- const result = await response.json();
80
- console.log("✅ Pull request created successfully!");
81
- console.log("Response:", JSON.stringify(result, null, 2));
82
- } catch (error) {
83
- console.error("❌ Failed to create pull request:");
84
- console.error(error instanceof Error ? error.message : String(error));
85
- process.exit(1);
86
- }
87
- }
88
- });
89
-
90
- //#endregion
91
- //#region src/commands/pr/index.ts
92
- const prCommand = defineCommand({
93
- meta: {
94
- name: "pr",
95
- description: "Pull request operations"
96
- },
97
- subCommands: { create: createCommand }
98
- });
99
-
100
- //#endregion
101
- //#region src/commands/shard/codeowner.ts
102
- const codeownerCommand = defineCommand({
103
- meta: {
104
- name: "codeowner",
105
- description: "Analyze GitHub CODEOWNERS file and create sharding output"
106
- },
107
- args: {
108
- shardSize: {
109
- type: "string",
110
- alias: "s",
111
- description: "Number of files per shard",
112
- required: true
113
- },
114
- stateProp: {
115
- type: "string",
116
- alias: "p",
117
- description: "Property name for state output",
118
- required: true
119
- },
120
- codeowners: {
121
- type: "string",
122
- alias: "c",
123
- description: "Path to CODEOWNERS file (optional)",
124
- required: false
125
- }
126
- },
127
- async run({ args }) {
128
- const { shardSize: shardSizeStr, stateProp, codeowners: codeownersPath } = args;
129
- const shardSize = parseInt(shardSizeStr, 10);
130
- if (isNaN(shardSize) || shardSize <= 0) {
131
- console.error("Error: shard-size must be a positive number");
132
- process.exit(1);
133
- }
134
- const stateOutputsPath = process.env.STATE_OUTPUTS;
135
- if (!stateOutputsPath) {
136
- console.error("Error: STATE_OUTPUTS environment variable is required");
137
- process.exit(1);
138
- }
139
- try {
140
- let codeownersFilePath;
141
- if (codeownersPath) codeownersFilePath = resolve(codeownersPath);
142
- else {
143
- const defaultPath = resolve(process.cwd(), "CODEOWNERS");
144
- const githubPath = resolve(process.cwd(), ".github", "CODEOWNERS");
145
- const docsPath = resolve(process.cwd(), "docs", "CODEOWNERS");
146
- if (existsSync(defaultPath)) codeownersFilePath = defaultPath;
147
- else if (existsSync(githubPath)) codeownersFilePath = githubPath;
148
- else if (existsSync(docsPath)) codeownersFilePath = docsPath;
149
- else throw new Error("CODEOWNERS file not found. Please specify path with --codeowners flag or ensure CODEOWNERS file exists in current directory, .github/, or docs/ folder.");
150
- }
151
- if (!existsSync(codeownersFilePath)) throw new Error(`CODEOWNERS file not found at: ${codeownersFilePath}`);
152
- console.log(`Analyzing CODEOWNERS file: ${codeownersFilePath}`);
153
- console.log(`Shard size: ${shardSize}`);
154
- console.log(`State property: ${stateProp}`);
155
- const codeowners = new Codeowners(codeownersFilePath);
156
- const teamFileCounts = await countFilesPerTeam(codeowners);
157
- const allShards = [];
158
- for (const [team, fileCount] of Object.entries(teamFileCounts)) {
159
- const numShards = Math.ceil(fileCount / shardSize);
160
- console.log(`Team "${team}" owns ${fileCount} files, creating ${numShards} shards`);
161
- for (let i = 1; i <= numShards; i++) allShards.push({
162
- team,
163
- shard: `${i}/${numShards}`
164
- });
165
- }
166
- console.log(`Generated ${allShards.length} total shards`);
167
- const stateOutput = `${stateProp}=${JSON.stringify(allShards)}\n`;
168
- console.log(`Writing state output to: ${stateOutputsPath}`);
169
- await writeFile(stateOutputsPath, stateOutput, { flag: "a" });
170
- console.log("✅ Sharding completed successfully!");
171
- console.log("Generated shards:", JSON.stringify(allShards, null, 2));
172
- } catch (error) {
173
- console.error("❌ Failed to process codeowner file:");
174
- console.error(error instanceof Error ? error.message : String(error));
175
- process.exit(1);
176
- }
177
- }
178
- });
179
- /**
180
- * Count files for each team/owner based on CODEOWNERS file
181
- */
182
- async function countFilesPerTeam(codeowners) {
183
- const teamFileCounts = {};
184
- try {
185
- const files = await glob("**/*", {
186
- cwd: process.cwd(),
187
- nodir: true,
188
- ignore: [
189
- "node_modules/**",
190
- ".git/**",
191
- "dist/**",
192
- "build/**",
193
- "*.log",
194
- ".DS_Store"
195
- ]
196
- });
197
- console.log(`Found ${files.length} files to analyze`);
198
- for (const file of files) try {
199
- const owners = codeowners.getOwner(file);
200
- if (owners && owners.length > 0) for (const owner of owners) {
201
- const cleanOwner = owner.replace("@", "").trim();
202
- teamFileCounts[cleanOwner] = (teamFileCounts[cleanOwner] || 0) + 1 / owners.length;
203
- }
204
- else teamFileCounts["unassigned"] = (teamFileCounts["unassigned"] || 0) + 1;
205
- } catch (error) {
206
- teamFileCounts["unassigned"] = (teamFileCounts["unassigned"] || 0) + 1;
207
- }
208
- for (const team in teamFileCounts) teamFileCounts[team] = Math.round(teamFileCounts[team] ?? 0);
209
- } catch (error) {
210
- console.warn("Warning: Could not analyze files with codeowners, using fallback counting");
211
- console.warn(error);
212
- try {
213
- console.log("Trying fallback method with codeowners CLI...");
214
- const auditOutput = execSync("codeowners audit", {
215
- cwd: process.cwd(),
216
- encoding: "utf8",
217
- timeout: 3e4
218
- });
219
- const lines = auditOutput.split("\n").filter((line) => line.trim());
220
- for (const line of lines) {
221
- const parts = line.trim().split(/\s+/);
222
- if (parts.length >= 2) {
223
- const owners = parts.slice(1);
224
- for (const owner of owners) if (owner.startsWith("@")) {
225
- const cleanOwner = owner.replace("@", "").trim();
226
- teamFileCounts[cleanOwner] = (teamFileCounts[cleanOwner] || 0) + 1;
227
- }
228
- } else teamFileCounts["unassigned"] = (teamFileCounts["unassigned"] || 0) + 1;
229
- }
230
- } catch (cliError) {
231
- console.warn("Fallback CLI method also failed, using mock data for demonstration");
232
- teamFileCounts["DefaultTeam"] = 100;
233
- }
234
- }
235
- return teamFileCounts;
236
- }
237
-
238
- //#endregion
239
- //#region src/commands/shard/index.ts
240
- const shardCommand = defineCommand({
241
- meta: {
242
- name: "shard",
243
- description: "Sharding operations for distributing work"
244
- },
245
- subCommands: { codeowner: codeownerCommand }
246
- });
247
-
248
- //#endregion
249
- export { prCommand, shardCommand };