arol-ai 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -49,12 +49,14 @@ Scanned 128 files · 1 API detected
49
49
  src/agents/run.ts:88 → beta.threads
50
50
  → migrate: https://platform.openai.com/docs/assistants/migration
51
51
 
52
- These are today's deprecations. New ones land constantly — get
53
- alerted before the next one breaks you → arol.ai
52
+ ────────────────────────────────────────────────────────────
53
+ ⚠ These break on fixed dates. Get alerted before the next one hits you → arol.ai
54
54
  ```
55
55
 
56
56
  Note the citations point at the **exact source lines that use the deprecated API**, not at the manifest. Having the `openai` package installed is not enough on its own — your code has to actually call the removed surface.
57
57
 
58
+ The closing line is **severity-aware**: a high-severity finding gets the prominent warning above; findings with no high-severity items get `Get continuous deprecation alerts for your stack → arol.ai`; and a clean scan gets `✓ Clean today — but new deprecations land constantly. Stay covered → arol.ai`.
59
+
58
60
  When nothing is found:
59
61
 
60
62
  ```
@@ -67,13 +69,28 @@ Detection keys on **actual usage, not mere SDK presence.** Each dataset entry de
67
69
 
68
70
  ### `match: "pattern"` — the default
69
71
 
70
- Flags **only** when one of the entry's `detect.patterns` regexes matches inside a scanned **source file** — i.e. your code actually references the deprecated endpoint, method, or model string. `detect.sdk` is just a scope hint here and is **never** a trigger on its own. This is the default and covers almost everything.
72
+ Flags **only** when your code actually references the deprecated API in a scanned **source file**. `detect.sdk` is just a scope hint here and is **never** a trigger on its own. A `pattern` entry carries two kinds of usage signal:
73
+
74
+ - **`detect.patterns`** — raw regexes for code identifiers, endpoints, and params (e.g. `beta\.assistants`, `/v1/threads`, `charges\.create`, `hapikey\s*=`). Matched anywhere in the file.
75
+ - **`detect.models`** — model family names matched **only inside a string literal**. Each becomes: an opening quote (`'` `"` or `` ` ``), the family name, an optional `[A-Za-z0-9._-]*` version/suffix, then the matching closing quote. So `"gpt-4o"`, `'gpt-4o'`, `` `gpt-4o` ``, and `"gpt-4o-2024-05-13"` match — but the same name sitting in prose, JSX, or a comment does **not**.
76
+
77
+ This split is what keeps a marketing page that mentions *"GPT-4o, GPT-4.1, and o4-mini"* from being reported as deprecated usage: those names aren't quoted string literals, so `detect.models` ignores them. Only something like `model: "o4-mini"` counts.
78
+
79
+ Two more layers keep matches context-aware:
80
+
81
+ - **Language scoping (`applies_to`).** Each entry lists the file extensions its signals are valid in (e.g. `["py"]`, `["js","ts","jsx","tsx","mjs"]`, or `["*"]` for model strings). An entry is only tested against files with a matching extension, so a Python-only pattern like `openai.ChatCompletion` never fires in a `.tsx` file. Defaults to `["*"]` when omitted.
82
+ - **Comment stripping.** Before matching, comments are blanked out per language — `//`, `/* */`, JSX `{/* */}`, and `#` (Python). Stripping is string-aware: a marker inside a string literal (e.g. the `//` in `"https://…"`) is **not** treated as a comment, and offsets are preserved so reported line numbers stay exact. A commented-out `model: "gpt-4o"` is ignored; the real call on the next line is not.
83
+
84
+ Each hit records the **file path, line number, and matched text**, and one deprecation aggregates **all** of its matched locations into a single finding.
85
+
86
+ > Having the `openai` package in `requirements.txt` does **not** flag the Assistants API deprecation. Your code has to actually use `beta.assistants` / a deprecated model id (etc.), in a file of the right language, outside comments.
87
+
88
+ **Files scanned / skipped**
71
89
 
72
90
  - Extensions scanned: `.js .mjs .cjs .jsx .ts .mts .cts .tsx .py .go`
73
91
  - Skipped directories: `node_modules`, `.git`, `dist`, `build`, `.next`, `out`, `coverage`, `.venv`, `venv`, `vendor`
74
- - Each hit records the **file path, line number, and matched text**, and one deprecation aggregates **all** of its matched locations into a single finding.
75
-
76
- > Having the `openai` package in `requirements.txt` does **not** flag the Assistants API deprecation. Your code has to actually use `beta.assistants` / `beta.threads` (etc.).
92
+ - Skipped by default: `.md`, `.mdx`, `.txt` (docs/prose, where model names appear as text), plus the tool's own `deprecations.json` and `arol.config.*` / `.arolignore` files.
93
+ - Add a **`.arolignore`** file (gitignore-style globs) at the repo root, and/or pass **`--ignore <glob>`** (repeatable) to skip more paths.
77
94
 
78
95
  ### `match: "sdk"`
79
96
 
@@ -99,6 +116,7 @@ arol-ai scan [path] [options]
99
116
  | `--json` | Output machine-readable JSON instead of the report |
100
117
  | `--no-color` | Disable colored output (also respects `NO_COLOR`) |
101
118
  | `--data <file>` | Use a custom `deprecations.json` instead of the bundled one |
119
+ | `--ignore <glob>` | Skip files matching this glob; repeatable. Combined with `.arolignore`. e.g. `--ignore 'docs/**' --ignore '**/*.gen.ts'` |
102
120
  | `--fail-on <severity>` | Exit non-zero if findings meet a level: `high` \| `medium` \| `low` \| `any` \| `none` (default `none`) |
103
121
  | `-v, --version` | Print the version |
104
122
  | `-h, --help` | Show help |
@@ -129,20 +147,60 @@ The dataset is either a bare array of entries, or a `{ "deprecations": [ ... ] }
129
147
  "title": "Assistants API (beta)", // short headline (required)
130
148
  "severity": "high", // "high" | "medium" | "low" (required)
131
149
  "match": "pattern", // "pattern" (default) | "sdk" | "version"
150
+ "applies_to": ["py","js","ts","jsx","tsx","mjs"], // extensions to test; ["*"] = any
132
151
  "sunset_date": "2026-08-26", // ISO YYYY-MM-DD, or "" if no fixed date
133
152
  "detect": {
134
153
  "sdk": ["openai"], // scope hint for "pattern"; the trigger for "sdk"/"version"
135
- "patterns": [ // regex strings matched against source files
154
+ "patterns": [ // raw regexes: identifiers, endpoints, params
136
155
  "beta\\.assistants",
137
156
  "beta\\.threads",
138
157
  "/v1/assistants"
139
- ]
158
+ ],
159
+ "models": [] // model ids matched only inside string literals
140
160
  },
141
161
  "migration_url": "https://platform.openai.com/docs/assistants/migration",
142
162
  "summary": "One or two sentences explaining the change and what to do."
143
163
  }
144
164
  ```
145
165
 
166
+ A model-retirement entry uses `detect.models` (and `applies_to: ["*"]`) so it only fires on a quoted model id, in any language, never on prose:
167
+
168
+ ```jsonc
169
+ {
170
+ "id": "openai-gpt4-family-shutdown",
171
+ "vendor": "OpenAI",
172
+ "title": "GPT-4 family models (API shutdown)",
173
+ "severity": "high",
174
+ "match": "pattern",
175
+ "applies_to": ["*"],
176
+ "sunset_date": "2026-10-23",
177
+ "detect": {
178
+ "sdk": ["openai"],
179
+ "patterns": [],
180
+ "models": ["gpt-4o", "gpt-4-turbo", "o4-mini", "gpt-4.5-preview"]
181
+ },
182
+ "migration_url": "https://platform.openai.com/docs/deprecations",
183
+ "summary": "Migrate to the GPT-5 family."
184
+ }
185
+ ```
186
+
187
+ A Python-only entry scopes itself with `applies_to: ["py"]`, so its patterns never fire in JS/TSX files that merely mention the API in prose:
188
+
189
+ ```jsonc
190
+ {
191
+ "id": "openai-python-v0-syntax",
192
+ "vendor": "OpenAI",
193
+ "title": "Legacy openai-python v0 call syntax",
194
+ "severity": "high",
195
+ "match": "pattern",
196
+ "applies_to": ["py"],
197
+ "sunset_date": "2023-11-06",
198
+ "detect": { "sdk": ["openai"], "patterns": ["openai\\.ChatCompletion"] },
199
+ "migration_url": "https://github.com/openai/openai-python/discussions/742",
200
+ "summary": "Instantiate a client: client.chat.completions.create(...)."
201
+ }
202
+ ```
203
+
146
204
  A `version` entry instead flags on the installed SDK version (no patterns needed):
147
205
 
148
206
  ```jsonc
@@ -169,20 +227,24 @@ A `version` entry instead flags on the installed SDK version (no patterns needed
169
227
  | `title` | string | ✓ | Short headline for the finding. |
170
228
  | `severity` | `"high"` \| `"medium"` \| `"low"` | ✓ | Drives color, sort order, and `--fail-on`. |
171
229
  | `match` | `"pattern"` \| `"sdk"` \| `"version"` | – | How the entry is triggered. **Defaults to `"pattern"`** when omitted. See [How detection works](#how-detection-works). |
230
+ | `applies_to` | string[] | – | File extensions (no dot) the entry's patterns/models are tested against, e.g. `["py"]` or `["js","ts","jsx","tsx","mjs"]`. Use `["*"]` for any file (model strings). **Defaults to `["*"]`** when omitted. |
172
231
  | `version_range` | string | – | For `match: "version"` only — e.g. `"<3.0.0"`, `">=1.2.0"`, `"=2.1.0"`. If omitted, a `version` entry behaves like `"sdk"`. |
173
232
  | `sunset_date` | string | – | ISO `YYYY-MM-DD`. Use `""` for unmaintained/no-fixed-date items; the report shows a relative hint (e.g. *"in 42 days"* / *"passed 12 days ago"*). |
174
233
  | `detect.sdk` | string[] | – | Manifest dependency/module names. For `match: "pattern"` this is only a **scope hint and never triggers** a finding; for `sdk`/`version` it is the trigger. |
175
- | `detect.patterns` | string[] | – | **JSON-escaped** regular-expression strings (so `\d` becomes `\\d`). Matched against source-file contents; invalid regexes are skipped safely. Required (non-empty) for `match: "pattern"`. |
234
+ | `detect.patterns` | string[] | – | **JSON-escaped** regex strings (so `\d` becomes `\\d`). For code identifiers, endpoints, and params. Matched anywhere in a source file; invalid regexes are skipped safely. |
235
+ | `detect.models` | string[] | – | Model family names matched **only inside string literals** (quote-anchored, with an optional version suffix). Use this for model ids so prose/JSX mentions don't false-positive. Write the raw name (e.g. `gpt-4.5-preview`) — escaping is automatic. |
176
236
  | `migration_url` | string | – | Link shown in the report. |
177
237
  | `summary` | string | – | One or two sentences of guidance. |
178
238
 
179
- > A `pattern` entry with no `patterns`, or an `sdk`/`version` entry with no `sdk`, can never fire and is dropped at load time.
239
+ > A `pattern` entry needs at least one `detect.patterns` **or** `detect.models` entry; an `sdk`/`version` entry needs at least one `detect.sdk`. Entries that can never fire are dropped at load time.
180
240
 
181
- ### Writing good patterns
241
+ ### Writing good patterns & models
182
242
 
243
+ - **Put model ids in `detect.models`, not `detect.patterns`.** A bare model id as a raw pattern matches prose, JSX, comments, and changelogs. `detect.models` requires a quoted string literal, which is what real usage looks like (`model: "o4-mini"`).
244
+ - For `detect.models`, write the **raw family name** (e.g. `gpt-4.5-preview`, `claude-opus-4-20250514`) — escaping and quote-anchoring are automatic. The optional suffix means `gpt-4o` also catches `"gpt-4o-2024-05-13"`, so pick a family specific enough not to swallow a non-deprecated successor.
245
+ - For `detect.patterns`, match the **deprecated surface itself** — the method/property (`beta\.assistants`), endpoint path (`/v1/threads`), or param (`hapikey\s*=`) — not the import or package name. Importing an SDK isn't usage; calling the removed API is. Keep them specific (`client\.chat` is too broad — it hits unrelated SDKs).
183
246
  - Patterns are matched **case-sensitively** with the global flag over each file's contents; the file path, line number, and matched text are reported.
184
- - Match the **deprecated surface itself** the method/property (`beta\.assistants`), endpoint path (`/v1/threads`), or model string (`claude-opus-4-20250514`) not the import or the package name. Importing an SDK isn't usage; calling the removed API is.
185
- - Escape backslashes (and literal dots) for JSON: a regex `beta\.assistants` is written `"beta\\.assistants"`.
247
+ - Escape backslashes (and literal dots) for JSON: a regex `beta\.assistants` is written `"beta\\.assistants"`. (Model entries don't need this write `gpt-4.5-preview` as-is.)
186
248
  - Avoid `^`/`$` line anchors — matching runs against the whole file, not line-by-line; use `\b` word boundaries instead.
187
249
 
188
250
  ## Development
package/dist/cli.js CHANGED
@@ -62,6 +62,10 @@ function shouldUseColor(colorFlag) {
62
62
  return Boolean(process.stdout.isTTY);
63
63
  }
64
64
  const SEVERITY_RANK = { high: 3, medium: 2, low: 1 };
65
+ /** Commander collector so --ignore can be passed multiple times. */
66
+ function collectIgnore(value, previous) {
67
+ return previous.concat([value]);
68
+ }
65
69
  function runScan(targetPath, opts) {
66
70
  const root = path.resolve(targetPath ?? ".");
67
71
  // Validate the target directory up front for a friendly error.
@@ -88,7 +92,10 @@ function runScan(targetPath, opts) {
88
92
  process.exitCode = 2;
89
93
  return;
90
94
  }
91
- const result = (0, scanner_1.scanRepo)(root, deprecations);
95
+ const result = (0, scanner_1.scanRepo)(root, deprecations, {
96
+ ignore: opts.ignore,
97
+ dataPath: opts.data,
98
+ });
92
99
  if (opts.json) {
93
100
  const counts = { high: 0, medium: 0, low: 0 };
94
101
  for (const f of result.findings)
@@ -142,6 +149,7 @@ function main(argv) {
142
149
  .option("--json", "output machine-readable JSON instead of the report")
143
150
  .option("--no-color", "disable colored output")
144
151
  .option("--data <file>", "use a custom deprecations.json dataset instead of the bundled one")
152
+ .option("--ignore <glob>", "skip files matching this glob (repeatable); also reads .arolignore", collectIgnore, [])
145
153
  .option("--fail-on <severity>", "exit non-zero if findings meet this level: high | medium | low | any | none", "none")
146
154
  .action((pathArg, options) => {
147
155
  runScan(pathArg, options);
package/dist/data.js CHANGED
@@ -81,6 +81,7 @@ function coerceDeprecation(raw) {
81
81
  const detect = r.detect;
82
82
  const sdk = detect && isStringArray(detect.sdk) ? detect.sdk : [];
83
83
  const patterns = detect && isStringArray(detect.patterns) ? detect.patterns : [];
84
+ const models = detect && isStringArray(detect.models) ? detect.models : [];
84
85
  // Default to "pattern" — detection keys on real usage, not SDK presence.
85
86
  const match = MATCH_MODES.includes(r.match)
86
87
  ? r.match
@@ -88,8 +89,14 @@ function coerceDeprecation(raw) {
88
89
  const version_range = isNonEmptyString(r.version_range)
89
90
  ? r.version_range
90
91
  : undefined;
92
+ // Language scoping: extensions this entry's patterns are valid in.
93
+ // Normalize to lowercase, dot-stripped; default to ["*"] (match any file).
94
+ const applies_to = isStringArray(r.applies_to) && r.applies_to.length > 0
95
+ ? r.applies_to.map((e) => e.toLowerCase().replace(/^\./, ""))
96
+ : ["*"];
91
97
  // Drop entries that can never fire under their match mode.
92
- if (match === "pattern" && patterns.length === 0)
98
+ // A "pattern" entry fires on either a raw pattern OR a model-string match.
99
+ if (match === "pattern" && patterns.length === 0 && models.length === 0)
93
100
  return null;
94
101
  if ((match === "sdk" || match === "version") && sdk.length === 0)
95
102
  return null;
@@ -99,8 +106,9 @@ function coerceDeprecation(raw) {
99
106
  title: r.title,
100
107
  severity: r.severity,
101
108
  match,
109
+ applies_to,
102
110
  sunset_date: typeof r.sunset_date === "string" ? r.sunset_date : "",
103
- detect: { sdk, patterns },
111
+ detect: { sdk, patterns, models },
104
112
  version_range,
105
113
  migration_url: typeof r.migration_url === "string" ? r.migration_url : "",
106
114
  summary: typeof r.summary === "string" ? r.summary : "",
package/dist/report.js CHANGED
@@ -114,7 +114,7 @@ function renderReport(result, opts) {
114
114
  if (findings.length === 0) {
115
115
  out.push(s.green(s.bold("✓ No upcoming deprecations detected in your stack.")));
116
116
  out.push("");
117
- out.push(footer(s));
117
+ out.push(footer(s, findings));
118
118
  return out.join("\n");
119
119
  }
120
120
  // Severity summary.
@@ -156,15 +156,29 @@ function renderReport(result, opts) {
156
156
  }
157
157
  out.push("");
158
158
  }
159
- out.push(footer(s));
159
+ out.push(footer(s, findings));
160
160
  return out.join("\n");
161
161
  }
162
- function footer(s) {
163
- return [
164
- s.dim("─".repeat(60)),
165
- s.dim("These are today's deprecations. New ones land constantly — get"),
166
- s.dim("alerted before the next one breaks you → ") + s.cyan(s.bold("arol.ai")),
167
- ].join("\n");
162
+ /** Severity-aware closing CTA, visually separated from the findings above. */
163
+ function footer(s, findings) {
164
+ const sep = s.dim("─".repeat(60));
165
+ const brand = "arol.ai";
166
+ let message;
167
+ if (findings.some((f) => f.deprecation.severity === "high")) {
168
+ // Prominent: high-severity items break on fixed dates.
169
+ message = s.bold(s.red(`⚠ These break on fixed dates. Get alerted before the next one hits you → ${brand}`));
170
+ }
171
+ else if (findings.length > 0) {
172
+ message =
173
+ s.dim("Get continuous deprecation alerts for your stack → ") +
174
+ s.cyan(s.bold(brand));
175
+ }
176
+ else {
177
+ message =
178
+ s.green("✓ Clean today — but new deprecations land constantly. Stay covered → ") +
179
+ s.cyan(s.bold(brand));
180
+ }
181
+ return [sep, message].join("\n");
168
182
  }
169
183
  /** Soft-wrap text to a width, indenting continuation lines. */
170
184
  function wrapText(text, width, indent) {
package/dist/scanner.js CHANGED
@@ -67,13 +67,42 @@ const IGNORED_DIRS = [
67
67
  "venv",
68
68
  "vendor",
69
69
  ];
70
+ /**
71
+ * Files always skipped by default: documentation/prose (where model names show
72
+ * up as text, not code) and the tool's own dataset / config files.
73
+ */
74
+ const DEFAULT_FILE_IGNORES = [
75
+ "**/*.md",
76
+ "**/*.mdx",
77
+ "**/*.txt",
78
+ "**/deprecations.json",
79
+ "**/.arolignore",
80
+ "**/arol.config.*",
81
+ ];
70
82
  /** Skip files larger than this (bytes) to keep the scan fast. */
71
83
  const MAX_FILE_BYTES = 2 * 1024 * 1024;
72
84
  /** Cap matches recorded per pattern per file to avoid pathological output. */
73
85
  const MAX_MATCHES_PER_PATTERN_PER_FILE = 50;
86
+ /** Any of the three string-literal quote characters. */
87
+ const QUOTE_CLASS = "['\"`]";
88
+ /** Escape regex metacharacters in a literal model family name. */
89
+ function escapeRegex(s) {
90
+ return s.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
91
+ }
92
+ /**
93
+ * Build a regex that matches a model family ONLY inside a string literal:
94
+ * an opening quote, the (escaped) family name, an optional version/suffix, then
95
+ * the SAME closing quote. So a quoted model id (single, double, or backtick)
96
+ * and its versioned snapshots match, but never a bare occurrence in
97
+ * prose/JSX/markdown.
98
+ */
99
+ function modelRegexSource(family) {
100
+ return `(${QUOTE_CLASS})${escapeRegex(family)}[A-Za-z0-9._-]*\\1`;
101
+ }
74
102
  function compileDeprecations(deprecations) {
75
103
  return deprecations.map((deprecation) => {
76
104
  const regexes = [];
105
+ // Raw patterns — code identifiers, endpoints, params.
77
106
  for (const pattern of deprecation.detect.patterns) {
78
107
  try {
79
108
  // Global so we can iterate every match and derive line numbers.
@@ -83,9 +112,119 @@ function compileDeprecations(deprecations) {
83
112
  // A malformed pattern in the dataset must not crash the scan.
84
113
  }
85
114
  }
86
- return { deprecation, regexes };
115
+ // Model names only matched inside string literals (quote-anchored).
116
+ for (const family of deprecation.detect.models) {
117
+ try {
118
+ regexes.push(new RegExp(modelRegexSource(family), "g"));
119
+ }
120
+ catch {
121
+ // Defensive: a pathological family name must not crash the scan.
122
+ }
123
+ }
124
+ const appliesTo = new Set((deprecation.applies_to.length > 0 ? deprecation.applies_to : ["*"]).map((e) => e.toLowerCase()));
125
+ return { deprecation, regexes, appliesTo };
87
126
  });
88
127
  }
128
+ /** True if a compiled entry should be tested against a file with this extension. */
129
+ function appliesToExt(compiled, ext) {
130
+ return compiled.appliesTo.has("*") || compiled.appliesTo.has(ext);
131
+ }
132
+ function commentConfig(ext) {
133
+ switch (ext) {
134
+ case "js":
135
+ case "mjs":
136
+ case "cjs":
137
+ case "jsx":
138
+ case "ts":
139
+ case "mts":
140
+ case "cts":
141
+ case "tsx":
142
+ case "go":
143
+ return { line: ["//"], block: [["/*", "*/"]], strings: ["'", '"', "`"], triple: [] };
144
+ case "py":
145
+ return { line: ["#"], block: [], strings: ["'", '"'], triple: ['"""', "'''"] };
146
+ default:
147
+ return null;
148
+ }
149
+ }
150
+ /**
151
+ * Replace comments with spaces so they don't match, while preserving the exact
152
+ * byte length and all newlines — line/column offsets stay correct. String literals
153
+ * (including their contents) are left intact, so a comment marker inside a string
154
+ * (e.g. "https://…") is NOT treated as a comment.
155
+ */
156
+ function stripComments(src, cfg) {
157
+ const out = src.split("");
158
+ const n = src.length;
159
+ const at = (s, i) => src.startsWith(s, i);
160
+ const blank = (from, to) => {
161
+ for (let k = from; k < to; k++)
162
+ if (out[k] !== "\n")
163
+ out[k] = " ";
164
+ };
165
+ let i = 0;
166
+ while (i < n) {
167
+ // Triple-quoted strings (Python) — checked before single quotes.
168
+ let matched = false;
169
+ for (const t of cfg.triple) {
170
+ if (at(t, i)) {
171
+ i += t.length;
172
+ while (i < n && !at(t, i))
173
+ i++;
174
+ i += t.length;
175
+ matched = true;
176
+ break;
177
+ }
178
+ }
179
+ if (matched)
180
+ continue;
181
+ // Ordinary string literals — preserve contents verbatim.
182
+ for (const q of cfg.strings) {
183
+ if (src[i] === q) {
184
+ i++;
185
+ while (i < n && src[i] !== q) {
186
+ if (src[i] === "\\")
187
+ i++; // skip escaped char
188
+ i++;
189
+ }
190
+ i++; // closing quote (or past end if unterminated)
191
+ matched = true;
192
+ break;
193
+ }
194
+ }
195
+ if (matched)
196
+ continue;
197
+ // Block comments.
198
+ for (const [open, close] of cfg.block) {
199
+ if (at(open, i)) {
200
+ const end = src.indexOf(close, i + open.length);
201
+ const stop = end === -1 ? n : end + close.length;
202
+ blank(i, stop);
203
+ i = stop;
204
+ matched = true;
205
+ break;
206
+ }
207
+ }
208
+ if (matched)
209
+ continue;
210
+ // Line comments.
211
+ for (const lc of cfg.line) {
212
+ if (at(lc, i)) {
213
+ let k = i;
214
+ while (k < n && src[k] !== "\n")
215
+ k++;
216
+ blank(i, k);
217
+ i = k;
218
+ matched = true;
219
+ break;
220
+ }
221
+ }
222
+ if (matched)
223
+ continue;
224
+ i++;
225
+ }
226
+ return out.join("");
227
+ }
89
228
  /** Precompute the byte offset at which each line starts. */
90
229
  function computeLineStarts(content) {
91
230
  const starts = [0];
@@ -133,8 +272,8 @@ function scanContent(content, relPath, compiled, sink) {
133
272
  if (seenLines.has(line))
134
273
  continue; // one record per line per pattern
135
274
  seenLines.add(line);
136
- // Cite the matched substring itself (e.g. "beta.assistants"), normalized
137
- // and length-capped, so the report points at exactly what triggered.
275
+ // Cite the matched substring itself, normalized and length-capped, so
276
+ // the report points at exactly what triggered the finding.
138
277
  const text = (m[0] ?? "").replace(/\s+/g, " ").trim().slice(0, 120);
139
278
  if (!recorded) {
140
279
  recorded = [];
@@ -222,13 +361,61 @@ function versionInRange(declared, range) {
222
361
  return cmp === 0;
223
362
  }
224
363
  }
364
+ /**
365
+ * Convert one .arolignore line (gitignore-style) into fast-glob ignore globs.
366
+ * Supports comments (#), blank lines, leading "/" anchoring, and trailing "/"
367
+ * for directories. Negations ("!") are not supported and are skipped.
368
+ */
369
+ function arolignoreLineToGlobs(rawLine) {
370
+ let line = rawLine.trim();
371
+ if (!line || line.startsWith("#") || line.startsWith("!"))
372
+ return [];
373
+ const anchored = line.startsWith("/");
374
+ if (anchored)
375
+ line = line.slice(1);
376
+ const isDir = line.endsWith("/");
377
+ if (isDir)
378
+ line = line.replace(/\/+$/, "");
379
+ if (!line)
380
+ return [];
381
+ const base = anchored ? line : `**/${line}`;
382
+ // A directory ignore covers its contents; a file/glob ignore covers both the
383
+ // entry itself and (harmlessly) anything beneath it if it is a directory.
384
+ return isDir ? [`${base}/**`] : [base, `${base}/**`];
385
+ }
386
+ /** Read and parse a repo's .arolignore file into ignore globs (empty if none). */
387
+ function loadArolignore(root) {
388
+ let content;
389
+ try {
390
+ content = fs.readFileSync(path.join(root, ".arolignore"), "utf8");
391
+ }
392
+ catch {
393
+ return [];
394
+ }
395
+ return content.split(/\r?\n/).flatMap(arolignoreLineToGlobs);
396
+ }
225
397
  /**
226
398
  * Scan a repository for deprecation usage.
227
399
  * @param root repo root to scan.
228
400
  * @param deprecations validated dataset entries.
401
+ * @param options optional ignore globs (--ignore) and custom dataset path.
229
402
  */
230
- function scanRepo(root, deprecations) {
403
+ function scanRepo(root, deprecations, options = {}) {
231
404
  const absRoot = path.resolve(root);
405
+ // Assemble the ignore list: dirs + default file skips + .arolignore + --ignore.
406
+ const ignoreGlobs = [
407
+ ...IGNORED_DIRS.map((d) => `**/${d}/**`),
408
+ ...DEFAULT_FILE_IGNORES,
409
+ ...loadArolignore(absRoot),
410
+ ...(options.ignore ?? []),
411
+ ];
412
+ // Never scan the active custom dataset file, even if it lives in the tree.
413
+ if (options.dataPath) {
414
+ const relData = path.relative(absRoot, path.resolve(options.dataPath));
415
+ if (relData && !relData.startsWith("..") && !path.isAbsolute(relData)) {
416
+ ignoreGlobs.push(relData);
417
+ }
418
+ }
232
419
  // Partition by match mode: "pattern" entries key on real source usage, while
233
420
  // "sdk"/"version" entries key on the manifest.
234
421
  const patternDeps = deprecations.filter((d) => d.match === "pattern");
@@ -246,7 +433,7 @@ function scanRepo(root, deprecations) {
246
433
  dot: false,
247
434
  followSymbolicLinks: false,
248
435
  suppressErrors: true,
249
- ignore: IGNORED_DIRS.map((d) => `**/${d}/**`),
436
+ ignore: ignoreGlobs,
250
437
  });
251
438
  let scannedFiles = 0;
252
439
  for (const rel of files) {
@@ -268,7 +455,16 @@ function scanRepo(root, deprecations) {
268
455
  continue;
269
456
  }
270
457
  scannedFiles++;
271
- scanContent(content, rel, compiled, patternSink);
458
+ // Language scoping: only test entries valid for this file's extension.
459
+ const ext = path.extname(rel).slice(1).toLowerCase();
460
+ const applicable = compiled.filter((c) => appliesToExt(c, ext));
461
+ if (applicable.length === 0)
462
+ continue;
463
+ // Comment stripping: match against de-commented source so mentions in
464
+ // comments don't count. Offsets are preserved, so line numbers stay correct.
465
+ const cfg = commentConfig(ext);
466
+ const scanText = cfg ? stripComments(content, cfg) : content;
467
+ scanContent(scanText, rel, applicable, patternSink);
272
468
  }
273
469
  // 3. Build findings — one per deprecation, evaluated per its match mode.
274
470
  const findings = [];
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "arol-ai",
3
- "version": "0.1.1",
3
+ "version": "0.1.3",
4
4
  "description": "Scan a local repo for upcoming third-party API/SDK deprecations. Fully local — no network, no telemetry, your code never leaves the machine.",
5
5
  "keywords": [
6
6
  "deprecation",
@@ -5,6 +5,7 @@
5
5
  "title": "Assistants API (beta)",
6
6
  "severity": "high",
7
7
  "match": "pattern",
8
+ "applies_to": ["py", "js", "ts", "jsx", "tsx", "mjs"],
8
9
  "sunset_date": "2026-08-26",
9
10
  "detect": {
10
11
  "sdk": ["openai"],
@@ -25,10 +26,12 @@
25
26
  "title": "Claude Sonnet 4 & Opus 4",
26
27
  "severity": "high",
27
28
  "match": "pattern",
29
+ "applies_to": ["*"],
28
30
  "sunset_date": "2026-06-15",
29
31
  "detect": {
30
32
  "sdk": ["@anthropic-ai/sdk", "anthropic"],
31
- "patterns": [
33
+ "patterns": [],
34
+ "models": [
32
35
  "claude-sonnet-4-20250514",
33
36
  "claude-opus-4-20250514",
34
37
  "claude-sonnet-4-0",
@@ -45,15 +48,17 @@
45
48
  "title": "Retired Claude models (Haiku 3, 3.5/3.7 Sonnet, Claude 2.x)",
46
49
  "severity": "high",
47
50
  "match": "pattern",
51
+ "applies_to": ["*"],
48
52
  "sunset_date": "2026-04-20",
49
53
  "detect": {
50
54
  "sdk": ["@anthropic-ai/sdk", "anthropic"],
51
- "patterns": [
55
+ "patterns": [],
56
+ "models": [
52
57
  "claude-3-haiku-20240307",
53
58
  "claude-3-5-sonnet",
54
59
  "claude-3-7-sonnet",
55
- "claude-2\\.1",
56
- "claude-2\\.0",
60
+ "claude-2.1",
61
+ "claude-2.0",
57
62
  "claude-instant"
58
63
  ]
59
64
  },
@@ -67,14 +72,12 @@
67
72
  "title": "Gemini 2.5 Pro / Flash / Flash-Lite",
68
73
  "severity": "high",
69
74
  "match": "pattern",
75
+ "applies_to": ["*"],
70
76
  "sunset_date": "2026-10-16",
71
77
  "detect": {
72
78
  "sdk": ["@google/generative-ai", "@google/genai", "google-generativeai"],
73
- "patterns": [
74
- "gemini-2\\.5-pro",
75
- "gemini-2\\.5-flash",
76
- "gemini-2\\.5-flash-lite"
77
- ]
79
+ "patterns": [],
80
+ "models": ["gemini-2.5-pro", "gemini-2.5-flash", "gemini-2.5-flash-lite"]
78
81
  },
79
82
  "migration_url": "https://ai.google.dev/gemini-api/docs/deprecations",
80
83
  "summary": "Gemini 2.5 models scheduled for shutdown Oct 16, 2026 (Google states this is the earliest possible date). Migrate to Gemini 3.x.",
@@ -86,14 +89,16 @@
86
89
  "title": "Gemini 2.0 Flash family",
87
90
  "severity": "high",
88
91
  "match": "pattern",
92
+ "applies_to": ["*"],
89
93
  "sunset_date": "2026-06-01",
90
94
  "detect": {
91
95
  "sdk": ["@google/generative-ai", "@google/genai", "google-generativeai"],
92
- "patterns": [
93
- "gemini-2\\.0-flash",
94
- "gemini-2\\.0-flash-001",
95
- "gemini-2\\.0-flash-lite",
96
- "gemini-2\\.0-flash-lite-001"
96
+ "patterns": [],
97
+ "models": [
98
+ "gemini-2.0-flash",
99
+ "gemini-2.0-flash-001",
100
+ "gemini-2.0-flash-lite",
101
+ "gemini-2.0-flash-lite-001"
97
102
  ]
98
103
  },
99
104
  "migration_url": "https://ai.google.dev/gemini-api/docs/deprecations",
@@ -106,18 +111,244 @@
106
111
  "title": "Gemini 1.0 / 1.5 models",
107
112
  "severity": "high",
108
113
  "match": "pattern",
114
+ "applies_to": ["*"],
109
115
  "sunset_date": "2026-04-29",
110
116
  "detect": {
111
117
  "sdk": ["@google/generative-ai", "@google/genai", "google-generativeai"],
112
- "patterns": [
113
- "gemini-1\\.5-pro",
114
- "gemini-1\\.5-flash",
115
- "gemini-1\\.0-pro",
118
+ "patterns": [],
119
+ "models": [
120
+ "gemini-1.5-pro",
121
+ "gemini-1.5-flash",
122
+ "gemini-1.0-pro",
116
123
  "gemini-pro"
117
124
  ]
118
125
  },
119
126
  "migration_url": "https://ai.google.dev/gemini-api/docs/deprecations",
120
127
  "summary": "All Gemini 1.0 and 1.5 models are shut down and return 404. Migrate to Gemini 2.5+ or 3.x.",
121
128
  "source": "https://firebase.google.com/docs/ai-logic/models"
129
+ },
130
+ {
131
+ "id": "openai-gpt4-family-shutdown",
132
+ "vendor": "OpenAI",
133
+ "title": "GPT-4 family models (API shutdown)",
134
+ "severity": "high",
135
+ "match": "pattern",
136
+ "applies_to": ["*"],
137
+ "sunset_date": "2026-10-23",
138
+ "date_confidence": "verify",
139
+ "detect": {
140
+ "sdk": ["openai"],
141
+ "patterns": [],
142
+ "models": [
143
+ "gpt-4o",
144
+ "gpt-4-turbo",
145
+ "gpt-4-0613",
146
+ "gpt-4-0125-preview",
147
+ "gpt-4-1106-preview",
148
+ "gpt-4-32k",
149
+ "gpt-4-vision-preview",
150
+ "o4-mini",
151
+ "gpt-4.5-preview"
152
+ ]
153
+ },
154
+ "migration_url": "https://platform.openai.com/docs/deprecations",
155
+ "summary": "The GPT-4 family (gpt-4o snapshots, gpt-4-turbo, gpt-4-0613, o4-mini, etc.) is reported on a single API shutdown date of Oct 23, 2026; calls will then fail. Migrate to the GPT-5 family. VERIFY exact date and model membership on the official deprecations page before shipping.",
156
+ "source": "https://platform.openai.com/docs/deprecations"
157
+ },
158
+ {
159
+ "id": "openai-legacy-retired-models",
160
+ "vendor": "OpenAI",
161
+ "title": "Retired legacy OpenAI models (GPT-3 era, early snapshots)",
162
+ "severity": "high",
163
+ "match": "pattern",
164
+ "applies_to": ["*"],
165
+ "sunset_date": "2024-01-04",
166
+ "detect": {
167
+ "sdk": ["openai"],
168
+ "patterns": [],
169
+ "models": [
170
+ "text-davinci-003",
171
+ "text-davinci-002",
172
+ "gpt-3.5-turbo-0301",
173
+ "gpt-3.5-turbo-0613",
174
+ "gpt-4-0314",
175
+ "code-davinci",
176
+ "text-curie",
177
+ "text-babbage"
178
+ ]
179
+ },
180
+ "migration_url": "https://platform.openai.com/docs/deprecations",
181
+ "summary": "These legacy models (GPT-3-era completions and early dated GPT-4/3.5 snapshots) are already retired and return errors. Migrate to current models.",
182
+ "source": "https://platform.openai.com/docs/deprecations"
183
+ },
184
+ {
185
+ "id": "stripe-removed-js-methods",
186
+ "vendor": "Stripe",
187
+ "title": "Removed legacy Stripe.js methods",
188
+ "severity": "high",
189
+ "match": "pattern",
190
+ "applies_to": ["js", "ts", "jsx", "tsx", "mjs"],
191
+ "sunset_date": "2026-03-25",
192
+ "detect": {
193
+ "sdk": ["@stripe/stripe-js", "stripe"],
194
+ "patterns": [
195
+ "handleCardPayment",
196
+ "confirmPaymentIntent",
197
+ "handleFpxPayment",
198
+ "handleCardSetup",
199
+ "confirmSetupIntent",
200
+ "createSource",
201
+ "retrieveSource"
202
+ ]
203
+ },
204
+ "migration_url": "https://docs.stripe.com/changelog/dahlia/2026-03-25/remove-legacy-stripejs-methods",
205
+ "summary": "These legacy Stripe.js methods were removed and now throw errors. Replace handleCardPayment/confirmPaymentIntent with confirmCardPayment, handleCardSetup/confirmSetupIntent with confirmCardSetup, and migrate createSource/retrieveSource to the PaymentMethods API.",
206
+ "source": "https://docs.stripe.com/changelog/dahlia/2026-03-25/remove-legacy-stripejs-methods"
207
+ },
208
+ {
209
+ "id": "stripe-sources-charges-legacy",
210
+ "vendor": "Stripe",
211
+ "title": "Sources API / legacy Charges API",
212
+ "severity": "medium",
213
+ "match": "pattern",
214
+ "applies_to": ["js", "ts", "jsx", "tsx", "mjs"],
215
+ "sunset_date": "2024-05-15",
216
+ "detect": {
217
+ "sdk": ["stripe"],
218
+ "patterns": ["\\.sources\\.create", "charges\\.create", "Charge\\.create"]
219
+ },
220
+ "migration_url": "https://docs.stripe.com/payments/older-apis",
221
+ "summary": "The Sources API is deprecated (local payment methods stopped being accepted May 15, 2024) and the Charges API is legacy. Migrate to the PaymentMethods + PaymentIntents APIs.",
222
+ "source": "https://docs.stripe.com/payments/older-apis"
223
+ },
224
+ {
225
+ "id": "twilio-notify-eol",
226
+ "vendor": "Twilio",
227
+ "title": "Notify API",
228
+ "severity": "high",
229
+ "match": "pattern",
230
+ "applies_to": ["py", "js", "ts", "jsx", "tsx", "mjs"],
231
+ "sunset_date": "2027-01-31",
232
+ "date_confidence": "verify",
233
+ "detect": {
234
+ "sdk": ["twilio"],
235
+ "patterns": ["notify\\.v1", "\\.notify\\.services", "client\\.notify"]
236
+ },
237
+ "migration_url": "https://www.twilio.com/en-us/changelog",
238
+ "summary": "Twilio Notify reaches end of life Jan 31, 2027; after that all Notify API requests will fail. No 1:1 replacement — rebuild with Programmable Messaging / Conversations. Verify the date against Twilio's official EOL notice.",
239
+ "source": "https://www.courier.com/blog/twilio-notify-end-of-life"
240
+ },
241
+ {
242
+ "id": "twilio-programmable-chat-retired",
243
+ "vendor": "Twilio",
244
+ "title": "Programmable Chat API",
245
+ "severity": "high",
246
+ "match": "pattern",
247
+ "applies_to": ["py", "js", "ts", "jsx", "tsx", "mjs"],
248
+ "sunset_date": "2022-07-25",
249
+ "detect": {
250
+ "sdk": ["twilio", "twilio-chat"],
251
+ "patterns": ["chat\\.v2", "IpMessaging", "twilio-chat"]
252
+ },
253
+ "migration_url": "https://www.twilio.com/docs/conversations/migrating-chat-conversations",
254
+ "summary": "The standalone Programmable Chat API was sunset July 25, 2022 (Programmable Chat in Flex ended June 1, 2026). Migrate to the Conversations API.",
255
+ "source": "https://www.twilio.com/en-us/changelog/programmable-chat-end-of-life-notice"
256
+ },
257
+ {
258
+ "id": "aws-sdk-js-v2-eol",
259
+ "vendor": "AWS",
260
+ "title": "AWS SDK for JavaScript v2",
261
+ "severity": "medium",
262
+ "match": "sdk",
263
+ "applies_to": ["js", "ts", "jsx", "tsx", "mjs"],
264
+ "sunset_date": "2025-09-08",
265
+ "detect": {
266
+ "sdk": ["aws-sdk"],
267
+ "patterns": []
268
+ },
269
+ "migration_url": "https://aws.amazon.com/blogs/developer/announcing-end-of-support-for-aws-sdk-for-javascript-v2/",
270
+ "summary": "AWS SDK for JavaScript v2 (the 'aws-sdk' package) reached end-of-support Sept 8, 2025 — no more updates or security fixes. Migrate to the modular AWS SDK v3 (@aws-sdk/* packages).",
271
+ "source": "https://aws.amazon.com/blogs/developer/announcing-end-of-support-for-aws-sdk-for-javascript-v2/"
272
+ },
273
+ {
274
+ "id": "hubspot-api-key-hapikey",
275
+ "vendor": "HubSpot",
276
+ "title": "Legacy API keys (hapikey)",
277
+ "severity": "high",
278
+ "match": "pattern",
279
+ "applies_to": ["py", "js", "ts", "jsx", "tsx", "mjs"],
280
+ "sunset_date": "2022-11-30",
281
+ "detect": {
282
+ "sdk": ["@hubspot/api-client"],
283
+ "patterns": ["hapikey\\s*=", "hapiKey\\s*="]
284
+ },
285
+ "migration_url": "https://developers.hubspot.com/changelog/upcoming-api-key-sunset",
286
+ "summary": "HubSpot deprecated legacy API keys (the hapikey query param) on Nov 30, 2022; requests using hapikey now fail. Migrate to Private App access tokens (Bearer auth). The eCommerce Bridge and Accounting Extension APIs were also sunset Dec 1, 2022.",
287
+ "source": "https://developers.hubspot.com/changelog/upcoming-api-key-sunset"
288
+ },
289
+ {
290
+ "id": "openai-python-v0-syntax",
291
+ "vendor": "OpenAI",
292
+ "title": "Legacy openai-python v0 call syntax",
293
+ "severity": "high",
294
+ "match": "pattern",
295
+ "applies_to": ["py"],
296
+ "sunset_date": "2023-11-06",
297
+ "detect": {
298
+ "sdk": ["openai"],
299
+ "patterns": [
300
+ "openai\\.ChatCompletion",
301
+ "openai\\.Completion\\.create",
302
+ "openai\\.Embedding\\.create",
303
+ "openai\\.Moderation\\.create"
304
+ ]
305
+ },
306
+ "migration_url": "https://github.com/openai/openai-python/discussions/742",
307
+ "summary": "openai-python v1.0 (Nov 2023) removed module-level calls. openai.ChatCompletion/Completion/Embedding.create now raise APIRemovedInV1. Instantiate a client (client = OpenAI(); client.chat.completions.create(...)) or run `openai migrate`.",
308
+ "source": "https://github.com/openai/openai-python/issues/2172"
309
+ },
310
+ {
311
+ "id": "vercel-ai-sdk-v5-removed",
312
+ "vendor": "Vercel (AI SDK)",
313
+ "title": "AI SDK v4→v5/v6 removed APIs",
314
+ "severity": "medium",
315
+ "match": "pattern",
316
+ "applies_to": ["js", "ts", "jsx", "tsx", "mjs"],
317
+ "sunset_date": "2025-07-31",
318
+ "detect": {
319
+ "sdk": ["ai"],
320
+ "patterns": [
321
+ "from ['\"]ai/react['\"]",
322
+ "from ['\"]ai/openai['\"]",
323
+ "experimental_streamText",
324
+ "experimental_generateText",
325
+ "StreamingTextResponse"
326
+ ]
327
+ },
328
+ "migration_url": "https://ai-sdk.dev/docs/migration-guides/migration-guide-5-0",
329
+ "summary": "AI SDK v5 (Jul 2025) removed deprecated APIs: 'ai/react' → '@ai-sdk/react', experimental_streamText → streamText, StreamingTextResponse removed, useChat maxSteps removed. v6 moved provider imports ('ai/openai' → '@ai-sdk/openai').",
330
+ "source": "https://ai-sdk.dev/docs/migration-guides/migration-guide-5-0"
331
+ },
332
+ {
333
+ "id": "langchain-legacy-imports",
334
+ "vendor": "LangChain",
335
+ "title": "Legacy LangChain imports & chains",
336
+ "severity": "medium",
337
+ "match": "pattern",
338
+ "applies_to": ["py"],
339
+ "sunset_date": "2024-05-01",
340
+ "detect": {
341
+ "sdk": ["langchain"],
342
+ "patterns": [
343
+ "from langchain\\.llms import",
344
+ "from langchain\\.chat_models import",
345
+ "from langchain\\.embeddings import",
346
+ "initialize_agent",
347
+ "LLMChain"
348
+ ]
349
+ },
350
+ "migration_url": "https://python.langchain.com/docs/versions/v0_2/",
351
+ "summary": "Pre-0.2 LangChain imports are deprecated/removed: langchain.llms/chat_models/embeddings → langchain_community or provider packages (langchain_openai). LLMChain and initialize_agent are deprecated (use LCEL / create_agent; legacy moved to langchain-classic in 1.0).",
352
+ "source": "https://github.com/langchain-ai/langchain/discussions/19083"
122
353
  }
123
354
  ]