auto-api-discovery 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +79 -0
- package/dist/crawler.js +142 -0
- package/dist/db.js +60 -0
- package/dist/index.js +125 -0
- package/dist/interceptor.js +90 -0
- package/dist/openapi-generator.js +87 -0
- package/dist/schema-engine.js +86 -0
- package/package.json +39 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Anooj Shete
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
# 🕸️ ApiGen
|
|
2
|
+
|
|
3
|
+
> **Automated API Discovery System with HITL Authentication**
|
|
4
|
+
|
|
5
|
+

|
|
6
|
+

|
|
7
|
+

|
|
8
|
+

|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## 🚀 What is it?
|
|
13
|
+
|
|
14
|
+
Modern web applications often rely on undocumented, "hidden" internal APIs. Reverse-engineering these endpoints manually by staring at the network tab in DevTools is a tedious, error-prone, and highly time-consuming process.
|
|
15
|
+
|
|
16
|
+
**ApiGen** solves this pain point by automating the discovery and documentation of these hidden APIs. It seamlessly intercepts network traffic (XHR, Fetch, GraphQL), gracefully handles complex authentication barriers via a Human-in-the-Loop (HITL) system, recursively maps internal paths automatically via headless spidering, and outputs a strict, structurally sound OpenAPI 3.0 representation.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## 🏗️ Architecture
|
|
21
|
+
|
|
22
|
+
ApiGen utilizes a **Hybrid API Discovery Approach** consisting of:
|
|
23
|
+
|
|
24
|
+
1. **Passive Recording (Capture Mode):** Uses a headful instance of Playwright. You navigate a target web application as a real user—bypassing CAPTCHAs, 2FA, and complex login flows. Under the hood, ApiGen intercepts all network requests/responses and logs them securely into a local SQLite database. Upon closing the browser, it actively exports your authenticated session state into `.apigen-session.json`.
|
|
25
|
+
2. **Auto-Spidering (Crawl Mode):** Takes the previously captured authenticated session state and launches a headless Playwright Breadth-First-Search (BFS) spider. It organically navigates the application just like an authenticated user would, discovering and hitting protected routes, intelligently avoiding generic WAF traps using randomized latency offsets, and feeding new underlying API traffic deeply into the local SQLite store.
|
|
26
|
+
3. **OpenAPI Generation (Export):** A local robust schema inference engine processes thousands of SQLite records. It aggressively deduplicates dynamic URLs (folding Object IDs and UUIDs into neat path parameters), accurately infers nested JSON payload structures recursively, and compiles them to a strict, standard OpenAPI 3.0 json specification document.
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## ⚙️ Tech Stack
|
|
31
|
+
|
|
32
|
+
- **[Node.js](https://nodejs.org/en/) & [TypeScript](https://www.typescriptlang.org/)** - For robust type-safe runtime execution.
|
|
33
|
+
- **[Playwright](https://playwright.dev/)** - For complete DOM navigation, session persistence, and native underlying browser CDP network interception workflows.
|
|
34
|
+
- **[Better-SQLite3](https://github.com/WiseLibs/better-sqlite3)** - Highly scalable, local WAL-mode database used as the high-throughput primary storage layer.
|
|
35
|
+
- **[Commander.js](https://github.com/tj/commander.js/)** - For building the intuitive Command Line Interface.
|
|
36
|
+
- **[Chalk](https://github.com/chalk/chalk)** - For beautiful and clear terminal logging feedback.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## 🛠️ Getting Started
|
|
41
|
+
|
|
42
|
+
### Installation
|
|
43
|
+
|
|
44
|
+
Clone the repository and install dependencies:
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
git clone https://github.com/anoojshete/auto-api-discovery.git
|
|
48
|
+
cd auto-api-discovery
|
|
49
|
+
npm install
|
|
50
|
+
npx playwright install chromium
|
|
51
|
+
npm run build
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
*(Note: The project leverages `npm run apigen` as the execution entrypoint hooked to `ts-node src/index.ts`)*
|
|
55
|
+
|
|
56
|
+
### Usage
|
|
57
|
+
|
|
58
|
+
#### 1. Capture Mode
|
|
59
|
+
Boot into target application interactively, bypass authentications, and capture underlying APIs. When you close the browser, your cookies are saved securely to your local working directory.
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
npm run apigen capture https://example.com
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
#### 2. Crawl Mode
|
|
66
|
+
Trigger the automated, authenticated headless spider to deeply map internal features and capture the underlying APIs organically mapping to those components.
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
# Options: -d / --depth, -p / --pages
|
|
70
|
+
npm run apigen crawl https://example.com --depth 3 --pages 50
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
#### 3. Export OpenAPI Schema
|
|
74
|
+
Transform your densely populated local SQLite database into a unified, natively grouped and inferred OpenAPI 3.0 Document.
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
# Options: -b / --base-url
|
|
78
|
+
npm run apigen export ./openapi.json --base-url https://api.example.com
|
|
79
|
+
```
|
package/dist/crawler.js
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
var __createBinding = (this && this.__createBinding) || (Object.create ? (function(o, m, k, k2) {
|
|
3
|
+
if (k2 === undefined) k2 = k;
|
|
4
|
+
var desc = Object.getOwnPropertyDescriptor(m, k);
|
|
5
|
+
if (!desc || ("get" in desc ? !m.__esModule : desc.writable || desc.configurable)) {
|
|
6
|
+
desc = { enumerable: true, get: function() { return m[k]; } };
|
|
7
|
+
}
|
|
8
|
+
Object.defineProperty(o, k2, desc);
|
|
9
|
+
}) : (function(o, m, k, k2) {
|
|
10
|
+
if (k2 === undefined) k2 = k;
|
|
11
|
+
o[k2] = m[k];
|
|
12
|
+
}));
|
|
13
|
+
var __setModuleDefault = (this && this.__setModuleDefault) || (Object.create ? (function(o, v) {
|
|
14
|
+
Object.defineProperty(o, "default", { enumerable: true, value: v });
|
|
15
|
+
}) : function(o, v) {
|
|
16
|
+
o["default"] = v;
|
|
17
|
+
});
|
|
18
|
+
var __importStar = (this && this.__importStar) || (function () {
|
|
19
|
+
var ownKeys = function(o) {
|
|
20
|
+
ownKeys = Object.getOwnPropertyNames || function (o) {
|
|
21
|
+
var ar = [];
|
|
22
|
+
for (var k in o) if (Object.prototype.hasOwnProperty.call(o, k)) ar[ar.length] = k;
|
|
23
|
+
return ar;
|
|
24
|
+
};
|
|
25
|
+
return ownKeys(o);
|
|
26
|
+
};
|
|
27
|
+
return function (mod) {
|
|
28
|
+
if (mod && mod.__esModule) return mod;
|
|
29
|
+
var result = {};
|
|
30
|
+
if (mod != null) for (var k = ownKeys(mod), i = 0; i < k.length; i++) if (k[i] !== "default") __createBinding(result, mod, k[i]);
|
|
31
|
+
__setModuleDefault(result, mod);
|
|
32
|
+
return result;
|
|
33
|
+
};
|
|
34
|
+
})();
|
|
35
|
+
var __importDefault = (this && this.__importDefault) || function (mod) {
|
|
36
|
+
return (mod && mod.__esModule) ? mod : { "default": mod };
|
|
37
|
+
};
|
|
38
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
39
|
+
exports.startCrawler = startCrawler;
|
|
40
|
+
const playwright_1 = require("playwright");
|
|
41
|
+
const chalk_1 = __importDefault(require("chalk"));
|
|
42
|
+
const fs = __importStar(require("fs"));
|
|
43
|
+
const path = __importStar(require("path"));
|
|
44
|
+
const interceptor_1 = require("./interceptor");
|
|
45
|
+
const SESSION_FILE = path.resolve(process.cwd(), '.apigen-session.json');
|
|
46
|
+
const sleep = (ms) => new Promise(r => setTimeout(r, ms));
|
|
47
|
+
const randomDelay = (min, max) => sleep(Math.floor(Math.random() * (max - min + 1)) + min);
|
|
48
|
+
async function startCrawler(targetUrl, maxDepth = 2, maxPages = 50) {
|
|
49
|
+
console.log(chalk_1.default.yellow(`Starting crawler for ${targetUrl} (Max Depth: ${maxDepth}, Max Pages: ${maxPages})...`));
|
|
50
|
+
let browser = null;
|
|
51
|
+
try {
|
|
52
|
+
browser = await playwright_1.chromium.launch({ headless: true });
|
|
53
|
+
const context = await browser.newContext();
|
|
54
|
+
// Load cookies if available to run fully authenticated
|
|
55
|
+
if (fs.existsSync(SESSION_FILE)) {
|
|
56
|
+
try {
|
|
57
|
+
const cookies = JSON.parse(fs.readFileSync(SESSION_FILE, 'utf-8'));
|
|
58
|
+
await context.addCookies(cookies);
|
|
59
|
+
console.log(chalk_1.default.green('Loaded session cookies successfully. Crawler is authenticated.'));
|
|
60
|
+
}
|
|
61
|
+
catch (err) {
|
|
62
|
+
console.error(chalk_1.default.red('Failed to load cookies from session file.'), err);
|
|
63
|
+
}
|
|
64
|
+
}
|
|
65
|
+
const page = await context.newPage();
|
|
66
|
+
// Attach the EXACT same interceptor from Milestone 1
|
|
67
|
+
(0, interceptor_1.attachInterceptor)(page);
|
|
68
|
+
let parsedTargetUrl;
|
|
69
|
+
try {
|
|
70
|
+
parsedTargetUrl = new URL(targetUrl);
|
|
71
|
+
}
|
|
72
|
+
catch {
|
|
73
|
+
console.error(chalk_1.default.red(`Invalid target URL: ${targetUrl}`));
|
|
74
|
+
return;
|
|
75
|
+
}
|
|
76
|
+
const domain = parsedTargetUrl.hostname;
|
|
77
|
+
// BFS Queue: { url, currentDepth }
|
|
78
|
+
const queue = [{ url: targetUrl, depth: 0 }];
|
|
79
|
+
const visited = new Set();
|
|
80
|
+
let pagesProcessed = 0;
|
|
81
|
+
console.log(chalk_1.default.blue('Beginning Breadth-First Spidering algorithm...'));
|
|
82
|
+
while (queue.length > 0 && pagesProcessed < maxPages) {
|
|
83
|
+
const current = queue.shift();
|
|
84
|
+
if (!current)
|
|
85
|
+
break;
|
|
86
|
+
const { url: currentUrl, depth } = current;
|
|
87
|
+
// Normalize URL (strip pure hash fragments to avoid duplicates)
|
|
88
|
+
let normalizedUrl = currentUrl;
|
|
89
|
+
try {
|
|
90
|
+
const pureUrl = new URL(currentUrl);
|
|
91
|
+
pureUrl.hash = '';
|
|
92
|
+
normalizedUrl = pureUrl.toString();
|
|
93
|
+
}
|
|
94
|
+
catch { }
|
|
95
|
+
if (visited.has(normalizedUrl))
|
|
96
|
+
continue;
|
|
97
|
+
visited.add(normalizedUrl);
|
|
98
|
+
console.log(chalk_1.default.cyan(`[Depth ${depth}] Crawling: ${normalizedUrl}`));
|
|
99
|
+
try {
|
|
100
|
+
await page.goto(normalizedUrl, { waitUntil: 'domcontentloaded', timeout: 15000 });
|
|
101
|
+
pagesProcessed++;
|
|
102
|
+
// Random delay to avoid aggressive WAF blocks
|
|
103
|
+
await randomDelay(500, 1500);
|
|
104
|
+
if (depth < maxDepth) {
|
|
105
|
+
// Extract all <a> tags and map their absolute URLs
|
|
106
|
+
const links = await page.$$eval('a', anchors => anchors.map(a => a.href));
|
|
107
|
+
for (const link of links) {
|
|
108
|
+
if (!link)
|
|
109
|
+
continue;
|
|
110
|
+
try {
|
|
111
|
+
const parsedLink = new URL(link);
|
|
112
|
+
// Filter out external links strictly based on target domain boundaries
|
|
113
|
+
if (parsedLink.hostname === domain || parsedLink.hostname.endsWith(`.${domain}`)) {
|
|
114
|
+
parsedLink.hash = '';
|
|
115
|
+
const nextUrl = parsedLink.toString();
|
|
116
|
+
// Add new links to queue
|
|
117
|
+
if (!visited.has(nextUrl)) {
|
|
118
|
+
queue.push({ url: nextUrl, depth: depth + 1 });
|
|
119
|
+
}
|
|
120
|
+
}
|
|
121
|
+
}
|
|
122
|
+
catch (err) {
|
|
123
|
+
// Ignore invalid or weird internal hrefs (`javascript:`, etc)
|
|
124
|
+
}
|
|
125
|
+
}
|
|
126
|
+
}
|
|
127
|
+
}
|
|
128
|
+
catch (navError) {
|
|
129
|
+
console.log(chalk_1.default.red(`Failed to crawl ${normalizedUrl} - `) + navError.message);
|
|
130
|
+
}
|
|
131
|
+
}
|
|
132
|
+
console.log(chalk_1.default.green(`Crawl completed! Processed ${pagesProcessed} pages and intercepted traffic.`));
|
|
133
|
+
}
|
|
134
|
+
catch (error) {
|
|
135
|
+
console.error(chalk_1.default.red('Crawler failed:'), error);
|
|
136
|
+
}
|
|
137
|
+
finally {
|
|
138
|
+
if (browser) {
|
|
139
|
+
await browser.close();
|
|
140
|
+
}
|
|
141
|
+
}
|
|
142
|
+
}
|
package/dist/db.js
ADDED
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
var __importDefault = (this && this.__importDefault) || function (mod) {
|
|
3
|
+
return (mod && mod.__esModule) ? mod : { "default": mod };
|
|
4
|
+
};
|
|
5
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
6
|
+
exports.insertEndpoint = insertEndpoint;
|
|
7
|
+
exports.getAllEndpoints = getAllEndpoints;
|
|
8
|
+
const better_sqlite3_1 = __importDefault(require("better-sqlite3"));
|
|
9
|
+
const path_1 = __importDefault(require("path"));
|
|
10
|
+
// Initialize db
|
|
11
|
+
const dbPath = path_1.default.resolve(process.cwd(), 'apigen.db');
|
|
12
|
+
const db = new better_sqlite3_1.default(dbPath);
|
|
13
|
+
db.pragma('journal_mode = WAL');
|
|
14
|
+
// Create table
|
|
15
|
+
db.exec(`
|
|
16
|
+
CREATE TABLE IF NOT EXISTS endpoints (
|
|
17
|
+
id TEXT PRIMARY KEY,
|
|
18
|
+
method TEXT NOT NULL,
|
|
19
|
+
url TEXT NOT NULL,
|
|
20
|
+
path_pattern TEXT NOT NULL,
|
|
21
|
+
request_headers TEXT,
|
|
22
|
+
request_body TEXT,
|
|
23
|
+
response_status INTEGER,
|
|
24
|
+
response_body TEXT,
|
|
25
|
+
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
26
|
+
)
|
|
27
|
+
`);
|
|
28
|
+
const insertStmt = db.prepare(`
|
|
29
|
+
INSERT INTO endpoints (
|
|
30
|
+
id, method, url, path_pattern, request_headers, request_body, response_status, response_body
|
|
31
|
+
) VALUES (
|
|
32
|
+
@id, @method, @url, @path_pattern, @request_headers, @request_body, @response_status, @response_body
|
|
33
|
+
)
|
|
34
|
+
`);
|
|
35
|
+
function insertEndpoint(data) {
|
|
36
|
+
try {
|
|
37
|
+
insertStmt.run({
|
|
38
|
+
id: data.id,
|
|
39
|
+
method: data.method,
|
|
40
|
+
url: data.url,
|
|
41
|
+
path_pattern: data.path_pattern,
|
|
42
|
+
request_headers: JSON.stringify(data.request_headers),
|
|
43
|
+
request_body: data.request_body ? JSON.stringify(data.request_body) : null,
|
|
44
|
+
response_status: data.response_status,
|
|
45
|
+
response_body: data.response_body ? JSON.stringify(data.response_body) : null,
|
|
46
|
+
});
|
|
47
|
+
}
|
|
48
|
+
catch (error) {
|
|
49
|
+
console.error('Failed to insert endpoint into database:', error);
|
|
50
|
+
}
|
|
51
|
+
}
|
|
52
|
+
function getAllEndpoints() {
|
|
53
|
+
try {
|
|
54
|
+
return db.prepare('SELECT * FROM endpoints').all();
|
|
55
|
+
}
|
|
56
|
+
catch (error) {
|
|
57
|
+
console.error('Failed to get endpoints from database:', error);
|
|
58
|
+
return [];
|
|
59
|
+
}
|
|
60
|
+
}
|
package/dist/index.js
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
"use strict";
|
|
3
|
+
var __createBinding = (this && this.__createBinding) || (Object.create ? (function(o, m, k, k2) {
|
|
4
|
+
if (k2 === undefined) k2 = k;
|
|
5
|
+
var desc = Object.getOwnPropertyDescriptor(m, k);
|
|
6
|
+
if (!desc || ("get" in desc ? !m.__esModule : desc.writable || desc.configurable)) {
|
|
7
|
+
desc = { enumerable: true, get: function() { return m[k]; } };
|
|
8
|
+
}
|
|
9
|
+
Object.defineProperty(o, k2, desc);
|
|
10
|
+
}) : (function(o, m, k, k2) {
|
|
11
|
+
if (k2 === undefined) k2 = k;
|
|
12
|
+
o[k2] = m[k];
|
|
13
|
+
}));
|
|
14
|
+
var __setModuleDefault = (this && this.__setModuleDefault) || (Object.create ? (function(o, v) {
|
|
15
|
+
Object.defineProperty(o, "default", { enumerable: true, value: v });
|
|
16
|
+
}) : function(o, v) {
|
|
17
|
+
o["default"] = v;
|
|
18
|
+
});
|
|
19
|
+
var __importStar = (this && this.__importStar) || (function () {
|
|
20
|
+
var ownKeys = function(o) {
|
|
21
|
+
ownKeys = Object.getOwnPropertyNames || function (o) {
|
|
22
|
+
var ar = [];
|
|
23
|
+
for (var k in o) if (Object.prototype.hasOwnProperty.call(o, k)) ar[ar.length] = k;
|
|
24
|
+
return ar;
|
|
25
|
+
};
|
|
26
|
+
return ownKeys(o);
|
|
27
|
+
};
|
|
28
|
+
return function (mod) {
|
|
29
|
+
if (mod && mod.__esModule) return mod;
|
|
30
|
+
var result = {};
|
|
31
|
+
if (mod != null) for (var k = ownKeys(mod), i = 0; i < k.length; i++) if (k[i] !== "default") __createBinding(result, mod, k[i]);
|
|
32
|
+
__setModuleDefault(result, mod);
|
|
33
|
+
return result;
|
|
34
|
+
};
|
|
35
|
+
})();
|
|
36
|
+
var __importDefault = (this && this.__importDefault) || function (mod) {
|
|
37
|
+
return (mod && mod.__esModule) ? mod : { "default": mod };
|
|
38
|
+
};
|
|
39
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
40
|
+
const commander_1 = require("commander");
|
|
41
|
+
const playwright_1 = require("playwright");
|
|
42
|
+
const chalk_1 = __importDefault(require("chalk"));
|
|
43
|
+
const interceptor_1 = require("./interceptor");
|
|
44
|
+
const fs = __importStar(require("fs"));
|
|
45
|
+
const db_1 = require("./db");
|
|
46
|
+
const schema_engine_1 = require("./schema-engine");
|
|
47
|
+
const openapi_generator_1 = require("./openapi-generator");
|
|
48
|
+
const crawler_1 = require("./crawler");
|
|
49
|
+
const path = __importStar(require("path"));
|
|
50
|
+
const program = new commander_1.Command();
|
|
51
|
+
program
|
|
52
|
+
.name('apigen')
|
|
53
|
+
.description('API discovery automation CLI')
|
|
54
|
+
.version('1.0.0');
|
|
55
|
+
program
|
|
56
|
+
.command('capture <url>')
|
|
57
|
+
.description('Launch Playwright to capture API traffic')
|
|
58
|
+
.action(async (url) => {
|
|
59
|
+
console.log(chalk_1.default.yellow(`Starting capture engine...`));
|
|
60
|
+
console.log(chalk_1.default.blue(`Navigating to: ${url}`));
|
|
61
|
+
try {
|
|
62
|
+
const browser = await playwright_1.chromium.launch({ headless: false });
|
|
63
|
+
const context = await browser.newContext();
|
|
64
|
+
const page = await context.newPage();
|
|
65
|
+
(0, interceptor_1.attachInterceptor)(page);
|
|
66
|
+
await page.goto(url, { waitUntil: 'domcontentloaded' });
|
|
67
|
+
console.log(chalk_1.default.green('Navigation complete. Intercepting API traffic...'));
|
|
68
|
+
console.log(chalk_1.default.gray('Terminal output shows real-time capture. Close the browser window to exit.'));
|
|
69
|
+
// Save cookies safely in a loop to guarantee they are captured before exit
|
|
70
|
+
const sessionFile = path.resolve(process.cwd(), '.apigen-session.json');
|
|
71
|
+
let isRunning = true;
|
|
72
|
+
browser.on('disconnected', () => {
|
|
73
|
+
isRunning = false;
|
|
74
|
+
console.log(chalk_1.default.yellow('\nBrowser closed. Session saved. Exiting apigen gracefully...'));
|
|
75
|
+
process.exit(0);
|
|
76
|
+
});
|
|
77
|
+
// Periodically sync the cookies securely to avoid missing them if context is torn down quickly
|
|
78
|
+
while (isRunning) {
|
|
79
|
+
try {
|
|
80
|
+
const cookies = await context.cookies();
|
|
81
|
+
fs.writeFileSync(sessionFile, JSON.stringify(cookies, null, 2), 'utf-8');
|
|
82
|
+
}
|
|
83
|
+
catch (e) {
|
|
84
|
+
// Might hit target closed error silently during exit
|
|
85
|
+
}
|
|
86
|
+
await new Promise(resolve => setTimeout(resolve, 2000));
|
|
87
|
+
}
|
|
88
|
+
}
|
|
89
|
+
catch (error) {
|
|
90
|
+
console.error(chalk_1.default.red('Failed to start capture:'), error);
|
|
91
|
+
process.exit(1);
|
|
92
|
+
}
|
|
93
|
+
});
|
|
94
|
+
program
|
|
95
|
+
.command('export <output-file-json>')
|
|
96
|
+
.description('Export OpenAPI 3.0 specification from intercepted database traffic')
|
|
97
|
+
.option('-b, --base-url <url>', 'Base URL for OpenAPI specification', 'http://localhost')
|
|
98
|
+
.action((outputFile, options) => {
|
|
99
|
+
console.log(chalk_1.default.yellow('Reading endpoints from database...'));
|
|
100
|
+
const endpoints = (0, db_1.getAllEndpoints)();
|
|
101
|
+
if (endpoints.length === 0) {
|
|
102
|
+
console.log(chalk_1.default.red('No endpoints found in database to export.'));
|
|
103
|
+
process.exit(0);
|
|
104
|
+
}
|
|
105
|
+
console.log(chalk_1.default.blue(`Found ${endpoints.length} raw endpoints. Generating schema map...`));
|
|
106
|
+
// Process URLs and inference schemas
|
|
107
|
+
const schemaMap = (0, schema_engine_1.generateSchemaMap)(endpoints);
|
|
108
|
+
const finalMap = Object.values(schemaMap);
|
|
109
|
+
console.log(chalk_1.default.green(`Folded into ${finalMap.length} unique routes.`));
|
|
110
|
+
console.log(chalk_1.default.blue('Converting to OpenAPI 3.0 specification...'));
|
|
111
|
+
const openapiSpec = (0, openapi_generator_1.generateOpenAPI)(finalMap, options.baseUrl);
|
|
112
|
+
fs.writeFileSync(outputFile, JSON.stringify(openapiSpec, null, 2), 'utf-8');
|
|
113
|
+
console.log(chalk_1.default.green(`Export complete: ${outputFile}`));
|
|
114
|
+
});
|
|
115
|
+
program
|
|
116
|
+
.command('crawl <target-url>')
|
|
117
|
+
.description('Run authenticated headless crawler on target URL')
|
|
118
|
+
.option('-d, --depth <number>', 'Maximum BFS crawl depth', '2')
|
|
119
|
+
.option('-p, --pages <number>', 'Maximum pages to crawl', '50')
|
|
120
|
+
.action(async (targetUrl, options) => {
|
|
121
|
+
const maxDepth = parseInt(options.depth, 10);
|
|
122
|
+
const maxPages = parseInt(options.pages, 10);
|
|
123
|
+
await (0, crawler_1.startCrawler)(targetUrl, maxDepth, maxPages);
|
|
124
|
+
});
|
|
125
|
+
program.parse(process.argv);
|
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
var __importDefault = (this && this.__importDefault) || function (mod) {
|
|
3
|
+
return (mod && mod.__esModule) ? mod : { "default": mod };
|
|
4
|
+
};
|
|
5
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
6
|
+
exports.attachInterceptor = attachInterceptor;
|
|
7
|
+
const crypto_1 = require("crypto");
|
|
8
|
+
const chalk_1 = __importDefault(require("chalk"));
|
|
9
|
+
const db_1 = require("./db");
|
|
10
|
+
const IGNORED_RESOURCE_TYPES = new Set(['image', 'stylesheet', 'font', 'media', 'script', 'document']);
|
|
11
|
+
const TARGET_RESOURCE_TYPES = new Set(['xhr', 'fetch']);
|
|
12
|
+
function attachInterceptor(page) {
|
|
13
|
+
page.on('response', async (response) => {
|
|
14
|
+
const request = response.request();
|
|
15
|
+
const resourceType = request.resourceType();
|
|
16
|
+
// 1. Crucial Filter: ONLY capture traffic if it is XHR, Fetch etc.
|
|
17
|
+
if (!TARGET_RESOURCE_TYPES.has(resourceType)) {
|
|
18
|
+
return;
|
|
19
|
+
}
|
|
20
|
+
const url = request.url();
|
|
21
|
+
// Ignore tracking domains / static extensions
|
|
22
|
+
if (url.includes('google-analytics.com') ||
|
|
23
|
+
url.includes('googletagmanager.com') ||
|
|
24
|
+
url.match(/\.(png|jpg|jpeg|gif|css|woff2?|js|ico|svg)$/i)) {
|
|
25
|
+
return;
|
|
26
|
+
}
|
|
27
|
+
const method = request.method();
|
|
28
|
+
// Ignore preflight requests
|
|
29
|
+
if (method === 'OPTIONS')
|
|
30
|
+
return;
|
|
31
|
+
try {
|
|
32
|
+
const status = response.status();
|
|
33
|
+
const headers = request.headers();
|
|
34
|
+
let reqBodyParsed = null;
|
|
35
|
+
let resBodyParsed = null;
|
|
36
|
+
// Parse request post body
|
|
37
|
+
const postData = request.postData();
|
|
38
|
+
if (postData) {
|
|
39
|
+
try {
|
|
40
|
+
reqBodyParsed = JSON.parse(postData);
|
|
41
|
+
}
|
|
42
|
+
catch {
|
|
43
|
+
reqBodyParsed = postData; // Fallback to raw string
|
|
44
|
+
}
|
|
45
|
+
}
|
|
46
|
+
// Parse response body (JSON, text)
|
|
47
|
+
const contentType = response.headers()['content-type'] || '';
|
|
48
|
+
if (contentType.includes('application/json') || contentType.includes('text/')) {
|
|
49
|
+
try {
|
|
50
|
+
const resBodyBuffer = await response.body();
|
|
51
|
+
const resBodyString = resBodyBuffer.toString('utf-8');
|
|
52
|
+
try {
|
|
53
|
+
resBodyParsed = JSON.parse(resBodyString);
|
|
54
|
+
}
|
|
55
|
+
catch {
|
|
56
|
+
resBodyParsed = resBodyString;
|
|
57
|
+
}
|
|
58
|
+
}
|
|
59
|
+
catch (err) {
|
|
60
|
+
resBodyParsed = null;
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
else {
|
|
64
|
+
resBodyParsed = "[Binary or Unsupported Content]";
|
|
65
|
+
}
|
|
66
|
+
// Extract basic path pattern
|
|
67
|
+
let pathPattern = '/';
|
|
68
|
+
try {
|
|
69
|
+
pathPattern = new URL(url).pathname;
|
|
70
|
+
}
|
|
71
|
+
catch { }
|
|
72
|
+
const data = {
|
|
73
|
+
id: (0, crypto_1.randomUUID)(),
|
|
74
|
+
method,
|
|
75
|
+
url,
|
|
76
|
+
path_pattern: pathPattern,
|
|
77
|
+
request_headers: headers,
|
|
78
|
+
request_body: reqBodyParsed,
|
|
79
|
+
response_status: status,
|
|
80
|
+
response_body: resBodyParsed
|
|
81
|
+
};
|
|
82
|
+
(0, db_1.insertEndpoint)(data);
|
|
83
|
+
const color = status >= 400 ? chalk_1.default.red : chalk_1.default.green;
|
|
84
|
+
console.log(`${chalk_1.default.cyan(`[${method}]`)} ${color(status)} - ${url}`);
|
|
85
|
+
}
|
|
86
|
+
catch (err) {
|
|
87
|
+
console.error(chalk_1.default.red('[Interceptor Error]'), err);
|
|
88
|
+
}
|
|
89
|
+
});
|
|
90
|
+
}
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.generateOpenAPI = generateOpenAPI;
|
|
4
|
+
function mapSchemaToOpenAPI(customSchema) {
|
|
5
|
+
if (customSchema === 'string')
|
|
6
|
+
return { type: 'string' };
|
|
7
|
+
if (customSchema === 'number')
|
|
8
|
+
return { type: 'number' };
|
|
9
|
+
if (customSchema === 'boolean')
|
|
10
|
+
return { type: 'boolean' };
|
|
11
|
+
if (customSchema === 'null')
|
|
12
|
+
return { nullable: true };
|
|
13
|
+
if (customSchema === 'any' || customSchema === 'unknown')
|
|
14
|
+
return {};
|
|
15
|
+
if (Array.isArray(customSchema)) {
|
|
16
|
+
let itemSchema = {};
|
|
17
|
+
if (customSchema.length > 0) {
|
|
18
|
+
itemSchema = mapSchemaToOpenAPI(customSchema[0]);
|
|
19
|
+
}
|
|
20
|
+
return { type: 'array', items: itemSchema };
|
|
21
|
+
}
|
|
22
|
+
if (typeof customSchema === 'object' && customSchema !== null) {
|
|
23
|
+
const properties = {};
|
|
24
|
+
for (const [key, value] of Object.entries(customSchema)) {
|
|
25
|
+
properties[key] = mapSchemaToOpenAPI(value);
|
|
26
|
+
}
|
|
27
|
+
return { type: 'object', properties };
|
|
28
|
+
}
|
|
29
|
+
return {}; // fallback
|
|
30
|
+
}
|
|
31
|
+
function extractPathParams(path) {
|
|
32
|
+
const matches = path.match(/\{([^}]+)\}/g);
|
|
33
|
+
if (!matches)
|
|
34
|
+
return [];
|
|
35
|
+
return matches.map(m => m.slice(1, -1));
|
|
36
|
+
}
|
|
37
|
+
function generateOpenAPI(customSchema, baseUrl) {
|
|
38
|
+
const openapi = {
|
|
39
|
+
openapi: '3.0.0',
|
|
40
|
+
info: {
|
|
41
|
+
title: 'Auto-Discovered API',
|
|
42
|
+
version: '1.0.0',
|
|
43
|
+
description: 'API documentation generated by apigen'
|
|
44
|
+
},
|
|
45
|
+
servers: [
|
|
46
|
+
{ url: baseUrl }
|
|
47
|
+
],
|
|
48
|
+
paths: {}
|
|
49
|
+
};
|
|
50
|
+
for (const entry of customSchema) {
|
|
51
|
+
const method = entry.method.toLowerCase();
|
|
52
|
+
const path = entry.foldedUrl;
|
|
53
|
+
if (!openapi.paths[path]) {
|
|
54
|
+
openapi.paths[path] = {};
|
|
55
|
+
}
|
|
56
|
+
const pathItem = openapi.paths[path];
|
|
57
|
+
const operation = {
|
|
58
|
+
responses: {}
|
|
59
|
+
};
|
|
60
|
+
const pathParams = extractPathParams(path);
|
|
61
|
+
if (pathParams.length > 0) {
|
|
62
|
+
operation.parameters = pathParams.map(param => ({
|
|
63
|
+
name: param,
|
|
64
|
+
in: 'path',
|
|
65
|
+
required: true,
|
|
66
|
+
schema: { type: 'string' }
|
|
67
|
+
}));
|
|
68
|
+
}
|
|
69
|
+
// Process responses
|
|
70
|
+
for (const [statusCode, schemaBody] of Object.entries(entry.responseSchemas)) {
|
|
71
|
+
operation.responses[statusCode] = {
|
|
72
|
+
description: `Response for status ${statusCode}`,
|
|
73
|
+
content: {
|
|
74
|
+
'application/json': {
|
|
75
|
+
schema: mapSchemaToOpenAPI(schemaBody)
|
|
76
|
+
}
|
|
77
|
+
}
|
|
78
|
+
};
|
|
79
|
+
}
|
|
80
|
+
// Default response if empty
|
|
81
|
+
if (Object.keys(operation.responses).length === 0) {
|
|
82
|
+
operation.responses['200'] = { description: 'Success' };
|
|
83
|
+
}
|
|
84
|
+
pathItem[method] = operation;
|
|
85
|
+
}
|
|
86
|
+
return openapi;
|
|
87
|
+
}
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.foldUrl = foldUrl;
|
|
4
|
+
exports.inferSchema = inferSchema;
|
|
5
|
+
exports.generateSchemaMap = generateSchemaMap;
|
|
6
|
+
const UUID_REGEX = /^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i;
|
|
7
|
+
const OBJECT_ID_REGEX = /^[0-9a-fA-F]{24}$/;
|
|
8
|
+
const INTEGER_REGEX = /^\d+$/;
|
|
9
|
+
function isDynamicToken(token) {
|
|
10
|
+
if (UUID_REGEX.test(token))
|
|
11
|
+
return true;
|
|
12
|
+
if (OBJECT_ID_REGEX.test(token))
|
|
13
|
+
return true;
|
|
14
|
+
if (INTEGER_REGEX.test(token))
|
|
15
|
+
return true;
|
|
16
|
+
return false;
|
|
17
|
+
}
|
|
18
|
+
function foldUrl(url) {
|
|
19
|
+
try {
|
|
20
|
+
const parsed = new URL(url);
|
|
21
|
+
const parts = parsed.pathname.split('/').filter(Boolean);
|
|
22
|
+
let paramCounter = 1;
|
|
23
|
+
const foldedParts = parts.map(part => {
|
|
24
|
+
if (isDynamicToken(part)) {
|
|
25
|
+
return `{param${paramCounter++}}`;
|
|
26
|
+
}
|
|
27
|
+
return part;
|
|
28
|
+
});
|
|
29
|
+
return `/${foldedParts.join('/')}`;
|
|
30
|
+
}
|
|
31
|
+
catch {
|
|
32
|
+
return url;
|
|
33
|
+
}
|
|
34
|
+
}
|
|
35
|
+
function inferSchema(data) {
|
|
36
|
+
if (data === null)
|
|
37
|
+
return 'null';
|
|
38
|
+
if (typeof data === 'string')
|
|
39
|
+
return 'string';
|
|
40
|
+
if (typeof data === 'number')
|
|
41
|
+
return 'number';
|
|
42
|
+
if (typeof data === 'boolean')
|
|
43
|
+
return 'boolean';
|
|
44
|
+
if (Array.isArray(data)) {
|
|
45
|
+
if (data.length === 0)
|
|
46
|
+
return ['any'];
|
|
47
|
+
// Infer schema of the object inside the array
|
|
48
|
+
return [inferSchema(data[0])];
|
|
49
|
+
}
|
|
50
|
+
if (typeof data === 'object') {
|
|
51
|
+
const schema = {};
|
|
52
|
+
for (const key of Object.keys(data)) {
|
|
53
|
+
schema[key] = inferSchema(data[key]);
|
|
54
|
+
}
|
|
55
|
+
return schema;
|
|
56
|
+
}
|
|
57
|
+
return 'unknown';
|
|
58
|
+
}
|
|
59
|
+
function generateSchemaMap(endpoints) {
|
|
60
|
+
const map = {};
|
|
61
|
+
for (const ep of endpoints) {
|
|
62
|
+
const foldedUrl = foldUrl(ep.url);
|
|
63
|
+
const key = `${ep.method} ${foldedUrl}`;
|
|
64
|
+
if (!map[key]) {
|
|
65
|
+
map[key] = {
|
|
66
|
+
method: ep.method,
|
|
67
|
+
foldedUrl,
|
|
68
|
+
responseSchemas: {},
|
|
69
|
+
};
|
|
70
|
+
}
|
|
71
|
+
if (ep.response_status === 200 && ep.response_body) {
|
|
72
|
+
let bodyData = ep.response_body;
|
|
73
|
+
if (typeof bodyData === 'string') {
|
|
74
|
+
try {
|
|
75
|
+
bodyData = JSON.parse(bodyData);
|
|
76
|
+
}
|
|
77
|
+
catch { } // Leave as string if parsing fails
|
|
78
|
+
}
|
|
79
|
+
if (bodyData && typeof bodyData === 'object') {
|
|
80
|
+
const schema = inferSchema(bodyData);
|
|
81
|
+
map[key].responseSchemas[200] = schema;
|
|
82
|
+
}
|
|
83
|
+
}
|
|
84
|
+
}
|
|
85
|
+
return map;
|
|
86
|
+
}
|
package/package.json
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "auto-api-discovery",
|
|
3
|
+
"version": "1.0.0",
|
|
4
|
+
"description": "API discovery automation CLI",
|
|
5
|
+
"main": "dist/index.js",
|
|
6
|
+
"bin": {
|
|
7
|
+
"apigen": "./dist/index.js"
|
|
8
|
+
},
|
|
9
|
+
"files": [
|
|
10
|
+
"dist"
|
|
11
|
+
],
|
|
12
|
+
"scripts": {
|
|
13
|
+
"build": "tsc",
|
|
14
|
+
"dev": "ts-node src/index.ts",
|
|
15
|
+
"prepublishOnly": "npm run build",
|
|
16
|
+
"postinstall": "playwright install"
|
|
17
|
+
},
|
|
18
|
+
"keywords": [
|
|
19
|
+
"api",
|
|
20
|
+
"openapi",
|
|
21
|
+
"cli",
|
|
22
|
+
"playwright",
|
|
23
|
+
"automation"
|
|
24
|
+
],
|
|
25
|
+
"author": "Anooj Shete",
|
|
26
|
+
"license": "MIT",
|
|
27
|
+
"dependencies": {
|
|
28
|
+
"better-sqlite3": "^9.4.0",
|
|
29
|
+
"chalk": "^4.1.2",
|
|
30
|
+
"commander": "^12.0.0",
|
|
31
|
+
"playwright": "^1.42.1"
|
|
32
|
+
},
|
|
33
|
+
"devDependencies": {
|
|
34
|
+
"@types/better-sqlite3": "^7.6.9",
|
|
35
|
+
"@types/node": "^20.11.24",
|
|
36
|
+
"ts-node": "^10.9.2",
|
|
37
|
+
"typescript": "^5.3.3"
|
|
38
|
+
}
|
|
39
|
+
}
|