npm - docs-agent - Versions diffs - 1.1.0 - Mend

docs-agent 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

package/CHANGELOG.md +47 -0
package/LICENSE +21 -0
package/README.md +239 -0
package/docs/DEPLOYMENT.md +142 -0
package/docs/mcp-client-prompt.md +26 -0
package/docs/reference.md +258 -0
package/env.example +47 -0
package/package.json +67 -0
package/src/CodeSearch.js +125 -0
package/src/DocsAgent.js +728 -0
package/src/FileUtility.js +130 -0
package/src/GitHubApi.js +337 -0
package/src/LLM.js +463 -0
package/src/UrlValidator.js +190 -0
package/src/api.js +107 -0
package/src/cli.js +0 -0
package/src/config/principles.diataxis.js +28 -0
package/src/config/principles.first.js +11 -0
package/src/config/prompt.docs.vs.code.js +52 -0
package/src/config/prompt.edit.disruptive.js +9 -0
package/src/config/prompt.edit.js +40 -0
package/src/config/prompt.extract.code.js +45 -0
package/src/config/prompt.formatting.js +38 -0
package/src/config/prompt.gen.referencedocs.js +181 -0
package/src/config/prompt.linking.js +14 -0
package/src/config/prompt.prioritize.js +24 -0
package/src/config/prompt.relatedfiles.js +14 -0
package/src/config/prompt.review.js +23 -0
package/src/config/prompt.scoring.js +9 -0
package/src/config/prompt.writing.js +13 -0
package/src/config/rules.linking.js +10 -0
package/src/index.js +49 -0
package/src/lib.js +4 -0
package/src/mcp.js +268 -0

package/docs/reference.md ADDED Viewed

@@ -0,0 +1,258 @@
+`docs-agent` is an AI-powered agent for continuous documentation improvement. It analyzes, reviews, and edits documentation files using advanced LLMs (Gemini, Claude, etc.), following the Diátaxis documentation framework. It exposes its functionality via an HTTP API, an MCP (Model Context Protocol) server, and a programmatic Node.js interface.
+## 1. MCP Server Tools
+The MCP server exposes the following tools for integration with Model Context Protocol clients:
+### `reviewDocs`
+- **Description:** Reviews documentation quality following the Diátaxis framework.
+- **Parameters:**
+  - `filepath` (`string`): Absolute path to the docs file.
+  - `filename` (`string`, optional): Relative path from project root.
+  - `projectStructure` (`string`, optional): Project structure in markdown list format.
+  - `customInstructions` (`string`, optional): Additional instructions.
+  - `glossaryFile` (`string`, optional): Path to glossary file.
+  - `relatedFiles` (`string[]`, optional): Absolute filepaths of related docs files.
+  - `allowDisruptiveChanges` (`boolean`, optional, default: false): Allow major restructuring.
+- **Returns:** `{ content: [{ type: "text", text: string }] }`
+### `improveDocs`
+- **Description:** Improves documentation quality following the Diátaxis framework.
+- **Parameters:**
+  - `filepath` (`string`): Absolute path to the docs file.
+  - `filename` (`string`, optional): Relative path from project root.
+  - `projectStructure` (`string`, optional): Project structure in markdown list format.
+  - `customInstructions` (`string`, optional): Additional instructions.
+  - `glossaryFile` (`string`, optional): Path to glossary file.
+  - `relatedFiles` (`string[]`, optional): Absolute filepaths of related docs files.
+  - `allowDisruptiveChanges` (`boolean`, optional, default: false): Allow major restructuring.
+- **Returns:** `{ content: [{ type: "text", text: string }] }`
+### `editDocs`
+- **Description:** Edits documentation according to the provided plan.
+- **Parameters:**
+  - `filepath` (`string`): Absolute path to the docs file.
+  - `editPlan` (`string`): Specific changes to be made to the documentation.
+  - `filename` (`string`, optional): Relative path from project root.
+  - `projectStructure` (`string`, optional): Project structure in markdown list format.
+  - `customInstructions` (`string`, optional): Additional instructions.
+  - `glossaryFile` (`string`, optional): Path to glossary file.
+  - `relatedFiles` (`string[]`, optional): Absolute filepaths of related docs files.
+- **Returns:** `{ content: [{ type: "text", text: string }] }`
+### `linkifyDocs`
+- **Description:** Improves internal linking to related files and glossary concepts.
+- **Parameters:**
+  - `filepath` (`string`): Absolute path to the docs file.
+  - `filename` (`string`, optional): Relative path from project root.
+  - `projectStructure` (`string`, optional): Project structure in markdown list format.
+  - `customInstructions` (`string`, optional): Additional instructions.
+  - `glossaryFile` (`string`, optional): Path to glossary file.
+  - `relatedFiles` (`string[]`, optional): Absolute filepaths of related docs files.
+- **Returns:** `{ content: [{ type: "text", text: string }] }`
+### `auditDocsAgainstCode`
+- **Description:** Audits documentation against the corresponding codebase for accuracy and completeness.
+- **Parameters:**
+  - `docsContentFilePath` (`string`): Absolute path to the docs file to audit.
+  - `repoUrl` (`string`): GitHub repository URL for the reference source code.
+  - `projectStructure` (`string`, optional): Project structure in markdown list format.
+  - `customInstructions` (`string`, optional): Additional instructions.
+- **Returns:** `{ content: [{ type: "text", text: string }] }`
+### `generateReferenceDocs` (Experimental)
+- **Description:** Generates reference documentation for the given codebase/content.
+- **Parameters:**
+  - `referenceSourceCodeFiles` (`string[]`): Remote URLs of the necessary source code files.
+  - `repoUrl` (`string`, optional): GitHub repository URL.
+  - `interfaceType` (`"library" | "http" | "mcp" | "cli" | "rpc" | "other"`, optional): Type of interface.
+  - `relatedFileUrls` (`string[]`, optional): Remote URLs of related code files.
+  - `projectStructure` (`string`, optional): Project structure in markdown list format.
+  - `customInstructions` (`string`, optional): Additional instructions.
+  - `knowledgeBase` (`string`, optional): Higher-level architecture context (not included in output).
+- **Returns:** `{ content: [{ type: "text", text: string }] }`
+- **Note:** This tool is experimental and disabled by default.
+----
+## 2. Node.js Library API
+### Example Usage
+```js
+const docsAgent = new DocsAgent();
+const editedDocsPageContent = await docsAgent.reviewPrioritizeAndEdit(contentToEdit, {
+  projectStructure: "- src/\n  - DocsAgent.js\n  - api.js\n  - mcp.js\n  - LLM.js\n - config/\n    - principles.diataxis.js",
+  customInstructions: "make sure the final output is a hugo markdown file"
+});
+console.log(editedDocsPageContent);
+```
+### DocsAgent Class
+The main class for documentation improvement operations.
+#### Constructor
+```js
+new DocsAgent();
+```
+- Initializes a new DocsAgent instance.
+#### Methods
+##### `reviewPrioritizeAndEdit(content, context = {}, customInstructions, isFileWriteAllowed)`
+- **Description:** Plans and executes the documentation improvement process: review, prioritize, and edit.
+- **Parameters:**
+  - `content` (`string`): The documentation content to improve. If not provided, `context.filepath` is required.
+  - `context` (`object`):
+    - `filepath` (`string`): Absolute or relative path to the docs file.
+    - `filename` (`string`): Name of the docs file.
+    - `projectStructure` (`string`): Project structure in markdown list format.
+    - `glossaryFile` (`string`): Path to glossary file for consistent terminology.
+    - `relatedFiles` (`string[]`): Absolute filepaths to related documentation files.
+    - `allowDisruptiveChanges` (`boolean`): Allow major restructuring (default: false).
+  - `customInstructions` (`string`): Additional instructions for the improvement process.
+  - `isFileWriteAllowed` (`boolean`): If true, saves original and edited files to disk.
+- **Returns:** `Promise<string>` (the improved documentation content).
+##### `review(content, context, customInstructions)`
+- **Description:** Reviews documentation and suggests improvements.
+- **Parameters:**
+  - `content` (`string`): Documentation to review.
+  - `context` (`object`): Context for the review (see above).
+  - `customInstructions` (`string`): Additional instructions.
+- **Returns:** `Promise<string>` (review summary, rating, suggestions, strengths).
+##### `prioritize(content, review, context, customInstructions)`
+- **Description:** Breaks down the review into prioritized editing tasks.
+- **Parameters:**
+  - `content` (`string`): Documentation to review.
+  - `review` (`string`): Review output.
+  - `context` (`object`): Context for prioritization.
+  - `customInstructions` (`string`): Additional instructions.
+- **Returns:** `Promise<string>` (prioritized editing plan).
+##### `edit(content, editPlan, context, customInstructions)`
+- **Description:** Edits documentation according to the provided plan.
+- **Parameters:**
+  - `content` (`string`): Documentation to edit.
+  - `editPlan` (`string`): Editing plan.
+  - `context` (`object`): Context for editing.
+  - `customInstructions` (`string`): Additional instructions.
+- **Returns:** `Promise<string>` (edited documentation).
+##### `linkify(content, context, customInstructions)`
+- **Description:** Improves internal linking to related files and glossary concepts.
+- **Parameters:**
+  - `content` (`string`): Documentation to linkify.
+  - `context` (`object`): Context for linking.
+  - `customInstructions` (`string`): Additional instructions.
+- **Returns:** `Promise<string>` (content with improved links).
+##### `generateReferenceDocs(referenceSourceCodeFiles, context, customInstructions)`
+- **Description:** Generates reference documentation for the given content.
+- **Parameters:**
+  - `referenceSourceCodeFiles` (`string[]`): Remote URLs of the necessary source code files.
+  - `context` (`object`): Context for generation.
+    - `interfaceType` (`string`): Type of interface ("library", "http", "mcp", "cli", "rpc", or "other").
+    - `repoUrl` (`string`): URL of the repository.
+    - `projectStructure` (`string`): Project structure in markdown list format.
+    - `relatedFileUrls` (`string[]`): Remote URLs of related code files.
+    - `knowledgeBase` (`string`): Higher-level architecture context (not included in output).
+  - `customInstructions` (`string`): Additional instructions.
+- **Returns:** `Promise<string>` (reference documentation).
+##### `auditDocsAgainstCode(docsContentFilePath, context, customInstructions)`
+- **Description:** Audits documentation against the corresponding codebase for accuracy and completeness.
+- **Parameters:**
+  - `docsContentFilePath` (`string`): Absolute path to the docs file to audit.
+  - `context` (`object`): Context for the audit.
+    - `repoUrl` (`string`): GitHub repository URL for the reference source code.
+    - `projectStructure` (`string`): Project structure in markdown list format.
+  - `customInstructions` (`string`): Additional instructions.
+- **Returns:** `Promise<string>` (audit report with discrepancies and suggestions).
+---
+## 3. HTTP API
+The HTTP API is served via Express (default port: 3001).
+### Endpoints
+#### `POST /review`
+- **Request Body:** `{ content: string, filename?: string }`
+- **Response:** Review summary, rating, suggestions, strengths.
+#### `POST /prioritize`
+- **Request Body:** `{ content: string, review: string }`
+- **Response:** Prioritized actionable instructions.
+#### `POST /edit`
+- **Request Body:** `{ content: string, editPlan?: string, filename?: string }`
+- **Response:** Edited documentation content.
+#### `GET /health`
+- **Response:** Service status, version, uptime, memory, CPU usage.
+---
+## Prompts and Principles
+The agent uses a set of configurable prompts and principles (in `src/config/`) to guide its review, prioritization, editing, and linking operations. These include:
+- Diátaxis documentation principles
+- Technical accuracy principles
+- Writing and formatting style guides
+- Linking rules
+---
+## Environment Variables
+- `GITHUB_TOKEN` - GitHub personal access token
+- `REVIEW_AI_SERVICE` (Default: `gemini`) - AI service to be used for the documentation analysis part e.g. gemini, anthropic, etc.
+- `REVIEW_AI_MODEL` (Default: `gemini-2.5-pro-preview-05-06`) - AI model to be used for the documentation analysis part
+- `PREFERRED_AI_SERVICE` (Default: `anthropic`) - Preferred AI service
+- `PREFERRED_AI_MODEL` (Default: `claude-3-5-sonnet-20240620`) - Preferred AI model
+- `GOOGLE_GENERATIVE_AI_API_KEY` - Google AI API key
+- `ANTHROPIC_API_KEY` - Anthropic API key
+- `OPENAI_API_KEY` - OpenAI API key
+- `OLLAMA_API_KEY` - Ollama API key
+- `OLLAMA_API_URL` - Ollama API URL
+- `API_DISABLED` - Must not be "true" to serve the project as an HTTP API
+- `MCP_DISABLED` - Must not be "true" to serve the project as an MCP server
+- `MAX_CHANGES` - Maximum number of changes allowed
+- `ALLOW_DISRUPTIVE_CHANGES` - Whether to allow disruptive changes
+- `MAX_TOKENS` - Maximum tokens for LLM requests
+---
+## Folder and File Structure
+- `src/DocsAgent.js`: Main agent logic.
+- `src/LLM.js`: LLM interface.
+- `src/api.js`: HTTP API server.
+- `src/mcp.js`: MCP server and tool definitions.
+- `src/config/`: Prompts, principles, and configuration.
+- `test/`: Tests and fixtures.
+- `docs/`: Documentation.

package/env.example ADDED Viewed

@@ -0,0 +1,47 @@
+MCP_DISABLED=false
+API_DISABLED=true
+API_KEY=docs_agent_api_key_here
+MAX_CHANGES=1
+RESULTS_WEBHOOK_KEY=api_key_to_send_back_results_via_webhook
+ALLOW_DISRUPTIVE_CHANGES=false
+PREFERRED_AI_SERVICE=anthropic
+PREFERRED_AI_MODEL=claude-3-5-sonnet-20240620
+REVIEW_AI_SERVICE=gemini
+REVIEW_AI_MODEL=gemini-2.5-pro-preview-05-06
+MAX_TOKENS=8000
+OLLAMA_API_KEY=sk-ollama-1234567890
+OLLAMA_API_URL=http://localhost:11434/api/generateGEMINI_API_KEY=#deprecated
+GOOGLE_GENERATIVE_AI_API_KEY=
+OPENAI_API_KEY=sk-proj-1234567890
+ANTHROPIC_API_KEY=sk-ant-api03-1234567890
+OTEL_EXPORTER_OTLP_ENDPOINT=https://api.braintrust.dev/otel
+OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer <Your API Key>, x-bt-parent=project_id:<Your Project ID>"
+# GitHub
+GITHUB_TOKEN=your_github_personal_access_token_here
+#For staging and testing servers only
+SMEE_PROXY_ENDPOINT_REVIEW=#proxy to route POST request to /review
+SMEE_PROXY_ENDPOINT_PRIORITIZE=#proxy to route POST request to /prioritize
+SMEE_PROXY_ENDPOINT_EDIT=#proxy to route POST request to /edit
+SMEE_PROXY_ENDPOINT_LINK=#proxy to route POST request to /link
+SMEE_PROXY_ENDPOINT_AUDIT=#proxy to route POST request to /audit
+# Webhook SSRF Protection - Comma-separated list of exact allowed webhook URLs
+# Only exact URL matches are allowed (scheme + host + path + query)
+# Examples:
+# ALLOWED_WEBHOOK_URLS=https://integrating.app.url/api/comment
+ALLOWED_WEBHOOK_URLS=http://localhost:3000/api/comment #All endpoints where you want to send the AI responses back
+# Remote file fetch allowlist for FileUtility (comma-separated hostnames)
+# Built-in allowed: github.com, raw.githubusercontent.com, gist.github.com, gitlab.com
+# Add custom hosts here if needed (no wildcards)
+# Example:
+# REMOTE_FILE_ALLOWED_HOSTS=raw.githubusercontent.com,gitlab.example.com
+REMOTE_FILE_ALLOWED_HOSTS=
+# Enable custom remote hosts (set to "true" to allow hosts from REMOTE_FILE_ALLOWED_HOSTS)
+# Default: false (only built-in GitHub/GitLab hosts allowed)
+REMOTE_FILE_ALLOW_CUSTOM_HOSTS=false
+# File Access Control (built-in, no configuration needed)
+# API mode: Restricts local file access to public/ subdirectory only
+# MCP mode: Allows unrestricted local file access (relies on MCP client permissions)
+# Both modes: Allow validated remote file access (GitHub/GitLab URLs only)

package/package.json ADDED Viewed

@@ -0,0 +1,67 @@
+{
+  "name": "docs-agent",
+  "version": "1.1.0",
+  "description": "Docs-Agent leverages AI to avoid outdated documentation, fix issues related to comprehensiblity or technical accuracy.",
+  "main": "src/index.js",
+  "type": "module",
+  "license": "MIT",
+  "files": ["src", "README.md", "LICENSE", "CHANGELOG.md", "env.example", "docs"],
+  "scripts": {
+    "test": "mocha test/**/*.test.js",
+    "start": "node src/index.js",
+    "proxy": "./node_modules/.bin/smee -u $SMEE_PROXY_ENDPOINT_REVIEW -t https://$HOSTNAME-3000.csb.app/review & ./node_modules/.bin/smee -u $SMEE_PROXY_ENDPOINT_PRIORITIZE -t https://$HOSTNAME-3000.csb.app/prioritize & ./node_modules/.bin/smee -u $SMEE_PROXY_ENDPOINT_EDIT -t https://$HOSTNAME-3000.csb.app/edit & ./node_modules/.bin/smee -u $SMEE_PROXY_ENDPOINT_LINK -t https://$HOSTNAME-3000.csb.app/link & ./node_modules/.bin/smee -u $SMEE_PROXY_ENDPOINT_AUDIT -t https://$HOSTNAME-3000.csb.app/audit",
+    "staging": "npm run proxy & PORT=3000 npm start",
+    "mcp-list": "MCP_DISABLED=false API_DISABLED=true npx @modelcontextprotocol/inspector --cli node src/index.js --method tools/list",
+    "mcp-run": "MCP_DISABLED=false API_DISABLED=true npx @modelcontextprotocol/inspector --cli node src/index.js --method tools/call --tool-name improveDocs --tool-arg filepath=./test/fixtures/docs.input.js"
+  },
+  "dependencies": {
+    "@ai-sdk/anthropic": "^1.2.12",
+    "@ai-sdk/google": "^1.2.19",
+    "@ai-sdk/openai": "^1.3.22",
+    "@modelcontextprotocol/sdk": "^1.10.2",
+    "@octokit/core": "^7.0.3",
+    "@octokit/plugin-retry": "^8.0.1",
+    "@octokit/plugin-throttling": "^11.0.1",
+    "@vercel/otel": "^1.13.0",
+    "ai": "^4.0.0",
+    "dotenv": "^16.5.0",
+    "express": "^5.1.0",
+    "js-tiktoken": "^1.0.20",
+    "zod": "^3.25.76"
+  },
+  "devDependencies": {
+    "@modelcontextprotocol/inspector": "^0.15.0",
+    "chai": "^5.2.0",
+    "chai-http": "^5.1.2",
+    "mocha": "^11.1.0",
+    "sinon": "^20.0.0",
+    "smee-client": "^4.3.1"
+  },
+  "engines": {
+    "node": ">=20.0.0"
+  },
+  "author": "",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/rudderlabs/docs-agent"
+  },
+  "homepage": "https://github.com/rudderlabs/docs-agent#readme",
+  "bugs": {
+    "url": "https://github.com/rudderlabs/docs-agent/issues"
+  },
+  "keywords": [
+    "docs",
+    "diataxis",
+    "documentation",
+    "agent",
+    "mcp",
+    "ai",
+    "technical-writing",
+    "mcp",
+    "mcp-server",
+    "mcp-agent",
+    "ai-agent",
+    "ai-assistant",
+    "technical-documentation"
+  ]
+}

package/src/CodeSearch.js ADDED Viewed

@@ -0,0 +1,125 @@
+/**
+ * Search in codebase
+ */
+import GitHubApi from './GitHubApi.js';
+import * as FileUtility from './FileUtility.js';
+class CodeSearch {
+  constructor(options={}) {
+    this.searchSpace = options.searchSpace || "github";
+    this.repository = options.repository;
+    this.fetchContent = options.fetchContent || false;
+    this.maxResults = options.maxResults || 10;
+    this.scoreThreshold = options.scoreThreshold || 0;
+    this.cache = options.cache || true;
+    if(this.searchSpace === "github"){
+      this.githubApi = new GitHubApi();
+    }
+  }
+  /**
+   * Search for code in the codebase
+   * Query language: https://docs.github.com/en/search-github/searching-on-github/searching-code
+   * @param {string} query - The search query e.g. "keyword" | "symbol:keyword" | "repo:owner/repo"
+   * @param {object} [context={}] - Additional context
+   * @param {boolean} [context.fetchContent=false] - Whether to fetch the complete file content
+   * @param {string} [context.repository] - Repository path, supports both remote or local repository paths e.g. "https://github.com/owner/repo" or "/home/user/repo-folder"
+   * @param {string} [context.sha] - SHA of the search result (if available e.g. dsf2w32.. or main)
+   * @param {number} [context.scoreThreshold=0] - Minimum relevance score (0-1)
+   * @param {number} [context.maxResults=10] - Maximum number of results to return
+   * @param {boolean} [context.cache=true] - Whether to use caching
+   * @returns {Promise<Array<Object>>} searchResults - Array of search results with search results containing filepath and metadata
+   * @returns {Promise<string>} searchResults[].path - Relative path to the file
+   * @returns {Promise<string>} searchResults[].repository - Full repository path, supports both remote or local repository paths
+   * @returns {Promise<number>} searchResults[].score - Relevance score of the search result (0-1)
+   * @returns {Promise<string>} searchResults[].sha - SHA of the search result (if available e.g. dsf2w32.. or main)
+   * @returns {Promise<string>} searchResults[].filepath - Absolute file path to the file
+   */
+  async execute(query, context={}) {
+    if(!context.repository){
+      context.repository = this.repository;
+    }
+    if(context.fetchContent === undefined){
+      context.fetchContent = this.fetchContent;
+    }
+    if(context.cache === undefined){
+      context.cache = this.cache;
+    }
+    let searchResults = [];
+    if(this.searchSpace === "github"){
+      searchResults = await this.githubApi.searchCode(query, context);
+    } else if(this.searchSpace === "local"){
+      throw new Error("Local search is not implemented yet");
+    } else {
+      throw new Error(`Invalid search space: ${this.searchSpace}`);
+    }
+    const rankedResults = await this.rankResults(searchResults, context);
+    if(!context.fetchContent) {
+      return rankedResults;
+    }
+    const fileContents = await this.getFileContent(rankedResults);
+    const reRankedResults = await this.reRankResults(fileContents);
+    return reRankedResults;
+  }
+  /**
+   * Rank and filter search results based on relevance score
+   * @param {Array<Object>} searchResults - Array of search results
+   * @param {Object} context - Context object
+   * @param {number} context.scoreThreshold - Minimum relevance score (0-1)
+   * @param {number} context.maxResults - Maximum number of results to return
+   * @returns {Array<Object>} rankedResults - Array of ranked search results
+   */
+  async rankResults(searchResults, context) {
+    const {
+      scoreThreshold = this.scoreThreshold,
+      maxResults = this.maxResults
+    } = context;
+    const rankedResults = [];
+    for(const result of searchResults) {
+      if(result.score < scoreThreshold) {
+        continue;
+      }
+      if(maxResults > 0 && rankedResults.length >= maxResults) {
+        break;
+      }
+      rankedResults.push(result);
+    }
+    return rankedResults;
+  }
+  /**
+   * Get file content from the search results and add it to the search results
+   * @param {Array<Object>} searchResults - Array of search results
+   * @returns {Array<Object>} searchResults - Array of search results with file content
+   */
+  async getFileContent(searchResults) {
+    const fileContents = [];
+    for(const result of searchResults) {
+      let fileContent;
+      if(this.searchSpace === "github"){
+        fileContent = await this.githubApi.getFileContent(result.repository, result.path, result.sha);
+      } else {
+        fileContent = await FileUtility.readFile(result.path);
+      }
+      fileContents.push({
+        ...result,
+        content: fileContent
+      });
+    }
+    return fileContents;
+  }
+    /**
+   * Rerank search results based on file content
+   * @param {Array<Object>} searchResults - Array of search results
+   * @returns {Array<Object>} reRankedResults - Array of reranked search results
+   */
+    async reRankResults(searchResults) {
+      //TODO: Implement reranking logic
+      return searchResults;
+    }
+}
+export default CodeSearch;