npm - @udx/mq - Versions diffs - 0.1.1 → 1.1.3 - Mend

@udx/mq 0.1.1 → 1.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +242 -0
package/lib/operations/content-extractor.js +195 -0
package/lib/operations/extractors.js +105 -4
package/mq.js +42 -1
package/package.json +5 -3
package/readme.md +0 -242

package/README.md ADDED Viewed

@@ -0,0 +1,242 @@
+# @udx/mq - Markdown Query
+A powerful tool for querying and transforming markdown documents, designed as a companion to @udx/mcurl. Think of it as "jq for markdown" - a tool that lets you treat markdown as structured data.
+## Key Capabilities
+- **Clean Content Extraction**: Pull narrative content without code blocks for cleaner analysis
+- **Structured Querying**: Filter and transform markdown content like jq does for JSON
+- **Document Analysis**: Generate actionable insights and understand document structure
+- **Format Conversion**: Transform between JSON, markdown, and other formats
+- **Composability**: Combine with other tools in Unix-style pipelines
+## Why Clean Content Extraction Matters
+Code blocks in technical documents serve a crucial purpose for developers but act as "noise" when analyzing the narrative flow. By separating content from code, mq helps:
+- Improve focus on conceptual information
+- Extract cleaner summaries without code snippets
+- Better identify key points and arguments
+- Create more approachable versions of technical content
+## Installation
+```bash
+npm install -g @udx/mq
+```
+## Usage Examples
+### Extract Clean Content (No Code Blocks)
+```bash
+# Extract clean content without code blocks
+mq --clean-content --input test/fixtures/test-code-blocks.md
+# Filter content to include only h1 and h2 headings and their content
+mq --clean-content=2 --input test/fixtures/complex-test.md
+# Get clean content in JSON format
+mq --clean-content --format json --input test/fixtures/test-code-blocks.md
+```
+### Basic Query Operations
+```bash
+# Extract headings from a document (returns JSON structure by default)
+mq --input test/fixtures/basic-test.md '.headings[]'
+# Analyze document structure (returns formatted Markdown report)
+mq --analyze --input test/fixtures/complex-test.md
+# Generate a table of contents (returns Markdown TOC)
+mq --input test/fixtures/test-document.md '.toc'
+# Extract code blocks by language (returns JSON structure)
+mq --language javascript --input test/fixtures/test-code-blocks.md
+# Extract code content only in raw format
+mq --language javascript --input test/fixtures/test-code-blocks.md | jq -r '.[0].content'
+# Extract all images (returns JSON structure)
+mq --input test/fixtures/test-images.md '.images[]'
+# Extract first sentences from sections (returns text content)
+mq --first-sentences 2 --input test/fixtures/test-sentences.md
+```
+### Pipe with mcurl
+```bash
+# Fetch web content and analyze it
+mcurl https://udx.io | mq --analyze
+# Fetch web content and extract key information
+mcurl https://udx.io/work | mq --clean-content
+# First analyze the overall structure of web content
+mcurl https://udx.io/about | mq --analyze
+```
+### Complex Queries
+```bash
+# Extract level 2 headings
+mq --input test/fixtures/complex-test.md '.headings[] | select(.level == 2)'
+# Extract links to specific domain
+mq --input test/fixtures/test-document.md '.links[] | select(.href | contains("example"))'
+# Extract code blocks and make them collapsible
+mq --input test/fixtures/test-code-blocks.md --transform-code-blocks
+```
+### Integration with curl and jq
+One of the most powerful aspects of mq is its ability to integrate with curl, mcurl, and jq in Unix-style pipelines:
+```bash
+# Fetch a GitHub markdown file and extract headings
+curl -s https://raw.githubusercontent.com/WordPress/wordpress-develop/HEAD/README.md | mq '.headings[]'
+# Get content from a website and extract clean narrative content
+mcurl https://udx.io/about | mq --clean-content
+# Process markdown content and pipe to jq for further filtering
+curl -s https://raw.githubusercontent.com/WordPress/wordpress-develop/HEAD/README.md | mq --clean-content --format json | \
+  jq '[.[] | select(.type=="heading" and .level == 1)]'
+# Extract expertise data from UDX API using proper jq patterns
+curl -s 'https://udx.io/wp-json/udx/v2/works/search?query=&page=1' | \
+  jq '.facets.expertise[] | select(.count > 10) | {name: .name, count: .count}'
+```
+## Advanced Features
+### Clean Content Extraction
+The clean content extractor is one of mq's most powerful features for document analysis. It removes code blocks while preserving the document's narrative structure:
+```bash
+# Extract clean content without code blocks
+mq --clean-content --input test/fixtures/test-code-blocks.md
+# Limit extraction to specific heading levels (h1 and h2 only)
+mq --clean-content=2 --input test/fixtures/complex-test.md
+# Get JSON output for programmatic processing
+mq --clean-content --format json --input test/fixtures/test-code-blocks.md | jq length
+```
+#### Benefits of Clean Content Extraction
+- **Improved Analysis**: Focus on the narrative without code noise
+- **Better Summarization**: Generate more coherent summaries from technical content
+- **Hierarchical Understanding**: Preserve document structure while filtering code
+- **Content Repurposing**: Transform code-heavy tutorials into conceptual guides
+- **Incremental Content Processing**: Extract varying amounts of content for different purposes
+### Advanced UDX API Examples
+```bash
+# Extract links from HTML content using mq
+mcurl https://udx.io/about | mq '.links[0:5]'
+# Extract clean content from a WordPress page for easier reading
+mcurl https://udx.io/guidance | mq --clean-content
+# First analyze page structure, then extract specific elements
+mcurl https://udx.io/work | mq --analyze
+```
+## Approach
+### Best Practices for Working with Markdown and APIs
+- **Native Node.js Functions**: Prefer using native Node.js functions for fetching API data rather than dedicated modules. For example:
+  ```javascript
+  // Using native Node.js rather than dedicated modules
+  const https = require('https');
+  function fetchContent(url) {
+    // Function fetches content from URL using native Node.js modules
+    // Input: url - String URL to fetch
+    // Output: Promise that resolves to response body
+    return new Promise((resolve, reject) => {
+      https.get(url, (res) => {
+        let data = '';
+        res.on('data', (chunk) => { data += chunk; });
+        res.on('end', () => { resolve(data); });
+      }).on('error', reject);
+    });
+  }
+  ```
+- **Logging and Debugging**: Always log API request metadata and response data for troubleshooting:
+  ```javascript
+  // Proper logging for API requests
+  function logApiRequest(url, options, response) {
+    // Log API request details when verbose mode is enabled
+    // Input: url - request URL, options - request options, response - API response
+    // Output: None, logs to console
+    if (process.env.DEBUG || process.env.VERBOSE) {
+      console.log(`[API Request] ${options.method || 'GET'} ${url}`);
+      console.log(`[API Response] Status: ${response.statusCode}`);
+      if (process.env.VERBOSE) {
+        console.log(`[API Response Body] ${JSON.stringify(response.body).substring(0, 200)}...`);
+      }
+    }
+  }
+  ```
+- **Use Lodash for Complex Operations**: Leverage Lodash for data transformations to improve readability and fault tolerance in your pipeline.
+- **Progressive Enhancement Workflow**:
+  1. Start by analyzing content structure with `mq --analyze`
+  2. Extract relevant sections with targeted selectors
+  3. Process and transform with clean content extraction
+  4. Format output appropriately for your use case
+- **Testing Strategy**: Test your pipelines using REST API tools, Mocha for unit tests, or simple curl commands for verification.
+- **Documentation**: Add comprehensive function headers that explain purpose, inputs, and outputs for all custom operations.
+### Common Pipelines
+```bash
+# Extract content → Clean → Filter → Format as JSON
+mcurl https://udx.io/about | mq --clean-content | mq --format json | jq 'length'
+# Analyze content structure then target specific elements
+mcurl https://udx.io/work | mq --analyze && mcurl https://udx.io/work | mq '.headings[0:5]'
+# Process multiple sources with consistent transformations
+for url in "udx.io/about" "udx.io/work" "udx.io/guidance"; do
+  echo "Processing $url"
+  mcurl https://$url | mq --clean-content=2 | wc -l
+done
+```
+### UDX API Integration Patterns
+Mq can be used as part of a larger data processing pipeline, working alongside other tools like curl and jq:
+```bash
+# Use mq for HTML content processing
+mcurl https://udx.io/work | mq --clean-content | grep "Cloud"
+# Use curl+jq for JSON API processing (not mcurl!)
+curl -s 'https://udx.io/wp-json/udx/v2/works/search?query=&page=1' | \
+  jq '.facets.expertise[] | select(.count > 10) | {name: .name, count: .count}'
+# Get industry distribution with better formatting
+curl -s 'https://udx.io/wp-json/udx/v2/works/search?query=&page=1' | \
+  jq '.facets.industries[] | select(.count > 5) | {name: .name, count: .count}'
+# Pipeline: Extract content from UDX pages, clean it, then analyze structure
+for page in "about" "work" "guidance"; do
+  mcurl "https://udx.io/$page" | mq --clean-content | mq --analyze | grep -i "headings"
+done
+```

package/lib/operations/content-extractor.js ADDED Viewed

@@ -0,0 +1,195 @@
+/**
+ * Enhanced Content Extractor for Markdown Query
+ *
+ * Provides flexible extraction methods for markdown content with customizable filters
+ * Designed specifically for extracting meaningful content while omitting code blocks
+ */
+import { toString } from 'mdast-util-to-string';
+import { visit } from 'unist-util-visit';
+import _ from 'lodash';
+/**
+ * Extract clean content without code blocks
+ * Extracts readable content from markdown while filtering code blocks
+ *
+ * Input: Markdown AST document
+ * Output: Array of content nodes with metadata
+ *
+ * @param {Object} ast - Markdown AST
+ * @param {string} selector - Selector for filtering (e.g., 'level=2' for headings up to level 2)
+ * @param {Object} options - Additional options for extraction
+ * @returns {Array} Array of content objects
+ */
+function extractCleanContent(ast, selector, options = {}) {
+  // Default options
+  const config = _.defaults(options, {
+    includeHeadings: true,
+    includeParagraphs: true,
+    includeImages: false,
+    includeLists: true,
+    maxHeadingLevel: 6,
+    preserveHierarchy: true,
+    debug: false
+  });
+  // Track which nodes to exclude (code blocks and their containers)
+  const excludeNodes = new Set();
+  const contentNodes = [];
+  // First pass: identify all code blocks to exclude
+  visit(ast, 'code', (node, index, parent) => {
+    excludeNodes.add(node);
+    if (config.debug) {
+      console.log(`Found code block at index ${index}, marking for exclusion`);
+    }
+  });
+  // Parse options from selector
+  if (selector && selector.includes('level=')) {
+    const levelMatch = selector.match(/level=(\d+)/);
+    if (levelMatch) {
+      config.maxHeadingLevel = parseInt(levelMatch[1], 10);
+    }
+  }
+  // Second pass: extract all desired content
+  visit(ast, (node) => {
+    // Skip excluded nodes
+    if (excludeNodes.has(node)) {
+      return;
+    }
+    let content;
+    // Process based on node type
+    switch (node.type) {
+      case 'heading':
+        if (config.includeHeadings && node.depth <= config.maxHeadingLevel) {
+          content = {
+            type: 'heading',
+            level: node.depth,
+            text: toString(node),
+            position: node.position
+          };
+        }
+        break;
+      case 'paragraph':
+        if (config.includeParagraphs) {
+          content = {
+            type: 'paragraph',
+            text: toString(node),
+            position: node.position
+          };
+        }
+        break;
+      case 'image':
+        if (config.includeImages) {
+          content = {
+            type: 'image',
+            alt: node.alt || '',
+            src: node.url,
+            title: node.title || '',
+            position: node.position
+          };
+        }
+        break;
+      case 'list':
+        if (config.includeLists) {
+          const items = [];
+          visit(node, 'listItem', (item) => {
+            items.push(toString(item));
+          });
+          content = {
+            type: 'list',
+            items,
+            ordered: node.ordered,
+            position: node.position
+          };
+        }
+        break;
+    }
+    if (content) {
+      contentNodes.push(content);
+    }
+  });
+  // Sort content by original position in document
+  const sortedContent = _.sortBy(contentNodes, (node) => {
+    return node.position ? node.position.start.line : 0;
+  });
+  // Log operation info when in verbose debug mode
+  if (config.debug) {
+    console.log(`Extracted ${sortedContent.length} content nodes, filtering out code blocks`);
+  }
+  return sortedContent;
+}
+/**
+ * Convert extracted content to markdown
+ * Transforms the output from extractCleanContent back to markdown
+ *
+ * Input: Array of content nodes
+ * Output: Markdown string
+ *
+ * @param {Array} contentNodes - Array of content objects
+ * @param {Object} options - Formatting options
+ * @returns {string} Formatted markdown content
+ */
+function contentToMarkdown(contentNodes, options = {}) {
+  // Default options
+  const config = _.defaults(options, {
+    addSeparators: false,
+    headingStyle: '#',  // or 'setext'
+    addLineBreaks: true
+  });
+  let markdown = '';
+  contentNodes.forEach((node) => {
+    switch (node.type) {
+      case 'heading':
+        // Add proper number of # for heading level
+        markdown += '#'.repeat(node.level) + ' ' + node.text;
+        markdown += config.addLineBreaks ? '\n\n' : '\n';
+        break;
+      case 'paragraph':
+        markdown += node.text;
+        markdown += config.addLineBreaks ? '\n\n' : '\n';
+        break;
+      case 'image':
+        markdown += `![${node.alt}](${node.src}${node.title ? ` "${node.title}"` : ''})`;
+        markdown += config.addLineBreaks ? '\n\n' : '\n';
+        break;
+      case 'list':
+        node.items.forEach((item, index) => {
+          const prefix = node.ordered ? `${index + 1}. ` : '- ';
+          markdown += prefix + item;
+          markdown += config.addLineBreaks ? '\n' : '';
+        });
+        markdown += config.addLineBreaks ? '\n' : '';
+        break;
+    }
+    if (config.addSeparators) {
+      markdown += '---\n\n';
+    }
+  });
+  return markdown;
+}
+export {
+  extractCleanContent,
+  contentToMarkdown
+};

package/lib/operations/extractors.js CHANGED Viewed

@@ -14,7 +14,9 @@ export {
   extractLinks,
   generateToc,
   extractSections,
-  filterHeadingsByLevel
+  filterHeadingsByLevel,
+  extractImages,
+  extractFirstSentences
 };
 /**
@@ -57,11 +59,12 @@ function extractHeadings(ast, selector) {
 /**
  * Extract code blocks from AST
  *
- * Extracts all code blocks from a markdown document with their language and content
+ * Extracts all code blocks from a markdown document with their language and content.
+ * Supports filtering by language using the lang= selector.
  *
  * @param {Object} ast - Markdown AST
- * @param {string} selector - Query selector
- * @returns {Array|Object} Array of code blocks or single code block if index specified
+ * @param {string} selector - Query selector (e.g., '[0]' for first block, 'lang=php' for PHP blocks)
+ * @returns {Array|Object} Array of code blocks or filtered blocks based on selector
  */
 function extractCodeBlocks(ast, selector) {
   const codeBlocks = [];
@@ -73,6 +76,15 @@ function extractCodeBlocks(ast, selector) {
     });
   });
+  // Handle language selector if present
+  if (selector && selector.includes('lang=')) {
+    const langMatch = selector.match(/lang=["']?([a-zA-Z0-9]+)["']?/);
+    if (langMatch) {
+      const language = langMatch[1];
+      return codeBlocks.filter(block => block.language === language);
+    }
+  }
   // Handle array index selector if present
   if (selector && selector.includes('[')) {
     const indexMatch = selector.match(/\[(\d+)\]/);
@@ -245,3 +257,92 @@ function filterHeadingsByLevel(ast, level) {
   return output;
 }
+/**
+ * Extract images from AST
+ *
+ * Extracts all images from a markdown document with their alt text and source URL
+ *
+ * @param {Object} ast - Markdown AST
+ * @param {string} selector - Query selector
+ * @returns {Array|Object} Array of images or single image if index specified
+ */
+function extractImages(ast, selector) {
+  const images = [];
+  visit(ast, 'image', (node) => {
+    images.push({
+      alt: node.alt || '',
+      src: node.url,
+      title: node.title || ''
+    });
+  });
+  // Handle array index selector if present
+  if (selector && selector.includes('[')) {
+    const indexMatch = selector.match(/\[(\d+)\]/);
+    if (indexMatch) {
+      const index = parseInt(indexMatch[1], 10);
+      return images[index];
+    }
+    return images;
+  }
+  return images;
+}
+/**
+ * Extract first sentences from sections
+ *
+ * Extracts the first sentence from each section down to a specified heading level
+ *
+ * @param {Object} ast - Markdown AST
+ * @param {string} selector - Query selector (e.g., 'level=3' for headings up to level 3)
+ * @returns {Array} Array of sections with their first sentences
+ */
+function extractFirstSentences(ast, selector) {
+  // Parse selector for max level if present
+  let maxLevel = 6;
+  if (selector && selector.includes('level=')) {
+    const levelMatch = selector.match(/level=(\d+)/);
+    if (levelMatch) {
+      maxLevel = parseInt(levelMatch[1], 10);
+    }
+  }
+  // Extract sections
+  const sections = extractSections(ast);
+  const result = [];
+  // Process each section to extract first sentence
+  sections.forEach(section => {
+    // Skip sections with heading level greater than maxLevel
+    if (section.depth > maxLevel) return;
+    // Find the first paragraph after the heading
+    let firstSentence = '';
+    for (const node of section.content) {
+      if (node.type === 'paragraph') {
+        const text = toString(node);
+        // Extract first sentence (ending with period, question mark, or exclamation mark)
+        const sentenceMatch = text.match(/^[^.!?]*[.!?]/);
+        if (sentenceMatch) {
+          firstSentence = sentenceMatch[0];
+          break;
+        } else {
+          // If no sentence ending found, use the full paragraph
+          firstSentence = text;
+          break;
+        }
+      }
+    }
+    result.push({
+      title: section.title,
+      level: section.depth,
+      firstSentence
+    });
+  });
+  return result;
+}

package/mq.js CHANGED Viewed

@@ -45,7 +45,8 @@ const packageJson = JSON.parse(
 );
 // Import extract operations
-import { extractHeadings, extractCodeBlocks, extractLinks, generateToc, extractSections, filterHeadingsByLevel } from './lib/operations/extractors.js';
+import { extractHeadings, extractCodeBlocks, extractLinks, generateToc, extractSections, filterHeadingsByLevel, extractImages, extractFirstSentences } from './lib/operations/extractors.js';
+import { extractCleanContent, contentToMarkdown } from './lib/operations/content-extractor.js';
 // Import analysis operations
 import { showDocumentStructure, countDocumentElements, analyzeDocument } from './lib/operations/analysis.js';
@@ -60,6 +61,9 @@ const queryOperations = {
   'count': countDocumentElements,
   'sections': extractSections,
   'level': filterHeadingsByLevel,
+  'images': extractImages,
+  'firstSentences': extractFirstSentences,
+  'cleanContent': extractCleanContent,
   'default': (ast, query) => ast
 };
@@ -98,11 +102,15 @@ program
   .option('-a, --analyze', 'Analyze document structure')
   .option('-s, --structure', 'Show document structure (headings hierarchy)')
   .option('-c, --count', 'Count document elements')
+  .option('-l, --language <lang>', 'Filter code blocks by language')
   .option('-i, --input <file>', 'Input file (defaults to stdin)')
   .option('-o, --output <file>', 'Output file (defaults to stdout)')
   .option('-f, --format <format>', 'Output format (json, yaml, markdown)', 'markdown')
   .option('-v, --verbose', 'Verbose output with operation details')
   .option('-d, --debug', 'Debug mode with detailed logs for troubleshooting')
+  .option('--images', 'Extract all images from markdown')
+  .option('--first-sentences [level]', 'Extract first sentences of sections, optionally specify max heading level')
+  .option('--clean-content [level]', 'Extract clean content without code blocks, optionally specify max heading level')
   .parse(process.argv);
 const options = program.opts();
@@ -338,6 +346,39 @@ async function main() {
       log(`Filtering headings by level: ${options.level}`, 'verbose');
       result = filterHeadingsByLevel(ast, parseInt(options.level, 10));
       log('Heading filtering completed', 'debug');
+    } else if (options.language) {
+      log(`Filtering code blocks by language: ${options.language}`, 'verbose');
+      result = extractCodeBlocks(ast, `lang=${options.language}`);
+      log('Code block filtering completed', 'debug');
+    } else if (options.images) {
+      log('Extracting images from markdown', 'verbose');
+      result = extractImages(ast);
+      log('Image extraction completed', 'debug');
+    } else if (options.firstSentences) {
+      const level = typeof options.firstSentences === 'string' ?
+        options.firstSentences : '6';
+      log(`Extracting first sentences with max level ${level}`, 'verbose');
+      result = extractFirstSentences(ast, `level=${level}`);
+      log('First sentence extraction completed', 'debug');
+    } else if (options.cleanContent) {
+      const level = typeof options.cleanContent === 'string' ?
+        options.cleanContent : '6';
+      log(`Extracting clean content without code blocks, max level ${level}`, 'verbose');
+      const cleanContent = extractCleanContent(ast, `level=${level}`, {
+        debug: options.debug || options.verbose,
+        maxHeadingLevel: parseInt(level, 10)
+      });
+      if (options.format === 'json' || options.format === 'yaml') {
+        result = cleanContent;
+      } else {
+        // For markdown output, convert the structured content back to markdown
+        result = contentToMarkdown(cleanContent, {
+          addLineBreaks: true
+        });
+      }
+      log('Clean content extraction completed', 'debug');
     } else if (options.transform) {
       log(`Applying transform operation: ${options.transform}`, 'verbose');
       result = transformMarkdown(ast, options.transform);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@udx/mq",
-  "version": "0.1.1",
+  "version": "1.1.3",
   "description": "Markdown Query - jq for Markdown documents",
   "main": "mq.js",
   "type": "module",
@@ -10,7 +10,7 @@
   "scripts": {
     "test": "NODE_OPTIONS=--experimental-vm-modules npx mocha",
     "test:gist": "NODE_OPTIONS=--experimental-vm-modules npx mocha test/gist-integration.test.mjs",
-    "test:integration": "NODE_OPTIONS=--experimental-vm-modules npx mocha test/integration.test.mjs",
+    "test:integration": "NODE_OPTIONS=--experimental-vm-modules npx mocha test/integration.test.mjs",
     "test:all": "npm test",
     "start": "node mq.js",
     "prepublishOnly": "npm test"
@@ -46,11 +46,12 @@
     "examples"
   ],
   "engines": {
-    "node": ">=14.16"
+    "node": ">=18.17"
   },
   "dependencies": {
     "commander": "^11.1.0",
     "js-yaml": "^4.1.0",
+    "lodash": "^4.17.21",
     "mdast-util-from-markdown": "^1.3.0",
     "mdast-util-gfm": "^3.1.0",
     "mdast-util-to-markdown": "^1.5.0",
@@ -62,6 +63,7 @@
     "@udx/mcurl": "^0.3.0"
   },
   "devDependencies": {
+    "express": "^4.21.2",
     "mocha": "^11.1.0"
   }
 }

package/readme.md DELETED Viewed

@@ -1,242 +0,0 @@
-# mq - Markdown Query
-A powerful tool for querying, transforming, and analyzing markdown documents, designed as an extension for [@udx/mcurl](../mcurl).
-## Quick Start
-Installing mq
-```bash
-npm install -g @udx/mq
-```
-Test installation:
-```bash
-mq --version
-```
-### Basic Usage
-```bash
-# Extract all headings
-mq '.headings[]' -i ../../README.md
-# Get the table of contents
-mq '.toc' -i ../../README.md
-# Extract all code blocks with their language
-mq '.codeBlocks[] | {language, content}' -i ../../README.md
-# Make all code blocks collapsible
-mq -t '.codeBlocks[] |= makeCollapsible' -i ../../README.md
-# Analyze document structure
-mq --analyze -i ../../README.md
-# Show document structure (headings hierarchy)
-mq --structure -i ../../README.md
-# Count document elements
-mq --count -i ../../README.md
-```
-### Using Piped Input
-You can also use mq with piped input from the tools/mq directory:
-```bash
-cat ../../README.md | mq '.headings[]'
-```
-## Concept
-Just as `jq` allows you to query and transform JSON data, `mq` provides a similar experience for markdown documents. It parses markdown into a structured object that you can query, transform, and analyze using a familiar syntax.
-## Features
-- Query markdown documents with a jq-like syntax
-- Transform markdown content with various operations
-- Extract specific elements like headings, code blocks, links
-- Generate table of contents
-- Analyze document structure and content
-- Count document elements
-- Integration with mcurl
-## Query Syntax
-mq uses a simplified jq-like syntax for querying markdown elements:
-```bash
-# Basic element selection
-.headings[]                # All headings
-.headings[0]               # First heading
-.headings[] | select(.level == 2)  # All h2 headings
-.links[]                   # All links
-.codeBlocks[]              # All code blocks
-.toc                       # Table of contents
-# Filtering and transformations
-.headings[] | select(.text | contains("Introduction"))  # Headings containing "Introduction"
-.codeBlocks[] | select(.language == "javascript")       # JavaScript code blocks
-.links[] | select(.href | startswith("http"))           # External links
-```
-## Transform Operations
-Transform operations modify the markdown structure:
-```bash
-# Make code blocks collapsible
-mq -t '.codeBlocks[] |= makeCollapsible' -i test/fixtures/content-website.md
-# Add a descriptive TOC
-mq -t '.toc |= makeDescriptive' -i test/fixtures/content-website.md
-# Add cross-links section after TOC
-mq -t '.toc |= addCrossLinks(["worker.md", "platform.md"])' -i test/fixtures/content-website.md
-# Fix heading hierarchy
-mq -t '.headings |= fixHierarchy' -i test/fixtures/content-website.md
-```
-## Analysis Features
-mq provides powerful analysis capabilities:
-```bash
-# Analyze document structure and content
-mq --analyze -i test/fixtures/content-website.md
-# Show document structure (headings hierarchy)
-mq --structure -i test/fixtures/content-website.md
-# Count document elements
-mq --count -i test/fixtures/content-website.md
-```
-## Piping with Other Tools
-mq works seamlessly with Unix pipes and other tools:
-```bash
-# Find markdown files with inconsistent heading hierarchy
-find ../../content/ -name "*.md" | xargs -I{} sh -c 'cat {} | mq ".headings | validateHierarchy" || echo "Issue in {}"'
-# Extract all external links from documentation
-find ../../content/ -name "*.md" | xargs cat | mq '.links[] | select(.href | startswith("http")) | .href' | sort | uniq
-# Generate a combined TOC from multiple files
-find ../../content/architecture/ -name "*.md" | xargs cat | mq '.headings[] | select(.level <= 2) | {file: input_filename, heading: .text}'
-```
-## Integration with mCurl
-Use mq as a content handler extension for mCurl:
-```javascript
-import { registerContentHandler } from '@udx/mcurl';
-import { markdownHandler } from '@udx/mq';
-// Register the markdown handler with mcurl
-registerContentHandler('text/markdown', markdownHandler);
-// Fetch and transform markdown content
-const result = await mcurl('https://udx.io', {
-  mqQuery: '.headings[] | select(.level == 2) | .text'
-});
-```
-## Common Use Cases
-### 1. Documentation Analysis
-```bash
-# Find documents missing a proper TOC
-find test/fixtures/ -name "*.md" | xargs -I{} sh -c 'cat {} | mq ".toc | length" | grep -q "^0$" && echo "{} missing TOC"'
-# List all unique tags used in documentation
-find test/fixtures/ -name "*.md" | xargs cat | mq '.tags[]?' | sort | uniq -c | sort -nr
-# Analyze content distribution
-find test/fixtures/ -name "*.md" | xargs -I{} sh -c 'cat {} | mq --analyze > {}.analysis.md'
-```
-### 2. Batch Transformations
-```bash
-# Add cross-links between related documents
-for file in test/fixtures/*.md; do
-  cat $file | mq -t '.toc |= addCrossLinks(["worker.md", "platform.md"])' > $file.new
-  mv $file.new $file
-done
-# Make all code blocks collapsible
-find test/fixtures/ -name "*.md" | xargs -I{} sh -c 'cat {} | mq -t ".codeBlocks[] |= makeCollapsible" > {}.new && mv {}.new {}'
-```
-### 3. Documentation Validation
-```bash
-# Validate all links are working
-find test/fixtures/ -name "*.md" | xargs cat | mq '.links[] | .href' | xargs -I{} curl -s -o /dev/null -w "%{http_code} {}\n" {} | grep -v "^200"
-# Check for consistent heading structure
-find test/fixtures/ -name "*.md" | xargs -I{} sh -c 'cat {} | mq ".headings | validateHierarchy" || echo "Issue in {}"'
-```
-## Examples
-See the [examples](./examples) directory for more usage examples.
-## Document Comparison Examples
-Use `mq` to compare and analyze differences between architecture documents. Run these commands from the tools/mq directory:
-```bash
-# Compare document statistics between files
-mq --count -i test/fixtures/stateless.md | grep 'Headings\|Paragraphs\|Links\|CodeBlocks'
-# Find unique sections in a specific document (e.g., mobile credential-related sections)
-mq '.headings[] | select(.text | contains("Mobile Credential"))' -i test/fixtures/stateless.md
-# Compare content distribution to identify most detailed sections
-mq --analyze -i test/fixtures/stateless.md | grep -A20 'Content Distribution'
-# Extract links to see different external references
-mq '.links[]' -i test/fixtures/stateless.md
-```
-For inline comparison without creating temporary files, use command substitution:
-```bash
-# Make sure to run these from the tools/mq directory
-# Directly compare heading counts
-echo "Stateless: $(mq --count -i test/fixtures/stateless.md | grep 'Headings' | awk '{print $2}')"
-# Find terms that appear in one document but not the other
-comm -23 <(mq '.headings[].text' -i test/fixtures/stateless.md | sort) \
-         <(mq '.headings[].text' -i test/fixtures/cloud-automation-group.md | sort)
-```
-### Document Comparison Script
-Create a simple shell script to help compare architecture documents:
-```bash
-#!/bin/bash
-# Save as tools/mq/compare-architectures.sh and make executable with chmod +x
-echo "=== Comparing Document Statistics ==="
-echo "Stateless:"
-mq --count -i test/fixtures/stateless.md | grep 'Headings\|Paragraphs\|Links\|CodeBlocks'
-echo "\n=== Finding Mobile Credential Sections ==="
-mq '.headings[] | select(.text | contains("Mobile") or .text | contains("Credential"))' -i test/fixtures/stateless.md | grep "text"
-```
-## License
-MIT