npm - @conduction/docusaurus-preset - Versions diffs - 3.3.0 → 3.5.0 - Mend

@conduction/docusaurus-preset 3.3.0 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +50 -0
package/bin/validate-ai-baseline.mjs +177 -0
package/package.json +5 -1
package/src/components/DetailHero/DetailHero.jsx +55 -0
package/src/components/FAQ/FAQ.jsx +56 -3
package/src/data/apps-registry.js +39 -16
package/src/index.js +184 -6
package/src/plugins/ai-crawling.js +112 -0
package/static/img/og-conduction.png +0 -0
package/static/robots.txt.template +86 -0

package/README.md CHANGED Viewed

@@ -21,6 +21,7 @@ A few non-negotiables encoded by the package CSS and worth knowing about:
 - **Brand-default navbar** — locale-dropdown + GitHub link. Sites override `items[]` for site-specific navigation.
 - **Brand-default footer** — three-column link grid + Conduction-tells (KvK, BTW, address). Per-property override: pass `footer: { links: [...] }` to swap columns and inherit the brand copyright unchanged. Spread `baseFooterLinks()` to keep one or two brand columns alongside site-specific ones.
 - **Sensible defaults** — `trailingSlash`, `onBrokenLinks: 'warn'`, `respectPrefersColorScheme`, dark-mode brand mapping.
+- **AI-crawler baseline** — Organization + WebSite JSON-LD on every page, `SoftwareApplication` JSON-LD from `<DetailHero>`, `FAQPage` JSON-LD from `<FAQ>`, default `og:image` + Twitter card meta, sitemap options, and a `postBuild` plugin that emits `robots.txt` when the site does not ship its own. See the AI baseline section below for the validator and content requirements.
 ## Usage
@@ -163,6 +164,55 @@ import '@conduction/docusaurus-preset/diagrams';
 This is how product sites such as `mydash.conduction.nl/docs/...` adopt the brand without copying CSS or theme code, and stay in sync as the design-system evolves.
+## AI-crawler baseline
+Every site that consumes this preset inherits a contract that AI crawlers (GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google AI Overviews) expect. The schemas, meta tags, and `robots.txt` ship automatically; sites only have to opt in to the content that surfaces them.
+**What the preset ships**
+| Surface | Source | How a site uses it |
+| --- | --- | --- |
+| Organization + WebSite JSON-LD | `headTags` injected by `createConfig` | Automatic on every page |
+| `og:image`, `twitter:site`, `twitter:card`, `og:type` | `themeConfig.image` + `themeConfig.metadata` defaults | Override per site by passing `themeConfig.image: 'img/og-my-app.png'` |
+| Default `robots.txt` | `conduction-ai-crawling` postBuild plugin | Drop `static/robots.txt` to override |
+| `SoftwareApplication` JSON-LD | `<DetailHero appId="my-app" .../>` | Pages that should advertise the app must render `<DetailHero>` with an `appId` that resolves in `src/data/apps-registry.js`. No DetailHero means no schema. |
+| `FAQPage` JSON-LD | `<FAQ>` with `<FAQItem question=...>` children | Drop a `<FAQ>` block onto a page; the schema is auto-emitted from the children |
+| Sitemap options | Default `sitemap` config on the classic preset | Sites that override `presets` must include their own `sitemap` block |
+**Validating a site**
+The preset ships a generic 8-check validator as a `bin`. Wire it into the site's build:
+```jsonc
+// docs/package.json
+{
+  "scripts": {
+    "build": "docusaurus build",
+    "postbuild": "validate-ai-baseline",
+    "validate:ai-baseline": "validate-ai-baseline"
+  }
+}
+```
+`npm run build` now exits non-zero if any of these regress: `robots.txt` exists with a Sitemap line and an AI-bot allow line, `sitemap.xml` has at least one URL, the homepage emits Organization + WebSite JSON-LD plus `og:image` / `og:type` / `twitter:site` / `twitter:card`, and the `og:image` URL resolves to a real file. Sites can extend the validator with extra checks (per-app SoftwareApplication, FAQPage on specific pages, etc.) by adding their own `scripts/validate-ai-baseline-site.mjs` and chaining it.
+**Per-app docs site checklist**
+For a per-app docs site to satisfy the full schema contract, the landing page must render `<DetailHero appId="my-app" .../>` with an `appId` that exists in `src/data/apps-registry.js`. That single render emits the `SoftwareApplication` JSON-LD with category mapping (Data and Processes -> BusinessApplication, Connectors -> DeveloperApplication, etc.), `operatingSystem: 'Nextcloud'`, and the EUPL-1.2 license URL. Sites that build a custom landing without `<DetailHero>` get only Organization + WebSite, not the per-app schema.
+**Opting out**
+```js
+createConfig({
+  title: '...',
+  url: '...',
+  baseUrl: '/',
+  aiCrawling: { disable: true },          // skip the whole postBuild plugin
+  // or, finer-grained:
+  aiCrawling: { disable: { robotsTxt: true } },  // ship our own static/robots.txt
+});
+```
 ## Releasing
 Releases auto-publish on push to `main`, driven by [semantic-release](https://semantic-release.gitbook.io/) reading [conventional-commit](https://www.conventionalcommits.org/) messages. The [.github/workflows/publish-packages.yml](../.github/workflows/publish-packages.yml) workflow walks every commit since the last `@conduction/docusaurus-preset-v*` tag and decides what to ship:

package/bin/validate-ai-baseline.mjs ADDED Viewed

@@ -0,0 +1,177 @@
+#!/usr/bin/env node
+/**
+ * scripts/validate-ai-baseline.mjs
+ *
+ * Generic AI-crawler baseline validator. Runs as a postbuild step on
+ * every Conduction Docusaurus site that consumes
+ * @conduction/docusaurus-preset >= 3.4.0. Asserts the SSG output
+ * carries the contract AI crawlers (GPTBot, ClaudeBot, PerplexityBot,
+ * OAI-SearchBot, Claude-SearchBot, Google AI Overviews) expect.
+ *
+ * Universal checks only - no site-specific routes. Sites that want
+ * additional gates (per-app SoftwareApplication, FAQPage on specific
+ * pages, etc.) extend this script in place. See conduction-website's
+ * version for an example of additional checks.
+ *
+ * Exit codes:
+ *   0   all checks passed
+ *   1   one or more checks failed (CI should block)
+ *   2   build directory not found (script invoked before build)
+ */
+import {readFileSync, existsSync, statSync} from 'node:fs';
+import {join, resolve} from 'node:path';
+const buildDir = resolve(process.argv[2] || 'build');
+if (!existsSync(buildDir)) {
+  console.error(`✗ build directory not found: ${buildDir}`);
+  console.error(`  Run \`npx docusaurus build\` first.`);
+  process.exit(2);
+}
+const results = [];
+function check(name, fn) {
+  try {
+    const r = fn();
+    results.push({name, ok: r.ok, msg: r.msg});
+  } catch (e) {
+    results.push({name, ok: false, msg: `threw: ${e.message}`});
+  }
+}
+function readBuild(p) {
+  return readFileSync(join(buildDir, p), 'utf8');
+}
+/* robots.txt - shipped by the preset's ai-crawling plugin (or the
+   site's own static/robots.txt). Either way, the file must exist
+   and name at least one AI search bot so a `grep` audit can confirm
+   the posture at a glance. */
+check('robots.txt exists and is non-empty', () => {
+  const path = join(buildDir, 'robots.txt');
+  if (!existsSync(path)) return {ok: false, msg: 'missing'};
+  const size = statSync(path).size;
+  if (size < 50) return {ok: false, msg: `too small (${size} bytes)`};
+  return {ok: true, msg: `${size} bytes`};
+});
+check('robots.txt names at least one AI search bot', () => {
+  const body = readBuild('robots.txt');
+  const candidates = ['OAI-SearchBot', 'Claude-SearchBot', 'PerplexityBot', 'ChatGPT-User', 'Claude-User'];
+  const found = candidates.filter(ua => body.includes(`User-agent: ${ua}`));
+  if (found.length === 0) {
+    return {ok: false, msg: `none of [${candidates.join(', ')}] referenced`};
+  }
+  return {ok: true, msg: `${found.length} bot(s): ${found.join(', ')}`};
+});
+check('robots.txt has a Sitemap line', () => {
+  const body = readBuild('robots.txt');
+  const matches = body.match(/^Sitemap:\s+https?:\/\//gm) || [];
+  if (matches.length === 0) return {ok: false, msg: 'no Sitemap: line'};
+  return {ok: true, msg: `${matches.length} sitemap line(s)`};
+});
+/* sitemap.xml - emitted by @docusaurus/plugin-sitemap (loaded via
+   the classic preset). Locale-specific sitemaps (e.g. /nl/sitemap.xml)
+   are present for i18n builds; we only check the canonical one
+   because some sites are single-locale. */
+check('sitemap.xml exists and has at least 1 URL', () => {
+  const path = join(buildDir, 'sitemap.xml');
+  if (!existsSync(path)) return {ok: false, msg: 'missing'};
+  const body = readBuild('sitemap.xml');
+  const n = (body.match(/<loc>/g) || []).length;
+  if (n < 1) return {ok: false, msg: 'no <loc> entries'};
+  return {ok: true, msg: `${n} URLs`};
+});
+/* Helper for the JSON-LD checks below. Docusaurus emits ld+json
+   tags via two paths with different attribute ordering: top-level
+   headTags renders <script type="..."> first, while Helmet (used
+   by <Head> from inside React components like <DetailHero>, <FAQ>)
+   prefixes data-rh="true". The regex matches either ordering. */
+function extractJsonLdBlocks(html) {
+  const out = [];
+  const re = /<script\b[^>]*\btype="application\/ld\+json"[^>]*>([\s\S]*?)<\/script>/g;
+  let m;
+  while ((m = re.exec(html)) !== null) {
+    out.push(m[1]);
+  }
+  return out;
+}
+check('homepage emits >= 2 JSON-LD blocks, all valid JSON', () => {
+  if (!existsSync(join(buildDir, 'index.html'))) return {ok: false, msg: 'no index.html'};
+  const html = readBuild('index.html');
+  const blocks = extractJsonLdBlocks(html);
+  if (blocks.length < 2) return {ok: false, msg: `only ${blocks.length} block(s)`};
+  for (const [i, b] of blocks.entries()) {
+    try {JSON.parse(b);} catch (e) {
+      return {ok: false, msg: `block ${i} invalid JSON: ${e.message}`};
+    }
+  }
+  return {ok: true, msg: `${blocks.length} blocks, all valid`};
+});
+check('homepage JSON-LD includes Organization and WebSite', () => {
+  const html = readBuild('index.html');
+  const types = extractJsonLdBlocks(html).map(b => {
+    try {return JSON.parse(b)['@type'];} catch {return null;}
+  });
+  const want = ['Organization', 'WebSite'];
+  const missing = want.filter(t => !types.includes(t));
+  if (missing.length) return {ok: false, msg: `missing @type: ${missing.join(', ')}`};
+  return {ok: true, msg: types.filter(Boolean).join(' + ')};
+});
+/* Social-card meta. og:image is the one that breaks LinkedIn /
+   Slack / AI previews when it 404s, so we also resolve the URL to
+   a local file in the build output. */
+function metaTag(html, key) {
+  const re = new RegExp(`<meta[^>]+(?:name|property)="${key}"[^>]+content="([^"]+)"`, 'i');
+  const m = html.match(re);
+  return m ? m[1] : null;
+}
+check('homepage has og:image, og:type, twitter:site, twitter:card', () => {
+  const html = readBuild('index.html');
+  const checks = {
+    'og:image': metaTag(html, 'og:image'),
+    'og:type': metaTag(html, 'og:type'),
+    'twitter:site': metaTag(html, 'twitter:site'),
+    'twitter:card': metaTag(html, 'twitter:card'),
+  };
+  const missing = Object.entries(checks).filter(([, v]) => !v).map(([k]) => k);
+  if (missing.length) return {ok: false, msg: `missing: ${missing.join(', ')}`};
+  return {ok: true, msg: 'all four present'};
+});
+check('og:image URL resolves to a file in the build', () => {
+  const html = readBuild('index.html');
+  const url = metaTag(html, 'og:image');
+  if (!url) return {ok: false, msg: 'no og:image meta'};
+  const path = url.replace(/^https?:\/\/[^/]+\//, '');
+  const local = join(buildDir, path);
+  if (!existsSync(local)) return {ok: false, msg: `og:image refers to ${url}, not found at ${local}`};
+  const size = statSync(local).size;
+  if (size < 1024) return {ok: false, msg: `og:image file suspiciously small (${size} bytes)`};
+  return {ok: true, msg: `${path} (${size} bytes)`};
+});
+/* Report */
+let failed = 0;
+for (const {name, ok, msg} of results) {
+  const icon = ok ? '✓' : '✗';
+  console.log(`${icon} ${name} - ${msg}`);
+  if (!ok) failed++;
+}
+console.log('');
+if (failed) {
+  console.error(`${failed} of ${results.length} checks failed.`);
+  console.error('AI-crawler baseline regressed. Fix the failures above before merging.');
+  process.exit(1);
+} else {
+  console.log(`All ${results.length} AI-baseline checks passed.`);
+}

package/package.json CHANGED Viewed

@@ -1,9 +1,12 @@
 {
   "name": "@conduction/docusaurus-preset",
-  "version": "3.3.0",
+  "version": "3.5.0",
   "scripts": {
     "prepack": "node scripts/prepack-bundle-css.js"
   },
+  "bin": {
+    "validate-ai-baseline": "./bin/validate-ai-baseline.mjs"
+  },
   "description": "Conduction brand preset for Docusaurus 3. Tokens, theme, navbar, footer, i18n config for nl/en/de/fr, and the React component library that powers conduction.nl and the Conduction product sites.",
   "main": "src/index.js",
   "exports": {
@@ -28,6 +31,7 @@
   "files": [
     "src/",
     "static/",
+    "bin/",
     "README.md",
     "MISSING_COMPONENTS.md"
   ],

package/src/components/DetailHero/DetailHero.jsx CHANGED Viewed

@@ -45,11 +45,13 @@
  */
 import React from 'react';
+import Head from '@docusaurus/Head';
 import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
 import HexBullet from '../primitives/HexBullet';
 import Button from '../primitives/Button';
 import {deriveStability} from '../../theme/brand.jsx';
 import {downloadsForApp, formatDownloads} from '../../data/app-downloads';
+import {APPS_REGISTRY, applicationCategoryFor} from '../../data/apps-registry';
 import styles from './DetailHero.module.css';
 /**
@@ -107,8 +109,61 @@ export default function DetailHero({
       }
     : undefined);
+  /* SoftwareApplication JSON-LD for AI crawlers. Emitted when appId
+     resolves to a known entry in apps-registry (so the schema only
+     fires on actual product pages, not partner/solution detail pages
+     that reuse this hero). Pulls applicationCategory from the registry
+     category, operatingSystem is hard-coded "Nextcloud" because every
+     Conduction app is a Nextcloud app. Downloads and version surface
+     as ratingCount-shaped signals on schema.org/SoftwareApplication.
+     The schema lives on every page that mounts this hero, including
+     each product page's /apps/<slug> route on conduction.nl AND each
+     per-app docs site's landing where DetailHero is the masthead. */
+  const appEntry = appId ? APPS_REGISTRY[appId] : undefined;
+  const softwareApplicationJsonLd = appEntry ? (() => {
+    const titleText = typeof title === 'string' ? title : appEntry.name;
+    const taglineText = typeof tagline === 'string' ? tagline : undefined;
+    const schema = {
+      '@context': 'https://schema.org',
+      '@type': 'SoftwareApplication',
+      '@id': `${siteConfig?.url || ''}${appEntry.productHref}#app`,
+      name: titleText,
+      applicationCategory: applicationCategoryFor(appId),
+      operatingSystem: 'Nextcloud',
+      url: `${siteConfig?.url || ''}${appEntry.productHref}`,
+      sameAs: [appEntry.docsHref].filter(Boolean),
+      publisher: {'@id': 'https://www.conduction.nl/#org'},
+      offers: {
+        '@type': 'Offer',
+        price: '0',
+        priceCurrency: 'EUR',
+        availability: 'https://schema.org/InStock',
+      },
+      license: 'https://eupl.eu/1.2/en/',
+    };
+    if (taglineText) schema.description = taglineText;
+    if (resolvedVersion) schema.softwareVersion = resolvedVersion.replace(/^v/, '');
+    if (dlCount > 0) {
+      /* Surface install count as InteractionCounter rather than
+         aggregateRating; downloads are not reviews. */
+      schema.interactionStatistic = {
+        '@type': 'InteractionCounter',
+        interactionType: {'@type': 'DownloadAction'},
+        userInteractionCount: dlCount,
+      };
+    }
+    return schema;
+  })() : null;
   return (
     <section className={[styles.head, hasIllustration && styles.withIllustration, bgClass, className].filter(Boolean).join(' ')}>
+      {softwareApplicationJsonLd && (
+        <Head>
+          <script type="application/ld+json">
+            {JSON.stringify(softwareApplicationJsonLd)}
+          </script>
+        </Head>
+      )}
       {crumb && Array.isArray(crumb) && (
         <div className={styles.crumb}>
           {crumb.map((c, i) => {

package/src/components/FAQ/FAQ.jsx CHANGED Viewed

@@ -5,19 +5,26 @@
  * Native <details>/<summary> for keyboard, screen-reader, and no-JS support.
  * One question expands at a time on click; users can have several open.
  *
+ * Emits FAQPage JSON-LD automatically from the FAQItem children, so AI
+ * crawlers (Google AI Overviews, Perplexity, ChatGPT) pick up each
+ * question + answer as a structured entity. Pass `schema={false}` to
+ * suppress the JSON-LD output (useful when the FAQ block isn't the
+ * primary content on the page).
+ *
  * Usage:
  *
  *   <FAQ title="FAQ.">
  *     <FAQItem question="If the apps are free, what does support cost?" defaultOpen>
- *       Whatever the partner you pick charges. Conduction sets no minimum…
+ *       Whatever the partner you pick charges. Conduction sets no minimum.
  *     </FAQItem>
  *     <FAQItem question="Why doesn't Conduction sell support directly?">
- *       Two reasons. First…
+ *       Two reasons. First, ...
  *     </FAQItem>
  *   </FAQ>
  */
 import React from 'react';
+import Head from '@docusaurus/Head';
 import styles from './FAQ.module.css';
 export function FAQItem({question, defaultOpen, children, className}) {
@@ -31,10 +38,56 @@ export function FAQItem({question, defaultOpen, children, className}) {
   );
 }
-export default function FAQ({title = 'FAQ.', children, className}) {
+/**
+ * Walk a React children tree and flatten to plain text. Used to
+ * convert the JSX body of each <FAQItem> into the `text` field
+ * schema.org/Question expects. Strips elements, keeps strings and
+ * numbers, joins siblings with a single space.
+ */
+function flattenToText(node) {
+  if (node == null || node === false) return '';
+  if (typeof node === 'string' || typeof node === 'number') return String(node);
+  if (Array.isArray(node)) return node.map(flattenToText).join(' ');
+  if (node.props && node.props.children) return flattenToText(node.props.children);
+  return '';
+}
+export default function FAQ({title = 'FAQ.', children, className, schema = true}) {
   const composed = [styles.faq, className].filter(Boolean).join(' ');
+  /* Build FAQPage JSON-LD from FAQItem children. Each item turns into
+     a Question entity with an Answer body. Items without a `question`
+     prop (or with empty body text) are skipped so a partial render
+     doesn't ship a malformed schema. */
+  const items = schema
+    ? React.Children.toArray(children)
+        .filter(c => c && c.props && c.props.question)
+        .map(c => {
+          const text = flattenToText(c.props.children).trim().replace(/\s+/g, ' ');
+          if (!text) return null;
+          return {
+            '@type': 'Question',
+            name: c.props.question,
+            acceptedAnswer: {'@type': 'Answer', text},
+          };
+        })
+        .filter(Boolean)
+    : [];
+  const showSchema = schema && items.length > 0;
   return (
     <section className={composed}>
+      {showSchema && (
+        <Head>
+          <script type="application/ld+json">
+            {JSON.stringify({
+              '@context': 'https://schema.org',
+              '@type': 'FAQPage',
+              mainEntity: items,
+            })}
+          </script>
+        </Head>
+      )}
       {title && <h2 className={styles.heading}>{title}</h2>}
       {children}
     </section>

package/src/data/apps-registry.js CHANGED Viewed

@@ -27,24 +27,47 @@
  */
 export const APPS_REGISTRY = {
-  opencatalogi:    {slug: 'opencatalogi',    name: 'OpenCatalogi',     productHref: '/apps/opencatalogi',    docsHref: 'https://docs.conduction.nl/opencatalogi',    academyHref: '/academy?app=opencatalogi'},
-  openregister:    {slug: 'openregister',    name: 'OpenRegister',     productHref: '/apps/openregister',    docsHref: 'https://docs.conduction.nl/openregister',    academyHref: '/academy?app=openregister'},
-  openconnector:   {slug: 'openconnector',   name: 'OpenConnector',    productHref: '/apps/openconnector',   docsHref: 'https://docs.conduction.nl/openconnector',   academyHref: '/academy?app=openconnector'},
-  docudesk:        {slug: 'docudesk',        name: 'DocuDesk',         productHref: '/apps/docudesk',        docsHref: 'https://docs.conduction.nl/docudesk',        academyHref: '/academy?app=docudesk'},
-  mydash:          {slug: 'mydash',          name: 'MyDash',           productHref: '/apps/mydash',          docsHref: 'https://docs.conduction.nl/mydash',          academyHref: '/academy?app=mydash'},
-  zaakafhandelapp: {slug: 'zaakafhandelapp', name: 'ZaakAfhandelApp',  productHref: '/apps/zaakafhandelapp', docsHref: 'https://docs.conduction.nl/zaakafhandelapp', academyHref: '/academy?app=zaakafhandelapp'},
-  pipelinq:        {slug: 'pipelinq',        name: 'PipelinQ',         productHref: '/apps/pipelinq',        docsHref: 'https://docs.conduction.nl/pipelinq',        academyHref: '/academy?app=pipelinq'},
-  procest:         {slug: 'procest',         name: 'Procest',          productHref: '/apps/procest',         docsHref: 'https://docs.conduction.nl/procest',         academyHref: '/academy?app=procest'},
-  decidesk:        {slug: 'decidesk',        name: 'DeciDesk',         productHref: '/apps/decidesk',        docsHref: 'https://docs.conduction.nl/decidesk',        academyHref: '/academy?app=decidesk'},
-  softwarecatalog: {slug: 'softwarecatalog', name: 'SoftwareCatalog',  productHref: '/apps/softwarecatalog', docsHref: 'https://docs.conduction.nl/softwarecatalog', academyHref: '/academy?app=softwarecatalog'},
-  larpingapp:      {slug: 'larpingapp',      name: 'LarpingApp',       productHref: '/apps/larpingapp',      docsHref: 'https://docs.conduction.nl/larpingapp',      academyHref: '/academy?app=larpingapp'},
-  nldesign:        {slug: 'nldesign',        name: 'NLDesign',         productHref: '/apps/nldesign',        docsHref: 'https://docs.conduction.nl/nldesign',        academyHref: '/academy?app=nldesign'},
-  shillinq:        {slug: 'shillinq',        name: 'Shillinq',         productHref: '/apps/shillinq',        docsHref: 'https://docs.conduction.nl/shillinq',        academyHref: '/academy?app=shillinq'},
-  openbuilt:       {slug: 'openbuilt',       name: 'OpenBuilt',        productHref: '/apps/openbuilt',       docsHref: 'https://docs.conduction.nl/openbuilt',       academyHref: '/academy?app=openbuilt'},
-  doriath:         {slug: 'doriath',         name: 'Doriath',          productHref: '/apps/doriath',         docsHref: 'https://docs.conduction.nl/doriath',         academyHref: '/academy?app=doriath'},
-  'app-versions':  {slug: 'app-versions',    name: 'App Versions',     productHref: '/apps/app-versions',    docsHref: 'https://docs.conduction.nl/app-versions',    academyHref: '/academy?app=app-versions'},
+  opencatalogi:    {slug: 'opencatalogi',    name: 'OpenCatalogi',     category: 'Data',        productHref: '/apps/opencatalogi',    docsHref: 'https://docs.conduction.nl/opencatalogi',    academyHref: '/academy?app=opencatalogi'},
+  openregister:    {slug: 'openregister',    name: 'OpenRegister',     category: 'Data',        productHref: '/apps/openregister',    docsHref: 'https://docs.conduction.nl/openregister',    academyHref: '/academy?app=openregister'},
+  openconnector:   {slug: 'openconnector',   name: 'OpenConnector',    category: 'Connectors',  productHref: '/apps/openconnector',   docsHref: 'https://docs.conduction.nl/openconnector',   academyHref: '/academy?app=openconnector'},
+  docudesk:        {slug: 'docudesk',        name: 'DocuDesk',         category: 'Documents',   productHref: '/apps/docudesk',        docsHref: 'https://docs.conduction.nl/docudesk',        academyHref: '/academy?app=docudesk'},
+  mydash:          {slug: 'mydash',          name: 'MyDash',           category: 'Dashboards',  productHref: '/apps/mydash',          docsHref: 'https://docs.conduction.nl/mydash',          academyHref: '/academy?app=mydash'},
+  zaakafhandelapp: {slug: 'zaakafhandelapp', name: 'ZaakAfhandelApp',  category: 'Processes',   productHref: '/apps/zaakafhandelapp', docsHref: 'https://docs.conduction.nl/zaakafhandelapp', academyHref: '/academy?app=zaakafhandelapp'},
+  pipelinq:        {slug: 'pipelinq',        name: 'PipelinQ',         category: 'Processes',   productHref: '/apps/pipelinq',        docsHref: 'https://docs.conduction.nl/pipelinq',        academyHref: '/academy?app=pipelinq'},
+  procest:         {slug: 'procest',         name: 'Procest',          category: 'Processes',   productHref: '/apps/procest',         docsHref: 'https://docs.conduction.nl/procest',         academyHref: '/academy?app=procest'},
+  decidesk:        {slug: 'decidesk',        name: 'DeciDesk',         category: 'Processes',   productHref: '/apps/decidesk',        docsHref: 'https://docs.conduction.nl/decidesk',        academyHref: '/academy?app=decidesk'},
+  softwarecatalog: {slug: 'softwarecatalog', name: 'SoftwareCatalog',  category: 'Data',        productHref: '/apps/softwarecatalog', docsHref: 'https://docs.conduction.nl/softwarecatalog', academyHref: '/academy?app=softwarecatalog'},
+  larpingapp:      {slug: 'larpingapp',      name: 'LarpingApp',       category: 'Processes',   productHref: '/apps/larpingapp',      docsHref: 'https://docs.conduction.nl/larpingapp',      academyHref: '/academy?app=larpingapp'},
+  nldesign:        {slug: 'nldesign',        name: 'NLDesign',         category: 'Documents',   productHref: '/apps/nldesign',        docsHref: 'https://docs.conduction.nl/nldesign',        academyHref: '/academy?app=nldesign'},
+  shillinq:        {slug: 'shillinq',        name: 'Shillinq',         category: 'Processes',   productHref: '/apps/shillinq',        docsHref: 'https://docs.conduction.nl/shillinq',        academyHref: '/academy?app=shillinq'},
+  openbuilt:       {slug: 'openbuilt',       name: 'OpenBuilt',        category: 'Processes',   productHref: '/apps/openbuilt',       docsHref: 'https://docs.conduction.nl/openbuilt',       academyHref: '/academy?app=openbuilt'},
+  doriath:         {slug: 'doriath',         name: 'Doriath',          category: 'Connectors',  productHref: '/apps/doriath',         docsHref: 'https://docs.conduction.nl/doriath',         academyHref: '/academy?app=doriath'},
+  'app-versions':  {slug: 'app-versions',    name: 'App Versions',     category: 'Data',        productHref: '/apps/app-versions',    docsHref: 'https://docs.conduction.nl/app-versions',    academyHref: '/academy?app=app-versions'},
 };
+/**
+ * Map an apps-catalog category to a schema.org applicationCategory.
+ * Used by <DetailHero> when emitting SoftwareApplication JSON-LD for
+ * AI crawlers. Defaults to BusinessApplication for any unknown
+ * category, since every Conduction app fits BusinessApplication in
+ * the absence of better signal.
+ */
+export const SCHEMA_APPLICATION_CATEGORY = {
+  Data:        'BusinessApplication',
+  Processes:   'BusinessApplication',
+  Connectors:  'DeveloperApplication',
+  Documents:   'BusinessApplication',
+  Dashboards:  'BusinessApplication',
+  AI:          'BusinessApplication',
+};
+/** Resolve an appId to its schema.org applicationCategory. */
+export function applicationCategoryFor(slug) {
+  const entry = APPS_REGISTRY[slug];
+  if (!entry) return 'BusinessApplication';
+  return SCHEMA_APPLICATION_CATEGORY[entry.category] || 'BusinessApplication';
+}
 export const APP_SLUGS = Object.keys(APPS_REGISTRY);
 /** Build a label map keyed by slug, suitable for <ContentTypeFilter labels=…/>. */

package/src/index.js CHANGED Viewed

@@ -88,6 +88,136 @@ function resolveAppVersion(opts) {
   return undefined;
 }
+/**
+ * Brand-default Organization JSON-LD. One canonical version of the
+ * company's legal-entity facts (address, KvK, BTW, socials), shipped on
+ * every Conduction site so AI crawlers (GPTBot, ClaudeBot, Perplexity-
+ * Bot, OAI-SearchBot, Google AI Overviews) get the same answer to
+ * "who is Conduction" regardless of which subdomain they landed on.
+ * Updates here propagate to the fleet on the next preset release.
+ *
+ * Sites that aren't conduction.nl (per-app docs sites at
+ * {slug}.conduction.nl, etc.) still reference the same Organization via
+ * @id, so cross-site citations consolidate cleanly.
+ */
+const BRAND_ORGANIZATION_JSONLD = {
+  '@context': 'https://schema.org',
+  '@type': 'Organization',
+  '@id': 'https://www.conduction.nl/#org',
+  name: 'Conduction B.V.',
+  alternateName: 'Conduction',
+  url: 'https://www.conduction.nl/',
+  logo: 'https://www.conduction.nl/img/brand/avatar-conduction-gold-on-white.svg',
+  foundingDate: '2019',
+  description:
+    'Dutch open-source software company building EUPL-1.2 apps for the Nextcloud workspace.',
+  address: {
+    '@type': 'PostalAddress',
+    streetAddress: 'Lauriergracht 14h',
+    postalCode: '1016 RR',
+    addressLocality: 'Amsterdam',
+    addressCountry: 'NL',
+  },
+  email: 'info@conduction.nl',
+  telephone: '+31-85-303-6840',
+  taxID: 'NL860784241B01',
+  vatID: 'NL860784241B01',
+  identifier: {
+    '@type': 'PropertyValue',
+    propertyID: 'KvK',
+    value: '76741850',
+  },
+  sameAs: [
+    'https://github.com/ConductionNL',
+    'https://www.linkedin.com/company/conduction/',
+  ],
+};
+/**
+ * Build the per-site WebSite JSON-LD that ties the consuming site to
+ * the shared Organization. WebSite carries the site title and URL the
+ * site was configured with; Organization stays canonical.
+ */
+function buildWebsiteJsonLd(opts) {
+  return {
+    '@context': 'https://schema.org',
+    '@type': 'WebSite',
+    '@id': `${opts.url}/#website`,
+    url: `${opts.url}/`,
+    name: opts.title,
+    publisher: {'@id': 'https://www.conduction.nl/#org'},
+    inLanguage: (opts.i18n && opts.i18n.locales) || ['nl', 'en', 'de', 'fr'],
+  };
+}
+/**
+ * Default headTags emitted on every page. Two JSON-LD blocks
+ * (Organization + WebSite) consumed by AI crawlers, Google rich
+ * results, Bing AI, LinkedIn previews. Static SSG output, so non-JS
+ * fetchers (GPTBot, ClaudeBot, PerplexityBot) see them too.
+ *
+ * Sites extend by passing `opts.headTags = [...]`; the preset merges
+ * the site's tags after its own defaults.
+ */
+function buildAiHeadTags(opts) {
+  return [
+    {
+      tagName: 'script',
+      attributes: {type: 'application/ld+json'},
+      innerHTML: JSON.stringify(BRAND_ORGANIZATION_JSONLD),
+    },
+    {
+      tagName: 'script',
+      attributes: {type: 'application/ld+json'},
+      innerHTML: JSON.stringify(buildWebsiteJsonLd(opts)),
+    },
+  ];
+}
+/**
+ * Default sitemap plugin options. Each locale outputs its own
+ * sitemap.xml. /academy/tags/** is excluded site-wide because tag
+ * pages are thin and confuse AI summarisers more than they help SEO.
+ * ignorePatterns matches the *route path* after locale prefixing, so
+ * we list every locale variant.
+ *
+ * Sites passing their own classic preset config can override by
+ * including a `sitemap` key alongside `docs`/`blog`/`theme`.
+ */
+const DEFAULT_SITEMAP_OPTIONS = {
+  changefreq: 'weekly',
+  priority: 0.5,
+  ignorePatterns: [
+    '/academy/tags/**',
+    '/nl/academy/tags/**',
+    '/en/academy/tags/**',
+    '/de/academy/tags/**',
+    '/fr/academy/tags/**',
+  ],
+  filename: 'sitemap.xml',
+};
+/**
+ * Default themeConfig.metadata. Twitter + og:type baselines so social
+ * cards render correctly. Per-page MDX frontmatter still wins (Helmet
+ * de-dupes by meta name/property). Sites override the whole array by
+ * passing `themeConfig.metadata = [...]` in opts.
+ */
+const DEFAULT_METADATA = [
+  {name: 'twitter:site', content: '@ConductionNL'},
+  {name: 'twitter:card', content: 'summary_large_image'},
+  {property: 'og:type', content: 'website'},
+];
+/**
+ * Default OG image. 1200x630 cobalt brand card shipped from the preset
+ * static/img/. Sites can override by dropping their own
+ * static/img/og-conduction.png (staticDirectories precedence puts the
+ * site's file last). For per-app product cards, set themeConfig.image
+ * explicitly in your site config.
+ */
+const DEFAULT_OG_IMAGE = 'img/og-conduction.png';
 /**
  * Brand-default i18n block. Nederlands at the URL root, others at /en/, /de/, /fr/.
  */
@@ -266,12 +396,19 @@ function createConfig(opts) {
     onBrokenLinks: 'warn',
     onBrokenMarkdownLinks: 'warn',
-    /* Two static roots, in increasing-precedence order:
-         1. preset's own ../static (lib/canal-footer, conduction-bg, hex-rain,
-            platform-diagram + brand img/favicon, logo, logo-dark, nextcloud-logo)
+    /* Two static roots. Docusaurus wires staticDirectories through
+       copy-webpack-plugin's parallel pattern processing (Promise.all),
+       so for file collisions the winner is whichever pattern finishes
+       reading first — non-deterministic in practice (preset wins on
+       most disks because it ships smaller). Don't rely on this array
+       order for overrides; ship a Docusaurus plugin like
+       ./plugins/ai-crawling.js when you need deterministic precedence.
+         1. preset's own ../static (canal-footer, conduction-bg,
+            hex-rain, platform-diagram, brand img/favicon, logo, logo-
+            dark, nextcloud-logo, default OG card)
          2. site's own static/ (CNAME, site-specific images, overrides)
-       Last wins per-file, so a site can drop its own /img/logo.svg into
-       static/img/logo.svg to override the brand default. */
+       Files unique to one directory always copy; conflicts are
+       essentially undefined behaviour. */
     staticDirectories: opts.staticDirectories || [
       path.resolve(__dirname, '..', 'static'),
       'static',
@@ -295,6 +432,11 @@ function createConfig(opts) {
           theme: {
             customCss,
           },
+          /* AI-crawler-friendly defaults for @docusaurus/plugin-sitemap.
+             Per-locale sitemap.xml emitted automatically; /academy/tags
+             excluded across every locale prefix. Sites passing their own
+             presets array must include their own sitemap config too. */
+          sitemap: DEFAULT_SITEMAP_OPTIONS,
         },
       ],
     ],
@@ -363,17 +505,53 @@ function createConfig(opts) {
                isoCertifications: true | false,
              } */
         legalLinks: opts.legalLinks || {},
+        /* AI-friendly social-card defaults. `image` ships from the
+           preset's static/img/og-conduction.png and gets served at every
+           consuming site's /img/og-conduction.png; drop your own
+           static/img/og-conduction.png to override per-site. `metadata`
+           seeds twitter:site + twitter:card + og:type baselines; per-
+           page MDX frontmatter still wins via Helmet de-dupe. */
+        image: DEFAULT_OG_IMAGE,
+        metadata: DEFAULT_METADATA,
       },
       opts.themeConfig || {}
     ),
-    plugins: opts.plugins || [],
+    /* AI-crawler discovery: Organization + WebSite JSON-LD on every
+       page, plus any site-specific tags the consumer adds. Sites
+       inherit the canonical Conduction Organization automatically;
+       per-app SoftwareApplication schemas are emitted by the
+       <DetailHero> component on the pages it renders. */
+    headTags: [
+      ...buildAiHeadTags(opts),
+      ...(opts.headTags || []),
+    ],
+    /* The AI-crawling plugin emits /robots.txt (and optionally /llms.txt)
+       in postBuild, after webpack's copy-plugin has copied static files.
+       It no-ops when the file already exists in outDir, so a site's own
+       static/robots.txt or static/llms.txt always wins. Sites disable
+       per-file or wholesale via opts.aiCrawling.disable. Hand-rolled
+       plugins in opts.plugins are appended after this default. */
+    plugins: [
+      [
+        require.resolve('./plugins/ai-crawling.js'),
+        opts.aiCrawling || {},
+      ],
+      ...(opts.plugins || []),
+    ],
   };
 }
 module.exports = {
   createConfig,
   I18N,
+  BRAND_ORGANIZATION_JSONLD,
+  buildWebsiteJsonLd,
+  buildAiHeadTags,
+  DEFAULT_SITEMAP_OPTIONS,
+  DEFAULT_METADATA,
+  DEFAULT_OG_IMAGE,
   baseNavbar,
   baseFooter,
   baseFooterLinks,

package/src/plugins/ai-crawling.js ADDED Viewed

@@ -0,0 +1,112 @@
+/**
+ * @conduction/docusaurus-preset/plugins/ai-crawling
+ *
+ * Docusaurus plugin that ships the AI-crawler-friendly fleet baseline:
+ *   - /robots.txt  default: open posture (allows search and training
+ *     bots), generated from the template at static/robots.txt.template
+ *   - /llms.txt    optional: a llmstxt.org-format site index. Sites
+ *     pass `opts.llmsTxt` to provide their own content; absent that,
+ *     no llms.txt is emitted (the spec needs hand-curated copy and a
+ *     generic placeholder would be worse than nothing).
+ *
+ * Why a postBuild plugin instead of static/ files: Docusaurus wires
+ * staticDirectories through copy-webpack-plugin's parallel pattern
+ * processing, which makes file-collision resolution non-deterministic
+ * (patterns are emitted via Promise.all and the first-emitted asset
+ * wins). Shipping defaults via this plugin moves the write to the
+ * post-build hook, which runs strictly after all static files have
+ * been copied. The plugin no-ops when the target file already exists
+ * in the build output, so a site's own static/robots.txt or
+ * static/llms.txt always wins.
+ *
+ * Options:
+ *   robotsTxt   string  override the default robots.txt body
+ *   llmsTxt     string  llms.txt body to emit (no default)
+ *   disable     boolean | object  fine-grained opt-out:
+ *                 disable: true                  skip the whole plugin
+ *                 disable: {robotsTxt: true}     skip just robots.txt
+ *                 disable: {llmsTxt: true}       skip just llms.txt
+ */
+const fs = require('fs');
+const path = require('path');
+const TEMPLATE_PATH = path.resolve(
+  __dirname,
+  '..',
+  '..',
+  'static',
+  'robots.txt.template'
+);
+function readRobotsTemplate() {
+  try {
+    return fs.readFileSync(TEMPLATE_PATH, 'utf8');
+  } catch (e) {
+    return 'User-agent: *\nAllow: /\n';
+  }
+}
+/**
+ * Walk every staticDirectory the site exposes and look for the given
+ * filename at the root. Used to decide whether the plugin should emit
+ * its default — sites that ship their own static/robots.txt (or
+ * static/llms.txt) take precedence over the plugin's fallback.
+ */
+function siteShipsFile(context, filename) {
+  const dirs = context.siteConfig?.staticDirectories || ['static'];
+  const siteDir = context.siteDir;
+  for (const dir of dirs) {
+    const resolved = path.isAbsolute(dir) ? dir : path.resolve(siteDir, dir);
+    if (fs.existsSync(path.join(resolved, filename))) return true;
+  }
+  return false;
+}
+function aiCrawlingPlugin(context, options = {}) {
+  const disable = options.disable === true
+    ? {robotsTxt: true, llmsTxt: true}
+    : (options.disable || {});
+  /* Decide once at plugin-load time which files the plugin will emit.
+     Checking now (not in postBuild) means we read source staticDir
+     contents while disk state is stable, avoiding the postBuild race
+     where webpack's CopyPlugin may not have flushed its assets yet. */
+  const shouldEmitRobots = !disable.robotsTxt
+    && !siteShipsFile(context, 'robots.txt');
+  const shouldEmitLlms = !disable.llmsTxt
+    && options.llmsTxt
+    && !siteShipsFile(context, 'llms.txt');
+  return {
+    name: 'conduction-ai-crawling',
+    async postBuild({outDir, siteConfig}) {
+      const tasks = [];
+      if (shouldEmitRobots) {
+        const target = path.join(outDir, 'robots.txt');
+        let body = options.robotsTxt || readRobotsTemplate();
+        /* If the site config has a url, append a Sitemap line so
+           crawlers without locale-suffix discovery still find the
+           per-locale sitemaps. Sites supplying their own robotsTxt
+           body are trusted to include their own Sitemap lines. */
+        if (!options.robotsTxt && siteConfig?.url) {
+          const u = siteConfig.url.replace(/\/$/, '');
+          body = body.trimEnd() + `\n\nSitemap: ${u}/sitemap.xml\n`;
+        }
+        tasks.push(fs.promises.writeFile(target, body, 'utf8'));
+      }
+      if (shouldEmitLlms) {
+        const target = path.join(outDir, 'llms.txt');
+        tasks.push(fs.promises.writeFile(target, options.llmsTxt, 'utf8'));
+      }
+      await Promise.all(tasks);
+    },
+  };
+}
+module.exports = aiCrawlingPlugin;
+module.exports.readRobotsTemplate = readRobotsTemplate;

package/static/img/og-conduction.png ADDED Viewed

Binary file

package/static/robots.txt.template ADDED Viewed

@@ -0,0 +1,86 @@
+# Conduction default robots.txt
+#
+# Generated at build time by the @conduction/docusaurus-preset
+# ai-crawling plugin. Sites that ship their own static/robots.txt
+# get that file instead (the plugin no-ops if robots.txt already
+# exists in the build output).
+#
+# Posture: open. Conduction is an open-source company, our content is
+# meant to be read, quoted, and learned from. We allow both AI-search
+# citation crawlers and AI-training crawlers.
+#
+# Override by dropping a custom static/robots.txt in your site repo.
+# To flip the whole fleet, edit the DEFAULT_ROBOTS_TXT constant in
+# design-system/docusaurus-preset/src/plugins/ai-crawling.js.
+#
+# Bots are listed explicitly even though "User-agent: *" already permits
+# them. The explicit list documents intent and makes a future audit a
+# one-line diff.
+User-agent: *
+Allow: /
+# AI search / citation crawlers
+User-agent: OAI-SearchBot
+Allow: /
+User-agent: ChatGPT-User
+Allow: /
+User-agent: Claude-SearchBot
+Allow: /
+User-agent: Claude-User
+Allow: /
+User-agent: PerplexityBot
+Allow: /
+User-agent: Perplexity-User
+Allow: /
+User-agent: DuckAssistBot
+Allow: /
+User-agent: MistralAI-User
+Allow: /
+User-agent: Amazonbot
+Allow: /
+User-agent: YouBot
+Allow: /
+# AI training crawlers
+User-agent: GPTBot
+Allow: /
+User-agent: ClaudeBot
+Allow: /
+User-agent: anthropic-ai
+Allow: /
+User-agent: Claude-Web
+Allow: /
+User-agent: CCBot
+Allow: /
+User-agent: cohere-training-data-crawler
+Allow: /
+User-agent: cohere-ai
+Allow: /
+User-agent: Meta-ExternalAgent
+Allow: /
+User-agent: FacebookBot
+Allow: /
+User-agent: Applebot-Extended
+Allow: /
+User-agent: Google-Extended
+Allow: /