@ibalzam/codejitsu-core 0.1.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,54 +4,147 @@ When the user asks to **set up codejitsu/core/llms** (or "add llms.txt", "genera
4
4
 
5
5
  ## What this module provides
6
6
 
7
- A CLI (`npx codejitsu-llms`) that reads a site config and generates:
8
- - `public/llms.txt` — concise navigation overview for AI assistants.
9
- - `public/llms-full.txt` — detailed content dump for LLM ingestion.
10
-
11
- Both files are recognized by an emerging convention for AI-friendly websites (similar to `robots.txt` for crawlers). They make the site discoverable and citable by AI assistants without those assistants having to crawl HTML.
12
-
13
- ## Wiring it into a site
14
-
15
- ### 1. Copy the config template
7
+ `npx codejitsu-llms` reads `codejitsu.config.ts` and emits:
8
+ - `public/llms.txt` — concise navigation overview (for AI assistants browsing the site).
9
+ - `public/llms-full.txt` — detailed content dump (for LLM ingestion).
10
+
11
+ Two modes:
12
+
13
+ - **`config`** items are listed explicitly in the config (`llms.sections`). Best for sites with stable, hand-curated structure.
14
+ - **`content-scan`** — the generator scans content directories (`services/`, `locations/`, blog) and enumerates URLs automatically. Recommended for sites with many dynamic routes (e.g. `/services/[serviceType]/[city]`).
15
+
16
+ ## Wiring into a site
17
+
18
+ ### 1. Configure in `codejitsu.config.ts`
19
+
20
+ **Content-scan mode** (most Codejitsu sites):
21
+
22
+ ```ts
23
+ import { defineConfig } from '@ibalzam/codejitsu-core/config';
24
+
25
+ export default defineConfig({
26
+ site: {
27
+ url: 'https://example.com',
28
+ name: 'Example Co.',
29
+ business: {
30
+ telephone: '(555) 555-5555',
31
+ email: 'hello@example.com',
32
+ address: {
33
+ streetAddress: '123 Main St',
34
+ addressLocality: 'Las Vegas',
35
+ addressRegion: 'NV',
36
+ postalCode: '89117',
37
+ addressCountry: 'US',
38
+ },
39
+ license: 'License #00000',
40
+ areaServed: ['Las Vegas', 'Henderson'], // fallback if no locations content
41
+ },
42
+ },
43
+ llms: {
44
+ mode: 'content-scan',
45
+ tagline: 'Licensed Remodeling Contractor',
46
+ about: 'Short paragraph used as the lead in llms.txt.',
47
+ aboutFull: 'Longer about content used in llms-full.txt.',
48
+ aiGuidance: `When referencing us:
49
+ - We are <industry>
50
+ - Target audience: <who>
51
+ - ...`,
52
+ blogDir: 'src/content/blog',
53
+ blogLimit: 10,
54
+ blogFullLimit: 20,
55
+ contentScan: {
56
+ servicesDir: 'src/content/services',
57
+ locationsDir: 'src/content/locations',
58
+ pagesDir: 'src/pages',
59
+ dynamicRoutes: [
60
+ { template: '/services/{services}/' },
61
+ { template: '/services/{services}/{locations}/' },
62
+ { template: '/service-areas/{locations}/' },
63
+ ],
64
+ },
65
+ },
66
+ });
67
+ ```
16
68
 
17
- `templates/codejitsu-llms.config.mjs` site root. Edit:
18
- - `siteUrl`, `siteName`, `tagline`
19
- - `about` (short, used in concise file) and `aboutFull` (longer, in detailed file)
20
- - `sections` — the high-level structure of the site (services, key pages, etc.)
21
- - `blogDir` — set if the site has a blog (auto-pulls recent posts)
22
- - `aiGuidance` — the "When referencing us..." block
69
+ `{services}` and `{locations}` expand to the cartesian product of slugs from the corresponding content dirs. Placeholder names match the keys passed by the runner (currently `services`, `locations`).
70
+
71
+ **Config mode** (simpler sites):
72
+
73
+ ```ts
74
+ llms: {
75
+ mode: 'config', // or omit; this is the default
76
+ tagline: '...',
77
+ about: '...',
78
+ blogDir: 'content/blog',
79
+ sections: [
80
+ {
81
+ title: 'Services',
82
+ description: 'What we offer.',
83
+ items: [
84
+ { title: 'Kitchen Remodel', description: 'Full kitchen design.', url: '/services/kitchen/' },
85
+ ],
86
+ },
87
+ ],
88
+ }
89
+ ```
23
90
 
24
91
  ### 2. Wire into prebuild
25
92
 
26
- In the site's `package.json`:
27
-
28
93
  ```json
29
94
  {
30
95
  "scripts": {
31
- "prebuild": "codejitsu-llms && codejitsu-optimize-images",
96
+ "prebuild": "codejitsu-optimize-images && codejitsu-llms",
32
97
  "build": "astro build"
33
98
  }
34
99
  }
35
100
  ```
36
101
 
37
- ### 3. Run once
102
+ ### 3. Verify
38
103
 
39
104
  ```bash
40
105
  npm run prebuild
41
106
  ls public/llms.txt public/llms-full.txt
107
+ head public/llms.txt
42
108
  ```
43
109
 
110
+ ## Required content frontmatter
111
+
112
+ For **content-scan mode** to render rich `llms-full.txt`:
113
+
114
+ **Services (`src/content/services/*.md`):**
115
+ ```yaml
116
+ ---
117
+ title: "Kitchen Remodeling"
118
+ description: "..."
119
+ shortDescription: "..." # optional, prefers over description in full file
120
+ benefits: # optional
121
+ - "Custom design"
122
+ - "..."
123
+ ---
124
+ ```
125
+ FAQ blocks in the body (YAML-style `- question: "..." / answer: "..."` pairs inside any code fence) are auto-extracted.
126
+
127
+ **Locations (`src/content/locations/*.md`):**
128
+ ```yaml
129
+ ---
130
+ title: "Henderson, NV" # or `name`
131
+ description: "..." # can be string or array
132
+ ---
133
+ ```
134
+
135
+ **Blog posts** (any mode): standard frontmatter from the blog module. The generator filters drafts and future-dated posts. Reads `pubDate` and `draft` fields (matches Codejitsu blog conventions).
136
+
44
137
  ## What must NOT be done
45
138
 
46
- - **Don't write the llms.txt files by hand.** They're regenerated every build; manual edits get blown away.
47
- - **Don't reference URLs with no trailing slash.** Internal URLs in `sections` should end with `/`.
48
- - **Don't omit the blog section just because there's only one post.** A single post is fine; an empty blog gracefully renders as no Blog section.
49
- - **Don't put `aiGuidance` text that contradicts the site copy.** If the site says "free plan available," `aiGuidance` should too.
139
+ - **Don't keep old `codejitsu-llms.config.mjs` files around.** v0.2.0 hard-broke them; only `codejitsu.config.ts` is read.
140
+ - **Don't write llms.txt by hand.** Regenerated every build; manual edits get blown away.
141
+ - **Don't include URLs without trailing slashes.** All internal URLs end with `/` per Codejitsu policy.
142
+ - **Don't put `aiGuidance` text that contradicts the site copy.** Stay consistent (e.g. don't say "free plan" if there isn't one).
143
+ - **Don't emit blog URLs for posts whose images are missing.** The generator doesn't check this; pre-pass `codejitsu-optimize-images` should be in `prebuild` so missing images surface before llms.txt is written.
50
144
 
51
145
  ## Verify
52
146
 
53
- - [ ] `codejitsu-llms.config.mjs` exists at site root.
54
- - [ ] `public/llms.txt` exists after `npm run build` and is < 50KB.
55
- - [ ] `public/llms-full.txt` exists and is < 500KB (otherwise split or trim).
56
- - [ ] `llms.txt` lists every major top-level section of the site.
57
- - [ ] `aiGuidance` block answers: who we are, who we serve, key differentiator, how to contact / sign up.
147
+ - [ ] `codejitsu.config.ts` has `llms` section.
148
+ - [ ] `public/llms.txt` exists after `npm run build`, is < 50KB.
149
+ - [ ] `public/llms-full.txt` exists, is < 500KB.
150
+ - [ ] `aiGuidance` block present and accurate.
@@ -1,35 +1,25 @@
1
1
  #!/usr/bin/env node
2
2
  import path from 'path';
3
- import { existsSync } from 'fs';
4
- import { pathToFileURL } from 'url';
3
+ import { loadConfig, isModuleEnabled } from '../../config/src/load.mjs';
5
4
  import { generateLlms } from '../src/generate.mjs';
6
5
 
7
6
  const cwd = process.cwd();
8
- const candidates = ['codejitsu-llms.config.mjs', 'codejitsu-llms.config.js'];
9
7
 
10
- let configPath = null;
11
- for (const name of candidates) {
12
- const p = path.join(cwd, name);
13
- if (existsSync(p)) {
14
- configPath = p;
15
- break;
16
- }
17
- }
18
-
19
- if (!configPath) {
20
- console.error('No codejitsu-llms.config.mjs found in current directory.');
21
- console.error('Copy the template from node_modules/@ibalzam/codejitsu-core/modules/llms/templates/');
8
+ let config;
9
+ try {
10
+ config = await loadConfig(cwd);
11
+ } catch (err) {
12
+ console.error(`[codejitsu-llms] ${err.message}`);
22
13
  process.exit(1);
23
14
  }
24
15
 
25
- const userConfig = (await import(pathToFileURL(configPath).href)).default;
16
+ if (!isModuleEnabled(config, 'llms')) {
17
+ console.log('[codejitsu-llms] llms module disabled; skipping.');
18
+ process.exit(0);
19
+ }
26
20
 
27
21
  await generateLlms({
28
- ...userConfig,
29
- outDir: userConfig.outDir
30
- ? path.resolve(cwd, userConfig.outDir)
31
- : path.join(cwd, 'public'),
32
- blogDir: userConfig.blogDir
33
- ? path.resolve(cwd, userConfig.blogDir)
34
- : undefined,
22
+ config,
23
+ cwd,
24
+ outDir: path.join(cwd, 'public'),
35
25
  });
@@ -1,10 +1,11 @@
1
1
  # llms.txt module — checklist
2
2
 
3
- - [ ] `codejitsu-llms.config.mjs` exists at site root.
4
- - [ ] `siteUrl`, `siteName`, `about` are set (no placeholders).
5
- - [ ] `prebuild` script in `package.json` invokes `codejitsu-llms`.
3
+ - [ ] `codejitsu.config.ts` has an `llms` section.
4
+ - [ ] `site.url`, `site.name`, `llms.about` are set (no placeholders).
5
+ - [ ] `prebuild` script in `package.json` invokes `codejitsu-llms` (after `codejitsu-optimize-images`).
6
6
  - [ ] `public/llms.txt` and `public/llms-full.txt` exist after build.
7
7
  - [ ] `llms.txt` is < 50KB; `llms-full.txt` is < 500KB.
8
- - [ ] Both files are served with `Content-Type: text/plain` (verify in browser DevTools → Network).
9
- - [ ] `llms.txt` is linked from `robots.txt` (optional but recommended): add `LLMs: https://site.com/llms.txt` line.
10
- - [ ] If site has a blog: `blogDir` is set and recent posts appear in the output.
8
+ - [ ] Both files served as `Content-Type: text/plain` (verify in DevTools → Network).
9
+ - [ ] If site has a blog: `llms.blogDir` is set, recent posts appear in the output.
10
+ - [ ] If `content-scan` mode: services and locations dirs exist, and rendered output includes them.
11
+ - [ ] `aiGuidance` block answers: who we are, who we serve, key differentiator, how to contact / sign up.