codebeacon 0.3.3__tar.gz → 0.4.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {codebeacon-0.3.3 → codebeacon-0.4.0}/PKG-INFO +33 -21
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.de.md +23 -17
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.es.md +23 -17
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.fr.md +23 -17
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.ja.md +22 -18
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.ko.md +31 -21
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.md +32 -20
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.pt-BR.md +23 -17
- {codebeacon-0.3.3 → codebeacon-0.4.0}/README.zh-CN.md +20 -18
- codebeacon-0.4.0/codebeacon/__init__.py +1 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/cli.py +16 -6
- codebeacon-0.4.0/codebeacon/semantic_pipeline.py +816 -0
- codebeacon-0.4.0/codebeacon/skill/SKILL.md +312 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/pyproject.toml +1 -1
- codebeacon-0.4.0/tests/test_semantic.py +191 -0
- codebeacon-0.3.3/codebeacon/__init__.py +0 -1
- codebeacon-0.3.3/codebeacon/semantic_pipeline.py +0 -457
- codebeacon-0.3.3/codebeacon/skill/SKILL.md +0 -190
- {codebeacon-0.3.3 → codebeacon-0.4.0}/.cursorrules +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/.github/CODEOWNERS +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/.github/dependabot.yml +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/.github/workflows/ci.yml +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/.github/workflows/release.yml +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/.gitignore +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/AGENTS.md +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/CLAUDE.md +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/LICENSE +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/__main__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/cache.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/common/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/common/filters.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/common/safety.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/common/symbols.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/common/types.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/config.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/contextmap/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/contextmap/generator.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/discover/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/discover/detector.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/discover/ignore.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/discover/scanner.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/export/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/export/callflow_html.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/export/hooks.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/export/mcp.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/export/merge.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/export/obsidian.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/export/tree_html.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/base.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/components.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/dependencies.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/entities.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/README.md +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/actix.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/angular.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/aspnet.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/django.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/express.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/fastapi.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/flask.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/gin.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/ktor.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/laravel.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/nestjs.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/rails.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/react.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/spring_boot.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/svelte.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/tauri.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/vapor.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/queries/vue.scm +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/routes.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/semantic.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/extract/services.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/graph/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/graph/analyze.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/graph/build.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/graph/cluster.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/graph/enrich.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/graph/write.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/plugins/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/plugins/githooks.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/plugins/skills.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/wave.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/wiki/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/wiki/generator.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/wiki/index.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon/wiki/templates.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/codebeacon.yaml.example +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/docs/TRANSLATION_STATUS.md +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/public-plan.md +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/skill/install.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/__init__.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/conftest.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/actix/main.rs +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/angular/app.component.ts +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/aspnet/UserController.cs +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/django/views.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/express/userRouter.js +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/fastapi/main.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/flask/app.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/gin/main.go +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/ktor/UserRoutes.kt +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/laravel/UserController.php +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/nestjs/user.controller.ts +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/rails/users_controller.rb +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/react/UserPage.tsx +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/spring_boot/UserController.java +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/sveltekit/+page.svelte +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/vapor/routes.swift +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/fixtures/vue/UserList.vue +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_discover.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_entities.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_filters.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_graph.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_plugins.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_resolve.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_routes.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_safety_and_writes.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_services.py +0 -0
- {codebeacon-0.3.3 → codebeacon-0.4.0}/tests/test_wiki.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: codebeacon
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.4.0
|
|
4
4
|
Summary: Source code AST analysis tool for AI context generation — unified multi-framework knowledge graph
|
|
5
5
|
Project-URL: Homepage, https://github.com/codebeacon/codebeacon
|
|
6
6
|
Project-URL: Repository, https://github.com/codebeacon/codebeacon
|
|
@@ -97,6 +97,7 @@ Existing tools solve this partially. Route analyzers map your controllers but mi
|
|
|
97
97
|
- **Zero configuration** — auto-detects frameworks and languages; generates `codebeacon.yaml` for repeat runs
|
|
98
98
|
- **Deep-dive mode** — `--deep-dive` generates per-project `.codebeacon/` + `CLAUDE.md` for every sub-project; running `codebeacon scan . --update` from any sub-project folder automatically syncs all projects in the workspace
|
|
99
99
|
- **Workspace auto-rediscovery** — on every `scan` / `sync`, codebeacon re-scans the workspace and appends any new project folders to `codebeacon.yaml` before extraction, so freshly added sub-projects are never silently skipped; pass `--no-rediscover` to opt out for hand-curated configs
|
|
100
|
+
- **Graphify-style semantic enrichment** — after AST extraction, the skill dispatches one parallel subagent per chunk to emit `{nodes, edges, hyperedges}` with 8 relation types (`calls`/`implements`/`references`/`cites`/`conceptually_related_to`/`shares_data_with`/`semantically_similar_to`/`rationale_for`) and EXTRACTED/INFERRED/AMBIGUOUS confidence; on Claude Code the subagent runs one tier below the host model (Opus→Sonnet, Sonnet→Haiku) so spend stays proportional to corpus size. AST owns code nodes; LLM only contributes `concept`/`document`/`paper` nodes. Existing 0.3.x archives replay through the new schema unchanged.
|
|
100
101
|
|
|
101
102
|
---
|
|
102
103
|
|
|
@@ -185,11 +186,14 @@ project-root/
|
|
|
185
186
|
components/<Name>.md
|
|
186
187
|
obsidian/ ← Obsidian vault (one note per graph node)
|
|
187
188
|
semantic/
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
189
|
+
pending/ ← prepare writes chunk_NNN.jsonl here (≤ --chunk-size tasks each)
|
|
190
|
+
chunk_001.jsonl
|
|
191
|
+
chunk_002.jsonl
|
|
192
|
+
results/ ← agent writes a matching chunk_NNN.jsonl per pending file
|
|
193
|
+
chunk_001.jsonl
|
|
194
|
+
original/ ← apply moves done chunks here (durable archive)
|
|
195
|
+
chunk_001.jsonl
|
|
196
|
+
chunk_002.jsonl ← (older runs accumulate; chunk numbers are monotonic)
|
|
193
197
|
```
|
|
194
198
|
|
|
195
199
|
### Deep Dive Mode
|
|
@@ -356,14 +360,21 @@ codebeacon sync --config <file> # use a specific config file
|
|
|
356
360
|
codebeacon sync --no-rediscover # don't auto-append newly added projects (hand-curated yaml mode)
|
|
357
361
|
|
|
358
362
|
# AI-semantic enrichment (the agent does the LLM work, codebeacon does the bookkeeping)
|
|
359
|
-
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N]
|
|
360
|
-
# rehydrate
|
|
361
|
-
#
|
|
362
|
-
#
|
|
363
|
+
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N] [--chunk-size N]
|
|
364
|
+
# rehydrate archive (.codebeacon/semantic/original/*.jsonl) onto
|
|
365
|
+
# the fresh graph, prune entries pointing at missing nodes,
|
|
366
|
+
# then emit every NEW candidate (god folders + hub files +
|
|
367
|
+
# unresolved targets) into .codebeacon/semantic/pending/
|
|
368
|
+
# chunk_NNN.jsonl (--chunk-size tasks per file, default 10).
|
|
369
|
+
# `--max-tasks` is an optional cap (0 = no cap = emit all).
|
|
370
|
+
# task_id includes a content hash, so a file whose semantic
|
|
371
|
+
# content changes between scans is automatically re-emitted.
|
|
363
372
|
codebeacon semantic-apply [--dir .codebeacon]
|
|
364
|
-
#
|
|
365
|
-
#
|
|
366
|
-
#
|
|
373
|
+
# for each .codebeacon/semantic/results/chunk_NNN.jsonl the
|
|
374
|
+
# agent has written, merge edges (INFERRED references) into
|
|
375
|
+
# beacon.json and MOVE the pending chunk into
|
|
376
|
+
# .codebeacon/semantic/original/chunk_NNN.jsonl (durable
|
|
377
|
+
# archive). Regenerates wiki/obsidian/context map.
|
|
367
378
|
|
|
368
379
|
# Query the knowledge graph
|
|
369
380
|
codebeacon query <term> [--dir .codebeacon] [--limit N] # search nodes by label substring
|
|
@@ -398,22 +409,23 @@ The CLI itself never makes an LLM API call. The AI-semantic layer is intentional
|
|
|
398
409
|
When you invoke `/codebeacon` in Claude Code:
|
|
399
410
|
|
|
400
411
|
1. `scan` / `sync` builds `beacon.json` from the AST (no LLM).
|
|
401
|
-
2. `codebeacon semantic-prepare`
|
|
402
|
-
3. The skill
|
|
403
|
-
4. `codebeacon semantic-apply` merges the results as `INFERRED references` edges into `beacon.json
|
|
404
|
-
5. Next scan: `semantic-prepare`
|
|
412
|
+
2. `codebeacon semantic-prepare` rehydrates the archive at `.codebeacon/semantic/original/*.jsonl` onto the fresh graph, **prunes** archive entries whose source node no longer exists, and writes new task chunks to `.codebeacon/semantic/pending/chunk_NNN.jsonl` (≤ `--chunk-size` tasks per file, default 10). Chunk numbers continue from where the durable archive left off, so they never collide.
|
|
413
|
+
3. The skill iterates the pending chunks **one chunk at a time**. For each `pending/chunk_NNN.jsonl`, the agent (using its current model) reads each task's `excerpt` and writes a matching `semantic/results/chunk_NNN.jsonl`.
|
|
414
|
+
4. `codebeacon semantic-apply` merges the results as `INFERRED references` edges into `beacon.json` and **moves** each finished `pending/chunk_NNN.jsonl` into `semantic/original/chunk_NNN.jsonl` (with the applied edges spliced in for auditability). Result files are deleted; wiki + obsidian + context map regenerated.
|
|
415
|
+
5. Next scan: `semantic-prepare` reads every chunk under `original/`, applies their edges to the freshly built graph (so historical inferences don't disappear), and skips any task whose `task_id` is already on file. `task_id` is `SHA1(file_path | node_id | excerpt_hash[:8])` — a file whose semantic content changes earns a new id and gets re-analysed automatically.
|
|
405
416
|
|
|
406
|
-
This gives you incremental, idempotent enrichment: the agent never re-
|
|
417
|
+
This gives you incremental, idempotent enrichment: the agent never re-analyses the same `(file, content)` twice, accumulated AI signal survives every rescan, and chunked files keep the agent's working set small.
|
|
407
418
|
|
|
408
419
|
### Direct CLI usage
|
|
409
420
|
|
|
410
|
-
If you're not running through the skill (e.g. CI), you can drive the same two commands manually and supply your own `
|
|
421
|
+
If you're not running through the skill (e.g. CI), you can drive the same two commands manually and supply your own `results/chunk_NNN.jsonl` files:
|
|
411
422
|
|
|
412
423
|
```bash
|
|
413
424
|
codebeacon scan .
|
|
414
|
-
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50
|
|
425
|
+
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50 --chunk-size 10
|
|
415
426
|
|
|
416
|
-
#
|
|
427
|
+
# .codebeacon/semantic/pending/chunk_001.jsonl ... now exist.
|
|
428
|
+
# For each pending chunk, write a matching results/chunk_NNN.jsonl. Each line:
|
|
417
429
|
# {"task_id":"...", "source_node_id":"...", "edges":[
|
|
418
430
|
# {"target_name":"UserService","relation":"references","confidence_score":0.7}
|
|
419
431
|
# ]}
|
|
@@ -56,6 +56,7 @@ Bestehende Tools lösen dieses Problem nur teilweise. Route-Analyzer erfassen Ih
|
|
|
56
56
|
- **Keine Konfiguration notwendig** — erkennt Frameworks und Sprachen automatisch; generiert `codebeacon.yaml` für Folgeläufe
|
|
57
57
|
- **Deep-Dive-Modus** — `--deep-dive` erzeugt für jedes Sub-Projekt eigene `.codebeacon/` + `CLAUDE.md`; ein Update-Aufruf aus **beliebigem** Sub-Projekt-Ordner synchronisiert automatisch alle Projekte im Workspace
|
|
58
58
|
- **Automatische Workspace-Wiedererkennung** — bei jedem `scan`/`sync` scannt codebeacon den Workspace erneut und hängt vor der Extraktion automatisch neue Projekte an die `codebeacon.yaml` an, sodass frisch hinzugefügte Sub-Projekte nicht unbemerkt übersprungen werden; `--no-rediscover` deaktiviert dies für handgepflegte Konfigurationen
|
|
59
|
+
- **Graphify-artige Semantik-Anreicherung** — nach der AST-Extraktion dispatcht der Skill einen parallelen Subagenten pro Chunk, der vollständige Knowledge-Graph-Fragmente `{nodes, edges, hyperedges}` mit 8 Relationstypen (`calls`/`implements`/`references`/`cites`/`conceptually_related_to`/`shares_data_with`/`semantically_similar_to`/`rationale_for`) und Konfidenz EXTRACTED/INFERRED/AMBIGUOUS erzeugt; auf Claude Code läuft der Subagent eine Stufe unter dem Host-Modell (Opus→Sonnet, Sonnet→Haiku), damit die Kosten proportional zur Korpus-Größe bleiben. Code-Knoten gehören dem AST; das LLM darf nur `concept`/`document`/`paper`-Knoten beisteuern. Bestehende 0.3.x-Archive werden unter dem neuen Schema unverändert wiedergegeben
|
|
59
60
|
|
|
60
61
|
---
|
|
61
62
|
|
|
@@ -378,16 +379,19 @@ codebeacon hook install [path] # Merge-Driver + Post-Commit-Inkrement
|
|
|
378
379
|
codebeacon merge-driver <base> <cur> <other> # von git nach `hook install` aufgerufen; Union-Merge von beacon.json
|
|
379
380
|
|
|
380
381
|
# AI-semantische Anreicherung (LLM macht der Agent, codebeacon nur die Buchführung)
|
|
381
|
-
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N]
|
|
382
|
-
#
|
|
383
|
-
# beacon.json
|
|
384
|
-
#
|
|
382
|
+
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N] [--chunk-size N]
|
|
383
|
+
# rehydriert .codebeacon/semantic/original/*.jsonl auf das
|
|
384
|
+
# frische beacon.json + entfernt Einträge mit verschwundenen
|
|
385
|
+
# Knoten, schreibt dann neue Aufgaben nach
|
|
386
|
+
# .codebeacon/semantic/pending/chunk_NNN.jsonl
|
|
387
|
+
# (--chunk-size pro Chunk, Std. 10). task_id enthält einen
|
|
388
|
+
# Content-Hash – geänderte Dateien werden neu emittiert.
|
|
385
389
|
codebeacon semantic-apply [--dir .codebeacon]
|
|
386
|
-
#
|
|
387
|
-
# INFERRED references-Kanten in
|
|
388
|
-
#
|
|
389
|
-
#
|
|
390
|
-
#
|
|
390
|
+
# für jede vom Agent geschriebene .codebeacon/semantic/
|
|
391
|
+
# results/chunk_NNN.jsonl: INFERRED references-Kanten in
|
|
392
|
+
# beacon.json mergen + den Pending-Chunk nach
|
|
393
|
+
# .codebeacon/semantic/original/chunk_NNN.jsonl VERSCHIEBEN
|
|
394
|
+
# (dauerhaftes Archiv). Results löschen, alles regenerieren.
|
|
391
395
|
|
|
392
396
|
codebeacon serve [--dir .codebeacon] # MCP-Server starten (stdio)
|
|
393
397
|
codebeacon install # Claude-Code-Skill installieren
|
|
@@ -413,22 +417,24 @@ Das CLI selbst **ruft niemals einen LLM-Anbieter auf**. Die AI-semantik-Schicht
|
|
|
413
417
|
Wenn Sie `/codebeacon` in Claude Code aufrufen:
|
|
414
418
|
|
|
415
419
|
1. `scan` / `sync` baut `beacon.json` aus dem AST (kein LLM-Aufruf).
|
|
416
|
-
2. `codebeacon semantic-prepare`
|
|
417
|
-
3. Der Skill
|
|
418
|
-
4. `codebeacon semantic-apply` mergt die Ergebnisse als `INFERRED references`-Kanten in `beacon.json
|
|
419
|
-
5. Beim nächsten Scan: `semantic-prepare`
|
|
420
|
+
2. `codebeacon semantic-prepare` rehydriert das Archiv unter `.codebeacon/semantic/original/*.jsonl` auf den frischen Graphen und **entfernt** Einträge, deren Quellknoten nicht mehr existiert. Anschließend schreibt es neue Aufgaben nach `.codebeacon/semantic/pending/chunk_NNN.jsonl` (≤ `--chunk-size` pro Datei, Std. 10). Chunk-Nummern setzen genau dort an, wo das dauerhafte Archiv aufhört — keine Kollisionen möglich.
|
|
421
|
+
3. Der Skill verarbeitet Pending-Chunks **einzeln**. Für jedes `pending/chunk_NNN.jsonl` liest der Agent (mit dem Modell der laufenden Sitzung) den `excerpt` jeder Aufgabe und schreibt eine gleichnamige `semantic/results/chunk_NNN.jsonl`.
|
|
422
|
+
4. `codebeacon semantic-apply` mergt die Ergebnisse als `INFERRED references`-Kanten in `beacon.json` und **verschiebt** jede abgeschlossene `pending/chunk_NNN.jsonl` nach **`semantic/original/chunk_NNN.jsonl`** (mit den angewandten Kanten zur Nachvollziehbarkeit). Die Result-Dateien werden gelöscht, Wiki + Obsidian + Kontextkarte regeneriert.
|
|
423
|
+
5. Beim nächsten Scan: `semantic-prepare` liest jeden Chunk unter `original/`, wendet seine Kanten auf den frisch gebauten Graphen an (historische Inferenzen bleiben erhalten) und überspringt jede Aufgabe, deren `task_id` bereits archiviert ist. `task_id` = `SHA1(file_path | node_id | excerpt_hash[:8])` — ändert sich der semantische Inhalt einer Datei, bekommt sie eine neue id und wird neu analysiert.
|
|
420
424
|
|
|
421
|
-
|
|
425
|
+
Inkrementelle, idempotente Anreicherung: der Agent analysiert dieselbe (Datei, Inhalt)-Kombination nie zweimal, das angesammelte AI-Signal überlebt jeden Rescan, und die Chunk-Aufteilung hält den Arbeitsumfang des Agenten klein.
|
|
422
426
|
|
|
423
427
|
### Direkte CLI-Nutzung
|
|
424
428
|
|
|
425
|
-
Wenn Sie nicht über den Skill gehen (z. B. CI), können Sie dieselben zwei Befehle manuell ausführen und
|
|
429
|
+
Wenn Sie nicht über den Skill gehen (z. B. CI), können Sie dieselben zwei Befehle manuell ausführen und Ihre eigenen `results/chunk_NNN.jsonl` liefern:
|
|
426
430
|
|
|
427
431
|
```bash
|
|
428
432
|
codebeacon scan .
|
|
429
|
-
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50
|
|
433
|
+
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50 --chunk-size 10
|
|
430
434
|
|
|
431
|
-
#
|
|
435
|
+
# .codebeacon/semantic/pending/chunk_001.jsonl ... existieren jetzt.
|
|
436
|
+
# Schreiben Sie für jeden Pending-Chunk eine gleichnamige results/chunk_NNN.jsonl.
|
|
437
|
+
# Jede Zeile:
|
|
432
438
|
# {"task_id":"...", "source_node_id":"...", "edges":[
|
|
433
439
|
# {"target_name":"UserService","relation":"references","confidence_score":0.7}
|
|
434
440
|
# ]}
|
|
@@ -56,6 +56,7 @@ Las herramientas existentes resuelven esto de forma parcial. Los analizadores de
|
|
|
56
56
|
- **Cero configuración** — detecta frameworks y lenguajes automáticamente; genera `codebeacon.yaml` para ejecuciones posteriores
|
|
57
57
|
- **Modo Deep Dive** — `--deep-dive` genera `.codebeacon/` + `CLAUDE.md` propios para cada sub-proyecto; ejecutar el comando de actualización desde **cualquier** sub-proyecto sincroniza automáticamente todos los proyectos del workspace
|
|
58
58
|
- **Auto-redescubrimiento del workspace** — en cada `scan`/`sync`, codebeacon re-escanea el workspace y añade automáticamente al `codebeacon.yaml` los nuevos proyectos antes de extraer, de modo que los sub-proyectos recién añadidos nunca se omitan silenciosamente; usa `--no-rediscover` para optar por el modo de configuración curada manualmente
|
|
59
|
+
- **Enriquecimiento semántico estilo Graphify** — tras la extracción AST, el skill despacha un subagente paralelo por chunk para emitir fragmentos completos de grafo `{nodes, edges, hyperedges}` con 8 tipos de relación (`calls`/`implements`/`references`/`cites`/`conceptually_related_to`/`shares_data_with`/`semantically_similar_to`/`rationale_for`) y confianza EXTRACTED/INFERRED/AMBIGUOUS; en Claude Code el subagente se ejecuta un nivel por debajo del modelo host (Opus→Sonnet, Sonnet→Haiku) para mantener el gasto proporcional al tamaño del corpus. El AST posee los nodos de código; el LLM solo puede aportar nodos `concept`/`document`/`paper`. Los archivos 0.3.x existentes se replayean con el nuevo esquema sin cambios
|
|
59
60
|
|
|
60
61
|
---
|
|
61
62
|
|
|
@@ -376,16 +377,19 @@ codebeacon hook install [path] # instala merge driver + hook post-com
|
|
|
376
377
|
codebeacon merge-driver <base> <cur> <other> # invocado por git tras `hook install`; union-merge de beacon.json
|
|
377
378
|
|
|
378
379
|
# Enriquecimiento AI-semántico (el LLM lo ejecuta el agente, codebeacon lleva la contabilidad)
|
|
379
|
-
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N]
|
|
380
|
-
# rehidrata
|
|
381
|
-
#
|
|
382
|
-
#
|
|
380
|
+
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N] [--chunk-size N]
|
|
381
|
+
# rehidrata .codebeacon/semantic/original/*.jsonl sobre el
|
|
382
|
+
# nuevo beacon.json + poda entradas que apuntan a nodos
|
|
383
|
+
# desaparecidos, luego escribe tareas en
|
|
384
|
+
# .codebeacon/semantic/pending/chunk_NNN.jsonl
|
|
385
|
+
# (--chunk-size por chunk, predet. 10). El task_id incluye
|
|
386
|
+
# hash de contenido: si el archivo cambia, se reemite.
|
|
383
387
|
codebeacon semantic-apply [--dir .codebeacon]
|
|
384
|
-
#
|
|
385
|
-
#
|
|
386
|
-
#
|
|
387
|
-
#
|
|
388
|
-
#
|
|
388
|
+
# por cada .codebeacon/semantic/results/chunk_NNN.jsonl que
|
|
389
|
+
# haya escrito el agente, fusiona las aristas INFERRED
|
|
390
|
+
# references en beacon.json y MUEVE el chunk pendiente a
|
|
391
|
+
# .codebeacon/semantic/original/chunk_NNN.jsonl (archivo
|
|
392
|
+
# durable). Borra los resultados y regenera todo.
|
|
389
393
|
|
|
390
394
|
codebeacon serve [--dir .codebeacon] # servidor MCP (stdio)
|
|
391
395
|
codebeacon install # instalar skill de Claude Code
|
|
@@ -411,22 +415,24 @@ El CLI por sí mismo **nunca llama a un LLM**. La capa AI-semántica es propieda
|
|
|
411
415
|
Cuando invocas `/codebeacon` en Claude Code:
|
|
412
416
|
|
|
413
417
|
1. `scan` / `sync` construye `beacon.json` desde el AST (sin LLM).
|
|
414
|
-
2. `codebeacon semantic-prepare`
|
|
415
|
-
3. El skill itera
|
|
416
|
-
4. `codebeacon semantic-apply` mezcla los resultados como aristas `INFERRED references` en `beacon.json
|
|
417
|
-
5. En la
|
|
418
|
+
2. `codebeacon semantic-prepare` rehidrata el archivo en `.codebeacon/semantic/original/*.jsonl` sobre el grafo nuevo y **poda** las entradas que apuntan a nodos ya desaparecidos. Después escribe los nuevos task chunks en `.codebeacon/semantic/pending/chunk_NNN.jsonl` (cada chunk ≤ `--chunk-size`, predet. 10). La numeración de chunks continúa donde dejó el archivo durable, así nunca colisiona.
|
|
419
|
+
3. El skill itera los chunks pendientes **uno por uno**. Para cada `pending/chunk_NNN.jsonl`, el agente (con el modelo de su sesión actual) lee el `excerpt` de cada task y escribe un `semantic/results/chunk_NNN.jsonl` con el mismo nombre.
|
|
420
|
+
4. `codebeacon semantic-apply` mezcla los resultados como aristas `INFERRED references` en `beacon.json` y **mueve** cada `pending/chunk_NNN.jsonl` terminado a **`semantic/original/chunk_NNN.jsonl`** (con las aristas aplicadas para auditoría). Los archivos de resultados se eliminan; wiki + obsidian + mapa de contexto se regeneran.
|
|
421
|
+
5. En la siguiente ejecución: `semantic-prepare` lee cada chunk bajo `original/`, aplica sus aristas al grafo recién construido (las inferencias históricas no se pierden) y omite cualquier task cuyo `task_id` ya esté archivado. `task_id` = `SHA1(file_path | node_id | excerpt_hash[:8])`: si el contenido del archivo cambia, recibe un id nuevo y se reanaliza.
|
|
418
422
|
|
|
419
|
-
|
|
423
|
+
Enriquecimiento incremental e idempotente: el agente nunca reanaliza la misma combinación (archivo, contenido) dos veces, la señal AI acumulada sobrevive a cada re-escaneo y los chunks mantienen pequeño el conjunto de trabajo del agente.
|
|
420
424
|
|
|
421
425
|
### Uso directo del CLI
|
|
422
426
|
|
|
423
|
-
Si no usas el skill (p. ej. en CI), puedes ejecutar las mismas dos órdenes manualmente y
|
|
427
|
+
Si no usas el skill (p. ej. en CI), puedes ejecutar las mismas dos órdenes manualmente y proporcionar tus propios `results/chunk_NNN.jsonl`:
|
|
424
428
|
|
|
425
429
|
```bash
|
|
426
430
|
codebeacon scan .
|
|
427
|
-
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50
|
|
431
|
+
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50 --chunk-size 10
|
|
428
432
|
|
|
429
|
-
#
|
|
433
|
+
# Existen .codebeacon/semantic/pending/chunk_001.jsonl ...
|
|
434
|
+
# Para cada chunk pendiente, escribe un results/chunk_NNN.jsonl con el mismo
|
|
435
|
+
# nombre. Cada línea:
|
|
430
436
|
# {"task_id":"...", "source_node_id":"...", "edges":[
|
|
431
437
|
# {"target_name":"UserService","relation":"references","confidence_score":0.7}
|
|
432
438
|
# ]}
|
|
@@ -56,6 +56,7 @@ Les outils existants ne résolvent ce problème qu'en partie. Les analyseurs de
|
|
|
56
56
|
- **Zéro configuration** — détecte automatiquement les frameworks et langages ; génère `codebeacon.yaml` pour les exécutions suivantes
|
|
57
57
|
- **Mode Deep Dive** — `--deep-dive` génère un `.codebeacon/` + `CLAUDE.md` propre à chaque sous-projet ; une commande de mise à jour depuis **n'importe quel** sous-projet synchronise automatiquement tous les projets du workspace
|
|
58
58
|
- **Redécouverte automatique du workspace** — à chaque `scan`/`sync`, codebeacon réanalyse le workspace et ajoute automatiquement les nouveaux projets au `codebeacon.yaml` avant l'extraction, de sorte que les sous-projets fraîchement ajoutés ne soient jamais oubliés en silence ; utilisez `--no-rediscover` pour conserver une configuration yaml gérée manuellement
|
|
59
|
+
- **Enrichissement sémantique façon Graphify** — après l'extraction AST, le skill dispatche un sous-agent parallèle par chunk pour émettre des fragments complets de knowledge graph `{nodes, edges, hyperedges}` avec 8 types de relations (`calls`/`implements`/`references`/`cites`/`conceptually_related_to`/`shares_data_with`/`semantically_similar_to`/`rationale_for`) et confiance EXTRACTED/INFERRED/AMBIGUOUS ; sur Claude Code, le sous-agent s'exécute un cran sous le modèle hôte (Opus→Sonnet, Sonnet→Haiku) pour garder le coût proportionnel à la taille du corpus. L'AST possède les nœuds de code ; le LLM ne peut contribuer que des nœuds `concept`/`document`/`paper`. Les archives 0.3.x existantes sont rejouées sous le nouveau schéma sans modification
|
|
59
60
|
|
|
60
61
|
---
|
|
61
62
|
|
|
@@ -377,16 +378,19 @@ codebeacon hook install [path] # installer merge driver + hook post-c
|
|
|
377
378
|
codebeacon merge-driver <base> <cur> <other> # invoqué par git après `hook install` ; union-merge de beacon.json
|
|
378
379
|
|
|
379
380
|
# Enrichissement AI-sémantique (le LLM est exécuté par l'agent, codebeacon tient la comptabilité)
|
|
380
|
-
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N]
|
|
381
|
-
#
|
|
382
|
-
#
|
|
383
|
-
#
|
|
381
|
+
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N] [--chunk-size N]
|
|
382
|
+
# réhydrate .codebeacon/semantic/original/*.jsonl sur le
|
|
383
|
+
# nouveau beacon.json + élague les entrées pointant vers
|
|
384
|
+
# des nœuds disparus, puis écrit les nouvelles tâches
|
|
385
|
+
# dans .codebeacon/semantic/pending/chunk_NNN.jsonl
|
|
386
|
+
# (--chunk-size par chunk, défaut 10). task_id inclut un
|
|
387
|
+
# hash de contenu : un fichier modifié est ré-émis.
|
|
384
388
|
codebeacon semantic-apply [--dir .codebeacon]
|
|
385
|
-
#
|
|
386
|
-
# arêtes INFERRED references
|
|
387
|
-
#
|
|
388
|
-
#
|
|
389
|
-
#
|
|
389
|
+
# pour chaque .codebeacon/semantic/results/chunk_NNN.jsonl
|
|
390
|
+
# écrit par l'agent, fusionne les arêtes INFERRED references
|
|
391
|
+
# dans beacon.json et DÉPLACE le chunk pending vers
|
|
392
|
+
# .codebeacon/semantic/original/chunk_NNN.jsonl (archive
|
|
393
|
+
# durable). Supprime les résultats, régénère tout.
|
|
390
394
|
|
|
391
395
|
codebeacon serve [--dir .codebeacon] # serveur MCP (stdio)
|
|
392
396
|
codebeacon install # installer le skill Claude Code
|
|
@@ -412,22 +416,24 @@ Le CLI lui-même **n'appelle jamais un LLM**. La couche AI-sémantique est inten
|
|
|
412
416
|
Quand vous invoquez `/codebeacon` dans Claude Code :
|
|
413
417
|
|
|
414
418
|
1. `scan` / `sync` construit `beacon.json` à partir de l'AST (aucun appel LLM).
|
|
415
|
-
2. `codebeacon semantic-prepare`
|
|
416
|
-
3. Le skill itère
|
|
417
|
-
4. `codebeacon semantic-apply` fusionne les résultats en arêtes `INFERRED references` dans `beacon.json
|
|
418
|
-
5. Au prochain scan : `semantic-prepare`
|
|
419
|
+
2. `codebeacon semantic-prepare` réhydrate l'archive sous `.codebeacon/semantic/original/*.jsonl` sur le graphe frais et **élague** les entrées dont le nœud source a disparu. Il écrit ensuite les nouvelles tâches dans `.codebeacon/semantic/pending/chunk_NNN.jsonl` (≤ `--chunk-size` par fichier, défaut 10). La numérotation des chunks reprend là où l'archive durable s'est arrêtée — pas de collision possible.
|
|
420
|
+
3. Le skill itère les chunks pending **un par un**. Pour chaque `pending/chunk_NNN.jsonl`, l'agent (avec le modèle de sa session courante) lit l'`excerpt` de chaque tâche et écrit un `semantic/results/chunk_NNN.jsonl` du même nom.
|
|
421
|
+
4. `codebeacon semantic-apply` fusionne les résultats en arêtes `INFERRED references` dans `beacon.json` et **déplace** chaque `pending/chunk_NNN.jsonl` terminé vers **`semantic/original/chunk_NNN.jsonl`** (les arêtes appliquées y sont incluses pour auditabilité). Les fichiers de résultats sont supprimés ; wiki + obsidian + carte de contexte sont régénérés.
|
|
422
|
+
5. Au prochain scan : `semantic-prepare` lit chaque chunk sous `original/`, applique ses arêtes au graphe fraîchement construit (les inférences historiques sont conservées) et saute toute tâche dont le `task_id` est déjà archivé. `task_id` = `SHA1(file_path | node_id | excerpt_hash[:8])` — si le contenu sémantique d'un fichier change, il obtient un nouvel id et est ré-analysé.
|
|
419
423
|
|
|
420
|
-
|
|
424
|
+
Enrichissement incrémental et idempotent : l'agent ne ré-analyse jamais deux fois la même combinaison (fichier, contenu), le signal AI accumulé survit à chaque re-scan et les chunks gardent l'ensemble de travail de l'agent petit.
|
|
421
425
|
|
|
422
426
|
### Utilisation directe du CLI
|
|
423
427
|
|
|
424
|
-
Si vous n'utilisez pas le skill (par ex. en CI), vous pouvez piloter les deux mêmes commandes manuellement et fournir
|
|
428
|
+
Si vous n'utilisez pas le skill (par ex. en CI), vous pouvez piloter les deux mêmes commandes manuellement et fournir vos propres `results/chunk_NNN.jsonl` :
|
|
425
429
|
|
|
426
430
|
```bash
|
|
427
431
|
codebeacon scan .
|
|
428
|
-
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50
|
|
432
|
+
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50 --chunk-size 10
|
|
429
433
|
|
|
430
|
-
#
|
|
434
|
+
# .codebeacon/semantic/pending/chunk_001.jsonl ... existent.
|
|
435
|
+
# Pour chaque chunk pending, écrivez un results/chunk_NNN.jsonl du même nom.
|
|
436
|
+
# Chaque ligne :
|
|
431
437
|
# {"task_id":"...", "source_node_id":"...", "edges":[
|
|
432
438
|
# {"target_name":"UserService","relation":"references","confidence_score":0.7}
|
|
433
439
|
# ]}
|
|
@@ -56,6 +56,7 @@ AI コーディングセッションを新しく開くたびに、アシスタ
|
|
|
56
56
|
- **ゼロ設定** — フレームワークと言語を自動検出;繰り返し実行のために `codebeacon.yaml` を自動生成
|
|
57
57
|
- **ディープダイブモード** — `--deep-dive` で各サブプロジェクトに専用の `.codebeacon/` + `CLAUDE.md` を生成;**どのサブプロジェクトからでも**更新コマンドを実行するだけでワークスペース全体が自動同期
|
|
58
58
|
- **ワークスペース自動再検出** — `scan`/`sync` 実行のたびにワークスペースを再スキャンし、`codebeacon.yaml` に未登録の新規プロジェクトを自動追加してから抽出を開始するため、新しく追加されたサブプロジェクトが見落とされることがない;yaml を手動で管理している場合は `--no-rediscover` でオプトアウト可能
|
|
59
|
+
- **Graphify 風のセマンティック強化** — AST 抽出後、スキルがチャンクごとに 1 つのサブエージェントを並列でディスパッチし、`{nodes, edges, hyperedges}` のフル知識グラフ断片を抽出。関係 8 種(`calls`/`implements`/`references`/`cites`/`conceptually_related_to`/`shares_data_with`/`semantically_similar_to`/`rationale_for`)+ 信頼度 3 段階(EXTRACTED/INFERRED/AMBIGUOUS)をサポート。Claude Code ではサブエージェントがホストモデルより 1 段階下(Opus→Sonnet、Sonnet→Haiku)に自動ダウングレードされ、コーパスサイズに比例したコストを維持。コードノードは AST が担当し、LLM は `concept`/`document`/`paper` ノードのみ寄与可能。既存の 0.3.x アーカイブは新スキーマで透過的にリプレイされる
|
|
59
60
|
|
|
60
61
|
---
|
|
61
62
|
|
|
@@ -285,17 +286,19 @@ codebeacon hook install [path] # merge driver + post-commit インク
|
|
|
285
286
|
codebeacon merge-driver <base> <cur> <other> # `hook install` 後 git が呼び出す;beacon.json を union マージ
|
|
286
287
|
|
|
287
288
|
# AI-セマンティック補強 (LLM はエージェント、整合性管理は codebeacon)
|
|
288
|
-
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N]
|
|
289
|
-
#
|
|
290
|
-
#
|
|
291
|
-
#
|
|
292
|
-
# に書き出す
|
|
289
|
+
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N] [--chunk-size N]
|
|
290
|
+
# .codebeacon/semantic/original/*.jsonl アーカイブを fresh
|
|
291
|
+
# beacon.json に再適用 + 失われたノードを指す stale エントリ
|
|
292
|
+
# を prune し、新規候補のみを .codebeacon/semantic/pending/
|
|
293
|
+
# chunk_NNN.jsonl に書き出す (chunk あたり --chunk-size 件、
|
|
294
|
+
# 既定 10)。task_id にコンテンツハッシュが入っているので、
|
|
295
|
+
# ファイル内容が変わると自動で再発行される。
|
|
293
296
|
codebeacon semantic-apply [--dir .codebeacon]
|
|
294
|
-
# .codebeacon/semantic
|
|
295
|
-
#
|
|
296
|
-
# .
|
|
297
|
-
#
|
|
298
|
-
# wiki/obsidian
|
|
297
|
+
# エージェントが書いた .codebeacon/semantic/results/
|
|
298
|
+
# chunk_NNN.jsonl をそれぞれ INFERRED references エッジ
|
|
299
|
+
# として beacon.json にマージし、pending/chunk_NNN.jsonl
|
|
300
|
+
# を original/chunk_NNN.jsonl に移動 (永続アーカイブ)。
|
|
301
|
+
# results は削除、wiki/obsidian/コンテキストマップを再生成。
|
|
299
302
|
|
|
300
303
|
# インテグレーション
|
|
301
304
|
codebeacon serve [--dir .codebeacon] # MCP サーバー起動 (stdio)
|
|
@@ -322,22 +325,23 @@ CLI 自体は LLM API 呼び出しを **行いません**。AI-セマンティ
|
|
|
322
325
|
Claude Code で `/codebeacon` を呼び出すと:
|
|
323
326
|
|
|
324
327
|
1. `scan` / `sync` が AST から `beacon.json` を構築(LLM 呼び出しなし)。
|
|
325
|
-
2. `codebeacon semantic-prepare`
|
|
326
|
-
3.
|
|
327
|
-
4. `codebeacon semantic-apply` が結果を `INFERRED references` エッジとして `beacon.json`
|
|
328
|
-
5. 次回スキャン時:`semantic-prepare`
|
|
328
|
+
2. `codebeacon semantic-prepare` が `.codebeacon/semantic/original/*.jsonl` アーカイブを新グラフに再適用し、グラフから消えたノードを指す stale エントリを **prune**。続いて新規 task を `.codebeacon/semantic/pending/chunk_NNN.jsonl` に書き出す(`--chunk-size` 単位、既定 10)。chunk 番号は永続アーカイブの続きから始まるため衝突しません。
|
|
329
|
+
3. スキルは pending chunk を**1 つずつ**処理。各 `pending/chunk_NNN.jsonl` について、エージェント(現在セッションのモデル)が task の `excerpt` を読み、同名の `semantic/results/chunk_NNN.jsonl` を書きます。
|
|
330
|
+
4. `codebeacon semantic-apply` が結果を `INFERRED references` エッジとして `beacon.json` にマージし、完了済み `pending/chunk_NNN.jsonl` を **`semantic/original/chunk_NNN.jsonl`** に**移動**(適用済みエッジを一緒に記録)。results は削除、wiki + obsidian + コンテキストマップを再生成。
|
|
331
|
+
5. 次回スキャン時:`semantic-prepare` が `original/` の全 chunk のエッジを新グラフに再適用(過去の推論を保全)し、既に処理済みの `task_id` はスキップ。`task_id` = `SHA1(file_path | node_id | excerpt_hash[:8])` — ファイルのセマンティック内容が変われば自動的に新しい id になり再解析されます。
|
|
329
332
|
|
|
330
|
-
→
|
|
333
|
+
→ 増分かつ冪等の補強。同じ (ファイル, 内容) を二度分析せず、蓄積された AI シグナルは毎回の再スキャンを生き延び、chunk 分割でエージェントの作業セットも小さく保てます。
|
|
331
334
|
|
|
332
335
|
### 直接 CLI 使用
|
|
333
336
|
|
|
334
|
-
スキルを介さず(例:CI)に同じ 2 コマンドで手動運用し、`
|
|
337
|
+
スキルを介さず(例:CI)に同じ 2 コマンドで手動運用し、`results/chunk_NNN.jsonl` を自分で書くこともできます:
|
|
335
338
|
|
|
336
339
|
```bash
|
|
337
340
|
codebeacon scan .
|
|
338
|
-
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50
|
|
341
|
+
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50 --chunk-size 10
|
|
339
342
|
|
|
340
|
-
#
|
|
343
|
+
# .codebeacon/semantic/pending/chunk_001.jsonl ... が生成される。
|
|
344
|
+
# 各 pending chunk について同名の results/chunk_NNN.jsonl を書く。各行:
|
|
341
345
|
# {"task_id":"...", "source_node_id":"...", "edges":[
|
|
342
346
|
# {"target_name":"UserService","relation":"references","confidence_score":0.7}
|
|
343
347
|
# ]}
|
|
@@ -56,6 +56,7 @@ AI 코딩 세션을 새로 열 때마다 어시스턴트는 백지 상태에서
|
|
|
56
56
|
- **제로 설정** — 프레임워크와 언어 자동 감지; 반복 실행을 위한 `codebeacon.yaml` 자동 생성
|
|
57
57
|
- **딥다이브 모드** — `--deep-dive`는 각 서브 프로젝트에 개별 `.codebeacon/` + `CLAUDE.md`를 생성; 어느 서브 프로젝트 폴더에서든 `codebeacon scan . --update`를 실행하면 워크스페이스의 모든 프로젝트가 자동으로 업데이트됨
|
|
58
58
|
- **워크스페이스 자동 재발견** — `scan`/`sync` 실행마다 워크스페이스를 다시 훑어 `codebeacon.yaml`에 없는 신규 프로젝트를 자동으로 yaml에 추가한 뒤 추출 시작 — 새로 추가된 서브 프로젝트가 조용히 누락되지 않음; 수동으로 yaml을 큐레이션 중이라면 `--no-rediscover`로 옵트아웃
|
|
59
|
+
- **Graphify 스타일 semantic 보강** — AST 추출 후 스킬이 청크당 subagent 1개를 병렬로 띄워 `{nodes, edges, hyperedges}` 풀 그래프 단편을 추출. 관계 8종(`calls`/`implements`/`references`/`cites`/`conceptually_related_to`/`shares_data_with`/`semantically_similar_to`/`rationale_for`) + 신뢰도 3단계(EXTRACTED/INFERRED/AMBIGUOUS) 지원. Claude Code에서는 subagent가 호스트 모델보다 한 단계 아래(Opus→Sonnet, Sonnet→Haiku)로 자동 강등되어 코퍼스 크기에 비례한 비용 유지. 코드 노드는 AST 전담, LLM은 `concept`/`document`/`paper` 노드만 기여 가능. 기존 0.3.x 아카이브는 새 스키마로 그대로 replay됨
|
|
59
60
|
|
|
60
61
|
---
|
|
61
62
|
|
|
@@ -143,11 +144,14 @@ project-root/
|
|
|
143
144
|
components/<Name>.md
|
|
144
145
|
obsidian/ ← Obsidian 볼트 (그래프 노드당 노트 1개)
|
|
145
146
|
semantic/
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
147
|
+
pending/ ← prepare 가 chunk_NNN.jsonl 작성 (chunk 당 --chunk-size 개)
|
|
148
|
+
chunk_001.jsonl
|
|
149
|
+
chunk_002.jsonl
|
|
150
|
+
results/ ← 에이전트가 같은 이름의 chunk_NNN.jsonl 작성
|
|
151
|
+
chunk_001.jsonl
|
|
152
|
+
original/ ← apply 가 완료 chunk 를 이동 (영구 아카이브)
|
|
153
|
+
chunk_001.jsonl
|
|
154
|
+
chunk_002.jsonl ← (과거 실행 분이 누적; chunk 번호는 monotonic)
|
|
151
155
|
```
|
|
152
156
|
|
|
153
157
|
### 딥다이브 모드
|
|
@@ -314,15 +318,20 @@ codebeacon sync --config <file> # 특정 설정 파일 사용
|
|
|
314
318
|
codebeacon sync --no-rediscover # 신규 프로젝트 자동 추가 비활성화 (수동 큐레이션 모드)
|
|
315
319
|
|
|
316
320
|
# AI-시맨틱 보강 (LLM 작업은 에이전트가, 부기는 codebeacon이 담당)
|
|
317
|
-
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N]
|
|
318
|
-
#
|
|
319
|
-
#
|
|
320
|
-
#
|
|
321
|
+
codebeacon semantic-prepare [--dir .codebeacon] [--max-tasks N] [--chunk-size N]
|
|
322
|
+
# .codebeacon/semantic/original/*.jsonl 아카이브를 fresh
|
|
323
|
+
# beacon.json 에 재적용 + 사라진 노드를 가리키는 stale 엔트리
|
|
324
|
+
# prune, 그 후 **모든** NEW 후보 (god 폴더 + hub file +
|
|
325
|
+
# unresolved 타겟) 를 .codebeacon/semantic/pending/
|
|
326
|
+
# chunk_NNN.jsonl 로 작성 (chunk 당 --chunk-size 개, 기본 10).
|
|
327
|
+
# --max-tasks 는 선택적 cap (0 = no cap, 기본 — 모두 emit).
|
|
328
|
+
# task_id 에 콘텐츠 해시가 포함되어 파일 내용이 바뀌면 자동 재발행.
|
|
321
329
|
codebeacon semantic-apply [--dir .codebeacon]
|
|
322
|
-
# .codebeacon/semantic
|
|
323
|
-
# INFERRED references 엣지로
|
|
324
|
-
# .
|
|
325
|
-
#
|
|
330
|
+
# 에이전트가 작성한 .codebeacon/semantic/results/
|
|
331
|
+
# chunk_NNN.jsonl 각각을 INFERRED references 엣지로
|
|
332
|
+
# beacon.json 에 머지 + pending/chunk_NNN.jsonl 을
|
|
333
|
+
# original/chunk_NNN.jsonl 로 이동 (영구 아카이브).
|
|
334
|
+
# results 파일 삭제, wiki/obsidian/컨텍스트 맵 재생성.
|
|
326
335
|
|
|
327
336
|
# 지식 그래프 쿼리
|
|
328
337
|
codebeacon query <term> [--dir .codebeacon] [--limit N] # 라벨 부분 문자열로 노드 검색
|
|
@@ -357,22 +366,23 @@ CLI 자체는 LLM API 호출을 **하지 않습니다**. AI-시맨틱 계층은
|
|
|
357
366
|
Claude Code 에서 `/codebeacon` 호출 시:
|
|
358
367
|
|
|
359
368
|
1. `scan` / `sync` 가 AST 로부터 `beacon.json` 빌드 (LLM 호출 없음).
|
|
360
|
-
2. `codebeacon semantic-prepare` 가
|
|
361
|
-
3. 스킬이
|
|
362
|
-
4. `codebeacon semantic-apply` 가 결과를 `INFERRED references` 엣지로 `beacon.json` 에 머지하고,
|
|
363
|
-
5. 다음 스캔: `semantic-prepare` 가
|
|
369
|
+
2. `codebeacon semantic-prepare` 가 `.codebeacon/semantic/original/*.jsonl` 아카이브를 새 그래프에 재적용하고, 그래프에서 사라진 노드를 가리키는 stale 엔트리를 **prune** 한 뒤, 신규 task 들을 `.codebeacon/semantic/pending/chunk_NNN.jsonl` 로 작성 (`--chunk-size` 당 1 chunk, 기본 10). chunk 번호는 영구 아카이브의 다음 번호부터 시작 — 절대 충돌하지 않음.
|
|
370
|
+
3. 스킬이 pending chunk 들을 **한 번에 하나씩** 처리. 각 `pending/chunk_NNN.jsonl` 에 대해 에이전트(현재 세션의 모델)가 각 task 의 `excerpt` 를 읽고 같은 이름의 `semantic/results/chunk_NNN.jsonl` 을 작성.
|
|
371
|
+
4. `codebeacon semantic-apply` 가 결과를 `INFERRED references` 엣지로 `beacon.json` 에 머지하고, 각 완료된 `pending/chunk_NNN.jsonl` 을 **`semantic/original/chunk_NNN.jsonl`** 로 **이동** (적용된 엣지를 함께 적재, 감사 가능). results 파일은 삭제, wiki + obsidian + 컨텍스트 맵 재생성.
|
|
372
|
+
5. 다음 스캔: `semantic-prepare` 가 `original/` 의 모든 chunk 엣지를 새 그래프에 재적용 (과거 추론 보존) 하고, 이미 처리된 task_id 는 스킵. `task_id` = `SHA1(file_path | node_id | excerpt_hash[:8])` — 파일 시맨틱 내용이 바뀌면 자동으로 새 id 가 되어 재분석.
|
|
364
373
|
|
|
365
|
-
→ 증분 + 멱등 보강. 같은
|
|
374
|
+
→ 증분 + 멱등 보강. 같은 (파일, 내용) 조합을 두 번 분석하지 않고, 누적된 AI 시그널은 매 재스캔을 살아남으며 chunk 분할로 에이전트의 working set 도 작게 유지됩니다.
|
|
366
375
|
|
|
367
376
|
### 직접 CLI 사용
|
|
368
377
|
|
|
369
|
-
스킬 없이 (예: CI) 같은 두
|
|
378
|
+
스킬 없이 (예: CI) 같은 두 명령을 직접 운영할 수 있습니다 — `results/chunk_NNN.jsonl` 파일들을 본인이 채우면 됩니다:
|
|
370
379
|
|
|
371
380
|
```bash
|
|
372
381
|
codebeacon scan .
|
|
373
|
-
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50
|
|
382
|
+
codebeacon semantic-prepare --dir .codebeacon --max-tasks 50 --chunk-size 10
|
|
374
383
|
|
|
375
|
-
#
|
|
384
|
+
# .codebeacon/semantic/pending/chunk_001.jsonl ... 이 생성됨.
|
|
385
|
+
# 각 pending chunk 에 대해 같은 이름의 results/chunk_NNN.jsonl 을 작성. 각 라인:
|
|
376
386
|
# {"task_id":"...", "source_node_id":"...", "edges":[
|
|
377
387
|
# {"target_name":"UserService","relation":"references","confidence_score":0.7}
|
|
378
388
|
# ]}
|