wikifier 4.1.1__tar.gz → 4.1.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {wikifier-4.1.1/wikifier.egg-info → wikifier-4.1.2}/PKG-INFO +10 -1
- {wikifier-4.1.1 → wikifier-4.1.2}/README.md +9 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/pyproject.toml +1 -1
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/__init__.py +1 -1
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/cli.py +67 -10
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/import_cache.py +37 -8
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/parsers/python.py +11 -4
- {wikifier-4.1.1 → wikifier-4.1.2/wikifier.egg-info}/PKG-INFO +10 -1
- {wikifier-4.1.1 → wikifier-4.1.2}/CONTRIBUTING.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/LICENSE +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/MANIFEST.in +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/diagnostics.html +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/docs/Basis-v0.3.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/docs/RELEASE_NOTES.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/docs/TRADEOFFS.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/docs/spec.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/docs/v0.4-Execution-Plan.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/docs/v0.4-execution-plan.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/index.html +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/setup.cfg +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/skills/run.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/__main__.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/contracts.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/daemon.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/diagnostics.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/gap1_validation_harness.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/health.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/locking.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/mcp/__init__.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/mcp/server.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/parsers/__init__.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/parsers/bree.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/parsers/cdia.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/parsers/javascript.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/resolution.py +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/exclude_patterns.txt +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/file_health.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/library.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/monitored_paths.txt +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/pending_updates.md +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/wikifier.bat +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/wikifier.ps1 +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier/scripts/wikifier.sh +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier.egg-info/SOURCES.txt +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier.egg-info/dependency_links.txt +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier.egg-info/entry_points.txt +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier.egg-info/requires.txt +0 -0
- {wikifier-4.1.1 → wikifier-4.1.2}/wikifier.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: wikifier
|
|
3
|
-
Version: 4.1.
|
|
3
|
+
Version: 4.1.2
|
|
4
4
|
Summary: Agent-first, zero-dependency, self-maintaining codebase documentation & change tracking system
|
|
5
5
|
Author-email: Aron Amos <aron@example.com>
|
|
6
6
|
Maintainer: Aron Amos
|
|
@@ -97,6 +97,15 @@ See the final synth in `Findings/M5-Dogfood-Progress.md` and full Assessment-Rep
|
|
|
97
97
|
- Synced descriptions in README.md, `skills/run.md`, `wikifier/mcp/README.md`.
|
|
98
98
|
- All under protocol (FRESH, record-change + mark-green with subid, main clean). Version to 4.1.1. See the separation-fix commit.
|
|
99
99
|
|
|
100
|
+
**v4.1.2 (2026-06)**: Very minor patch for mapping / update speed hygiene (no new behaviour or scope; pure internal improvements for large projects).
|
|
101
|
+
|
|
102
|
+
- Faster candidate collection everywhere that drives check-changes, monitor, and update-maps (Python-primary paths): switched to `os.scandir`-based recursive scan (std lib, avoids walk overhead) + git `ls-files --cached --others --exclude-standard` fast-path when in a git repo (dramatically faster on real monorepos; falls back cleanly).
|
|
103
|
+
- Consistent early pruning: `exclude_patterns.txt` (user-editable, populated by init) now applied more broadly during mapping walks in both sh and Python collectors (venvs, caches, site-packages, etc. stop descent sooner).
|
|
104
|
+
- Small parser micro-opt: hoisted common regex compiles (docstring strip, dynamic import detectors) to module level in the Python parser (hot path on dirty files during maps).
|
|
105
|
+
- Minor sh-side note + skeleton for git fast collection in traditional update-maps path (real wins already flow through the Python collectors used by check-changes/streaming/lib/MCP).
|
|
106
|
+
- All changes FRESH-checked + recorded+marked under subid=mapping-speed-hygiene. Complements existing levers (monitored_paths.txt narrowing, --dir / directory= scoping, --stream / --max-files / --max-time streaming+budgets, python-primary, incremental dirty + BRC reverse index).
|
|
107
|
+
- Version to 4.1.2. See the speed-hygiene edits + this commit.
|
|
108
|
+
|
|
100
109
|
### Historical (pre-v4.0)
|
|
101
110
|
|
|
102
111
|
#### What's New in v0.3.3 (Gap #1)
|
|
@@ -65,6 +65,15 @@ See the final synth in `Findings/M5-Dogfood-Progress.md` and full Assessment-Rep
|
|
|
65
65
|
- Synced descriptions in README.md, `skills/run.md`, `wikifier/mcp/README.md`.
|
|
66
66
|
- All under protocol (FRESH, record-change + mark-green with subid, main clean). Version to 4.1.1. See the separation-fix commit.
|
|
67
67
|
|
|
68
|
+
**v4.1.2 (2026-06)**: Very minor patch for mapping / update speed hygiene (no new behaviour or scope; pure internal improvements for large projects).
|
|
69
|
+
|
|
70
|
+
- Faster candidate collection everywhere that drives check-changes, monitor, and update-maps (Python-primary paths): switched to `os.scandir`-based recursive scan (std lib, avoids walk overhead) + git `ls-files --cached --others --exclude-standard` fast-path when in a git repo (dramatically faster on real monorepos; falls back cleanly).
|
|
71
|
+
- Consistent early pruning: `exclude_patterns.txt` (user-editable, populated by init) now applied more broadly during mapping walks in both sh and Python collectors (venvs, caches, site-packages, etc. stop descent sooner).
|
|
72
|
+
- Small parser micro-opt: hoisted common regex compiles (docstring strip, dynamic import detectors) to module level in the Python parser (hot path on dirty files during maps).
|
|
73
|
+
- Minor sh-side note + skeleton for git fast collection in traditional update-maps path (real wins already flow through the Python collectors used by check-changes/streaming/lib/MCP).
|
|
74
|
+
- All changes FRESH-checked + recorded+marked under subid=mapping-speed-hygiene. Complements existing levers (monitored_paths.txt narrowing, --dir / directory= scoping, --stream / --max-files / --max-time streaming+budgets, python-primary, incremental dirty + BRC reverse index).
|
|
75
|
+
- Version to 4.1.2. See the speed-hygiene edits + this commit.
|
|
76
|
+
|
|
68
77
|
### Historical (pre-v4.0)
|
|
69
78
|
|
|
70
79
|
#### What's New in v0.3.3 (Gap #1)
|
|
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|
|
4
4
|
|
|
5
5
|
[project]
|
|
6
6
|
name = "wikifier"
|
|
7
|
-
version = "4.1.
|
|
7
|
+
version = "4.1.2"
|
|
8
8
|
description = "Agent-first, zero-dependency, self-maintaining codebase documentation & change tracking system"
|
|
9
9
|
readme = "README.md"
|
|
10
10
|
license = {text = "MIT"}
|
|
@@ -172,18 +172,75 @@ def _collect_candidate_source_files(root: Path) -> List[Path]:
|
|
|
172
172
|
root = Path(root).resolve()
|
|
173
173
|
except Exception:
|
|
174
174
|
root = Path(root)
|
|
175
|
+
# Also respect the project's exclude_patterns.txt (if present) for parity with sh
|
|
176
|
+
# mapping paths and check-changes. Simple dir-name globs only for pruning speed.
|
|
177
|
+
# This makes python-primary update-maps benefit from user custom excludes (venvs etc)
|
|
178
|
+
# without any behavior change.
|
|
179
|
+
# Look relative to explicit WIKIFIER_PROJECT_ROOT (if set for the target) or the
|
|
180
|
+
# passed root; excludes live at the logical project root, not arbitrary monitored subdirs.
|
|
181
|
+
ep_root = Path(os.environ.get("WIKIFIER_PROJECT_ROOT", root))
|
|
182
|
+
ep = ep_root / "exclude_patterns.txt"
|
|
183
|
+
if ep.exists():
|
|
184
|
+
try:
|
|
185
|
+
for line in ep.read_text(errors="ignore").splitlines():
|
|
186
|
+
p = line.strip()
|
|
187
|
+
if p and not p.startswith("#"):
|
|
188
|
+
p = p.split()[0] # first token
|
|
189
|
+
if p:
|
|
190
|
+
EXCLUDES.add(p)
|
|
191
|
+
# also common glob forms as exact for dirname match
|
|
192
|
+
if p.endswith("/*") or p.endswith("*"):
|
|
193
|
+
EXCLUDES.add(p.rstrip("/*"))
|
|
194
|
+
except Exception:
|
|
195
|
+
pass
|
|
175
196
|
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
197
|
+
# Fast path: if inside a git repo, use `git ls-files` + untracked (respects .gitignore, dramatically faster
|
|
198
|
+
# on large checkouts than any walk; falls back to scandir scan). This is a pure speed opt for "updates"
|
|
199
|
+
# (check-changes, update-maps) with near-identical or better candidate set for real codebases.
|
|
200
|
+
git_dir = root / ".git"
|
|
201
|
+
if git_dir.exists() or (root / ".git" / "HEAD").exists(): # works for worktrees too
|
|
202
|
+
try:
|
|
203
|
+
import subprocess
|
|
204
|
+
# cached + others (untracked but not ignored), exclude standard ignores
|
|
205
|
+
out = subprocess.check_output(
|
|
206
|
+
["git", "ls-files", "--cached", "--others", "--exclude-standard", "-z"],
|
|
207
|
+
cwd=root, stderr=subprocess.DEVNULL
|
|
208
|
+
)
|
|
209
|
+
for entry in out.split(b"\0"):
|
|
210
|
+
if not entry:
|
|
186
211
|
continue
|
|
212
|
+
p = (root / entry.decode("utf-8", "ignore")).resolve()
|
|
213
|
+
if p.suffix.lower() in exts: # reuse the set from above (adjusted)
|
|
214
|
+
# quick filter for excludes we still want even if git surfaces them
|
|
215
|
+
parts = p.parts
|
|
216
|
+
if not any(part in EXCLUDES or any(part.startswith(e) for e in (".",)) for part in parts):
|
|
217
|
+
candidates.append(p)
|
|
218
|
+
if candidates:
|
|
219
|
+
return candidates # success, use git list
|
|
220
|
+
except Exception:
|
|
221
|
+
pass # fall through to scandir
|
|
222
|
+
|
|
223
|
+
# Use os.scandir for faster directory traversal (std lib only; avoids full listdir + separate is_dir stats on large trees).
|
|
224
|
+
# Pruning is applied on the fly. Behavior identical to prior walk.
|
|
225
|
+
exts_lower = tuple(e.lower() for e in exts)
|
|
226
|
+
def _scan_dir(d: Path) -> None:
|
|
227
|
+
try:
|
|
228
|
+
with os.scandir(d) as it:
|
|
229
|
+
for entry in it:
|
|
230
|
+
try:
|
|
231
|
+
name = entry.name
|
|
232
|
+
if entry.is_dir(follow_symlinks=False):
|
|
233
|
+
if name not in EXCLUDES and not name.startswith('.'):
|
|
234
|
+
_scan_dir(Path(entry.path))
|
|
235
|
+
elif entry.is_file(follow_symlinks=False):
|
|
236
|
+
lname = name.lower()
|
|
237
|
+
if lname.endswith(exts_lower):
|
|
238
|
+
candidates.append(Path(entry.path))
|
|
239
|
+
except Exception:
|
|
240
|
+
continue
|
|
241
|
+
except Exception:
|
|
242
|
+
pass
|
|
243
|
+
_scan_dir(root)
|
|
187
244
|
return candidates
|
|
188
245
|
|
|
189
246
|
|
|
@@ -1852,15 +1852,44 @@ def generate_update_events(
|
|
|
1852
1852
|
# Real early scope projection (proportional for 50k+) - Micro-step 1
|
|
1853
1853
|
projector_stats: Dict[str, Any] = {"degraded": True}
|
|
1854
1854
|
proj: Dict[str, Any] = {}
|
|
1855
|
-
#
|
|
1856
|
-
|
|
1855
|
+
# Faster candidate collection using os.scandir (avoids repeated listdir overhead).
|
|
1856
|
+
# Respects exclude_patterns.txt when present (for consistency with check-changes + mapping speed).
|
|
1857
|
+
# Same semantics as before.
|
|
1858
|
+
candidates: List[Path] = []
|
|
1857
1859
|
exts = {'.py', '.js', '.ts', '.jsx', '.tsx'}
|
|
1858
|
-
exclude_dirs = {'__pycache__', '.git', 'node_modules', '.venv', 'venv', 'build', 'dist', '.next', '.cache'
|
|
1859
|
-
|
|
1860
|
-
|
|
1861
|
-
|
|
1862
|
-
|
|
1863
|
-
|
|
1860
|
+
exclude_dirs = {'__pycache__', '.git', 'node_modules', '.venv', 'venv', 'build', 'dist', '.next', '.cache',
|
|
1861
|
+
'.pnpm', '.yarn', '.store', 'tmp', 'temp', '.turbo', '.mypy_cache', '.ruff_cache'}
|
|
1862
|
+
# Load project excludes if available (project root level)
|
|
1863
|
+
ep = root / "exclude_patterns.txt"
|
|
1864
|
+
if ep.exists():
|
|
1865
|
+
try:
|
|
1866
|
+
for line in ep.read_text(errors="ignore").splitlines():
|
|
1867
|
+
p = line.strip()
|
|
1868
|
+
if p and not p.startswith("#"):
|
|
1869
|
+
p = p.split()[0]
|
|
1870
|
+
if p:
|
|
1871
|
+
exclude_dirs.add(p)
|
|
1872
|
+
if p.endswith("/*") or p.endswith("*"):
|
|
1873
|
+
exclude_dirs.add(p.rstrip("/*"))
|
|
1874
|
+
except Exception:
|
|
1875
|
+
pass
|
|
1876
|
+
def _scan(d: Path) -> None:
|
|
1877
|
+
try:
|
|
1878
|
+
with os.scandir(d) as it:
|
|
1879
|
+
for entry in it:
|
|
1880
|
+
try:
|
|
1881
|
+
name = entry.name
|
|
1882
|
+
if entry.is_dir(follow_symlinks=False):
|
|
1883
|
+
if name not in exclude_dirs and not name.startswith('.'):
|
|
1884
|
+
_scan(Path(entry.path))
|
|
1885
|
+
elif entry.is_file(follow_symlinks=False):
|
|
1886
|
+
if os.path.splitext(name)[1].lower() in exts:
|
|
1887
|
+
candidates.append(Path(entry.path))
|
|
1888
|
+
except Exception:
|
|
1889
|
+
continue
|
|
1890
|
+
except Exception:
|
|
1891
|
+
pass
|
|
1892
|
+
_scan(root)
|
|
1864
1893
|
candidates_rel: List[str] = []
|
|
1865
1894
|
for p in candidates:
|
|
1866
1895
|
try:
|
|
@@ -43,6 +43,13 @@ import re
|
|
|
43
43
|
from pathlib import Path
|
|
44
44
|
from typing import List, Dict, Optional, Any
|
|
45
45
|
|
|
46
|
+
# Module-level compiled regexes for small repeated speed win on every parse (docstring strip + dynamic import detectors).
|
|
47
|
+
# Zero behavior change.
|
|
48
|
+
_DOCSTRING_RE1 = re.compile(r'"""[\s\S]*?"""')
|
|
49
|
+
_DOCSTRING_RE2 = re.compile(r"'''[\s\S]*?'''")
|
|
50
|
+
_DYN_IMPORT_RE = re.compile(r'(?P<call>(?:importlib\.)?import_module)\s*\(', re.MULTILINE)
|
|
51
|
+
_DYN_DUNDER_RE = re.compile(r'(?P<call>__import__)\s*\(', re.MULTILINE)
|
|
52
|
+
|
|
46
53
|
# Diagnostics & Failure Transparency (Limitation #5) - same robust import pattern as JS parser
|
|
47
54
|
try:
|
|
48
55
|
from . import diagnostics
|
|
@@ -249,8 +256,8 @@ def _strip_docstrings(content: str) -> str:
|
|
|
249
256
|
strings well), but it is good enough for v0.4 and keeps us zero-dependency.
|
|
250
257
|
"""
|
|
251
258
|
# Remove """...""" and '''...''' (non-greedy, handles both single and double)
|
|
252
|
-
content =
|
|
253
|
-
content =
|
|
259
|
+
content = _DOCSTRING_RE1.sub('', content)
|
|
260
|
+
content = _DOCSTRING_RE2.sub('', content)
|
|
254
261
|
return content
|
|
255
262
|
|
|
256
263
|
|
|
@@ -304,8 +311,8 @@ def parse_python_imports(filepath: str) -> List[Dict[str, Any]]:
|
|
|
304
311
|
dynamic_imports: List[Dict[str, Any]] = []
|
|
305
312
|
if _extract_candidate_literals is not None and _apply_dynamic_registry is not None:
|
|
306
313
|
dyn_patterns = [
|
|
307
|
-
(
|
|
308
|
-
(
|
|
314
|
+
(_DYN_IMPORT_RE, "import_module"),
|
|
315
|
+
(_DYN_DUNDER_RE, "dunder_import"),
|
|
309
316
|
]
|
|
310
317
|
for pat, ptype in dyn_patterns:
|
|
311
318
|
for match in pat.finditer(content):
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: wikifier
|
|
3
|
-
Version: 4.1.
|
|
3
|
+
Version: 4.1.2
|
|
4
4
|
Summary: Agent-first, zero-dependency, self-maintaining codebase documentation & change tracking system
|
|
5
5
|
Author-email: Aron Amos <aron@example.com>
|
|
6
6
|
Maintainer: Aron Amos
|
|
@@ -97,6 +97,15 @@ See the final synth in `Findings/M5-Dogfood-Progress.md` and full Assessment-Rep
|
|
|
97
97
|
- Synced descriptions in README.md, `skills/run.md`, `wikifier/mcp/README.md`.
|
|
98
98
|
- All under protocol (FRESH, record-change + mark-green with subid, main clean). Version to 4.1.1. See the separation-fix commit.
|
|
99
99
|
|
|
100
|
+
**v4.1.2 (2026-06)**: Very minor patch for mapping / update speed hygiene (no new behaviour or scope; pure internal improvements for large projects).
|
|
101
|
+
|
|
102
|
+
- Faster candidate collection everywhere that drives check-changes, monitor, and update-maps (Python-primary paths): switched to `os.scandir`-based recursive scan (std lib, avoids walk overhead) + git `ls-files --cached --others --exclude-standard` fast-path when in a git repo (dramatically faster on real monorepos; falls back cleanly).
|
|
103
|
+
- Consistent early pruning: `exclude_patterns.txt` (user-editable, populated by init) now applied more broadly during mapping walks in both sh and Python collectors (venvs, caches, site-packages, etc. stop descent sooner).
|
|
104
|
+
- Small parser micro-opt: hoisted common regex compiles (docstring strip, dynamic import detectors) to module level in the Python parser (hot path on dirty files during maps).
|
|
105
|
+
- Minor sh-side note + skeleton for git fast collection in traditional update-maps path (real wins already flow through the Python collectors used by check-changes/streaming/lib/MCP).
|
|
106
|
+
- All changes FRESH-checked + recorded+marked under subid=mapping-speed-hygiene. Complements existing levers (monitored_paths.txt narrowing, --dir / directory= scoping, --stream / --max-files / --max-time streaming+budgets, python-primary, incremental dirty + BRC reverse index).
|
|
107
|
+
- Version to 4.1.2. See the speed-hygiene edits + this commit.
|
|
108
|
+
|
|
100
109
|
### Historical (pre-v4.0)
|
|
101
110
|
|
|
102
111
|
#### What's New in v0.3.3 (Gap #1)
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|