specmem-hardwicksoftware 3.7.17 → 3.7.20
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +87 -8
- package/dist/codebase/codeAnalyzer.js +1155 -0
- package/dist/codebase/codebaseIndexer.js +1 -1
- package/dist/database.js +12 -1
- package/dist/mcp/toolRegistry.js +4 -2
- package/dist/tools/goofy/exportProjectMemories.js +243 -0
- package/dist/tools/goofy/findWhatISaid.js +1 -1
- package/dist/tools/goofy/importProjectMemories.js +9 -9
- package/embedding-sandbox/frankenstein-embeddings.py +32 -16
- package/embedding-sandbox/server.mjs +40 -7
- package/mcp-proxy.cjs +92 -35
- package/package.json +14 -3
- package/scripts/specmem-init.cjs +1 -1
package/README.md
CHANGED
|
@@ -201,16 +201,20 @@ Memory Stats: 156 memories, 89 code files indexed
|
|
|
201
201
|
|
|
202
202
|
## Why Root?
|
|
203
203
|
|
|
204
|
-
|
|
204
|
+
Yeah, we know — nobody likes running stuff as root. We're actively working on removing this requirement (migrating to Bun, reworking permissions). But here's why it's there right now, and why we aren't apologizing for it yet.
|
|
205
205
|
|
|
206
|
-
|
|
207
|
-
- **Docker container management** - SpecMem spins up PostgreSQL with the pgvector extension in Docker. It creates containers, manages volumes, and handles networking. Docker's socket (`/var/run/docker.sock`) typically requires root or docker group membership.
|
|
208
|
-
- **Global npm directories** - installing to `/usr/local/lib/node_modules/` and linking binaries to `/usr/local/bin/` means writing to system paths. That's just how global npm packages work on Linux.
|
|
209
|
-
- **Screen sessions for background services** - the embedding server, file watcher, and MCP server all run in detached screen sessions. Managing those system-wide requires the right permissions.
|
|
210
|
-
- **PostgreSQL database setup** - first-run creates the database, enables pgvector, runs migrations, and sets up per-project schemas. It's touching system-level database configs.
|
|
211
|
-
- **File watching across your entire codebase** - chokidar watches thousands of files and needs access to whatever directories you're working in.
|
|
206
|
+
SpecMem isn't some little npm package you toss into a project folder and forget about. It's a full system-level tool that does a lot under the hood, and most of that stuff can't happen without elevated permissions. Here's the real breakdown:
|
|
212
207
|
|
|
213
|
-
|
|
208
|
+
- **System-wide hooks** get dropped into `~/.claude/hooks/` — these intercept and augment every tool call in real-time. They aren't project-scoped, they're user-scoped, and writing to those directories from a global install needs elevated permissions on most Linux setups.
|
|
209
|
+
- **Docker container management** — SpecMem spins up PostgreSQL with pgvector in Docker. It creates containers, manages volumes, handles networking. Docker's socket (`/var/run/docker.sock`) typically needs root or docker group membership. We don't assume you've set up the docker group.
|
|
210
|
+
- **Global npm directories** — installing to `/usr/local/lib/node_modules/` and linking binaries to `/usr/local/bin/` means writing to system paths. That's just how global npm packages work on Linux. Nothing we can do about that one.
|
|
211
|
+
- **Screen sessions for background services** — the embedding server, file watcher, and MCP server all run in detached screen sessions. Managing those system-wide needs the right permissions.
|
|
212
|
+
- **PostgreSQL database setup** — first-run creates the database, enables pgvector, runs migrations, sets up per-project schemas. It's touching system-level database configs that a regular user can't write to.
|
|
213
|
+
- **File watching across your entire codebase** — chokidar watches thousands of files and needs access to whatever directories you're working in. Some of those directories have restrictive permissions.
|
|
214
|
+
|
|
215
|
+
There's also a practical reason we haven't rushed to remove it: **root requirement means you own your machine.** This isn't software for corporate laptops where some IT department controls what you can install. If you can't run `sudo npm install -g`, you probably don't own the box, and we've had issues with companies using SpecMem without proper licensing. The root requirement acts as a natural filter — it keeps usage to power users and developers who actually control their own systems. Companies that want to deploy this on managed infrastructure need to talk to us about commercial licensing first.
|
|
216
|
+
|
|
217
|
+
That said, we're not keeping this forever. The Bun migration is in progress and one of the goals is dropping the root requirement entirely. When that lands, you'll be able to run SpecMem as a regular user with Docker handled separately. But for now, if you want semantic search with vector embeddings, code memorization across sessions, 60% token compression, multi-agent team coordination, and a local embedding server that doesn't phone home — all of that runs as system services, and system services need root. There's no way around it with the current architecture.
|
|
214
218
|
|
|
215
219
|
---
|
|
216
220
|
|
|
@@ -428,6 +432,81 @@ psql -U specmem -d specmem -c "SELECT 1"
|
|
|
428
432
|
|
|
429
433
|
---
|
|
430
434
|
|
|
435
|
+
## What SpecMem Actually Does (The Full Picture)
|
|
436
|
+
|
|
437
|
+
Most people look at SpecMem and think it's just a memory plugin. It's not. It's a full persistent intelligence layer for your Claude Code sessions, and honestly there's nothing else like it on npm right now. Let's break down what you're getting.
|
|
438
|
+
|
|
439
|
+
### Semantic Code Memory — Not Just Chat History
|
|
440
|
+
|
|
441
|
+
Every time you run `specmem init` on a project, it doesn't just save your conversations. It crawls your entire codebase and builds a real semantic graph of everything in it. We're talking functions, classes, methods, fields, constants, variables, enums, structs, interfaces, traits, macros, type aliases, constructors, destructors, operator overloads — the works. And it doesn't stop at definitions. It maps out every import, every dependency, every `#include`, every `use` statement, every `<script src>`. The whole dependency graph gets stored in PostgreSQL with pgvector embeddings so you can search it by meaning, not just by name.
|
|
442
|
+
|
|
443
|
+
When you ask Claude "where's that function that handles rate limiting?" — SpecMem doesn't do a dumb string match. It runs a semantic search across your entire codebase graph and finds `rateLimiter()`, `handleThrottle()`, `apiQuotaManager()`, plus all the conversations you've had about rate limiting. That's why it works.
|
|
444
|
+
|
|
445
|
+
### Language Support — We Don't Play Favorites
|
|
446
|
+
|
|
447
|
+
Here's every language that gets full dedicated analysis with proper extraction of all definitions and dependencies:
|
|
448
|
+
|
|
449
|
+
| Language | What Gets Indexed |
|
|
450
|
+
|----------|------------------|
|
|
451
|
+
| **TypeScript / JavaScript / TSX / JSX** | Functions, arrow functions, classes, interfaces, types, enums, methods, constants, variables, nested definitions with parent tracking. Imports (named, default, namespace, dynamic, re-export), require() calls. |
|
|
452
|
+
| **Python** | Functions, async functions, classes, methods (with `self`/`cls` detection), module-level constants. `import` and `from...import` statements. Indentation-based scope tracking. |
|
|
453
|
+
| **Java** | Classes, abstract classes, interfaces, enums, records (Java 14+), annotations (@interface), constructors, methods, fields (private/protected/public/static/final), static initializer blocks. Package declarations, imports, static imports, wildcard imports. |
|
|
454
|
+
| **Kotlin** | Everything Java gets plus `fun`, `val`/`var`, `data class`, `object`/`companion object`, `suspend` functions, `internal` visibility. Same import handling. |
|
|
455
|
+
| **Scala** | Shares the Java/Kotlin extractor — picks up classes, traits, objects, methods, vals. |
|
|
456
|
+
| **Go** | Functions, methods (with receivers), structs, interfaces, types, constants, variables. Single and block imports. Exported detection via capitalization. |
|
|
457
|
+
| **Rust** | Functions, async functions, structs, enums, traits, impl blocks, constants, statics. `use` statements with nested paths, `extern crate`. Pub detection. |
|
|
458
|
+
| **C / C++** | Functions, methods, classes, structs, unions, enums (including `enum class`), namespaces, typedefs, `using` aliases, constructors, destructors, operator overloads, macros (#define with and without params), global/static/extern/constexpr/thread_local variables. `#include` (angle vs quote, STL builtin detection), `using namespace`, `using` declarations. Template support. Virtual/inline/const method detection. |
|
|
459
|
+
| **HTML** | Elements with IDs, CSS classes, `<script>` and `<style>` blocks, forms, templates, web components (`<slot>`, `<component>`), `data-*` attributes, semantic sections. Script src, stylesheet links, image/iframe/source assets, inline ES module imports. Structural chunking by HTML blocks. |
|
|
460
|
+
| **Ruby, PHP, Swift** | Analyzable with generic extraction (function/class detection). Dedicated extractors coming. |
|
|
461
|
+
|
|
462
|
+
That's not a marketing list — every one of those has real regex-based extraction that's been tested against actual codebases. The Java extractor alone handles annotations, records, static initializers, field visibility, and constructor detection. The C++ extractor picks up operator overloads and destructor naming. We didn't cut corners on this.
|
|
463
|
+
|
|
464
|
+
### Chat Session Memory — Conversations That Stick Around
|
|
465
|
+
|
|
466
|
+
Every conversation you have with Claude gets stored as a memory with full semantic embeddings. Next session, Claude can search through your past discussions by meaning. You talked about a JWT refresh token edge case three weeks ago? SpecMem finds it. You discussed why you chose PostgreSQL over MongoDB for the user service? It's there. Your conversations don't vanish when you close the terminal anymore.
|
|
467
|
+
|
|
468
|
+
Memories get tagged by type (conversation, decision, architecture, bug, etc.), importance level, and project. They're searchable with `find_memory`, drillable with `drill_down`, and you can link related ones together with `link_the_vibes`. It's your project's institutional knowledge, but it actually works.
|
|
469
|
+
|
|
470
|
+
### Token Compression — 60% Smaller Context
|
|
471
|
+
|
|
472
|
+
SpecMem uses Traditional Chinese-based compression that squeezes your context down by about 60%. That's not a typo. The same information that would eat 1000 tokens takes about 400 tokens in compressed form. Claude reads it with 99%+ accuracy. This means your context window goes further, you hit fewer limits, and long sessions don't fall apart as fast.
|
|
473
|
+
|
|
474
|
+
### Multi-Agent Team Coordination
|
|
475
|
+
|
|
476
|
+
Got multiple Claude instances working on the same project? SpecMem handles that. Team messaging with `send_team_message` and `read_team_messages`. Task claiming with `claim_task` and `release_task` so two agents don't step on each other. A coordination server that runs on port 8596 by default. Dashboard on port 8585 so you can see what's happening. This isn't theoretical — we use it ourselves with up to 4 agents running in parallel on the same codebase.
|
|
477
|
+
|
|
478
|
+
### 74+ MCP Tools
|
|
479
|
+
|
|
480
|
+
SpecMem ships with over 74 MCP tools out of the box. Memory search, code pointers, memory storage, team comms, file watching, stats, drilldown, sync checking — it's all there. Every tool is available as a slash command too. `/specmem-find`, `/specmem-code`, `/specmem-pointers`, `/specmem-stats`, `/specmem-remember` — whatever you need.
|
|
481
|
+
|
|
482
|
+
### The Embedding Server
|
|
483
|
+
|
|
484
|
+
We run our own embedding server locally in Docker. Your code never leaves your machine. No API calls to OpenAI or anyone else. The embeddings get stored in PostgreSQL with pgvector and they're used for all semantic search operations. It's fast, it's private, and it doesn't cost you anything per query.
|
|
485
|
+
|
|
486
|
+
---
|
|
487
|
+
|
|
488
|
+
## What's New in v3.7
|
|
489
|
+
|
|
490
|
+
### New Language Support
|
|
491
|
+
|
|
492
|
+
We just shipped dedicated code analyzers for **Java**, **Kotlin**, **C**, **C++**, and **HTML**. These aren't half-baked generic matchers — they're full extractors that understand each language's syntax and pull out everything: constructors, destructors, operator overloads, macros, typedefs, annotations, records, data classes, companion objects, web components, structural HTML chunking, the lot. Previously these languages fell through to a generic extractor that could barely find functions and classes. Now they get the same treatment that TypeScript and Python have had since day one.
|
|
493
|
+
|
|
494
|
+
### Embedding Server Stability Fix
|
|
495
|
+
|
|
496
|
+
The MCP proxy timeout handling got a complete overhaul. Previously the embedding server would go stale after long sessions — the process would still be running but the socket connection was dead. We've fixed the SIGTERM handling, added proper health checks that detect stale connections, and the MCP proxy now handles reconnection properly. If you were seeing "embedding server not responding" errors after a few hours of work, that's fixed.
|
|
497
|
+
|
|
498
|
+
### Coming Soon
|
|
499
|
+
|
|
500
|
+
We've got two big features in the pipeline that we're pretty excited about:
|
|
501
|
+
|
|
502
|
+
- **OCR for PDFs** — SpecMem will be able to index PDF documentation in your project. Technical specs, API docs, architecture diagrams with text — all searchable by meaning. This is gonna be huge for projects that have a `/docs` folder full of PDFs that Claude currently can't touch.
|
|
503
|
+
|
|
504
|
+
- **YOLO-based Image Analysis** (optional) — For summarizing screenshots, diagrams, SVGs, PNGs, JPEGs, and other visual assets in your codebase. This won't be required — it's an optional add-on for teams that work with a lot of visual content. Think UI mockups, architecture diagrams, flowcharts. YOLO picks out the key elements and SpecMem stores searchable summaries.
|
|
505
|
+
|
|
506
|
+
Both of these are active development. They'll ship as optional features so they don't bloat the base install.
|
|
507
|
+
|
|
508
|
+
---
|
|
509
|
+
|
|
431
510
|
## Documentation
|
|
432
511
|
|
|
433
512
|
- [Quick Start Guide](./QUICKSTART.md)
|