docforge-cli 0.2.0__tar.gz → 0.2.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. {docforge_cli-0.2.0 → docforge_cli-0.2.1}/PKG-INFO +13 -11
  2. {docforge_cli-0.2.0 → docforge_cli-0.2.1}/README.md +7 -10
  3. {docforge_cli-0.2.0 → docforge_cli-0.2.1}/pyproject.toml +10 -2
  4. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge_cli.egg-info/PKG-INFO +13 -11
  5. docforge_cli-0.2.1/src/docforge_cli.egg-info/SOURCES.txt +42 -0
  6. docforge_cli-0.2.0/docforge_cli.egg-info/SOURCES.txt +0 -42
  7. {docforge_cli-0.2.0 → docforge_cli-0.2.1}/LICENSE +0 -0
  8. {docforge_cli-0.2.0 → docforge_cli-0.2.1}/setup.cfg +0 -0
  9. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/__init__.py +0 -0
  10. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/__main__.py +0 -0
  11. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/api.py +0 -0
  12. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/cli.py +0 -0
  13. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/config.py +0 -0
  14. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/crawlers/__init__.py +0 -0
  15. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/crawlers/confluence.py +0 -0
  16. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/crawlers/git.py +0 -0
  17. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/db.py +0 -0
  18. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/ingest.py +0 -0
  19. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/lint.py +0 -0
  20. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/mcp_server.py +0 -0
  21. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/processors/__init__.py +0 -0
  22. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/processors/chunker.py +0 -0
  23. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/processors/embedder.py +0 -0
  24. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/processors/parser.py +0 -0
  25. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/query_log.py +0 -0
  26. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/ranking.py +0 -0
  27. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/scripts/__init__.py +0 -0
  28. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/scripts/eval_search.py +0 -0
  29. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/scripts/latency_report.py +0 -0
  30. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sources.py +0 -0
  31. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sql/migrations/001_add_source_identifier.sql +0 -0
  32. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sql/migrations/002_add_status_index.sql +0 -0
  33. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sql/migrations/003_add_source_tags.sql +0 -0
  34. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sql/migrations/004_add_query_log.sql +0 -0
  35. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sql/migrations/005_add_query_log_user_oid.sql +0 -0
  36. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sql/migrations/006_add_query_log_request_ms.sql +0 -0
  37. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/sql/schema.sql +0 -0
  38. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/templates/docforge.yml +0 -0
  39. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/templates/docker-compose.yml +0 -0
  40. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/templates/mcp_client.py +0 -0
  41. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge/templates/sources.yml +0 -0
  42. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge_cli.egg-info/dependency_links.txt +0 -0
  43. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge_cli.egg-info/entry_points.txt +0 -0
  44. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge_cli.egg-info/requires.txt +0 -0
  45. {docforge_cli-0.2.0 → docforge_cli-0.2.1/src}/docforge_cli.egg-info/top_level.txt +0 -0
@@ -1,8 +1,13 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: docforge-cli
3
- Version: 0.2.0
3
+ Version: 0.2.1
4
4
  Summary: Forge searchable context from Confluence and git repos for AI coding assistants
5
5
  License: MIT
6
+ Project-URL: Homepage, https://GranatenUdo.github.io/docforge/
7
+ Project-URL: Source, https://github.com/GranatenUdo/docforge
8
+ Project-URL: Issues, https://github.com/GranatenUdo/docforge/issues
9
+ Project-URL: Changelog, https://github.com/GranatenUdo/docforge/blob/master/CHANGELOG.md
10
+ Project-URL: Documentation, https://GranatenUdo.github.io/docforge/
6
11
  Requires-Python: >=3.12
7
12
  Description-Content-Type: text/markdown
8
13
  License-File: LICENSE
@@ -140,29 +145,26 @@ Contributions welcome. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for development
140
145
 
141
146
  ## Evaluation & retrieval quality
142
147
 
143
- docforge ships with a retrieval-quality eval harness at [`docforge/scripts/eval_search.py`](docforge/scripts/eval_search.py). It measures recall@1, recall@k, and MRR against a ground-truth query set you maintain. The harness is designed for **drift detection** — run it after `sources.yml` changes, embedding-model updates, or ranking tweaks, and compare against your baseline. There is no absolute quality threshold; the metric magnitude depends on how closely your ground-truth queries match source titles. See [`docforge/scripts/README.md`](docforge/scripts/README.md) for details.
148
+ docforge ships with a retrieval-quality eval harness at [`src/docforge/scripts/eval_search.py`](src/docforge/scripts/eval_search.py). It measures recall@1, recall@k, and MRR against a ground-truth query set you maintain. The harness is designed for **drift detection** — run it after `sources.yml` changes, embedding-model updates, or ranking tweaks, and compare against your baseline. There is no absolute quality threshold; the metric magnitude depends on how closely your ground-truth queries match source titles. See [`src/docforge/scripts/README.md`](src/docforge/scripts/README.md) for details.
144
149
 
145
150
  ## FAQ
146
151
 
147
- ### "Cannot connect to PostgreSQL"
148
-
149
- Check that the database is running: `docker compose up -d db`. Verify `DATABASE_URL` in `.env` points to `postgresql://docforge:localdev@localhost:5432/docforge` (or your custom value).
152
+ The three install-time issues new users hit most often are inline below. The
153
+ full FAQ — including "no results found", "ingest skipped everything", removing
154
+ sources, swapping embedding models, and where to file issues lives on the
155
+ [microsite FAQ](https://GranatenUdo.github.io/docforge/faq/).
150
156
 
151
157
  ### "HF_TOKEN required" or model download fails
152
158
 
153
159
  The embedding model `google/embeddinggemma-300m` requires a Hugging Face token with access to the gated model. Create one at https://huggingface.co/settings/tokens, accept the model license at https://huggingface.co/google/embeddinggemma-300m, and set `HF_TOKEN=hf_...` in `.env`.
154
160
 
155
- ### "No results found" after ingest
156
-
157
- Run `docforge status` to confirm sources and chunks exist. If counts are zero, check the ingest logs for per-source failures — the summary at the end lists sources that failed.
158
-
159
161
  ### First ingest / first container start is very slow
160
162
 
161
163
  The first run downloads the 300M embedding model (~1.2 GB) from Hugging Face. Locally, the model is cached at `~/.cache/huggingface/`. In the Docker image, it is cached at `/app/.cache/huggingface/` — **mount this as a volume** so container restarts do not re-download: `docker run -v docforge-hf-cache:/app/.cache/huggingface ...`.
162
164
 
163
- ### "Ingest skipped everything"
165
+ ### "Cannot connect to PostgreSQL"
164
166
 
165
- docforge skips sources whose `content_hash` matches the stored hash (no changes detected). To force re-ingest, clear the hash: `UPDATE sources SET content_hash = NULL;` then run `docforge ingest`.
167
+ Check that the database is running: `docker compose up -d db`. Verify `DATABASE_URL` in `.env` points to `postgresql://docforge:localdev@localhost:5432/docforge` (or your custom value).
166
168
 
167
169
  ## License
168
170
 
@@ -107,29 +107,26 @@ Contributions welcome. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for development
107
107
 
108
108
  ## Evaluation & retrieval quality
109
109
 
110
- docforge ships with a retrieval-quality eval harness at [`docforge/scripts/eval_search.py`](docforge/scripts/eval_search.py). It measures recall@1, recall@k, and MRR against a ground-truth query set you maintain. The harness is designed for **drift detection** — run it after `sources.yml` changes, embedding-model updates, or ranking tweaks, and compare against your baseline. There is no absolute quality threshold; the metric magnitude depends on how closely your ground-truth queries match source titles. See [`docforge/scripts/README.md`](docforge/scripts/README.md) for details.
110
+ docforge ships with a retrieval-quality eval harness at [`src/docforge/scripts/eval_search.py`](src/docforge/scripts/eval_search.py). It measures recall@1, recall@k, and MRR against a ground-truth query set you maintain. The harness is designed for **drift detection** — run it after `sources.yml` changes, embedding-model updates, or ranking tweaks, and compare against your baseline. There is no absolute quality threshold; the metric magnitude depends on how closely your ground-truth queries match source titles. See [`src/docforge/scripts/README.md`](src/docforge/scripts/README.md) for details.
111
111
 
112
112
  ## FAQ
113
113
 
114
- ### "Cannot connect to PostgreSQL"
115
-
116
- Check that the database is running: `docker compose up -d db`. Verify `DATABASE_URL` in `.env` points to `postgresql://docforge:localdev@localhost:5432/docforge` (or your custom value).
114
+ The three install-time issues new users hit most often are inline below. The
115
+ full FAQ — including "no results found", "ingest skipped everything", removing
116
+ sources, swapping embedding models, and where to file issues lives on the
117
+ [microsite FAQ](https://GranatenUdo.github.io/docforge/faq/).
117
118
 
118
119
  ### "HF_TOKEN required" or model download fails
119
120
 
120
121
  The embedding model `google/embeddinggemma-300m` requires a Hugging Face token with access to the gated model. Create one at https://huggingface.co/settings/tokens, accept the model license at https://huggingface.co/google/embeddinggemma-300m, and set `HF_TOKEN=hf_...` in `.env`.
121
122
 
122
- ### "No results found" after ingest
123
-
124
- Run `docforge status` to confirm sources and chunks exist. If counts are zero, check the ingest logs for per-source failures — the summary at the end lists sources that failed.
125
-
126
123
  ### First ingest / first container start is very slow
127
124
 
128
125
  The first run downloads the 300M embedding model (~1.2 GB) from Hugging Face. Locally, the model is cached at `~/.cache/huggingface/`. In the Docker image, it is cached at `/app/.cache/huggingface/` — **mount this as a volume** so container restarts do not re-download: `docker run -v docforge-hf-cache:/app/.cache/huggingface ...`.
129
126
 
130
- ### "Ingest skipped everything"
127
+ ### "Cannot connect to PostgreSQL"
131
128
 
132
- docforge skips sources whose `content_hash` matches the stored hash (no changes detected). To force re-ingest, clear the hash: `UPDATE sources SET content_hash = NULL;` then run `docforge ingest`.
129
+ Check that the database is running: `docker compose up -d db`. Verify `DATABASE_URL` in `.env` points to `postgresql://docforge:localdev@localhost:5432/docforge` (or your custom value).
133
130
 
134
131
  ## License
135
132
 
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "docforge-cli"
7
- version = "0.2.0"
7
+ version = "0.2.1"
8
8
  description = "Forge searchable context from Confluence and git repos for AI coding assistants"
9
9
  readme = "README.md"
10
10
  license = {text = "MIT"}
@@ -25,6 +25,13 @@ dependencies = [
25
25
  "numpy>=1.26",
26
26
  ]
27
27
 
28
+ [project.urls]
29
+ Homepage = "https://GranatenUdo.github.io/docforge/"
30
+ Source = "https://github.com/GranatenUdo/docforge"
31
+ Issues = "https://github.com/GranatenUdo/docforge/issues"
32
+ Changelog = "https://github.com/GranatenUdo/docforge/blob/master/CHANGELOG.md"
33
+ Documentation = "https://GranatenUdo.github.io/docforge/"
34
+
28
35
  [project.scripts]
29
36
  docforge = "docforge.cli:app"
30
37
 
@@ -44,6 +51,7 @@ entra = [
44
51
  ]
45
52
 
46
53
  [tool.setuptools.packages.find]
54
+ where = ["src"]
47
55
  include = ["docforge*"]
48
56
 
49
57
  [tool.setuptools.package-data]
@@ -62,7 +70,7 @@ testpaths = ["tests"]
62
70
  markers = [
63
71
  "integration: requires Docker (pgvector container)",
64
72
  ]
65
- addopts = "--cov=docforge"
73
+ addopts = "--cov=src/docforge"
66
74
 
67
75
  [tool.coverage.report]
68
76
  fail_under = 60
@@ -1,8 +1,13 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: docforge-cli
3
- Version: 0.2.0
3
+ Version: 0.2.1
4
4
  Summary: Forge searchable context from Confluence and git repos for AI coding assistants
5
5
  License: MIT
6
+ Project-URL: Homepage, https://GranatenUdo.github.io/docforge/
7
+ Project-URL: Source, https://github.com/GranatenUdo/docforge
8
+ Project-URL: Issues, https://github.com/GranatenUdo/docforge/issues
9
+ Project-URL: Changelog, https://github.com/GranatenUdo/docforge/blob/master/CHANGELOG.md
10
+ Project-URL: Documentation, https://GranatenUdo.github.io/docforge/
6
11
  Requires-Python: >=3.12
7
12
  Description-Content-Type: text/markdown
8
13
  License-File: LICENSE
@@ -140,29 +145,26 @@ Contributions welcome. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for development
140
145
 
141
146
  ## Evaluation & retrieval quality
142
147
 
143
- docforge ships with a retrieval-quality eval harness at [`docforge/scripts/eval_search.py`](docforge/scripts/eval_search.py). It measures recall@1, recall@k, and MRR against a ground-truth query set you maintain. The harness is designed for **drift detection** — run it after `sources.yml` changes, embedding-model updates, or ranking tweaks, and compare against your baseline. There is no absolute quality threshold; the metric magnitude depends on how closely your ground-truth queries match source titles. See [`docforge/scripts/README.md`](docforge/scripts/README.md) for details.
148
+ docforge ships with a retrieval-quality eval harness at [`src/docforge/scripts/eval_search.py`](src/docforge/scripts/eval_search.py). It measures recall@1, recall@k, and MRR against a ground-truth query set you maintain. The harness is designed for **drift detection** — run it after `sources.yml` changes, embedding-model updates, or ranking tweaks, and compare against your baseline. There is no absolute quality threshold; the metric magnitude depends on how closely your ground-truth queries match source titles. See [`src/docforge/scripts/README.md`](src/docforge/scripts/README.md) for details.
144
149
 
145
150
  ## FAQ
146
151
 
147
- ### "Cannot connect to PostgreSQL"
148
-
149
- Check that the database is running: `docker compose up -d db`. Verify `DATABASE_URL` in `.env` points to `postgresql://docforge:localdev@localhost:5432/docforge` (or your custom value).
152
+ The three install-time issues new users hit most often are inline below. The
153
+ full FAQ — including "no results found", "ingest skipped everything", removing
154
+ sources, swapping embedding models, and where to file issues lives on the
155
+ [microsite FAQ](https://GranatenUdo.github.io/docforge/faq/).
150
156
 
151
157
  ### "HF_TOKEN required" or model download fails
152
158
 
153
159
  The embedding model `google/embeddinggemma-300m` requires a Hugging Face token with access to the gated model. Create one at https://huggingface.co/settings/tokens, accept the model license at https://huggingface.co/google/embeddinggemma-300m, and set `HF_TOKEN=hf_...` in `.env`.
154
160
 
155
- ### "No results found" after ingest
156
-
157
- Run `docforge status` to confirm sources and chunks exist. If counts are zero, check the ingest logs for per-source failures — the summary at the end lists sources that failed.
158
-
159
161
  ### First ingest / first container start is very slow
160
162
 
161
163
  The first run downloads the 300M embedding model (~1.2 GB) from Hugging Face. Locally, the model is cached at `~/.cache/huggingface/`. In the Docker image, it is cached at `/app/.cache/huggingface/` — **mount this as a volume** so container restarts do not re-download: `docker run -v docforge-hf-cache:/app/.cache/huggingface ...`.
162
164
 
163
- ### "Ingest skipped everything"
165
+ ### "Cannot connect to PostgreSQL"
164
166
 
165
- docforge skips sources whose `content_hash` matches the stored hash (no changes detected). To force re-ingest, clear the hash: `UPDATE sources SET content_hash = NULL;` then run `docforge ingest`.
167
+ Check that the database is running: `docker compose up -d db`. Verify `DATABASE_URL` in `.env` points to `postgresql://docforge:localdev@localhost:5432/docforge` (or your custom value).
166
168
 
167
169
  ## License
168
170
 
@@ -0,0 +1,42 @@
1
+ LICENSE
2
+ README.md
3
+ pyproject.toml
4
+ src/docforge/__init__.py
5
+ src/docforge/__main__.py
6
+ src/docforge/api.py
7
+ src/docforge/cli.py
8
+ src/docforge/config.py
9
+ src/docforge/db.py
10
+ src/docforge/ingest.py
11
+ src/docforge/lint.py
12
+ src/docforge/mcp_server.py
13
+ src/docforge/query_log.py
14
+ src/docforge/ranking.py
15
+ src/docforge/sources.py
16
+ src/docforge/crawlers/__init__.py
17
+ src/docforge/crawlers/confluence.py
18
+ src/docforge/crawlers/git.py
19
+ src/docforge/processors/__init__.py
20
+ src/docforge/processors/chunker.py
21
+ src/docforge/processors/embedder.py
22
+ src/docforge/processors/parser.py
23
+ src/docforge/scripts/__init__.py
24
+ src/docforge/scripts/eval_search.py
25
+ src/docforge/scripts/latency_report.py
26
+ src/docforge/sql/schema.sql
27
+ src/docforge/sql/migrations/001_add_source_identifier.sql
28
+ src/docforge/sql/migrations/002_add_status_index.sql
29
+ src/docforge/sql/migrations/003_add_source_tags.sql
30
+ src/docforge/sql/migrations/004_add_query_log.sql
31
+ src/docforge/sql/migrations/005_add_query_log_user_oid.sql
32
+ src/docforge/sql/migrations/006_add_query_log_request_ms.sql
33
+ src/docforge/templates/docforge.yml
34
+ src/docforge/templates/docker-compose.yml
35
+ src/docforge/templates/mcp_client.py
36
+ src/docforge/templates/sources.yml
37
+ src/docforge_cli.egg-info/PKG-INFO
38
+ src/docforge_cli.egg-info/SOURCES.txt
39
+ src/docforge_cli.egg-info/dependency_links.txt
40
+ src/docforge_cli.egg-info/entry_points.txt
41
+ src/docforge_cli.egg-info/requires.txt
42
+ src/docforge_cli.egg-info/top_level.txt
@@ -1,42 +0,0 @@
1
- LICENSE
2
- README.md
3
- pyproject.toml
4
- docforge/__init__.py
5
- docforge/__main__.py
6
- docforge/api.py
7
- docforge/cli.py
8
- docforge/config.py
9
- docforge/db.py
10
- docforge/ingest.py
11
- docforge/lint.py
12
- docforge/mcp_server.py
13
- docforge/query_log.py
14
- docforge/ranking.py
15
- docforge/sources.py
16
- docforge/crawlers/__init__.py
17
- docforge/crawlers/confluence.py
18
- docforge/crawlers/git.py
19
- docforge/processors/__init__.py
20
- docforge/processors/chunker.py
21
- docforge/processors/embedder.py
22
- docforge/processors/parser.py
23
- docforge/scripts/__init__.py
24
- docforge/scripts/eval_search.py
25
- docforge/scripts/latency_report.py
26
- docforge/sql/schema.sql
27
- docforge/sql/migrations/001_add_source_identifier.sql
28
- docforge/sql/migrations/002_add_status_index.sql
29
- docforge/sql/migrations/003_add_source_tags.sql
30
- docforge/sql/migrations/004_add_query_log.sql
31
- docforge/sql/migrations/005_add_query_log_user_oid.sql
32
- docforge/sql/migrations/006_add_query_log_request_ms.sql
33
- docforge/templates/docforge.yml
34
- docforge/templates/docker-compose.yml
35
- docforge/templates/mcp_client.py
36
- docforge/templates/sources.yml
37
- docforge_cli.egg-info/PKG-INFO
38
- docforge_cli.egg-info/SOURCES.txt
39
- docforge_cli.egg-info/dependency_links.txt
40
- docforge_cli.egg-info/entry_points.txt
41
- docforge_cli.egg-info/requires.txt
42
- docforge_cli.egg-info/top_level.txt
File without changes
File without changes