@dev-pi2pie/word-counter 0.1.3-canary.2 → 0.1.4-canary.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +27 -21
- package/dist/esm/bin.mjs +2033 -1656
- package/dist/esm/bin.mjs.map +1 -1
- package/dist/esm/worker/count-worker.mjs +1370 -0
- package/dist/esm/worker/count-worker.mjs.map +1 -0
- package/dist/esm/worker-pool.mjs +187 -0
- package/dist/esm/worker-pool.mjs.map +1 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -139,6 +139,29 @@ word-counter --path ./examples/test-case-multi-files-support --keep-progress
|
|
|
139
139
|
|
|
140
140
|
Progress is transient by default, auto-disabled for single-input runs, and suppressed in `--format raw` and `--format json`.
|
|
141
141
|
|
|
142
|
+
### Batch Concurrency (`--jobs`)
|
|
143
|
+
|
|
144
|
+
Use `--jobs` to control batch concurrency:
|
|
145
|
+
|
|
146
|
+
```bash
|
|
147
|
+
word-counter --path ./examples/test-case-multi-files-support --jobs 1
|
|
148
|
+
word-counter --path ./examples/test-case-multi-files-support --jobs 4
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
Quick policy:
|
|
152
|
+
|
|
153
|
+
- no `--jobs` and `--jobs 1` are equivalent baseline behavior.
|
|
154
|
+
- `--jobs > 1` enables concurrent `load+count`.
|
|
155
|
+
- if requested `--jobs` exceeds host `suggestedMaxJobs` (from `--print-jobs-limit`), the CLI warns and runs with the suggested limit as a safety cap.
|
|
156
|
+
|
|
157
|
+
Inspect host jobs diagnostics:
|
|
158
|
+
|
|
159
|
+
```bash
|
|
160
|
+
word-counter --print-jobs-limit
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
For full policy details, JSON parity expectations (`--misc`, `--total-of whitespace,words`), and benchmark standards, see [`docs/batch-jobs-usage-guide.md`](docs/batch-jobs-usage-guide.md).
|
|
164
|
+
|
|
142
165
|
### Stable Path Resolution Contract
|
|
143
166
|
|
|
144
167
|
- Repeated `--path` values are accepted as mixed inputs (file + directory).
|
|
@@ -591,27 +614,10 @@ Example JSON (trimmed):
|
|
|
591
614
|
|
|
592
615
|
## Locale Tag Detection Notes
|
|
593
616
|
|
|
594
|
-
- Detection is regex/script based
|
|
595
|
-
- Ambiguous Latin
|
|
596
|
-
-
|
|
597
|
-
-
|
|
598
|
-
- `de`: `äöüÄÖÜß`
|
|
599
|
-
- `es`: `ñÑ¿¡`
|
|
600
|
-
- `pt`: `ãõÃÕ`
|
|
601
|
-
- `fr`: `œŒæÆ`
|
|
602
|
-
- `pl`: `ąćęłńśźżĄĆĘŁŃŚŹŻ`
|
|
603
|
-
- `tr`: `ıİğĞşŞ`
|
|
604
|
-
- `ro`: `ăĂâÂîÎșȘțȚ`
|
|
605
|
-
- `hu`: `őŐűŰ`
|
|
606
|
-
- `is`: `ðÐþÞ`
|
|
607
|
-
- Latin text with other European diacritics may still remain in `und-Latn` unless a hint is provided.
|
|
608
|
-
- Use `--mode chunk`/`--mode segments` or `--format json` to see the exact locale tag assigned to each chunk.
|
|
609
|
-
- Regex/script-only detection cannot reliably identify English vs. other Latin-script languages; 100% certainty requires explicit metadata (document language tags, user-provided locale, headers) or a language-ID model.
|
|
610
|
-
- Use `--latin-language <tag>` or `--latin-tag <tag>` for ambiguous Latin text.
|
|
611
|
-
- Use `--latin-hint <tag>=<pattern>` (repeatable) and `--latin-hints-file <path>` to add custom Latin rules.
|
|
612
|
-
- Use `--no-default-latin-hints` to disable built-in Latin diacritic rules.
|
|
613
|
-
- Use `--han-language <tag>` or `--han-tag <tag>` for Han-script fallback.
|
|
614
|
-
- `--latin-locale` remains supported as a legacy alias for now and is planned for future deprecation.
|
|
617
|
+
- Detection is regex/script based, not statistical language-ID.
|
|
618
|
+
- Ambiguous Latin defaults to `und-Latn`; Han fallback defaults to `und-Hani`.
|
|
619
|
+
- Use explicit tag and hint flags when you need deterministic tagging.
|
|
620
|
+
- Full notes (built-in heuristics, limitations, and override guidance) are tracked in `docs/locale-tag-detection-notes.md`.
|
|
615
621
|
|
|
616
622
|
## Breaking Changes Notes
|
|
617
623
|
|