npm - fr-spell - Versions diffs - 1.0.1 - Mend

fr-spell 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/LICENSE +21 -0
package/README.cn.md +165 -0
package/README.fr.md +165 -0
package/README.md +165 -0
package/benchmark/checklist_adje_100.json +702 -0
package/benchmark/checklist_lemma_verb_100.json +402 -0
package/benchmark/checklist_noun_100.json +702 -0
package/benchmark/checklist_verb_100.json +702 -0
package/benchmark/generate-checklists.js +192 -0
package/benchmark/run-benchmark.js +123 -0
package/models/small/derive_form_model.int8.onnx +0 -0
package/models/small/derive_form_vocab.json +74 -0
package/models/small/lemma_type_labels.json +47 -0
package/models/small/lemma_type_model.int8.onnx +0 -0
package/models/small/lemma_type_vocab.json +244 -0
package/package.json +50 -0
package/scripts/help.js +54 -0
package/src/frspell.js +9 -0
package/src/module/Predictor.js +416 -0
package/test/test.js +21 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Davy Chen
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.cn.md ADDED Viewed

@@ -0,0 +1,165 @@
+# FR-SPELL
+[English](README.md) | [中文](README.cn.md) | [Français](README.fr.md)
+FR-SPELL 是一个用于法语词形还原与派生形态生成的 npm 包。
+支持能力如下：
+- 将动词变位形式预测为词元（lemma）
+- 名词词形生成
+- 形容词词形生成
+- 动词词形生成
+该包基于 ONNX Runtime 与 INT8 量化模型，兼顾高速度与小体积。
+## 安装
+```bash
+npm install fr-spell
+```
+## 集成到你的项目
+```js
+import { FrSpell } from 'fr-spell';
+const predictor = await FrSpell();
+const lemma = await predictor.lemma('mangeons');
+const noun = await predictor.nounDerive('chat', 'THD_PLF');
+const adje = await predictor.adjeDerive('beau', 'THD_F');
+const verb = await predictor.verbDerive('manger', 'FST_PL', 'INDI', 'PRES');
+console.log(lemma);
+console.log(noun);
+console.log(adje);
+console.log(verb);
+```
+示例运行输出：
+```txt
+{ input: 'mangeons', lemma: 'manger', wordType: 'VERB', confidence: 0.9965604285, timeMs: 3.89 }
+{ lemma: 'chat', wordType: 'NOUN', person: 'THD_PLF', mode: 'ALL', tense: 'ALL', output: 'chattes', confidence: 0.9997230679, timeMs: 5.06 }
+{ lemma: 'beau', wordType: 'ADJE', person: 'THD_F', mode: 'ALL', tense: 'ALL', output: 'belle', confidence: 0.9999751771, timeMs: 3.08 }
+{ lemma: 'manger', wordType: 'VERB', person: 'FST_PL', mode: 'INDI', tense: 'PRES', output: 'mangeons', confidence: 0.9999864523, timeMs: 4.79 }
+```
+## 预测参数说明
+词元预测（lemma）：
+- API：`predictor.lemma(input)`
+- `input`：字符串，输入变位/屈折后的词形，例如 `mangeons`
+派生预测（derive）：
+- 名词 API：`predictor.nounDerive(lemma, person)`
+- 形容词 API：`predictor.adjeDerive(lemma, person)`
+- 动词 API：`predictor.verbDerive(lemma, person, mode, tense)`
+- 通用 API：`predictor.derive(lemma, wordType, person, mode, tense)`
+可用 `wordType`：
+- `NOUN`（名词）
+- `ADJE`（形容词）
+- `VERB`（动词）
+可用 `person`：
+- `FST`（第一人称单数）
+- `SND`（第二人称单数）
+- `THD_M`（第三人称阳性单数）
+- `THD_F`（第三人称阴性单数）
+- `FST_PL`（第一人称复数）
+- `SND_PL`（第二人称复数）
+- `THD_PLM`（第三人称阳性复数）
+- `THD_PLF`（第三人称阴性复数）
+可用 `mode`：
+- `INDI`（陈述式）
+- `SUBJ`（虚拟式）
+- `COND`（条件式）
+- `PART`（分词式）
+- `IMPE`（命令式）
+- `INFI`（不定式）
+当前实现支持的 `tense` 仅有：
+- `PRES`（现在时）
+- `IMPA`（未完成过去时）
+- `FUTU`（将来时）
+- `PASS`（过去时）
+说明：
+- 参考定义文件中还有更多时态名称，但本包实现目前只支持 `PRES`、`IMPA`、`FUTU`、`PASS`。
+- 对于名词/形容词派生，用户输入时不需要 `mode` 与 `tense`。
+## 运行测试
+```bash
+npm test
+```
+该命令会执行 test/test.js，并输出示例预测结果。
+## 查看帮助
+```bash
+npm run help
+```
+该命令会输出参数速查说明，覆盖 `lemma`、`nounDerive`、`adjeDerive`、`verbDerive`、`derive` 及 person/mode/tense 可用值。
+## 运行基准测试
+1) 先生成检查清单 JSON（每类 100 条）：
+```bash
+npm run benchmark:prepare
+```
+2) 运行全部基准测试：
+```bash
+npm run benchmark
+```
+3) 可选：按单项运行：
+```bash
+npm run benchmark:lemma
+npm run benchmark:noun
+npm run benchmark:verb
+npm run benchmark:adje
+```
+## 基准结果（最近一次本地运行）
+基准命令：
+```bash
+npm run benchmark
+```
+结果：
+- 由变位预测词元：99/100，准确率 99.00%，平均 16.46 ms
+- 名词派生：99/100，准确率 99.00%，平均 17.33 ms
+- 动词派生：100/100，准确率 100.00%，平均 17.12 ms
+- 形容词派生：100/100，准确率 100.00%，平均 17.34 ms
+## 模型体积
+- 词元 ONNX 模型：models/small/lemma_type_model.int8.onnx = 0.96 MB
+- 派生 ONNX 模型：models/small/derive_form_model.int8.onnx = 0.91 MB
+- ONNX 总体积：约 1.87 MB
+## 为什么它非常适合 Web 前端产品
+- 法语关键形态任务具有高准确率
+- 单次请求延迟低（本地基准平均约 16 到 17 ms）
+- ONNX 体积小（总计约 1.87 MB）
+- 非常适合为 Web 前端功能提供后端推理能力，例如实时写作辅助、语法提示与词元感知检索

package/README.fr.md ADDED Viewed

@@ -0,0 +1,165 @@
+# FR-SPELL
+[English](README.md) | [中文](README.cn.md) | [Français](README.fr.md)
+FR-SPELL est un package npm pour la prédiction de lemme en français et la génération de formes dérivées.
+Fonctionnalités prises en charge :
+- prédiction du lemme à partir d'une forme conjuguée
+- génération de formes nominales
+- génération de formes adjectivales
+- génération de formes verbales
+Le package s'appuie sur ONNX Runtime et des modèles INT8 quantifiés pour offrir une grande vitesse avec une empreinte mémoire réduite.
+## Installation
+```bash
+npm install fr-spell
+```
+## Intégration dans votre projet
+```js
+import { FrSpell } from 'fr-spell';
+const predictor = await FrSpell();
+const lemma = await predictor.lemma('mangeons');
+const noun = await predictor.nounDerive('chat', 'THD_PLF');
+const adje = await predictor.adjeDerive('beau', 'THD_F');
+const verb = await predictor.verbDerive('manger', 'FST_PL', 'INDI', 'PRES');
+console.log(lemma);
+console.log(noun);
+console.log(adje);
+console.log(verb);
+```
+Exemple de sortie a l'execution :
+```txt
+{ input: 'mangeons', lemma: 'manger', wordType: 'VERB', confidence: 0.9965604285, timeMs: 3.89 }
+{ lemma: 'chat', wordType: 'NOUN', person: 'THD_PLF', mode: 'ALL', tense: 'ALL', output: 'chattes', confidence: 0.9997230679, timeMs: 5.06 }
+{ lemma: 'beau', wordType: 'ADJE', person: 'THD_F', mode: 'ALL', tense: 'ALL', output: 'belle', confidence: 0.9999751771, timeMs: 3.08 }
+{ lemma: 'manger', wordType: 'VERB', person: 'FST_PL', mode: 'INDI', tense: 'PRES', output: 'mangeons', confidence: 0.9999864523, timeMs: 4.79 }
+```
+## Parametres de prediction
+Prediction de lemme :
+- API : `predictor.lemma(input)`
+- `input` : chaine de caracteres, forme flechie/conjuguee, par exemple `mangeons`
+Prediction de derive :
+- API nom : `predictor.nounDerive(lemma, person)`
+- API adjectif : `predictor.adjeDerive(lemma, person)`
+- API verbe : `predictor.verbDerive(lemma, person, mode, tense)`
+- API generique : `predictor.derive(lemma, wordType, person, mode, tense)`
+Valeurs `wordType` autorisees :
+- `NOUN` (nom)
+- `ADJE` (adjectif)
+- `VERB` (verbe)
+Valeurs `person` autorisees :
+- `FST` (1re personne du singulier)
+- `SND` (2e personne du singulier)
+- `THD_M` (3e personne masculine du singulier)
+- `THD_F` (3e personne feminine du singulier)
+- `FST_PL` (1re personne du pluriel)
+- `SND_PL` (2e personne du pluriel)
+- `THD_PLM` (3e personne masculine du pluriel)
+- `THD_PLF` (3e personne feminine du pluriel)
+Valeurs `mode` autorisees :
+- `INDI` (indicatif)
+- `SUBJ` (subjonctif)
+- `COND` (conditionnel)
+- `PART` (participe)
+- `IMPE` (imperatif)
+- `INFI` (infinitif)
+Valeurs `tense` prises en charge dans l'implementation actuelle :
+- `PRES` (present)
+- `IMPA` (imparfait)
+- `FUTU` (futur)
+- `PASS` (passe)
+Note :
+- Le fichier de definitions d'origine contient plus de noms de temps, mais ce package prend actuellement en charge uniquement `PRES`, `IMPA`, `FUTU`, `PASS`.
+- Pour les appels nom/adjectif, `mode` et `tense` ne sont pas requis dans l'entree utilisateur.
+## Exécuter les tests
+```bash
+npm test
+```
+Cette commande exécute test/test.js et affiche des exemples de prédiction.
+## Afficher l'aide
+```bash
+npm run help
+```
+Cette commande affiche un guide rapide des paramètres pour `lemma`, `nounDerive`, `adjeDerive`, `verbDerive` et `derive`, avec les valeurs autorisées de person/mode/tense.
+## Exécuter les benchmarks
+1) Générer les fichiers JSON de checklist (100 éléments chacun) :
+```bash
+npm run benchmark:prepare
+```
+2) Exécuter toutes les suites de benchmark :
+```bash
+npm run benchmark
+```
+3) Optionnel : exécuter une suite spécifique :
+```bash
+npm run benchmark:lemma
+npm run benchmark:noun
+npm run benchmark:verb
+npm run benchmark:adje
+```
+## Résultats de benchmark (dernier run local)
+Commande de benchmark :
+```bash
+npm run benchmark
+```
+Résultats :
+- lemme depuis une conjugaison : 99/100, précision 99.00 %, moyenne 16.46 ms
+- dérivation nominale : 99/100, précision 99.00 %, moyenne 17.33 ms
+- dérivation verbale : 100/100, précision 100.00 %, moyenne 17.12 ms
+- dérivation adjectivale : 100/100, précision 100.00 %, moyenne 17.34 ms
+## Taille des modèles
+- modèle ONNX lemme : models/small/lemma_type_model.int8.onnx = 0.96 MB
+- modèle ONNX dérivation : models/small/derive_form_model.int8.onnx = 0.91 MB
+- taille totale ONNX : environ 1.87 MB
+## Pourquoi c'est idéal pour des produits web frontend
+- excellente précision sur les tâches clés de morphologie française
+- faible latence par requête (environ 16 à 17 ms en moyenne sur benchmark local)
+- empreinte ONNX très compacte (environ 1.87 MB au total)
+- parfait pour alimenter des fonctionnalités frontend via une inférence backend : assistance à l'écriture en temps réel, suggestions grammaticales et recherche basée sur le lemme

package/README.md ADDED Viewed

@@ -0,0 +1,165 @@
+# FR-SPELL
+[English](README.md) | [中文](README.cn.md) | [Français](README.fr.md)
+FR-SPELL is an npm package for French lemma prediction and derivative form generation.
+It supports:
+- conjugation to lemma prediction
+- noun form generation
+- adjective form generation
+- verb form generation
+The package runs with ONNX Runtime and quantized INT8 models for high speed and small model footprint.
+## Install
+```bash
+npm install fr-spell
+```
+## Integrate Into Your Project
+```js
+import { FrSpell } from 'fr-spell';
+const predictor = await FrSpell();
+const lemma = await predictor.lemma('mangeons');
+const noun = await predictor.nounDerive('chat', 'THD_PLF');
+const adje = await predictor.adjeDerive('beau', 'THD_F');
+const verb = await predictor.verbDerive('manger', 'FST_PL', 'INDI', 'PRES');
+console.log(lemma);
+console.log(noun);
+console.log(adje);
+console.log(verb);
+```
+Sample runtime output:
+```txt
+{ input: 'mangeons', lemma: 'manger', wordType: 'VERB', confidence: 0.9965604285, timeMs: 3.89 }
+{ lemma: 'chat', wordType: 'NOUN', person: 'THD_PLF', mode: 'ALL', tense: 'ALL', output: 'chattes', confidence: 0.9997230679, timeMs: 5.06 }
+{ lemma: 'beau', wordType: 'ADJE', person: 'THD_F', mode: 'ALL', tense: 'ALL', output: 'belle', confidence: 0.9999751771, timeMs: 3.08 }
+{ lemma: 'manger', wordType: 'VERB', person: 'FST_PL', mode: 'INDI', tense: 'PRES', output: 'mangeons', confidence: 0.9999864523, timeMs: 4.79 }
+```
+## Prediction Parameters
+Lemma prediction:
+- API: `predictor.lemma(input)`
+- `input`: string, inflected/conjugated word form, for example `mangeons`
+Derive prediction:
+- Noun API: `predictor.nounDerive(lemma, person)`
+- Adjective API: `predictor.adjeDerive(lemma, person)`
+- Verb API: `predictor.verbDerive(lemma, person, mode, tense)`
+- Generic API: `predictor.derive(lemma, wordType, person, mode, tense)`
+Allowed `wordType` values:
+- `NOUN` (noun)
+- `ADJE` (adjective)
+- `VERB` (verb)
+Allowed `person` values:
+- `FST` (1st person singular)
+- `SND` (2nd person singular)
+- `THD_M` (3rd person masculine singular)
+- `THD_F` (3rd person feminine singular)
+- `FST_PL` (1st person plural)
+- `SND_PL` (2nd person plural)
+- `THD_PLM` (3rd person masculine plural)
+- `THD_PLF` (3rd person feminine plural)
+Allowed `mode` values:
+- `INDI` (indicative)
+- `SUBJ` (subjunctive)
+- `COND` (conditional)
+- `PART` (participle)
+- `IMPE` (imperative)
+- `INFI` (infinitive)
+Allowed `tense` values in current implementation:
+- `PRES` (present)
+- `IMPA` (imperfect)
+- `FUTU` (future)
+- `PASS` (past)
+Note:
+- The original grammar definition file includes more tense names, but this package implementation currently supports only `PRES`, `IMPA`, `FUTU`, `PASS`.
+- For noun/adjective derive calls, `mode` and `tense` are not required in user input.
+## Run Test
+```bash
+npm test
+```
+This executes test/test.js and prints sample prediction outputs.
+## Run Help
+```bash
+npm run help
+```
+This prints a quick parameter reference for `lemma`, `nounDerive`, `adjeDerive`, `verbDerive`, and `derive`, including allowed person/mode/tense values.
+## Run Benchmark
+1) Prepare checklist JSON files (100 items each):
+```bash
+npm run benchmark:prepare
+```
+2) Run all benchmark suites:
+```bash
+npm run benchmark
+```
+3) Optional single-suite runs:
+```bash
+npm run benchmark:lemma
+npm run benchmark:noun
+npm run benchmark:verb
+npm run benchmark:adje
+```
+## Benchmark Result (Latest Local Run)
+Benchmark command:
+```bash
+npm run benchmark
+```
+Results:
+- lemma from conjugation: 99/100, accuracy 99.00%, average 16.46 ms
+- noun derive: 99/100, accuracy 99.00%, average 17.33 ms
+- verb derive: 100/100, accuracy 100.00%, average 17.12 ms
+- adjective derive: 100/100, accuracy 100.00%, average 17.34 ms
+## Model Size
+- lemma ONNX model: models/small/lemma_type_model.int8.onnx = 0.96 MB
+- derive ONNX model: models/small/derive_form_model.int8.onnx = 0.91 MB
+- total ONNX model size: about 1.87 MB
+## Why It Is Great For Web Frontend Products
+- high accuracy for key French morphology tasks
+- low per-request latency (about 16 to 17 ms average in local benchmark)
+- very small ONNX footprint (about 1.87 MB total)
+- ideal for backend inference powering web frontend features such as live writing assistance, grammar hints, and lemma-aware search