daidai 0.1.0 → 0.1.1.dev.20260627.f7f9ee5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +10 -1
- data/README.md +5 -4
- data/lib/daidai/deinflector.rb +2 -3
- data/lib/daidai/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: a97f1319b774e539184acc0b5a8e45dc85e26d8a944d4e3d52d8d19abcd98729
|
|
4
|
+
data.tar.gz: b71b9b7f3a9aa79e54c2918a5c6ba47b3e27c56c634612dab0fab58cde7786f3
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 4c84f1e48d01875f02e7e29c0cc1a001d28a84e85ee67d8769776fa5f4f4576e0cb801d907fcd4f223763bf835d860fffe614498c182eb29013cedfa284bbff8
|
|
7
|
+
data.tar.gz: 950541a236e0416cbad536057f29f7c64fd450c0dfc1b094bed66750eed5636da21907630f45b103b83f3bd8535eca175b03da4e649ffa17ee69c6d994dec8ad
|
data/CHANGELOG.md
CHANGED
|
@@ -6,6 +6,14 @@ follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
|
6
6
|
|
|
7
7
|
## [Unreleased]
|
|
8
8
|
|
|
9
|
+
## [0.1.1] - 2026-06-27
|
|
10
|
+
|
|
11
|
+
### Changed
|
|
12
|
+
|
|
13
|
+
- `Daidai::Deinflection` no longer exposes the internal `conditions` bitmask;
|
|
14
|
+
use `#dictionary_form?` (the raw flags remain on `Daidai::Deinflector.transform`).
|
|
15
|
+
- README: corrected the "Data & tables" scope, added Yomitan attribution.
|
|
16
|
+
|
|
9
17
|
## [0.1.0] - 2026-06-27
|
|
10
18
|
|
|
11
19
|
### Added
|
|
@@ -27,5 +35,6 @@ follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
|
27
35
|
inverse of `#conjugate`). Ported from Yomitan's Japanese language transforms;
|
|
28
36
|
also covers colloquial contractions (てる, ちゃう, …). See `Daidai::Deinflector`.
|
|
29
37
|
|
|
30
|
-
[Unreleased]: https://github.com/davafons/daidai/compare/v0.1.
|
|
38
|
+
[Unreleased]: https://github.com/davafons/daidai/compare/v0.1.1...HEAD
|
|
39
|
+
[0.1.1]: https://github.com/davafons/daidai/compare/v0.1.0...v0.1.1
|
|
31
40
|
[0.1.0]: https://github.com/davafons/daidai/releases/tag/v0.1.0
|
data/README.md
CHANGED
|
@@ -194,14 +194,14 @@ d.to_s # => "食べる [-いる, -て]"
|
|
|
194
194
|
```
|
|
195
195
|
|
|
196
196
|
Deinflection is rule-based and **dictionary-free**, so it returns *every* base
|
|
197
|
-
form the rules can reach
|
|
197
|
+
form the rules can reach, many of which are not real words (食べてる also yields
|
|
198
198
|
食べつ as a hypothetical potential). It is meant to feed a dictionary lookup: keep
|
|
199
199
|
the candidates whose `term` is a real entry. If you have no dictionary, filtering
|
|
200
200
|
to `dictionary_form?` candidates keeps the plausible lemmas.
|
|
201
201
|
|
|
202
202
|
This pairs naturally with a dictionary like JMdict: deinflect the query, look up
|
|
203
203
|
each candidate `term`, and you have the lemma, its part of speech, and the named
|
|
204
|
-
inflection
|
|
204
|
+
inflection, without a morphological analyzer. (For a single authoritative lemma
|
|
205
205
|
+ reading from arbitrary text, including full sentences, the kabosu path above is
|
|
206
206
|
still the tool; the two are complementary.)
|
|
207
207
|
|
|
@@ -211,7 +211,7 @@ Japanese language transforms and is vendored as JSON under
|
|
|
211
211
|
|
|
212
212
|
## Data & tables
|
|
213
213
|
|
|
214
|
-
All of the
|
|
214
|
+
All of the *conjugation* knowledge lives in four tab-separated tables vendored under `lib/daidai/resources/`, taken from **JMdictDB** (the maintained home of these tables; jconj is the standalone reference implementation Daidai ports). (The deinflection rules live beside them as `japanese-transforms.json`; see [Deinflection](#deinflection-inflected-form-to-dictionary-form).)
|
|
215
215
|
|
|
216
216
|
| File | Contents |
|
|
217
217
|
|------|----------|
|
|
@@ -232,10 +232,11 @@ This downloads the latest `conj.csv`, `conjo.csv`, `conotes.csv`, and `kwpos.csv
|
|
|
232
232
|
|
|
233
233
|
## Data & attribution
|
|
234
234
|
|
|
235
|
-
The conjugation
|
|
235
|
+
The conjugation and deinflection logic are not original to Daidai. They come from:
|
|
236
236
|
|
|
237
237
|
- **JMdictDB** (<https://gitlab.com/yamagoya/jmdictdb>), by Stuart McGraw: the actively-maintained home of the conjugation tables and part-of-speech taxonomy, under Jim Breen's **Electronic Dictionary Research and Development Group (EDRDG)**, <https://www.edrdg.org/>.
|
|
238
238
|
- **jconj** (<https://gitlab.com/yamagoya/jconj>): the standalone, table-based conjugator whose algorithm Daidai ports to Ruby.
|
|
239
|
+
- **Yomitan** (<https://github.com/yomidevs/yomitan>), by the Yomitan Authors: the source of the deinflection rule set and transformer that `Daidai.deinflect` ports to Ruby.
|
|
239
240
|
|
|
240
241
|
Because the upstream work is GPL-licensed, Daidai inherits that lineage and is distributed under the **GPL-3.0** license. The JMdict/JMdictDB data is used under the EDRDG licence; please retain the attribution above and the `NOTICE` file in any redistribution.
|
|
241
242
|
|
data/lib/daidai/deinflector.rb
CHANGED
|
@@ -6,12 +6,12 @@ module Daidai
|
|
|
6
6
|
# A single deinflection candidate: a base-form `term` reached from the input by
|
|
7
7
|
# applying `inflections` (transform names, ordered from the surface form inward
|
|
8
8
|
# to the dictionary form). `dictionary_form?` is true when the rule chain lands
|
|
9
|
-
# on a recognised dictionary form (a likely real lemma)
|
|
9
|
+
# on a recognised dictionary form (a likely real lemma), useful for callers
|
|
10
10
|
# without their own dictionary to look the term up in.
|
|
11
11
|
#
|
|
12
12
|
# Daidai.deinflect("食べてる") # candidate base forms, each with named inflections;
|
|
13
13
|
# # one is #<Daidai::Deinflection 食べる [-いる, -て]>
|
|
14
|
-
Deinflection = Struct.new(:term, :inflections, :
|
|
14
|
+
Deinflection = Struct.new(:term, :inflections, :dictionary_form, keyword_init: true) do
|
|
15
15
|
def dictionary_form? = dictionary_form
|
|
16
16
|
|
|
17
17
|
def to_s = inflections.empty? ? term : "#{term} [#{inflections.join(", ")}]"
|
|
@@ -94,7 +94,6 @@ module Daidai
|
|
|
94
94
|
# trace is newest-first (innermost rule first); reverse so the names read
|
|
95
95
|
# from the surface form inward to the dictionary form.
|
|
96
96
|
inflections: transformed.trace.reverse.map { |frame| transforms_by_id[frame[:transform]].name },
|
|
97
|
-
conditions: transformed.conditions,
|
|
98
97
|
dictionary_form: transformed.conditions.anybits?(dictionary_mask)
|
|
99
98
|
)
|
|
100
99
|
end
|
data/lib/daidai/version.rb
CHANGED