eyeling 1.15.2 → 1.15.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/HANDBOOK.md +345 -0
- package/package.json +1 -1
- package/test/check.test.js +12 -12
package/HANDBOOK.md
CHANGED
|
@@ -29,6 +29,7 @@
|
|
|
29
29
|
- [Appendix B — Notation3: when facts can carry their own logic](#app-b)
|
|
30
30
|
- [Appendix C — N3 beyond Prolog: logic that survives the open web](#app-c)
|
|
31
31
|
- [Appendix D — LLM + Eyeling: A Repeatable Logic Toolchain](#app-d)
|
|
32
|
+
- [Appendix E — How Eyeling reaches 100% on `notation3tests`](#app-e)
|
|
32
33
|
|
|
33
34
|
---
|
|
34
35
|
|
|
@@ -2160,3 +2161,347 @@ A simple structure that keeps the LLM honest:
|
|
|
2160
2161
|
- “If something is unknown, emit a placeholder fact (`:needsFact`) rather than guessing.”
|
|
2161
2162
|
|
|
2162
2163
|
The point isn’t that the LLM is “right”; it’s that **Eyeling makes the result checkable**, and the artifact becomes a maintainable program rather than a one-off generation.
|
|
2164
|
+
|
|
2165
|
+
---
|
|
2166
|
+
|
|
2167
|
+
<a id="app-e"></a>
|
|
2168
|
+
|
|
2169
|
+
## Appendix E — How Eyeling reaches 100% on `notation3tests`
|
|
2170
|
+
|
|
2171
|
+
### E.1 The goal
|
|
2172
|
+
|
|
2173
|
+
Eyeling does not treat [notation3tests](https://codeberg.org/phochste/notation3tests/) as a side check.
|
|
2174
|
+
|
|
2175
|
+
It treats the suite as an **external semantic contract**.
|
|
2176
|
+
|
|
2177
|
+
That means:
|
|
2178
|
+
|
|
2179
|
+
- the target is public
|
|
2180
|
+
- the target is reproducible
|
|
2181
|
+
- the target is outside the local codebase
|
|
2182
|
+
- success means interoperability, not self-consistency
|
|
2183
|
+
|
|
2184
|
+
---
|
|
2185
|
+
|
|
2186
|
+
### E.2 The test loop
|
|
2187
|
+
|
|
2188
|
+
The workflow is simple and strict:
|
|
2189
|
+
|
|
2190
|
+
- clone the external [notation3tests](https://codeberg.org/phochste/notation3tests/) suite
|
|
2191
|
+
- package the current Eyeling tree
|
|
2192
|
+
- install that package into the suite
|
|
2193
|
+
- run the suite’s Eyeling target
|
|
2194
|
+
- fix semantics, not cosmetics
|
|
2195
|
+
|
|
2196
|
+
This keeps the suite honest and keeps Eyeling honest.
|
|
2197
|
+
|
|
2198
|
+
---
|
|
2199
|
+
|
|
2200
|
+
### E.3 The prompt packet
|
|
2201
|
+
|
|
2202
|
+
A typical conformance-fix prompt is not open-ended.
|
|
2203
|
+
|
|
2204
|
+
It usually includes a small, repeatable packet:
|
|
2205
|
+
|
|
2206
|
+
- the Eyeling source as an attached zip `https://github.com/eyereasoner/eyeling/archive/refs/heads/main.zip`
|
|
2207
|
+
- pointers to the failing tests
|
|
2208
|
+
- the exact failing output, or the exact command needed to reproduce it
|
|
2209
|
+
- a pointer to the N3 spec `https://w3c.github.io/N3/spec/`
|
|
2210
|
+
- a pointer to the builtin definitions `https://w3c.github.io/N3/spec/builtins.html`
|
|
2211
|
+
- a direct request to fix the issue in the engine
|
|
2212
|
+
- a direct request to update `HANDBOOK.md`
|
|
2213
|
+
|
|
2214
|
+
The request is usually phrased in a narrow way:
|
|
2215
|
+
|
|
2216
|
+
- fix this specific failing conformance case
|
|
2217
|
+
- preserve existing passing behavior
|
|
2218
|
+
- make the smallest coherent patch
|
|
2219
|
+
- add or update a regression test if needed
|
|
2220
|
+
- update the handbook so the semantic rule is documented, not just implemented
|
|
2221
|
+
- do not stop at making the test green; align the implementation with the spec and explain the semantic reason in `HANDBOOK.md`
|
|
2222
|
+
|
|
2223
|
+
The model is not asked to “improve the reasoner” in general.
|
|
2224
|
+
|
|
2225
|
+
It is asked to repair one semantic gap against: the code, the failing test, the spec, and the handbook.
|
|
2226
|
+
|
|
2227
|
+
---
|
|
2228
|
+
|
|
2229
|
+
### E.4 The core idea
|
|
2230
|
+
|
|
2231
|
+
Eyeling reaches 100% by making the engine match the semantics that the suite exercises.
|
|
2232
|
+
|
|
2233
|
+
That means getting these right:
|
|
2234
|
+
|
|
2235
|
+
- N3 syntax
|
|
2236
|
+
- rule forms
|
|
2237
|
+
- quoted formulas
|
|
2238
|
+
- variable and blank-node behavior
|
|
2239
|
+
- builtin relations
|
|
2240
|
+
- closure and duplicate control
|
|
2241
|
+
|
|
2242
|
+
The result is not “test gaming.”
|
|
2243
|
+
|
|
2244
|
+
The result is semantic alignment.
|
|
2245
|
+
|
|
2246
|
+
---
|
|
2247
|
+
|
|
2248
|
+
### E.5 One rule core, many surfaces
|
|
2249
|
+
|
|
2250
|
+
The suite uses different surface forms for the same logical ideas.
|
|
2251
|
+
|
|
2252
|
+
Eyeling accepts and normalizes them into one internal rule model:
|
|
2253
|
+
|
|
2254
|
+
- `{ P } => { C } .`
|
|
2255
|
+
- `{ H } <= { B } .`
|
|
2256
|
+
- top-level `log:implies`
|
|
2257
|
+
- top-level `log:impliedBy`
|
|
2258
|
+
|
|
2259
|
+
That matters because conformance depends on recognizing equivalence across syntax, not just parsing one preferred style.
|
|
2260
|
+
|
|
2261
|
+
---
|
|
2262
|
+
|
|
2263
|
+
### E.6 Normalize first, reason second
|
|
2264
|
+
|
|
2265
|
+
A large share of conformance work happens **before** execution.
|
|
2266
|
+
|
|
2267
|
+
Eyeling normalizes the tricky parts early:
|
|
2268
|
+
|
|
2269
|
+
- body blanks become variables
|
|
2270
|
+
- head blanks stay existential
|
|
2271
|
+
- RDF collection encodings become list terms
|
|
2272
|
+
- rule syntax variants become one rule representation
|
|
2273
|
+
|
|
2274
|
+
This removes ambiguity before the engine starts proving anything.
|
|
2275
|
+
|
|
2276
|
+
---
|
|
2277
|
+
|
|
2278
|
+
### E.7 Body blanks vs. head blanks
|
|
2279
|
+
|
|
2280
|
+
This is one of the decisive details.
|
|
2281
|
+
|
|
2282
|
+
In Eyeling:
|
|
2283
|
+
|
|
2284
|
+
- blanks in rule bodies act like placeholders
|
|
2285
|
+
- blanks in rule heads act like fresh existentials
|
|
2286
|
+
|
|
2287
|
+
That split is essential.
|
|
2288
|
+
|
|
2289
|
+
Without it:
|
|
2290
|
+
|
|
2291
|
+
- rule matching goes wrong
|
|
2292
|
+
- proofs become unstable
|
|
2293
|
+
- existential output becomes noisy
|
|
2294
|
+
- conformance drops
|
|
2295
|
+
|
|
2296
|
+
---
|
|
2297
|
+
|
|
2298
|
+
### E.8 Builtins must behave like relations
|
|
2299
|
+
|
|
2300
|
+
Eyeling does not treat builtins as one-way helper functions.
|
|
2301
|
+
|
|
2302
|
+
It treats them as **relations inside proof search**.
|
|
2303
|
+
|
|
2304
|
+
That means a builtin can:
|
|
2305
|
+
|
|
2306
|
+
- succeed
|
|
2307
|
+
- fail
|
|
2308
|
+
- bind variables
|
|
2309
|
+
- stay satisfiable without yet binding anything
|
|
2310
|
+
|
|
2311
|
+
This is critical for the suite, because many builtin cases are really tests of search behavior, not just value computation.
|
|
2312
|
+
|
|
2313
|
+
---
|
|
2314
|
+
|
|
2315
|
+
### E.9 Delay builtins when needed
|
|
2316
|
+
|
|
2317
|
+
Some builtins only become useful after neighboring goals bind enough variables.
|
|
2318
|
+
|
|
2319
|
+
Eyeling handles that by deferring non-informative builtins inside conjunctions.
|
|
2320
|
+
|
|
2321
|
+
So instead of failing too early, the engine:
|
|
2322
|
+
|
|
2323
|
+
- rotates the builtin later
|
|
2324
|
+
- keeps proving the remaining goals
|
|
2325
|
+
- retries once more information exists
|
|
2326
|
+
|
|
2327
|
+
This preserves logical behavior while staying operationally efficient.
|
|
2328
|
+
|
|
2329
|
+
---
|
|
2330
|
+
|
|
2331
|
+
### E.10 Formulas are first-class terms
|
|
2332
|
+
|
|
2333
|
+
Quoted formulas are not treated as strings.
|
|
2334
|
+
|
|
2335
|
+
They are treated as structured logical objects.
|
|
2336
|
+
|
|
2337
|
+
That gives Eyeling the machinery it needs for:
|
|
2338
|
+
|
|
2339
|
+
- formula matching
|
|
2340
|
+
- nested reasoning
|
|
2341
|
+
- `log:includes`
|
|
2342
|
+
- `log:conclusion`
|
|
2343
|
+
- formula comparison by alpha-equivalence
|
|
2344
|
+
|
|
2345
|
+
This is a major reason the higher-level N3 tests pass cleanly.
|
|
2346
|
+
|
|
2347
|
+
---
|
|
2348
|
+
|
|
2349
|
+
### E.11 Alpha-equivalence matters
|
|
2350
|
+
|
|
2351
|
+
Two formulas that differ only in internal names must still count as the same formula when their structure matches.
|
|
2352
|
+
|
|
2353
|
+
Eyeling therefore compares formulas by structure, not by accidental naming.
|
|
2354
|
+
|
|
2355
|
+
That removes a common source of false mismatches in:
|
|
2356
|
+
|
|
2357
|
+
- quoted formulas
|
|
2358
|
+
- nested graphs
|
|
2359
|
+
- rule introspection
|
|
2360
|
+
- scoped reasoning
|
|
2361
|
+
|
|
2362
|
+
---
|
|
2363
|
+
|
|
2364
|
+
### E.12 Lists must have one meaning
|
|
2365
|
+
|
|
2366
|
+
The suite exercises list behavior in more than one spelling.
|
|
2367
|
+
|
|
2368
|
+
Eyeling unifies them:
|
|
2369
|
+
|
|
2370
|
+
- concrete N3 lists
|
|
2371
|
+
- RDF `first/rest` collection encodings
|
|
2372
|
+
|
|
2373
|
+
By materializing anonymous RDF collections into list terms, Eyeling gives both forms one semantic path through the engine.
|
|
2374
|
+
|
|
2375
|
+
That keeps list reasoning consistent across the whole suite.
|
|
2376
|
+
|
|
2377
|
+
---
|
|
2378
|
+
|
|
2379
|
+
### E.13 Existentials must be stable
|
|
2380
|
+
|
|
2381
|
+
A rule head with blanks must not generate endless fresh variants of the same logical result.
|
|
2382
|
+
|
|
2383
|
+
Eyeling stabilizes this by skolemizing head blanks per firing instance.
|
|
2384
|
+
|
|
2385
|
+
So one logical firing yields:
|
|
2386
|
+
|
|
2387
|
+
- one stable witness
|
|
2388
|
+
- one stable derived shape
|
|
2389
|
+
- one meaningful duplicate check
|
|
2390
|
+
|
|
2391
|
+
This is what lets closure reach a real fixpoint.
|
|
2392
|
+
|
|
2393
|
+
---
|
|
2394
|
+
|
|
2395
|
+
### E.14 Duplicate suppression is semantic, not cosmetic
|
|
2396
|
+
|
|
2397
|
+
The engine does not merely try to avoid repeated printing.
|
|
2398
|
+
|
|
2399
|
+
It tries to avoid repeated derivation of the same fact.
|
|
2400
|
+
|
|
2401
|
+
That requires:
|
|
2402
|
+
|
|
2403
|
+
- stable term ids
|
|
2404
|
+
- indexed fact storage
|
|
2405
|
+
- reliable duplicate keys
|
|
2406
|
+
- stable existential handling
|
|
2407
|
+
|
|
2408
|
+
Without that, a reasoner can look busy forever and still fail conformance.
|
|
2409
|
+
|
|
2410
|
+
---
|
|
2411
|
+
|
|
2412
|
+
### E.15 Closure must really close
|
|
2413
|
+
|
|
2414
|
+
Full conformance depends on real saturation behavior.
|
|
2415
|
+
|
|
2416
|
+
Eyeling therefore treats closure as:
|
|
2417
|
+
|
|
2418
|
+
- repeated rule firing
|
|
2419
|
+
- repeated proof over indexed facts
|
|
2420
|
+
- duplicate-aware insertion
|
|
2421
|
+
- termination at fixpoint
|
|
2422
|
+
|
|
2423
|
+
This is what turns the engine from a parser plus demos into a conformance-grade reasoner.
|
|
2424
|
+
|
|
2425
|
+
---
|
|
2426
|
+
|
|
2427
|
+
### E.16 Performance choices support correctness
|
|
2428
|
+
|
|
2429
|
+
Several implementation choices are operational, but they directly protect conformance:
|
|
2430
|
+
|
|
2431
|
+
- predicate-based indexing
|
|
2432
|
+
- subject/object refinement
|
|
2433
|
+
- smallest-bucket candidate selection
|
|
2434
|
+
- fast duplicate keys
|
|
2435
|
+
- skipping already-known ground heads
|
|
2436
|
+
|
|
2437
|
+
These choices reduce accidental nontermination and prevent operational noise from becoming semantic failure.
|
|
2438
|
+
|
|
2439
|
+
---
|
|
2440
|
+
|
|
2441
|
+
### E.17 The suite stays external
|
|
2442
|
+
|
|
2443
|
+
This is a key discipline.
|
|
2444
|
+
|
|
2445
|
+
Eyeling does not define success by a private in-repo imitation of [notation3tests](https://codeberg.org/phochste/notation3tests/).
|
|
2446
|
+
|
|
2447
|
+
It runs against the external suite.
|
|
2448
|
+
|
|
2449
|
+
That means:
|
|
2450
|
+
|
|
2451
|
+
- the benchmark is shared
|
|
2452
|
+
- the contract is public
|
|
2453
|
+
- the result is independently meaningful
|
|
2454
|
+
|
|
2455
|
+
A green run says something real.
|
|
2456
|
+
|
|
2457
|
+
---
|
|
2458
|
+
|
|
2459
|
+
### E.18 Every failure becomes an invariant
|
|
2460
|
+
|
|
2461
|
+
Eyeling reaches 100% because failures are not patched superficially.
|
|
2462
|
+
|
|
2463
|
+
Each failure is turned into an engine rule.
|
|
2464
|
+
|
|
2465
|
+
Examples:
|
|
2466
|
+
|
|
2467
|
+
- parser failure → broader syntax support
|
|
2468
|
+
- list failure → one unified list model
|
|
2469
|
+
- formula failure → alpha-equivalence discipline
|
|
2470
|
+
- builtin failure → relational evaluation
|
|
2471
|
+
- closure failure → stable existential handling
|
|
2472
|
+
|
|
2473
|
+
That is how the suite shapes the engine.
|
|
2474
|
+
|
|
2475
|
+
---
|
|
2476
|
+
|
|
2477
|
+
### E.19 Why 100% happens
|
|
2478
|
+
|
|
2479
|
+
Eyeling gets to 100% because all the key layers line up:
|
|
2480
|
+
|
|
2481
|
+
- the parser accepts the full rule surface
|
|
2482
|
+
- normalization removes semantic ambiguity
|
|
2483
|
+
- formulas are real terms
|
|
2484
|
+
- builtins participate in proof search
|
|
2485
|
+
- existential output is stable
|
|
2486
|
+
- closure reaches a true fixpoint
|
|
2487
|
+
- the public suite remains the judge
|
|
2488
|
+
|
|
2489
|
+
Once those pieces are in place, 100% is the visible result of a coherent design.
|
|
2490
|
+
|
|
2491
|
+
---
|
|
2492
|
+
|
|
2493
|
+
### E.20 Final takeaway
|
|
2494
|
+
|
|
2495
|
+
Eyeling reaches full [notation3tests](https://codeberg.org/phochste/notation3tests/) conformance by making “pass the suite” and “implement N3 correctly enough to interoperate” the same task.
|
|
2496
|
+
|
|
2497
|
+
That is the method:
|
|
2498
|
+
|
|
2499
|
+
- external suite
|
|
2500
|
+
- one semantic core
|
|
2501
|
+
- early normalization
|
|
2502
|
+
- relational builtins
|
|
2503
|
+
- formula-aware reasoning
|
|
2504
|
+
- stable existential output
|
|
2505
|
+
- duplicate-safe fixpoint closure
|
|
2506
|
+
|
|
2507
|
+
That is why the result is 100%.
|
package/package.json
CHANGED
package/test/check.test.js
CHANGED
|
@@ -8,19 +8,19 @@ const cp = require('node:child_process');
|
|
|
8
8
|
|
|
9
9
|
const C = process.stdout.isTTY
|
|
10
10
|
? {
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
11
|
+
red: '\x1b[31m',
|
|
12
|
+
green: '\x1b[32m',
|
|
13
|
+
yellow: '\x1b[33m',
|
|
14
|
+
dim: '\x1b[2m',
|
|
15
|
+
reset: '\x1b[0m',
|
|
16
|
+
}
|
|
17
17
|
: {
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
18
|
+
red: '',
|
|
19
|
+
green: '',
|
|
20
|
+
yellow: '',
|
|
21
|
+
dim: '',
|
|
22
|
+
reset: '',
|
|
23
|
+
};
|
|
24
24
|
|
|
25
25
|
function ok(msg) {
|
|
26
26
|
console.log(`${C.green}OK${C.reset} ${msg}`);
|