@jhlagado/azm 0.2.8 → 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (29) hide show
  1. package/README.md +68 -6
  2. package/dist/src/api-compile.js +27 -0
  3. package/dist/src/assembly/assemble-program.js +5 -0
  4. package/dist/src/assembly/import-visibility.d.ts +3 -0
  5. package/dist/src/assembly/import-visibility.js +204 -0
  6. package/dist/src/node/source-host.js +40 -13
  7. package/dist/src/outputs/write-asm80.js +4 -0
  8. package/dist/src/register-contracts/programModel-routines.js +33 -17
  9. package/dist/src/source/logical-lines.d.ts +3 -0
  10. package/dist/src/source/source-span.d.ts +2 -0
  11. package/dist/src/syntax/parse-directive-statement.d.ts +1 -6
  12. package/dist/src/syntax/parse-directive-statement.js +3 -1
  13. package/dist/src/syntax/parse-layout-declarations.js +11 -2
  14. package/dist/src/syntax/parse-line.js +18 -2
  15. package/dist/src/tooling/api.js +1 -1
  16. package/docs/codebase/01-orientation-and-repository-layout.md +192 -0
  17. package/docs/codebase/02-source-loading-and-parsing.md +263 -0
  18. package/docs/codebase/03-assembly-and-z80-emission.md +251 -0
  19. package/docs/codebase/04-ops-and-register-contracts.md +237 -0
  20. package/docs/codebase/05-interfaces-and-output-artifacts.md +253 -0
  21. package/docs/codebase/06-verification-and-maintenance.md +202 -0
  22. package/docs/codebase/appendices/a-directory-file-reference.md +253 -0
  23. package/docs/codebase/appendices/b-compile-flow-reference.md +103 -0
  24. package/docs/codebase/appendices/c-public-surface-reference.md +106 -0
  25. package/docs/codebase/appendices/index.md +16 -0
  26. package/docs/codebase/index.md +46 -0
  27. package/package.json +2 -3
  28. package/docs/reference/cli.md +0 -158
  29. package/docs/reference/tooling-api.md +0 -320
@@ -0,0 +1,251 @@
1
+ ---
2
+ layout: default
3
+ title: 'Chapter 3 - Assembly and Z80 Emission'
4
+ parent: 'AZM Engineering Manual'
5
+ nav_order: 3
6
+ ---
7
+
8
+ [<- Source Loading and Parsing](02-source-loading-and-parsing.md) | [Ops and Register Contracts ->](04-ops-and-register-contracts.md)
9
+
10
+ # Chapter 3 - Assembly and Z80 Emission
11
+
12
+ Assembly turns source items into facts, bytes, fixups and source segments. This
13
+ chapter combines the assembler-time fact model with the Z80 emission path
14
+ because those two stages are tightly coupled: planning decides addresses and
15
+ sizes, then emission writes bytes using those facts.
16
+
17
+ The central files are:
18
+
19
+ - `src/assembly/address-planning.ts`
20
+ - `src/assembly/address-symbols.ts`
21
+ - `src/assembly/placement.ts`
22
+ - `src/assembly/program-emission.ts`
23
+ - `src/assembly/fixup-emission.ts`
24
+ - `src/assembly/assemble-program.ts`
25
+ - `src/semantics/expression-evaluation.ts`
26
+ - `src/semantics/constant-operators.ts`
27
+ - `src/semantics/layout-evaluation.ts`
28
+ - `src/z80/parse-instruction.ts`
29
+ - `src/z80/instruction.ts`
30
+ - `src/z80/encode.ts`
31
+ - `src/z80/effects.ts`
32
+
33
+ ## Assembly Orchestration
34
+
35
+ `src/assembly/assemble-program.ts` is the entry point:
36
+
37
+ ```ts
38
+ export function assembleProgram(items: readonly SourceItem[]): AssembleProgramResult;
39
+ ```
40
+
41
+ It builds the address state, emits the program image and returns diagnostics,
42
+ symbols, byte maps and source segments. The function is intentionally small.
43
+ Detailed decisions live in the modules below it.
44
+
45
+ ## Address Planning
46
+
47
+ `src/assembly/address-planning.ts` walks source items and builds the facts
48
+ needed to place and evaluate the program:
49
+
50
+ - labels and their addresses
51
+ - `.equ` constants and enum members
52
+ - record and union layouts
53
+ - type aliases
54
+ - `.org` placement state
55
+ - sizes for data, storage and instructions
56
+
57
+ The returned `AddressState` contains symbol facts, equate records, layout
58
+ records, enum names, diagnostics and placement information. This is where parsed
59
+ syntax becomes assembler knowledge. A label item becomes an address. A type
60
+ declaration becomes a layout record. A `.ds` directive becomes a byte count.
61
+
62
+ ## Placement
63
+
64
+ `src/assembly/placement.ts` owns placement state. `.org` sets the active
65
+ address. Instructions advance by encoded size. `.db` advances by emitted byte
66
+ count. `.dw` advances by two bytes per expression. `.ds` advances by calculated
67
+ storage size. `.align` advances to the next aligned address.
68
+
69
+ Placement is separate from byte emission so the assembler can resolve symbols
70
+ before every byte has been written. A branch to a later label can be encoded
71
+ after address planning has seen the label definition.
72
+
73
+ ## Symbols and Layouts
74
+
75
+ Labels define addresses. `.equ` declarations define assembler-time values.
76
+ Enums define qualified constants. `src/assembly/address-symbols.ts` contains
77
+ the symbol helpers that define labels, equates and enum members, enforce
78
+ duplicate rules and record spans for diagnostics and output metadata.
79
+
80
+ Record and union declarations become layout records. A record field advances the
81
+ offset by its byte size. A union field starts at offset zero and the union size
82
+ is the largest field size.
83
+
84
+ For a record:
85
+
86
+ ```asm
87
+ Sprite .type
88
+ x .field byte
89
+ y .field byte
90
+ tile .field byte
91
+ flags .field byte
92
+ .endtype
93
+ ```
94
+
95
+ the layout record stores `x = 0`, `y = 1`, `tile = 2`, `flags = 3` and total
96
+ size `4`.
97
+
98
+ Type aliases bind a name to another type expression:
99
+
100
+ ```asm
101
+ SpriteArray .typealias Sprite[16]
102
+ ```
103
+
104
+ `sizeof(SpriteArray)` resolves through the alias to the size of `Sprite[16]`,
105
+ with the same field paths as the underlying array expression.
106
+
107
+ ## Expression Evaluation
108
+
109
+ `src/semantics/expression-evaluation.ts` evaluates expression trees against the
110
+ assembler-time environment. It coordinates literal arithmetic, symbol lookup,
111
+ constant operators, byte functions, layout functions and layout casts that fold
112
+ to constant addresses.
113
+
114
+ The operator and layout rules live in focused modules:
115
+
116
+ | File | Role |
117
+ | ----------------------- | ------------------------------------------------------------------- |
118
+ | `constant-operators.ts` | Binary and unary constant-operator dispatch. |
119
+ | `binary-operators.ts` | Binary arithmetic, bitwise, comparison and logical operators. |
120
+ | `unary-operators.ts` | Unary numeric operators. |
121
+ | `byte-functions.ts` | `LSB(...)`, `MSB(...)` and related byte extraction helpers. |
122
+ | `layout-evaluation.ts` | `sizeof(...)`, `offset(...)` and layout type expression evaluation. |
123
+ | `layout-path.ts` | Field and array path resolution for layout casts. |
124
+ | `layout-format.ts` | Human-readable layout names for diagnostics. |
125
+ | `diagnostics.ts` | Shared semantic diagnostic construction. |
126
+
127
+ Expression evaluation is context-sensitive. A symbol in an instruction operand
128
+ may be a label or constant. A type expression in `.ds Sprite[4]` resolves to a
129
+ byte count. A layout cast such as `<SpriteArray>Sprites[3].flags` resolves to an
130
+ address when the base address, type alias, index and field path are all known at
131
+ assembly time.
132
+
133
+ ## Data and Storage Size
134
+
135
+ Address planning needs the byte length of each directive. It contains size
136
+ helpers for `.db`, `.dw`, `.ds`, strings and alignment. The same string
137
+ directive byte rules are reused during emission so planning and output stay
138
+ aligned.
139
+
140
+ Initialized data writes bytes through `.db`, `.dw`, `.cstr`, `.pstr` and
141
+ `.istr`. `.ds` reserves space calculated from numbers or layout type
142
+ expressions. Layout type expressions are byte-size expressions in storage and
143
+ field-size positions.
144
+
145
+ ## Program Emission
146
+
147
+ `src/assembly/program-emission.ts` owns byte writing. It walks source items in
148
+ order and writes emitted bytes to a map keyed by absolute address. It also
149
+ records source segments for the D8 map writer.
150
+
151
+ The emitted program shape is:
152
+
153
+ ```ts
154
+ export interface EmittedProgram {
155
+ readonly image: ReadonlyMap<number, number>;
156
+ readonly sourceSegments: readonly EmittedSourceSegment[];
157
+ readonly initializedAddresses: readonly number[];
158
+ }
159
+ ```
160
+
161
+ The byte map stores final addresses. If source emits one byte at `$0100` and
162
+ another at `$8000`, the map has two entries. A sparse map fits Z80 programs that
163
+ place code, data and ROM vectors at separate origins.
164
+
165
+ ## Emission Walkthrough
166
+
167
+ For this source:
168
+
169
+ ```asm
170
+ .org $0100
171
+ Start:
172
+ ld a,42
173
+ jp Start
174
+ ```
175
+
176
+ address planning records `Start = $0100`. The instruction parser represents
177
+ `ld a,42` as an immediate load and `jp Start` as an absolute branch. The encoder
178
+ returns literal opcode bytes plus a 16-bit expression fragment for `Start`.
179
+ Fixup emission evaluates `Start` to `$0100` and writes the little-endian operand
180
+ bytes.
181
+
182
+ The final byte map contains:
183
+
184
+ ```text
185
+ $0100: 3E
186
+ $0101: 2A
187
+ $0102: C3
188
+ $0103: 00
189
+ $0104: 01
190
+ ```
191
+
192
+ The D8 source segment records that those bytes came from the source lines that
193
+ emitted them.
194
+
195
+ ## Z80 Instruction Model
196
+
197
+ `src/z80/instruction.ts` defines the instruction and operand types.
198
+ `src/z80/parse-instruction.ts` dispatches instruction text into parser families.
199
+ `src/z80/encode.ts` dispatches typed instructions into encoder families.
200
+ `src/z80/effects.ts` describes register and flag effects for register contracts.
201
+
202
+ The parser and encoder work as a pair:
203
+
204
+ | File | Question |
205
+ | ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
206
+ | `parse-instruction.ts` | Which parser family should handle this instruction head? |
207
+ | `parse-basic.ts`, `parse-branch.ts`, `parse-exchange.ts`, `parse-io-control.ts`, `parse-ld.ts` | Parse specific instruction families. |
208
+ | `parse-operands.ts`, `parse-conditions.ts`, `operand-split.ts` | Split operands and classify conditions, registers, indexed operands and expression operands. |
209
+ | `instruction.ts` | How is that form represented as TypeScript data? |
210
+ | `encode.ts`, `encode-core.ts`, `encode-ld.ts`, `encode-ld-helpers.ts` | Which bytes and fixup fragments represent that form? |
211
+ | `effects.ts` | Which registers and flags does that form read or write? |
212
+
213
+ The instruction model keeps overloaded Z80 mnemonics manageable. `ld` has many
214
+ forms, but the parser classifies operands before encoding. The encoder then
215
+ selects a specific opcode family from typed operands.
216
+
217
+ ## Fixups and Relative Branches
218
+
219
+ The Z80 encoder returns fragments: literal bytes, 8-bit immediates, 16-bit
220
+ immediates and relative branch targets. `src/assembly/fixup-emission.ts`
221
+ evaluates the expression attached to each fragment and writes the final byte or
222
+ word.
223
+
224
+ Relative branches such as `jr` and `djnz` emit an 8-bit displacement from the
225
+ next instruction. The parser recognises `jr nz,Loop`. The encoder knows the
226
+ opcode and that the operand is relative. Fixup emission knows the current
227
+ address and final target address.
228
+
229
+ The same principle applies to absolute branches and immediate operands. The
230
+ encoder chooses the fragment width. Fixup emission evaluates the expression and
231
+ checks that the value fits that width.
232
+
233
+ ## Source Segments
234
+
235
+ Every emitted byte range can carry source provenance. `program-emission.ts`
236
+ adds `EmittedSourceSegment` records with start address, end address, source
237
+ file, line and segment kind. `writeD8m()` later groups these segments by file
238
+ for Debug80.
239
+
240
+ Source segments classify emitted ranges as code, data, directive output, label
241
+ context or unknown output. Debug80 uses this metadata to connect source lines to
242
+ addresses. A byte-perfect assembler can still give a poor debugging experience
243
+ when source segments are too broad, absent or attached to a different file.
244
+
245
+ ## Changing Assembly or Encoding
246
+
247
+ Instruction changes usually touch `parse-instruction.ts`, `instruction.ts`,
248
+ `encode.ts` and `effects.ts`. Assembly changes usually touch
249
+ `address-planning.ts`, `program-emission.ts`, `fixup-emission.ts` or
250
+ `placement.ts`, depending on whether the change affects facts, emitted bytes,
251
+ symbolic references or address movement.
@@ -0,0 +1,237 @@
1
+ ---
2
+ layout: default
3
+ title: 'Chapter 4 - Ops and Register Contracts'
4
+ parent: 'AZM Engineering Manual'
5
+ nav_order: 4
6
+ ---
7
+
8
+ [<- Assembly and Z80 Emission](03-assembly-and-z80-emission.md) | [Interfaces and Output Artifacts ->](05-interfaces-and-output-artifacts.md)
9
+
10
+ # Chapter 4 - Ops and Register Contracts
11
+
12
+ Ops and register contracts are the two AZM-specific subsystems that sit above plain
13
+ Z80 instruction assembly. Ops expand source into visible inline assembly.
14
+ Register contract analysis checks the resulting routines and calls.
15
+
16
+ These features belong together in the codebase tour because they meet at the
17
+ same boundary: parsed source items. Ops produce source items. Register contract analysis
18
+ reads source items.
19
+
20
+ ## Ops as Visible Expansion
21
+
22
+ Ops are named inline instruction idioms. They let source define a small
23
+ operation once and expand it visibly at each use site. The implementation lives
24
+ in `src/expansion/`. `op-expansion.ts` coordinates the subsystem. Operand
25
+ splitting, overload selection, selected expansion, instruction instantiation
26
+ and local-label rewriting live in focused helper modules.
27
+
28
+ An op is closer to a typed inline template than to a text macro. The op parser
29
+ understands operands, chooses an overload and parses the expanded body back
30
+ through the normal AZM parser. The result is visible assembly with the same
31
+ diagnostic and register contract behaviour as handwritten source.
32
+
33
+ For example:
34
+
35
+ ```asm
36
+ op clear(reg8 r)
37
+ xor r
38
+ end
39
+
40
+ clear a
41
+ ```
42
+
43
+ The expansion stage matches `a` as a register operand, substitutes it into the
44
+ template and emits the source item for `xor a`. Address planning and emission
45
+ then treat that instruction exactly like a line written directly in the source.
46
+
47
+ ## Op Collection and Invocation
48
+
49
+ `collectOps()` scans logical lines before normal parsing. It finds top-level
50
+ `op` blocks, parses their parameter lists, records the body template and marks
51
+ the source lines that belong to the definition body.
52
+
53
+ An op declaration has:
54
+
55
+ - a name
56
+ - a parameter list
57
+ - matcher information for overload selection
58
+ - a body template
59
+ - source location metadata for diagnostics
60
+
61
+ The registry is complete before invocation parsing starts. `parseOpInvocation()`
62
+ checks whether a source line could be an op call. If the name matches a
63
+ collected op, `expandOpInvocation()` selects an overload and instantiates the
64
+ body.
65
+
66
+ The parser handles op invocations before `parseLogicalLine()`. An op head can
67
+ look like an instruction head at the source level. The expansion stage resolves
68
+ it before ordinary line parsing.
69
+
70
+ ## Overloads and Templates
71
+
72
+ Ops support overloads. `op-selection.ts` compares invocation operands against
73
+ each candidate signature. It prefers the most specific matching overload and
74
+ emits diagnostics for arity errors, unsupported operands, ambiguous matches and
75
+ invalid expansions.
76
+
77
+ The matcher vocabulary recognises fixed tokens, registers, register pairs,
78
+ immediates, conditions, ports and indexed operands. It stays close to the Z80
79
+ operand model, so op dispatch and instruction parsing describe operands in the
80
+ same terms.
81
+
82
+ An op body template is parsed into template items. During expansion, operands
83
+ from the call site are substituted into the template by
84
+ `op-instruction-instantiation.ts`. The result is formatted as ordinary source
85
+ text and parsed through the same line parser used for top-level source.
86
+
87
+ Local label rewriting lives in `op-local-labels.ts`. A local label in an op
88
+ expansion becomes unique at the use site so each expansion receives its own
89
+ generated label. Once the rewritten labels become source items, address planning
90
+ defines and resolves them through the ordinary symbol path.
91
+
92
+ ## Op Diagnostics and Register Contracts
93
+
94
+ Op diagnostics point at the call site while explaining the definition that
95
+ matched or failed. Invalid expanded instructions are reported as op expansion
96
+ failures with the underlying Z80 parser diagnostic included.
97
+
98
+ Ops expand before register contract analysis builds routines. AZM sees the
99
+ expanded instructions. An op is visible inline assembly, so its register effects
100
+ belong to the caller.
101
+
102
+ ## Register Contract Analysis
103
+
104
+ Register contract analysis checks how routines use Z80 registers. It reads routine
105
+ boundaries, instruction effects and AZMDoc contract comments, then reports
106
+ conflicts where a caller still needs a register value that a callee may change.
107
+
108
+ The implementation lives in `src/register-contracts/`. The public analysis entry
109
+ point is `analyzeRegisterContracts()` in `src/register-contracts/analyze.ts`.
110
+
111
+ Register contract analysis is a data-flow analysis over assembled source
112
+ structure. It works with routines, calls, instruction effects and contracts. It
113
+ analyses the source structure to find values live across a call and callee
114
+ summaries that can change those values.
115
+
116
+ ## A Register Contract Conflict
117
+
118
+ This source shape captures the problem:
119
+
120
+ ```asm
121
+ @Caller:
122
+ ld b,8
123
+ Loop:
124
+ call Worker
125
+ djnz Loop
126
+ ret
127
+
128
+ ;! clobbers B
129
+ @Worker:
130
+ ld b,0
131
+ ret
132
+ ```
133
+
134
+ `Caller` uses `B` as the `djnz` counter. `Worker` declares that it clobbers
135
+ `B`. Liveness sees that `B` is still needed after the call because `djnz Loop`
136
+ reads it. The register contract conflict is at the call site: `Caller` passes
137
+ through a routine boundary that may change a live unit.
138
+
139
+ The programmer can preserve `B`, choose a different counter register, change
140
+ `Worker` so it leaves `B` unchanged or update the calling sequence. Register
141
+ contract analysis identifies the conflict and the source location where the
142
+ caller crosses the boundary.
143
+
144
+ ## Routine Model and Contracts
145
+
146
+ `src/register-contracts/programModel.ts` builds the program model from parsed source
147
+ items. Routine-specific extraction is split into
148
+ `programModel-boundaries.ts` and `programModel-routines.ts`. Together they find
149
+ routine boundaries, direct calls, labels and instructions. Routine entry labels
150
+ use `@` in source and become callable public routine names after the marker is
151
+ removed.
152
+
153
+ `src/register-contracts/smartComments.ts` reads AZMDoc comments from the comment maps
154
+ captured during loading. Comment-block splitting and token parsing live in
155
+ `smartCommentBlocks.ts` and `smartCommentParsing.ts`. External `.asmi`
156
+ contracts are parsed in `interfaceContracts.ts`.
157
+
158
+ Contracts can describe:
159
+
160
+ - inputs
161
+ - outputs
162
+ - clobbered registers
163
+ - preserved registers
164
+ - expected outputs at call sites
165
+
166
+ Source comments and external interfaces describe the same kind of fact: a
167
+ routine contract. Source comments attach to routines in the current program.
168
+ `.asmi` entries attach to routines whose source is assembled elsewhere.
169
+
170
+ ## Effects, Summaries and Liveness
171
+
172
+ Register contract analysis depends on `src/z80/effects.ts`. Effects describe which registers
173
+ and flags an instruction reads, writes or preserves. `instruction-head.ts`,
174
+ `instruction-operands.ts`, `instruction-predicates.ts` and
175
+ `operand-register-name.ts` translate between Z80 instruction shapes and
176
+ register contract units such as `A`, `HL`, `carry` and register pairs.
177
+
178
+ `src/register-contracts/summary.ts` infers a summary for a single routine. Boundary,
179
+ contract, result, state and token-transfer logic now lives in
180
+ `summary-boundary.ts`, `summary-contract.ts`, `summary-result.ts`,
181
+ `summary-state.ts` and `summary-token-transfer.ts`. `routine-summaries.ts` and
182
+ `summaries.ts` combine routine summaries, external contracts and profile
183
+ summaries into lookup tables. A summary records the observable contract of a
184
+ routine: the units it reads, writes, preserves, clobbers and returns as outputs.
185
+
186
+ `src/register-contracts/liveness.ts` performs the caller-side analysis. It works
187
+ backwards through each routine. At a call, it compares the live-after set with
188
+ the callee summary. A live unit that the callee clobbers becomes a conflict. A
189
+ unit produced by the callee and read by the caller becomes an output candidate.
190
+
191
+ Stack behaviour is part of routine summaries. `summary.ts` tracks push, pop,
192
+ exchange-top and unknown stack effects. `routine-summaries.ts` infers summaries
193
+ to a fixed point so internal routine calls can see optimistic boundary
194
+ summaries before the final pass. Strict mode uses `stackBalanced` and
195
+ `hasUnknownStackEffect` to distinguish balanced stack use from a routine whose
196
+ boundary may leave the stack in an unknown state.
197
+
198
+ ## Reports, Interfaces and Tooling
199
+
200
+ `report.ts` renders human-readable `.regcontracts.txt` reports and `.asmi` interface
201
+ metadata. `annotate.ts`, `annotations.ts`, `fix.ts` and `sourceText.ts` support
202
+ source updates for generated AZMDoc comments and conservative fixes.
203
+
204
+ The CLI can request these behaviours through:
205
+
206
+ - `--reg-report`
207
+ - `--reg-interface`
208
+ - `--contracts`
209
+ - `--fix`
210
+ - `--accept-out`
211
+
212
+ `src/register-contracts/tooling.ts` exposes editor-friendly diagnostics and code
213
+ actions through `analyzeRegisterContractsForTools()`. Tooling diagnostics carry
214
+ file, line, column, message, fixability and optional text edits. An editor can
215
+ show the same register contract information that the CLI reports while using
216
+ normal editor actions for accepted fixes.
217
+
218
+ ## Changing Ops or Register Contracts
219
+
220
+ Op changes belong in `src/expansion/`, with tests under
221
+ `test/unit/expansion/` and integration tests for source-level behaviour.
222
+ Register contract changes usually begin in one of these files:
223
+
224
+ - Routine boundaries and calls: `programModel.ts`
225
+ - AZMDoc parsing: `smartComments.ts`, `smartCommentBlocks.ts`,
226
+ `smartCommentParsing.ts`, `interfaceContracts.ts`
227
+ - Instruction effects: `z80/effects.ts`, `instruction-head.ts`,
228
+ `instruction-operands.ts`, `instruction-predicates.ts`
229
+ - Summary inference: `summary.ts`
230
+ - Caller liveness: `liveness.ts`
231
+ - Output text: `report.ts`
232
+ - Source edits: `annotate.ts`, `fix.ts`, `annotations.ts`
233
+ - Tooling surface: `tooling.ts`
234
+
235
+ Run unit tests under `test/unit/register-contracts/`, integration tests under
236
+ `test/integration/register-contracts/` and CLI tests in
237
+ `test/cli/register_contracts_cli.test.ts`.