@jhlagado/azm 0.2.8 → 0.2.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +75 -15
- package/dist/src/api-compile.js +27 -0
- package/dist/src/assembly/assemble-program.js +5 -0
- package/dist/src/assembly/import-visibility.d.ts +3 -0
- package/dist/src/assembly/import-visibility.js +204 -0
- package/dist/src/node/source-host.js +40 -13
- package/dist/src/outputs/write-asm80.js +4 -0
- package/dist/src/register-contracts/programModel-routines.js +33 -17
- package/dist/src/register-contracts/report.js +15 -3
- package/dist/src/register-contracts/smartCommentParsing.d.ts +1 -0
- package/dist/src/register-contracts/smartCommentParsing.js +42 -7
- package/dist/src/register-contracts/smartComments.d.ts +2 -2
- package/dist/src/register-contracts/smartComments.js +3 -4
- package/dist/src/source/logical-lines.d.ts +3 -0
- package/dist/src/source/source-span.d.ts +2 -0
- package/dist/src/syntax/parse-directive-statement.d.ts +1 -6
- package/dist/src/syntax/parse-directive-statement.js +3 -1
- package/dist/src/syntax/parse-layout-declarations.js +11 -2
- package/dist/src/syntax/parse-line.js +18 -2
- package/dist/src/tooling/api.js +1 -1
- package/docs/codebase/01-orientation-and-repository-layout.md +192 -0
- package/docs/codebase/02-source-loading-and-parsing.md +263 -0
- package/docs/codebase/03-assembly-and-z80-emission.md +251 -0
- package/docs/codebase/04-ops-and-register-contracts.md +237 -0
- package/docs/codebase/05-interfaces-and-output-artifacts.md +253 -0
- package/docs/codebase/06-verification-and-maintenance.md +202 -0
- package/docs/codebase/appendices/a-directory-file-reference.md +253 -0
- package/docs/codebase/appendices/b-compile-flow-reference.md +103 -0
- package/docs/codebase/appendices/c-public-surface-reference.md +106 -0
- package/docs/codebase/appendices/index.md +16 -0
- package/docs/codebase/index.md +46 -0
- package/package.json +2 -3
- package/docs/reference/cli.md +0 -158
- package/docs/reference/tooling-api.md +0 -320
|
@@ -0,0 +1,263 @@
|
|
|
1
|
+
---
|
|
2
|
+
layout: default
|
|
3
|
+
title: 'Chapter 2 - Source Loading and Parsing'
|
|
4
|
+
parent: 'AZM Engineering Manual'
|
|
5
|
+
nav_order: 2
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
[<- Orientation and Repository Layout](01-orientation-and-repository-layout.md) | [Assembly and Z80 Emission ->](03-assembly-and-z80-emission.md)
|
|
9
|
+
|
|
10
|
+
# Chapter 2 - Source Loading and Parsing
|
|
11
|
+
|
|
12
|
+
Source loading and parsing turn entry files into typed source items. This
|
|
13
|
+
chapter follows the path from a filename to the structured data that assembly,
|
|
14
|
+
tooling and register contracts consume.
|
|
15
|
+
|
|
16
|
+
The loading boundary lives in `src/node/source-host.ts`. The parser is
|
|
17
|
+
orchestrated by `parseNextSourceItems()` in `src/core/compile.ts`, with
|
|
18
|
+
single-line parsing in `src/syntax/parse-line.ts`. Expression and declaration
|
|
19
|
+
parsing is split across tokenizer, token-expression, directive and layout
|
|
20
|
+
modules in `src/syntax/`.
|
|
21
|
+
|
|
22
|
+
## Entry Files and Source Text
|
|
23
|
+
|
|
24
|
+
The public tooling and compile APIs enter loading through `loadProgramNext()` in
|
|
25
|
+
`src/tooling/api.ts`. That function calls `expandSourceForTooling()` and then
|
|
26
|
+
passes the expanded logical lines to `parseNextSourceItems()`.
|
|
27
|
+
|
|
28
|
+
`expandSourceForTooling()` accepts:
|
|
29
|
+
|
|
30
|
+
```ts
|
|
31
|
+
export interface LoadProgramNextOptions {
|
|
32
|
+
readonly entryFile: string;
|
|
33
|
+
readonly includeDirs?: readonly string[];
|
|
34
|
+
readonly directiveAliasFiles?: readonly string[];
|
|
35
|
+
readonly preloadedText?: string;
|
|
36
|
+
readonly signal?: AbortSignal;
|
|
37
|
+
}
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
The entry file is normalised and checked for a source extension. AZM source
|
|
41
|
+
entries use `.asm` or `.z80`. `preloadedText` lets editor integrations parse an
|
|
42
|
+
unsaved buffer for the entry file while included files still come from disk.
|
|
43
|
+
`signal` lets an editor cancel stale work when a newer buffer arrives.
|
|
44
|
+
|
|
45
|
+
The loader keeps the full text of every loaded source file in `sourceTexts`.
|
|
46
|
+
Later stages use parsed source items for compiler logic, but several features
|
|
47
|
+
need original text:
|
|
48
|
+
|
|
49
|
+
- register contract annotation rewrites exact source lines
|
|
50
|
+
- tooling reads source text for diagnostics and code actions
|
|
51
|
+
- D8 map generation needs file names and line provenance
|
|
52
|
+
- case-style linting inspects original token case
|
|
53
|
+
|
|
54
|
+
Logical lines drive parsing. Source texts support tools that need to point back
|
|
55
|
+
into the user's files.
|
|
56
|
+
|
|
57
|
+
## Source Loading Directives
|
|
58
|
+
|
|
59
|
+
The tooling loader recognises two source-loading directives before parsing:
|
|
60
|
+
`.include` and `.import`.
|
|
61
|
+
|
|
62
|
+
`.include` is textual inclusion. The loader reads the entry file, scans it into
|
|
63
|
+
logical lines and recursively expands include directives. Include paths resolve
|
|
64
|
+
relative to the including source file first, then through configured include
|
|
65
|
+
directories.
|
|
66
|
+
|
|
67
|
+
`.import` uses the same path resolution rule, but it starts a new source
|
|
68
|
+
ownership unit for tooling. Parsed items from the imported file still join the
|
|
69
|
+
same flattened logical line stream, though their spans record the imported file
|
|
70
|
+
as the owning unit. This lets tools distinguish entry-owned source, text pulled
|
|
71
|
+
in by `.include` and routines introduced by imported modules.
|
|
72
|
+
|
|
73
|
+
Repeated imports of the same resolved file are idempotent. The first import
|
|
74
|
+
loads and emits the module at the import point; later imports of that same
|
|
75
|
+
resolved file are skipped. Repeated includes remain textual and repeatable.
|
|
76
|
+
Recursive include or import stacks are diagnosed before parsing, with the
|
|
77
|
+
diagnostic naming the recursive source relation.
|
|
78
|
+
|
|
79
|
+
Labels in imported source keep their physical source locations. Imported
|
|
80
|
+
`@Name:` labels are public exports visible to outside source as `Name`. Plain
|
|
81
|
+
labels in an imported file are private to that import unit. Text included from
|
|
82
|
+
inside an imported file remains part of that imported unit, so its plain labels
|
|
83
|
+
are private unless they are also public `@` labels.
|
|
84
|
+
|
|
85
|
+
That rule keeps library files portable. A library can include a sibling file and
|
|
86
|
+
still assemble when the entry file is run from another directory. Include
|
|
87
|
+
directories then act as project-level search paths for shared headers, vendor
|
|
88
|
+
source and imported modules.
|
|
89
|
+
|
|
90
|
+
The loader returns:
|
|
91
|
+
|
|
92
|
+
```ts
|
|
93
|
+
export interface ExpandedNextSource {
|
|
94
|
+
readonly entryFile: string;
|
|
95
|
+
readonly lines: readonly LogicalLine[];
|
|
96
|
+
readonly sourceTexts: ReadonlyMap<string, string>;
|
|
97
|
+
readonly sourceLineComments: ReadonlyMap<string, ReadonlyMap<number, string>>;
|
|
98
|
+
}
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
`lines` is the flattened source stream for parsing. `sourceTexts` keeps the
|
|
102
|
+
original file text. `sourceLineComments` keeps comments indexed by file and line
|
|
103
|
+
so register contract analysis can reconstruct AZMDoc contract blocks after routines have
|
|
104
|
+
been identified.
|
|
105
|
+
|
|
106
|
+
## Logical Lines and Comments
|
|
107
|
+
|
|
108
|
+
`src/source/logical-lines.ts` scans a `SourceFile` into `LogicalLine` objects. A
|
|
109
|
+
logical line records the source name, line number and original text. Tooling
|
|
110
|
+
loads can also attach `sourceUnit` and `sourceRelation`:
|
|
111
|
+
|
|
112
|
+
- `sourceUnit` is the owning file for the current tooling unit
|
|
113
|
+
- `sourceRelation` is `entry`, `include` or `import`
|
|
114
|
+
|
|
115
|
+
This thin structure gives every later diagnostic a stable location and enough
|
|
116
|
+
provenance for tooling features that need to reason about module ownership.
|
|
117
|
+
|
|
118
|
+
The source helpers are small and important:
|
|
119
|
+
|
|
120
|
+
| File | Role |
|
|
121
|
+
| ------------------------- | ------------------------------------------------------ |
|
|
122
|
+
| `source-file.ts` | Wraps source text with a source name. |
|
|
123
|
+
| `logical-lines.ts` | Splits text into line records. |
|
|
124
|
+
| `source-span.ts` | Defines the common span shape. |
|
|
125
|
+
| `line-comment-scanner.ts` | Finds line comments while respecting quoted text. |
|
|
126
|
+
| `strip-line-comment.ts` | Removes semicolon comments through the shared scanner. |
|
|
127
|
+
|
|
128
|
+
`strip-line-comment.ts` is used by source-loading directive recognition, layout parsing,
|
|
129
|
+
conditional assembly and single-line parsing. Shared comment handling prevents
|
|
130
|
+
each stage from inventing a slightly different rule for semicolons inside
|
|
131
|
+
strings and character literals.
|
|
132
|
+
|
|
133
|
+
## Directive Aliases
|
|
134
|
+
|
|
135
|
+
Directive aliases are loaded during `loadProgramNext()`:
|
|
136
|
+
|
|
137
|
+
```ts
|
|
138
|
+
const directiveAliasProfiles = await Promise.all(
|
|
139
|
+
(options.directiveAliasFiles ?? []).map((path) => readDirectiveAliasProfile(path)),
|
|
140
|
+
);
|
|
141
|
+
const directiveAliasPolicy = buildDirectiveAliasPolicy(directiveAliasProfiles);
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
`src/syntax/directive-aliases.ts` owns the alias policy. Built-in aliases and
|
|
145
|
+
project alias files are normalised before line parsing. The parser then
|
|
146
|
+
receives canonical directive forms and emits canonical source items.
|
|
147
|
+
|
|
148
|
+
Aliases are a syntax boundary. They affect directive recognition before parsing.
|
|
149
|
+
The assembler-time model receives canonical source items.
|
|
150
|
+
|
|
151
|
+
## Source Items
|
|
152
|
+
|
|
153
|
+
The parser is the first place where AZM source becomes compiler data. Before
|
|
154
|
+
this point, a line is text with a file name and line number. After this point, a
|
|
155
|
+
line is a label, instruction, directive, layout declaration or comment item.
|
|
156
|
+
|
|
157
|
+
`src/model/source-item.ts` defines the parser output. The model includes:
|
|
158
|
+
|
|
159
|
+
- labels
|
|
160
|
+
- `.org`, `.equ`, `.db`, `.dw`, `.ds`, `.align`, string directives and `.end`
|
|
161
|
+
- instructions
|
|
162
|
+
- record and union layout declarations
|
|
163
|
+
- type aliases
|
|
164
|
+
- enums
|
|
165
|
+
- op-expanded items
|
|
166
|
+
- comments
|
|
167
|
+
|
|
168
|
+
Each item carries a source span where appropriate. Tooling spans now preserve
|
|
169
|
+
optional `sourceUnit` and `sourceRelation` fields when the loader attached them.
|
|
170
|
+
Assembly uses item kind to decide size and emission. Register contract analysis
|
|
171
|
+
uses instruction, label and comment items to build routines. D8 map output uses
|
|
172
|
+
spans to connect emitted bytes back to files and lines.
|
|
173
|
+
|
|
174
|
+
## Top-Level Parse Order
|
|
175
|
+
|
|
176
|
+
`parseNextSourceItems()` handles structural forms before ordinary line parsing:
|
|
177
|
+
|
|
178
|
+
1. `applyConditionalAssembly()` in `src/core/conditional-assembly.ts` filters
|
|
179
|
+
the logical line stream.
|
|
180
|
+
2. `collectOps()` records top-level `op` definitions and marks their body lines.
|
|
181
|
+
3. Name-left `.typealias` declarations are parsed.
|
|
182
|
+
4. Record and union headers collect `.field` declarations until `.endtype` or
|
|
183
|
+
`.endunion`.
|
|
184
|
+
5. Visible op invocations expand into ordinary source items.
|
|
185
|
+
6. `parseLogicalLine()` handles single-line labels, directives, data and
|
|
186
|
+
instructions.
|
|
187
|
+
|
|
188
|
+
This order matters. Ops must be collected before invocation expansion. Layout
|
|
189
|
+
declarations must collect their body lines as one source item. Ordinary
|
|
190
|
+
instruction parsing should see the lines that remain after those structural
|
|
191
|
+
forms have been handled.
|
|
192
|
+
|
|
193
|
+
## Layout and Declaration Parsing
|
|
194
|
+
|
|
195
|
+
Name-left layout syntax is parsed in `parseNextSourceItems()` because a record
|
|
196
|
+
or union body spans multiple lines:
|
|
197
|
+
|
|
198
|
+
```asm
|
|
199
|
+
Sprite .type
|
|
200
|
+
x .field byte
|
|
201
|
+
y .field byte
|
|
202
|
+
tile .field byte
|
|
203
|
+
flags .field byte
|
|
204
|
+
.endtype
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
Fields are parsed as `LayoutField` values. Each field has a name and a type
|
|
208
|
+
expression. The parser checks declaration shape. `address-planning.ts` later
|
|
209
|
+
checks duplicate field names, layout size and type references.
|
|
210
|
+
|
|
211
|
+
Type aliases are parsed as named bindings:
|
|
212
|
+
|
|
213
|
+
```asm
|
|
214
|
+
SpriteArray .typealias Sprite[16]
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
The parser stores the alias target as a type expression. Assembly resolves the
|
|
218
|
+
target against scalar layout names, record names, union names and other type
|
|
219
|
+
aliases.
|
|
220
|
+
|
|
221
|
+
The parser also distinguishes address labels from declarations. An address
|
|
222
|
+
label uses a colon and becomes a label item. Name-left declarations become
|
|
223
|
+
equate, enum, type, union or type-alias items.
|
|
224
|
+
|
|
225
|
+
```asm
|
|
226
|
+
Start:
|
|
227
|
+
ret
|
|
228
|
+
|
|
229
|
+
COUNT .equ 8
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
A label contributes an address based on placement. An equate contributes an
|
|
233
|
+
assembler-time value based on expression evaluation.
|
|
234
|
+
|
|
235
|
+
## Expressions and Conditionals
|
|
236
|
+
|
|
237
|
+
`src/syntax/expression-tokenizer.ts` tokenizes expression text.
|
|
238
|
+
`parse-token-expression.ts` builds expression trees from tokens.
|
|
239
|
+
`parse-expression.ts` is the public syntax wrapper used by line parsing.
|
|
240
|
+
`parse-layout-expression.ts` parses layout type expressions used by `.ds`,
|
|
241
|
+
`.field`, `.typealias`, `sizeof(...)`, `offset(...)` and layout casts.
|
|
242
|
+
`parse-directive-statement.ts` parses directive statements that need more than
|
|
243
|
+
single-token recognition.
|
|
244
|
+
|
|
245
|
+
The parser produces expression trees from `src/model/expression.ts`.
|
|
246
|
+
`src/semantics/expression-evaluation.ts` evaluates those trees when the
|
|
247
|
+
assembler-time environment is available.
|
|
248
|
+
|
|
249
|
+
Conditional assembly is handled before final line parsing. The conditional pass
|
|
250
|
+
keeps the active lines and removes inactive branches from the stream seen by
|
|
251
|
+
later stages. Ordinary parsing then receives one effective source program.
|
|
252
|
+
|
|
253
|
+
## Parse Diagnostics
|
|
254
|
+
|
|
255
|
+
`src/syntax/parse-diagnostics.ts` contains shared helpers for syntax errors.
|
|
256
|
+
Diagnostic IDs come from `src/model/diagnostic.ts`. Use those helpers when
|
|
257
|
+
adding parse failures so source positions, severity and code shape stay
|
|
258
|
+
consistent.
|
|
259
|
+
|
|
260
|
+
Parser recovery matters for editor tooling. A user may have a half-written line
|
|
261
|
+
while typing. Tooling still needs symbols, diagnostics and register contract hints
|
|
262
|
+
for surrounding source, so parse errors should usually report a diagnostic and
|
|
263
|
+
let parsing continue.
|
|
@@ -0,0 +1,251 @@
|
|
|
1
|
+
---
|
|
2
|
+
layout: default
|
|
3
|
+
title: 'Chapter 3 - Assembly and Z80 Emission'
|
|
4
|
+
parent: 'AZM Engineering Manual'
|
|
5
|
+
nav_order: 3
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
[<- Source Loading and Parsing](02-source-loading-and-parsing.md) | [Ops and Register Contracts ->](04-ops-and-register-contracts.md)
|
|
9
|
+
|
|
10
|
+
# Chapter 3 - Assembly and Z80 Emission
|
|
11
|
+
|
|
12
|
+
Assembly turns source items into facts, bytes, fixups and source segments. This
|
|
13
|
+
chapter combines the assembler-time fact model with the Z80 emission path
|
|
14
|
+
because those two stages are tightly coupled: planning decides addresses and
|
|
15
|
+
sizes, then emission writes bytes using those facts.
|
|
16
|
+
|
|
17
|
+
The central files are:
|
|
18
|
+
|
|
19
|
+
- `src/assembly/address-planning.ts`
|
|
20
|
+
- `src/assembly/address-symbols.ts`
|
|
21
|
+
- `src/assembly/placement.ts`
|
|
22
|
+
- `src/assembly/program-emission.ts`
|
|
23
|
+
- `src/assembly/fixup-emission.ts`
|
|
24
|
+
- `src/assembly/assemble-program.ts`
|
|
25
|
+
- `src/semantics/expression-evaluation.ts`
|
|
26
|
+
- `src/semantics/constant-operators.ts`
|
|
27
|
+
- `src/semantics/layout-evaluation.ts`
|
|
28
|
+
- `src/z80/parse-instruction.ts`
|
|
29
|
+
- `src/z80/instruction.ts`
|
|
30
|
+
- `src/z80/encode.ts`
|
|
31
|
+
- `src/z80/effects.ts`
|
|
32
|
+
|
|
33
|
+
## Assembly Orchestration
|
|
34
|
+
|
|
35
|
+
`src/assembly/assemble-program.ts` is the entry point:
|
|
36
|
+
|
|
37
|
+
```ts
|
|
38
|
+
export function assembleProgram(items: readonly SourceItem[]): AssembleProgramResult;
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
It builds the address state, emits the program image and returns diagnostics,
|
|
42
|
+
symbols, byte maps and source segments. The function is intentionally small.
|
|
43
|
+
Detailed decisions live in the modules below it.
|
|
44
|
+
|
|
45
|
+
## Address Planning
|
|
46
|
+
|
|
47
|
+
`src/assembly/address-planning.ts` walks source items and builds the facts
|
|
48
|
+
needed to place and evaluate the program:
|
|
49
|
+
|
|
50
|
+
- labels and their addresses
|
|
51
|
+
- `.equ` constants and enum members
|
|
52
|
+
- record and union layouts
|
|
53
|
+
- type aliases
|
|
54
|
+
- `.org` placement state
|
|
55
|
+
- sizes for data, storage and instructions
|
|
56
|
+
|
|
57
|
+
The returned `AddressState` contains symbol facts, equate records, layout
|
|
58
|
+
records, enum names, diagnostics and placement information. This is where parsed
|
|
59
|
+
syntax becomes assembler knowledge. A label item becomes an address. A type
|
|
60
|
+
declaration becomes a layout record. A `.ds` directive becomes a byte count.
|
|
61
|
+
|
|
62
|
+
## Placement
|
|
63
|
+
|
|
64
|
+
`src/assembly/placement.ts` owns placement state. `.org` sets the active
|
|
65
|
+
address. Instructions advance by encoded size. `.db` advances by emitted byte
|
|
66
|
+
count. `.dw` advances by two bytes per expression. `.ds` advances by calculated
|
|
67
|
+
storage size. `.align` advances to the next aligned address.
|
|
68
|
+
|
|
69
|
+
Placement is separate from byte emission so the assembler can resolve symbols
|
|
70
|
+
before every byte has been written. A branch to a later label can be encoded
|
|
71
|
+
after address planning has seen the label definition.
|
|
72
|
+
|
|
73
|
+
## Symbols and Layouts
|
|
74
|
+
|
|
75
|
+
Labels define addresses. `.equ` declarations define assembler-time values.
|
|
76
|
+
Enums define qualified constants. `src/assembly/address-symbols.ts` contains
|
|
77
|
+
the symbol helpers that define labels, equates and enum members, enforce
|
|
78
|
+
duplicate rules and record spans for diagnostics and output metadata.
|
|
79
|
+
|
|
80
|
+
Record and union declarations become layout records. A record field advances the
|
|
81
|
+
offset by its byte size. A union field starts at offset zero and the union size
|
|
82
|
+
is the largest field size.
|
|
83
|
+
|
|
84
|
+
For a record:
|
|
85
|
+
|
|
86
|
+
```asm
|
|
87
|
+
Sprite .type
|
|
88
|
+
x .field byte
|
|
89
|
+
y .field byte
|
|
90
|
+
tile .field byte
|
|
91
|
+
flags .field byte
|
|
92
|
+
.endtype
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
the layout record stores `x = 0`, `y = 1`, `tile = 2`, `flags = 3` and total
|
|
96
|
+
size `4`.
|
|
97
|
+
|
|
98
|
+
Type aliases bind a name to another type expression:
|
|
99
|
+
|
|
100
|
+
```asm
|
|
101
|
+
SpriteArray .typealias Sprite[16]
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
`sizeof(SpriteArray)` resolves through the alias to the size of `Sprite[16]`,
|
|
105
|
+
with the same field paths as the underlying array expression.
|
|
106
|
+
|
|
107
|
+
## Expression Evaluation
|
|
108
|
+
|
|
109
|
+
`src/semantics/expression-evaluation.ts` evaluates expression trees against the
|
|
110
|
+
assembler-time environment. It coordinates literal arithmetic, symbol lookup,
|
|
111
|
+
constant operators, byte functions, layout functions and layout casts that fold
|
|
112
|
+
to constant addresses.
|
|
113
|
+
|
|
114
|
+
The operator and layout rules live in focused modules:
|
|
115
|
+
|
|
116
|
+
| File | Role |
|
|
117
|
+
| ----------------------- | ------------------------------------------------------------------- |
|
|
118
|
+
| `constant-operators.ts` | Binary and unary constant-operator dispatch. |
|
|
119
|
+
| `binary-operators.ts` | Binary arithmetic, bitwise, comparison and logical operators. |
|
|
120
|
+
| `unary-operators.ts` | Unary numeric operators. |
|
|
121
|
+
| `byte-functions.ts` | `LSB(...)`, `MSB(...)` and related byte extraction helpers. |
|
|
122
|
+
| `layout-evaluation.ts` | `sizeof(...)`, `offset(...)` and layout type expression evaluation. |
|
|
123
|
+
| `layout-path.ts` | Field and array path resolution for layout casts. |
|
|
124
|
+
| `layout-format.ts` | Human-readable layout names for diagnostics. |
|
|
125
|
+
| `diagnostics.ts` | Shared semantic diagnostic construction. |
|
|
126
|
+
|
|
127
|
+
Expression evaluation is context-sensitive. A symbol in an instruction operand
|
|
128
|
+
may be a label or constant. A type expression in `.ds Sprite[4]` resolves to a
|
|
129
|
+
byte count. A layout cast such as `<SpriteArray>Sprites[3].flags` resolves to an
|
|
130
|
+
address when the base address, type alias, index and field path are all known at
|
|
131
|
+
assembly time.
|
|
132
|
+
|
|
133
|
+
## Data and Storage Size
|
|
134
|
+
|
|
135
|
+
Address planning needs the byte length of each directive. It contains size
|
|
136
|
+
helpers for `.db`, `.dw`, `.ds`, strings and alignment. The same string
|
|
137
|
+
directive byte rules are reused during emission so planning and output stay
|
|
138
|
+
aligned.
|
|
139
|
+
|
|
140
|
+
Initialized data writes bytes through `.db`, `.dw`, `.cstr`, `.pstr` and
|
|
141
|
+
`.istr`. `.ds` reserves space calculated from numbers or layout type
|
|
142
|
+
expressions. Layout type expressions are byte-size expressions in storage and
|
|
143
|
+
field-size positions.
|
|
144
|
+
|
|
145
|
+
## Program Emission
|
|
146
|
+
|
|
147
|
+
`src/assembly/program-emission.ts` owns byte writing. It walks source items in
|
|
148
|
+
order and writes emitted bytes to a map keyed by absolute address. It also
|
|
149
|
+
records source segments for the D8 map writer.
|
|
150
|
+
|
|
151
|
+
The emitted program shape is:
|
|
152
|
+
|
|
153
|
+
```ts
|
|
154
|
+
export interface EmittedProgram {
|
|
155
|
+
readonly image: ReadonlyMap<number, number>;
|
|
156
|
+
readonly sourceSegments: readonly EmittedSourceSegment[];
|
|
157
|
+
readonly initializedAddresses: readonly number[];
|
|
158
|
+
}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
The byte map stores final addresses. If source emits one byte at `$0100` and
|
|
162
|
+
another at `$8000`, the map has two entries. A sparse map fits Z80 programs that
|
|
163
|
+
place code, data and ROM vectors at separate origins.
|
|
164
|
+
|
|
165
|
+
## Emission Walkthrough
|
|
166
|
+
|
|
167
|
+
For this source:
|
|
168
|
+
|
|
169
|
+
```asm
|
|
170
|
+
.org $0100
|
|
171
|
+
Start:
|
|
172
|
+
ld a,42
|
|
173
|
+
jp Start
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
address planning records `Start = $0100`. The instruction parser represents
|
|
177
|
+
`ld a,42` as an immediate load and `jp Start` as an absolute branch. The encoder
|
|
178
|
+
returns literal opcode bytes plus a 16-bit expression fragment for `Start`.
|
|
179
|
+
Fixup emission evaluates `Start` to `$0100` and writes the little-endian operand
|
|
180
|
+
bytes.
|
|
181
|
+
|
|
182
|
+
The final byte map contains:
|
|
183
|
+
|
|
184
|
+
```text
|
|
185
|
+
$0100: 3E
|
|
186
|
+
$0101: 2A
|
|
187
|
+
$0102: C3
|
|
188
|
+
$0103: 00
|
|
189
|
+
$0104: 01
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
The D8 source segment records that those bytes came from the source lines that
|
|
193
|
+
emitted them.
|
|
194
|
+
|
|
195
|
+
## Z80 Instruction Model
|
|
196
|
+
|
|
197
|
+
`src/z80/instruction.ts` defines the instruction and operand types.
|
|
198
|
+
`src/z80/parse-instruction.ts` dispatches instruction text into parser families.
|
|
199
|
+
`src/z80/encode.ts` dispatches typed instructions into encoder families.
|
|
200
|
+
`src/z80/effects.ts` describes register and flag effects for register contracts.
|
|
201
|
+
|
|
202
|
+
The parser and encoder work as a pair:
|
|
203
|
+
|
|
204
|
+
| File | Question |
|
|
205
|
+
| ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
|
|
206
|
+
| `parse-instruction.ts` | Which parser family should handle this instruction head? |
|
|
207
|
+
| `parse-basic.ts`, `parse-branch.ts`, `parse-exchange.ts`, `parse-io-control.ts`, `parse-ld.ts` | Parse specific instruction families. |
|
|
208
|
+
| `parse-operands.ts`, `parse-conditions.ts`, `operand-split.ts` | Split operands and classify conditions, registers, indexed operands and expression operands. |
|
|
209
|
+
| `instruction.ts` | How is that form represented as TypeScript data? |
|
|
210
|
+
| `encode.ts`, `encode-core.ts`, `encode-ld.ts`, `encode-ld-helpers.ts` | Which bytes and fixup fragments represent that form? |
|
|
211
|
+
| `effects.ts` | Which registers and flags does that form read or write? |
|
|
212
|
+
|
|
213
|
+
The instruction model keeps overloaded Z80 mnemonics manageable. `ld` has many
|
|
214
|
+
forms, but the parser classifies operands before encoding. The encoder then
|
|
215
|
+
selects a specific opcode family from typed operands.
|
|
216
|
+
|
|
217
|
+
## Fixups and Relative Branches
|
|
218
|
+
|
|
219
|
+
The Z80 encoder returns fragments: literal bytes, 8-bit immediates, 16-bit
|
|
220
|
+
immediates and relative branch targets. `src/assembly/fixup-emission.ts`
|
|
221
|
+
evaluates the expression attached to each fragment and writes the final byte or
|
|
222
|
+
word.
|
|
223
|
+
|
|
224
|
+
Relative branches such as `jr` and `djnz` emit an 8-bit displacement from the
|
|
225
|
+
next instruction. The parser recognises `jr nz,Loop`. The encoder knows the
|
|
226
|
+
opcode and that the operand is relative. Fixup emission knows the current
|
|
227
|
+
address and final target address.
|
|
228
|
+
|
|
229
|
+
The same principle applies to absolute branches and immediate operands. The
|
|
230
|
+
encoder chooses the fragment width. Fixup emission evaluates the expression and
|
|
231
|
+
checks that the value fits that width.
|
|
232
|
+
|
|
233
|
+
## Source Segments
|
|
234
|
+
|
|
235
|
+
Every emitted byte range can carry source provenance. `program-emission.ts`
|
|
236
|
+
adds `EmittedSourceSegment` records with start address, end address, source
|
|
237
|
+
file, line and segment kind. `writeD8m()` later groups these segments by file
|
|
238
|
+
for Debug80.
|
|
239
|
+
|
|
240
|
+
Source segments classify emitted ranges as code, data, directive output, label
|
|
241
|
+
context or unknown output. Debug80 uses this metadata to connect source lines to
|
|
242
|
+
addresses. A byte-perfect assembler can still give a poor debugging experience
|
|
243
|
+
when source segments are too broad, absent or attached to a different file.
|
|
244
|
+
|
|
245
|
+
## Changing Assembly or Encoding
|
|
246
|
+
|
|
247
|
+
Instruction changes usually touch `parse-instruction.ts`, `instruction.ts`,
|
|
248
|
+
`encode.ts` and `effects.ts`. Assembly changes usually touch
|
|
249
|
+
`address-planning.ts`, `program-emission.ts`, `fixup-emission.ts` or
|
|
250
|
+
`placement.ts`, depending on whether the change affects facts, emitted bytes,
|
|
251
|
+
symbolic references or address movement.
|