marc-ts 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -7,12 +7,11 @@
7
7
 
8
8
  ## Features
9
9
 
10
- - **Immutable API** - All operations return new objects, never mutate existing records
11
- - **Zero runtime dependencies** - Works in browsers and Node.js (≥14) without any dependencies
12
- - **Type-safe** - Full TypeScript type definitions with strict typing
13
- - **Well-tested** - >90% code coverage with comprehensive test suite
14
- - **Universal** - Runs in Node.js and modern browsers (Chrome, Firefox, Safari, Edge)
15
- - **Functional design** - Pure functions for composability and predictability
10
+ - **Four formats** ISO2709 binary, MARCXML, MARC-in-JSON, MARCBreaker/marctxt
11
+ - **Consistent API** every format uses `parse*(input) → MarcRecord[]` and `serialize*(records[]) <native>`
12
+ - **Immutable** all operations return new objects, never mutate
13
+ - **Zero dependencies** works in Node.js and modern browsers
14
+ - **Fully typed** strict TypeScript throughout
16
15
 
17
16
  ## Installation
18
17
 
@@ -23,620 +22,318 @@ npm install marc-ts
23
22
  ## Quick Start
24
23
 
25
24
  ```typescript
26
- import { parseMarcRecord, serializeMarcRecord, title, author, isbn, subjects } from 'marc-ts';
25
+ import { parseMarcBinary, serializeMarcBinary, title, author } from 'marc-ts';
27
26
  import { parseMarcXml, serializeMarcXml } from 'marc-ts/xml';
28
27
  import { parseMarcJson, serializeMarcJsonString } from 'marc-ts/json';
29
28
  import { parseMarcTxt, serializeMarcTxt } from 'marc-ts/txt';
30
29
 
31
- // --- ISO 2709 binary (MARC21) ---
32
- const buffer = new Uint8Array([...]); // Your MARC21 binary data
33
- const result = parseMarcRecord(buffer);
30
+ // Binary (ISO2709) splits on 0x1D, returns all records
31
+ const records = parseMarcBinary(buffer);
32
+ console.log(title(records[0]));
33
+ console.log(author(records[0]));
34
+ const binary = serializeMarcBinary(records);
34
35
 
35
- if (result.record) {
36
- console.log('Title:', title(result.record));
37
- console.log('Author:', author(result.record));
38
- console.log('ISBNs:', isbn(result.record));
39
- console.log('Subjects:', subjects(result.record));
40
- }
41
- if (result.warnings.length > 0) {
42
- console.warn('Parsing warnings:', result.warnings);
43
- }
44
-
45
- // Serialize back to binary (UTF-8 by default; pass { encoding: 'marc8' } for MARC-8 output)
46
- const binary = serializeMarcRecord(result.record!, { encoding: 'utf8' });
47
-
48
- // --- MARCXML ---
49
- const xmlString = `<?xml version="1.0"?>
50
- <collection xmlns="http://www.loc.gov/MARC21/slim">
51
- <record>
52
- <leader>00000nam a2200000 4500</leader>
53
- <datafield tag="245" ind1="1" ind2="0">
54
- <subfield code="a">The Hobbit</subfield>
55
- </datafield>
56
- </record>
57
- </collection>`;
58
-
59
- const [xmlRecord] = parseMarcXml(xmlString);
60
- console.log('Title from XML:', title(xmlRecord));
61
- const roundtripXml = serializeMarcXml([xmlRecord]); // back to a <collection> document
62
-
63
- // --- MARC-in-JSON ---
64
- const jsonString = JSON.stringify({
65
- leader: '00000nam a2200000 4500',
66
- fields: [
67
- { '245': { subfields: [{ a: 'The Hobbit' }], ind1: '1', ind2: '0' } },
68
- ],
69
- });
70
-
71
- const jsonRecord = parseMarcJson(jsonString);
72
- console.log('Title from JSON:', title(jsonRecord));
73
- const roundtripJson = serializeMarcJsonString(jsonRecord); // back to a JSON string
74
-
75
- // --- MARCBreaker (marctxt) ---
76
- const txtString = `=LDR 00000nam a2200000 4500
77
- =001 5490
78
- =245 10$aThe Hobbit /$cJ.R.R. Tolkien.
79
- `;
80
-
81
- const [txtRecord] = parseMarcTxt(txtString);
82
- console.log('Title from marctxt:', title(txtRecord));
83
- const roundtripTxt = serializeMarcTxt([txtRecord]); // back to marctxt string
84
-
85
- // --- MARC-8 binary ---
86
- // parseMarcRecord detects MARC-8 automatically from leader byte 9 (' ');
87
- // records are decoded to Unicode transparently — no special handling needed.
88
- const marc8Buffer = new Uint8Array([...]); // MARC-8 encoded binary
89
- const marc8Result = parseMarcRecord(marc8Buffer); // decoded to Unicode automatically
90
- ```
91
-
92
- ## Why marc-ts?
36
+ // MARCXML — parses <collection> or bare <record> elements
37
+ const xmlRecords = parseMarcXml(xmlString);
38
+ const xml = serializeMarcXml(xmlRecords);
93
39
 
94
- Existing JavaScript/TypeScript MARC libraries are often:
95
- - Node.js-only (using streams, fs, Buffer APIs)
96
- - Class-based OOP patterns that don't leverage TypeScript's strengths
97
- - Mutable APIs that can lead to unexpected bugs
98
- - Lacking comprehensive type definitions
40
+ // MARC-in-JSON accepts array, single object, or JSON string
41
+ const jsonRecords = parseMarcJson(jsonString);
42
+ const json = serializeMarcJsonString(jsonRecords);
99
43
 
100
- **marc-ts** addresses these limitations:
101
- - Universal browser and Node.js compatibility
102
- - TypeScript-native with full type safety and functional patterns
103
- - Immutable operations for safer, more predictable code
104
- - Zero runtime dependencies for minimal bundle size
105
-
106
- ## Core Concepts
107
-
108
- ### Immutability
109
-
110
- Mutation-style operations in **marc-ts** return new records or fields rather than modifying existing ones:
111
-
112
- ```typescript
113
- const updated = appendField(record, field); // record remains unchanged
44
+ // MARCBreaker records separated by blank lines
45
+ const txtRecords = parseMarcTxt(txtString);
46
+ const txt = serializeMarcTxt(txtRecords);
114
47
  ```
115
48
 
116
- This approach prevents accidental mutations and makes code easier to reason about, especially in reactive frameworks like React or Vue.
49
+ ## Formats
117
50
 
118
- ### Functional API
119
-
120
- **marc-ts** uses pure functions for maximum composability:
51
+ ### ISO2709 Binary (`marc-ts`)
121
52
 
122
53
  ```typescript
123
- // Extract metadata using pure functions
124
- const bookTitle = title(record);
125
- const bookAuthor = author(record);
126
-
127
- // Access fields functionally
128
- const titleField = getField(record, '245');
54
+ import { parseMarcBinary, serializeMarcBinary } from 'marc-ts';
55
+ import type { ParseOptions, SerializeOptions } from 'marc-ts';
129
56
  ```
130
57
 
131
- ### Type Safety
58
+ #### `parseMarcBinary(buffer, options?): MarcRecord[]`
132
59
 
133
- Full TypeScript types ensure compile-time correctness:
60
+ Parse a concatenated ISO2709 binary stream. Splits on `0x1D` (RECORD_TERMINATOR) and parses each record. Records that fail to parse are silently skipped in lenient mode; with `strict: true` the first error throws.
134
61
 
135
62
  ```typescript
136
- import type { MarcRecord, DataField } from 'marc-ts';
137
- import { isDataField } from 'marc-ts';
138
-
139
- const field = getField(record, '245');
140
- if (field && isDataField(field)) {
141
- // TypeScript knows field is a DataField
142
- const titleValue = getSubfield(field, 'a');
143
- }
63
+ const records = parseMarcBinary(buffer);
64
+ const strict = parseMarcBinary(buffer, { strict: true });
144
65
  ```
145
66
 
146
- ## API Reference
147
-
148
- ### Parsing and Serialization
67
+ **`ParseOptions`**
149
68
 
150
- #### `parseMarcRecord(buffer, options?): ParseResult`
151
-
152
- Parse ISO2709 binary data into a MARC record.
153
-
154
- ```typescript
155
- const result = parseMarcRecord(buffer, {
156
- strict: false, // If true, throw on fatal parse errors
157
- maxWarnings: 100, // Maximum warnings to collect
158
- });
159
-
160
- if (result.record) {
161
- // Successfully parsed
162
- } else {
163
- // Parsing failed, check result.warnings
164
- }
165
- ```
69
+ | Option | Type | Default | Description |
70
+ |--------|------|---------|-------------|
71
+ | `strict` | `boolean` | `false` | Throw on fatal parse errors instead of skipping |
72
+ | `maxWarnings` | `number` | `100` | Stop collecting warnings after this many |
166
73
 
167
- Recoverable issues may still be returned in `warnings`, such as MARC leader compatibility warnings.
74
+ **Character encoding.** Leader byte 9 controls decoding: `'a'` = UTF-8, `' '` (space) = MARC-8. MARC-8 decoding handles ANSEL Latin, Greek, Hebrew, Cyrillic, Arabic, and subscript/superscript scripts via escape-designated sequences. EACC/CJK coverage is limited (~33 of ~16,000 official triples); records with substantial CJK content will mostly decode to U+FFFD — prefer UTF-8 sources for CJK catalogs.
168
75
 
169
- #### `parseMarcRecordStrict(buffer): MarcRecord`
76
+ #### `serializeMarcBinary(records, options?): Uint8Array`
170
77
 
171
- Convenience wrapper for strict parsing (throws on fatal parse errors).
78
+ Serialize an array of records to a concatenated ISO2709 binary stream. Each record is individually serialized with its own `0x1D` terminator.
172
79
 
173
80
  ```typescript
174
- try {
175
- const record = parseMarcRecordStrict(buffer);
176
- } catch (error) {
177
- console.error('Parsing failed:', error);
178
- }
81
+ const buffer = serializeMarcBinary(records);
82
+ const marc8Buffer = serializeMarcBinary(records, { encoding: 'marc8' });
179
83
  ```
180
84
 
181
- #### `serializeMarcRecord(record): Uint8Array`
85
+ **`SerializeOptions`**
182
86
 
183
- Serialize a MARC record to ISO2709 binary format.
87
+ | Option | Type | Default | Description |
88
+ |--------|------|---------|-------------|
89
+ | `encoding` | `'utf8' \| 'marc8'` | `'utf8'` | Character encoding; `'marc8'` replaces unsupported Unicode with `?` |
184
90
 
185
- ```typescript
186
- const buffer = serializeMarcRecord(record);
187
- // Can be written to file or transmitted over network
188
- ```
91
+ ---
189
92
 
190
- `parseMarcRecord` decodes UTF-8 records and MARC-8 records signaled by leader
191
- byte 9. MARC-8 decoding handles escape-designated scripts such as ANSEL Latin,
192
- Greek, Hebrew, Cyrillic, Arabic, subscript/superscript, and mapped EACC/CJK
193
- triples. MARC-8 serialization is intentionally conservative: `encoding:
194
- 'marc8'` writes ASCII plus ANSEL Latin/combining characters and replaces
195
- unsupported Unicode characters with `?`.
196
-
197
- **EACC coverage caveat:** the bundled EACC table maps only ~33 of the ~16,000
198
- official triples. Records with substantial Chinese/Japanese/Korean content
199
- will mostly decode to U+FFFD. For CJK catalogs, prefer UTF-8 sources
200
- (`leader[9] === 'a'`).
201
-
202
- **Surfacing lossy MARC-8 encoding:** because `serializeMarcRecord` returns a
203
- plain `Uint8Array`, lossy substitutions are invisible to callers. Use
204
- `serializeMarcRecordWithWarnings(record, { encoding: 'marc8' })` to get
205
- `{ bytes, warnings }` — any character that could not be encoded surfaces as
206
- an `encoding_error` warning. For just-the-encoder visibility, use
207
- `unicodeToMarc8WithStats(text)` to get `{ bytes, lossyCount }`.
208
-
209
- ### Convenience Accessors
210
-
211
- Extract common bibliographic metadata:
212
-
213
- | Function | Field | Description | Example |
214
- |----------|-------|-------------|---------|
215
- | `title(record)` | 245 $a$b | Full title with subtitle | `"The Catcher in the Rye"` |
216
- | `titleProper(record)` | 245 $a | Main title only | `"The Catcher in the Rye"` |
217
- | `author(record)` | 100/110 $a | Main author/creator | `"Salinger, J. D."` |
218
- | `edition(record)` | 250 $a | Edition statement | `"1st ed."` |
219
- | `publisher(record)` | 260/264 $b | Publisher name | `"Little, Brown,"` |
220
- | `publicationDate(record)` | 260/264 $c | Publication date | `"1951."` |
221
- | `isbn(record)` | 020 $a | ISBN(s) - array | `["978-0-316-76948-0"]` |
222
- | `issn(record)` | 022 $a | ISSN | `"0028-0836"` |
223
- | `lccn(record)` | 010 $a | Library of Congress Control Number | `"50011915"` |
224
- | `subjects(record)` | 6XX $a | All subject headings - array | `["Fiction", "History"]` |
225
- | `seriesStatement(record)` | 490 $a | Series statement | `"Penguin classics"` |
226
-
227
- ### Field Access
228
-
229
- #### `getField(record, tag): ControlField | DataField | undefined`
230
-
231
- Get the first field with a specific tag.
93
+ ### MARCXML (`marc-ts/xml`)
232
94
 
233
95
  ```typescript
234
- const titleField = getField(record, '245');
96
+ import { parseMarcXml, serializeMarcXml } from 'marc-ts/xml';
235
97
  ```
236
98
 
237
- #### `getFields(record, tag): (ControlField | DataField)[]`
99
+ #### `parseMarcXml(xml): MarcRecord[]`
238
100
 
239
- Get all fields with a specific tag.
101
+ Parse a MARCXML string. Accepts a `<collection>` document, bare `<record>` elements, or namespace-prefixed variants (e.g. `marc:record`). Returns all records found; returns `[]` for empty or record-free input.
240
102
 
241
103
  ```typescript
242
- const subjectFields = getFields(record, '650');
104
+ const records = parseMarcXml(xmlString);
243
105
  ```
244
106
 
245
- #### `getSubfield(field, code): string | undefined`
107
+ #### `serializeMarcXml(records): string`
246
108
 
247
- Get the first subfield value from a data field.
109
+ Serialize records to a full MARCXML `<collection>` document with XML declaration and MARC21 namespace.
248
110
 
249
111
  ```typescript
250
- const field = getField(record, '245');
251
- if (field && isDataField(field)) {
252
- const titleValue = getSubfield(field, 'a');
253
- }
112
+ const xml = serializeMarcXml(records);
254
113
  ```
255
114
 
256
- #### `getSubfields(field, code): string[]`
115
+ ---
257
116
 
258
- Get all subfield values with a specific code (for repeatable subfields).
117
+ ### MARC-in-JSON (`marc-ts/json`)
259
118
 
260
119
  ```typescript
261
- const field = getField(record, '650');
262
- if (field && isDataField(field)) {
263
- const subdivisions = getSubfields(field, 'x');
264
- }
120
+ import { parseMarcJson, serializeMarcJson, serializeMarcJsonString } from 'marc-ts/json';
121
+ import type { MarcJsonObject } from 'marc-ts/json';
265
122
  ```
266
123
 
267
- #### `getAllSubfields(field): Array<{ code: string; value: string }>`
124
+ The [MARC-in-JSON](https://wiki.code4lib.org/MARCJSONification) format represents each field as a single-key object:
268
125
 
269
- Get all subfields from a data field.
270
-
271
- ```typescript
272
- const field = getField(record, '245');
273
- if (field && isDataField(field)) {
274
- const allSubfields = getAllSubfields(field);
126
+ ```json
127
+ {
128
+ "leader": "01142cam a2200301 a 4500",
129
+ "fields": [
130
+ { "001": "5490" },
131
+ { "245": { "subfields": [{ "a": "The Hobbit" }], "ind1": "1", "ind2": "0" } }
132
+ ]
275
133
  }
276
134
  ```
277
135
 
278
- ### Wildcard Querying
279
-
280
- #### `getFieldsByPattern(record, pattern): (ControlField | DataField)[]`
281
-
282
- Match fields using wildcard patterns (`.` or `X` = any digit).
283
-
284
- ```typescript
285
- // Get all 6XX subject fields
286
- const subjects = getFieldsByPattern(record, '6..');
287
-
288
- // Get all 7XX added entry fields
289
- const addedEntries = getFieldsByPattern(record, '7XX');
290
-
291
- // Get all X00 fields (100, 200, ..., 900)
292
- const x00Fields = getFieldsByPattern(record, 'X00');
293
- ```
294
-
295
- #### `getFirstFieldByPattern(record, pattern): ControlField | DataField | undefined`
296
-
297
- Get the first field matching a wildcard pattern.
298
-
299
- ```typescript
300
- const firstSubject = getFirstFieldByPattern(record, '6..');
301
- ```
136
+ #### `parseMarcJson(json): MarcRecord[]`
302
137
 
303
- ### Field Operations (Immutable)
138
+ Parse MARC-in-JSON into records. Accepts:
139
+ - A JSON string whose top-level value is an array or a single object
140
+ - A `MarcJsonObject[]` array
141
+ - A single `MarcJsonObject`
304
142
 
305
- All operations return new records/fields without mutating the original.
306
-
307
- #### `appendField(record, field): MarcRecord`
308
-
309
- Append a field to the end of a record.
143
+ Always returns `MarcRecord[]`. Throws on structural errors.
310
144
 
311
145
  ```typescript
312
- const newField: DataField = {
313
- tag: '650',
314
- indicator1: ' ',
315
- indicator2: '0',
316
- subfields: [{ code: 'a', value: 'New subject' }],
317
- };
318
-
319
- const updated = appendField(record, newField);
320
- // record is unchanged, updated has the new field
146
+ const records = parseMarcJson(jsonString); // JSON string (array or single)
147
+ const records = parseMarcJson([obj1, obj2]); // plain object array
148
+ const records = parseMarcJson(singleObj); // single object → one-element array
321
149
  ```
322
150
 
323
- #### `insertFieldBefore(record, tag, field): MarcRecord`
151
+ #### `serializeMarcJson(records): MarcJsonObject[]`
324
152
 
325
- Insert a field before the first occurrence of a tag.
153
+ Serialize records to an array of MARC-in-JSON plain objects.
326
154
 
327
155
  ```typescript
328
- const updated = insertFieldBefore(record, '700', newField);
156
+ const objs = serializeMarcJson(records);
329
157
  ```
330
158
 
331
- #### `insertFieldAfter(record, tag, field): MarcRecord`
159
+ #### `serializeMarcJsonString(records): string`
332
160
 
333
- Insert a field after the first occurrence of a tag.
161
+ Serialize records directly to a JSON string (a JSON array).
334
162
 
335
163
  ```typescript
336
- const updated = insertFieldAfter(record, '245', newField);
164
+ const json = serializeMarcJsonString(records);
337
165
  ```
338
166
 
339
- #### `insertGroupedField(record, field): MarcRecord`
340
-
341
- Insert a field maintaining MARC block order (00X → 0XX → 1XX → ... → 9XX).
342
-
343
- ```typescript
344
- const updated = insertGroupedField(record, field);
345
- // Field is inserted in proper MARC order
346
- ```
347
-
348
- #### `removeFields(record, tag): MarcRecord`
167
+ ---
349
168
 
350
- Remove all fields with a specific tag.
169
+ ### MARCBreaker / marctxt (`marc-ts/txt`)
351
170
 
352
171
  ```typescript
353
- const updated = removeFields(record, '650');
172
+ import { parseMarcTxt, serializeMarcTxt } from 'marc-ts/txt';
354
173
  ```
355
174
 
356
- #### `removeField(record, field): MarcRecord`
357
-
358
- Remove a specific field instance using reference equality.
175
+ MARCBreaker is a human-readable line-oriented format. Each field occupies one line; blank indicators are written as `\`; subfields use `$` followed by a single-character code. Records are separated by blank lines:
359
176
 
360
- ```typescript
361
- const field = getField(record, '650');
362
- const updated = field ? removeField(record, field) : record;
363
177
  ```
364
-
365
- #### Subfield Operations
366
-
367
- ```typescript
368
- // Add subfield to a field
369
- const updated = addSubfield(field, 'b', 'Subtitle');
370
-
371
- // Remove all subfields with code
372
- const updated = removeSubfield(field, 'x');
373
-
374
- // Replace first subfield with code
375
- const updated = replaceSubfield(field, 'a', 'New value');
178
+ =LDR 00706cam a2200217 a 4500
179
+ =001 5490
180
+ =003 OCoLC
181
+ =245 14$aThe Hobbit /$cJ.R.R. Tolkien.
182
+ =650 \1$aHobbits (Fictitious characters)$vFiction.
376
183
  ```
377
184
 
378
- ### Clone and Equality
379
-
380
- #### `cloneRecord(record): MarcRecord`
185
+ **Value escaping** reserved characters in field values are escaped as follows:
381
186
 
382
- Create a deep copy of a record.
187
+ | Character | Escaped form |
188
+ |-----------|-------------|
189
+ | `$` | `{dollar}` |
190
+ | `{` | `{lcub}` |
191
+ | `}` | `{rcub}` |
192
+ | `\` | `{bsol}` |
383
193
 
384
- ```typescript
385
- const copy = cloneRecord(record);
386
- // Modifying copy will not affect record
387
- ```
194
+ Embedded newlines in values are replaced with a space on serialize.
388
195
 
389
- #### `recordsEqual(a, b, ignoreFieldOrder?): boolean`
196
+ #### `parseMarcTxt(text): MarcRecord[]`
390
197
 
391
- Check if two records are equal.
198
+ Parse a marctxt string. Accepts `\n` and `\r\n` line endings. Records are separated by blank lines. Returns all records found.
392
199
 
393
200
  ```typescript
394
- if (recordsEqual(record1, record2)) {
395
- console.log('Records are identical');
396
- }
397
-
398
- // Ignore field order
399
- if (recordsEqual(record1, record2, true)) {
400
- console.log('Records have same content');
401
- }
201
+ const records = parseMarcTxt(txtString);
402
202
  ```
403
203
 
404
- #### `fieldsEqual(a, b): boolean`
204
+ #### `serializeMarcTxt(records): string`
405
205
 
406
- Check if two fields are equal.
206
+ Serialize records to a marctxt string, with records separated by blank lines.
407
207
 
408
208
  ```typescript
409
- if (fieldsEqual(field1, field2)) {
410
- console.log('Fields are identical');
411
- }
209
+ const txt = serializeMarcTxt(records);
412
210
  ```
413
211
 
414
- ### Warnings
212
+ ---
415
213
 
416
- #### `createWarning(type, message, position?, tag?): MarcWarning`
214
+ ## Convenience Accessors
417
215
 
418
- Create a parsing warning object.
216
+ Extract common bibliographic metadata from any `MarcRecord`:
419
217
 
420
218
  ```typescript
421
- const warning = createWarning('invalid_field', 'Field is out of bounds', 42, '245');
219
+ import { title, titleProper, author, edition, publisher, publicationDate,
220
+ isbn, issn, lccn, subjects, seriesStatement } from 'marc-ts';
422
221
  ```
423
222
 
424
- ## Additional Formats
425
-
426
- ### MARCXML (`marc-ts/xml`)
427
-
428
- Import from the `marc-ts/xml` subpath for MARCXML support (Library of Congress schema).
223
+ | Function | Source field | Returns |
224
+ |----------|-------------|---------|
225
+ | `title(record)` | 245 $a$b | Full title with subtitle |
226
+ | `titleProper(record)` | 245 $a | Main title only |
227
+ | `author(record)` | 100/110 $a | Main author/creator |
228
+ | `edition(record)` | 250 $a | Edition statement |
229
+ | `publisher(record)` | 260/264 $b | Publisher name |
230
+ | `publicationDate(record)` | 260/264 $c | Publication date |
231
+ | `isbn(record)` | 020 $a | `string[]` of ISBNs |
232
+ | `issn(record)` | 022 $a | ISSN |
233
+ | `lccn(record)` | 010 $a | Library of Congress Control Number |
234
+ | `subjects(record)` | 6XX $a | `string[]` of subject headings |
235
+ | `seriesStatement(record)` | 490 $a | Series statement |
429
236
 
430
- ```typescript
431
- import {
432
- parseMarcXml,
433
- parseMarcXmlRecord,
434
- serializeMarcXml,
435
- serializeMarcXmlRecord,
436
- } from 'marc-ts/xml';
437
- ```
438
-
439
- #### `parseMarcXml(xml): MarcRecord[]`
237
+ ---
440
238
 
441
- Parse a MARCXML string containing a `<collection>` or one or more bare `<record>` elements.
239
+ ## Field Access
442
240
 
443
241
  ```typescript
444
- const records = parseMarcXml(xmlString);
445
- // Returns all records found in the document
242
+ import { getField, getFields, getSubfield, getSubfields, getAllSubfields } from 'marc-ts';
243
+ import { isControlField, isDataField } from 'marc-ts';
446
244
  ```
447
245
 
448
- #### `parseMarcXmlRecord(xml): MarcRecord`
449
-
450
- Parse a MARCXML string expected to contain exactly one `<record>`. Throws if none is found.
246
+ #### `getField(record, tag)` / `getFields(record, tag)`
451
247
 
452
248
  ```typescript
453
- const record = parseMarcXmlRecord(xmlString);
249
+ const field = getField(record, '245'); // first match or undefined
250
+ const fields = getFields(record, '650'); // all matches
454
251
  ```
455
252
 
456
- #### `serializeMarcXml(records): string`
457
-
458
- Serialize one or more records into a full MARCXML `<collection>` document (with XML declaration).
253
+ #### `getSubfield(field, code)` / `getSubfields(field, code)`
459
254
 
460
255
  ```typescript
461
- const xml = serializeMarcXml([record1, record2]);
256
+ if (field && isDataField(field)) {
257
+ const a = getSubfield(field, 'a'); // first $a or undefined
258
+ const xs = getSubfields(field, 'x'); // all $x values
259
+ }
462
260
  ```
463
261
 
464
- #### `serializeMarcXmlRecord(record): string`
465
-
466
- Serialize a single record to a `<record>` XML element string (no collection wrapper or XML declaration).
262
+ #### `getAllSubfields(field)`
467
263
 
468
264
  ```typescript
469
- const recordXml = serializeMarcXmlRecord(record);
265
+ const all = getAllSubfields(field); // [{ code, value }, ...]
470
266
  ```
471
267
 
472
268
  ---
473
269
 
474
- ### MARC-in-JSON (`marc-ts/json`)
475
-
476
- Import from the `marc-ts/json` subpath for [MARC-in-JSON](https://wiki.code4lib.org/MARCJSONification) support (used by Open Library and many REST APIs).
270
+ ## Wildcard Querying
477
271
 
478
272
  ```typescript
479
- import {
480
- parseMarcJson,
481
- serializeMarcJson,
482
- serializeMarcJsonString,
483
- } from 'marc-ts/json';
484
- import type { MarcJsonObject } from 'marc-ts/json';
485
- ```
486
-
487
- The format represents each field as a single-key object in an array:
273
+ import { getFieldsByPattern, getFirstFieldByPattern } from 'marc-ts';
488
274
 
489
- ```json
490
- {
491
- "leader": "01142cam a2200301 a 4500",
492
- "fields": [
493
- { "001": "5490" },
494
- { "245": { "subfields": [{ "a": "The Hobbit" }], "ind1": "1", "ind2": "0" } }
495
- ]
496
- }
497
- ```
498
-
499
- #### `parseMarcJson(json): MarcRecord`
500
-
501
- Parse a MARC-in-JSON object or JSON string into a `MarcRecord`. Throws on structural errors.
502
-
503
- ```typescript
504
- const record = parseMarcJson(jsonString); // from a JSON string
505
- const record = parseMarcJson(jsonObject); // from a plain object
275
+ const subjects = getFieldsByPattern(record, '6..'); // all 6XX fields
276
+ const first7xx = getFirstFieldByPattern(record, '7XX');
506
277
  ```
507
278
 
508
- #### `serializeMarcJson(record): MarcJsonObject`
509
-
510
- Serialize a `MarcRecord` to a MARC-in-JSON plain object.
511
-
512
- ```typescript
513
- const obj = serializeMarcJson(record);
514
- // obj.leader, obj.fields — ready for JSON.stringify or further processing
515
- ```
516
-
517
- #### `serializeMarcJsonString(record): string`
518
-
519
- Serialize a `MarcRecord` directly to a JSON string.
520
-
521
- ```typescript
522
- const json = serializeMarcJsonString(record);
523
- ```
279
+ `.` and `X` each match any single digit.
524
280
 
525
281
  ---
526
282
 
527
- ### MARCBreaker / marctxt (`marc-ts/txt`)
283
+ ## Field Operations (Immutable)
528
284
 
529
- Import from the `marc-ts/txt` subpath for MARCBreaker support. This format (also called MARCMaker or marctxt) is a human-readable line-oriented representation originated by the Library of Congress MARCMaker/MARCBreaker tools and widely used for editing MARC data in plain text.
285
+ All operations return a new `MarcRecord` without modifying the original.
530
286
 
531
287
  ```typescript
532
288
  import {
533
- parseMarcTxt,
534
- parseMarcTxtRecord,
535
- serializeMarcTxt,
536
- serializeMarcTxtRecord,
537
- } from 'marc-ts/txt';
538
- ```
539
-
540
- Each field occupies one line. Blank indicators are written as `\`. Subfields use `$` followed by a single-character code. Records are separated by blank lines:
541
-
542
- ```
543
- =LDR 00706cam a2200217 a 4500
544
- =001 5490
545
- =003 OCoLC
546
- =245 14$aThe Hobbit /$cJ.R.R. Tolkien.
547
- =650 \1$aHobbits (Fictitious characters)$vFiction.
548
- ```
549
-
550
- **Value escaping.** Standard MARCBreaker has no way to represent a literal `$`
551
- in a value, and `marc-ts` follows the same convention as other tools for the
552
- remaining reserved characters:
289
+ appendField, insertFieldBefore, insertFieldAfter, insertGroupedField,
290
+ removeFields, removeField,
291
+ addSubfield, removeSubfield, replaceSubfield,
292
+ } from 'marc-ts';
553
293
 
554
- - `$` `{dollar}`
555
- - `{` `{lcub}`, `}` → `{rcub}`
556
- - `\` `{bsol}`
294
+ const r1 = appendField(record, newField);
295
+ const r2 = insertFieldBefore(record, '700', newField);
296
+ const r3 = insertFieldAfter(record, '245', newField);
297
+ const r4 = insertGroupedField(record, newField); // maintains MARC block order
298
+ const r5 = removeFields(record, '650');
299
+ const r6 = removeField(record, specificField); // reference equality
557
300
 
558
- Embedded newlines (`\n`) in field values are replaced with a space on
559
- serialize, matching the behavior of other MARCBreaker tools. Source values that
560
- do not contain any of these characters are emitted verbatim.
561
-
562
- #### `parseMarcTxt(text): MarcRecord[]`
563
-
564
- Parse a marctxt string containing one or more records separated by blank lines. Accepts both `\n` and `\r\n` line endings.
565
-
566
- ```typescript
567
- const records = parseMarcTxt(txtString);
568
- // Returns all records found
301
+ // Subfield operations return a new DataField
302
+ const f1 = addSubfield(field, 'b', 'Subtitle');
303
+ const f2 = removeSubfield(field, 'x');
304
+ const f3 = replaceSubfield(field, 'a', 'New value');
569
305
  ```
570
306
 
571
- #### `parseMarcTxtRecord(text): MarcRecord`
307
+ ---
572
308
 
573
- Parse a marctxt string expected to contain exactly one record. Throws if none is found.
309
+ ## Clone and Equality
574
310
 
575
311
  ```typescript
576
- const record = parseMarcTxtRecord(txtString);
577
- ```
578
-
579
- #### `serializeMarcTxt(records): string`
580
-
581
- Serialize one or more records into a marctxt string, with records separated by blank lines.
312
+ import { cloneRecord, recordsEqual, fieldsEqual } from 'marc-ts';
582
313
 
583
- ```typescript
584
- const txt = serializeMarcTxt([record1, record2]);
314
+ const copy = cloneRecord(record);
315
+ recordsEqual(a, b); // strict field order
316
+ recordsEqual(a, b, true); // ignore field order
317
+ fieldsEqual(field1, field2);
585
318
  ```
586
319
 
587
- #### `serializeMarcTxtRecord(record): string`
320
+ ---
588
321
 
589
- Serialize a single record to marctxt (no surrounding blank line).
322
+ ## Types
590
323
 
591
324
  ```typescript
592
- const txt = serializeMarcTxtRecord(record);
325
+ import type { MarcRecord, ControlField, DataField, Subfield,
326
+ ParseOptions, SerializeOptions, MarcWarning, MarcWarningType } from 'marc-ts';
593
327
  ```
594
328
 
595
329
  ---
596
330
 
597
- ## Browser Usage
598
-
599
- **marc-ts** works in modern browsers without any bundler configuration:
600
-
601
- ```html
602
- <!DOCTYPE html>
603
- <html>
604
- <head>
605
- <title>marc-ts Browser Example</title>
606
- </head>
607
- <body>
608
- <input type="file" id="fileInput" accept=".mrc" />
609
- <pre id="output"></pre>
610
-
611
- <script type="module">
612
- import { parseMarcRecord, title, author } from 'https://cdn.skypack.dev/marc-ts';
613
-
614
- document.getElementById('fileInput').addEventListener('change', async (e) => {
615
- const file = e.target.files[0];
616
- const arrayBuffer = await file.arrayBuffer();
617
- const buffer = new Uint8Array(arrayBuffer);
618
-
619
- const result = parseMarcRecord(buffer);
620
- if (result.record) {
621
- document.getElementById('output').textContent = `
622
- Title: ${title(result.record) || 'N/A'}
623
- Author: ${author(result.record) || 'N/A'}
624
- `;
625
- }
626
- });
627
- </script>
628
- </body>
629
- </html>
630
- ```
631
-
632
- ## Examples
633
-
634
- See the [examples/](./examples/) directory for more examples:
635
- - [basic-usage.ts](./examples/basic-usage.ts) - Common usage patterns
636
- - [browser.html](./examples/browser.html) - Browser integration
637
-
638
331
  ## Development
639
332
 
640
- Requires Node.js **20.19** or **22.12+** (driven by Vite 8). Older Node versions
641
- are EOL and will fail to install the dev toolchain. The compiled output is
642
- compatible with modern browsers and any actively-supported Node release.
333
+ Requires Node.js **20.19** or **22.12+** (driven by Vite 8). The compiled output targets modern browsers and any actively-supported Node release.
334
+
335
+ ```bash
336
+ npm test # run tests
337
+ npm run build # compile to dist/
338
+ npm run type-check # TypeScript check without emit
339
+ ```