outparse 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- outparse-1.0.0/LICENSE +28 -0
- outparse-1.0.0/PKG-INFO +367 -0
- outparse-1.0.0/README.md +320 -0
- outparse-1.0.0/outparse/__init__.py +7 -0
- outparse-1.0.0/outparse/parser.py +839 -0
- outparse-1.0.0/outparse.egg-info/PKG-INFO +367 -0
- outparse-1.0.0/outparse.egg-info/SOURCES.txt +9 -0
- outparse-1.0.0/outparse.egg-info/dependency_links.txt +1 -0
- outparse-1.0.0/outparse.egg-info/top_level.txt +1 -0
- outparse-1.0.0/pyproject.toml +31 -0
- outparse-1.0.0/setup.cfg +4 -0
outparse-1.0.0/LICENSE
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
BSD 3-Clause License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026, edmynay
|
|
4
|
+
|
|
5
|
+
Redistribution and use in source and binary forms, with or without
|
|
6
|
+
modification, are permitted provided that the following conditions are met:
|
|
7
|
+
|
|
8
|
+
1. Redistributions of source code must retain the above copyright notice, this
|
|
9
|
+
list of conditions and the following disclaimer.
|
|
10
|
+
|
|
11
|
+
2. Redistributions in binary form must reproduce the above copyright notice,
|
|
12
|
+
this list of conditions and the following disclaimer in the documentation
|
|
13
|
+
and/or other materials provided with the distribution.
|
|
14
|
+
|
|
15
|
+
3. Neither the name of the copyright holder nor the names of its
|
|
16
|
+
contributors may be used to endorse or promote products derived from
|
|
17
|
+
this software without specific prior written permission.
|
|
18
|
+
|
|
19
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
|
20
|
+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
21
|
+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
|
22
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
|
23
|
+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
24
|
+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
|
25
|
+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
|
26
|
+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
|
27
|
+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
|
28
|
+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
outparse-1.0.0/PKG-INFO
ADDED
|
@@ -0,0 +1,367 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: outparse
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: Outparse — configurable fast printout parser
|
|
5
|
+
Author: edmynay
|
|
6
|
+
License: BSD 3-Clause License
|
|
7
|
+
|
|
8
|
+
Copyright (c) 2026, edmynay
|
|
9
|
+
|
|
10
|
+
Redistribution and use in source and binary forms, with or without
|
|
11
|
+
modification, are permitted provided that the following conditions are met:
|
|
12
|
+
|
|
13
|
+
1. Redistributions of source code must retain the above copyright notice, this
|
|
14
|
+
list of conditions and the following disclaimer.
|
|
15
|
+
|
|
16
|
+
2. Redistributions in binary form must reproduce the above copyright notice,
|
|
17
|
+
this list of conditions and the following disclaimer in the documentation
|
|
18
|
+
and/or other materials provided with the distribution.
|
|
19
|
+
|
|
20
|
+
3. Neither the name of the copyright holder nor the names of its
|
|
21
|
+
contributors may be used to endorse or promote products derived from
|
|
22
|
+
this software without specific prior written permission.
|
|
23
|
+
|
|
24
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
|
25
|
+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
26
|
+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
|
27
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
|
28
|
+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
29
|
+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
|
30
|
+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
|
31
|
+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
|
32
|
+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
|
33
|
+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
34
|
+
|
|
35
|
+
Project-URL: Homepage, https://github.com/edmynay/outparse
|
|
36
|
+
Project-URL: Repository, https://github.com/edmynay/outparse
|
|
37
|
+
Project-URL: Issues, https://github.com/edmynay/outparse/issues
|
|
38
|
+
Project-URL: Documentation, https://github.com/edmynay/outparse#readme
|
|
39
|
+
Keywords: cli output,printout,text table,parser
|
|
40
|
+
Classifier: Programming Language :: Python :: 3
|
|
41
|
+
Classifier: License :: OSI Approved :: BSD License
|
|
42
|
+
Classifier: Operating System :: OS Independent
|
|
43
|
+
Requires-Python: >=3.8
|
|
44
|
+
Description-Content-Type: text/markdown
|
|
45
|
+
License-File: LICENSE
|
|
46
|
+
Dynamic: license-file
|
|
47
|
+
|
|
48
|
+
OutParse — configurable fast printout (text table) parser
|
|
49
|
+
=========================================================
|
|
50
|
+
|
|
51
|
+
Overview
|
|
52
|
+
--------
|
|
53
|
+
OutParse parses human-readable, line-wrapped printouts (text tables)
|
|
54
|
+
and converts them into structured Python data.
|
|
55
|
+
|
|
56
|
+
It has no external dependencies and is suitable for embedded environments
|
|
57
|
+
where pip usage is limited.
|
|
58
|
+
|
|
59
|
+
A printout represents logically tabular data (rows and columns) that may be:
|
|
60
|
+
- wrapped across multiple lines
|
|
61
|
+
- split into logical sections
|
|
62
|
+
- mixed with horizontal key–value parameters
|
|
63
|
+
- nested (parent/child objects)
|
|
64
|
+
|
|
65
|
+
The parser output is always a list of dictionaries.
|
|
66
|
+
|
|
67
|
+
|
|
68
|
+
Quick Start
|
|
69
|
+
-----------
|
|
70
|
+
Example:
|
|
71
|
+
|
|
72
|
+
```python
|
|
73
|
+
from outparse import PrintoutParser
|
|
74
|
+
|
|
75
|
+
text = '''
|
|
76
|
+
POINTS
|
|
77
|
+
|
|
78
|
+
NAME LOCATION TYPE
|
|
79
|
+
DotA 100, 88 p
|
|
80
|
+
|
|
81
|
+
STATUS ACTIVE
|
|
82
|
+
|
|
83
|
+
NAME LOCATION TYPE
|
|
84
|
+
PointB 155, 25 p
|
|
85
|
+
|
|
86
|
+
STATUS PASSIVE
|
|
87
|
+
|
|
88
|
+
USERS
|
|
89
|
+
|
|
90
|
+
Username Email
|
|
91
|
+
John Doe john_doe@www.org
|
|
92
|
+
'''
|
|
93
|
+
|
|
94
|
+
parser = PrintoutParser(hor_param_names=["STATUS"])
|
|
95
|
+
result = parser.parse(text)
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
Result:
|
|
99
|
+
|
|
100
|
+
```python
|
|
101
|
+
[
|
|
102
|
+
{
|
|
103
|
+
'NAME': ['DotA'],
|
|
104
|
+
'LOCATION': ['100', '88'],
|
|
105
|
+
'TYPE': ['p'],
|
|
106
|
+
'STATUS': ['ACTIVE'],
|
|
107
|
+
'object_id_param_name': 'NAME'
|
|
108
|
+
},
|
|
109
|
+
{
|
|
110
|
+
'NAME': ['PointB'],
|
|
111
|
+
'LOCATION': ['155', '25'],
|
|
112
|
+
'TYPE': ['p'],
|
|
113
|
+
'STATUS': ['PASSIVE'],
|
|
114
|
+
'object_id_param_name': 'NAME'
|
|
115
|
+
},
|
|
116
|
+
{
|
|
117
|
+
'Username': ['John', 'Doe'],
|
|
118
|
+
'Email': ['john_doe@www.org'],
|
|
119
|
+
'object_id_param_name': 'Username'
|
|
120
|
+
}
|
|
121
|
+
]
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
|
|
125
|
+
What is a printout/text table?
|
|
126
|
+
------------------------------
|
|
127
|
+
A printout (aka text table) is a human-readable representation of tabular data
|
|
128
|
+
where rows may span multiple lines, but column semantics remain consistent.
|
|
129
|
+
|
|
130
|
+
Even when visually wrapped, such a printout can always be normalized
|
|
131
|
+
into a flat table structure without losing information.
|
|
132
|
+
|
|
133
|
+
Wrapped form (printout):
|
|
134
|
+
|
|
135
|
+
```
|
|
136
|
+
NAME LOCATION TYPE
|
|
137
|
+
DotA 100, 88 p
|
|
138
|
+
|
|
139
|
+
STATUS ACTIVE
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
Logical flat form (text table):
|
|
143
|
+
|
|
144
|
+
```
|
|
145
|
+
NAME LOCATION TYPE STATUS
|
|
146
|
+
DotA 100, 88 p ACTIVE
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
|
|
150
|
+
Parameters
|
|
151
|
+
----------
|
|
152
|
+
A parameter is a named field with one or more values.
|
|
153
|
+
|
|
154
|
+
- Values are always stored as lists.
|
|
155
|
+
- Multiple values are separated by delimiters (spaces or commas by default.
|
|
156
|
+
- Splitting behavior is configurable via value_delimiters.
|
|
157
|
+
- Set value_delimiters=None or '' to disable splitting.
|
|
158
|
+
|
|
159
|
+
|
|
160
|
+
Vertical and Horizontal Parameters
|
|
161
|
+
----------------------------------
|
|
162
|
+
Vertical parameters:
|
|
163
|
+
Values aligned under a header row.
|
|
164
|
+
|
|
165
|
+
Example:
|
|
166
|
+
|
|
167
|
+
X Y
|
|
168
|
+
10 15
|
|
169
|
+
|
|
170
|
+
Horizontal parameters:
|
|
171
|
+
Parameters whose name and value appear on the same line.
|
|
172
|
+
|
|
173
|
+
Example:
|
|
174
|
+
|
|
175
|
+
NAME John Doe
|
|
176
|
+
|
|
177
|
+
Horizontal parameters are NOT auto-detected and must be explicitly
|
|
178
|
+
declared via hor_param_names.
|
|
179
|
+
|
|
180
|
+
|
|
181
|
+
Objects and Identifiers
|
|
182
|
+
-----------------------
|
|
183
|
+
Each parsed object corresponds to one logical row of data.
|
|
184
|
+
|
|
185
|
+
An object is identified by an identifier parameter (e.g. NAME, ID).
|
|
186
|
+
|
|
187
|
+
Default behavior:
|
|
188
|
+
- The first detected parameter becomes the identifier.
|
|
189
|
+
- When the same identifier parameter appears again with
|
|
190
|
+
a non-empty value, a new object is started.
|
|
191
|
+
|
|
192
|
+
If object_id_param_names is provided:
|
|
193
|
+
- Only listed parameters are treated as identifiers.
|
|
194
|
+
- Section changes do not reset identifier detection automatically.
|
|
195
|
+
|
|
196
|
+
|
|
197
|
+
Printout Logical Sections
|
|
198
|
+
-------------------------
|
|
199
|
+
A single non-empty line surrounded by empty lines starts a new section.
|
|
200
|
+
Sections may contain a different object type.
|
|
201
|
+
|
|
202
|
+
When a new section starts:
|
|
203
|
+
- the current object is finalized
|
|
204
|
+
- identifier detection restarts (unless custom identifiers are specified)
|
|
205
|
+
|
|
206
|
+
Example:
|
|
207
|
+
|
|
208
|
+
```
|
|
209
|
+
POINTS
|
|
210
|
+
|
|
211
|
+
NAME LOCATION TYPE
|
|
212
|
+
pointA 155, 25 n
|
|
213
|
+
|
|
214
|
+
USER DATA
|
|
215
|
+
|
|
216
|
+
USERNAME EMAIL
|
|
217
|
+
John john@mail.com
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
|
|
221
|
+
Child Objects (Advanced)
|
|
222
|
+
------------------------
|
|
223
|
+
OutParse supports hierarchical parent–child relationships.
|
|
224
|
+
|
|
225
|
+
Its used when one object (parent) contains one or more nested objects (childs).
|
|
226
|
+
|
|
227
|
+
Example
|
|
228
|
+
|
|
229
|
+
```
|
|
230
|
+
DEPARTMENTS
|
|
231
|
+
|
|
232
|
+
Department Manager
|
|
233
|
+
Macrodata Refinement Mark.S
|
|
234
|
+
|
|
235
|
+
Employee Role
|
|
236
|
+
Mark.S Refiner
|
|
237
|
+
Dylan.G Refiner
|
|
238
|
+
Irving.B Refiner
|
|
239
|
+
Helly.R Refiner
|
|
240
|
+
|
|
241
|
+
Department Manager
|
|
242
|
+
Optics & Design Burt.G
|
|
243
|
+
|
|
244
|
+
Employee Role
|
|
245
|
+
Burt.G Designer
|
|
246
|
+
Felicia Technician
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
Here we have two object types: Department (parent) and Employee (child), to parse it properly, this should be configured via object_relations:
|
|
250
|
+
|
|
251
|
+
```python
|
|
252
|
+
parser = PrintoutParser(object_relations={'Department': ['Employee']})
|
|
253
|
+
result = parser.parse(text)
|
|
254
|
+
print(result)
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
Which results in
|
|
258
|
+
|
|
259
|
+
```python
|
|
260
|
+
[
|
|
261
|
+
{
|
|
262
|
+
'Department': ['Macrodata', 'Refinement'],
|
|
263
|
+
'Manager': ['Mark.S'],
|
|
264
|
+
'Employee': ['Mark.S', 'Dylan.G', 'Irving.B', 'Helly.R'],
|
|
265
|
+
'Role': [
|
|
266
|
+
['Refiner'],
|
|
267
|
+
['Refiner'],
|
|
268
|
+
['Refiner'],
|
|
269
|
+
['Refiner']
|
|
270
|
+
],
|
|
271
|
+
'object_id_param_name': 'Department'
|
|
272
|
+
},
|
|
273
|
+
{
|
|
274
|
+
'Department': ['Optics', '&', 'Design'],
|
|
275
|
+
'Manager': ['Burt.G'],
|
|
276
|
+
'Employee': ['Burt.G', 'Felicia'],
|
|
277
|
+
'Role': [
|
|
278
|
+
['Designer'],
|
|
279
|
+
['Technician']
|
|
280
|
+
],
|
|
281
|
+
'object_id_param_name': 'Department'
|
|
282
|
+
}
|
|
283
|
+
]
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
Child parameters are stored as lists of lists,
|
|
287
|
+
aligned by child identifier index.
|
|
288
|
+
|
|
289
|
+
Hierarchy is configured via object_relations:
|
|
290
|
+
|
|
291
|
+
```python
|
|
292
|
+
{
|
|
293
|
+
"PARENT_ID": ["CHILD_ID_1", "CHILD_ID_2"]
|
|
294
|
+
}
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
where "PARENT_ID" is the identifier parameter name of the parent object type,
|
|
298
|
+
and ["CHILD_ID_1", "CHILD_ID_2"] is a list of identifier parameter names for all child object types
|
|
299
|
+
that belong to this parent — including indirect descendants (children, grandchildren, etc.).
|
|
300
|
+
Nesting level does not matter: any identifier listed here will be treated as a child of "PARENT_ID".
|
|
301
|
+
|
|
302
|
+
|
|
303
|
+
Basic Output Format
|
|
304
|
+
-------------------
|
|
305
|
+
The parser returns:
|
|
306
|
+
|
|
307
|
+
`List[Dict[str, List[str]]]`
|
|
308
|
+
|
|
309
|
+
Each dictionary represents one parsed object and contains:
|
|
310
|
+
- parameter names as keys
|
|
311
|
+
- lists of values as values
|
|
312
|
+
- "object_id_param_name" storing object identifier parameter name
|
|
313
|
+
|
|
314
|
+
|
|
315
|
+
Common Mistakes / Requirements
|
|
316
|
+
------------------------------
|
|
317
|
+
|
|
318
|
+
1. Header line must be separated from previous content (if any) by an empty line
|
|
319
|
+
|
|
320
|
+
Incorrect:
|
|
321
|
+
```
|
|
322
|
+
<previous data>
|
|
323
|
+
NAME LOCATION TYPE
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
Correct:
|
|
327
|
+
```
|
|
328
|
+
<previous data>
|
|
329
|
+
|
|
330
|
+
NAME LOCATION TYPE
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
|
|
334
|
+
2. Section title must be separated from previous content (if any) by an empty line
|
|
335
|
+
and must always be followed by an empty line
|
|
336
|
+
|
|
337
|
+
Incorrect:
|
|
338
|
+
|
|
339
|
+
```
|
|
340
|
+
<previous data>
|
|
341
|
+
POINTS
|
|
342
|
+
NAME LOCATION TYPE
|
|
343
|
+
```
|
|
344
|
+
|
|
345
|
+
Correct:
|
|
346
|
+
|
|
347
|
+
```
|
|
348
|
+
<previous data>
|
|
349
|
+
|
|
350
|
+
POINTS
|
|
351
|
+
|
|
352
|
+
NAME LOCATION TYPE
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
3. Text must be space-formatted
|
|
356
|
+
|
|
357
|
+
Parsing relies on column positioning.
|
|
358
|
+
If text is tab-formatted, replace tabs with spaces before parsing:
|
|
359
|
+
|
|
360
|
+
```
|
|
361
|
+
text_for_parsing = text.replace('\t', 4 * ' ')
|
|
362
|
+
```
|
|
363
|
+
|
|
364
|
+
## License
|
|
365
|
+
|
|
366
|
+
This project is licensed under the BSD 3-Clause License.
|
|
367
|
+
See the [LICENSE](LICENSE) file for details.
|
outparse-1.0.0/README.md
ADDED
|
@@ -0,0 +1,320 @@
|
|
|
1
|
+
OutParse — configurable fast printout (text table) parser
|
|
2
|
+
=========================================================
|
|
3
|
+
|
|
4
|
+
Overview
|
|
5
|
+
--------
|
|
6
|
+
OutParse parses human-readable, line-wrapped printouts (text tables)
|
|
7
|
+
and converts them into structured Python data.
|
|
8
|
+
|
|
9
|
+
It has no external dependencies and is suitable for embedded environments
|
|
10
|
+
where pip usage is limited.
|
|
11
|
+
|
|
12
|
+
A printout represents logically tabular data (rows and columns) that may be:
|
|
13
|
+
- wrapped across multiple lines
|
|
14
|
+
- split into logical sections
|
|
15
|
+
- mixed with horizontal key–value parameters
|
|
16
|
+
- nested (parent/child objects)
|
|
17
|
+
|
|
18
|
+
The parser output is always a list of dictionaries.
|
|
19
|
+
|
|
20
|
+
|
|
21
|
+
Quick Start
|
|
22
|
+
-----------
|
|
23
|
+
Example:
|
|
24
|
+
|
|
25
|
+
```python
|
|
26
|
+
from outparse import PrintoutParser
|
|
27
|
+
|
|
28
|
+
text = '''
|
|
29
|
+
POINTS
|
|
30
|
+
|
|
31
|
+
NAME LOCATION TYPE
|
|
32
|
+
DotA 100, 88 p
|
|
33
|
+
|
|
34
|
+
STATUS ACTIVE
|
|
35
|
+
|
|
36
|
+
NAME LOCATION TYPE
|
|
37
|
+
PointB 155, 25 p
|
|
38
|
+
|
|
39
|
+
STATUS PASSIVE
|
|
40
|
+
|
|
41
|
+
USERS
|
|
42
|
+
|
|
43
|
+
Username Email
|
|
44
|
+
John Doe john_doe@www.org
|
|
45
|
+
'''
|
|
46
|
+
|
|
47
|
+
parser = PrintoutParser(hor_param_names=["STATUS"])
|
|
48
|
+
result = parser.parse(text)
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Result:
|
|
52
|
+
|
|
53
|
+
```python
|
|
54
|
+
[
|
|
55
|
+
{
|
|
56
|
+
'NAME': ['DotA'],
|
|
57
|
+
'LOCATION': ['100', '88'],
|
|
58
|
+
'TYPE': ['p'],
|
|
59
|
+
'STATUS': ['ACTIVE'],
|
|
60
|
+
'object_id_param_name': 'NAME'
|
|
61
|
+
},
|
|
62
|
+
{
|
|
63
|
+
'NAME': ['PointB'],
|
|
64
|
+
'LOCATION': ['155', '25'],
|
|
65
|
+
'TYPE': ['p'],
|
|
66
|
+
'STATUS': ['PASSIVE'],
|
|
67
|
+
'object_id_param_name': 'NAME'
|
|
68
|
+
},
|
|
69
|
+
{
|
|
70
|
+
'Username': ['John', 'Doe'],
|
|
71
|
+
'Email': ['john_doe@www.org'],
|
|
72
|
+
'object_id_param_name': 'Username'
|
|
73
|
+
}
|
|
74
|
+
]
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
|
|
78
|
+
What is a printout/text table?
|
|
79
|
+
------------------------------
|
|
80
|
+
A printout (aka text table) is a human-readable representation of tabular data
|
|
81
|
+
where rows may span multiple lines, but column semantics remain consistent.
|
|
82
|
+
|
|
83
|
+
Even when visually wrapped, such a printout can always be normalized
|
|
84
|
+
into a flat table structure without losing information.
|
|
85
|
+
|
|
86
|
+
Wrapped form (printout):
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
NAME LOCATION TYPE
|
|
90
|
+
DotA 100, 88 p
|
|
91
|
+
|
|
92
|
+
STATUS ACTIVE
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Logical flat form (text table):
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
NAME LOCATION TYPE STATUS
|
|
99
|
+
DotA 100, 88 p ACTIVE
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
|
|
103
|
+
Parameters
|
|
104
|
+
----------
|
|
105
|
+
A parameter is a named field with one or more values.
|
|
106
|
+
|
|
107
|
+
- Values are always stored as lists.
|
|
108
|
+
- Multiple values are separated by delimiters (spaces or commas by default.
|
|
109
|
+
- Splitting behavior is configurable via value_delimiters.
|
|
110
|
+
- Set value_delimiters=None or '' to disable splitting.
|
|
111
|
+
|
|
112
|
+
|
|
113
|
+
Vertical and Horizontal Parameters
|
|
114
|
+
----------------------------------
|
|
115
|
+
Vertical parameters:
|
|
116
|
+
Values aligned under a header row.
|
|
117
|
+
|
|
118
|
+
Example:
|
|
119
|
+
|
|
120
|
+
X Y
|
|
121
|
+
10 15
|
|
122
|
+
|
|
123
|
+
Horizontal parameters:
|
|
124
|
+
Parameters whose name and value appear on the same line.
|
|
125
|
+
|
|
126
|
+
Example:
|
|
127
|
+
|
|
128
|
+
NAME John Doe
|
|
129
|
+
|
|
130
|
+
Horizontal parameters are NOT auto-detected and must be explicitly
|
|
131
|
+
declared via hor_param_names.
|
|
132
|
+
|
|
133
|
+
|
|
134
|
+
Objects and Identifiers
|
|
135
|
+
-----------------------
|
|
136
|
+
Each parsed object corresponds to one logical row of data.
|
|
137
|
+
|
|
138
|
+
An object is identified by an identifier parameter (e.g. NAME, ID).
|
|
139
|
+
|
|
140
|
+
Default behavior:
|
|
141
|
+
- The first detected parameter becomes the identifier.
|
|
142
|
+
- When the same identifier parameter appears again with
|
|
143
|
+
a non-empty value, a new object is started.
|
|
144
|
+
|
|
145
|
+
If object_id_param_names is provided:
|
|
146
|
+
- Only listed parameters are treated as identifiers.
|
|
147
|
+
- Section changes do not reset identifier detection automatically.
|
|
148
|
+
|
|
149
|
+
|
|
150
|
+
Printout Logical Sections
|
|
151
|
+
-------------------------
|
|
152
|
+
A single non-empty line surrounded by empty lines starts a new section.
|
|
153
|
+
Sections may contain a different object type.
|
|
154
|
+
|
|
155
|
+
When a new section starts:
|
|
156
|
+
- the current object is finalized
|
|
157
|
+
- identifier detection restarts (unless custom identifiers are specified)
|
|
158
|
+
|
|
159
|
+
Example:
|
|
160
|
+
|
|
161
|
+
```
|
|
162
|
+
POINTS
|
|
163
|
+
|
|
164
|
+
NAME LOCATION TYPE
|
|
165
|
+
pointA 155, 25 n
|
|
166
|
+
|
|
167
|
+
USER DATA
|
|
168
|
+
|
|
169
|
+
USERNAME EMAIL
|
|
170
|
+
John john@mail.com
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
|
|
174
|
+
Child Objects (Advanced)
|
|
175
|
+
------------------------
|
|
176
|
+
OutParse supports hierarchical parent–child relationships.
|
|
177
|
+
|
|
178
|
+
Its used when one object (parent) contains one or more nested objects (childs).
|
|
179
|
+
|
|
180
|
+
Example
|
|
181
|
+
|
|
182
|
+
```
|
|
183
|
+
DEPARTMENTS
|
|
184
|
+
|
|
185
|
+
Department Manager
|
|
186
|
+
Macrodata Refinement Mark.S
|
|
187
|
+
|
|
188
|
+
Employee Role
|
|
189
|
+
Mark.S Refiner
|
|
190
|
+
Dylan.G Refiner
|
|
191
|
+
Irving.B Refiner
|
|
192
|
+
Helly.R Refiner
|
|
193
|
+
|
|
194
|
+
Department Manager
|
|
195
|
+
Optics & Design Burt.G
|
|
196
|
+
|
|
197
|
+
Employee Role
|
|
198
|
+
Burt.G Designer
|
|
199
|
+
Felicia Technician
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
Here we have two object types: Department (parent) and Employee (child), to parse it properly, this should be configured via object_relations:
|
|
203
|
+
|
|
204
|
+
```python
|
|
205
|
+
parser = PrintoutParser(object_relations={'Department': ['Employee']})
|
|
206
|
+
result = parser.parse(text)
|
|
207
|
+
print(result)
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
Which results in
|
|
211
|
+
|
|
212
|
+
```python
|
|
213
|
+
[
|
|
214
|
+
{
|
|
215
|
+
'Department': ['Macrodata', 'Refinement'],
|
|
216
|
+
'Manager': ['Mark.S'],
|
|
217
|
+
'Employee': ['Mark.S', 'Dylan.G', 'Irving.B', 'Helly.R'],
|
|
218
|
+
'Role': [
|
|
219
|
+
['Refiner'],
|
|
220
|
+
['Refiner'],
|
|
221
|
+
['Refiner'],
|
|
222
|
+
['Refiner']
|
|
223
|
+
],
|
|
224
|
+
'object_id_param_name': 'Department'
|
|
225
|
+
},
|
|
226
|
+
{
|
|
227
|
+
'Department': ['Optics', '&', 'Design'],
|
|
228
|
+
'Manager': ['Burt.G'],
|
|
229
|
+
'Employee': ['Burt.G', 'Felicia'],
|
|
230
|
+
'Role': [
|
|
231
|
+
['Designer'],
|
|
232
|
+
['Technician']
|
|
233
|
+
],
|
|
234
|
+
'object_id_param_name': 'Department'
|
|
235
|
+
}
|
|
236
|
+
]
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
Child parameters are stored as lists of lists,
|
|
240
|
+
aligned by child identifier index.
|
|
241
|
+
|
|
242
|
+
Hierarchy is configured via object_relations:
|
|
243
|
+
|
|
244
|
+
```python
|
|
245
|
+
{
|
|
246
|
+
"PARENT_ID": ["CHILD_ID_1", "CHILD_ID_2"]
|
|
247
|
+
}
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
where "PARENT_ID" is the identifier parameter name of the parent object type,
|
|
251
|
+
and ["CHILD_ID_1", "CHILD_ID_2"] is a list of identifier parameter names for all child object types
|
|
252
|
+
that belong to this parent — including indirect descendants (children, grandchildren, etc.).
|
|
253
|
+
Nesting level does not matter: any identifier listed here will be treated as a child of "PARENT_ID".
|
|
254
|
+
|
|
255
|
+
|
|
256
|
+
Basic Output Format
|
|
257
|
+
-------------------
|
|
258
|
+
The parser returns:
|
|
259
|
+
|
|
260
|
+
`List[Dict[str, List[str]]]`
|
|
261
|
+
|
|
262
|
+
Each dictionary represents one parsed object and contains:
|
|
263
|
+
- parameter names as keys
|
|
264
|
+
- lists of values as values
|
|
265
|
+
- "object_id_param_name" storing object identifier parameter name
|
|
266
|
+
|
|
267
|
+
|
|
268
|
+
Common Mistakes / Requirements
|
|
269
|
+
------------------------------
|
|
270
|
+
|
|
271
|
+
1. Header line must be separated from previous content (if any) by an empty line
|
|
272
|
+
|
|
273
|
+
Incorrect:
|
|
274
|
+
```
|
|
275
|
+
<previous data>
|
|
276
|
+
NAME LOCATION TYPE
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
Correct:
|
|
280
|
+
```
|
|
281
|
+
<previous data>
|
|
282
|
+
|
|
283
|
+
NAME LOCATION TYPE
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
|
|
287
|
+
2. Section title must be separated from previous content (if any) by an empty line
|
|
288
|
+
and must always be followed by an empty line
|
|
289
|
+
|
|
290
|
+
Incorrect:
|
|
291
|
+
|
|
292
|
+
```
|
|
293
|
+
<previous data>
|
|
294
|
+
POINTS
|
|
295
|
+
NAME LOCATION TYPE
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
Correct:
|
|
299
|
+
|
|
300
|
+
```
|
|
301
|
+
<previous data>
|
|
302
|
+
|
|
303
|
+
POINTS
|
|
304
|
+
|
|
305
|
+
NAME LOCATION TYPE
|
|
306
|
+
```
|
|
307
|
+
|
|
308
|
+
3. Text must be space-formatted
|
|
309
|
+
|
|
310
|
+
Parsing relies on column positioning.
|
|
311
|
+
If text is tab-formatted, replace tabs with spaces before parsing:
|
|
312
|
+
|
|
313
|
+
```
|
|
314
|
+
text_for_parsing = text.replace('\t', 4 * ' ')
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
## License
|
|
318
|
+
|
|
319
|
+
This project is licensed under the BSD 3-Clause License.
|
|
320
|
+
See the [LICENSE](LICENSE) file for details.
|