edifact_rails 1.2.1 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +11 -1
- data/README.md +72 -10
- data/lib/edifact_rails/exceptions.rb +9 -0
- data/lib/edifact_rails/formats.rb +9 -0
- data/lib/edifact_rails/parser.rb +116 -62
- data/lib/edifact_rails/version.rb +1 -1
- data/lib/edifact_rails.rb +5 -3
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ecc255d266d797eed361491b7643a6ca1a4e03205d214da5c447d4966d0cfb10
|
4
|
+
data.tar.gz: e7aa5f7730944a6b1f6b5adf9809b5d097ae2daf0bef00cd4f84c7433515b122
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f0976e015521c0145fd8f4fcca85be2acd5206cf88a31a1bae6cc2d1b88068315930fadf37e1c189c04adab02720fe80b743b14e0818de33d9bbdba3b2d06d00
|
7
|
+
data.tar.gz: 8f234c0556584570683113037fd3289cad26615fe03e933de3c3618957d0b6142c0a787cefcafa4f913b89564daf7c87bec56d24b61706320c54ea6f3f0800b6
|
data/CHANGELOG.md
CHANGED
@@ -20,4 +20,14 @@
|
|
20
20
|
## 1.2.1 (4/06/2024)
|
21
21
|
|
22
22
|
* `#una_special_characters` method now also returns decimal notation character, default `.`.
|
23
|
-
* `#una_special_characters` method can now take no arguments, and will return the default special characters if so.
|
23
|
+
* `#una_special_characters` method can now take no arguments, and will return the default special characters if so.
|
24
|
+
|
25
|
+
## 2.0.0 (18/06/2024)
|
26
|
+
|
27
|
+
* Added support for ANSIX12 format.
|
28
|
+
|
29
|
+
##### Breaking changes:
|
30
|
+
* `#una_special_characters` renamed to `#special_characters` (since it can now accept input of any supported format)
|
31
|
+
* New `UnrecognizedFormat` Error will now be thrown if the format of the input can not be detected.
|
32
|
+
* In essence, input must begin now with `UNA` or `UNB` (EDIFACT), `STX` (TRADACOMS), or `ISA` (ANSIX12)
|
33
|
+
|
data/README.md
CHANGED
@@ -1,10 +1,10 @@
|
|
1
1
|
# EdifactRails
|
2
2
|
|
3
|
-
This gem parses EDIFACT or
|
3
|
+
This gem parses EDIFACT, TRADACOMS, or ANSIX12 input, and converts it into a ruby array structure for whatever further processing or validation you desire.
|
4
4
|
|
5
5
|
It does not handle validation itself.
|
6
6
|
|
7
|
-
This gem is heavily inspired by
|
7
|
+
This gem is heavily inspired by [edifact_parser](https://github.com/pvdvreede/edifact_parser)
|
8
8
|
|
9
9
|
## Requirements
|
10
10
|
|
@@ -20,7 +20,7 @@ This gem has been tested on the following ruby versions:
|
|
20
20
|
In your `Gemfile`:
|
21
21
|
|
22
22
|
```ruby
|
23
|
-
gem 'edifact_rails', '~>
|
23
|
+
gem 'edifact_rails', '~> 2.0.0'
|
24
24
|
```
|
25
25
|
|
26
26
|
Otherwise:
|
@@ -37,20 +37,20 @@ If you don't have the gem in your `Gemfile`, you will need to:
|
|
37
37
|
require 'edifact_rails'
|
38
38
|
```
|
39
39
|
|
40
|
-
You can
|
40
|
+
You can parse a string input with `#parse`, or a file with `#parse_file`
|
41
41
|
|
42
42
|
```ruby
|
43
|
-
ruby_array = EdifactRails.
|
43
|
+
ruby_array = EdifactRails.parse("UNB+UNOA:3+TESTPLACE:1+DEP1:1+20051107:1159+6002'")
|
44
44
|
```
|
45
45
|
|
46
46
|
```ruby
|
47
|
-
ruby_array = EdifactRails.
|
47
|
+
ruby_array = EdifactRails.parse_file("your/file/path")
|
48
48
|
```
|
49
49
|
|
50
|
-
You can
|
50
|
+
You can return the special characters of your input with `#special_characters`.
|
51
51
|
```ruby
|
52
|
-
|
53
|
-
#
|
52
|
+
special_characters = EdifactRails.special_characters(example_edifact_input)
|
53
|
+
# special_characters =>
|
54
54
|
{
|
55
55
|
component_data_element_seperator: ":",
|
56
56
|
data_element_seperator: "+",
|
@@ -179,4 +179,66 @@ Will be returned as:
|
|
179
179
|
['MTR', [3]],
|
180
180
|
['END', [5]]
|
181
181
|
]
|
182
|
-
```
|
182
|
+
```
|
183
|
+
|
184
|
+
### ANSIX12
|
185
|
+
|
186
|
+
This ANSIX12 file:
|
187
|
+
|
188
|
+
```
|
189
|
+
ISA*00* *00* *01*SENDER *01*RECEIVER *231014*1200*U*00401*000000001*1*P*>~
|
190
|
+
GS*SS*APP SENDER*APP RECEIVER*20231014*1200*0001*X*004010~
|
191
|
+
ST*862*0001~
|
192
|
+
BSS*05*12345*20230414*DL*20231014*20231203****ORDER1*A~
|
193
|
+
N1*MI*SEEBURGER AG*ZZ*00000085~
|
194
|
+
N3*EDISONSTRASSE 1~
|
195
|
+
N4*BRETTEN**75015*DE~
|
196
|
+
N1*SU*SUPLIER NAME*ZZ*11222333~
|
197
|
+
N3*203 STREET NAME~
|
198
|
+
N4*ATLANTA*GA*30309*US~
|
199
|
+
LIN**BP*MATERIAL1*EC*ENGINEERING1*DR*001~
|
200
|
+
UIT*EA~
|
201
|
+
PER*SC*SEEBURGER INFO*TE*+49(7525)0~
|
202
|
+
FST*13*C*D*20231029****DO*12345-1~
|
203
|
+
FST*77*C*D*20231119****DO*12345-2~
|
204
|
+
FST*68*C*D*20231203****DO*12345-3~
|
205
|
+
SHP*01*927*011*20231014~
|
206
|
+
REF*SI*Q5880~
|
207
|
+
SHP*02*8557*011*20231014**20231203~
|
208
|
+
CTT*1*5~
|
209
|
+
SE*19*0001~
|
210
|
+
GE*1*0001~
|
211
|
+
IEA*1*000000001~
|
212
|
+
```
|
213
|
+
|
214
|
+
Will be returned as:
|
215
|
+
|
216
|
+
```ruby
|
217
|
+
[
|
218
|
+
["ISA", ["00"], [nil], ["00"], [nil], ["01"], ["SENDER"], ["01"], ["RECEIVER"], [231014], [1200], ["U"], ["00401"], ["000000001"], [1], ["P"], []],
|
219
|
+
["GS", ["SS"], ["APP SENDER"], ["APP RECEIVER"], [20231014], [1200], ["0001"], ["X"], ["004010"]],
|
220
|
+
["ST", [862], ["0001"]],
|
221
|
+
["BSS", ["05"], [12345], [20230414], ["DL"], [20231014], [20231203], [], [], [], ["ORDER1"], ["A"]],
|
222
|
+
["N1", ["MI"], ["SEEBURGER AG"], ["ZZ"], ["00000085"]],
|
223
|
+
["N3", ["EDISONSTRASSE 1"]],
|
224
|
+
["N4", ["BRETTEN"], [], [75015], ["DE"]],
|
225
|
+
["N1", ["SU"], ["SUPLIER NAME"], ["ZZ"], [11222333]],
|
226
|
+
["N3", ["203 STREET NAME"]],
|
227
|
+
["N4", ["ATLANTA"], ["GA"], [30309], ["US"]],
|
228
|
+
["LIN", [], ["BP"], ["MATERIAL1"], ["EC"], ["ENGINEERING1"], ["DR"], ["001"]],
|
229
|
+
["UIT", ["EA"]],
|
230
|
+
["PER", ["SC"], ["SEEBURGER INFO"], ["TE"], ["+49(7525)0"]],
|
231
|
+
["FST", [13], ["C"], ["D"], [20231029], [], [], [], ["DO"], ["12345-1"]],
|
232
|
+
["FST", [77], ["C"], ["D"], [20231119], [], [], [], ["DO"], ["12345-2"]],
|
233
|
+
["FST", [68], ["C"], ["D"], [20231203], [], [], [], ["DO"], ["12345-3"]],
|
234
|
+
["SHP", ["01"], [927], ["011"], [20231014]],
|
235
|
+
["REF", ["SI"], ["Q5880"]],
|
236
|
+
["SHP", ["02"], [8557], ["011"], [20231014], [], [20231203]],
|
237
|
+
["CTT", [1], [5]],
|
238
|
+
["SE", [19], ["0001"]],
|
239
|
+
["GE", [1], ["0001"]],
|
240
|
+
["IEA", [1], ["000000001"]]
|
241
|
+
]
|
242
|
+
```
|
243
|
+
|
244
|
+
|
data/lib/edifact_rails/parser.rb
CHANGED
@@ -10,20 +10,26 @@ module EdifactRails
|
|
10
10
|
|
11
11
|
# Treat the input, split the input string into segments, parse those segments
|
12
12
|
def parse(string)
|
13
|
-
#
|
14
|
-
string = string.gsub(
|
13
|
+
# Remove all carraige returns, and leading and trailing whitespace
|
14
|
+
string = string.delete("\r").gsub(/^\s*(.*)\s*$/, '\1')
|
15
15
|
|
16
|
-
|
16
|
+
@edi_format = detect_edi_format(string)
|
17
|
+
|
18
|
+
# Detects special characters in the UNA segment (edifact) or ISA segment (ansix12),
|
19
|
+
# updates special characters if so
|
17
20
|
detect_special_characters(string)
|
18
21
|
|
19
22
|
# Does some funky regex maniulation to handle escaped special characters
|
20
|
-
|
23
|
+
# Ansix12 does not have escape characters, so we can skip
|
24
|
+
string = handle_duplicate_escape_characters(string) unless @edi_format == EdifactRails::Formats::ANSIX12
|
21
25
|
|
22
26
|
# Split the input string into segments
|
23
|
-
segments =
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
+
segments =
|
28
|
+
if @edi_format == EdifactRails::Formats::ANSIX12
|
29
|
+
string.split(@special_characters[:segment_seperator])
|
30
|
+
else
|
31
|
+
string.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:segment_seperator])}/)
|
32
|
+
end
|
27
33
|
|
28
34
|
# Drop the UNA segment, if present (we have already dealt with it in #detect_special_characters)
|
29
35
|
segments.reject! { |s| s[0..2] == "UNA" }
|
@@ -33,62 +39,92 @@ module EdifactRails
|
|
33
39
|
end
|
34
40
|
|
35
41
|
# Given an input string, return the special characters as defined by the UNA segment
|
36
|
-
|
37
|
-
|
42
|
+
def special_characters(string = "")
|
43
|
+
# If no string is passed, return default edifact characters
|
44
|
+
return EdifactRails::DEFAULT_SPECIAL_CHARACTERS if string.empty?
|
45
|
+
|
46
|
+
string = string.delete("\r").gsub(/^\s*(.*)\s*$/, '\1')
|
47
|
+
@edi_format = detect_edi_format(string)
|
38
48
|
detect_special_characters(string)
|
39
49
|
|
40
|
-
|
41
|
-
component_data_element_seperator: @component_data_element_seperator,
|
42
|
-
data_element_seperator: @data_element_seperator,
|
43
|
-
decimal_notation: @decimal_notation,
|
44
|
-
escape_character: @escape_character,
|
45
|
-
segment_seperator: @segment_seperator
|
46
|
-
}
|
50
|
+
@special_characters
|
47
51
|
end
|
48
52
|
|
49
53
|
private
|
50
54
|
|
51
|
-
def
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
@decimal_notation = decimal_notation
|
63
|
-
@escape_character = escape_character
|
64
|
-
@segment_seperator = segment_seperator
|
55
|
+
def detect_edi_format(string)
|
56
|
+
case string[0..2]
|
57
|
+
when "UNA", "UNB"
|
58
|
+
EdifactRails::Formats::EDIFACT
|
59
|
+
when "STX"
|
60
|
+
EdifactRails::Formats::TRADACOMS
|
61
|
+
when "ISA"
|
62
|
+
EdifactRails::Formats::ANSIX12
|
63
|
+
else
|
64
|
+
raise EdifactRails::UnrecognizedFormat
|
65
|
+
end
|
65
66
|
end
|
66
67
|
|
67
68
|
def detect_special_characters(string)
|
68
|
-
#
|
69
|
-
|
69
|
+
# Format must be EDIFACT or ANSI X12 to set custom characters
|
70
|
+
# Tradacoms uses the defaults
|
71
|
+
return unless [EdifactRails::Formats::EDIFACT, EdifactRails::Formats::ANSIX12].include?(@edi_format)
|
72
|
+
|
73
|
+
# If EDIFACT, UNA tags are optional, so return if it's not present
|
74
|
+
return if @edi_format == EdifactRails::Formats::EDIFACT && string[0..2] != "UNA"
|
75
|
+
|
76
|
+
case @edi_format
|
77
|
+
when EdifactRails::Formats::EDIFACT
|
78
|
+
# UNA segments look like this:
|
79
|
+
#
|
80
|
+
# UNA:+.? '
|
81
|
+
#
|
82
|
+
# UNA followed by 6 special characters which are, in order:
|
83
|
+
# 1. Component data element separator
|
84
|
+
# 2. Data element separator
|
85
|
+
# 3. Decimal notation (must be . or ,)
|
86
|
+
# 4. Release character (aka escape character)
|
87
|
+
# 5. Reserved for future use, so always a space for now
|
88
|
+
# 6. Segment terminator
|
89
|
+
set_special_characters(
|
90
|
+
component_data_element_seperator: string[3],
|
91
|
+
data_element_seperator: string[4],
|
92
|
+
decimal_notation: string[5],
|
93
|
+
escape_character: string[6],
|
94
|
+
segment_seperator: string[8]
|
95
|
+
)
|
96
|
+
when EdifactRails::Formats::ANSIX12
|
97
|
+
# ISA segments look like this:
|
98
|
+
# ISA*00* *00* *01*SENDER *01*RECEIVER *231014*1200*U*00401*000000001*1*P*>~
|
99
|
+
# These are designed to always be the same number of characters, so we can use the hardcoded positions
|
100
|
+
# The special characters are the 4th (default *, data_element_seperator),
|
101
|
+
# 105th, 106th, 103rd, and 3rd characters
|
102
|
+
set_special_characters(
|
103
|
+
data_element_seperator: string[3],
|
104
|
+
component_data_element_seperator: string[104],
|
105
|
+
segment_seperator: string[105]
|
106
|
+
)
|
107
|
+
end
|
108
|
+
end
|
70
109
|
|
71
|
-
|
72
|
-
#
|
73
|
-
|
74
|
-
|
75
|
-
#
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
# 5. Reserved for future use, so always a space for now
|
81
|
-
# 6. Segment terminator
|
82
|
-
set_special_characters(string[3], string[4], string[5], string[6], string[8])
|
110
|
+
def set_special_characters(args = {})
|
111
|
+
# arg keys will overwrite the defaults when present
|
112
|
+
@special_characters = EdifactRails::DEFAULT_SPECIAL_CHARACTERS.merge(args)
|
113
|
+
|
114
|
+
# ANSIX12 files have no escape character or decimal notation character§
|
115
|
+
return unless @edi_format == EdifactRails::Formats::ANSIX12
|
116
|
+
|
117
|
+
@special_characters.delete(:escape_character)
|
118
|
+
@special_characters.delete(:decimal_notation)
|
83
119
|
end
|
84
120
|
|
85
|
-
def
|
121
|
+
def handle_duplicate_escape_characters(string)
|
86
122
|
# Prepare regex
|
87
|
-
|
123
|
+
other_specials_regex = Regexp.quote(
|
88
124
|
[
|
89
|
-
@segment_seperator,
|
90
|
-
@data_element_seperator,
|
91
|
-
@component_data_element_seperator
|
125
|
+
@special_characters[:segment_seperator],
|
126
|
+
@special_characters[:data_element_seperator],
|
127
|
+
@special_characters[:component_data_element_seperator]
|
92
128
|
].join
|
93
129
|
)
|
94
130
|
|
@@ -96,7 +132,7 @@ module EdifactRails
|
|
96
132
|
# the special character is therefore unescaped.
|
97
133
|
# Add a space between these even number of escapes, and the special character
|
98
134
|
#
|
99
|
-
# This means the regex logic for
|
135
|
+
# This means the regex logic for splitting on special characters is now consistent, since there will only ever
|
100
136
|
# be either 0 or 1 escape characters before every special character.
|
101
137
|
#
|
102
138
|
# We have to do this because we can't negative lookbehind for 'an even number of escape characters' since
|
@@ -110,20 +146,30 @@ module EdifactRails
|
|
110
146
|
# "LIN+even????+123" => '+' is not escaped, gsub'ed => "even???? +123" => parsed => ['LIN', ['even??'], [123]]
|
111
147
|
# "LIN+odd???+123" => '+' is escaped, not gsub'ed => "odd???+123" => parsed => ['LIN', ['odd?+123']]
|
112
148
|
string.gsub(
|
113
|
-
/(?<!#{Regexp.quote(@escape_character)})((#{Regexp.quote(@escape_character)}{2})+)([#{
|
149
|
+
/(?<!#{Regexp.quote(@special_characters[:escape_character])})((#{Regexp.quote(@special_characters[:escape_character])}{2})+)([#{other_specials_regex}])/,
|
114
150
|
'\1 \3'
|
115
151
|
)
|
116
152
|
end
|
117
153
|
|
118
154
|
# Split the segment into data elements, take the first as the tag, then parse the rest
|
119
155
|
def parse_segment(segment)
|
156
|
+
segment.chomp("")
|
157
|
+
segment.gsub!(/^\s*(.*)\s*/, '\1')
|
158
|
+
|
120
159
|
# If the input is a tradacoms file, the segment tag will be proceeded by '=' instead of '+'
|
121
160
|
# 'QTY=1+A:B' instead of 'QTY+1+A:B'
|
122
161
|
# Fortunately, this is easily handled by simply changing these "="s into "+"s before the split
|
123
|
-
|
162
|
+
if @edi_format == EdifactRails::Formats::TRADACOMS && segment.length >= 4
|
163
|
+
segment[3] = @special_characters[:data_element_seperator]
|
164
|
+
end
|
124
165
|
|
125
166
|
# Segments are made up of data elements
|
126
|
-
data_elements =
|
167
|
+
data_elements =
|
168
|
+
if @edi_format == EdifactRails::Formats::ANSIX12
|
169
|
+
segment.split(@special_characters[:data_element_seperator])
|
170
|
+
else
|
171
|
+
segment.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:data_element_seperator])}/)
|
172
|
+
end
|
127
173
|
|
128
174
|
# The first element is the tag, pop it off
|
129
175
|
parsed_segment = []
|
@@ -137,7 +183,11 @@ module EdifactRails
|
|
137
183
|
def parse_data_element(element)
|
138
184
|
# Split data element into components
|
139
185
|
components =
|
140
|
-
|
186
|
+
if @edi_format == EdifactRails::Formats::ANSIX12
|
187
|
+
element.split(@special_characters[:component_data_element_seperator])
|
188
|
+
else
|
189
|
+
element.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:component_data_element_seperator])}/)
|
190
|
+
end
|
141
191
|
|
142
192
|
components.map { |component| treat_component(component) }
|
143
193
|
end
|
@@ -149,15 +199,19 @@ module EdifactRails
|
|
149
199
|
|
150
200
|
# Prepare regex
|
151
201
|
all_special_characters_string = [
|
152
|
-
@segment_seperator,
|
153
|
-
@data_element_seperator,
|
154
|
-
@component_data_element_seperator,
|
155
|
-
@escape_character
|
202
|
+
@special_characters[:segment_seperator],
|
203
|
+
@special_characters[:data_element_seperator],
|
204
|
+
@special_characters[:component_data_element_seperator],
|
205
|
+
@special_characters[:escape_character]
|
156
206
|
].join
|
157
207
|
|
158
|
-
|
159
|
-
|
160
|
-
|
208
|
+
unless @edi_format == EdifactRails::Formats::ANSIX12
|
209
|
+
# If the component has escaped characters in it, remove the escape character and return the character as is
|
210
|
+
# "?+" -> "+", "??" -> "?"
|
211
|
+
component.gsub!(
|
212
|
+
/#{Regexp.quote(@special_characters[:escape_character])}([#{Regexp.quote(all_special_characters_string)}])/, '\1'
|
213
|
+
)
|
214
|
+
end
|
161
215
|
|
162
216
|
# Convert empty strings to nils
|
163
217
|
component = nil if component.empty?
|
data/lib/edifact_rails.rb
CHANGED
@@ -1,6 +1,8 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require "edifact_rails/parser"
|
4
|
+
require "edifact_rails/formats"
|
5
|
+
require "edifact_rails/exceptions"
|
4
6
|
|
5
7
|
module EdifactRails
|
6
8
|
DEFAULT_SPECIAL_CHARACTERS = {
|
@@ -17,11 +19,11 @@ module EdifactRails
|
|
17
19
|
end
|
18
20
|
|
19
21
|
def self.parse_file(file_path)
|
20
|
-
parse(File.read(file_path)
|
22
|
+
parse(File.read(file_path))
|
21
23
|
end
|
22
24
|
|
23
|
-
def self.
|
25
|
+
def self.special_characters(string = "")
|
24
26
|
parser = EdifactRails::Parser.new
|
25
|
-
parser.
|
27
|
+
parser.special_characters(string)
|
26
28
|
end
|
27
29
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: edifact_rails
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 2.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- David Blackwood
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2024-06-
|
11
|
+
date: 2024-06-18 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: byebug
|
@@ -124,6 +124,8 @@ files:
|
|
124
124
|
- LICENSE
|
125
125
|
- README.md
|
126
126
|
- lib/edifact_rails.rb
|
127
|
+
- lib/edifact_rails/exceptions.rb
|
128
|
+
- lib/edifact_rails/formats.rb
|
127
129
|
- lib/edifact_rails/parser.rb
|
128
130
|
- lib/edifact_rails/version.rb
|
129
131
|
homepage: https://github.com/david-blackwood/edifact_rails
|