edifact_rails 1.2.1 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 3c51652b41747f2b0c07ecd834c5c79b3bebbb1537464a450f37ee58ea68150e
4
- data.tar.gz: 2669313b30c7565c60f4dc568c528127577770946f7d4a8ffee7a1fc87f4c07e
3
+ metadata.gz: ecc255d266d797eed361491b7643a6ca1a4e03205d214da5c447d4966d0cfb10
4
+ data.tar.gz: e7aa5f7730944a6b1f6b5adf9809b5d097ae2daf0bef00cd4f84c7433515b122
5
5
  SHA512:
6
- metadata.gz: ded47109a99423254023e4f7316e1bdc3d1dc6602e73225a900c815baa2b2c4d9d14fe16f131aa5e1239b6b49ecab6394d7051ec9fc66923c6ffb5eb0e674ba5
7
- data.tar.gz: f3c9d7ae8f793651c62eff11b3356321f6bcc531b5ce61020f23d530a08b0112d46e338edc8ff1711bc514fb53875508b5792795c3cbea8056e5ee97a60e52f1
6
+ metadata.gz: f0976e015521c0145fd8f4fcca85be2acd5206cf88a31a1bae6cc2d1b88068315930fadf37e1c189c04adab02720fe80b743b14e0818de33d9bbdba3b2d06d00
7
+ data.tar.gz: 8f234c0556584570683113037fd3289cad26615fe03e933de3c3618957d0b6142c0a787cefcafa4f913b89564daf7c87bec56d24b61706320c54ea6f3f0800b6
data/CHANGELOG.md CHANGED
@@ -20,4 +20,14 @@
20
20
  ## 1.2.1 (4/06/2024)
21
21
 
22
22
  * `#una_special_characters` method now also returns decimal notation character, default `.`.
23
- * `#una_special_characters` method can now take no arguments, and will return the default special characters if so.
23
+ * `#una_special_characters` method can now take no arguments, and will return the default special characters if so.
24
+
25
+ ## 2.0.0 (18/06/2024)
26
+
27
+ * Added support for ANSIX12 format.
28
+
29
+ ##### Breaking changes:
30
+ * `#una_special_characters` renamed to `#special_characters` (since it can now accept input of any supported format)
31
+ * New `UnrecognizedFormat` Error will now be thrown if the format of the input can not be detected.
32
+ * In essence, input must begin now with `UNA` or `UNB` (EDIFACT), `STX` (TRADACOMS), or `ISA` (ANSIX12)
33
+
data/README.md CHANGED
@@ -1,10 +1,10 @@
1
1
  # EdifactRails
2
2
 
3
- This gem parses EDIFACT or TRADACOMS input, and converts it into a ruby array structure for whatever further processing or validation you desire.
3
+ This gem parses EDIFACT, TRADACOMS, or ANSIX12 input, and converts it into a ruby array structure for whatever further processing or validation you desire.
4
4
 
5
5
  It does not handle validation itself.
6
6
 
7
- This gem is heavily inspired by and attempts to output similar results as [edifact_parser](https://github.com/pvdvreede/edifact_parser)
7
+ This gem is heavily inspired by [edifact_parser](https://github.com/pvdvreede/edifact_parser)
8
8
 
9
9
  ## Requirements
10
10
 
@@ -20,7 +20,7 @@ This gem has been tested on the following ruby versions:
20
20
  In your `Gemfile`:
21
21
 
22
22
  ```ruby
23
- gem 'edifact_rails', '~> 1.2'
23
+ gem 'edifact_rails', '~> 2.0.0'
24
24
  ```
25
25
 
26
26
  Otherwise:
@@ -37,20 +37,20 @@ If you don't have the gem in your `Gemfile`, you will need to:
37
37
  require 'edifact_rails'
38
38
  ```
39
39
 
40
- You can pass either the path to your EDIFACT (or TRADACOMS) file, or a document (or snippet) as a string:
40
+ You can parse a string input with `#parse`, or a file with `#parse_file`
41
41
 
42
42
  ```ruby
43
- ruby_array = EdifactRails.parse_file("your/file/path")
43
+ ruby_array = EdifactRails.parse("UNB+UNOA:3+TESTPLACE:1+DEP1:1+20051107:1159+6002'")
44
44
  ```
45
45
 
46
46
  ```ruby
47
- ruby_array = EdifactRails.parse("LIN+1+1+0764569104:IB'QTY+1:25'")
47
+ ruby_array = EdifactRails.parse_file("your/file/path")
48
48
  ```
49
49
 
50
- You can pull just the special characters from the UNA segment (or the defaults if no UNA segment is present):
50
+ You can return the special characters of your input with `#special_characters`.
51
51
  ```ruby
52
- una_special_characters = EdifactRails.una_special_characters(your_string_input)
53
- # una_special_characters =>
52
+ special_characters = EdifactRails.special_characters(example_edifact_input)
53
+ # special_characters =>
54
54
  {
55
55
  component_data_element_seperator: ":",
56
56
  data_element_seperator: "+",
@@ -179,4 +179,66 @@ Will be returned as:
179
179
  ['MTR', [3]],
180
180
  ['END', [5]]
181
181
  ]
182
- ```
182
+ ```
183
+
184
+ ### ANSIX12
185
+
186
+ This ANSIX12 file:
187
+
188
+ ```
189
+ ISA*00* *00* *01*SENDER *01*RECEIVER *231014*1200*U*00401*000000001*1*P*>~
190
+ GS*SS*APP SENDER*APP RECEIVER*20231014*1200*0001*X*004010~
191
+ ST*862*0001~
192
+ BSS*05*12345*20230414*DL*20231014*20231203****ORDER1*A~
193
+ N1*MI*SEEBURGER AG*ZZ*00000085~
194
+ N3*EDISONSTRASSE 1~
195
+ N4*BRETTEN**75015*DE~
196
+ N1*SU*SUPLIER NAME*ZZ*11222333~
197
+ N3*203 STREET NAME~
198
+ N4*ATLANTA*GA*30309*US~
199
+ LIN**BP*MATERIAL1*EC*ENGINEERING1*DR*001~
200
+ UIT*EA~
201
+ PER*SC*SEEBURGER INFO*TE*+49(7525)0~
202
+ FST*13*C*D*20231029****DO*12345-1~
203
+ FST*77*C*D*20231119****DO*12345-2~
204
+ FST*68*C*D*20231203****DO*12345-3~
205
+ SHP*01*927*011*20231014~
206
+ REF*SI*Q5880~
207
+ SHP*02*8557*011*20231014**20231203~
208
+ CTT*1*5~
209
+ SE*19*0001~
210
+ GE*1*0001~
211
+ IEA*1*000000001~
212
+ ```
213
+
214
+ Will be returned as:
215
+
216
+ ```ruby
217
+ [
218
+ ["ISA", ["00"], [nil], ["00"], [nil], ["01"], ["SENDER"], ["01"], ["RECEIVER"], [231014], [1200], ["U"], ["00401"], ["000000001"], [1], ["P"], []],
219
+ ["GS", ["SS"], ["APP SENDER"], ["APP RECEIVER"], [20231014], [1200], ["0001"], ["X"], ["004010"]],
220
+ ["ST", [862], ["0001"]],
221
+ ["BSS", ["05"], [12345], [20230414], ["DL"], [20231014], [20231203], [], [], [], ["ORDER1"], ["A"]],
222
+ ["N1", ["MI"], ["SEEBURGER AG"], ["ZZ"], ["00000085"]],
223
+ ["N3", ["EDISONSTRASSE 1"]],
224
+ ["N4", ["BRETTEN"], [], [75015], ["DE"]],
225
+ ["N1", ["SU"], ["SUPLIER NAME"], ["ZZ"], [11222333]],
226
+ ["N3", ["203 STREET NAME"]],
227
+ ["N4", ["ATLANTA"], ["GA"], [30309], ["US"]],
228
+ ["LIN", [], ["BP"], ["MATERIAL1"], ["EC"], ["ENGINEERING1"], ["DR"], ["001"]],
229
+ ["UIT", ["EA"]],
230
+ ["PER", ["SC"], ["SEEBURGER INFO"], ["TE"], ["+49(7525)0"]],
231
+ ["FST", [13], ["C"], ["D"], [20231029], [], [], [], ["DO"], ["12345-1"]],
232
+ ["FST", [77], ["C"], ["D"], [20231119], [], [], [], ["DO"], ["12345-2"]],
233
+ ["FST", [68], ["C"], ["D"], [20231203], [], [], [], ["DO"], ["12345-3"]],
234
+ ["SHP", ["01"], [927], ["011"], [20231014]],
235
+ ["REF", ["SI"], ["Q5880"]],
236
+ ["SHP", ["02"], [8557], ["011"], [20231014], [], [20231203]],
237
+ ["CTT", [1], [5]],
238
+ ["SE", [19], ["0001"]],
239
+ ["GE", [1], ["0001"]],
240
+ ["IEA", [1], ["000000001"]]
241
+ ]
242
+ ```
243
+
244
+
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module EdifactRails
4
+ class UnrecognizedFormat < StandardError
5
+ def initialize
6
+ super("Unrecognized EDI format. Accepted formats: Edifact, Tradacoms, ANSIX12. File must begin with UNA, UNB, STX, or ISA.")
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module EdifactRails
4
+ class Formats
5
+ EDIFACT = "EDIFACT"
6
+ TRADACOMS = "TRADACOMS"
7
+ ANSIX12 = "ANSIX12"
8
+ end
9
+ end
@@ -10,20 +10,26 @@ module EdifactRails
10
10
 
11
11
  # Treat the input, split the input string into segments, parse those segments
12
12
  def parse(string)
13
- # Trim newlines and excess spaces around those newlines
14
- string = string.gsub(/\s*\n\s*/, "")
13
+ # Remove all carraige returns, and leading and trailing whitespace
14
+ string = string.delete("\r").gsub(/^\s*(.*)\s*$/, '\1')
15
15
 
16
- # Check for UNA segment, update special characters if so
16
+ @edi_format = detect_edi_format(string)
17
+
18
+ # Detects special characters in the UNA segment (edifact) or ISA segment (ansix12),
19
+ # updates special characters if so
17
20
  detect_special_characters(string)
18
21
 
19
22
  # Does some funky regex maniulation to handle escaped special characters
20
- string = treat_input(string)
23
+ # Ansix12 does not have escape characters, so we can skip
24
+ string = handle_duplicate_escape_characters(string) unless @edi_format == EdifactRails::Formats::ANSIX12
21
25
 
22
26
  # Split the input string into segments
23
- segments = string.split(/(?<!#{Regexp.quote(@escape_character)})#{Regexp.quote(@segment_seperator)}/)
24
-
25
- # Detect if the input is a tradacoms file
26
- @is_tradacoms = segments.map { |s| s[3] }.uniq == ["="]
27
+ segments =
28
+ if @edi_format == EdifactRails::Formats::ANSIX12
29
+ string.split(@special_characters[:segment_seperator])
30
+ else
31
+ string.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:segment_seperator])}/)
32
+ end
27
33
 
28
34
  # Drop the UNA segment, if present (we have already dealt with it in #detect_special_characters)
29
35
  segments.reject! { |s| s[0..2] == "UNA" }
@@ -33,62 +39,92 @@ module EdifactRails
33
39
  end
34
40
 
35
41
  # Given an input string, return the special characters as defined by the UNA segment
36
- # If no UNA segment is present, returns the default special characters
37
- def una_special_characters(string)
42
+ def special_characters(string = "")
43
+ # If no string is passed, return default edifact characters
44
+ return EdifactRails::DEFAULT_SPECIAL_CHARACTERS if string.empty?
45
+
46
+ string = string.delete("\r").gsub(/^\s*(.*)\s*$/, '\1')
47
+ @edi_format = detect_edi_format(string)
38
48
  detect_special_characters(string)
39
49
 
40
- {
41
- component_data_element_seperator: @component_data_element_seperator,
42
- data_element_seperator: @data_element_seperator,
43
- decimal_notation: @decimal_notation,
44
- escape_character: @escape_character,
45
- segment_seperator: @segment_seperator
46
- }
50
+ @special_characters
47
51
  end
48
52
 
49
53
  private
50
54
 
51
- def set_special_characters(
52
- component_data_element_seperator =
53
- EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:component_data_element_seperator],
54
- data_element_seperator = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:data_element_seperator],
55
- decimal_notation = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:decimal_notation],
56
- escape_character = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:escape_character],
57
- segment_seperator = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:segment_seperator]
58
- )
59
- # Set the special characters
60
- @component_data_element_seperator = component_data_element_seperator
61
- @data_element_seperator = data_element_seperator
62
- @decimal_notation = decimal_notation
63
- @escape_character = escape_character
64
- @segment_seperator = segment_seperator
55
+ def detect_edi_format(string)
56
+ case string[0..2]
57
+ when "UNA", "UNB"
58
+ EdifactRails::Formats::EDIFACT
59
+ when "STX"
60
+ EdifactRails::Formats::TRADACOMS
61
+ when "ISA"
62
+ EdifactRails::Formats::ANSIX12
63
+ else
64
+ raise EdifactRails::UnrecognizedFormat
65
+ end
65
66
  end
66
67
 
67
68
  def detect_special_characters(string)
68
- # UNA tags must be at the start of the input otherwise they are ignored
69
- return unless string[0..2] == "UNA"
69
+ # Format must be EDIFACT or ANSI X12 to set custom characters
70
+ # Tradacoms uses the defaults
71
+ return unless [EdifactRails::Formats::EDIFACT, EdifactRails::Formats::ANSIX12].include?(@edi_format)
72
+
73
+ # If EDIFACT, UNA tags are optional, so return if it's not present
74
+ return if @edi_format == EdifactRails::Formats::EDIFACT && string[0..2] != "UNA"
75
+
76
+ case @edi_format
77
+ when EdifactRails::Formats::EDIFACT
78
+ # UNA segments look like this:
79
+ #
80
+ # UNA:+.? '
81
+ #
82
+ # UNA followed by 6 special characters which are, in order:
83
+ # 1. Component data element separator
84
+ # 2. Data element separator
85
+ # 3. Decimal notation (must be . or ,)
86
+ # 4. Release character (aka escape character)
87
+ # 5. Reserved for future use, so always a space for now
88
+ # 6. Segment terminator
89
+ set_special_characters(
90
+ component_data_element_seperator: string[3],
91
+ data_element_seperator: string[4],
92
+ decimal_notation: string[5],
93
+ escape_character: string[6],
94
+ segment_seperator: string[8]
95
+ )
96
+ when EdifactRails::Formats::ANSIX12
97
+ # ISA segments look like this:
98
+ # ISA*00* *00* *01*SENDER *01*RECEIVER *231014*1200*U*00401*000000001*1*P*>~
99
+ # These are designed to always be the same number of characters, so we can use the hardcoded positions
100
+ # The special characters are the 4th (default *, data_element_seperator),
101
+ # 105th, 106th, 103rd, and 3rd characters
102
+ set_special_characters(
103
+ data_element_seperator: string[3],
104
+ component_data_element_seperator: string[104],
105
+ segment_seperator: string[105]
106
+ )
107
+ end
108
+ end
70
109
 
71
- # UNA segments look like this:
72
- #
73
- # UNA:+.? '
74
- #
75
- # UNA followed by 6 special characters which are, in order:
76
- # 1. Component data element separator
77
- # 2. Data element separator
78
- # 3. Decimal notation (must be . or ,)
79
- # 4. Release character (aka escape character)
80
- # 5. Reserved for future use, so always a space for now
81
- # 6. Segment terminator
82
- set_special_characters(string[3], string[4], string[5], string[6], string[8])
110
+ def set_special_characters(args = {})
111
+ # arg keys will overwrite the defaults when present
112
+ @special_characters = EdifactRails::DEFAULT_SPECIAL_CHARACTERS.merge(args)
113
+
114
+ # ANSIX12 files have no escape character or decimal notation character§
115
+ return unless @edi_format == EdifactRails::Formats::ANSIX12
116
+
117
+ @special_characters.delete(:escape_character)
118
+ @special_characters.delete(:decimal_notation)
83
119
  end
84
120
 
85
- def treat_input(string)
121
+ def handle_duplicate_escape_characters(string)
86
122
  # Prepare regex
87
- other_specials_rx = Regexp.quote(
123
+ other_specials_regex = Regexp.quote(
88
124
  [
89
- @segment_seperator,
90
- @data_element_seperator,
91
- @component_data_element_seperator
125
+ @special_characters[:segment_seperator],
126
+ @special_characters[:data_element_seperator],
127
+ @special_characters[:component_data_element_seperator]
92
128
  ].join
93
129
  )
94
130
 
@@ -96,7 +132,7 @@ module EdifactRails
96
132
  # the special character is therefore unescaped.
97
133
  # Add a space between these even number of escapes, and the special character
98
134
  #
99
- # This means the regex logic for #splitting on special characters is now consistent, since there will only ever
135
+ # This means the regex logic for splitting on special characters is now consistent, since there will only ever
100
136
  # be either 0 or 1 escape characters before every special character.
101
137
  #
102
138
  # We have to do this because we can't negative lookbehind for 'an even number of escape characters' since
@@ -110,20 +146,30 @@ module EdifactRails
110
146
  # "LIN+even????+123" => '+' is not escaped, gsub'ed => "even???? +123" => parsed => ['LIN', ['even??'], [123]]
111
147
  # "LIN+odd???+123" => '+' is escaped, not gsub'ed => "odd???+123" => parsed => ['LIN', ['odd?+123']]
112
148
  string.gsub(
113
- /(?<!#{Regexp.quote(@escape_character)})((#{Regexp.quote(@escape_character)}{2})+)([#{other_specials_rx}])/,
149
+ /(?<!#{Regexp.quote(@special_characters[:escape_character])})((#{Regexp.quote(@special_characters[:escape_character])}{2})+)([#{other_specials_regex}])/,
114
150
  '\1 \3'
115
151
  )
116
152
  end
117
153
 
118
154
  # Split the segment into data elements, take the first as the tag, then parse the rest
119
155
  def parse_segment(segment)
156
+ segment.chomp("")
157
+ segment.gsub!(/^\s*(.*)\s*/, '\1')
158
+
120
159
  # If the input is a tradacoms file, the segment tag will be proceeded by '=' instead of '+'
121
160
  # 'QTY=1+A:B' instead of 'QTY+1+A:B'
122
161
  # Fortunately, this is easily handled by simply changing these "="s into "+"s before the split
123
- segment[3] = @data_element_seperator if @is_tradacoms && segment.length >= 4
162
+ if @edi_format == EdifactRails::Formats::TRADACOMS && segment.length >= 4
163
+ segment[3] = @special_characters[:data_element_seperator]
164
+ end
124
165
 
125
166
  # Segments are made up of data elements
126
- data_elements = segment.split(/(?<!#{Regexp.quote(@escape_character)})#{Regexp.quote(@data_element_seperator)}/)
167
+ data_elements =
168
+ if @edi_format == EdifactRails::Formats::ANSIX12
169
+ segment.split(@special_characters[:data_element_seperator])
170
+ else
171
+ segment.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:data_element_seperator])}/)
172
+ end
127
173
 
128
174
  # The first element is the tag, pop it off
129
175
  parsed_segment = []
@@ -137,7 +183,11 @@ module EdifactRails
137
183
  def parse_data_element(element)
138
184
  # Split data element into components
139
185
  components =
140
- element.split(/(?<!#{Regexp.quote(@escape_character)})#{Regexp.quote(@component_data_element_seperator)}/)
186
+ if @edi_format == EdifactRails::Formats::ANSIX12
187
+ element.split(@special_characters[:component_data_element_seperator])
188
+ else
189
+ element.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:component_data_element_seperator])}/)
190
+ end
141
191
 
142
192
  components.map { |component| treat_component(component) }
143
193
  end
@@ -149,15 +199,19 @@ module EdifactRails
149
199
 
150
200
  # Prepare regex
151
201
  all_special_characters_string = [
152
- @segment_seperator,
153
- @data_element_seperator,
154
- @component_data_element_seperator,
155
- @escape_character
202
+ @special_characters[:segment_seperator],
203
+ @special_characters[:data_element_seperator],
204
+ @special_characters[:component_data_element_seperator],
205
+ @special_characters[:escape_character]
156
206
  ].join
157
207
 
158
- # If the component has escaped characters in it, remove the escape character and return the character as is
159
- # "?+" -> "+", "??" -> "?"
160
- component.gsub!(/#{Regexp.quote(@escape_character)}([#{Regexp.quote(all_special_characters_string)}])/, '\1')
208
+ unless @edi_format == EdifactRails::Formats::ANSIX12
209
+ # If the component has escaped characters in it, remove the escape character and return the character as is
210
+ # "?+" -> "+", "??" -> "?"
211
+ component.gsub!(
212
+ /#{Regexp.quote(@special_characters[:escape_character])}([#{Regexp.quote(all_special_characters_string)}])/, '\1'
213
+ )
214
+ end
161
215
 
162
216
  # Convert empty strings to nils
163
217
  component = nil if component.empty?
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module EdifactRails
4
- VERSION = "1.2.1"
4
+ VERSION = "2.0.0"
5
5
  end
data/lib/edifact_rails.rb CHANGED
@@ -1,6 +1,8 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "edifact_rails/parser"
4
+ require "edifact_rails/formats"
5
+ require "edifact_rails/exceptions"
4
6
 
5
7
  module EdifactRails
6
8
  DEFAULT_SPECIAL_CHARACTERS = {
@@ -17,11 +19,11 @@ module EdifactRails
17
19
  end
18
20
 
19
21
  def self.parse_file(file_path)
20
- parse(File.read(file_path).split("\n").join)
22
+ parse(File.read(file_path))
21
23
  end
22
24
 
23
- def self.una_special_characters(string = '')
25
+ def self.special_characters(string = "")
24
26
  parser = EdifactRails::Parser.new
25
- parser.una_special_characters(string)
27
+ parser.special_characters(string)
26
28
  end
27
29
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: edifact_rails
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.2.1
4
+ version: 2.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - David Blackwood
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-06-04 00:00:00.000000000 Z
11
+ date: 2024-06-18 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: byebug
@@ -124,6 +124,8 @@ files:
124
124
  - LICENSE
125
125
  - README.md
126
126
  - lib/edifact_rails.rb
127
+ - lib/edifact_rails/exceptions.rb
128
+ - lib/edifact_rails/formats.rb
127
129
  - lib/edifact_rails/parser.rb
128
130
  - lib/edifact_rails/version.rb
129
131
  homepage: https://github.com/david-blackwood/edifact_rails