RubyGems - edifact_rails - Versions diffs - 1.2.1 → 2.0.0 - Mend

edifact_rails 1.2.1 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +11 -1
data/README.md +72 -10
data/lib/edifact_rails/exceptions.rb +9 -0
data/lib/edifact_rails/formats.rb +9 -0
data/lib/edifact_rails/parser.rb +116 -62
data/lib/edifact_rails/version.rb +1 -1
data/lib/edifact_rails.rb +5 -3
metadata +4 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 3c51652b41747f2b0c07ecd834c5c79b3bebbb1537464a450f37ee58ea68150e
-  data.tar.gz: 2669313b30c7565c60f4dc568c528127577770946f7d4a8ffee7a1fc87f4c07e
+  metadata.gz: ecc255d266d797eed361491b7643a6ca1a4e03205d214da5c447d4966d0cfb10
+  data.tar.gz: e7aa5f7730944a6b1f6b5adf9809b5d097ae2daf0bef00cd4f84c7433515b122
 SHA512:
-  metadata.gz: ded47109a99423254023e4f7316e1bdc3d1dc6602e73225a900c815baa2b2c4d9d14fe16f131aa5e1239b6b49ecab6394d7051ec9fc66923c6ffb5eb0e674ba5
-  data.tar.gz: f3c9d7ae8f793651c62eff11b3356321f6bcc531b5ce61020f23d530a08b0112d46e338edc8ff1711bc514fb53875508b5792795c3cbea8056e5ee97a60e52f1
+  metadata.gz: f0976e015521c0145fd8f4fcca85be2acd5206cf88a31a1bae6cc2d1b88068315930fadf37e1c189c04adab02720fe80b743b14e0818de33d9bbdba3b2d06d00
+  data.tar.gz: 8f234c0556584570683113037fd3289cad26615fe03e933de3c3618957d0b6142c0a787cefcafa4f913b89564daf7c87bec56d24b61706320c54ea6f3f0800b6

data/CHANGELOG.md CHANGED Viewed

@@ -20,4 +20,14 @@
 ## 1.2.1 (4/06/2024)
 * `#una_special_characters` method now also returns decimal notation character, default `.`.
-* `#una_special_characters` method can now take no arguments, and will return the default special characters if so.
+* `#una_special_characters` method can now take no arguments, and will return the default special characters if so.
+## 2.0.0 (18/06/2024)
+* Added support for ANSIX12 format.
+##### Breaking changes:
+* `#una_special_characters` renamed to `#special_characters` (since it can now accept input of any supported format)
+* New `UnrecognizedFormat` Error will now be thrown if the format of the input can not be detected.
+    * In essence, input must begin now with `UNA` or `UNB` (EDIFACT), `STX` (TRADACOMS), or `ISA` (ANSIX12)

data/README.md CHANGED Viewed

@@ -1,10 +1,10 @@
 # EdifactRails
-This gem parses EDIFACT or TRADACOMS input, and converts it into a ruby array structure for whatever further processing or validation you desire.
+This gem parses EDIFACT, TRADACOMS, or ANSIX12 input, and converts it into a ruby array structure for whatever further processing or validation you desire.
 It does not handle validation itself.
-This gem is heavily inspired by and attempts to output similar results as [edifact_parser](https://github.com/pvdvreede/edifact_parser)
+This gem is heavily inspired by [edifact_parser](https://github.com/pvdvreede/edifact_parser)
 ## Requirements
@@ -20,7 +20,7 @@ This gem has been tested on the following ruby versions:
 In your `Gemfile`:
 ```ruby
-gem 'edifact_rails', '~> 1.2'
+gem 'edifact_rails', '~> 2.0.0'
 ```
 Otherwise:
@@ -37,20 +37,20 @@ If you don't have the gem in your `Gemfile`, you will need to:
 require 'edifact_rails'
 ```
-You can pass either the path to your EDIFACT (or TRADACOMS) file, or a document (or snippet) as a string:
+You can parse a string input with `#parse`, or a file with `#parse_file`
 ```ruby
-ruby_array = EdifactRails.parse_file("your/file/path")
+ruby_array = EdifactRails.parse("UNB+UNOA:3+TESTPLACE:1+DEP1:1+20051107:1159+6002'")
 ```
 ```ruby
-ruby_array = EdifactRails.parse("LIN+1+1+0764569104:IB'QTY+1:25'")
+ruby_array = EdifactRails.parse_file("your/file/path")
 ```
-You can pull just the special characters from the UNA segment (or the defaults if no UNA segment is present):
+You can return the special characters of your input with `#special_characters`.
 ```ruby
-una_special_characters = EdifactRails.una_special_characters(your_string_input)
-# una_special_characters =>
+special_characters = EdifactRails.special_characters(example_edifact_input)
+# special_characters =>
 {
   component_data_element_seperator: ":",
   data_element_seperator: "+",
@@ -179,4 +179,66 @@ Will be returned as:
   ['MTR', [3]],
   ['END', [5]]
 ]
-```
+```
+### ANSIX12
+This ANSIX12 file:
+```
+ISA*00*          *00*          *01*SENDER         *01*RECEIVER       *231014*1200*U*00401*000000001*1*P*>~
+GS*SS*APP SENDER*APP RECEIVER*20231014*1200*0001*X*004010~
+ST*862*0001~
+BSS*05*12345*20230414*DL*20231014*20231203****ORDER1*A~
+N1*MI*SEEBURGER AG*ZZ*00000085~
+N3*EDISONSTRASSE 1~
+N4*BRETTEN**75015*DE~
+N1*SU*SUPLIER NAME*ZZ*11222333~
+N3*203 STREET NAME~
+N4*ATLANTA*GA*30309*US~
+LIN**BP*MATERIAL1*EC*ENGINEERING1*DR*001~
+UIT*EA~
+PER*SC*SEEBURGER INFO*TE*+49(7525)0~
+FST*13*C*D*20231029****DO*12345-1~
+FST*77*C*D*20231119****DO*12345-2~
+FST*68*C*D*20231203****DO*12345-3~
+SHP*01*927*011*20231014~
+REF*SI*Q5880~
+SHP*02*8557*011*20231014**20231203~
+CTT*1*5~
+SE*19*0001~
+GE*1*0001~
+IEA*1*000000001~
+```
+Will be returned as:
+```ruby
+[
+  ["ISA", ["00"], [nil], ["00"], [nil], ["01"], ["SENDER"], ["01"], ["RECEIVER"], [231014], [1200], ["U"], ["00401"], ["000000001"], [1], ["P"], []],
+  ["GS", ["SS"], ["APP SENDER"], ["APP RECEIVER"], [20231014], [1200], ["0001"], ["X"], ["004010"]],
+  ["ST", [862], ["0001"]],
+  ["BSS", ["05"], [12345], [20230414], ["DL"], [20231014], [20231203], [], [], [], ["ORDER1"], ["A"]],
+  ["N1", ["MI"], ["SEEBURGER AG"], ["ZZ"], ["00000085"]],
+  ["N3", ["EDISONSTRASSE 1"]],
+  ["N4", ["BRETTEN"], [], [75015], ["DE"]],
+  ["N1", ["SU"], ["SUPLIER NAME"], ["ZZ"], [11222333]],
+  ["N3", ["203 STREET NAME"]],
+  ["N4", ["ATLANTA"], ["GA"], [30309], ["US"]],
+  ["LIN", [], ["BP"], ["MATERIAL1"], ["EC"], ["ENGINEERING1"], ["DR"], ["001"]],
+  ["UIT", ["EA"]],
+  ["PER", ["SC"], ["SEEBURGER INFO"], ["TE"], ["+49(7525)0"]],
+  ["FST", [13], ["C"], ["D"], [20231029], [], [], [], ["DO"], ["12345-1"]],
+  ["FST", [77], ["C"], ["D"], [20231119], [], [], [], ["DO"], ["12345-2"]],
+  ["FST", [68], ["C"], ["D"], [20231203], [], [], [], ["DO"], ["12345-3"]],
+  ["SHP", ["01"], [927], ["011"], [20231014]],
+  ["REF", ["SI"], ["Q5880"]],
+  ["SHP", ["02"], [8557], ["011"], [20231014], [], [20231203]],
+  ["CTT", [1], [5]],
+  ["SE", [19], ["0001"]],
+  ["GE", [1], ["0001"]],
+  ["IEA", [1], ["000000001"]]
+]
+```

data/lib/edifact_rails/exceptions.rb ADDED Viewed

@@ -0,0 +1,9 @@
+# frozen_string_literal: true
+module EdifactRails
+  class UnrecognizedFormat < StandardError
+    def initialize
+      super("Unrecognized EDI format. Accepted formats: Edifact, Tradacoms, ANSIX12. File must begin with UNA, UNB, STX, or ISA.")
+    end
+  end
+end

data/lib/edifact_rails/formats.rb ADDED Viewed

@@ -0,0 +1,9 @@
+# frozen_string_literal: true
+module EdifactRails
+  class Formats
+    EDIFACT = "EDIFACT"
+    TRADACOMS = "TRADACOMS"
+    ANSIX12 = "ANSIX12"
+  end
+end

data/lib/edifact_rails/parser.rb CHANGED Viewed

@@ -10,20 +10,26 @@ module EdifactRails
     # Treat the input, split the input string into segments, parse those segments
     def parse(string)
-      # Trim newlines and excess spaces around those newlines
-      string = string.gsub(/\s*\n\s*/, "")
+      # Remove all carraige returns, and leading and trailing whitespace
+      string = string.delete("\r").gsub(/^\s*(.*)\s*$/, '\1')
-      # Check for UNA segment, update special characters if so
+      @edi_format = detect_edi_format(string)
+      # Detects special characters in the UNA segment (edifact) or ISA segment (ansix12),
+      # updates special characters if so
       detect_special_characters(string)
       # Does some funky regex maniulation to handle escaped special characters
-      string = treat_input(string)
+      # Ansix12 does not have escape characters, so we can skip
+      string = handle_duplicate_escape_characters(string) unless @edi_format == EdifactRails::Formats::ANSIX12
       # Split the input string into segments
-      segments = string.split(/(?<!#{Regexp.quote(@escape_character)})#{Regexp.quote(@segment_seperator)}/)
-      # Detect if the input is a tradacoms file
-      @is_tradacoms = segments.map { |s| s[3] }.uniq == ["="]
+      segments =
+        if @edi_format == EdifactRails::Formats::ANSIX12
+          string.split(@special_characters[:segment_seperator])
+        else
+          string.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:segment_seperator])}/)
+        end
       # Drop the UNA segment, if present (we have already dealt with it in #detect_special_characters)
       segments.reject! { |s| s[0..2] == "UNA" }
@@ -33,62 +39,92 @@ module EdifactRails
     end
     # Given an input string, return the special characters as defined by the UNA segment
-    # If no UNA segment is present, returns the default special characters
-    def una_special_characters(string)
+    def special_characters(string = "")
+      # If no string is passed, return default edifact characters
+      return EdifactRails::DEFAULT_SPECIAL_CHARACTERS if string.empty?
+      string = string.delete("\r").gsub(/^\s*(.*)\s*$/, '\1')
+      @edi_format = detect_edi_format(string)
       detect_special_characters(string)
-      {
-        component_data_element_seperator: @component_data_element_seperator,
-        data_element_seperator: @data_element_seperator,
-        decimal_notation: @decimal_notation,
-        escape_character: @escape_character,
-        segment_seperator: @segment_seperator
-      }
+      @special_characters
     end
     private
-    def set_special_characters(
-      component_data_element_seperator =
-        EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:component_data_element_seperator],
-      data_element_seperator = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:data_element_seperator],
-      decimal_notation = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:decimal_notation],
-      escape_character = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:escape_character],
-      segment_seperator = EdifactRails::DEFAULT_SPECIAL_CHARACTERS[:segment_seperator]
-    )
-      # Set the special characters
-      @component_data_element_seperator = component_data_element_seperator
-      @data_element_seperator = data_element_seperator
-      @decimal_notation = decimal_notation
-      @escape_character = escape_character
-      @segment_seperator = segment_seperator
+    def detect_edi_format(string)
+      case string[0..2]
+      when "UNA", "UNB"
+        EdifactRails::Formats::EDIFACT
+      when "STX"
+        EdifactRails::Formats::TRADACOMS
+      when "ISA"
+        EdifactRails::Formats::ANSIX12
+      else
+        raise EdifactRails::UnrecognizedFormat
+      end
     end
     def detect_special_characters(string)
-      # UNA tags must be at the start of the input otherwise they are ignored
-      return unless string[0..2] == "UNA"
+      # Format must be EDIFACT or ANSI X12 to set custom characters
+      # Tradacoms uses the defaults
+      return unless [EdifactRails::Formats::EDIFACT, EdifactRails::Formats::ANSIX12].include?(@edi_format)
+      # If EDIFACT, UNA tags are optional, so return if it's not present
+      return if @edi_format == EdifactRails::Formats::EDIFACT && string[0..2] != "UNA"
+      case @edi_format
+      when EdifactRails::Formats::EDIFACT
+        # UNA segments look like this:
+        #
+        # UNA:+.? '
+        #
+        # UNA followed by 6 special characters which are, in order:
+        # 1. Component data element separator
+        # 2. Data element separator
+        # 3. Decimal notation (must be . or ,)
+        # 4. Release character (aka escape character)
+        # 5. Reserved for future use, so always a space for now
+        # 6. Segment terminator
+        set_special_characters(
+          component_data_element_seperator: string[3],
+          data_element_seperator: string[4],
+          decimal_notation: string[5],
+          escape_character: string[6],
+          segment_seperator: string[8]
+        )
+      when EdifactRails::Formats::ANSIX12
+        # ISA segments look like this:
+        # ISA*00*          *00*          *01*SENDER         *01*RECEIVER       *231014*1200*U*00401*000000001*1*P*>~
+        # These are designed to always be the same number of characters, so we can use the hardcoded positions
+        # The special characters are the 4th (default *, data_element_seperator),
+        # 105th, 106th, 103rd, and 3rd characters
+        set_special_characters(
+          data_element_seperator: string[3],
+          component_data_element_seperator: string[104],
+          segment_seperator: string[105]
+        )
+      end
+    end
-      # UNA segments look like this:
-      #
-      # UNA:+.? '
-      #
-      # UNA followed by 6 special characters which are, in order:
-      # 1. Component data element separator
-      # 2. Data element separator
-      # 3. Decimal notation (must be . or ,)
-      # 4. Release character (aka escape character)
-      # 5. Reserved for future use, so always a space for now
-      # 6. Segment terminator
-      set_special_characters(string[3], string[4], string[5], string[6], string[8])
+    def set_special_characters(args = {})
+      # arg keys will overwrite the defaults when present
+      @special_characters = EdifactRails::DEFAULT_SPECIAL_CHARACTERS.merge(args)
+      # ANSIX12 files have no escape character or decimal notation character§
+      return unless @edi_format == EdifactRails::Formats::ANSIX12
+      @special_characters.delete(:escape_character)
+      @special_characters.delete(:decimal_notation)
     end
-    def treat_input(string)
+    def handle_duplicate_escape_characters(string)
       # Prepare regex
-      other_specials_rx = Regexp.quote(
+      other_specials_regex = Regexp.quote(
         [
-          @segment_seperator,
-          @data_element_seperator,
-          @component_data_element_seperator
+          @special_characters[:segment_seperator],
+          @special_characters[:data_element_seperator],
+          @special_characters[:component_data_element_seperator]
         ].join
       )
@@ -96,7 +132,7 @@ module EdifactRails
       # the special character is therefore unescaped.
       # Add a space between these even number of escapes, and the special character
       #
-      # This means the regex logic for #splitting on special characters is now consistent, since there will only ever
+      # This means the regex logic for splitting on special characters is now consistent, since there will only ever
       # be either 0 or 1 escape characters before every special character.
       #
       # We have to do this because we can't negative lookbehind for 'an even number of escape characters' since
@@ -110,20 +146,30 @@ module EdifactRails
       # "LIN+even????+123" => '+' is not escaped, gsub'ed => "even???? +123" => parsed => ['LIN', ['even??'], [123]]
       # "LIN+odd???+123" => '+' is escaped, not gsub'ed => "odd???+123" => parsed => ['LIN', ['odd?+123']]
       string.gsub(
-        /(?<!#{Regexp.quote(@escape_character)})((#{Regexp.quote(@escape_character)}{2})+)([#{other_specials_rx}])/,
+        /(?<!#{Regexp.quote(@special_characters[:escape_character])})((#{Regexp.quote(@special_characters[:escape_character])}{2})+)([#{other_specials_regex}])/,
         '\1 \3'
       )
     end
     # Split the segment into data elements, take the first as the tag, then parse the rest
     def parse_segment(segment)
+      segment.chomp("")
+      segment.gsub!(/^\s*(.*)\s*/, '\1')
       # If the input is a tradacoms file, the segment tag will be proceeded by '=' instead of '+'
       # 'QTY=1+A:B' instead of 'QTY+1+A:B'
       # Fortunately, this is easily handled by simply changing these "="s into "+"s before the split
-      segment[3] = @data_element_seperator if @is_tradacoms && segment.length >= 4
+      if @edi_format == EdifactRails::Formats::TRADACOMS && segment.length >= 4
+        segment[3] = @special_characters[:data_element_seperator]
+      end
       # Segments are made up of data elements
-      data_elements = segment.split(/(?<!#{Regexp.quote(@escape_character)})#{Regexp.quote(@data_element_seperator)}/)
+      data_elements =
+        if @edi_format == EdifactRails::Formats::ANSIX12
+          segment.split(@special_characters[:data_element_seperator])
+        else
+          segment.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:data_element_seperator])}/)
+        end
       # The first element is the tag, pop it off
       parsed_segment = []
@@ -137,7 +183,11 @@ module EdifactRails
     def parse_data_element(element)
       # Split data element into components
       components =
-        element.split(/(?<!#{Regexp.quote(@escape_character)})#{Regexp.quote(@component_data_element_seperator)}/)
+        if @edi_format == EdifactRails::Formats::ANSIX12
+          element.split(@special_characters[:component_data_element_seperator])
+        else
+          element.split(/(?<!#{Regexp.quote(@special_characters[:escape_character])})#{Regexp.quote(@special_characters[:component_data_element_seperator])}/)
+        end
       components.map { |component| treat_component(component) }
     end
@@ -149,15 +199,19 @@ module EdifactRails
       # Prepare regex
       all_special_characters_string = [
-        @segment_seperator,
-        @data_element_seperator,
-        @component_data_element_seperator,
-        @escape_character
+        @special_characters[:segment_seperator],
+        @special_characters[:data_element_seperator],
+        @special_characters[:component_data_element_seperator],
+        @special_characters[:escape_character]
       ].join
-      # If the component has escaped characters in it, remove the escape character and return the character as is
-      # "?+" -> "+", "??" -> "?"
-      component.gsub!(/#{Regexp.quote(@escape_character)}([#{Regexp.quote(all_special_characters_string)}])/, '\1')
+      unless @edi_format == EdifactRails::Formats::ANSIX12
+        # If the component has escaped characters in it, remove the escape character and return the character as is
+        # "?+" -> "+", "??" -> "?"
+        component.gsub!(
+          /#{Regexp.quote(@special_characters[:escape_character])}([#{Regexp.quote(all_special_characters_string)}])/, '\1'
+        )
+      end
       # Convert empty strings to nils
       component = nil if component.empty?

data/lib/edifact_rails/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module EdifactRails
-  VERSION = "1.2.1"
+  VERSION = "2.0.0"
 end

data/lib/edifact_rails.rb CHANGED Viewed

@@ -1,6 +1,8 @@
 # frozen_string_literal: true
 require "edifact_rails/parser"
+require "edifact_rails/formats"
+require "edifact_rails/exceptions"
 module EdifactRails
   DEFAULT_SPECIAL_CHARACTERS = {
@@ -17,11 +19,11 @@ module EdifactRails
   end
   def self.parse_file(file_path)
-    parse(File.read(file_path).split("\n").join)
+    parse(File.read(file_path))
   end
-  def self.una_special_characters(string = '')
+  def self.special_characters(string = "")
     parser = EdifactRails::Parser.new
-    parser.una_special_characters(string)
+    parser.special_characters(string)
   end
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: edifact_rails
 version: !ruby/object:Gem::Version
-  version: 1.2.1
+  version: 2.0.0
 platform: ruby
 authors:
 - David Blackwood
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2024-06-04 00:00:00.000000000 Z
+date: 2024-06-18 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: byebug
@@ -124,6 +124,8 @@ files:
 - LICENSE
 - README.md
 - lib/edifact_rails.rb
+- lib/edifact_rails/exceptions.rb
+- lib/edifact_rails/formats.rb
 - lib/edifact_rails/parser.rb
 - lib/edifact_rails/version.rb
 homepage: https://github.com/david-blackwood/edifact_rails