RubyGems - srl_ruby - Versions diffs - 0.1.1 → 0.2.0 - Mend

srl_ruby 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +12 -0
data/LICENSE.txt +6 -1
data/README.md +4 -1
data/lib/srl_ruby.rb +5 -4
data/lib/srl_ruby/ast_builder.rb +2 -1
data/lib/srl_ruby/grammar.rb +1 -3
data/lib/srl_ruby/tokenizer.rb +1 -2
data/lib/srl_ruby/version.rb +1 -1
data/spec/acceptance/srl_test_suite_spec.rb +57 -0
data/spec/acceptance/support/rule_file_ast_builder.rb +99 -0
data/spec/acceptance/support/rule_file_grammar.rb +41 -0
data/spec/acceptance/support/rule_file_nodes.rb +49 -0
data/spec/acceptance/support/rule_file_parser.rb +46 -0
data/spec/acceptance/support/rule_file_token.rb +22 -0
data/spec/acceptance/support/rule_file_tokenizer.rb +154 -0
data/spec/integration_spec.rb +1 -1
data/srl_ruby.gemspec +3 -2
data/srl_test/README.md +12 -0
data/srl_test/Test-Rules/README.md +56 -0
data/srl_test/Test-Rules/backslash.rule +5 -0
data/srl_test/Test-Rules/basename_capture_group.rule +7 -0
data/srl_test/Test-Rules/issue_17_uppercase_letter.rule +6 -0
data/srl_test/Test-Rules/literally_spaces.rule +4 -0
data/srl_test/Test-Rules/no_word.rule +4 -0
data/srl_test/Test-Rules/nondigit.rule +8 -0
data/srl_test/Test-Rules/none_of.rule +6 -0
data/srl_test/Test-Rules/sample_capture.rule +10 -0
data/srl_test/Test-Rules/tab.rule +3 -0
data/srl_test/Test-Rules/website_example_email.rule +9 -0
data/srl_test/Test-Rules/website_example_email_capture.rule +11 -0
data/srl_test/Test-Rules/website_example_lookahead.rule +6 -0
data/srl_test/Test-Rules/website_example_password.rule +11 -0
data/srl_test/Test-Rules/website_example_url.rule +38 -0
data/srl_test/Test-Rules/word.rule +3 -0
metadata +29 -4

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: c6ca8e70f473d53992bd449ed2945cc55ebedcff
-  data.tar.gz: 065406740aded7b357712fa67daa4f622627a79c
+  metadata.gz: 6aad01259ac0746c8e49a856822c5e8c53aaed52
+  data.tar.gz: b814be25539c6304eab9471843c2eaa92abbbfc7
 SHA512:
-  metadata.gz: 174e237267f85e4d0e4d9e714ded189c9a1d46635b8668af469d23c0b5a3709f53e48087486cd5fb8f60861e02de19d44c694224581a9d1071718272574898c6
-  data.tar.gz: 92118db7249a6f66dc88ae66e2ca229a6800b91ba7ac53e692bc452ef87f311a3bb4bbc8fe401f0b2c015d960ecd29015465d9908d633b1425ef08aa1575a77b
+  metadata.gz: 3f93e5d7277bafd3c4ca6449855d6a423eae6b6912d4e5543f7f2042026d6f1b56a970c3241a5152ae184b3e78a72e68ab437cdeb1e37cfc694f96c68941d4ad
+  data.tar.gz: f83cc31f5c5ccaeae1c14a0997d75da1eb4eed6ec3f306f822477de72c074383636d060530eb08f8176276a37870a4767a0635edf22035ed281987c88a9ff085

data/CHANGELOG.md CHANGED

@@ -6,6 +6,18 @@
 ### Fixed
 ### Security
+## [0.2.0] - 2018-03-14
+### Added
+- Added `spec/acceptance/support` directory. It contains test harness to use the .rule files from standard SRL test suite.
+- Added `acceptance/srl_test_suite_spec.rb`file. Spec file designed to standard SRL test suite. At this date, SrlRuby passes 3 tests out of 15 tests in total.
+### Changed
+- API Change. Method SrlRuby#parse returns a Regexp instance (previously it was a String)
+- API Change. Method SrlRuby#load_file returns a Regexp instance (previously it was a String)
+### Fixed
+- SRL 'backslash' produces now 4 consecutive backslashes (required by the conversion into Regexp)
 ## [0.1.1] - 2018-03-10
 ### Changed
 - Parse error location is now given in line number, column number position.

data/LICENSE.txt CHANGED

@@ -1,6 +1,11 @@
+This license applies to all of srl_ruby except for the portions found under
+the 'srl_test' directory, which is subject to its own license.
+-----
 The MIT License (MIT)
-Copyright (c) 2018 TODO: Write your name
+Copyright (c) 2018 Dimitri Geshef
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

data/README.md CHANGED

@@ -69,7 +69,10 @@ And there is the equivalent regex found by `srl_ruby`:
 ## Usage
-The following snippet...
+The method `SrlRuby#parse` accepts a Simple Regex Language string as input, and returns the corresponding regular expression.
+For instance, the following snippet...
 ```ruby
 require 'srl_ruby' # Load srl_ruby library

data/lib/srl_ruby.rb CHANGED

@@ -5,9 +5,9 @@ require_relative './srl_ruby/ast_builder'
 module SrlRuby # This module is used as a namespace
   # Load the SRL expression contained in filename.
-  # Returns the literal regular expression representation
-  # as a Ruby String.
+  # Returns an equivalent Regexp object.
   # @param filename [String] file name to parse.
+  # @return [Regexp]
   def self.load_file(filename)
     source = nil
     File.open(filename, 'r') { |f| source = f.read }
@@ -16,8 +16,9 @@ module SrlRuby # This module is used as a namespace
     return parse(source)
   end
-  # Parse the SRL expression into its literal regexp and return it.
+  # Parse the SRL expression into its Regexp equivalent.
   # @param source [String] the SRL source to parse and convert.
+  # @return [Regexp]
   def self.parse(source)
     # Create a Rley facade object
     engine = Rley::Engine.new
@@ -41,6 +42,6 @@ module SrlRuby # This module is used as a namespace
     # Now output the regexp literal
     root = ast_ptree.root
-    return root.to_str
+    return Regexp.new(root.to_str)
   end
 end # module

data/lib/srl_ruby/ast_builder.rb CHANGED

@@ -262,7 +262,8 @@ module SrlRuby
     # rule('special_char' => 'BACKSLASH').as 'backslash'
     def reduce_backslash(_production, _range, _tokens, _children)
-      Regex::Character.new('\\')
+      # Double the basckslash (because of escaping)
+      string_literal("\\", true)
     end
     # rule('special_char' => %w[NEW LINE]).as 'new_line'

data/lib/srl_ruby/grammar.rb CHANGED

@@ -2,9 +2,7 @@
 require 'rley' # Load the gem
 module SrlRuby
   ########################################
-  # Work in progress.
-  # This is a very partial grammar of SRL.
-  # It will be expanded with the coming versions of Rley
+  # SRL grammar
   builder = Rley::Syntax::GrammarBuilder.new do
     add_terminals('LPAREN', 'RPAREN', 'COMMA')
     add_terminals('DIGIT_LIT', 'INTEGER', 'LETTER_LIT')

data/lib/srl_ruby/tokenizer.rb CHANGED

@@ -16,7 +16,7 @@ module SrlRuby
     attr_reader(:scanner)
     attr_reader(:lineno)
     attr_reader(:line_start)
-    attr_reader(:column)
+    # attr_reader(:column)
     @@lexeme2name = {
       '(' => 'LPAREN',
@@ -174,6 +174,5 @@ module SrlRuby
     def tab_size()
       2
     end
   end # class
 end # module

data/lib/srl_ruby/version.rb CHANGED

@@ -1,3 +1,3 @@
 module SrlRuby
-  VERSION = '0.1.1'.freeze
+  VERSION = '0.2.0'.freeze
 end

data/spec/acceptance/srl_test_suite_spec.rb ADDED

@@ -0,0 +1,57 @@
+require_relative '../spec_helper'
+require_relative './support/rule_file_parser'
+require_relative '../../lib/srl_ruby'
+##############################
+# Understand how parser fails when first rule begins with %[...] instead of %w[...]
+##############################
+RSpec.describe Acceptance do
+  def rule_path
+    __FILE__.sub(/spec\/.+$/, 'srl_test/Test-Rules/')
+  end
+  def load_file(aFilename)
+    return Acceptance::RuleFileParser.load_file(rule_path + aFilename)
+  end
+  def test_rule_file(aRuleFileRepr)
+    regex = SrlRuby::parse(aRuleFileRepr.srl.value)
+    expect(regex).not_to be_nil
+    aRuleFileRepr.match_tests.each do |test|
+      expect(regex.match(test.test_string.value)).not_to be_nil
+    end
+    aRuleFileRepr.no_match_tests.each do |test|
+      expect(regex.match(test.test_string.value)).to be_nil
+    end
+    aRuleFileRepr.capture_tests.each do |test|
+      matching = regex.match(test.test_string.value)
+      expect(matching).not_to be_nil
+      test.expectations do |exp|
+        var = exp.var_name.value.to_s
+        captured = exp.captured_text.value
+        name_index = matching.names.index(var)
+        expect(name_index).not_to be_nil
+        expect(matching.captures[name_index]).to eq(captured)
+      end
+    end
+  end
+  it 'should match a backslash' do
+    puts __FILE__
+    rule_file_repr = load_file('backslash.rule')
+    test_rule_file(rule_file_repr)
+  end
+  it 'should not trim literal strings' do
+    rule_file_repr = load_file('literally_spaces.rule')
+    test_rule_file(rule_file_repr)
+  end
+  it 'should support lookahead' do
+    rule_file_repr = load_file('website_example_lookahead.rule')
+    test_rule_file(rule_file_repr)
+  end
+end

data/spec/acceptance/support/rule_file_ast_builder.rb ADDED

@@ -0,0 +1,99 @@
+require_relative 'rule_file_nodes'
+module Acceptance
+  # The purpose of a ASTBuilder is to build piece by piece an AST
+  # (Abstract Syntax Tree) from a sequence of input tokens and
+  # visit events produced by walking over a GFGParsing object.
+  # Uses the Builder GoF pattern.
+  # The Builder pattern creates a complex object
+  # (say, a parse tree) from simpler objects (terminal and non-terminal
+  # nodes) and using a step by step approach.
+  class RuleFileASTBuilder < Rley::ParseRep::ASTBaseBuilder
+    Terminal2NodeClass = {
+      # Lexical ambiguity: integer literal represents two very different concepts:
+      # An index or a capture variable name
+      'INTEGER' => IntegerNode,
+      'STRING_LIT' => StringLitNode,
+      'IDENTIFIER' => VarnameNode,
+      'SRL_SOURCE' => SRLSourceNode
+    }.freeze
+    attr_reader :options
+    protected
+    def terminal2node()
+      Terminal2NodeClass
+    end
+    # rule('rule_file' => %w[srl_heading srl_tests]).as 'start_rule'
+    def reduce_start_rule(_production, _range, _tokens, theChildren)
+      rule_file = RuleFileTests.new(theChildren[0])
+      tests = theChildren.last.flatten
+      tests.each do |t|
+        case t
+        when MatchTest then rule_file.match_tests << t
+        when NoMatchTest then rule_file.no_match_tests << t
+        when CaptureTest then rule_file.capture_tests << t
+        else
+          raise StandardError, 'Internal error'
+        end
+      end
+      return rule_file
+    end
+    # rule('srl_heading' => %w[SRL: SRL_SOURCE]).as 'srl_source'
+    def reduce_srl_source(_production, _range, _tokens, theChildren)
+      return theChildren.last
+    end
+    # rule('srl_tests' => %w[srl_tests single_test]).as 'test_list'
+    def reduce_test_list(_production, _range, _tokens, theChildren)
+      return theChildren[0] << theChildren[1]
+    end
+    # rule('srl_tests' => 'single_test').as 'one_test'
+    def reduce_one_test(_production, _range, _tokens, theChildren)
+      return [theChildren.last]
+    end
+    # rule('match_test' => %w[MATCH: STRING_LIT]).as 'match_string'
+    def reduce_match_string(_production, _range, _tokens, theChildren)
+      MatchTest.new(theChildren.last)
+    end
+    # rule('no_match_test' => %w[NO MATCH: STRING_LIT]).as 'no_match_string'
+    def reduce_no_match_string(_production, _range, _tokens, theChildren)
+      NoMatchTest.new(theChildren.last)
+    end
+    # rule('capture_test' => %w[capture_heading capture_expectations])
+    #  .as 'capture_test'
+    def reduce_capture_test(_production, _range, _tokens, theChildren)
+      CaptureTest.new(theChildren[0], theChildren.last)
+    end
+    # rule('capture_heading' => %w[CAPTURE FOR STRING_LIT COLON]).as 'capture_string'
+    def reduce_capture_string(_production, _range, _tokens, theChildren)
+      return theChildren[2]
+    end
+    # rule('capture_expectations' => %w[capture_expectations
+    #   single_expectation]).as 'assertion_list'
+    def reduce_assertion_list(_production, _range, _tokens, theChildren)
+      return theChildren[0] << theChildren[1]
+    end
+    # rule('capture_expectations' => 'single_expectation').as 'one_expectation'
+    def reduce_one_expectation(_production, _range, _tokens, theChildren)
+      return [theChildren.last]
+    end
+    # rule('single_expectation' => %w[DASH INTEGER COLON capture_variable
+    #   COLON STRING_LIT]).as 'capture_expectation'
+    def reduce_capture_expectation(_production, _range, _tokens, theChildren)
+      CaptureExpectation.new(theChildren[1], theChildren[3], theChildren[5])
+    end
+  end # class
+end # module

data/spec/acceptance/support/rule_file_grammar.rb ADDED

@@ -0,0 +1,41 @@
+# File: rule_file_grammar.rb
+require 'rley' # Load the Rley gem
+# Grammar for Test-Rule files
+# [File format](https://github.com/SimpleRegex/Test-Rules/blob/master/README.md)
+########################################
+# Define a grammar for basic arithmetical expressions
+builder = Rley::Syntax::GrammarBuilder.new do
+  # Punctuation
+  add_terminals('COLON', 'DASH')
+  # Keywords
+  add_terminals('CAPTURE', 'FOR')
+  add_terminals('MATCH:', 'NO', 'SRL:')
+  # Literals
+  add_terminals('INTEGER', 'STRING_LIT')
+  add_terminals('IDENTIFIER', 'SRL_SOURCE')
+  rule('rule_file' => %w[srl_heading srl_tests]).as 'start_rule'
+  rule('srl_heading' => %w[SRL: SRL_SOURCE]).as 'srl_source'
+  rule('srl_tests' => %w[srl_tests single_test]).as 'test_list'
+  rule('srl_tests' => 'single_test').as 'one_test'
+  rule('single_test' => 'atomic_test').as 'single_atomic_test'
+  rule('single_test' => 'compound_test').as 'single_compound_test'
+  rule('atomic_test' => 'match_test').as 'atomic_match'
+  rule('atomic_test' => 'no_match_test').as 'atomic_no_match'
+  rule('compound_test' => 'capture_test').as 'compound_capture'
+  rule('match_test' => %w[MATCH: STRING_LIT]).as 'match_string'
+  rule('no_match_test' => %w[NO MATCH: STRING_LIT]).as 'no_match_string'
+  rule('capture_test' => %w[capture_heading capture_expectations]).as 'capture_test'
+  rule('capture_heading' => %w[CAPTURE FOR STRING_LIT COLON]).as 'capture_string'
+  rule('capture_expectations' => %w[capture_expectations single_expectation]).as 'assertion_list'
+  rule('capture_expectations' => 'single_expectation').as 'one_expectation'
+  rule('single_expectation' => %w[DASH INTEGER COLON capture_variable COLON STRING_LIT]).as 'capture_expectation'
+  rule('capture_variable' => 'INTEGER').as 'var_integer'
+  rule('capture_variable' => 'IDENTIFIER').as 'var_identifier'
+end
+# And now build the grammar...
+RuleFileGrammar = builder.grammar

data/spec/acceptance/support/rule_file_nodes.rb ADDED

@@ -0,0 +1,49 @@
+# Classes that implement nodes of Abstract Syntax Trees (AST) representing
+# rule file contents.
+module Acceptance
+  RuleFileTerminalNode = Struct.new(:value) do
+    def initialize(aToken, _position)
+      init_value(aToken.lexeme)
+    end
+  end
+  class IntegerNode < RuleFileTerminalNode
+    def init_value(aLiteral)
+      self.value = aLiteral.to_i
+    end
+  end
+  class StringLitNode < RuleFileTerminalNode
+    def init_value(aLiteral)
+      self.value = aLiteral.dup
+    end
+  end
+  class SRLSourceNode < RuleFileTerminalNode
+    def init_value(aLiteral)
+      self.value = aLiteral.dup
+    end
+  end
+  class VarnameNode < RuleFileTerminalNode
+    def init_value(aLiteral)
+      self.value = aLiteral.dup
+    end
+  end
+  RuleFileTests = Struct.new(:srl, :match_tests, :no_match_tests, :capture_tests) do
+    def initialize(aSRLExpression)
+      self.srl = aSRLExpression.dup
+      self.match_tests = []
+      self.no_match_tests = []
+		  self.capture_tests = []
+    end
+	end
+  MatchTest = Struct.new(:test_string)
+  NoMatchTest = Struct.new(:test_string)
+  CaptureExpectation = Struct.new(:result_index, :var_name, :captured_text)
+  CaptureTest = Struct.new(:test_string, :expectations)
+end # module

data/spec/acceptance/support/rule_file_parser.rb ADDED

@@ -0,0 +1,46 @@
+require_relative 'rule_file_tokenizer'
+require_relative 'rule_file_grammar'
+require_relative 'rule_file_ast_builder'
+module Acceptance # This module is used as a namespace
+  module RuleFileParser
+    # Load the rule file
+    # Returns the test rule representation
+    # @param filename [String] file name to parse.
+    def self.load_file(filename)
+      source = nil
+      File.open(filename, 'r') { |f| source = f.read }
+      return source if source.nil? || source.empty?
+      return parse(source)
+    end
+    # Parse the rule file
+    # @param source [String] the SRL source to parse and convert.
+    def self.parse(source)
+      # Create a Rley facade object
+      engine = Rley::Engine.new
+      # Step 1. Load SRL grammar
+      engine.use_grammar(RuleFileGrammar)
+      lexer = RuleFileTokenizer.new(source)
+      result = engine.parse(lexer.tokens)
+      unless result.success?
+        # Stop if the parse failed...
+        line1 = "Parsing failed\n"
+        line2 = "Reason: #{result.failure_reason.message}"
+        raise StandardError, line1 + line2
+      end
+      # Generate an abstract syntax tree (AST) from the parse result
+      engine.configuration.repr_builder = RuleFileASTBuilder
+      ast_ptree = engine.convert(result)
+      # Now output the regexp literal
+      root = ast_ptree.root
+      return root
+    end
+  end
+end # module

data/spec/acceptance/support/rule_file_token.rb ADDED

@@ -0,0 +1,22 @@
+require 'rley' # Load the Rley gem
+module Acceptance
+  Position = Struct.new(:line, :column) do
+    def to_s()
+      "line #{line}, column #{column}"
+    end
+  end
+  # Specialization of Token class.
+  # It stores the position in (line, row) of the token
+  class RuleFileToken < Rley::Lexical::Token
+    attr_reader(:position)
+    def initialize(theLexeme, aTerminal, aPosition)
+      super(theLexeme, aTerminal)
+      @position = aPosition
+    end
+  end # class
+end # module
+# End of file

data/spec/acceptance/support/rule_file_tokenizer.rb ADDED

@@ -0,0 +1,154 @@
+# File: rule_tokenizer.rb
+# Tokenizer for SimpleRegex Test-Rule files
+# [File format](https://github.com/SimpleRegex/Test-Rules/blob/master/README.md)
+require 'strscan'
+require 'pp'
+require_relative 'rule_file_token'
+module Acceptance
+  # The tokenizer should recognize:
+  # Keywords: as, capture, letter
+  # Integer literals including single digit
+  # String literals (quote delimited)
+  # Single character literal
+  # Delimiters: parentheses '(' and ')'
+  # Separators: comma (optional)
+  class RuleFileTokenizer
+    attr_reader(:scanner)
+    attr_reader(:lineno)
+    attr_reader(:line_start)
+    # Can be :default, :expecting_srl
+    attr_reader(:state)
+    @@lexeme2name = {
+      ':' => 'COLON',
+      '-' => 'DASH'
+    }.freeze
+    # Here are all the Rule file keywords
+    @@keywords = %w[
+      capture
+      for
+      match:
+      no
+      srl:
+    ].map { |x| [x, x.upcase] }.to_h
+    class ScanError < StandardError; end
+    def initialize(source)
+      @scanner = StringScanner.new(source)
+      @lineno = 1
+      @line_start = 0
+      @state = :default
+    end
+    def tokens()
+      tok_sequence = []
+      until @scanner.eos?
+        token = _next_token
+        tok_sequence << token unless token.nil?
+      end
+      return tok_sequence
+    end
+    private
+    def _next_token()
+      skip_noise
+      curr_ch = scanner.peek(1)
+      return nil if curr_ch.nil? || curr_ch.empty?
+      token = if state == :default
+        default_mode
+      else
+        expecting_srl
+      end
+      return token
+    end
+    def default_mode()
+      curr_ch = scanner.peek(1)
+      token = nil
+      if '-:'.include? curr_ch
+        # Delimiters, separators => single character token
+        token = build_token(@@lexeme2name[curr_ch], scanner.getch)
+      elsif (lexeme = scanner.scan(/[0-9]+/))
+        token = build_token('INTEGER', lexeme)
+      elsif (lexeme = scanner.scan(/srl:|match:/))
+        token = build_token(@@keywords[lexeme], lexeme)
+        @state = :expecting_srl if lexeme == 'srl:'
+      elsif (lexeme = scanner.scan(/[a-zA-Z_][a-zA-Z0-9_]*/))
+        keyw = @@keywords[lexeme]
+        token_type = keyw ? keyw : 'IDENTIFIER'
+        token = build_token(token_type, lexeme)
+      elsif (lexeme = scanner.scan(/"([^"]|\\")*"/)) # Double quotes literal?
+        unquoted = lexeme.gsub(/(^")|("$)/, '')
+        token = build_token('STRING_LIT', unquoted)
+      else # Unknown token
+        erroneous = curr_ch.nil? ? '' : curr_ch
+        sequel = scanner.scan(/.{1,20}/)
+        erroneous += sequel unless sequel.nil?
+        raise ScanError.new("Unknown token #{erroneous}")
+      end
+      return token
+    end
+    def expecting_srl()
+      scanner.skip(/^:/)
+      lexeme = scanner.scan(/[^\r\n]*/)
+      @state = :default
+      build_token('SRL_SOURCE', lexeme)
+    end
+    def build_token(aSymbolName, aLexeme)
+      begin
+        col = scanner.pos - aLexeme.size - @line_start + 1
+        pos = Position.new(@lineno, col)
+        token = RuleFileToken.new(aLexeme, aSymbolName, pos)
+      rescue StandardError => exc
+        puts "Failing with '#{aSymbolName}' and '#{aLexeme}'"
+        raise exc
+      end
+      return token
+    end
+    def skip_noise()
+      begin
+        noise_found = false
+        noise_found = true if skip_whitespaces
+        noise_found = true if skip_comment
+      end while noise_found
+    end
+    def skip_whitespaces()
+      pre_pos = scanner.pos
+      begin
+        ws_found = false
+        found = scanner.skip(/[ \t\f]+/)
+        ws_found = true if found
+        found = scanner.skip(/(?:\r\n)|\r|\n/)
+        if found
+          ws_found = true
+          @lineno += 1
+          @line_start = scanner.pos
+        end
+      end while ws_found
+      curr_pos = scanner.pos
+      return !(curr_pos == pre_pos)
+    end
+    def skip_comment()
+      scanner.skip(/#[^\n\r]+/)
+    end
+  end # class
+end # module

data/spec/integration_spec.rb CHANGED

@@ -173,7 +173,7 @@ module SrlRuby
         expect(result).to be_success
         regexp = regexp_repr(result)
-        expect(regexp.to_str).to eq('\\')
+        expect(regexp.to_str).to eq('\\\\')
       end
       it "should parse 'new line' syntax" do

data/srl_ruby.gemspec CHANGED

@@ -18,7 +18,8 @@ module PkgExtending
       'srl_ruby.gemspec',
       'lib/*.*',
       'lib/**/*.rb',
-      'spec/**/*.rb'
+      'spec/**/*.rb',
+      'srl_test/**/*.*'
     ]
     aPackage.files = file_list
     aPackage.test_files = Dir['spec/**/*_spec.rb']
@@ -54,7 +55,7 @@ END_DESCR
   spec.required_ruby_version = '>= 2.1.0'
   # Runtime dependencies
-  spec.add_dependency 'rley', '~> 0.6.03'
+  spec.add_dependency 'rley', '~> 0.6.04'
   # Development dependencies
   spec.add_development_dependency 'bundler', '~> 1.16'

data/srl_test/README.md ADDED

@@ -0,0 +1,12 @@
+The files found under this directory are extracted directly from the
+[Test-Rules](ttps://github.com/SimpleRegex/Test-Rules.) test harness
+used for verifying the SRL implementation, the most recent version
+of which is available at https://github.com/SimpleRegex/Test-Rules.
+With the exception of this README.md, all of the files are Copyright (c)
+2016-2018 Karim Geigier and released under the MIT license. Please see
+Test-Rules/License text file for details.
+Directory contents:
+README.txt  -- this file
+Test-Rules/ -- files extracted directly from the SimpleRegex project.

data/srl_test/Test-Rules/README.md ADDED

@@ -0,0 +1,56 @@
+# Test Rules
+Test rules are made to verify that your implementation of SRL is valid.
+These files contain simple tests to validate the SRL and the
+corresponding results. The structure is easy to understand and implement.
+## Structure of a .rule File
+These rules are required to build valid test rules:
+* All files used for testing must end with the extension `.rule` and at
+least contain one valid assertion along with the SRL query.
+* The query is defined through `srl: ` on the beginning of a line.
+* All strings that should match are defined through `match: ` on the
+beginning of a line.
+  * There can be unlimited `match: ` lines per rule.
+  * Each match must be surrounded by `"`.
+* All strings that should **not** match are defined through `no match: `
+on the beginning of a line.
+  * There can be unlimited `no match: ` lines per rule.
+  * Each match must be surrounded by `"`.
+* If a capture group is defined, its result can be defined as follows:
+  * The line must begin with `capture for `.
+  * Surrounded by `"`, the test string to match must be provided, followed by a `: `.
+  * If a named group is desired, use the following syntax: `name: "result"`
+  * If a anonymous group is desired, just supply `"result"`.
+  * Separate multiple captures using `, `.
+  * If one expression returns multiple matches, supply the same test string in the second line.
+* The query as well as the expectations must not exceed one line.
+If required, new lines can be forced using `\n`. Tabs using `\t`.
+* Comments must be on a separate line and start with a `#`.
+## Example .rule Files
+```
+# This is a sample rule with a named capture group
+srl: capture (letter twice) as "foo"
+capture for "aa1":
+- 0: foo: "aa"
+match: "example"
+match: "aa2"
+no match: "a"
+```
+```
+# This is a sample rule with an anonymous capture group and multiple results
+srl: capture (digit)
+capture for "123":
+- 0: 0: "1"
+- 1: 0: "2"
+- 2: 0: "3"
+capture for "01":
+- 0: 0: "0"
+- 1: 0: "1"
+```

data/srl_test/Test-Rules/backslash.rule ADDED

@@ -0,0 +1,5 @@
+# Match a backslash.
+srl: begin with backslash must end
+match: "\"
+no match: "\a"
+no match: "a\"

data/srl_test/Test-Rules/basename_capture_group.rule ADDED

@@ -0,0 +1,7 @@
+# Make sure php internal functions aren't executed while using capture groups
+srl: begin with capture (letter twice) as "basename", must end
+capture for "aa":
+- 0: basename: "aa"
+no match: "a1"
+no match: "aaa"
+match: "bb"

data/srl_test/Test-Rules/issue_17_uppercase_letter.rule ADDED

@@ -0,0 +1,6 @@
+srl: begin with (digit once), any of (letter, digit, uppercase letter) once or more, must end
+match: "1a"
+match: "56"
+match: "8B"
+no match: "abc"
+no match: "1"

data/srl_test/Test-Rules/literally_spaces.rule ADDED

@@ -0,0 +1,4 @@
+# Make sure literal strings are not trimmed.
+srl: literally " foo "
+match: " foo "
+no match: "foo"

data/srl_test/Test-Rules/no_word.rule ADDED

@@ -0,0 +1,4 @@
+srl: begin with (no word), anything, must end
+match: "#"
+no match: "19"
+no match: "#123   "

data/srl_test/Test-Rules/nondigit.rule ADDED

@@ -0,0 +1,8 @@
+srl: begin with (no digit), any of (letter, digit, uppercase letter) once or more, must end
+match: "#22a"
+match: "abc"
+match: "_A2"
+no match: "1"
+no match: "1a"
+no match: "56"
+no match: "8B"

data/srl_test/Test-Rules/none_of.rule ADDED

@@ -0,0 +1,6 @@
+srl: begin with none of abcd, any of (letter, digit, uppercase letter) once or more, must end
+match: "Ad22"
+match: ">el"
+no match: "bring"
+no match: "cat"
+no match: "d20"

data/srl_test/Test-Rules/sample_capture.rule ADDED

@@ -0,0 +1,10 @@
+# This is a sample rule with an anonymous capture group and multiple results
+srl: capture (digit)
+capture for "123":
+- 0: 0: "1"
+- 1: 0: "2"
+- 2: 0: "3"
+capture for "01":
+- 0: 0: "0"
+- 1: 0: "1"

data/srl_test/Test-Rules/tab.rule ADDED

@@ -0,0 +1,3 @@
+srl: begin with letter exactly 3 times, literally ':', tab, digit at least 1 times, must end
+match: "abc:	90"
+no match: "xyz:  5"

data/srl_test/Test-Rules/website_example_email.rule ADDED

@@ -0,0 +1,9 @@
+srl: begin with any of (digit, letter, one of "._%+-") once or more, literally "@", any of (digit, letter, one of ".-") once or more, literally ".", letter at least 2 times, must end, case insensitive
+match: "you@example.com"
+match: "you@example.email"
+match: "me@foo.bar.email"
+no match: "you@example.c"
+no match: "you@example"
+no match: "you@.com"
+no match: "@example.com"
+no match: "example.com"

data/srl_test/Test-Rules/website_example_email_capture.rule ADDED

@@ -0,0 +1,11 @@
+srl: capture (any of (digit, letter, one of "._%+-") once or more) as "local", literally "@", capture (any of (digit, letter, one of ".-") once or more, literally ".", letter at least 2 times ) as "domain", case insensitive
+match: "you@example.email, me@you.com"
+no match: "you@example.c"
+no match: "just some text"
+no match: "example.com"
+capture for "Message me at you@example.com. Business email: business@awesome.email":
+- 0: local: "you"
+- 0: domain: "example.com"
+- 1: local: "business"
+- 1: domain: "awesome.email"

data/srl_test/Test-Rules/website_example_lookahead.rule ADDED

@@ -0,0 +1,6 @@
+srl: capture (digit) if not followed by (anything once or more, digit)
+match: "This example contains 3 numbers. 2 should not match. Only 1 should."
+no match: "some string without numbers"
+capture for "This example contains 3 numbers. 2 should not match. Only 1 should.":
+- 0: 0: "1"

data/srl_test/Test-Rules/website_example_password.rule ADDED

@@ -0,0 +1,11 @@
+srl: if followed by (anything never or more, letter), if followed by (anything never or more, uppercase letter), if followed by (anything never or more, digit), if followed by (anything never or more, one of "!@#$%^&*[]\"';:_-<>., =+/\\"), anything at least 8 time
+match: "P@sSword1"
+match: "Pass-w0rd"
+match: "Th1s is Secure"
+no match: "Password"
+no match: "P@sS1"
+no match: "justalongpassword"
+no match: "m1ss1ng upper"
+no match: "missing Number"
+no match: "M1SS1NG LOWER"
+no match: "m1ss1ngSpec1al"

data/srl_test/Test-Rules/website_example_url.rule ADDED

@@ -0,0 +1,38 @@
+srl: begin with capture (letter once or more) as "protocol", literally "://", capture ( letter once or more, any of (letter, literally ".") once or more, letter at least 2 times ) as "domain", literally ":" optional, capture (digit once or more) as "port" optional, capture (anything never or more) as "path" until (any of (literally "?", must end)), literally "?" optional, capture (anything never or more) as "parameters" optional, must end, case insensitive
+match: "https://example.domain.com:1234/a/path?query=param"
+match: "http://domain.com?query=param"
+match: "http://domain.com/"
+match: "http://domain.com"
+match: "http://domain/foo/?bar=baz"
+no match: "you@example.com"
+no match: "domain.com"
+no match: "://domain.com"
+no match: "http://"
+capture for "https://example.domain.com:1234/a/path?query=param":
+- 0: protocol: "https"
+- 0: domain: "example.domain.com"
+- 0: port: "1234"
+- 0: path: "/a/path"
+- 0: parameters: "query=param"
+capture for "https://example.domain.com:1234/a/path":
+- 0: protocol: "https"
+- 0: domain: "example.domain.com"
+- 0: port: "1234"
+- 0: path: "/a/path"
+- 0: parameters: ""
+capture for "protocol://domain/a/path":
+- 0: protocol: "protocol"
+- 0: domain: "domain"
+- 0: port: ""
+- 0: path: "/a/path"
+- 0: parameters: ""
+capture for "http://domain.com":
+- 0: protocol: "http"
+- 0: domain: "domain.com"
+- 0: port: ""
+- 0: path: ""
+- 0: parameters: ""

data/srl_test/Test-Rules/word.rule ADDED

@@ -0,0 +1,3 @@
+srl: begin with (word), letter, letter, letter, must end
+match: "abc"
+no match: "   "

metadata CHANGED

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: srl_ruby
 version: !ruby/object:Gem::Version
-  version: 0.1.1
+  version: 0.2.0
 platform: ruby
 authors:
 - Dimitri Geshef
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2018-03-10 00:00:00.000000000 Z
+date: 2018-03-14 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: rley
@@ -16,14 +16,14 @@ dependencies:
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.6.03
+        version: 0.6.04
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.6.03
+        version: 0.6.04
 - !ruby/object:Gem::Dependency
   name: bundler
   requirement: !ruby/object:Gem::Requirement
@@ -113,6 +113,13 @@ files:
 - lib/srl_ruby/srl_token.rb
 - lib/srl_ruby/tokenizer.rb
 - lib/srl_ruby/version.rb
+- spec/acceptance/srl_test_suite_spec.rb
+- spec/acceptance/support/rule_file_ast_builder.rb
+- spec/acceptance/support/rule_file_grammar.rb
+- spec/acceptance/support/rule_file_nodes.rb
+- spec/acceptance/support/rule_file_parser.rb
+- spec/acceptance/support/rule_file_token.rb
+- spec/acceptance/support/rule_file_tokenizer.rb
 - spec/integration_spec.rb
 - spec/regex/character_spec.rb
 - spec/regex/multiplicity_spec.rb
@@ -120,6 +127,23 @@ files:
 - spec/srl_ruby/srl_ruby_spec.rb
 - spec/srl_ruby/tokenizer_spec.rb
 - srl_ruby.gemspec
+- srl_test/README.md
+- srl_test/Test-Rules/README.md
+- srl_test/Test-Rules/backslash.rule
+- srl_test/Test-Rules/basename_capture_group.rule
+- srl_test/Test-Rules/issue_17_uppercase_letter.rule
+- srl_test/Test-Rules/literally_spaces.rule
+- srl_test/Test-Rules/no_word.rule
+- srl_test/Test-Rules/nondigit.rule
+- srl_test/Test-Rules/none_of.rule
+- srl_test/Test-Rules/sample_capture.rule
+- srl_test/Test-Rules/tab.rule
+- srl_test/Test-Rules/website_example_email.rule
+- srl_test/Test-Rules/website_example_email_capture.rule
+- srl_test/Test-Rules/website_example_lookahead.rule
+- srl_test/Test-Rules/website_example_password.rule
+- srl_test/Test-Rules/website_example_url.rule
+- srl_test/Test-Rules/word.rule
 homepage: https://github.com/famished-tiger/SRL-Ruby
 licenses:
 - MIT
@@ -148,6 +172,7 @@ summary: srl_ruby is a gem implementing a parser for Simple Regex Language (SRL)
   It translates patterns expressed in SRL into plain Ruby Regexp objects  or regex
   literals.
 test_files:
+- spec/acceptance/srl_test_suite_spec.rb
 - spec/integration_spec.rb
 - spec/regex/character_spec.rb
 - spec/regex/multiplicity_spec.rb