RubyGems - liquid - Versions diffs - 5.6.0 → 5.7.0 - Mend

liquid 5.6.0 → 5.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

checksums.yaml +4 -4
data/History.md +44 -2
data/README.md +1 -1
data/lib/liquid/context.rb +5 -1
data/lib/liquid/expression.rb +97 -21
data/lib/liquid/lexer.rb +63 -127
data/lib/liquid/parse_context.rb +25 -3
data/lib/liquid/parser.rb +2 -2
data/lib/liquid/range_lookup.rb +3 -3
data/lib/liquid/standardfilters.rb +167 -61
data/lib/liquid/tags/cycle.rb +7 -1
data/lib/liquid/tags/for.rb +1 -1
data/lib/liquid/tags/if.rb +1 -1
data/lib/liquid/tokenizer.rb +123 -13
data/lib/liquid/utils.rb +96 -0
data/lib/liquid/variable.rb +3 -3
data/lib/liquid/variable_lookup.rb +13 -5
data/lib/liquid/version.rb +1 -1
data/lib/liquid.rb +4 -1
metadata +5 -5

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: d1a39a98605d86cc2cf586a59d1a402ed708ad3722061083514eeeea602249d9
-  data.tar.gz: d3de02a25cada366198f08959e8d6565b120b961978810b98dda3170acee7b7e
+  metadata.gz: 72a0f18697a90c81db846fc86027cbec36198105fb458d40989f88b1acd55687
+  data.tar.gz: ba8f6ecc9612f737f954006109b91759ea7cea41074de2ab38e73221d4e18be2
 SHA512:
-  metadata.gz: fa084464c4927940f8edc1c0206858bdf4c39ca5bd1b632d2786ab6b11b235f5a5696dad9c9e9bdf79640d897539baaa5c56044230ed6901d1c34f930c0e3f0a
-  data.tar.gz: dbe05e17ceb7c73461ed0b4f838a03af7853ec47a454ada51c632763adb31e28cf8e4fd3d89c2ed0c5c2f8d91eea2e3b661c8257ab266d7895c398ef1ac047f5
+  metadata.gz: ad608314f023c78123d3cf57a3c9b54229ec7dca712f1b3897218c27880e5f78d7188cdce775d15759960803a31c5e7d1adc81c9282ab784a6f6fa50efeb051e
+  data.tar.gz: 6988a403789ca9a3a3d3d48e5cba0639f81226b3cd8af477dcce1d8eeb9db3693e90b373ea391f5d458a723ec40d7c0c6d28f7e445d3d0562b219e9cbcfff1ee

data/History.md CHANGED Viewed

@@ -1,11 +1,53 @@
 # Liquid Change Log
-## 5.6.0 (unreleased)
+## 5.8.0 (unreleased)
+## 5.7.0 2025-01-16
+### Features
+* Add `find`, `find_index`, `has`, and `reject` filters to arrays
+* Compatibility with Ruby 3.4
+## 5.6.4 2025-01-14
 ### Fixes
+* Add a default `string_scanner` to avoid errors with `Liquid::VariableLookup.parse("foo.bar")` [Ian Ker-Seymer]
-* Fix Tokenizer to handle null source value (#1873) [Bahar Pourazar]
+## 5.6.3 2025-01-13
+* Remove `lru_redux` dependency [Michael Go]
+## 5.6.2 2025-01-13
+### Fixes
+* Preserve the old behavior of requiring floats to start with a digit [Michael Go]
+## 5.6.1 2025-01-13
+### Performance improvements
+* Faster Expression parser / Tokenizer with StringScanner [Michael Go]
+## 5.6.0 2024-12-19
+### Architectural changes
+* Added new `Environment` class to manage configuration and state that was previously stored in `Template` [Ian Ker-Seymer]
+* Moved tag registration from `Template` to `Environment` [Ian Ker-Seymer]
+* Removed `StrainerFactory` in favor of `Environment`-based strainer creation [Ian Ker-Seymer]
+* Consolidated standard tags into a new `Tags` module with `STANDARD_TAGS` constant [Ian Ker-Seymer]
+### Performance improvements
+* Optimized `Lexer` with a new `Lexer2` implementation using jump tables for faster tokenization, requires Ruby 3.4 [Ian Ker-Seymer]
+* Improved variable rendering with specialized handling for different types [Michael Go]
+* Reduced array allocations by using frozen empty constants [Michael Go]
+### API changes
+* Deprecated several `Template` class methods in favor of `Environment` methods [Ian Ker-Seymer]
+* Added deprecation warnings system [Ian Ker-Seymer]
+* Changed how filters and tags are registered to use Environment [Ian Ker-Seymer]
+### Fixes
+* Fixed table row handling of break interrupts [Alex Coco]
+* Improved variable output handling for arrays [Ian Ker-Seymer]
+* Fix Tokenizer to handle null source value (#1873) [Bahar Pourazar]
 ## 5.5.0 2024-03-21

data/README.md CHANGED Viewed

@@ -91,7 +91,7 @@ Liquid::Template.parse(<<~LIQUID, environment: email_environment)
 LIQUID
 ```
-By using Environments, you ensure that custom tags and filters are only available in the contexts where they are needed, making your Liquid templates more robust and easier to manage.
+By using Environments, you ensure that custom tags and filters are only available in the contexts where they are needed, making your Liquid templates more robust and easier to manage. For smaller projects, a global environment is available via `Liquid::Environment.default`.
 ### Error Modes

data/lib/liquid/context.rb CHANGED Viewed

@@ -40,6 +40,10 @@ module Liquid
       @global_filter       = nil
       @disabled_tags       = {}
+      # Instead of constructing new StringScanner objects for each Expression parse,
+      # we recycle the same one.
+      @string_scanner = StringScanner.new("")
       @registers.static[:cached_partials] ||= {}
       @registers.static[:file_system] ||= environment.file_system
       @registers.static[:template_factory] ||= Liquid::TemplateFactory.new
@@ -176,7 +180,7 @@ module Liquid
     # Example:
     #   products == empty #=> products.empty?
     def [](expression)
-      evaluate(Expression.parse(expression))
+      evaluate(Expression.parse(expression, @string_scanner))
     end
     def key?(key)

data/lib/liquid/expression.rb CHANGED Viewed

@@ -10,37 +10,113 @@ module Liquid
       'true' => true,
       'false' => false,
       'blank' => '',
-      'empty' => ''
+      'empty' => '',
+      # in lax mode, minus sign can be a VariableLookup
+      # For simplicity and performace, we treat it like a literal
+      '-' => VariableLookup.parse("-", nil).freeze,
     }.freeze
-    INTEGERS_REGEX       = /\A(-?\d+)\z/
-    FLOATS_REGEX         = /\A(-?\d[\d\.]+)\z/
+    DOT = ".".ord
+    ZERO = "0".ord
+    NINE = "9".ord
+    DASH = "-".ord
     # Use an atomic group (?>...) to avoid pathological backtracing from
     # malicious input as described in https://github.com/Shopify/liquid/issues/1357
-    RANGES_REGEX         = /\A\(\s*(?>(\S+)\s*\.\.)\s*(\S+)\s*\)\z/
+    RANGES_REGEX = /\A\(\s*(?>(\S+)\s*\.\.)\s*(\S+)\s*\)\z/
+    INTEGER_REGEX = /\A(-?\d+)\z/
+    FLOAT_REGEX = /\A(-?\d+)\.\d+\z/
-    def self.parse(markup)
-      return nil unless markup
+    class << self
+      def parse(markup, ss = StringScanner.new(""), cache = nil)
+        return unless markup
-      markup = markup.strip
-      if (markup.start_with?('"') && markup.end_with?('"')) ||
-         (markup.start_with?("'") && markup.end_with?("'"))
-        return markup[1..-2]
+        markup = markup.strip # markup can be a frozen string
+        if (markup.start_with?('"') && markup.end_with?('"')) ||
+          (markup.start_with?("'") && markup.end_with?("'"))
+          return markup[1..-2]
+        elsif LITERALS.key?(markup)
+          return LITERALS[markup]
+        end
+        # Cache only exists during parsing
+        if cache
+          return cache[markup] if cache.key?(markup)
+          cache[markup] = inner_parse(markup, ss, cache).freeze
+        else
+          inner_parse(markup, ss, nil).freeze
+        end
       end
-      case markup
-      when INTEGERS_REGEX
-        Regexp.last_match(1).to_i
-      when RANGES_REGEX
-        RangeLookup.parse(Regexp.last_match(1), Regexp.last_match(2))
-      when FLOATS_REGEX
-        Regexp.last_match(1).to_f
-      else
-        if LITERALS.key?(markup)
-          LITERALS[markup]
+      def inner_parse(markup, ss, cache)
+        if (markup.start_with?("(") && markup.end_with?(")")) && markup =~ RANGES_REGEX
+          return RangeLookup.parse(
+            Regexp.last_match(1),
+            Regexp.last_match(2),
+            ss,
+            cache,
+          )
+        end
+        if (num = parse_number(markup, ss))
+          num
+        else
+          VariableLookup.parse(markup, ss, cache)
+        end
+      end
+      def parse_number(markup, ss)
+        # check if the markup is simple integer or float
+        case markup
+        when INTEGER_REGEX
+          return Integer(markup, 10)
+        when FLOAT_REGEX
+          return markup.to_f
+        end
+        ss.string = markup
+        # the first byte must be a digit or  a dash
+        byte = ss.scan_byte
+        return false if byte != DASH && (byte < ZERO || byte > NINE)
+        if byte == DASH
+          peek_byte = ss.peek_byte
+          # if it starts with a dash, the next byte must be a digit
+          return false if peek_byte.nil? || !(peek_byte >= ZERO && peek_byte <= NINE)
+        end
+        # The markup could be a float with multiple dots
+        first_dot_pos = nil
+        num_end_pos = nil
+        while (byte = ss.scan_byte)
+          return false if byte != DOT && (byte < ZERO || byte > NINE)
+          # we found our number and now we are just scanning the rest of the string
+          next if num_end_pos
+          if byte == DOT
+            if first_dot_pos.nil?
+              first_dot_pos = ss.pos
+            else
+              # we found another dot, so we know that the number ends here
+              num_end_pos = ss.pos - 1
+            end
+          end
+        end
+        num_end_pos = markup.length if ss.eos?
+        if num_end_pos
+          # number ends with a number "123.123"
+          markup.byteslice(0, num_end_pos).to_f
         else
-          VariableLookup.parse(markup)
+          # number ends with a dot "123."
+          markup.byteslice(0, first_dot_pos).to_f
         end
       end
     end

data/lib/liquid/lexer.rb CHANGED Viewed

@@ -1,66 +1,7 @@
 # frozen_string_literal: true
-require "strscan"
 module Liquid
-  class Lexer1
-    SPECIALS = {
-      '|' => :pipe,
-      '.' => :dot,
-      ':' => :colon,
-      ',' => :comma,
-      '[' => :open_square,
-      ']' => :close_square,
-      '(' => :open_round,
-      ')' => :close_round,
-      '?' => :question,
-      '-' => :dash,
-    }.freeze
-    IDENTIFIER            = /[a-zA-Z_][\w-]*\??/
-    SINGLE_STRING_LITERAL = /'[^\']*'/
-    DOUBLE_STRING_LITERAL = /"[^\"]*"/
-    STRING_LITERAL        = Regexp.union(SINGLE_STRING_LITERAL, DOUBLE_STRING_LITERAL)
-    NUMBER_LITERAL        = /-?\d+(\.\d+)?/
-    DOTDOT                = /\.\./
-    COMPARISON_OPERATOR   = /==|!=|<>|<=?|>=?|contains(?=\s)/
-    WHITESPACE_OR_NOTHING = /\s*/
-    def initialize(input)
-      @ss = StringScanner.new(input)
-    end
-    def tokenize
-      @output = []
-      until @ss.eos?
-        @ss.skip(WHITESPACE_OR_NOTHING)
-        break if @ss.eos?
-        tok      = if (t = @ss.scan(COMPARISON_OPERATOR))
-          [:comparison, t]
-        elsif (t = @ss.scan(STRING_LITERAL))
-          [:string, t]
-        elsif (t = @ss.scan(NUMBER_LITERAL))
-          [:number, t]
-        elsif (t = @ss.scan(IDENTIFIER))
-          [:id, t]
-        elsif (t = @ss.scan(DOTDOT))
-          [:dotdot, t]
-        else
-          c     = @ss.getch
-          if (s = SPECIALS[c])
-            [s, c]
-          else
-            raise SyntaxError, "Unexpected character #{c}"
-          end
-        end
-        @output << tok
-      end
-      @output << [:end_of_string]
-    end
-  end
-  class Lexer2
+  class Lexer
     CLOSE_ROUND = [:close_round, ")"].freeze
     CLOSE_SQUARE = [:close_square, "]"].freeze
     COLON = [:colon, ":"].freeze
@@ -92,6 +33,7 @@ module Liquid
     SINGLE_COMPARISON_TOKENS = [].tap do |table|
       table["<".ord] = COMPARISON_LESS_THAN
       table[">".ord] = COMPARISON_GREATER_THAN
+      table.freeze
     end
     TWO_CHARS_COMPARISON_JUMP_TABLE = [].tap do |table|
@@ -103,18 +45,17 @@ module Liquid
         sub_table["=".ord] = COMPARISION_NOT_EQUAL
         sub_table.freeze
       end
+      table.freeze
     end
     COMPARISON_JUMP_TABLE = [].tap do |table|
       table["<".ord] = [].tap do |sub_table|
         sub_table["=".ord] = COMPARISON_LESS_THAN_OR_EQUAL
         sub_table[">".ord] = COMPARISON_NOT_EQUAL_ALT
-        RUBY_WHITESPACE.each { |c| sub_table[c.ord] = COMPARISON_LESS_THAN }
         sub_table.freeze
       end
       table[">".ord] = [].tap do |sub_table|
         sub_table["=".ord] = COMPARISON_GREATER_THAN_OR_EQUAL
-        RUBY_WHITESPACE.each { |c| sub_table[c.ord] = COMPARISON_GREATER_THAN }
         sub_table.freeze
       end
       table.freeze
@@ -157,81 +98,76 @@ module Liquid
       table.freeze
     end
-    def initialize(input)
-      @ss = StringScanner.new(input)
-    end
     # rubocop:disable Metrics/BlockNesting
-    def tokenize
-      @output = []
-      until @ss.eos?
-        @ss.skip(WHITESPACE_OR_NOTHING)
-        break if @ss.eos?
-        start_pos = @ss.pos
-        peeked = @ss.peek_byte
-        if (special = SPECIAL_TABLE[peeked])
-          @ss.scan_byte
-          # Special case for ".."
-          if special == DOT && @ss.peek_byte == DOT_ORD
-            @ss.scan_byte
-            @output << DOTDOT
-          elsif special == DASH
-            # Special case for negative numbers
-            if (peeked_byte = @ss.peek_byte) && NUMBER_TABLE[peeked_byte]
-              @ss.pos -= 1
-              @output << [:number, @ss.scan(NUMBER_LITERAL)]
+    class << self
+      def tokenize(ss)
+        output = []
+        until ss.eos?
+          ss.skip(WHITESPACE_OR_NOTHING)
+          break if ss.eos?
+          start_pos = ss.pos
+          peeked = ss.peek_byte
+          if (special = SPECIAL_TABLE[peeked])
+            ss.scan_byte
+            # Special case for ".."
+            if special == DOT && ss.peek_byte == DOT_ORD
+              ss.scan_byte
+              output << DOTDOT
+            elsif special == DASH
+              # Special case for negative numbers
+              if (peeked_byte = ss.peek_byte) && NUMBER_TABLE[peeked_byte]
+                ss.pos -= 1
+                output << [:number, ss.scan(NUMBER_LITERAL)]
+              else
+                output << special
+              end
             else
-              @output << special
+              output << special
             end
-          else
-            @output << special
-          end
-        elsif (sub_table = TWO_CHARS_COMPARISON_JUMP_TABLE[peeked])
-          @ss.scan_byte
-          if (peeked_byte = @ss.peek_byte) && (found = sub_table[peeked_byte])
-            @output << found
-            @ss.scan_byte
-          else
-            raise_syntax_error(start_pos)
-          end
-        elsif (sub_table = COMPARISON_JUMP_TABLE[peeked])
-          @ss.scan_byte
-          if (peeked_byte = @ss.peek_byte) && (found = sub_table[peeked_byte])
-            @output << found
-            @ss.scan_byte
-          else
-            @output << SINGLE_COMPARISON_TOKENS[peeked]
-          end
-        else
-          type, pattern = NEXT_MATCHER_JUMP_TABLE[peeked]
-          if type && (t = @ss.scan(pattern))
-            # Special case for "contains"
-            @output << if type == :id && t == "contains" && @output.last&.first != :dot
-              COMPARISON_CONTAINS
+          elsif (sub_table = TWO_CHARS_COMPARISON_JUMP_TABLE[peeked])
+            ss.scan_byte
+            if (peeked_byte = ss.peek_byte) && (found = sub_table[peeked_byte])
+              output << found
+              ss.scan_byte
+            else
+              raise_syntax_error(start_pos, ss)
+            end
+          elsif (sub_table = COMPARISON_JUMP_TABLE[peeked])
+            ss.scan_byte
+            if (peeked_byte = ss.peek_byte) && (found = sub_table[peeked_byte])
+              output << found
+              ss.scan_byte
             else
-              [type, t]
+              output << SINGLE_COMPARISON_TOKENS[peeked]
             end
           else
-            raise_syntax_error(start_pos)
+            type, pattern = NEXT_MATCHER_JUMP_TABLE[peeked]
+            if type && (t = ss.scan(pattern))
+              # Special case for "contains"
+              output << if type == :id && t == "contains" && output.last&.first != :dot
+                COMPARISON_CONTAINS
+              else
+                [type, t]
+              end
+            else
+              raise_syntax_error(start_pos, ss)
+            end
           end
         end
+        # rubocop:enable Metrics/BlockNesting
+        output << EOS
       end
-      # rubocop:enable Metrics/BlockNesting
-      @output << EOS
-    end
-    def raise_syntax_error(start_pos)
-      @ss.pos = start_pos
-      # the character could be a UTF-8 character, use getch to get all the bytes
-      raise SyntaxError, "Unexpected character #{@ss.getch}"
+      def raise_syntax_error(start_pos, ss)
+        ss.pos = start_pos
+        # the character could be a UTF-8 character, use getch to get all the bytes
+        raise SyntaxError, "Unexpected character #{ss.getch}"
+      end
     end
   end
-  Lexer = StringScanner.instance_methods.include?(:scan_byte) ? Lexer2 : Lexer1
 end

data/lib/liquid/parse_context.rb CHANGED Viewed

@@ -12,6 +12,18 @@ module Liquid
       @locale   = @template_options[:locale] ||= I18n.new
       @warnings = []
+      # constructing new StringScanner in Lexer, Tokenizer, etc is expensive
+      # This StringScanner will be shared by all of them
+      @string_scanner = StringScanner.new("")
+      @expression_cache = if options[:expression_cache].nil?
+        {}
+      elsif options[:expression_cache].respond_to?(:[]) && options[:expression_cache].respond_to?(:[]=)
+        options[:expression_cache]
+      elsif options[:expression_cache]
+        {}
+      end
       self.depth   = 0
       self.partial = false
     end
@@ -24,12 +36,22 @@ module Liquid
       Liquid::BlockBody.new
     end
-    def new_tokenizer(markup, start_line_number: nil, for_liquid_tag: false)
-      Tokenizer.new(markup, line_number: start_line_number, for_liquid_tag: for_liquid_tag)
+    def new_parser(input)
+      @string_scanner.string = input
+      Parser.new(@string_scanner)
+    end
+    def new_tokenizer(source, start_line_number: nil, for_liquid_tag: false)
+      Tokenizer.new(
+        source: source,
+        string_scanner: @string_scanner,
+        line_number: start_line_number,
+        for_liquid_tag: for_liquid_tag,
+      )
     end
     def parse_expression(markup)
-      Expression.parse(markup)
+      Expression.parse(markup, @string_scanner, @expression_cache)
     end
     def partial=(value)

data/lib/liquid/parser.rb CHANGED Viewed

@@ -3,8 +3,8 @@
 module Liquid
   class Parser
     def initialize(input)
-      l       = Lexer.new(input)
-      @tokens = l.tokenize
+      ss = input.is_a?(StringScanner) ? input : StringScanner.new(input)
+      @tokens = Lexer.tokenize(ss)
       @p      = 0 # pointer to current location
     end

data/lib/liquid/range_lookup.rb CHANGED Viewed

@@ -2,9 +2,9 @@
 module Liquid
   class RangeLookup
-    def self.parse(start_markup, end_markup)
-      start_obj = Expression.parse(start_markup)
-      end_obj   = Expression.parse(end_markup)
+    def self.parse(start_markup, end_markup, string_scanner, cache = nil)
+      start_obj = Expression.parse(start_markup, string_scanner, cache)
+      end_obj   = Expression.parse(end_markup, string_scanner, cache)
       if start_obj.respond_to?(:evaluate) || end_obj.respond_to?(:evaluate)
         new(start_obj, end_obj)
       else