RubyGems - css_parser - Versions diffs - 0.9.0 - Mend

css_parser 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

data/CHANGELOG +33 -0
data/LICENSE +42 -0
data/README +60 -0
data/lib/css_parser/parser.rb +345 -0
data/lib/css_parser/regexps.rb +46 -0
data/lib/css_parser/rule_set.rb +381 -0
data/lib/css_parser.rb +149 -0
data/test/fixtures/import-circular-reference.css +4 -0
data/test/fixtures/import-with-media-types.css +3 -0
data/test/fixtures/import1.css +3 -0
data/test/fixtures/simple.css +6 -0
data/test/fixtures/subdir/import2.css +3 -0
data/test/test_css_parser_basic.rb +56 -0
data/test/test_css_parser_downloading.rb +81 -0
data/test/test_css_parser_media_types.rb +71 -0
data/test/test_css_parser_misc.rb +143 -0
data/test/test_css_parser_regexps.rb +68 -0
data/test/test_helper.rb +8 -0
data/test/test_merging.rb +88 -0
data/test/test_rule_set.rb +74 -0
data/test/test_rule_set_creating_shorthand.rb +90 -0
data/test/test_rule_set_expanding_shorthand.rb +178 -0
metadata +82 -0

data/CHANGELOG ADDED Viewed

@@ -0,0 +1,33 @@
+= Premailer CHANGELOG
+== Version 0.9
+ * initial proof-of-concept
+ * PHP web version
+== Version 1.0
+ * ported web interface to eRuby
+ * incremental parsing improvements
+== Version 1.1
+ * proper calculation of selector specificity per CSS 2.1 spec
+ * support for <tt>@import</tt>
+ * preliminary support for shorthand CSS properties (<tt>margin</tt>, <tt>padding</tt>)
+ * preliminary separation of CSS parser
+== Version 1.2
+ * respect <tt>LINK</tt> media types
+ * better style folding
+ * incremental parsing improvements
+== Version 1.3
+ * separate CSS parser into its own library
+ * handle <tt>background: red url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR42mP4%2F58BAAT%2FAf9jgNErAAAAAElFTkSuQmCC);</tt>
+ * preserve <tt>:hover</tt> etc... in head styles
+== TODO: Future
+ * respect <tt>@media</tt> rule (http://www.w3.org/TR/CSS21/media.html#at-media-rule)
+ * complete shorthand properties support (<tt>border-width</tt>, <tt>font</tt>, <tt>background</tt>)
+ * better quote escaping
+ * UTF-8 and other charsets (test page: http://kianga.kcore.de/2004/09/21/utf8_test)
+ * make warnings for <tt>border</tt> match <tt>border-left</tt>, etc...
+ * correctly parse http://www.webstandards.org/files/acid2/test.html

data/LICENSE ADDED Viewed

@@ -0,0 +1,42 @@
+= CSS Parser License
+Copyright (c) 2007 Alex Dunae
+Premailer is copyrighted free software by Alex Dunae (http://dunae.ca/).
+You can redistribute it and/or modify it under the conditions below:
+  1. You may make and give away verbatim copies of the source form of the
+     software without restriction, provided that you duplicate all of the
+     original copyright notices and associated disclaimers.
+  2. You may modify your copy of the software in any way, provided that
+     you do at least ONE of the following:
+       a) place your modifications in the Public Domain or otherwise
+          make them Freely Available, such as by posting said
+	  modifications to the internet or an equivalent medium, or by
+	  allowing the author to include your modifications in the software.
+       b) use the modified software only within your corporation or
+          organization.
+       c) rename any non-standard executables so the names do not conflict
+	  with standard executables, which must also be provided.
+       d) make other distribution arrangements with the author.
+  3. You may modify and include the part of the software into any other
+     software (possibly commercial) as long as clear acknowledgement and
+     a link back to the original software (http://code.dunae.ca/premailer.web/)
+     is provided.
+  5. The scripts and library files supplied as input to or produced as
+     output from the software do not automatically fall under the
+     copyright of the software, but belong to whomever generated them,
+     and may be sold commercially, and may be aggregated with this
+     software.
+  6. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
+     IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+     WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+     PURPOSE.

data/README ADDED Viewed

@@ -0,0 +1,60 @@
+= Ruby CSS Parser
+Load, parse and cascade CSS rule sets in Ruby.
+=== Setup
+Install the gem from RubyGems.
+   gem install css_parser
+Done.
+=== An example
+  require 'css_parser'
+  include CssParser
+  parser = CssParser::Parser.new
+  parser.load_file!('http://example.com/styles/style.css')
+  # lookup a rule by a selector
+  parser.find('#content')
+  #=> 'font-size: 13px; line-height: 1.2;'
+  # lookup a rule by a selector and media type
+  parser.find('#content', [:screen, :handheld])
+  # iterate through selectors by media type
+  parser.each_selector(:screen) do |selector, declarations, specificity|
+    ...
+  end
+  # add a block of CSS
+  css = <<-EOT
+    body { margin: 0 1em; }
+  EOT
+  parser.add_block!(css)
+  # output all CSS rules in a single stylesheet
+  parser.to_s
+  => #content { font-size: 13px; line-height: 1.2; }
+     body { margin: 0 1em; }
+=== Testing
+You can run the suite of unit tests using <tt>rake test</tt>.
+The download/import tests require that WEBrick is installed.  The tests set up
+a temporary server on port 12000 and pull down files from the <tt>test/fixtures/</tt>
+directory.
+=== Credits and code
+* Project page: http://code.dunae.ca/css_parser/
+* Source: http://code.dunae.ca/svn/css_parser/
+* Docs: http://code.dunae.ca/css_parser/doc/
+By Alex Dunae (dunae.ca, e-mail 'code' at the same domain), 2007.
+Made with love on Vancouver Island.

data/lib/css_parser/parser.rb ADDED Viewed

@@ -0,0 +1,345 @@
+module CssParser
+  # Exception class used for any errors encountered while downloading remote files.
+  class RemoteFileError < IOError; end
+  # Exception class used if a request is made to load a CSS file more than once.
+  class CircularReferenceError < StandardError; end
+  # == Parser class
+  #
+  # All CSS is converted to UTF-8.
+  #
+  # When calling Parser#new there are some configuaration options:
+  # [<tt>absolute_paths</tt>] Convert relative paths to absolute paths (<tt>href</tt>, <tt>src</tt> and <tt>url('')</tt>. Boolean, default is <tt>false</tt>.
+  # [<tt>import</tt>] Follow <tt>@import</tt> rules. Boolean, default is <tt>true</tt>.
+  # [<tt>io_exceptions</tt>] Throw an exception if a link can not be found. Boolean, default is <tt>true</tt>.
+  class Parser
+    USER_AGENT   = "Ruby CSS Parser/#{VERSION} (http://code.dunae.ca/css_parser/)"
+    STRIP_CSS_COMMENTS_RX = /\/\*.*?\*\//m
+    STRIP_HTML_COMMENTS_RX = /\<\!\-\-|\-\-\>/m
+    # Initial parsing
+    RE_AT_IMPORT_RULE = /\@import[\s]+(url\()?["']+(.[^'"]*)["']\)?([\w\s\,]*);?/i
+    #--
+    # RE_AT_IMPORT_RULE = Regexp.new('@import[\s]*(' + RE_STRING.to_s + ')([\w\s\,]*)[;]?', Regexp::IGNORECASE) -- should handle url() even though it is not allowed
+    #++
+    # Array of CSS files that have been loaded.
+    attr_reader   :loaded_uris
+    #attr_reader   :rules
+    #--
+    # Class variable? see http://www.oreillynet.com/ruby/blog/2007/01/nubygems_dont_use_class_variab_1.html
+    #++
+    @folded_declaration_cache = {}
+    class << self; attr_reader :folded_declaration_cache; end
+    def initialize(options = {})
+      @options = {:absolute_paths => false,
+                  :import => true,
+                  :io_exceptions => true}.merge(options)
+      # array of RuleSets
+      @rules = []
+      @loaded_uris = []
+      # unprocessed blocks of CSS
+      @blocks = []
+      reset!
+    end
+    # Get declarations by selector.
+    #
+    # +media_types+ are optional, and can be a symbol or an array of symbols.
+    # The default value is <tt>:all</tt>.
+    #
+    # ==== Examples
+    #  find_by_selector('#content')
+    #  => 'font-size: 13px; line-height: 1.2;'
+    #
+    #  find_by_selector('#content', [:screen, :handheld])
+    #  => 'font-size: 13px; line-height: 1.2;'
+    #
+    #  find_by_selector('#content', :print)
+    #  => 'font-size: 11pt; line-height: 1.2;'
+    #
+    # Returns an array of declarations.
+    def find_by_selector(selector, media_types = :all)
+      out = []
+      each_selector(media_types) do |sel, dec, spec|
+        out << dec if sel.strip == selector.strip
+      end
+      out
+    end
+    alias_method :[], :find_by_selector
+    # Add a raw block of CSS.
+    #
+    # ==== Example
+    #   css = <<-EOT
+    #     body { font-size: 10pt }
+    #     p { margin: 0px; }
+    #     @media screen, print {
+    #       body { line-height: 1.2 }
+    #     }
+    #   EOT
+    #
+    #   parser = CssParser::Parser.new
+    #   parser.load_css!(css)
+    #--
+    # TODO: add media_type
+    #++
+    def add_block!(block, options = {})
+      options = {:base_uri => nil, :charset => nil, :media_types => :all}.merge(options)
+      block = cleanup_block(block)
+      if options[:base_uri] and @options[:absolute_paths]
+        block = CssParser.convert_uris(block, options[:base_uri])
+      end
+      parse_block_into_rule_sets!(block, options)
+    end
+    # Add a CSS rule by setting the +selectors+, +declarations+ and +media_types+.
+    #
+    # +media_types+ can be a symbol or an array of symbols.
+    def add_rule!(selectors, declarations, media_types = :all)
+      rule_set = RuleSet.new(selectors, declarations)
+      add_rule_set!(rule_set, media_types)
+    end
+    # Add a CssParser RuleSet object.
+    #
+    # +media_types+ can be a symbol or an array of symbols.
+    def add_rule_set!(ruleset, media_types = :all)
+      raise ArgumentError unless ruleset.kind_of?(CssParser::RuleSet)
+      media_types = [media_types] if media_types.kind_of?(Symbol)
+      @rules << {:media_types => media_types, :rules => ruleset}
+    end
+    # Iterate through RuleSet objects.
+    #
+    # +media_types+ can be a symbol or an array of symbols.
+    def each_rule_set(media_types = :all) # :yields: rule_set
+      media_types = [:all] if media_types.nil?
+      media_types = [media_types] if media_types.kind_of?(Symbol)
+      @rules.each do |block|
+        if media_types.include?(:all) or block[:media_types].any? { |mt| media_types.include?(mt) }
+          yield block[:rules]
+        end
+      end
+    end
+    # Iterate through CSS selectors.
+    #
+    # +media_types+ can be a symbol or an array of symbols.
+    # See RuleSet#each_selector for +options+.
+    def each_selector(media_types = :all, options = {}) # :yields: selectors, declarations, specificity
+      each_rule_set(media_types) do |rule_set|
+        #puts rule_set
+        rule_set.each_selector(options) do |selectors, declarations, specificity|
+          yield selectors, declarations, specificity
+        end
+      end
+    end
+    # Output all CSS rules as a single stylesheet.
+    def to_s(media_types = :all)
+      out = ''
+      each_selector(media_types) do |selectors, declarations, specificity|
+        out << "#{selectors} {\n#{declarations}\n}\n"
+      end
+      out
+    end
+    # Merge declarations with the same selector.
+    def compact! # :nodoc:
+      compacted = []
+      compacted
+    end
+    def parse_block_into_rule_sets!(block, options = {}) # :nodoc:
+      options = {:media_types => :all}.merge(options)
+      media_types = options[:media_types]
+      in_declarations = false
+      block_depth = 0
+      # @charset is ignored for now
+      in_charset = false
+      in_string = false
+      in_at_media_rule = false
+      current_selectors = ''
+      current_declarations = ''
+      block.scan(/([\\]?[{}\s"]|(.[^\s"{}\\]*))/).each do |matches|
+      #block.scan(/((.[^{}"\n\r\f\s]*)[\s]|(.[^{}"\n\r\f]*)\{|(.[^{}"\n\r\f]*)\}|(.[^{}"\n\r\f]*)\"|(.*)[\s]+)/).each do |matches|
+        token = matches[0]
+        #puts "TOKEN: #{token}" unless token =~ /^[\s]*$/
+        if token =~ /\A"/ # found un-escaped double quote
+          in_string = !in_string
+        end
+        if in_declarations
+          current_declarations += token
+          if token =~ /\}/ and not in_string
+            current_declarations.gsub!(/\}[\s]*$/, '')
+            in_declarations = false
+            unless current_declarations.strip.empty?
+              #puts "SAVING #{current_selectors} -> #{current_declarations}"
+              add_rule!(current_selectors, current_declarations, media_types)
+            end
+            current_selectors = ''
+            current_declarations = ''
+          end
+        elsif token =~ /@media/i
+          # found '@media', reset current media_types
+          in_at_media_rule = true
+          media_types = []
+        elsif in_at_media_rule
+          if token =~ /\{/
+            block_depth = block_depth + 1
+            in_at_media_rule = false
+          else
+            token.gsub!(/[,\s]*/, '')
+            media_types << token.strip.downcase.to_sym unless token.empty?
+          end
+        elsif in_charset or token =~ /@charset/i
+          # iterate until we are out of the charset declaration
+          in_charset = (token =~ /;/ ? false : true)
+        else
+          if token =~ /\}/ and not in_string
+            block_depth = block_depth - 1
+          else
+            if token =~ /\{/ and not in_string
+              current_selectors.gsub!(/^[\s]*/, '')
+              current_selectors.gsub!(/[\s]*$/, '')
+              in_declarations = true
+            else
+              current_selectors += token
+            end
+          end
+        end
+      end
+    end
+    # Load a remote CSS file.
+    def load_uri!(uri, base_uri = nil, media_types = :all)
+      base_uri = uri if base_uri.nil?
+      src, charset = read_remote_file(uri)
+      # Load @imported CSS
+      src.scan(RE_AT_IMPORT_RULE).each do |import_rule|
+        import_path = import_rule[1].to_s.gsub(/['"]*/, '').strip
+        import_uri = URI.parse(base_uri.to_s).merge(import_path)
+        #puts import_uri.to_s
+        media_types = []
+        if media_string = import_rule[import_rule.length-1]
+          media_string.split(/\s|\,/).each do |t|
+            media_types << t.to_sym unless t.empty?
+          end
+        end
+        # Recurse
+        load_uri!(import_uri, nil, media_types)
+      end
+      # Remove @import declarations
+      src.gsub!(RE_AT_IMPORT_RULE, '')
+      # Relative paths need to be converted here
+      src = CssParser.convert_uris(src, base_uri) if base_uri and @options[:absolute_paths]
+      add_block!(src, {:media_types => media_types})
+    end
+  protected
+    # Strip comments and clean up blank lines from a block of CSS.
+    #
+    # Returns a string.
+    def cleanup_block(block) # :nodoc:
+      # Strip CSS comments
+      block.gsub!(STRIP_CSS_COMMENTS_RX, '')
+      # Strip HTML comments - they shouldn't really be in here but
+      # some people are just crazy...
+      block.gsub!(STRIP_HTML_COMMENTS_RX, '')
+      # Strip lines containing just whitespace
+      block.gsub!(/^\s+$/, "")
+      block
+    end
+    # Download a file into a string.
+    #
+    # Returns the file's data and character set in an array.
+    #--
+    # TODO: add option to fail silently or throw and exception on a 404
+    #++
+    def read_remote_file(uri) # :nodoc:
+      raise CircularReferenceError, "can't load #{uri.to_s} more than once" if @loaded_uris.include?(uri.to_s)
+      @loaded_uris << uri.to_s
+      begin
+      #fh = open(uri, 'rb')
+        fh = open(uri, 'rb', 'User-Agent' => USER_AGENT, 'Accept-Encoding' => 'gzip')
+        if fh.content_encoding.include?('gzip')
+          remote_src = Zlib::GzipReader.new(fh).read
+        else
+          remote_src = fh.read
+        end
+        #puts "reading #{uri} (#{fh.charset})"
+        ic = Iconv.new('UTF-8//IGNORE', fh.charset)
+        src = ic.iconv(remote_src)
+        fh.close
+        return src, fh.charset
+      rescue
+        raise RemoteFileError if @options[:io_exceptions]
+        return '', nil
+      end
+    end
+  private
+    # Save a folded declaration block to the internal cache.
+    def save_folded_declaration(block_hash, folded_declaration) # :nodoc:
+      @folded_declaration_cache[block_hash] = folded_declaration
+    end
+    # Retrieve a folded declaration block from the internal cache.
+    def get_folded_declaration(block_hash) # :nodoc:
+      return @folded_declaration_cache[block_hash] ||= nil
+    end
+    def reset! # :nodoc:
+      @folded_declaration_cache = {}
+      @css_source = ''
+      @css_rules = []
+      @css_warnings = []
+    end
+  end
+end

data/lib/css_parser/regexps.rb ADDED Viewed

@@ -0,0 +1,46 @@
+module CssParser
+  # :stopdoc:
+  # Base types
+  RE_NL = Regexp.new('(\n|\r\n|\r|\f)')
+  RE_NON_ASCII = Regexp.new('([\x00-\xFF])', Regexp::IGNORECASE)  #[^\0-\177]
+  RE_UNICODE = Regexp.new('(\\\\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])*)', Regexp::IGNORECASE | Regexp::EXTENDED | Regexp::MULTILINE)
+  RE_ESCAPE = Regexp.union(RE_UNICODE, '|(\\\\[^\n\r\f0-9a-f])')
+  RE_IDENT = Regexp.new("[\-]?([_a-z]|#{RE_NON_ASCII}|#{RE_ESCAPE})([_a-z0-9\-]|#{RE_NON_ASCII}|#{RE_ESCAPE})*", Regexp::IGNORECASE)
+  # General strings
+  RE_STRING1 = Regexp.new('(\"(.[^\n\r\f\\"]*|\\\\' + RE_NL.to_s + '|' + RE_ESCAPE.to_s + ')*\")')
+  RE_STRING2 = Regexp.new('(\'(.[^\n\r\f\\\']*|\\\\' + RE_NL.to_s + '|' + RE_ESCAPE.to_s + ')*\')')
+  RE_STRING = Regexp.union(RE_STRING1, RE_STRING2)
+  RE_URI = Regexp.new('(url\([\s]*([\s]*' + RE_STRING.to_s + '[\s]*)[\s]*\))|(url\([\s]*([!#$%&*\-~]|' + RE_NON_ASCII.to_s + '|' + RE_ESCAPE.to_s + ')*[\s]*)\)', Regexp::IGNORECASE | Regexp::EXTENDED  | Regexp::MULTILINE)
+  URI_RX = /url\(("([^"]*)"|'([^']*)'|([^)]*))\)/im
+  # Initial parsing
+  RE_AT_IMPORT_RULE = /\@import[\s]+(url\()?["']+(.[^'"]*)["']\)?([\w\s\,]*);?/i
+  #--
+  #RE_AT_MEDIA_RULE = Regexp.new('(\"(.[^\n\r\f\\"]*|\\\\' + RE_NL.to_s + '|' + RE_ESCAPE.to_s + ')*\")')
+  #RE_AT_IMPORT_RULE = Regexp.new('@import[\s]*(' + RE_STRING.to_s + ')([\w\s\,]*)[;]?', Regexp::IGNORECASE) -- should handle url() even though it is not allowed
+  #++
+  IMPORTANT_IN_PROPERTY_RX = /[\s]*\!important[\s]*/i
+  STRIP_CSS_COMMENTS_RX = /\/\*.*?\*\//m
+  STRIP_HTML_COMMENTS_RX = /\<\!\-\-|\-\-\>/m
+  # Special units
+  BOX_MODEL_UNITS_RX = /(auto|inherit|0|([\-]*([0-9]+|[0-9]*\.[0-9]+)(e[mx]+|px|[cm]+m|p[tc+]|in|\%)))([\s;]|\Z)/imx
+  RE_LENGTH_OR_PERCENTAGE = Regexp.new('([\-]*(([0-9]*\.[0-9]+)|[0-9]+)(e[mx]+|px|[cm]+m|p[tc+]|in|\%))', Regexp::IGNORECASE)
+  RE_BACKGROUND_POSITION = Regexp.new("((#{RE_LENGTH_OR_PERCENTAGE})|left|center|right|top|bottom)", Regexp::IGNORECASE | Regexp::EXTENDED)
+  FONT_UNITS_RX = /(([x]+\-)*small|medium|large[r]*|auto|inherit|([0-9]+|[0-9]*\.[0-9]+)(e[mx]+|px|[cm]+m|p[tc+]|in|\%)*)/i
+  # Patterns for specificity calculations
+  ELEMENTS_AND_PSEUDO_ELEMENTS_RX = /((^|[\s\+\>]+)[\w]+|\:(first\-line|first\-letter|before|after))/i
+  NON_ID_ATTRIBUTES_AND_PSEUDO_CLASSES_RX = /(\.[\w]+)|(\[[\w]+)|(\:(link|first\-child|lang))/i
+  # Colours
+  RE_COLOUR_RGB = Regexp.new('(rgb[\s]*\([\s-]*[\d]+(\.[\d]+)?[%\s]*,[\s-]*[\d]+(\.[\d]+)?[%\s]*,[\s-]*[\d]+(\.[\d]+)?[%\s]*\))', Regexp::IGNORECASE)
+  RE_COLOUR_HEX = /(#([0-9a-f]{6}|[0-9a-f]{3})([\s;]|$))/i
+  RE_COLOUR_NAMED = /([\s]*^)?(aqua|black|blue|fuchsia|gray|green|lime|maroon|navy|olive|orange|purple|red|silver|teal|white|yellow|transparent)([\s]*$)?/i
+  RE_COLOUR = Regexp.union(RE_COLOUR_RGB, RE_COLOUR_HEX, RE_COLOUR_NAMED)
+  # :startdoc:
+end