crass 0.2.1 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (41) hide show
  1. checksums.yaml +4 -4
  2. data/.travis.yml +4 -0
  3. data/HISTORY.md +22 -1
  4. data/LICENSE +1 -1
  5. data/README.md +64 -72
  6. data/Rakefile +4 -0
  7. data/crass.gemspec +2 -2
  8. data/lib/crass.rb +1 -1
  9. data/lib/crass/parser.rb +231 -96
  10. data/lib/crass/scanner.rb +21 -21
  11. data/lib/crass/token-scanner.rb +8 -1
  12. data/lib/crass/tokenizer.rb +133 -131
  13. data/lib/crass/version.rb +1 -1
  14. data/test/css-parsing-tests/An+B.json +156 -0
  15. data/test/css-parsing-tests/LICENSE +8 -0
  16. data/test/css-parsing-tests/README.rst +301 -0
  17. data/test/css-parsing-tests/color3.json +142 -0
  18. data/test/css-parsing-tests/color3_hsl.json +3890 -0
  19. data/test/css-parsing-tests/color3_keywords.json +803 -0
  20. data/test/css-parsing-tests/component_value_list.json +432 -0
  21. data/test/css-parsing-tests/declaration_list.json +44 -0
  22. data/test/css-parsing-tests/make_color3_hsl.py +17 -0
  23. data/test/css-parsing-tests/make_color3_keywords.py +191 -0
  24. data/test/css-parsing-tests/one_component_value.json +27 -0
  25. data/test/css-parsing-tests/one_declaration.json +46 -0
  26. data/test/css-parsing-tests/one_rule.json +36 -0
  27. data/test/css-parsing-tests/rule_list.json +48 -0
  28. data/test/css-parsing-tests/stylesheet.json +44 -0
  29. data/test/css-parsing-tests/stylesheet_bytes.json +146 -0
  30. data/test/shared/parse_rules.rb +377 -434
  31. data/test/support/common.rb +124 -0
  32. data/test/support/serialization/animate.css +3158 -0
  33. data/test/support/serialization/html5-boilerplate.css +268 -0
  34. data/test/support/serialization/misc.css +9 -0
  35. data/test/test_css_parsing_tests.rb +150 -0
  36. data/test/test_parse_properties.rb +136 -211
  37. data/test/test_parse_rules.rb +0 -52
  38. data/test/test_parse_stylesheet.rb +0 -39
  39. data/test/test_serialization.rb +13 -4
  40. metadata +44 -7
  41. data/test/test_tokenizer.rb +0 -1562
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a113964aef62593fb97ac9cb48a4e5a0c9f20428
4
- data.tar.gz: 01c82afdd4fa11e034f8b4a89751d7e9712d2893
3
+ metadata.gz: 519ff6d013ecd7ec215befe0c10a19b571146ccd
4
+ data.tar.gz: 0ec0589dec0731c6dcff90172dc0dd6196093cde
5
5
  SHA512:
6
- metadata.gz: e534077f4b8c368a8a469451bc9d713df4d69edc68554d461276d35075e1c77c17c2dec95c2a2b1e9d8833fbeeb912683206305991d015bfd8e369842b4fb14d
7
- data.tar.gz: 1e78567ce0c1f91f232cd52e676bb4ef6826d207eb5e6c52b6ff0af924d122c4289d6c4bfedbe57f6ffa8616b46e51537dac3e1415dc03d64339efe1501f9592
6
+ metadata.gz: 54d9075e9e17ef59cee418d81a46c654bc9aadffe944f9d15cd487f182de84c345fb8b8f7e03f73be8aa5d6796caf8d2ac1cdb372f7c9cb891ae26e9ee8363a8
7
+ data.tar.gz: 965e9c63add2a8c2e34a50a9424ec9cba73f67e3a7aed615e8a494e8e0724a3762381fbae022452c19a69203ffd4e0f0471e16514dc84454d4d8f22acad7c2ec
data/.travis.yml CHANGED
@@ -3,3 +3,7 @@ rvm:
3
3
  - 1.9.2
4
4
  - 1.9.3
5
5
  - 2.0.0
6
+ - 2.1.3
7
+ - 2.1.4
8
+ - 2.1.5
9
+ - ruby-head
data/HISTORY.md CHANGED
@@ -1,6 +1,27 @@
1
1
  Crass Change History
2
2
  ====================
3
3
 
4
+ 1.0.0 (2014-11-16)
5
+ ------------------
6
+
7
+ * Many parsing and tokenization tweaks to bring us into full compliance with the
8
+ [14 November 2014 editor's draft][css-syntax-draft] of the CSS syntax spec.
9
+ The most significant outwardly visible change is that quoted URLs like
10
+ `url("foo")` are now returned as `:function` tokens and not `:url` tokens due
11
+ to a change in the tokenization spec.
12
+
13
+ * Teensy tiny speed and memory usage improvements that you almost certainly
14
+ won't notice.
15
+
16
+ * Fixed: A semicolon following a `@charset` rule would be omitted during
17
+ serialization.
18
+
19
+ * Fixed: A multibyte char at the beginning of an id token could trigger an
20
+ encoding error because `StringScanner#peek` is a jerkface.
21
+
22
+ [css-syntax-draft]:http://dev.w3.org/csswg/css-syntax-3/
23
+
24
+
4
25
  0.2.1 (2014-07-22)
5
26
  ------------------
6
27
 
@@ -20,7 +41,7 @@ Crass Change History
20
41
  functions.
21
42
 
22
43
  * Fixed: When parsing the value of an at-rule's block as a list of rules, a
23
- selector containing a function (such as "#foo:not(.bar)") would cause that
44
+ selector containing a function (such as `#foo:not(.bar)`) would cause that
24
45
  property and the rest of the token stream to be discarded.
25
46
 
26
47
 
data/LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013 Ryan Grove (ryan@wonko.com)
1
+ Copyright (c) 2014 Ryan Grove (ryan@wonko.com)
2
2
 
3
3
  Permission is hereby granted, free of charge, to any person obtaining a copy of
4
4
  this software and associated documentation files (the ‘Software’), to deal in
data/README.md CHANGED
@@ -1,22 +1,22 @@
1
1
  Crass
2
2
  =====
3
3
 
4
- Crass is a Ruby CSS parser based on the [CSS Syntax Level 3][css] draft
5
- specification.
4
+ Crass is a Ruby CSS parser that's fully compliant with the
5
+ [CSS Syntax Level 3][css] specification.
6
6
 
7
7
  * [Home](https://github.com/rgrove/crass/)
8
8
  * [API Docs](http://rubydoc.info/github/rgrove/crass/master)
9
9
 
10
- [![Build Status](https://travis-ci.org/rgrove/crass.png?branch=master)](https://travis-ci.org/rgrove/crass?branch=master)
11
- [![Gem Version](https://badge.fury.io/rb/crass.png)](http://badge.fury.io/rb/crass)
10
+ [![Build Status](https://travis-ci.org/rgrove/crass.svg?branch=master)](https://travis-ci.org/rgrove/crass)
11
+ [![Gem Version](https://badge.fury.io/rb/crass.svg)](http://badge.fury.io/rb/crass)
12
12
 
13
13
  Features
14
14
  --------
15
15
 
16
16
  * Pure Ruby, with no runtime dependencies other than Ruby 1.9.x or higher.
17
17
 
18
- * Tokenizes and parses CSS according to the rules defined in the 2013 draft of
19
- the [CSS Syntax Level 3][css] specification.
18
+ * Tokenizes and parses CSS according to the rules defined in the 14 November
19
+ 2014 editor's draft of the [CSS Syntax Level 3][css] specification.
20
20
 
21
21
  * Extremely tolerant of broken or invalid CSS. If a browser can handle it, Crass
22
22
  should be able to handle it too.
@@ -29,7 +29,7 @@ Features
29
29
  * Capable of serializing the parse tree back to CSS while maintaining all
30
30
  original whitespace, comments, and indentation.
31
31
 
32
- [css]: http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/
32
+ [css]: http://dev.w3.org/csswg/css-syntax/
33
33
 
34
34
  Problems
35
35
  --------
@@ -47,10 +47,8 @@ Problems
47
47
  (except for wholesale removal of nodes) are not reflected in the serialized
48
48
  output.
49
49
 
50
- * At the moment, Crass only supports UTF-8 input and doesn't respect `@charset`
51
- rules. Input in any other encoding will be converted to UTF-8.
52
-
53
- * Probably other things. Did I mention Crass is pretty new?
50
+ * Crass only supports UTF-8 input and doesn't respect `@charset` rules. Input in
51
+ any other encoding will be converted to UTF-8.
54
52
 
55
53
  Installing
56
54
  ----------
@@ -62,7 +60,7 @@ gem install crass
62
60
  Examples
63
61
  --------
64
62
 
65
- Say you have a string containing the following simple CSS:
63
+ Say you have a string containing some CSS:
66
64
 
67
65
  ```css
68
66
  /* Comment! */
@@ -78,62 +76,61 @@ Parsing it is simple:
78
76
  tree = Crass.parse(css, :preserve_comments => true)
79
77
  ```
80
78
 
81
- This returns a big fat ugly parse tree, which looks like this:
79
+ This returns a big fat beautiful parse tree, which looks like this:
82
80
 
83
81
  ```ruby
84
82
  [{:node=>:comment, :pos=>0, :raw=>"/* Comment! */", :value=>" Comment! "},
85
83
  {:node=>:whitespace, :pos=>14, :raw=>"\n"},
86
- {:node=>:property,
87
- :name=>"a",
88
- :value=>"hover {\n color: #0d8bfa",
89
- :children=>
90
- [{:node=>:ident, :pos=>17, :raw=>"hover", :value=>"hover"},
91
- {:node=>:whitespace, :pos=>22, :raw=>" "},
92
- {:node=>:"{", :pos=>23, :raw=>"{"},
93
- {:node=>:whitespace, :pos=>24, :raw=>"\n "},
94
- {:node=>:ident, :pos=>27, :raw=>"color", :value=>"color"},
95
- {:node=>:colon, :pos=>32, :raw=>":"},
96
- {:node=>:whitespace, :pos=>33, :raw=>" "},
97
- {:node=>:hash,
98
- :pos=>34,
99
- :raw=>"#0d8bfa",
100
- :type=>:unrestricted,
101
- :value=>"0d8bfa"}],
102
- :important=>false,
103
- :tokens=>
104
- [{:node=>:ident, :pos=>15, :raw=>"a", :value=>"a"},
105
- {:node=>:colon, :pos=>16, :raw=>":"},
106
- {:node=>:ident, :pos=>17, :raw=>"hover", :value=>"hover"},
107
- {:node=>:whitespace, :pos=>22, :raw=>" "},
108
- {:node=>:"{", :pos=>23, :raw=>"{"},
109
- {:node=>:whitespace, :pos=>24, :raw=>"\n "},
110
- {:node=>:ident, :pos=>27, :raw=>"color", :value=>"color"},
111
- {:node=>:colon, :pos=>32, :raw=>":"},
112
- {:node=>:whitespace, :pos=>33, :raw=>" "},
113
- {:node=>:hash,
114
- :pos=>34,
115
- :raw=>"#0d8bfa",
116
- :type=>:unrestricted,
117
- :value=>"0d8bfa"},
118
- {:node=>:semicolon, :pos=>41, :raw=>";"}]},
119
- {:node=>:whitespace, :pos=>42, :raw=>"\n "},
120
- {:node=>:property,
121
- :name=>"text-decoration",
122
- :value=>"underline",
84
+ {:node=>:style_rule,
85
+ :selector=>
86
+ {:node=>:selector,
87
+ :value=>"a:hover",
88
+ :tokens=>
89
+ [{:node=>:ident, :pos=>15, :raw=>"a", :value=>"a"},
90
+ {:node=>:colon, :pos=>16, :raw=>":"},
91
+ {:node=>:ident, :pos=>17, :raw=>"hover", :value=>"hover"},
92
+ {:node=>:whitespace, :pos=>22, :raw=>" "}]},
123
93
  :children=>
124
- [{:node=>:whitespace, :pos=>61, :raw=>" "},
125
- {:node=>:ident, :pos=>62, :raw=>"underline", :value=>"underline"}],
126
- :important=>false,
127
- :tokens=>
128
- [{:node=>:ident,
129
- :pos=>45,
130
- :raw=>"text-decoration",
131
- :value=>"text-decoration"},
132
- {:node=>:colon, :pos=>60, :raw=>":"},
133
- {:node=>:whitespace, :pos=>61, :raw=>" "},
134
- {:node=>:ident, :pos=>62, :raw=>"underline", :value=>"underline"},
135
- {:node=>:semicolon, :pos=>71, :raw=>";"}]},
136
- {:node=>:whitespace, :pos=>72, :raw=>"\n"}]
94
+ [{:node=>:whitespace, :pos=>24, :raw=>"\n "},
95
+ {:node=>:property,
96
+ :name=>"color",
97
+ :value=>"#0d8bfa",
98
+ :children=>
99
+ [{:node=>:whitespace, :pos=>33, :raw=>" "},
100
+ {:node=>:hash,
101
+ :pos=>34,
102
+ :raw=>"#0d8bfa",
103
+ :type=>:unrestricted,
104
+ :value=>"0d8bfa"}],
105
+ :important=>false,
106
+ :tokens=>
107
+ [{:node=>:ident, :pos=>27, :raw=>"color", :value=>"color"},
108
+ {:node=>:colon, :pos=>32, :raw=>":"},
109
+ {:node=>:whitespace, :pos=>33, :raw=>" "},
110
+ {:node=>:hash,
111
+ :pos=>34,
112
+ :raw=>"#0d8bfa",
113
+ :type=>:unrestricted,
114
+ :value=>"0d8bfa"}]},
115
+ {:node=>:semicolon, :pos=>41, :raw=>";"},
116
+ {:node=>:whitespace, :pos=>42, :raw=>"\n "},
117
+ {:node=>:property,
118
+ :name=>"text-decoration",
119
+ :value=>"underline",
120
+ :children=>
121
+ [{:node=>:whitespace, :pos=>61, :raw=>" "},
122
+ {:node=>:ident, :pos=>62, :raw=>"underline", :value=>"underline"}],
123
+ :important=>false,
124
+ :tokens=>
125
+ [{:node=>:ident,
126
+ :pos=>45,
127
+ :raw=>"text-decoration",
128
+ :value=>"text-decoration"},
129
+ {:node=>:colon, :pos=>60, :raw=>":"},
130
+ {:node=>:whitespace, :pos=>61, :raw=>" "},
131
+ {:node=>:ident, :pos=>62, :raw=>"underline", :value=>"underline"}]},
132
+ {:node=>:semicolon, :pos=>71, :raw=>";"},
133
+ {:node=>:whitespace, :pos=>72, :raw=>"\n"}]}]
137
134
  ```
138
135
 
139
136
  If you want, you can stringify the parse tree:
@@ -157,20 +154,15 @@ Wasn't that exciting?
157
154
  A Note on Versioning
158
155
  --------------------
159
156
 
160
- Crass's version number currently has a "0.x" prefix, indicating that it's a new
161
- project under heavy development. **As long as the version number starts with
162
- "0.x", minor revisions may introduce breaking changes.** You've been warned!
163
-
164
- Once Crass reaches version 1.0.0, it will adhere strictly to
165
- [SemVer 2.0][semver].
157
+ As of version 1.0.0, Crass adheres strictly to [SemVer 2.0][semver].
166
158
 
167
159
  [semver]:http://semver.org/spec/v2.0.0.html
168
160
 
169
161
  Contributing
170
162
  ------------
171
163
 
172
- The best way to contribute right now is to use Crass and [create issues][issue]
173
- when you run into problems.
164
+ The best way to contribute is to use Crass and [create issues][issue] when you
165
+ run into problems.
174
166
 
175
167
  Pull requests that fix bugs are more than welcome as long as they include tests.
176
168
  Please adhere to the style and format of the surrounding code, or I might ask
@@ -202,7 +194,7 @@ tokenizing and parsing rules that Crass implements.
202
194
  License
203
195
  -------
204
196
 
205
- Copyright (c) 2013 Ryan Grove (ryan@wonko.com)
197
+ Copyright (c) 2014 Ryan Grove (ryan@wonko.com)
206
198
 
207
199
  Permission is hereby granted, free of charge, to any person obtaining a copy of
208
200
  this software and associated documentation files (the ‘Software’), to deal in
data/Rakefile CHANGED
@@ -5,3 +5,7 @@ Bundler::GemHelper.install_tasks
5
5
 
6
6
  Rake::TestTask.new
7
7
  task :default => [:test]
8
+
9
+ task :'pull-css-tests' do
10
+ sh 'git subtree pull -P test/css-parsing-tests https://github.com/SimonSapin/css-parsing-tests.git master --squash'
11
+ end
data/crass.gemspec CHANGED
@@ -3,8 +3,8 @@ require './lib/crass/version'
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = 'crass'
6
- s.summary = 'CSS parser based on the CSS Syntax Level 3 draft.'
7
- s.description = 'Crass is a pure Ruby CSS parser based on the CSS Syntax Level 3 draft.'
6
+ s.summary = 'CSS parser based on the CSS Syntax Level 3 spec.'
7
+ s.description = 'Crass is a pure Ruby CSS parser based on the CSS Syntax Level 3 spec.'
8
8
  s.version = Crass::VERSION
9
9
  s.authors = ['Ryan Grove']
10
10
  s.email = ['ryan@wonko.com']
data/lib/crass.rb CHANGED
@@ -1,7 +1,7 @@
1
1
  # encoding: utf-8
2
2
  require_relative 'crass/parser'
3
3
 
4
- # A CSS parser based on the CSS Syntax Module Level 3 draft.
4
+ # A CSS parser based on the CSS Syntax Module Level 3 spec.
5
5
  module Crass
6
6
 
7
7
  # Parses _input_ as a CSS stylesheet and returns a parse tree.
data/lib/crass/parser.rb CHANGED
@@ -6,7 +6,7 @@ module Crass
6
6
 
7
7
  # Parses a CSS string or list of tokens.
8
8
  #
9
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parsing
9
+ # 5. http://dev.w3.org/csswg/css-syntax/#parsing
10
10
  class Parser
11
11
  BLOCK_END_TOKENS = {
12
12
  :'{' => :'}',
@@ -21,18 +21,18 @@ module Crass
21
21
  #
22
22
  # See {Tokenizer#initialize} for _options_.
23
23
  #
24
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-declarations
24
+ # 5.3.6. http://dev.w3.org/csswg/css-syntax/#parse-a-list-of-declarations
25
25
  def self.parse_properties(input, options = {})
26
26
  Parser.new(input, options).parse_properties
27
27
  end
28
28
 
29
29
  # Parses CSS rules (such as the content of a `@media` block) and returns a
30
- # parse tree. The only difference from {#parse_stylesheet} is that CDO/CDC
30
+ # parse tree. The only difference from {parse_stylesheet} is that CDO/CDC
31
31
  # nodes (`<!--` and `-->`) aren't ignored.
32
32
  #
33
33
  # See {Tokenizer#initialize} for _options_.
34
34
  #
35
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-rules
35
+ # 5.3.3. http://dev.w3.org/csswg/css-syntax/#parse-a-list-of-rules
36
36
  def self.parse_rules(input, options = {})
37
37
  parser = Parser.new(input, options)
38
38
  rules = parser.consume_rules
@@ -50,7 +50,7 @@ module Crass
50
50
  #
51
51
  # See {Tokenizer#initialize} for _options_.
52
52
  #
53
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-stylesheet
53
+ # 5.3.2. http://dev.w3.org/csswg/css-syntax/#parse-a-stylesheet
54
54
  def self.parse_stylesheet(input, options = {})
55
55
  parser = Parser.new(input, options)
56
56
  rules = parser.consume_rules(:top_level => true)
@@ -79,20 +79,9 @@ module Crass
79
79
  next if node.nil?
80
80
 
81
81
  case node[:node]
82
- when :at_rule
83
- string << node[:tokens].first[:raw]
84
- string << self.stringify(node[:prelude], options)
85
-
86
- if node[:block]
87
- string << self.stringify(node[:block], options)
88
- end
89
-
90
82
  when :comment
91
83
  string << node[:raw] unless options[:exclude_comments]
92
84
 
93
- when :property
94
- string << self.stringify(node[:tokens], options)
95
-
96
85
  when :simple_block
97
86
  string << node[:start]
98
87
  string << self.stringify(node[:value], options)
@@ -100,7 +89,7 @@ module Crass
100
89
 
101
90
  when :style_rule
102
91
  string << self.stringify(node[:selector][:tokens], options)
103
- string << "{#{self.stringify(node[:children], options)}}"
92
+ string << '{' << self.stringify(node[:children], options) << '}'
104
93
 
105
94
  else
106
95
  if node.key?(:raw)
@@ -133,7 +122,7 @@ module Crass
133
122
 
134
123
  # Consumes an at-rule and returns it.
135
124
  #
136
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-an-at-rule
125
+ # 5.4.2. http://dev.w3.org/csswg/css-syntax-3/#consume-an-at-rule
137
126
  def consume_at_rule(input = @tokens)
138
127
  rule = {}
139
128
 
@@ -143,6 +132,7 @@ module Crass
143
132
 
144
133
  while token = input.consume
145
134
  case token[:node]
135
+ # Non-standard.
146
136
  when :comment
147
137
  next
148
138
 
@@ -150,17 +140,21 @@ module Crass
150
140
  break
151
141
 
152
142
  when :'{'
153
- rule[:block] = consume_simple_block(input)
143
+ # Note: The spec says the block should _be_ the consumed simple
144
+ # block, but Simon Sapin's CSS parsing tests and tinycss2 expect
145
+ # only the _value_ of the consumed simple block here. I assume I'm
146
+ # interpreting the spec too literally, so I'm going with the
147
+ # tinycss2 behavior.
148
+ rule[:block] = consume_simple_block(input)[:value]
154
149
  break
155
150
 
156
- when :simple_block
157
- if token[:start] == '{'
158
- rule[:block] = token
159
- break
160
- else
161
- input.reconsume
162
- rule[:prelude] << consume_component_value(input)
163
- end
151
+ when :simple_block && token[:start] == '{'
152
+ # Note: The spec says the block should _be_ the simple block, but
153
+ # Simon Sapin's CSS parsing tests and tinycss2 expect only the
154
+ # _value_ of the simple block here. I assume I'm interpreting the
155
+ # spec too literally, so I'm going with the tinycss2 behavior.
156
+ rule[:block] = token[:value]
157
+ break
164
158
 
165
159
  else
166
160
  input.reconsume
@@ -172,9 +166,10 @@ module Crass
172
166
  create_node(:at_rule, rule)
173
167
  end
174
168
 
175
- # Consumes a component value and returns it.
169
+ # Consumes a component value and returns it, or `nil` if there are no more
170
+ # tokens.
176
171
  #
177
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-component-value
172
+ # 5.4.6. http://dev.w3.org/csswg/css-syntax-3/#consume-a-component-value
178
173
  def consume_component_value(input = @tokens)
179
174
  return nil unless token = input.consume
180
175
 
@@ -184,7 +179,9 @@ module Crass
184
179
 
185
180
  when :function
186
181
  if token.key?(:name)
187
- # This is a parsed function, not a function token.
182
+ # This is a parsed function, not a function token. This step isn't
183
+ # mentioned in the spec, but it's necessary to avoid re-parsing
184
+ # functions that have already been parsed.
188
185
  token
189
186
  else
190
187
  consume_function(input)
@@ -197,7 +194,7 @@ module Crass
197
194
 
198
195
  # Consumes a declaration and returns it, or `nil` on parse error.
199
196
  #
200
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-declaration
197
+ # 5.4.5. http://dev.w3.org/csswg/css-syntax-3/#consume-a-declaration
201
198
  def consume_declaration(input = @tokens)
202
199
  declaration = {}
203
200
  value = []
@@ -205,78 +202,88 @@ module Crass
205
202
  declaration[:tokens] = input.collect do
206
203
  declaration[:name] = input.consume[:value]
207
204
 
208
- token = input.consume
209
- token = input.consume while token && token[:node] == :whitespace
210
-
211
- return nil if !token || token[:node] != :colon # TODO: parse error
212
- value << token while token = input.consume
213
- end
214
-
215
- # Look for !important.
216
- pos = -1
217
- while token = value[pos]
218
- type = token[:node]
205
+ next_token = input.peek
219
206
 
220
- if type == :whitespace || type == :comment || type == :semicolon
221
- pos -= 1
222
- next
207
+ while next_token && next_token[:node] == :whitespace
208
+ input.consume
209
+ next_token = input.peek
223
210
  end
224
211
 
225
- if type == :ident && token[:value].downcase == 'important'
226
- prev_token = value[pos - 1]
212
+ unless next_token && next_token[:node] == :colon
213
+ # Parse error.
214
+ #
215
+ # Note: The spec explicitly says to return nothing here, but Simon
216
+ # Sapin's CSS parsing tests expect an error node.
217
+ return create_node(:error, :value => 'invalid')
218
+ end
227
219
 
228
- if prev_token && prev_token[:node] == :delim &&
229
- prev_token[:value] == '!'
220
+ input.consume
230
221
 
231
- declaration[:important] = true
232
- value.slice!(pos - 1, 2)
233
- else
234
- break
235
- end
236
- else
237
- break
222
+ until input.peek.nil?
223
+ value << consume_component_value(input)
238
224
  end
239
225
  end
240
226
 
227
+ # Look for !important.
228
+ important_tokens = value.reject {|token|
229
+ node = token[:node]
230
+ node == :whitespace || node == :comment || node == :semicolon
231
+ }.last(2)
232
+
233
+ if important_tokens.size == 2 &&
234
+ important_tokens[0][:node] == :delim &&
235
+ important_tokens[0][:value] == '!' &&
236
+ important_tokens[1][:node] == :ident &&
237
+ important_tokens[1][:value].downcase == 'important'
238
+
239
+ declaration[:important] = true
240
+ excl_index = value.index(important_tokens[0])
241
+
242
+ # Technically the spec doesn't require us to trim trailing tokens after
243
+ # the !important, but Simon Sapin's CSS parsing tests expect it and
244
+ # tinycss2 does it, so we'll go along with the cool kids.
245
+ value.slice!(excl_index, value.size - excl_index)
246
+ else
247
+ declaration[:important] = false
248
+ end
249
+
241
250
  declaration[:value] = value
242
251
  create_node(:declaration, declaration)
243
252
  end
244
253
 
245
254
  # Consumes a list of declarations and returns them.
246
255
  #
247
- # NOTE: The returned list may include `:comment`, `:semicolon`, and
256
+ # By default, the returned list may include `:comment`, `:semicolon`, and
248
257
  # `:whitespace` nodes, which is non-standard.
249
258
  #
250
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations
251
- def consume_declarations(input = @tokens)
259
+ # Options:
260
+ #
261
+ # * **:strict** - Set to `true` to exclude non-standard `:comment`,
262
+ # `:semicolon`, and `:whitespace` nodes.
263
+ #
264
+ # 5.4.4. http://dev.w3.org/csswg/css-syntax/#consume-a-list-of-declarations
265
+ def consume_declarations(input = @tokens, options = {})
252
266
  declarations = []
253
267
 
254
268
  while token = input.consume
255
269
  case token[:node]
270
+
271
+ # Non-standard: Preserve comments, semicolons, and whitespace.
256
272
  when :comment, :semicolon, :whitespace
257
- declarations << token
273
+ declarations << token unless options[:strict]
258
274
 
259
275
  when :at_keyword
260
- # TODO: this is technically a parse error when parsing a style rule,
261
- # but not necessarily at other times.
262
-
263
- # Note: The spec doesn't say we should reconsume here, but it's
264
- # necessary since `consume_at_rule` must consume the `:at_keyword` as
265
- # the rule's name or it'll end up in the prelude. The spec *does* say
266
- # we should reconsume when an `:at_keyword` is encountered in
267
- # `consume_rules`, so we either have to reconsume in both places or in
268
- # neither place. I've chosen to reconsume in both places.
276
+ # When parsing a style rule, this is a parse error. Otherwise it's
277
+ # not.
269
278
  input.reconsume
270
279
  declarations << consume_at_rule(input)
271
280
 
272
281
  when :ident
273
282
  decl_tokens = [token]
274
- input.consume
275
283
 
276
- while input.current
277
- decl_tokens << input.current
278
- break if input.current[:node] == :semicolon
279
- input.consume
284
+ while next_token = input.peek
285
+ break if next_token[:node] == :semicolon
286
+ decl_tokens << consume_component_value(input)
280
287
  end
281
288
 
282
289
  if decl = consume_declaration(TokenScanner.new(decl_tokens))
@@ -284,9 +291,17 @@ module Crass
284
291
  end
285
292
 
286
293
  else
287
- # TODO: parse error (invalid property name, etc.)
288
- while token && token[:node] != :semicolon
289
- token = consume_component_value(input)
294
+ # Parse error (invalid property name, etc.).
295
+ #
296
+ # Note: The spec doesn't say we should append anything to the list of
297
+ # declarations here, but Simon Sapin's CSS parsing tests expect an
298
+ # error node.
299
+ declarations << create_node(:error, :value => 'invalid')
300
+ input.reconsume
301
+
302
+ while next_token = input.peek
303
+ break if next_token[:node] == :semicolon
304
+ consume_component_value(input)
290
305
  end
291
306
  end
292
307
  end
@@ -296,12 +311,12 @@ module Crass
296
311
 
297
312
  # Consumes a function and returns it.
298
313
  #
299
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-function
314
+ # 5.4.8. http://dev.w3.org/csswg/css-syntax-3/#consume-a-function
300
315
  def consume_function(input = @tokens)
301
316
  function = {
302
317
  :name => input.current[:value],
303
318
  :value => [],
304
- :tokens => [input.current]
319
+ :tokens => [input.current] # Non-standard, used for serialization.
305
320
  }
306
321
 
307
322
  function[:tokens].concat(input.collect {
@@ -310,6 +325,7 @@ module Crass
310
325
  when :')'
311
326
  break
312
327
 
328
+ # Non-standard.
313
329
  when :comment
314
330
  next
315
331
 
@@ -326,19 +342,34 @@ module Crass
326
342
  # Consumes a qualified rule and returns it, or `nil` if a parse error
327
343
  # occurs.
328
344
  #
329
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-qualified-rule
345
+ # 5.4.3. http://dev.w3.org/csswg/css-syntax-3/#consume-a-qualified-rule
330
346
  def consume_qualified_rule(input = @tokens)
331
347
  rule = {:prelude => []}
332
348
 
333
349
  rule[:tokens] = input.collect do
334
350
  while true
335
- return nil unless token = input.consume
351
+ unless token = input.consume
352
+ # Parse error.
353
+ #
354
+ # Note: The spec explicitly says to return nothing here, but Simon
355
+ # Sapin's CSS parsing tests expect an error node.
356
+ return create_node(:error, :value => 'invalid')
357
+ end
336
358
 
337
359
  if token[:node] == :'{'
338
- rule[:block] = consume_simple_block(input)
360
+ # Note: The spec says the block should _be_ the consumed simple
361
+ # block, but Simon Sapin's CSS parsing tests and tinycss2 expect
362
+ # only the _value_ of the consumed simple block here. I assume I'm
363
+ # interpreting the spec too literally, so I'm going with the
364
+ # tinycss2 behavior.
365
+ rule[:block] = consume_simple_block(input)[:value]
339
366
  break
340
- elsif token[:node] == :simple_block
341
- rule[:block] = token
367
+ elsif token[:node] == :simple_block && token[:start] == '{'
368
+ # Note: The spec says the block should _be_ the simple block, but
369
+ # Simon Sapin's CSS parsing tests and tinycss2 expect only the
370
+ # _value_ of the simple block here. I assume I'm interpreting the
371
+ # spec too literally, so I'm going with the tinycss2 behavior.
372
+ rule[:block] = token[:value]
342
373
  break
343
374
  else
344
375
  input.reconsume
@@ -352,12 +383,14 @@ module Crass
352
383
 
353
384
  # Consumes a list of rules and returns them.
354
385
  #
355
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-rules
386
+ # 5.4.1. http://dev.w3.org/csswg/css-syntax/#consume-a-list-of-rules
356
387
  def consume_rules(flags = {})
357
388
  rules = []
358
389
 
359
390
  while token = @tokens.consume
360
391
  case token[:node]
392
+ # Non-standard. Spec says to discard comments and whitespace, but we
393
+ # keep them so we can serialize faithfully.
361
394
  when :comment, :whitespace
362
395
  rules << token
363
396
 
@@ -386,7 +419,7 @@ module Crass
386
419
  # Consumes and returns a simple block associated with the current input
387
420
  # token.
388
421
  #
389
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-simple-block0
422
+ # 5.4.7. http://dev.w3.org/csswg/css-syntax/#consume-a-simple-block
390
423
  def consume_simple_block(input = @tokens)
391
424
  start_token = input.current[:node]
392
425
  end_token = BLOCK_END_TOKENS[start_token]
@@ -395,7 +428,7 @@ module Crass
395
428
  :start => start_token.to_s,
396
429
  :end => end_token.to_s,
397
430
  :value => [],
398
- :tokens => [input.current]
431
+ :tokens => [input.current] # Non-standard. Used for serialization.
399
432
  }
400
433
 
401
434
  block[:tokens].concat(input.collect do
@@ -427,25 +460,96 @@ module Crass
427
460
 
428
461
  # Creates a `:style_rule` node from the given qualified _rule_, and returns
429
462
  # it.
430
- #
431
- # * http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#style-rules
432
- # * http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
433
463
  def create_style_rule(rule)
434
464
  create_node(:style_rule,
435
465
  :selector => create_selector(rule[:prelude]),
436
- :children => parse_properties(rule[:block][:value]))
466
+ :children => parse_properties(rule[:block]))
467
+ end
468
+
469
+ # Parses a single component value and returns it.
470
+ #
471
+ # 5.3.7. http://dev.w3.org/csswg/css-syntax-3/#parse-a-component-value
472
+ def parse_component_value(input = @tokens)
473
+ input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
474
+
475
+ while input.peek && input.peek[:node] == :whitespace
476
+ input.consume
477
+ end
478
+
479
+ if input.peek.nil?
480
+ return create_node(:error, :value => 'empty')
481
+ end
482
+
483
+ value = consume_component_value(input)
484
+
485
+ while input.peek && input.peek[:node] == :whitespace
486
+ input.consume
487
+ end
488
+
489
+ if input.peek.nil?
490
+ value
491
+ else
492
+ create_node(:error, :value => 'extra-input')
493
+ end
494
+ end
495
+
496
+ # Parses a list of component values and returns an array of parsed tokens.
497
+ #
498
+ # 5.3.8. http://dev.w3.org/csswg/css-syntax/#parse-a-list-of-component-values
499
+ def parse_component_values(input = @tokens)
500
+ input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
501
+ tokens = []
502
+
503
+ while token = consume_component_value(input)
504
+ tokens << token
505
+ end
506
+
507
+ tokens
508
+ end
509
+
510
+ # Parses a single declaration and returns it.
511
+ #
512
+ # 5.3.5. http://dev.w3.org/csswg/css-syntax/#parse-a-declaration
513
+ def parse_declaration(input = @tokens)
514
+ input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
515
+
516
+ while input.peek && input.peek[:node] == :whitespace
517
+ input.consume
518
+ end
519
+
520
+ if input.peek.nil?
521
+ # Syntax error.
522
+ return create_node(:error, :value => 'empty')
523
+ elsif input.peek[:node] != :ident
524
+ # Syntax error.
525
+ return create_node(:error, :value => 'invalid')
526
+ end
527
+
528
+ if decl = consume_declaration(input)
529
+ return decl
530
+ end
531
+
532
+ # Syntax error.
533
+ create_node(:error, :value => 'invalid')
534
+ end
535
+
536
+ # Parses a list of declarations and returns them.
537
+ #
538
+ # See {#consume_declarations} for _options_.
539
+ #
540
+ # 5.3.6. http://dev.w3.org/csswg/css-syntax/#parse-a-list-of-declarations
541
+ def parse_declarations(input = @tokens, options = {})
542
+ input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
543
+ consume_declarations(input, options)
437
544
  end
438
545
 
439
546
  # Parses a list of declarations and returns an array of `:property` nodes
440
547
  # (and any non-declaration nodes that were in the input). This is useful for
441
548
  # parsing the contents of an HTML element's `style` attribute.
442
- #
443
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-declarations
444
549
  def parse_properties(input = @tokens)
445
- input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
446
550
  properties = []
447
551
 
448
- consume_declarations(input).each do |decl|
552
+ parse_declarations(input).each do |decl|
449
553
  unless decl[:node] == :declaration
450
554
  properties << decl
451
555
  next
@@ -458,13 +562,44 @@ module Crass
458
562
  :name => decl[:name],
459
563
  :value => parse_value(decl[:value]),
460
564
  :children => children,
461
- :important => decl[:important] == true,
565
+ :important => decl[:important],
462
566
  :tokens => decl[:tokens])
463
567
  end
464
568
 
465
569
  properties
466
570
  end
467
571
 
572
+ # Parses a single rule and returns it.
573
+ #
574
+ # 5.3.4. http://dev.w3.org/csswg/css-syntax-3/#parse-a-rule
575
+ def parse_rule(input = @tokens)
576
+ input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
577
+
578
+ while input.peek && input.peek[:node] == :whitespace
579
+ input.consume
580
+ end
581
+
582
+ if input.peek.nil?
583
+ # Syntax error.
584
+ return create_node(:error, :value => 'empty')
585
+ elsif input.peek[:node] == :at_keyword
586
+ rule = consume_at_rule(input)
587
+ else
588
+ rule = consume_qualified_rule(input)
589
+ end
590
+
591
+ while input.peek && input.peek[:node] == :whitespace
592
+ input.consume
593
+ end
594
+
595
+ if input.peek.nil?
596
+ rule
597
+ else
598
+ # Syntax error.
599
+ create_node(:error, :value => 'extra-input')
600
+ end
601
+ end
602
+
468
603
  # Returns the unescaped value of a selector name or property declaration.
469
604
  def parse_value(nodes)
470
605
  nodes = [nodes] unless nodes.is_a?(Array)