crass 0.0.2 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,15 +1,15 @@
1
1
  ---
2
2
  !binary "U0hBMQ==":
3
3
  metadata.gz: !binary |-
4
- Y2NmZWM1MTg3MjVmNjcxN2ZjN2JhZmE4YzU0NjUxYWFjMDI0MjQwMA==
4
+ OGNiZTc1MDI4ZWMwYmQxZGI4ODE4Yzc1YzNkNTU2MDJjODc1NTI0Mg==
5
5
  data.tar.gz: !binary |-
6
- NmZmMGJmM2JkYmRiY2JhYjI2N2E1NzVhNDFjNzc3ZjJiZDViNzg3ZQ==
6
+ NTdlYmJkNTZlYjI3ZjE3Njc2ZDRiMzkxYWZjOTQ0OWE3ZTcxYWRiOQ==
7
7
  SHA512:
8
8
  metadata.gz: !binary |-
9
- ZjRkYzI5Zjk1NGZlOTEzOGI2YTRmMThhMTdiYzNjNDYyNGU4NzhiMjExYTY4
10
- NDQ4ZDJlYjlhYTBiNTE0OTcwNTk4ZTgxM2VjNGNkMmVkNjgwYzcwNmZiZGRh
11
- MDM5OWE4YzM3NWIwZjhkN2VhODVlZmJkMDYzMGJjMjgyM2NkNjY=
9
+ ZjA0NTZkOTAwYTYwZmNjMDgyMzE1N2NkZDQ5ZTlkNDg0NzZmYTk4OGNiMjli
10
+ MDdlNzQwNTAwZWZmMjkzY2U3N2NkMjEyOWU1ZDMyZWZlYjU1NGRjNzg1Y2Ez
11
+ ZWZiYjg5ZTk2YmFhNWQ4NDk1ZDEzNmUwYjg2NDE0NjkyNzFmMjY=
12
12
  data.tar.gz: !binary |-
13
- ZmMyOWZiNWNkYmQ5OGE3MGFkOWEyMWJiY2JmODYwM2Y0MTRhNzhhZWM1NGE0
14
- NTU2NTI1MGYyNzVlZDE3ZTZmYWQ4NjkyMmY2NWU4NDA4NGNlNDAyOWIxYWY4
15
- NmEzNmVkYjYyNTNhMjkwODFkMDU2MmMzOGJkNGZlM2RlYmMwY2E=
13
+ NTE5NzI2MDViOWU5YzE4NzFmOWNlZmU0MWM5NTJhNGIyZTMyMTBkOGVjNWQ5
14
+ MmQ5NDZhYWY5YjIzMTgyNzNjNzZmY2I1NDE3YzFmMzM2Njk1NGQzMzQ3OTkw
15
+ MjFmODllYmM5NDQ5MTk3YjAwN2I3M2NmNzU4NDZkYzQwYzE1NWE=
data/.gitignore CHANGED
@@ -1,4 +1,5 @@
1
1
  .yardoc/
2
2
  doc/
3
+ pkg/
3
4
  .DS_Store
4
5
  Gemfile.lock
data/HISTORY.md CHANGED
@@ -1,6 +1,34 @@
1
1
  Crass Change History
2
2
  ====================
3
3
 
4
+ 0.1.0 (2013-10-04)
5
+ ------------------
6
+
7
+ * Tokenization is a little over 50% faster.
8
+
9
+ * Added tons of unit tests.
10
+
11
+ * Added `Crass.parse_properties` and `Crass::Parser.parse_properties`, which can
12
+ be used to parse the contents of an HTML element's `style` attribute.
13
+
14
+ * Added `Crass::Parser.parse_rules`, which can be used to parse the contents of
15
+ an `:at_rule` block like `@media` that may contain style rules.
16
+
17
+ * Fixed: `Crass::Parser#consume_at_rule` and `#consume_qualified_rule` didn't
18
+ properly handle already-parsed `:simple_block` nodes in the input, which
19
+ occurs when parsing rules in the value of an `:at_rule` block.
20
+
21
+ * Fixed: On `:property` nodes, `:important` is now set to `true` when the
22
+ property is followed by an "!important" declaration.
23
+
24
+ * Fixed: "!important" is no longer included in the value of a `:property` node.
25
+
26
+ * Fixed: A variety of tokenization bugs uncovered by tests.
27
+
28
+ * Fixed: Added a workaround for a possible spec bug when an `:at_keyword` is
29
+ encountered while consuming declarations.
30
+
31
+
4
32
  0.0.2 (2013-09-30)
5
33
  ------------------
6
34
 
@@ -11,4 +39,3 @@ Crass Change History
11
39
  ------------------
12
40
 
13
41
  * Initial release.
14
-
data/README.md CHANGED
@@ -1,20 +1,22 @@
1
1
  Crass
2
2
  =====
3
3
 
4
- Crass is a Ruby CSS parser based on the [CSS Syntax Module Level 3][css] draft.
4
+ Crass is a Ruby CSS parser based on the [CSS Syntax Level 3][css] draft
5
+ specification.
5
6
 
6
7
  * [Home](https://github.com/rgrove/crass/)
7
8
  * [API Docs](http://rubydoc.info/github/rgrove/crass/master)
8
9
 
9
10
  [![Build Status](https://travis-ci.org/rgrove/crass.png?branch=master)](https://travis-ci.org/rgrove/crass?branch=master)
11
+ [![Gem Version](https://badge.fury.io/rb/crass.png)](http://badge.fury.io/rb/crass)
10
12
 
11
13
  Features
12
14
  --------
13
15
 
14
16
  * Pure Ruby, with no runtime dependencies other than Ruby 1.9.x or higher.
15
17
 
16
- * Tokenizes and parses CSS according to the rules defined in the
17
- [CSS Syntax Module Level 3][css] draft.
18
+ * Tokenizes and parses CSS according to the rules defined in the 2013 draft of
19
+ the [CSS Syntax Level 3][css] specification.
18
20
 
19
21
  * Extremely tolerant of broken or invalid CSS. If a browser can handle it, Crass
20
22
  should be able to handle it too.
@@ -32,7 +34,9 @@ Features
32
34
  Problems
33
35
  --------
34
36
 
35
- * It's pretty slow.
37
+ * Crass isn't terribly fast. I mean, it's Ruby, and it's not really slow by Ruby
38
+ standards. But compared to the CSS parser in your average browser? Yeah, it's
39
+ slow.
36
40
 
37
41
  * Crass only parses the CSS syntax; it doesn't understand what any of it means,
38
42
  doesn't coalesce selectors, etc. You can do this yourself by consuming the
@@ -43,9 +47,10 @@ Problems
43
47
  (except for wholesale removal of nodes) are not reflected in the serialized
44
48
  output.
45
49
 
46
- * Unit tests aren't complete yet.
50
+ * At the moment, Crass only supports UTF-8 input and doesn't respect `@charset`
51
+ rules. Input in any other encoding will be converted to UTF-8.
47
52
 
48
- * Probably tons of other things. Did I mention it's very new and experimental?
53
+ * Probably other things. Did I mention Crass is pretty new?
49
54
 
50
55
  Installing
51
56
  ----------
@@ -54,9 +59,6 @@ Installing
54
59
  gem install crass
55
60
  ```
56
61
 
57
- ...but only if you're brave. Seriously, this thing will almost certainly kill
58
- your family and poop on your pets.
59
-
60
62
  Examples
61
63
  --------
62
64
 
@@ -95,6 +97,7 @@ This returns a big fat ugly parse tree, which looks like this:
95
97
  {:node=>:property,
96
98
  :name=>"color",
97
99
  :value=>"#0d8bfa",
100
+ :important=>false,
98
101
  :tokens=>
99
102
  [{:node=>:ident, :pos=>27, :raw=>"color", :value=>"color"},
100
103
  {:node=>:colon, :pos=>32, :raw=>":"},
@@ -109,6 +112,7 @@ This returns a big fat ugly parse tree, which looks like this:
109
112
  {:node=>:property,
110
113
  :name=>"text-decoration",
111
114
  :value=>"underline",
115
+ :important=>false,
112
116
  :tokens=>
113
117
  [{:node=>:ident,
114
118
  :pos=>45,
@@ -168,6 +172,23 @@ hate to have to turn down a pull request you spent a lot of time on.
168
172
 
169
173
  [issue]: https://github.com/rgrove/crass/issues/new
170
174
 
175
+ Acknowledgments
176
+ ---------------
177
+
178
+ I'm deeply, deeply grateful to [Simon Sapin][simon] for his wonderfully
179
+ comprehensive [CSS parsing tests][css-tests], which I adapted to create many of
180
+ Crass's tests. They've been invaluable in helping me fix bugs and handle weird
181
+ edge cases, and Crass would be much crappier without them.
182
+
183
+ I'm also grateful to [Tab Atkins Jr.][tab] and Simon Sapin (again!) for their
184
+ work on the [CSS Syntax Level 3][spec] specification, which defines the
185
+ tokenizing and parsing rules that Crass implements.
186
+
187
+ [css-tests]:https://github.com/SimonSapin/css-parsing-tests/
188
+ [simon]:http://exyr.org/about/
189
+ [spec]:http://www.w3.org/TR/css-syntax-3/
190
+ [tab]:http://www.xanthir.com/contact/
191
+
171
192
  License
172
193
  -------
173
194
 
data/crass.gemspec CHANGED
@@ -3,8 +3,8 @@ require './lib/crass/version'
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = 'crass'
6
- s.summary = 'CSS parser based on the CSS Syntax Module Level 3 draft.'
7
- s.description = 'Crass is a pure Ruby CSS parser based on the CSS Syntax Module Level 3 draft.'
6
+ s.summary = 'CSS parser based on the CSS Syntax Level 3 draft.'
7
+ s.description = 'Crass is a pure Ruby CSS parser based on the CSS Syntax Level 3 draft.'
8
8
  s.version = Crass::VERSION
9
9
  s.authors = ['Ryan Grove']
10
10
  s.email = ['ryan@wonko.com']
data/lib/crass.rb CHANGED
@@ -11,4 +11,12 @@ module Crass
11
11
  Parser.parse_stylesheet(input, options)
12
12
  end
13
13
 
14
+ # Parses _input_ as a string of CSS properties (such as the contents of an
15
+ # HTML element's `style` attribute) and returns a parse tree.
16
+ #
17
+ # See {Tokenizer#initialize} for _options_.
18
+ def self.parse_properties(input, options = {})
19
+ Parser.parse_properties(input, options)
20
+ end
21
+
14
22
  end
data/lib/crass/parser.rb CHANGED
@@ -16,6 +16,36 @@ module Crass
16
16
 
17
17
  # -- Class Methods ---------------------------------------------------------
18
18
 
19
+ # Parses CSS properties (such as the contents of an HTML element's `style`
20
+ # attribute) and returns a parse tree.
21
+ #
22
+ # See {Tokenizer#initialize} for _options_.
23
+ #
24
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-declarations
25
+ def self.parse_properties(input, options = {})
26
+ Parser.new(input, options).parse_properties
27
+ end
28
+
29
+ # Parses a CSS rules (such as the content of a `@media` block) and returns a
30
+ # parse tree. The only difference from {#parse_stylesheet} is that CDO/CDC
31
+ # nodes (`<!--` and `-->`) aren't ignored.
32
+ #
33
+ # See {Tokenizer#initialize} for _options_.
34
+ #
35
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-rules
36
+ def self.parse_rules(input, options = {})
37
+ parser = Parser.new(input, options)
38
+ rules = parser.consume_rules
39
+
40
+ rules.map do |rule|
41
+ if rule[:node] == :qualified_rule
42
+ parser.create_style_rule(rule)
43
+ else
44
+ rule
45
+ end
46
+ end
47
+ end
48
+
19
49
  # Parses a CSS stylesheet and returns a parse tree.
20
50
  #
21
51
  # See {Tokenizer#initialize} for _options_.
@@ -26,10 +56,10 @@ module Crass
26
56
  rules = parser.consume_rules(:top_level => true)
27
57
 
28
58
  rules.map do |rule|
29
- case rule[:node]
30
- # TODO: handle at-rules
31
- when :qualified_rule then parser.create_style_rule(rule)
32
- else rule
59
+ if rule[:node] == :qualified_rule
60
+ parser.create_style_rule(rule)
61
+ else
62
+ rule
33
63
  end
34
64
  end
35
65
  end
@@ -46,20 +76,32 @@ module Crass
46
76
  string = ''
47
77
 
48
78
  nodes.each do |node|
79
+ next if node.nil?
80
+
49
81
  case node[:node]
82
+ when :at_rule
83
+ string << node[:tokens].first[:raw]
84
+ string << self.stringify(node[:prelude], options)
85
+
86
+ if node[:block]
87
+ string << self.stringify(node[:block], options)
88
+ end
89
+
50
90
  when :comment
51
91
  string << node[:raw] unless options[:exclude_comments]
52
92
 
53
- when :style_rule
54
- string << self.stringify(node[:selector][:tokens], options)
55
- string << "{"
56
- string << self.stringify(node[:children], options)
57
- string << "}"
58
-
59
93
  when :property
60
- string << options[:indent] if options[:indent]
61
94
  string << self.stringify(node[:tokens], options)
62
95
 
96
+ when :simple_block
97
+ string << node[:start]
98
+ string << self.stringify(node[:value], options)
99
+ string << node[:end]
100
+
101
+ when :style_rule
102
+ string << self.stringify(node[:selector][:tokens], options)
103
+ string << "{#{self.stringify(node[:children], options)}}"
104
+
63
105
  else
64
106
  if node.key?(:raw)
65
107
  string << node[:raw]
@@ -74,7 +116,7 @@ module Crass
74
116
 
75
117
  # -- Instance Methods ------------------------------------------------------
76
118
 
77
- # Array of tokens generated from this parser's input.
119
+ # {TokenScanner} wrapping the tokens generated from this parser's input.
78
120
  attr_reader :tokens
79
121
 
80
122
  # Initializes a parser based on the given _input_, which may be a CSS string
@@ -96,7 +138,7 @@ module Crass
96
138
  rule = {}
97
139
 
98
140
  rule[:tokens] = input.collect do
99
- rule[:name] = parse_value(input.consume)
141
+ rule[:name] = input.consume[:value]
100
142
  rule[:prelude] = []
101
143
 
102
144
  while token = input.consume
@@ -108,12 +150,14 @@ module Crass
108
150
  rule[:block] = consume_simple_block(input)
109
151
  break
110
152
 
111
- # TODO: At this point, the spec says we should check for a "simple
112
- # block with an associated token of <<{-token>>", but isn't that
113
- # exactly what we just did above? And the tokenizer only ever produces
114
- # standalone <<{-token>>s, so how could the token stream ever contain
115
- # one that's already associated with a simple block? What am I
116
- # missing?
153
+ when :simple_block
154
+ if token[:start] == '{'
155
+ rule[:block] = token
156
+ break
157
+ else
158
+ input.reconsume
159
+ rule[:prelude] << consume_component_value(input)
160
+ end
117
161
 
118
162
  else
119
163
  input.reconsume
@@ -140,34 +184,48 @@ module Crass
140
184
 
141
185
  # Consumes a declaration and returns it, or `nil` on parse error.
142
186
  #
143
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-declaration0
187
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-declaration
144
188
  def consume_declaration(input = @tokens)
145
189
  declaration = {}
190
+ value = []
146
191
 
147
192
  declaration[:tokens] = input.collect do
148
193
  declaration[:name] = input.consume[:value]
149
194
 
150
- value = []
151
195
  token = input.consume
152
- token = input.consume while token[:node] == :whitespace
153
-
154
- return nil if token[:node] != :colon # TODO: parse error
196
+ token = input.consume while token && token[:node] == :whitespace
155
197
 
198
+ return nil if !token || token[:node] != :colon # TODO: parse error
156
199
  value << token while token = input.consume
157
- declaration[:value] = value
200
+ end
158
201
 
159
- maybe_important = value.reject {|v| v[:node] == :whitespace }[-2, 2]
202
+ # Look for !important.
203
+ pos = -1
204
+ while token = value[pos]
205
+ type = token[:node]
206
+
207
+ if type == :whitespace || type == :comment || type == :semicolon
208
+ pos -= 1
209
+ next
210
+ end
160
211
 
161
- if maybe_important &&
162
- maybe_important[0][:node] == :delim &&
163
- maybe_important[0][:value] == '!' &&
164
- maybe_important[1][:node] == :ident &&
165
- maybe_important[1][:value].downcase == 'important'
212
+ if type == :ident && token[:value].downcase == 'important'
213
+ prev_token = value[pos - 1]
166
214
 
167
- declaration[:important] = true
215
+ if prev_token && prev_token[:node] == :delim &&
216
+ prev_token[:value] == '!'
217
+
218
+ declaration[:important] = true
219
+ value.slice!(pos - 1, 2)
220
+ else
221
+ break
222
+ end
223
+ else
224
+ break
168
225
  end
169
226
  end
170
227
 
228
+ declaration[:value] = value
171
229
  create_node(:declaration, declaration)
172
230
  end
173
231
 
@@ -176,7 +234,7 @@ module Crass
176
234
  # NOTE: The returned list may include `:comment`, `:semicolon`, and
177
235
  # `:whitespace` nodes, which is non-standard.
178
236
  #
179
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
237
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations
180
238
  def consume_declarations(input = @tokens)
181
239
  declarations = []
182
240
 
@@ -189,8 +247,13 @@ module Crass
189
247
  # TODO: this is technically a parse error when parsing a style rule,
190
248
  # but not necessarily at other times.
191
249
 
192
- # TODO: It seems like we should reconsume the current token here,
193
- # since that's what happens when consuming a list of rules.
250
+ # Note: The spec doesn't say we should reconsume here, but it's
251
+ # necessary since `consume_at_rule` must consume the `:at_keyword` as
252
+ # the rule's name or it'll end up in the prelude. The spec *does* say
253
+ # we should reconsume when an `:at_keyword` is encountered in
254
+ # `consume_rules`, so we either have to reconsume in both places or in
255
+ # neither place. I've chosen to reconsume in both places.
256
+ input.reconsume
194
257
  declarations << consume_at_rule(input)
195
258
 
196
259
  when :ident
@@ -247,7 +310,7 @@ module Crass
247
310
  # Consumes a qualified rule and returns it, or `nil` if a parse error
248
311
  # occurs.
249
312
  #
250
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-qualified-rule0
313
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-qualified-rule
251
314
  def consume_qualified_rule(input = @tokens)
252
315
  rule = {:prelude => []}
253
316
 
@@ -258,15 +321,9 @@ module Crass
258
321
  if token[:node] == :'{'
259
322
  rule[:block] = consume_simple_block(input)
260
323
  break
261
-
262
- # elsif [simple block with an associated <<{-token>>??]
263
-
264
- # TODO: At this point, the spec says we should check for a "simple block
265
- # with an associated token of <<{-token>>", but isn't that exactly what
266
- # we just did above? And the tokenizer only ever produces standalone
267
- # <<{-token>>s, so how could the token stream ever contain one that's
268
- # already associated with a simple block? What am I missing?
269
-
324
+ elsif token[:node] == :simple_block
325
+ rule[:block] = token
326
+ break
270
327
  else
271
328
  input.reconsume
272
329
  rule[:prelude] << consume_component_value(input)
@@ -279,7 +336,7 @@ module Crass
279
336
 
280
337
  # Consumes a list of rules and returns them.
281
338
  #
282
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-rules0
339
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-rules
283
340
  def consume_rules(flags = {})
284
341
  rules = []
285
342
 
@@ -358,33 +415,41 @@ module Crass
358
415
  # * http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#style-rules
359
416
  # * http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
360
417
  def create_style_rule(rule)
361
- children = []
362
- tokens = TokenScanner.new(rule[:block][:value])
418
+ create_node(:style_rule,
419
+ :selector => create_selector(rule[:prelude]),
420
+ :children => parse_properties(rule[:block][:value]))
421
+ end
363
422
 
364
- consume_declarations(tokens).each do |decl|
423
+ # Parses a list of declarations and returns an array of `:property` nodes
424
+ # (and any non-declaration nodes that were in the input). This is useful for
425
+ # parsing the contents of an HTML element's `style` attribute.
426
+ #
427
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-declarations
428
+ def parse_properties(input = @tokens)
429
+ input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
430
+ properties = []
431
+
432
+ consume_declarations(input).each do |decl|
365
433
  unless decl[:node] == :declaration
366
- children << decl
434
+ properties << decl
367
435
  next
368
436
  end
369
437
 
370
- children << create_node(:property,
371
- :name => decl[:name],
372
- :value => parse_value(decl[:value]),
373
- :tokens => decl[:tokens])
438
+ properties << create_node(:property,
439
+ :name => decl[:name],
440
+ :value => parse_value(decl[:value]),
441
+ :important => decl[:important] == true,
442
+ :tokens => decl[:tokens])
374
443
  end
375
444
 
376
- create_node(:style_rule,
377
- :selector => create_selector(rule[:prelude]),
378
- :children => children
379
- )
445
+ properties
380
446
  end
381
447
 
382
448
  # Returns the unescaped value of a selector name or property declaration.
383
449
  def parse_value(nodes)
450
+ nodes = [nodes] unless nodes.is_a?(Array)
384
451
  string = ''
385
452
 
386
- nodes = [nodes] unless nodes.is_a?(Array)
387
-
388
453
  nodes.each do |node|
389
454
  case node[:node]
390
455
  when :comment, :semicolon then next