crass 0.0.2 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,15 +1,15 @@
1
1
  ---
2
2
  !binary "U0hBMQ==":
3
3
  metadata.gz: !binary |-
4
- Y2NmZWM1MTg3MjVmNjcxN2ZjN2JhZmE4YzU0NjUxYWFjMDI0MjQwMA==
4
+ OGNiZTc1MDI4ZWMwYmQxZGI4ODE4Yzc1YzNkNTU2MDJjODc1NTI0Mg==
5
5
  data.tar.gz: !binary |-
6
- NmZmMGJmM2JkYmRiY2JhYjI2N2E1NzVhNDFjNzc3ZjJiZDViNzg3ZQ==
6
+ NTdlYmJkNTZlYjI3ZjE3Njc2ZDRiMzkxYWZjOTQ0OWE3ZTcxYWRiOQ==
7
7
  SHA512:
8
8
  metadata.gz: !binary |-
9
- ZjRkYzI5Zjk1NGZlOTEzOGI2YTRmMThhMTdiYzNjNDYyNGU4NzhiMjExYTY4
10
- NDQ4ZDJlYjlhYTBiNTE0OTcwNTk4ZTgxM2VjNGNkMmVkNjgwYzcwNmZiZGRh
11
- MDM5OWE4YzM3NWIwZjhkN2VhODVlZmJkMDYzMGJjMjgyM2NkNjY=
9
+ ZjA0NTZkOTAwYTYwZmNjMDgyMzE1N2NkZDQ5ZTlkNDg0NzZmYTk4OGNiMjli
10
+ MDdlNzQwNTAwZWZmMjkzY2U3N2NkMjEyOWU1ZDMyZWZlYjU1NGRjNzg1Y2Ez
11
+ ZWZiYjg5ZTk2YmFhNWQ4NDk1ZDEzNmUwYjg2NDE0NjkyNzFmMjY=
12
12
  data.tar.gz: !binary |-
13
- ZmMyOWZiNWNkYmQ5OGE3MGFkOWEyMWJiY2JmODYwM2Y0MTRhNzhhZWM1NGE0
14
- NTU2NTI1MGYyNzVlZDE3ZTZmYWQ4NjkyMmY2NWU4NDA4NGNlNDAyOWIxYWY4
15
- NmEzNmVkYjYyNTNhMjkwODFkMDU2MmMzOGJkNGZlM2RlYmMwY2E=
13
+ NTE5NzI2MDViOWU5YzE4NzFmOWNlZmU0MWM5NTJhNGIyZTMyMTBkOGVjNWQ5
14
+ MmQ5NDZhYWY5YjIzMTgyNzNjNzZmY2I1NDE3YzFmMzM2Njk1NGQzMzQ3OTkw
15
+ MjFmODllYmM5NDQ5MTk3YjAwN2I3M2NmNzU4NDZkYzQwYzE1NWE=
data/.gitignore CHANGED
@@ -1,4 +1,5 @@
1
1
  .yardoc/
2
2
  doc/
3
+ pkg/
3
4
  .DS_Store
4
5
  Gemfile.lock
data/HISTORY.md CHANGED
@@ -1,6 +1,34 @@
1
1
  Crass Change History
2
2
  ====================
3
3
 
4
+ 0.1.0 (2013-10-04)
5
+ ------------------
6
+
7
+ * Tokenization is a little over 50% faster.
8
+
9
+ * Added tons of unit tests.
10
+
11
+ * Added `Crass.parse_properties` and `Crass::Parser.parse_properties`, which can
12
+ be used to parse the contents of an HTML element's `style` attribute.
13
+
14
+ * Added `Crass::Parser.parse_rules`, which can be used to parse the contents of
15
+ an `:at_rule` block like `@media` that may contain style rules.
16
+
17
+ * Fixed: `Crass::Parser#consume_at_rule` and `#consume_qualified_rule` didn't
18
+ properly handle already-parsed `:simple_block` nodes in the input, which
19
+ occurs when parsing rules in the value of an `:at_rule` block.
20
+
21
+ * Fixed: On `:property` nodes, `:important` is now set to `true` when the
22
+ property is followed by an "!important" declaration.
23
+
24
+ * Fixed: "!important" is no longer included in the value of a `:property` node.
25
+
26
+ * Fixed: A variety of tokenization bugs uncovered by tests.
27
+
28
+ * Fixed: Added a workaround for a possible spec bug when an `:at_keyword` is
29
+ encountered while consuming declarations.
30
+
31
+
4
32
  0.0.2 (2013-09-30)
5
33
  ------------------
6
34
 
@@ -11,4 +39,3 @@ Crass Change History
11
39
  ------------------
12
40
 
13
41
  * Initial release.
14
-
data/README.md CHANGED
@@ -1,20 +1,22 @@
1
1
  Crass
2
2
  =====
3
3
 
4
- Crass is a Ruby CSS parser based on the [CSS Syntax Module Level 3][css] draft.
4
+ Crass is a Ruby CSS parser based on the [CSS Syntax Level 3][css] draft
5
+ specification.
5
6
 
6
7
  * [Home](https://github.com/rgrove/crass/)
7
8
  * [API Docs](http://rubydoc.info/github/rgrove/crass/master)
8
9
 
9
10
  [![Build Status](https://travis-ci.org/rgrove/crass.png?branch=master)](https://travis-ci.org/rgrove/crass?branch=master)
11
+ [![Gem Version](https://badge.fury.io/rb/crass.png)](http://badge.fury.io/rb/crass)
10
12
 
11
13
  Features
12
14
  --------
13
15
 
14
16
  * Pure Ruby, with no runtime dependencies other than Ruby 1.9.x or higher.
15
17
 
16
- * Tokenizes and parses CSS according to the rules defined in the
17
- [CSS Syntax Module Level 3][css] draft.
18
+ * Tokenizes and parses CSS according to the rules defined in the 2013 draft of
19
+ the [CSS Syntax Level 3][css] specification.
18
20
 
19
21
  * Extremely tolerant of broken or invalid CSS. If a browser can handle it, Crass
20
22
  should be able to handle it too.
@@ -32,7 +34,9 @@ Features
32
34
  Problems
33
35
  --------
34
36
 
35
- * It's pretty slow.
37
+ * Crass isn't terribly fast. I mean, it's Ruby, and it's not really slow by Ruby
38
+ standards. But compared to the CSS parser in your average browser? Yeah, it's
39
+ slow.
36
40
 
37
41
  * Crass only parses the CSS syntax; it doesn't understand what any of it means,
38
42
  doesn't coalesce selectors, etc. You can do this yourself by consuming the
@@ -43,9 +47,10 @@ Problems
43
47
  (except for wholesale removal of nodes) are not reflected in the serialized
44
48
  output.
45
49
 
46
- * Unit tests aren't complete yet.
50
+ * At the moment, Crass only supports UTF-8 input and doesn't respect `@charset`
51
+ rules. Input in any other encoding will be converted to UTF-8.
47
52
 
48
- * Probably tons of other things. Did I mention it's very new and experimental?
53
+ * Probably other things. Did I mention Crass is pretty new?
49
54
 
50
55
  Installing
51
56
  ----------
@@ -54,9 +59,6 @@ Installing
54
59
  gem install crass
55
60
  ```
56
61
 
57
- ...but only if you're brave. Seriously, this thing will almost certainly kill
58
- your family and poop on your pets.
59
-
60
62
  Examples
61
63
  --------
62
64
 
@@ -95,6 +97,7 @@ This returns a big fat ugly parse tree, which looks like this:
95
97
  {:node=>:property,
96
98
  :name=>"color",
97
99
  :value=>"#0d8bfa",
100
+ :important=>false,
98
101
  :tokens=>
99
102
  [{:node=>:ident, :pos=>27, :raw=>"color", :value=>"color"},
100
103
  {:node=>:colon, :pos=>32, :raw=>":"},
@@ -109,6 +112,7 @@ This returns a big fat ugly parse tree, which looks like this:
109
112
  {:node=>:property,
110
113
  :name=>"text-decoration",
111
114
  :value=>"underline",
115
+ :important=>false,
112
116
  :tokens=>
113
117
  [{:node=>:ident,
114
118
  :pos=>45,
@@ -168,6 +172,23 @@ hate to have to turn down a pull request you spent a lot of time on.
168
172
 
169
173
  [issue]: https://github.com/rgrove/crass/issues/new
170
174
 
175
+ Acknowledgments
176
+ ---------------
177
+
178
+ I'm deeply, deeply grateful to [Simon Sapin][simon] for his wonderfully
179
+ comprehensive [CSS parsing tests][css-tests], which I adapted to create many of
180
+ Crass's tests. They've been invaluable in helping me fix bugs and handle weird
181
+ edge cases, and Crass would be much crappier without them.
182
+
183
+ I'm also grateful to [Tab Atkins Jr.][tab] and Simon Sapin (again!) for their
184
+ work on the [CSS Syntax Level 3][spec] specification, which defines the
185
+ tokenizing and parsing rules that Crass implements.
186
+
187
+ [css-tests]:https://github.com/SimonSapin/css-parsing-tests/
188
+ [simon]:http://exyr.org/about/
189
+ [spec]:http://www.w3.org/TR/css-syntax-3/
190
+ [tab]:http://www.xanthir.com/contact/
191
+
171
192
  License
172
193
  -------
173
194
 
data/crass.gemspec CHANGED
@@ -3,8 +3,8 @@ require './lib/crass/version'
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = 'crass'
6
- s.summary = 'CSS parser based on the CSS Syntax Module Level 3 draft.'
7
- s.description = 'Crass is a pure Ruby CSS parser based on the CSS Syntax Module Level 3 draft.'
6
+ s.summary = 'CSS parser based on the CSS Syntax Level 3 draft.'
7
+ s.description = 'Crass is a pure Ruby CSS parser based on the CSS Syntax Level 3 draft.'
8
8
  s.version = Crass::VERSION
9
9
  s.authors = ['Ryan Grove']
10
10
  s.email = ['ryan@wonko.com']
data/lib/crass.rb CHANGED
@@ -11,4 +11,12 @@ module Crass
11
11
  Parser.parse_stylesheet(input, options)
12
12
  end
13
13
 
14
+ # Parses _input_ as a string of CSS properties (such as the contents of an
15
+ # HTML element's `style` attribute) and returns a parse tree.
16
+ #
17
+ # See {Tokenizer#initialize} for _options_.
18
+ def self.parse_properties(input, options = {})
19
+ Parser.parse_properties(input, options)
20
+ end
21
+
14
22
  end
data/lib/crass/parser.rb CHANGED
@@ -16,6 +16,36 @@ module Crass
16
16
 
17
17
  # -- Class Methods ---------------------------------------------------------
18
18
 
19
+ # Parses CSS properties (such as the contents of an HTML element's `style`
20
+ # attribute) and returns a parse tree.
21
+ #
22
+ # See {Tokenizer#initialize} for _options_.
23
+ #
24
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-declarations
25
+ def self.parse_properties(input, options = {})
26
+ Parser.new(input, options).parse_properties
27
+ end
28
+
29
+ # Parses a CSS rules (such as the content of a `@media` block) and returns a
30
+ # parse tree. The only difference from {#parse_stylesheet} is that CDO/CDC
31
+ # nodes (`<!--` and `-->`) aren't ignored.
32
+ #
33
+ # See {Tokenizer#initialize} for _options_.
34
+ #
35
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-rules
36
+ def self.parse_rules(input, options = {})
37
+ parser = Parser.new(input, options)
38
+ rules = parser.consume_rules
39
+
40
+ rules.map do |rule|
41
+ if rule[:node] == :qualified_rule
42
+ parser.create_style_rule(rule)
43
+ else
44
+ rule
45
+ end
46
+ end
47
+ end
48
+
19
49
  # Parses a CSS stylesheet and returns a parse tree.
20
50
  #
21
51
  # See {Tokenizer#initialize} for _options_.
@@ -26,10 +56,10 @@ module Crass
26
56
  rules = parser.consume_rules(:top_level => true)
27
57
 
28
58
  rules.map do |rule|
29
- case rule[:node]
30
- # TODO: handle at-rules
31
- when :qualified_rule then parser.create_style_rule(rule)
32
- else rule
59
+ if rule[:node] == :qualified_rule
60
+ parser.create_style_rule(rule)
61
+ else
62
+ rule
33
63
  end
34
64
  end
35
65
  end
@@ -46,20 +76,32 @@ module Crass
46
76
  string = ''
47
77
 
48
78
  nodes.each do |node|
79
+ next if node.nil?
80
+
49
81
  case node[:node]
82
+ when :at_rule
83
+ string << node[:tokens].first[:raw]
84
+ string << self.stringify(node[:prelude], options)
85
+
86
+ if node[:block]
87
+ string << self.stringify(node[:block], options)
88
+ end
89
+
50
90
  when :comment
51
91
  string << node[:raw] unless options[:exclude_comments]
52
92
 
53
- when :style_rule
54
- string << self.stringify(node[:selector][:tokens], options)
55
- string << "{"
56
- string << self.stringify(node[:children], options)
57
- string << "}"
58
-
59
93
  when :property
60
- string << options[:indent] if options[:indent]
61
94
  string << self.stringify(node[:tokens], options)
62
95
 
96
+ when :simple_block
97
+ string << node[:start]
98
+ string << self.stringify(node[:value], options)
99
+ string << node[:end]
100
+
101
+ when :style_rule
102
+ string << self.stringify(node[:selector][:tokens], options)
103
+ string << "{#{self.stringify(node[:children], options)}}"
104
+
63
105
  else
64
106
  if node.key?(:raw)
65
107
  string << node[:raw]
@@ -74,7 +116,7 @@ module Crass
74
116
 
75
117
  # -- Instance Methods ------------------------------------------------------
76
118
 
77
- # Array of tokens generated from this parser's input.
119
+ # {TokenScanner} wrapping the tokens generated from this parser's input.
78
120
  attr_reader :tokens
79
121
 
80
122
  # Initializes a parser based on the given _input_, which may be a CSS string
@@ -96,7 +138,7 @@ module Crass
96
138
  rule = {}
97
139
 
98
140
  rule[:tokens] = input.collect do
99
- rule[:name] = parse_value(input.consume)
141
+ rule[:name] = input.consume[:value]
100
142
  rule[:prelude] = []
101
143
 
102
144
  while token = input.consume
@@ -108,12 +150,14 @@ module Crass
108
150
  rule[:block] = consume_simple_block(input)
109
151
  break
110
152
 
111
- # TODO: At this point, the spec says we should check for a "simple
112
- # block with an associated token of <<{-token>>", but isn't that
113
- # exactly what we just did above? And the tokenizer only ever produces
114
- # standalone <<{-token>>s, so how could the token stream ever contain
115
- # one that's already associated with a simple block? What am I
116
- # missing?
153
+ when :simple_block
154
+ if token[:start] == '{'
155
+ rule[:block] = token
156
+ break
157
+ else
158
+ input.reconsume
159
+ rule[:prelude] << consume_component_value(input)
160
+ end
117
161
 
118
162
  else
119
163
  input.reconsume
@@ -140,34 +184,48 @@ module Crass
140
184
 
141
185
  # Consumes a declaration and returns it, or `nil` on parse error.
142
186
  #
143
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-declaration0
187
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-declaration
144
188
  def consume_declaration(input = @tokens)
145
189
  declaration = {}
190
+ value = []
146
191
 
147
192
  declaration[:tokens] = input.collect do
148
193
  declaration[:name] = input.consume[:value]
149
194
 
150
- value = []
151
195
  token = input.consume
152
- token = input.consume while token[:node] == :whitespace
153
-
154
- return nil if token[:node] != :colon # TODO: parse error
196
+ token = input.consume while token && token[:node] == :whitespace
155
197
 
198
+ return nil if !token || token[:node] != :colon # TODO: parse error
156
199
  value << token while token = input.consume
157
- declaration[:value] = value
200
+ end
158
201
 
159
- maybe_important = value.reject {|v| v[:node] == :whitespace }[-2, 2]
202
+ # Look for !important.
203
+ pos = -1
204
+ while token = value[pos]
205
+ type = token[:node]
206
+
207
+ if type == :whitespace || type == :comment || type == :semicolon
208
+ pos -= 1
209
+ next
210
+ end
160
211
 
161
- if maybe_important &&
162
- maybe_important[0][:node] == :delim &&
163
- maybe_important[0][:value] == '!' &&
164
- maybe_important[1][:node] == :ident &&
165
- maybe_important[1][:value].downcase == 'important'
212
+ if type == :ident && token[:value].downcase == 'important'
213
+ prev_token = value[pos - 1]
166
214
 
167
- declaration[:important] = true
215
+ if prev_token && prev_token[:node] == :delim &&
216
+ prev_token[:value] == '!'
217
+
218
+ declaration[:important] = true
219
+ value.slice!(pos - 1, 2)
220
+ else
221
+ break
222
+ end
223
+ else
224
+ break
168
225
  end
169
226
  end
170
227
 
228
+ declaration[:value] = value
171
229
  create_node(:declaration, declaration)
172
230
  end
173
231
 
@@ -176,7 +234,7 @@ module Crass
176
234
  # NOTE: The returned list may include `:comment`, `:semicolon`, and
177
235
  # `:whitespace` nodes, which is non-standard.
178
236
  #
179
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
237
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations
180
238
  def consume_declarations(input = @tokens)
181
239
  declarations = []
182
240
 
@@ -189,8 +247,13 @@ module Crass
189
247
  # TODO: this is technically a parse error when parsing a style rule,
190
248
  # but not necessarily at other times.
191
249
 
192
- # TODO: It seems like we should reconsume the current token here,
193
- # since that's what happens when consuming a list of rules.
250
+ # Note: The spec doesn't say we should reconsume here, but it's
251
+ # necessary since `consume_at_rule` must consume the `:at_keyword` as
252
+ # the rule's name or it'll end up in the prelude. The spec *does* say
253
+ # we should reconsume when an `:at_keyword` is encountered in
254
+ # `consume_rules`, so we either have to reconsume in both places or in
255
+ # neither place. I've chosen to reconsume in both places.
256
+ input.reconsume
194
257
  declarations << consume_at_rule(input)
195
258
 
196
259
  when :ident
@@ -247,7 +310,7 @@ module Crass
247
310
  # Consumes a qualified rule and returns it, or `nil` if a parse error
248
311
  # occurs.
249
312
  #
250
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-qualified-rule0
313
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-qualified-rule
251
314
  def consume_qualified_rule(input = @tokens)
252
315
  rule = {:prelude => []}
253
316
 
@@ -258,15 +321,9 @@ module Crass
258
321
  if token[:node] == :'{'
259
322
  rule[:block] = consume_simple_block(input)
260
323
  break
261
-
262
- # elsif [simple block with an associated <<{-token>>??]
263
-
264
- # TODO: At this point, the spec says we should check for a "simple block
265
- # with an associated token of <<{-token>>", but isn't that exactly what
266
- # we just did above? And the tokenizer only ever produces standalone
267
- # <<{-token>>s, so how could the token stream ever contain one that's
268
- # already associated with a simple block? What am I missing?
269
-
324
+ elsif token[:node] == :simple_block
325
+ rule[:block] = token
326
+ break
270
327
  else
271
328
  input.reconsume
272
329
  rule[:prelude] << consume_component_value(input)
@@ -279,7 +336,7 @@ module Crass
279
336
 
280
337
  # Consumes a list of rules and returns them.
281
338
  #
282
- # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-rules0
339
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-rules
283
340
  def consume_rules(flags = {})
284
341
  rules = []
285
342
 
@@ -358,33 +415,41 @@ module Crass
358
415
  # * http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#style-rules
359
416
  # * http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
360
417
  def create_style_rule(rule)
361
- children = []
362
- tokens = TokenScanner.new(rule[:block][:value])
418
+ create_node(:style_rule,
419
+ :selector => create_selector(rule[:prelude]),
420
+ :children => parse_properties(rule[:block][:value]))
421
+ end
363
422
 
364
- consume_declarations(tokens).each do |decl|
423
+ # Parses a list of declarations and returns an array of `:property` nodes
424
+ # (and any non-declaration nodes that were in the input). This is useful for
425
+ # parsing the contents of an HTML element's `style` attribute.
426
+ #
427
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-list-of-declarations
428
+ def parse_properties(input = @tokens)
429
+ input = TokenScanner.new(input) unless input.is_a?(TokenScanner)
430
+ properties = []
431
+
432
+ consume_declarations(input).each do |decl|
365
433
  unless decl[:node] == :declaration
366
- children << decl
434
+ properties << decl
367
435
  next
368
436
  end
369
437
 
370
- children << create_node(:property,
371
- :name => decl[:name],
372
- :value => parse_value(decl[:value]),
373
- :tokens => decl[:tokens])
438
+ properties << create_node(:property,
439
+ :name => decl[:name],
440
+ :value => parse_value(decl[:value]),
441
+ :important => decl[:important] == true,
442
+ :tokens => decl[:tokens])
374
443
  end
375
444
 
376
- create_node(:style_rule,
377
- :selector => create_selector(rule[:prelude]),
378
- :children => children
379
- )
445
+ properties
380
446
  end
381
447
 
382
448
  # Returns the unescaped value of a selector name or property declaration.
383
449
  def parse_value(nodes)
450
+ nodes = [nodes] unless nodes.is_a?(Array)
384
451
  string = ''
385
452
 
386
- nodes = [nodes] unless nodes.is_a?(Array)
387
-
388
453
  nodes.each do |node|
389
454
  case node[:node]
390
455
  when :comment, :semicolon then next