crass 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/HISTORY.md ADDED
@@ -0,0 +1,4 @@
1
+ Crass Change History
2
+ ====================
3
+
4
+ ?????
data/LICENSE ADDED
@@ -0,0 +1,18 @@
1
+ Copyright (c) 2013 Ryan Grove (ryan@wonko.com)
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
4
+ this software and associated documentation files (the ‘Software’), to deal in
5
+ the Software without restriction, including without limitation the rights to
6
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
7
+ the Software, and to permit persons to whom the Software is furnished to do so,
8
+ subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
15
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
16
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
17
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
18
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,157 @@
1
+ Crass
2
+ =====
3
+
4
+ Crass is a Ruby CSS parser based on the [CSS Syntax Module Level 3][css] draft.
5
+
6
+ * [Home](https://github.com/rgrove/crass/)
7
+ * [API Docs](http://rubydoc.info/github/rgrove/crass/master)
8
+
9
+ Features
10
+ --------
11
+
12
+ * Pure Ruby, with no runtime dependencies other than Ruby 1.9.x or higher.
13
+
14
+ * Tokenizes and parses CSS according to the rules defined in the
15
+ [CSS Syntax Module Level 3][css] draft.
16
+
17
+ * Extremely tolerant of broken or invalid CSS. If a browser can handle it, Crass
18
+ should be able to handle it too.
19
+
20
+ * Optionally includes comments in the token stream.
21
+
22
+ * Optionally preserves certain CSS hacks, such as the IE "*" hack, which would
23
+ otherwise be discarded according to CSS3 tokenizing rules.
24
+
25
+ * Capable of serializing the parse tree back to CSS while maintaining all
26
+ original whitespace, comments, and indentation.
27
+
28
+ [css]: http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/
29
+
30
+ Problems
31
+ --------
32
+
33
+ * It's pretty slow.
34
+
35
+ * Crass only parses the CSS syntax; it doesn't understand what any of it means,
36
+ doesn't coalesce selectors, etc. You can do this yourself by consuming the
37
+ parse tree, though.
38
+
39
+ * While any node in the parse tree (or the parse tree as a whole) can be
40
+ serialized back to CSS with perfect fidelity, changes made to those nodes
41
+ (except for wholesale removal of nodes) are not reflected in the serialized
42
+ output.
43
+
44
+ * It doesn't have any unit tests yet, because it's very, very new and I'm
45
+ still experimenting with its architecture.
46
+
47
+ * Probably plenty of other things. Did I mention it's very new?
48
+
49
+ Installing
50
+ ----------
51
+
52
+ Don't install it yet. It's not finished.
53
+
54
+ Examples
55
+ --------
56
+
57
+ Say you have a string containing the following simple CSS:
58
+
59
+ ```css
60
+ /* Comment! */
61
+ a:hover {
62
+ color: #0d8bfa;
63
+ text-decoration: underline;
64
+ }
65
+ ```
66
+
67
+ Parsing it is simple:
68
+
69
+ ```ruby
70
+ tree = Crass.parse(css, :preserve_comments => true)
71
+ ```
72
+
73
+ This returns a big fat ugly parse tree, which looks like this:
74
+
75
+ ```ruby
76
+ [{:node=>:comment, :pos=>0, :raw=>"/* Comment! */", :value=>" Comment! "},
77
+ {:node=>:whitespace, :pos=>14, :raw=>"\n"},
78
+ {:node=>:style_rule,
79
+ :selector=>
80
+ {:node=>:selector,
81
+ :value=>"a:hover",
82
+ :tokens=>
83
+ [{:node=>:ident, :pos=>15, :raw=>"a", :value=>"a"},
84
+ {:node=>:colon, :pos=>16, :raw=>":"},
85
+ {:node=>:ident, :pos=>17, :raw=>"hover", :value=>"hover"},
86
+ {:node=>:whitespace, :pos=>22, :raw=>" "}]},
87
+ :children=>
88
+ [{:node=>:whitespace, :pos=>24, :raw=>"\n "},
89
+ {:node=>:property,
90
+ :name=>"color",
91
+ :value=>"#0d8bfa",
92
+ :tokens=>
93
+ [{:node=>:ident, :pos=>27, :raw=>"color", :value=>"color"},
94
+ {:node=>:colon, :pos=>32, :raw=>":"},
95
+ {:node=>:whitespace, :pos=>33, :raw=>" "},
96
+ {:node=>:hash,
97
+ :pos=>34,
98
+ :raw=>"#0d8bfa",
99
+ :type=>:unrestricted,
100
+ :value=>"0d8bfa"},
101
+ {:node=>:semicolon, :pos=>41, :raw=>";"}]},
102
+ {:node=>:whitespace, :pos=>42, :raw=>"\n "},
103
+ {:node=>:property,
104
+ :name=>"text-decoration",
105
+ :value=>"underline",
106
+ :tokens=>
107
+ [{:node=>:ident,
108
+ :pos=>45,
109
+ :raw=>"text-decoration",
110
+ :value=>"text-decoration"},
111
+ {:node=>:colon, :pos=>60, :raw=>":"},
112
+ {:node=>:whitespace, :pos=>61, :raw=>" "},
113
+ {:node=>:ident, :pos=>62, :raw=>"underline", :value=>"underline"},
114
+ {:node=>:semicolon, :pos=>71, :raw=>";"}]},
115
+ {:node=>:whitespace, :pos=>72, :raw=>"\n"}]},
116
+ {:node=>:whitespace, :pos=>74, :raw=>"\n"}]
117
+ ```
118
+
119
+ If you want, you can stringify the parse tree:
120
+
121
+ ```ruby
122
+ css = Crass::Parser.stringify(tree)
123
+ ```
124
+
125
+ ...which gives you back exactly what you put in!
126
+
127
+ ```css
128
+ /* Comment! */
129
+ a:hover {
130
+ color: #0d8bfa;
131
+ text-decoration: underline;
132
+ }
133
+ ```
134
+
135
+ Wasn't that exciting?
136
+
137
+ License
138
+ -------
139
+
140
+ Copyright (c) 2013 Ryan Grove (ryan@wonko.com)
141
+
142
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
143
+ this software and associated documentation files (the ‘Software’), to deal in
144
+ the Software without restriction, including without limitation the rights to
145
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
146
+ the Software, and to permit persons to whom the Software is furnished to do so,
147
+ subject to the following conditions:
148
+
149
+ The above copyright notice and this permission notice shall be included in all
150
+ copies or substantial portions of the Software.
151
+
152
+ THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
153
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
154
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
155
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
156
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
157
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/lib/crass.rb ADDED
@@ -0,0 +1,13 @@
1
+ # encoding: utf-8
2
+ require_relative 'crass/parser'
3
+
4
+ # A CSS parser based on the CSS Syntax Module Level 3 draft.
5
+ module Crass
6
+
7
+ # Parses _input_ as a CSS stylesheet and returns a parse tree.
8
+ #
9
+ # See {Tokenizer#initialize} for _options_.
10
+ def self.parse(input, options = {})
11
+ Parser.parse_stylesheet(input, options)
12
+ end
13
+ end
@@ -0,0 +1,403 @@
1
+ # encoding: utf-8
2
+ require_relative 'token-scanner'
3
+ require_relative 'tokenizer'
4
+
5
+ module Crass
6
+
7
+ # Parses a CSS string or list of tokens.
8
+ #
9
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parsing
10
+ class Parser
11
+ BLOCK_END_TOKENS = {
12
+ :'{' => :'}',
13
+ :'[' => :']',
14
+ :'(' => :')'
15
+ }
16
+
17
+ # -- Class Methods ---------------------------------------------------------
18
+
19
+ # Parses a CSS stylesheet and returns a parse tree.
20
+ #
21
+ # See {Tokenizer#initialize} for _options_.
22
+ #
23
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-stylesheet
24
+ def self.parse_stylesheet(input, options = {})
25
+ parser = Parser.new(input, options)
26
+ rules = parser.consume_rules(:top_level => true)
27
+
28
+ rules.map do |rule|
29
+ case rule[:node]
30
+ # TODO: handle at-rules
31
+ when :qualified_rule then parser.parse_style_rule(rule)
32
+ else rule
33
+ end
34
+ end
35
+ end
36
+
37
+ # Converts a node or array of nodes into a CSS string based on their
38
+ # original tokenized input.
39
+ #
40
+ # Options:
41
+ #
42
+ # * **:exclude_comments** - When `true`, comments will be excluded.
43
+ #
44
+ def self.stringify(nodes, options = {})
45
+ nodes = [nodes] unless nodes.is_a?(Array)
46
+ string = ''
47
+
48
+ nodes.each do |node|
49
+ case node[:node]
50
+ when :comment
51
+ string << node[:raw] unless options[:exclude_comments]
52
+
53
+ when :style_rule
54
+ string << self.stringify(node[:selector][:tokens], options)
55
+ string << "{"
56
+ string << self.stringify(node[:children], options)
57
+ string << "}"
58
+
59
+ when :property
60
+ string << options[:indent] if options[:indent]
61
+ string << self.stringify(node[:tokens], options)
62
+
63
+ else
64
+ if node.key?(:raw)
65
+ string << node[:raw]
66
+ elsif node.key?(:tokens)
67
+ string << self.stringify(node[:tokens], options)
68
+ end
69
+ end
70
+ end
71
+
72
+ string
73
+ end
74
+
75
+ # -- Instance Methods ------------------------------------------------------
76
+
77
+ # Array of tokens generated from this parser's input.
78
+ attr_reader :tokens
79
+
80
+ # Initializes a parser based on the given _input_, which may be a CSS string
81
+ # or an array of tokens.
82
+ #
83
+ # See {Tokenizer#initialize} for _options_.
84
+ def initialize(input, options = {})
85
+ unless input.kind_of?(Enumerable)
86
+ input = Tokenizer.tokenize(input, options)
87
+ end
88
+
89
+ @tokens = TokenScanner.new(input)
90
+ end
91
+
92
+ # Consumes an at-rule and returns it.
93
+ #
94
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-an-at-rule0
95
+ def consume_at_rule(tokens = @tokens)
96
+ rule = {:prelude => []}
97
+
98
+ rule[:tokens] = tokens.collect do
99
+ while token = tokens.consume
100
+ case token[:node]
101
+ when :comment then next
102
+ when :semicolon, :eof then break
103
+
104
+ when :'{' then
105
+ rule[:block] = consume_simple_block(tokens)
106
+ break
107
+
108
+ # TODO: At this point, the spec says we should check for a "simple block
109
+ # with an associated token of <<{-token>>", but isn't that exactly what
110
+ # we just did above? And the tokenizer only ever produces standalone
111
+ # <<{-token>>s, so how could the token stream ever contain one that's
112
+ # already associated with a simple block? What am I missing?
113
+
114
+ else
115
+ tokens.reconsume
116
+ rule[:prelude] << consume_component_value(tokens)
117
+ end
118
+ end
119
+ end
120
+
121
+ create_node(:at_rule, rule)
122
+ end
123
+
124
+ # Consumes a component value and returns it.
125
+ #
126
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-component-value0
127
+ def consume_component_value(tokens = @tokens)
128
+ return nil unless token = tokens.consume
129
+
130
+ case token[:node]
131
+ when :'{', :'[', :'(' then consume_simple_block(tokens)
132
+ when :function then consume_function(tokens)
133
+ else token
134
+ end
135
+ end
136
+
137
+ # Consumes a declaration and returns it, or `nil` on parse error.
138
+ #
139
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-declaration0
140
+ def consume_declaration(tokens = @tokens)
141
+ declaration = {}
142
+
143
+ declaration[:tokens] = tokens.collect do
144
+ declaration[:name] = tokens.consume[:value]
145
+
146
+ value = []
147
+ token = tokens.consume
148
+ token = tokens.consume while token[:node] == :whitespace
149
+
150
+ return nil if token[:node] != :colon # TODO: parse error
151
+
152
+ value << token while token = tokens.consume
153
+ declaration[:value] = value
154
+
155
+ maybe_important = value.reject {|v| v[:node] == :whitespace }[-2, 2]
156
+
157
+ if maybe_important &&
158
+ maybe_important[0][:node] == :delim &&
159
+ maybe_important[0][:value] == '!' &&
160
+ maybe_important[1][:node] == :ident &&
161
+ maybe_important[1][:value].downcase == 'important'
162
+
163
+ declaration[:important] = true
164
+ end
165
+ end
166
+
167
+ create_node(:declaration, declaration)
168
+ end
169
+
170
+ # Consumes a list of declarations and returns them.
171
+ #
172
+ # NOTE: The returned list may include `:comment`, `:semicolon`, and
173
+ # `:whitespace` nodes, which is non-standard.
174
+ #
175
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
176
+ def consume_declarations(tokens = @tokens)
177
+ declarations = []
178
+
179
+ while token = tokens.consume
180
+ case token[:node]
181
+ when :comment, :semicolon, :whitespace
182
+ declarations << token
183
+
184
+ when :at_keyword
185
+ # TODO: this is technically a parse error when parsing a style rule,
186
+ # but not necessarily at other times.
187
+ declarations << consume_at_rule(tokens)
188
+
189
+ when :ident
190
+ decl_tokens = [token]
191
+ tokens.consume
192
+
193
+ while tokens.current
194
+ decl_tokens << tokens.current
195
+ break if tokens.current[:node] == :semicolon
196
+ tokens.consume
197
+ end
198
+
199
+ if decl = consume_declaration(TokenScanner.new(decl_tokens))
200
+ declarations << decl
201
+ end
202
+
203
+ else
204
+ # TODO: parse error (invalid property name, etc.)
205
+ while token && token[:node] != :semicolon
206
+ token = consume_component_value(tokens)
207
+ end
208
+ end
209
+ end
210
+
211
+ declarations
212
+ end
213
+
214
+ # Consumes a function and returns it.
215
+ #
216
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-function
217
+ def consume_function(tokens = @tokens)
218
+ function = {
219
+ :name => tokens.current[:value],
220
+ :value => [],
221
+ :tokens => [tokens.current]
222
+ }
223
+
224
+ function[:tokens].concat(tokens.collect do
225
+ while token = tokens.consume
226
+ case token[:node]
227
+ when :')', :eof then break
228
+ when :comment then next
229
+
230
+ else
231
+ tokens.reconsume
232
+ function[:value] << consume_component_value(tokens)
233
+ end
234
+ end
235
+ end)
236
+
237
+ create_node(:function, function)
238
+ end
239
+
240
+ # Consumes a qualified rule and returns it, or `nil` if a parse error
241
+ # occurs.
242
+ #
243
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-qualified-rule0
244
+ def consume_qualified_rule(tokens = @tokens)
245
+ rule = {:prelude => []}
246
+
247
+ rule[:tokens] = tokens.collect do
248
+ while true
249
+ return nil unless token = tokens.consume
250
+
251
+ if token[:node] == :'{'
252
+ rule[:block] = consume_simple_block(tokens)
253
+ break
254
+
255
+ # elsif [simple block with an associated <<{-token>>??]
256
+
257
+ # TODO: At this point, the spec says we should check for a "simple block
258
+ # with an associated token of <<{-token>>", but isn't that exactly what
259
+ # we just did above? And the tokenizer only ever produces standalone
260
+ # <<{-token>>s, so how could the token stream ever contain one that's
261
+ # already associated with a simple block? What am I missing?
262
+
263
+ else
264
+ tokens.reconsume
265
+ rule[:prelude] << consume_component_value(tokens)
266
+ end
267
+ end
268
+ end
269
+
270
+ create_node(:qualified_rule, rule)
271
+ end
272
+
273
+ # Consumes a list of rules and returns them.
274
+ #
275
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-rules0
276
+ def consume_rules(flags = {})
277
+ rules = []
278
+
279
+ while true
280
+ return rules unless token = @tokens.consume
281
+
282
+ case token[:node]
283
+ when :comment, :whitespace then rules << token
284
+ when :eof then return rules
285
+
286
+ when :cdc, :cdo
287
+ unless flags[:top_level]
288
+ @tokens.reconsume
289
+ rule = consume_qualified_rule
290
+ rules << rule if rule
291
+ end
292
+
293
+ when :at_keyword
294
+ @tokens.reconsume
295
+ rule = consume_at_rule
296
+ rules << rule if rule
297
+
298
+ else
299
+ @tokens.reconsume
300
+ rule = consume_qualified_rule
301
+ rules << rule if rule
302
+ end
303
+ end
304
+ end
305
+
306
+ # Consumes and returns a simple block associated with the current input
307
+ # token.
308
+ #
309
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-simple-block0
310
+ def consume_simple_block(tokens = @tokens)
311
+ start_token = tokens.current[:node]
312
+ end_token = BLOCK_END_TOKENS[start_token]
313
+
314
+ block = {
315
+ :start => start_token.to_s,
316
+ :end => end_token.to_s,
317
+ :value => [],
318
+ :tokens => [tokens.current]
319
+ }
320
+
321
+ block[:tokens].concat(tokens.collect do
322
+ while token = tokens.consume
323
+ break if token[:node] == end_token || token[:node] == :eof
324
+
325
+ tokens.reconsume
326
+ block[:value] << consume_component_value(tokens)
327
+ end
328
+ end)
329
+
330
+ create_node(:simple_block, block)
331
+ end
332
+
333
+ # Creates and returns a new parse node with the given _properties_.
334
+ def create_node(type, properties = {})
335
+ {:node => type}.merge!(properties)
336
+ end
337
+
338
+ # Parses the given _tokens_ into a selector node and returns it.
339
+ #
340
+ # Doesn't bother splitting the selector list into individual selectors or
341
+ # validating them. Feel free to do that yourself! It'll be fun!
342
+ def parse_selector(tokens)
343
+ create_node(:selector,
344
+ :value => parse_value(tokens),
345
+ :tokens => tokens)
346
+ end
347
+
348
+ # Parses a style rule and returns the result.
349
+ #
350
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#style-rules
351
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
352
+ def parse_style_rule(rule)
353
+ children = []
354
+ tokens = TokenScanner.new(rule[:block][:value])
355
+
356
+ consume_declarations(tokens).each do |decl|
357
+ unless decl[:node] == :declaration
358
+ children << decl
359
+ next
360
+ end
361
+
362
+ children << create_node(:property,
363
+ :name => decl[:name],
364
+ :value => parse_value(decl[:value]),
365
+ :tokens => decl[:tokens])
366
+ end
367
+
368
+ create_node(:style_rule,
369
+ :selector => parse_selector(rule[:prelude]),
370
+ :children => children
371
+ )
372
+ end
373
+
374
+ # Returns the unescaped value of a selector name or property declaration.
375
+ def parse_value(nodes)
376
+ string = ''
377
+
378
+ nodes.each do |node|
379
+ case node[:node]
380
+ when :comment, :semicolon then next
381
+ when :ident then string << node[:value]
382
+
383
+ when :function
384
+ if node[:value].is_a?(String)
385
+ string << node[:value]
386
+ else
387
+ string << parse_value(node[:value])
388
+ end
389
+
390
+ else
391
+ if node.key?(:raw)
392
+ string << node[:raw]
393
+ elsif node.key?(:tokens)
394
+ string << parse_value(node[:tokens])
395
+ end
396
+ end
397
+ end
398
+
399
+ string.strip
400
+ end
401
+ end
402
+
403
+ end