crass 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
data/HISTORY.md ADDED
@@ -0,0 +1,4 @@
1
+ Crass Change History
2
+ ====================
3
+
4
+ ?????
data/LICENSE ADDED
@@ -0,0 +1,18 @@
1
+ Copyright (c) 2013 Ryan Grove (ryan@wonko.com)
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
4
+ this software and associated documentation files (the ‘Software’), to deal in
5
+ the Software without restriction, including without limitation the rights to
6
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
7
+ the Software, and to permit persons to whom the Software is furnished to do so,
8
+ subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
15
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
16
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
17
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
18
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,157 @@
1
+ Crass
2
+ =====
3
+
4
+ Crass is a Ruby CSS parser based on the [CSS Syntax Module Level 3][css] draft.
5
+
6
+ * [Home](https://github.com/rgrove/crass/)
7
+ * [API Docs](http://rubydoc.info/github/rgrove/crass/master)
8
+
9
+ Features
10
+ --------
11
+
12
+ * Pure Ruby, with no runtime dependencies other than Ruby 1.9.x or higher.
13
+
14
+ * Tokenizes and parses CSS according to the rules defined in the
15
+ [CSS Syntax Module Level 3][css] draft.
16
+
17
+ * Extremely tolerant of broken or invalid CSS. If a browser can handle it, Crass
18
+ should be able to handle it too.
19
+
20
+ * Optionally includes comments in the token stream.
21
+
22
+ * Optionally preserves certain CSS hacks, such as the IE "*" hack, which would
23
+ otherwise be discarded according to CSS3 tokenizing rules.
24
+
25
+ * Capable of serializing the parse tree back to CSS while maintaining all
26
+ original whitespace, comments, and indentation.
27
+
28
+ [css]: http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/
29
+
30
+ Problems
31
+ --------
32
+
33
+ * It's pretty slow.
34
+
35
+ * Crass only parses the CSS syntax; it doesn't understand what any of it means,
36
+ doesn't coalesce selectors, etc. You can do this yourself by consuming the
37
+ parse tree, though.
38
+
39
+ * While any node in the parse tree (or the parse tree as a whole) can be
40
+ serialized back to CSS with perfect fidelity, changes made to those nodes
41
+ (except for wholesale removal of nodes) are not reflected in the serialized
42
+ output.
43
+
44
+ * It doesn't have any unit tests yet, because it's very, very new and I'm
45
+ still experimenting with its architecture.
46
+
47
+ * Probably plenty of other things. Did I mention it's very new?
48
+
49
+ Installing
50
+ ----------
51
+
52
+ Don't install it yet. It's not finished.
53
+
54
+ Examples
55
+ --------
56
+
57
+ Say you have a string containing the following simple CSS:
58
+
59
+ ```css
60
+ /* Comment! */
61
+ a:hover {
62
+ color: #0d8bfa;
63
+ text-decoration: underline;
64
+ }
65
+ ```
66
+
67
+ Parsing it is simple:
68
+
69
+ ```ruby
70
+ tree = Crass.parse(css, :preserve_comments => true)
71
+ ```
72
+
73
+ This returns a big fat ugly parse tree, which looks like this:
74
+
75
+ ```ruby
76
+ [{:node=>:comment, :pos=>0, :raw=>"/* Comment! */", :value=>" Comment! "},
77
+ {:node=>:whitespace, :pos=>14, :raw=>"\n"},
78
+ {:node=>:style_rule,
79
+ :selector=>
80
+ {:node=>:selector,
81
+ :value=>"a:hover",
82
+ :tokens=>
83
+ [{:node=>:ident, :pos=>15, :raw=>"a", :value=>"a"},
84
+ {:node=>:colon, :pos=>16, :raw=>":"},
85
+ {:node=>:ident, :pos=>17, :raw=>"hover", :value=>"hover"},
86
+ {:node=>:whitespace, :pos=>22, :raw=>" "}]},
87
+ :children=>
88
+ [{:node=>:whitespace, :pos=>24, :raw=>"\n "},
89
+ {:node=>:property,
90
+ :name=>"color",
91
+ :value=>"#0d8bfa",
92
+ :tokens=>
93
+ [{:node=>:ident, :pos=>27, :raw=>"color", :value=>"color"},
94
+ {:node=>:colon, :pos=>32, :raw=>":"},
95
+ {:node=>:whitespace, :pos=>33, :raw=>" "},
96
+ {:node=>:hash,
97
+ :pos=>34,
98
+ :raw=>"#0d8bfa",
99
+ :type=>:unrestricted,
100
+ :value=>"0d8bfa"},
101
+ {:node=>:semicolon, :pos=>41, :raw=>";"}]},
102
+ {:node=>:whitespace, :pos=>42, :raw=>"\n "},
103
+ {:node=>:property,
104
+ :name=>"text-decoration",
105
+ :value=>"underline",
106
+ :tokens=>
107
+ [{:node=>:ident,
108
+ :pos=>45,
109
+ :raw=>"text-decoration",
110
+ :value=>"text-decoration"},
111
+ {:node=>:colon, :pos=>60, :raw=>":"},
112
+ {:node=>:whitespace, :pos=>61, :raw=>" "},
113
+ {:node=>:ident, :pos=>62, :raw=>"underline", :value=>"underline"},
114
+ {:node=>:semicolon, :pos=>71, :raw=>";"}]},
115
+ {:node=>:whitespace, :pos=>72, :raw=>"\n"}]},
116
+ {:node=>:whitespace, :pos=>74, :raw=>"\n"}]
117
+ ```
118
+
119
+ If you want, you can stringify the parse tree:
120
+
121
+ ```ruby
122
+ css = Crass::Parser.stringify(tree)
123
+ ```
124
+
125
+ ...which gives you back exactly what you put in!
126
+
127
+ ```css
128
+ /* Comment! */
129
+ a:hover {
130
+ color: #0d8bfa;
131
+ text-decoration: underline;
132
+ }
133
+ ```
134
+
135
+ Wasn't that exciting?
136
+
137
+ License
138
+ -------
139
+
140
+ Copyright (c) 2013 Ryan Grove (ryan@wonko.com)
141
+
142
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
143
+ this software and associated documentation files (the ‘Software’), to deal in
144
+ the Software without restriction, including without limitation the rights to
145
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
146
+ the Software, and to permit persons to whom the Software is furnished to do so,
147
+ subject to the following conditions:
148
+
149
+ The above copyright notice and this permission notice shall be included in all
150
+ copies or substantial portions of the Software.
151
+
152
+ THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
153
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
154
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
155
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
156
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
157
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/lib/crass.rb ADDED
@@ -0,0 +1,13 @@
1
+ # encoding: utf-8
2
+ require_relative 'crass/parser'
3
+
4
+ # A CSS parser based on the CSS Syntax Module Level 3 draft.
5
+ module Crass
6
+
7
+ # Parses _input_ as a CSS stylesheet and returns a parse tree.
8
+ #
9
+ # See {Tokenizer#initialize} for _options_.
10
+ def self.parse(input, options = {})
11
+ Parser.parse_stylesheet(input, options)
12
+ end
13
+ end
@@ -0,0 +1,403 @@
1
+ # encoding: utf-8
2
+ require_relative 'token-scanner'
3
+ require_relative 'tokenizer'
4
+
5
+ module Crass
6
+
7
+ # Parses a CSS string or list of tokens.
8
+ #
9
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parsing
10
+ class Parser
11
+ BLOCK_END_TOKENS = {
12
+ :'{' => :'}',
13
+ :'[' => :']',
14
+ :'(' => :')'
15
+ }
16
+
17
+ # -- Class Methods ---------------------------------------------------------
18
+
19
+ # Parses a CSS stylesheet and returns a parse tree.
20
+ #
21
+ # See {Tokenizer#initialize} for _options_.
22
+ #
23
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#parse-a-stylesheet
24
+ def self.parse_stylesheet(input, options = {})
25
+ parser = Parser.new(input, options)
26
+ rules = parser.consume_rules(:top_level => true)
27
+
28
+ rules.map do |rule|
29
+ case rule[:node]
30
+ # TODO: handle at-rules
31
+ when :qualified_rule then parser.parse_style_rule(rule)
32
+ else rule
33
+ end
34
+ end
35
+ end
36
+
37
+ # Converts a node or array of nodes into a CSS string based on their
38
+ # original tokenized input.
39
+ #
40
+ # Options:
41
+ #
42
+ # * **:exclude_comments** - When `true`, comments will be excluded.
43
+ #
44
+ def self.stringify(nodes, options = {})
45
+ nodes = [nodes] unless nodes.is_a?(Array)
46
+ string = ''
47
+
48
+ nodes.each do |node|
49
+ case node[:node]
50
+ when :comment
51
+ string << node[:raw] unless options[:exclude_comments]
52
+
53
+ when :style_rule
54
+ string << self.stringify(node[:selector][:tokens], options)
55
+ string << "{"
56
+ string << self.stringify(node[:children], options)
57
+ string << "}"
58
+
59
+ when :property
60
+ string << options[:indent] if options[:indent]
61
+ string << self.stringify(node[:tokens], options)
62
+
63
+ else
64
+ if node.key?(:raw)
65
+ string << node[:raw]
66
+ elsif node.key?(:tokens)
67
+ string << self.stringify(node[:tokens], options)
68
+ end
69
+ end
70
+ end
71
+
72
+ string
73
+ end
74
+
75
+ # -- Instance Methods ------------------------------------------------------
76
+
77
+ # Array of tokens generated from this parser's input.
78
+ attr_reader :tokens
79
+
80
+ # Initializes a parser based on the given _input_, which may be a CSS string
81
+ # or an array of tokens.
82
+ #
83
+ # See {Tokenizer#initialize} for _options_.
84
+ def initialize(input, options = {})
85
+ unless input.kind_of?(Enumerable)
86
+ input = Tokenizer.tokenize(input, options)
87
+ end
88
+
89
+ @tokens = TokenScanner.new(input)
90
+ end
91
+
92
+ # Consumes an at-rule and returns it.
93
+ #
94
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-an-at-rule0
95
+ def consume_at_rule(tokens = @tokens)
96
+ rule = {:prelude => []}
97
+
98
+ rule[:tokens] = tokens.collect do
99
+ while token = tokens.consume
100
+ case token[:node]
101
+ when :comment then next
102
+ when :semicolon, :eof then break
103
+
104
+ when :'{' then
105
+ rule[:block] = consume_simple_block(tokens)
106
+ break
107
+
108
+ # TODO: At this point, the spec says we should check for a "simple block
109
+ # with an associated token of <<{-token>>", but isn't that exactly what
110
+ # we just did above? And the tokenizer only ever produces standalone
111
+ # <<{-token>>s, so how could the token stream ever contain one that's
112
+ # already associated with a simple block? What am I missing?
113
+
114
+ else
115
+ tokens.reconsume
116
+ rule[:prelude] << consume_component_value(tokens)
117
+ end
118
+ end
119
+ end
120
+
121
+ create_node(:at_rule, rule)
122
+ end
123
+
124
+ # Consumes a component value and returns it.
125
+ #
126
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-component-value0
127
+ def consume_component_value(tokens = @tokens)
128
+ return nil unless token = tokens.consume
129
+
130
+ case token[:node]
131
+ when :'{', :'[', :'(' then consume_simple_block(tokens)
132
+ when :function then consume_function(tokens)
133
+ else token
134
+ end
135
+ end
136
+
137
+ # Consumes a declaration and returns it, or `nil` on parse error.
138
+ #
139
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-declaration0
140
+ def consume_declaration(tokens = @tokens)
141
+ declaration = {}
142
+
143
+ declaration[:tokens] = tokens.collect do
144
+ declaration[:name] = tokens.consume[:value]
145
+
146
+ value = []
147
+ token = tokens.consume
148
+ token = tokens.consume while token[:node] == :whitespace
149
+
150
+ return nil if token[:node] != :colon # TODO: parse error
151
+
152
+ value << token while token = tokens.consume
153
+ declaration[:value] = value
154
+
155
+ maybe_important = value.reject {|v| v[:node] == :whitespace }[-2, 2]
156
+
157
+ if maybe_important &&
158
+ maybe_important[0][:node] == :delim &&
159
+ maybe_important[0][:value] == '!' &&
160
+ maybe_important[1][:node] == :ident &&
161
+ maybe_important[1][:value].downcase == 'important'
162
+
163
+ declaration[:important] = true
164
+ end
165
+ end
166
+
167
+ create_node(:declaration, declaration)
168
+ end
169
+
170
+ # Consumes a list of declarations and returns them.
171
+ #
172
+ # NOTE: The returned list may include `:comment`, `:semicolon`, and
173
+ # `:whitespace` nodes, which is non-standard.
174
+ #
175
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
176
+ def consume_declarations(tokens = @tokens)
177
+ declarations = []
178
+
179
+ while token = tokens.consume
180
+ case token[:node]
181
+ when :comment, :semicolon, :whitespace
182
+ declarations << token
183
+
184
+ when :at_keyword
185
+ # TODO: this is technically a parse error when parsing a style rule,
186
+ # but not necessarily at other times.
187
+ declarations << consume_at_rule(tokens)
188
+
189
+ when :ident
190
+ decl_tokens = [token]
191
+ tokens.consume
192
+
193
+ while tokens.current
194
+ decl_tokens << tokens.current
195
+ break if tokens.current[:node] == :semicolon
196
+ tokens.consume
197
+ end
198
+
199
+ if decl = consume_declaration(TokenScanner.new(decl_tokens))
200
+ declarations << decl
201
+ end
202
+
203
+ else
204
+ # TODO: parse error (invalid property name, etc.)
205
+ while token && token[:node] != :semicolon
206
+ token = consume_component_value(tokens)
207
+ end
208
+ end
209
+ end
210
+
211
+ declarations
212
+ end
213
+
214
+ # Consumes a function and returns it.
215
+ #
216
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-function
217
+ def consume_function(tokens = @tokens)
218
+ function = {
219
+ :name => tokens.current[:value],
220
+ :value => [],
221
+ :tokens => [tokens.current]
222
+ }
223
+
224
+ function[:tokens].concat(tokens.collect do
225
+ while token = tokens.consume
226
+ case token[:node]
227
+ when :')', :eof then break
228
+ when :comment then next
229
+
230
+ else
231
+ tokens.reconsume
232
+ function[:value] << consume_component_value(tokens)
233
+ end
234
+ end
235
+ end)
236
+
237
+ create_node(:function, function)
238
+ end
239
+
240
+ # Consumes a qualified rule and returns it, or `nil` if a parse error
241
+ # occurs.
242
+ #
243
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-qualified-rule0
244
+ def consume_qualified_rule(tokens = @tokens)
245
+ rule = {:prelude => []}
246
+
247
+ rule[:tokens] = tokens.collect do
248
+ while true
249
+ return nil unless token = tokens.consume
250
+
251
+ if token[:node] == :'{'
252
+ rule[:block] = consume_simple_block(tokens)
253
+ break
254
+
255
+ # elsif [simple block with an associated <<{-token>>??]
256
+
257
+ # TODO: At this point, the spec says we should check for a "simple block
258
+ # with an associated token of <<{-token>>", but isn't that exactly what
259
+ # we just did above? And the tokenizer only ever produces standalone
260
+ # <<{-token>>s, so how could the token stream ever contain one that's
261
+ # already associated with a simple block? What am I missing?
262
+
263
+ else
264
+ tokens.reconsume
265
+ rule[:prelude] << consume_component_value(tokens)
266
+ end
267
+ end
268
+ end
269
+
270
+ create_node(:qualified_rule, rule)
271
+ end
272
+
273
+ # Consumes a list of rules and returns them.
274
+ #
275
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-rules0
276
+ def consume_rules(flags = {})
277
+ rules = []
278
+
279
+ while true
280
+ return rules unless token = @tokens.consume
281
+
282
+ case token[:node]
283
+ when :comment, :whitespace then rules << token
284
+ when :eof then return rules
285
+
286
+ when :cdc, :cdo
287
+ unless flags[:top_level]
288
+ @tokens.reconsume
289
+ rule = consume_qualified_rule
290
+ rules << rule if rule
291
+ end
292
+
293
+ when :at_keyword
294
+ @tokens.reconsume
295
+ rule = consume_at_rule
296
+ rules << rule if rule
297
+
298
+ else
299
+ @tokens.reconsume
300
+ rule = consume_qualified_rule
301
+ rules << rule if rule
302
+ end
303
+ end
304
+ end
305
+
306
+ # Consumes and returns a simple block associated with the current input
307
+ # token.
308
+ #
309
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-simple-block0
310
+ def consume_simple_block(tokens = @tokens)
311
+ start_token = tokens.current[:node]
312
+ end_token = BLOCK_END_TOKENS[start_token]
313
+
314
+ block = {
315
+ :start => start_token.to_s,
316
+ :end => end_token.to_s,
317
+ :value => [],
318
+ :tokens => [tokens.current]
319
+ }
320
+
321
+ block[:tokens].concat(tokens.collect do
322
+ while token = tokens.consume
323
+ break if token[:node] == end_token || token[:node] == :eof
324
+
325
+ tokens.reconsume
326
+ block[:value] << consume_component_value(tokens)
327
+ end
328
+ end)
329
+
330
+ create_node(:simple_block, block)
331
+ end
332
+
333
+ # Creates and returns a new parse node with the given _properties_.
334
+ def create_node(type, properties = {})
335
+ {:node => type}.merge!(properties)
336
+ end
337
+
338
+ # Parses the given _tokens_ into a selector node and returns it.
339
+ #
340
+ # Doesn't bother splitting the selector list into individual selectors or
341
+ # validating them. Feel free to do that yourself! It'll be fun!
342
+ def parse_selector(tokens)
343
+ create_node(:selector,
344
+ :value => parse_value(tokens),
345
+ :tokens => tokens)
346
+ end
347
+
348
+ # Parses a style rule and returns the result.
349
+ #
350
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#style-rules
351
+ # http://www.w3.org/TR/2013/WD-css-syntax-3-20130919/#consume-a-list-of-declarations0
352
+ def parse_style_rule(rule)
353
+ children = []
354
+ tokens = TokenScanner.new(rule[:block][:value])
355
+
356
+ consume_declarations(tokens).each do |decl|
357
+ unless decl[:node] == :declaration
358
+ children << decl
359
+ next
360
+ end
361
+
362
+ children << create_node(:property,
363
+ :name => decl[:name],
364
+ :value => parse_value(decl[:value]),
365
+ :tokens => decl[:tokens])
366
+ end
367
+
368
+ create_node(:style_rule,
369
+ :selector => parse_selector(rule[:prelude]),
370
+ :children => children
371
+ )
372
+ end
373
+
374
+ # Returns the unescaped value of a selector name or property declaration.
375
+ def parse_value(nodes)
376
+ string = ''
377
+
378
+ nodes.each do |node|
379
+ case node[:node]
380
+ when :comment, :semicolon then next
381
+ when :ident then string << node[:value]
382
+
383
+ when :function
384
+ if node[:value].is_a?(String)
385
+ string << node[:value]
386
+ else
387
+ string << parse_value(node[:value])
388
+ end
389
+
390
+ else
391
+ if node.key?(:raw)
392
+ string << node[:raw]
393
+ elsif node.key?(:tokens)
394
+ string << parse_value(node[:tokens])
395
+ end
396
+ end
397
+ end
398
+
399
+ string.strip
400
+ end
401
+ end
402
+
403
+ end