oj 0.9.0 → 1.0.0
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of oj might be problematic. Click here for more details.
- data/README.md +95 -8
- data/ext/oj/fast.c +1540 -0
- data/ext/oj/load.c +1 -1
- data/ext/oj/oj.c +28 -21
- data/ext/oj/oj.h +6 -0
- data/lib/oj.rb +2 -1
- data/lib/oj/version.rb +1 -1
- data/test/perf_fast.rb +119 -0
- data/test/perf_strict.rb +13 -8
- data/test/test_fast.rb +331 -0
- data/test/where.rb +54 -0
- metadata +6 -2
data/README.md
CHANGED
@@ -16,15 +16,17 @@ A fast JSON parser and Object marshaller as a Ruby gem.
|
|
16
16
|
|
17
17
|
## <a name="links">Links of Interest</a>
|
18
18
|
|
19
|
+
[Need for Speed](http://www.ohler.com/software/thoughts/Blog/Entries/2012/3/13_Need_for_Speed.html) for an overview of how Oj::Doc was designed.
|
20
|
+
|
19
21
|
*Fast XML parser and marshaller on RubyGems*: https://rubygems.org/gems/ox
|
20
22
|
|
21
23
|
*Fast XML parser and marshaller on GitHub*: https://rubygems.org/gems/ox
|
22
24
|
|
23
25
|
## <a name="release">Release Notes</a>
|
24
26
|
|
25
|
-
### Release 0.
|
27
|
+
### Release 1.0.0
|
26
28
|
|
27
|
-
-
|
29
|
+
- The screaming fast Oj::Doc parser added.
|
28
30
|
|
29
31
|
## <a name="description">Description</a>
|
30
32
|
|
@@ -57,15 +59,100 @@ Oj is compatible with Ruby 1.8.7, 1.9.2, 1.9.3, JRuby, and RBX.
|
|
57
59
|
|
58
60
|
## <a name="plans">Planned Releases</a>
|
59
61
|
|
60
|
-
- Release 1.0:
|
62
|
+
- Release 1.0.1: Optimize the Oj::Doc dump() method to be native.
|
63
|
+
|
64
|
+
- Release 1.1: A JSON stream parser. Pushed out for the Oj::Doc parser.
|
61
65
|
|
62
66
|
## <a name="compare">Comparisons</a>
|
63
67
|
|
68
|
+
### Fast Oj::Doc parser comparisons
|
69
|
+
|
70
|
+
The fast Oj::Doc parser is compared to the Yajl and JSON::Pure parsers with
|
71
|
+
strict JSON documents. No object conversions are included, just simple JSON.
|
72
|
+
|
73
|
+
Since the Oj::Doc deviation from the conventional parsers comparisons of not
|
74
|
+
only parsing but data access is also included. These tests use the
|
75
|
+
perf_fast.rb test file. The first benchmark is for just parsing. The second is
|
76
|
+
for doing a get on every leaf value in the JSON data structure. The third
|
77
|
+
fetchs a value from a specific spot in the document. With Yajl and JSON this
|
78
|
+
is done with a set of calls to fetch() for each level in the document. For
|
79
|
+
Oj::Doc a single fetch with a path is used.
|
80
|
+
|
81
|
+
The benchmark results are:
|
82
|
+
|
83
|
+
> perf_fast.rb -g 1 -f
|
84
|
+
--------------------------------------------------------------------------------
|
85
|
+
Parse Performance
|
86
|
+
Oj::Doc.parse 100000 times in 0.164 seconds or 609893.696 parse/sec.
|
87
|
+
Yajl.parse 100000 times in 3.168 seconds or 31569.902 parse/sec.
|
88
|
+
JSON::Ext.parse 100000 times in 3.282 seconds or 30464.826 parse/sec.
|
89
|
+
|
90
|
+
Summary:
|
91
|
+
System time (secs) rate (ops/sec)
|
92
|
+
--------- ----------- --------------
|
93
|
+
Oj::Doc 0.164 609893.696
|
94
|
+
Yajl 3.168 31569.902
|
95
|
+
JSON::Ext 3.282 30464.826
|
96
|
+
|
97
|
+
Comparison Matrix
|
98
|
+
(performance factor, 2.0 row is means twice as fast as column)
|
99
|
+
Oj::Doc Yajl JSON::Ext
|
100
|
+
--------- --------- --------- ---------
|
101
|
+
Oj::Doc 1.00 19.32 20.02
|
102
|
+
Yajl 0.05 1.00 1.04
|
103
|
+
JSON::Ext 0.05 0.96 1.00
|
104
|
+
|
105
|
+
--------------------------------------------------------------------------------
|
106
|
+
Parse and get all values Performance
|
107
|
+
Oj::Doc.parse 100000 times in 0.417 seconds or 240054.540 parse/sec.
|
108
|
+
Yajl.parse 100000 times in 5.159 seconds or 19384.191 parse/sec.
|
109
|
+
JSON::Ext.parse 100000 times in 5.269 seconds or 18978.638 parse/sec.
|
110
|
+
|
111
|
+
Summary:
|
112
|
+
System time (secs) rate (ops/sec)
|
113
|
+
--------- ----------- --------------
|
114
|
+
Oj::Doc 0.417 240054.540
|
115
|
+
Yajl 5.159 19384.191
|
116
|
+
JSON::Ext 5.269 18978.638
|
117
|
+
|
118
|
+
Comparison Matrix
|
119
|
+
(performance factor, 2.0 row is means twice as fast as column)
|
120
|
+
Oj::Doc Yajl JSON::Ext
|
121
|
+
--------- --------- --------- ---------
|
122
|
+
Oj::Doc 1.00 12.38 12.65
|
123
|
+
Yajl 0.08 1.00 1.02
|
124
|
+
JSON::Ext 0.08 0.98 1.00
|
125
|
+
|
126
|
+
--------------------------------------------------------------------------------
|
127
|
+
fetch nested Performance
|
128
|
+
Oj::Doc.fetch 100000 times in 0.094 seconds or 1059995.760 fetch/sec.
|
129
|
+
Ruby.fetch 100000 times in 0.503 seconds or 198851.434 fetch/sec.
|
130
|
+
|
131
|
+
Summary:
|
132
|
+
System time (secs) rate (ops/sec)
|
133
|
+
------- ----------- --------------
|
134
|
+
Oj::Doc 0.094 1059995.760
|
135
|
+
Ruby 0.503 198851.434
|
136
|
+
|
137
|
+
Comparison Matrix
|
138
|
+
(performance factor, 2.0 row is means twice as fast as column)
|
139
|
+
Oj::Doc Ruby
|
140
|
+
------- ------- -------
|
141
|
+
Oj::Doc 1.00 5.33
|
142
|
+
Ruby 0.19 1.00
|
143
|
+
|
144
|
+
What the results mean are that for getting just a few values from a JSON
|
145
|
+
document Oj::Doc is 20 times faster than any other parser and for accessing
|
146
|
+
all values it is still over 12 times faster than any other Ruby JSON parser.
|
147
|
+
|
148
|
+
### Conventional Oj parser comparisons
|
149
|
+
|
64
150
|
The following table shows the difference is speeds between several
|
65
|
-
serialization packages
|
66
|
-
some of the gems. I finally gave up
|
67
|
-
without errors with Ruby 1.9.3. It had
|
68
|
-
a simple JSON structure. The errors
|
151
|
+
serialization packages compared to the more conventional Oj parser. The tests
|
152
|
+
had to be scaled back due to limitation of some of the gems. I finally gave up
|
153
|
+
trying to get JSON Pure to serialize without errors with Ruby 1.9.3. It had
|
154
|
+
internal errors on anything other than a simple JSON structure. The errors
|
155
|
+
encountered were:
|
69
156
|
|
70
157
|
- MessagePack fails to convert Bignum to JSON
|
71
158
|
|
@@ -84,7 +171,7 @@ It is also worth noting that although Oj is slightly behind MessagePack for
|
|
84
171
|
parsing, Oj serialization is much faster than MessagePack even though Oj uses
|
85
172
|
human readable JSON vs the binary MessagePack format.
|
86
173
|
|
87
|
-
|
174
|
+
Oj supports circular references when in :object mode and when the :circular
|
88
175
|
flag is true. None of the other gems tested supported circular
|
89
176
|
references. They failed in the following manners when the input included
|
90
177
|
circular references.
|
data/ext/oj/fast.c
ADDED
@@ -0,0 +1,1540 @@
|
|
1
|
+
/* fast.c
|
2
|
+
* Copyright (c) 2012, Peter Ohler
|
3
|
+
* All rights reserved.
|
4
|
+
*
|
5
|
+
* Redistribution and use in source and binary forms, with or without
|
6
|
+
* modification, are permitted provided that the following conditions are met:
|
7
|
+
*
|
8
|
+
* - Redistributions of source code must retain the above copyright notice, this
|
9
|
+
* list of conditions and the following disclaimer.
|
10
|
+
*
|
11
|
+
* - Redistributions in binary form must reproduce the above copyright notice,
|
12
|
+
* this list of conditions and the following disclaimer in the documentation
|
13
|
+
* and/or other materials provided with the distribution.
|
14
|
+
*
|
15
|
+
* - Neither the name of Peter Ohler nor the names of its contributors may be
|
16
|
+
* used to endorse or promote products derived from this software without
|
17
|
+
* specific prior written permission.
|
18
|
+
*
|
19
|
+
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
20
|
+
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
21
|
+
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
22
|
+
* DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
23
|
+
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
24
|
+
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
25
|
+
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
26
|
+
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
27
|
+
* OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
28
|
+
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
29
|
+
*/
|
30
|
+
|
31
|
+
#include <stdlib.h>
|
32
|
+
#include <stdio.h>
|
33
|
+
#include <string.h>
|
34
|
+
#include <math.h>
|
35
|
+
#include <errno.h>
|
36
|
+
|
37
|
+
#include "ruby.h"
|
38
|
+
#include "oj.h"
|
39
|
+
|
40
|
+
#define MAX_STACK 100
|
41
|
+
|
42
|
+
enum {
|
43
|
+
STR_VAL = 0x00,
|
44
|
+
COL_VAL = 0x01,
|
45
|
+
RUBY_VAL = 0x02
|
46
|
+
};
|
47
|
+
|
48
|
+
typedef struct _Leaf {
|
49
|
+
struct _Leaf *next;
|
50
|
+
union {
|
51
|
+
const char *key; // hash key
|
52
|
+
size_t index; // array index, 0 is not set
|
53
|
+
};
|
54
|
+
union {
|
55
|
+
char *str; // pointer to location in json string
|
56
|
+
struct _Leaf *elements; // array and hash elements
|
57
|
+
VALUE value;
|
58
|
+
};
|
59
|
+
uint8_t type;
|
60
|
+
uint8_t parent_type;
|
61
|
+
uint8_t value_type;
|
62
|
+
} *Leaf;
|
63
|
+
|
64
|
+
//#define BATCH_SIZE (4096 / sizeof(struct _Leaf) - 1)
|
65
|
+
#define BATCH_SIZE 100
|
66
|
+
|
67
|
+
typedef struct _Batch {
|
68
|
+
struct _Batch *next;
|
69
|
+
int next_avail;
|
70
|
+
struct _Leaf leaves[BATCH_SIZE];
|
71
|
+
} *Batch;
|
72
|
+
|
73
|
+
typedef struct _Doc {
|
74
|
+
Leaf data;
|
75
|
+
Leaf *where; // points to current location
|
76
|
+
Leaf where_path[MAX_STACK]; // points to head of path
|
77
|
+
#ifdef HAVE_RUBY_ENCODING_H
|
78
|
+
rb_encoding *encoding;
|
79
|
+
#else
|
80
|
+
void *encoding;
|
81
|
+
#endif
|
82
|
+
unsigned long size; // number of leaves/branches in the doc
|
83
|
+
VALUE self;
|
84
|
+
Batch batches;
|
85
|
+
//Leaf where_array[MAX_STACK];
|
86
|
+
//size_t where_len; // length of allocated if longer than where_array
|
87
|
+
struct _Batch batch0;
|
88
|
+
} *Doc;
|
89
|
+
|
90
|
+
typedef struct _ParseInfo {
|
91
|
+
char *str; /* buffer being read from */
|
92
|
+
char *s; /* current position in buffer */
|
93
|
+
Doc doc;
|
94
|
+
} *ParseInfo;
|
95
|
+
|
96
|
+
static void leaf_init(Leaf leaf, int type);
|
97
|
+
static Leaf leaf_new(Doc doc, int type);
|
98
|
+
static void leaf_append_element(Leaf parent, Leaf element);
|
99
|
+
static VALUE leaf_value(Doc doc, Leaf leaf);
|
100
|
+
static void leaf_fixnum_value(Leaf leaf);
|
101
|
+
static void leaf_float_value(Leaf leaf);
|
102
|
+
static VALUE leaf_array_value(Doc doc, Leaf leaf);
|
103
|
+
static VALUE leaf_hash_value(Doc doc, Leaf leaf);
|
104
|
+
|
105
|
+
static Leaf read_next(ParseInfo pi);
|
106
|
+
static Leaf read_obj(ParseInfo pi);
|
107
|
+
static Leaf read_array(ParseInfo pi);
|
108
|
+
static Leaf read_str(ParseInfo pi);
|
109
|
+
static Leaf read_num(ParseInfo pi);
|
110
|
+
static Leaf read_true(ParseInfo pi);
|
111
|
+
static Leaf read_false(ParseInfo pi);
|
112
|
+
static Leaf read_nil(ParseInfo pi);
|
113
|
+
static void next_non_white(ParseInfo pi);
|
114
|
+
static char* read_quoted_value(ParseInfo pi);
|
115
|
+
|
116
|
+
static VALUE protect_open_proc(VALUE x);
|
117
|
+
static VALUE parse_json(VALUE clas, char *json);
|
118
|
+
static void each_leaf(Doc doc, VALUE self);
|
119
|
+
static int move_step(Doc doc, const char *path, int loc);
|
120
|
+
static Leaf get_doc_leaf(Doc doc, const char *path);
|
121
|
+
static Leaf get_leaf(Leaf *stack, Leaf *lp, const char *path);
|
122
|
+
static void each_value(Doc doc, Leaf leaf);
|
123
|
+
|
124
|
+
static void doc_init(Doc doc);
|
125
|
+
static void doc_free(Doc doc);
|
126
|
+
static VALUE doc_open(VALUE clas, VALUE str);
|
127
|
+
static VALUE doc_open_file(VALUE clas, VALUE filename);
|
128
|
+
static VALUE doc_where(VALUE self);
|
129
|
+
static VALUE doc_local_key(VALUE self);
|
130
|
+
static VALUE doc_home(VALUE self);
|
131
|
+
static VALUE doc_type(int argc, VALUE *argv, VALUE self);
|
132
|
+
static VALUE doc_fetch(int argc, VALUE *argv, VALUE self);
|
133
|
+
static VALUE doc_each_leaf(int argc, VALUE *argv, VALUE self);
|
134
|
+
static VALUE doc_move(VALUE self, VALUE str);
|
135
|
+
static VALUE doc_each_child(int argc, VALUE *argv, VALUE self);
|
136
|
+
static VALUE doc_each_value(int argc, VALUE *argv, VALUE self);
|
137
|
+
static VALUE doc_dump(int argc, VALUE *argv, VALUE self);
|
138
|
+
static VALUE doc_size(VALUE self);
|
139
|
+
|
140
|
+
VALUE oj_doc_class = 0;
|
141
|
+
|
142
|
+
inline static void
|
143
|
+
next_non_white(ParseInfo pi) {
|
144
|
+
for (; 1; pi->s++) {
|
145
|
+
switch(*pi->s) {
|
146
|
+
case ' ':
|
147
|
+
case '\t':
|
148
|
+
case '\f':
|
149
|
+
case '\n':
|
150
|
+
case '\r':
|
151
|
+
break;
|
152
|
+
default:
|
153
|
+
return;
|
154
|
+
}
|
155
|
+
}
|
156
|
+
}
|
157
|
+
|
158
|
+
inline static void
|
159
|
+
next_white(ParseInfo pi) {
|
160
|
+
for (; 1; pi->s++) {
|
161
|
+
switch(*pi->s) {
|
162
|
+
case ' ':
|
163
|
+
case '\t':
|
164
|
+
case '\f':
|
165
|
+
case '\n':
|
166
|
+
case '\r':
|
167
|
+
case '\0':
|
168
|
+
return;
|
169
|
+
default:
|
170
|
+
break;
|
171
|
+
}
|
172
|
+
}
|
173
|
+
}
|
174
|
+
|
175
|
+
inline static char*
|
176
|
+
ulong_fill(char *s, size_t num) {
|
177
|
+
char buf[32];
|
178
|
+
char *b = buf + sizeof(buf) - 1;
|
179
|
+
|
180
|
+
*b-- = '\0';
|
181
|
+
for (; 0 < num; num /= 10, b--) {
|
182
|
+
*b = (num % 10) + '0';
|
183
|
+
}
|
184
|
+
b++;
|
185
|
+
if ('\0' == *b) {
|
186
|
+
b--;
|
187
|
+
*b = '0';
|
188
|
+
}
|
189
|
+
for (; '\0' != *b; b++, s++) {
|
190
|
+
*s = *b;
|
191
|
+
}
|
192
|
+
return s;
|
193
|
+
}
|
194
|
+
|
195
|
+
inline static void
|
196
|
+
leaf_init(Leaf leaf, int type) {
|
197
|
+
leaf->next = 0;
|
198
|
+
leaf->type = type;
|
199
|
+
leaf->parent_type = T_NONE;
|
200
|
+
switch (type) {
|
201
|
+
case T_ARRAY:
|
202
|
+
case T_HASH:
|
203
|
+
leaf->elements = 0;
|
204
|
+
leaf->value_type = COL_VAL;
|
205
|
+
break;
|
206
|
+
case T_NIL:
|
207
|
+
leaf->value = Qnil;
|
208
|
+
leaf->value_type = RUBY_VAL;
|
209
|
+
break;
|
210
|
+
case T_TRUE:
|
211
|
+
leaf->value = Qtrue;
|
212
|
+
leaf->value_type = RUBY_VAL;
|
213
|
+
break;
|
214
|
+
case T_FALSE:
|
215
|
+
leaf->value = Qfalse;
|
216
|
+
leaf->value_type = RUBY_VAL;
|
217
|
+
break;
|
218
|
+
case T_FIXNUM:
|
219
|
+
case T_FLOAT:
|
220
|
+
case T_STRING:
|
221
|
+
default:
|
222
|
+
leaf->value_type = STR_VAL;
|
223
|
+
break;
|
224
|
+
}
|
225
|
+
}
|
226
|
+
|
227
|
+
inline static Leaf
|
228
|
+
leaf_new(Doc doc, int type) {
|
229
|
+
Leaf leaf;
|
230
|
+
|
231
|
+
if (0 == doc->batches || BATCH_SIZE == doc->batches->next_avail) {
|
232
|
+
Batch b = ALLOC(struct _Batch);
|
233
|
+
|
234
|
+
b->next = doc->batches;
|
235
|
+
doc->batches = b;
|
236
|
+
b->next_avail = 0;
|
237
|
+
}
|
238
|
+
leaf = &doc->batches->leaves[doc->batches->next_avail];
|
239
|
+
doc->batches->next_avail++;
|
240
|
+
leaf_init(leaf, type);
|
241
|
+
|
242
|
+
return leaf;
|
243
|
+
}
|
244
|
+
|
245
|
+
inline static void
|
246
|
+
leaf_append_element(Leaf parent, Leaf element) {
|
247
|
+
if (0 == parent->elements) {
|
248
|
+
parent->elements = element;
|
249
|
+
element->next = element;
|
250
|
+
} else {
|
251
|
+
element->next = parent->elements->next;
|
252
|
+
parent->elements->next = element;
|
253
|
+
parent->elements = element;
|
254
|
+
}
|
255
|
+
}
|
256
|
+
|
257
|
+
static VALUE
|
258
|
+
leaf_value(Doc doc, Leaf leaf) {
|
259
|
+
if (RUBY_VAL != leaf->value_type) {
|
260
|
+
switch (leaf->type) {
|
261
|
+
case T_NIL:
|
262
|
+
leaf->value = Qnil;
|
263
|
+
break;
|
264
|
+
case T_TRUE:
|
265
|
+
leaf->value = Qtrue;
|
266
|
+
break;
|
267
|
+
case T_FALSE:
|
268
|
+
leaf->value = Qfalse;
|
269
|
+
break;
|
270
|
+
case T_FIXNUM:
|
271
|
+
leaf_fixnum_value(leaf);
|
272
|
+
break;
|
273
|
+
case T_FLOAT:
|
274
|
+
leaf_float_value(leaf);
|
275
|
+
break;
|
276
|
+
case T_STRING:
|
277
|
+
leaf->value = rb_str_new2(leaf->str);
|
278
|
+
#ifdef HAVE_RUBY_ENCODING_H
|
279
|
+
if (0 != doc->encoding) {
|
280
|
+
rb_enc_associate(leaf->value, doc->encoding);
|
281
|
+
}
|
282
|
+
#endif
|
283
|
+
leaf->value_type = RUBY_VAL;
|
284
|
+
break;
|
285
|
+
case T_ARRAY:
|
286
|
+
return leaf_array_value(doc, leaf);
|
287
|
+
break;
|
288
|
+
case T_HASH:
|
289
|
+
return leaf_hash_value(doc, leaf);
|
290
|
+
break;
|
291
|
+
default:
|
292
|
+
rb_raise(rb_eTypeError, "Unexpected type %02x.", leaf->type);
|
293
|
+
break;
|
294
|
+
}
|
295
|
+
}
|
296
|
+
return leaf->value;
|
297
|
+
}
|
298
|
+
|
299
|
+
#ifdef RUBINIUS
|
300
|
+
#define NUM_MAX 0x07FFFFFF
|
301
|
+
#else
|
302
|
+
#define NUM_MAX (FIXNUM_MAX >> 8)
|
303
|
+
#endif
|
304
|
+
|
305
|
+
|
306
|
+
static void
|
307
|
+
leaf_fixnum_value(Leaf leaf) {
|
308
|
+
char *s = leaf->str;
|
309
|
+
int64_t n = 0;
|
310
|
+
int neg = 0;
|
311
|
+
int big = 0;
|
312
|
+
|
313
|
+
if ('-' == *s) {
|
314
|
+
s++;
|
315
|
+
neg = 1;
|
316
|
+
} else if ('+' == *s) {
|
317
|
+
s++;
|
318
|
+
}
|
319
|
+
for (; '0' <= *s && *s <= '9'; s++) {
|
320
|
+
n = n * 10 + (*s - '0');
|
321
|
+
if (NUM_MAX <= n) {
|
322
|
+
big = 1;
|
323
|
+
}
|
324
|
+
}
|
325
|
+
if (big) {
|
326
|
+
char c = *s;
|
327
|
+
|
328
|
+
*s = '\0';
|
329
|
+
leaf->value = rb_cstr_to_inum(leaf->str, 10, 0);
|
330
|
+
*s = c;
|
331
|
+
} else {
|
332
|
+
if (neg) {
|
333
|
+
n = -n;
|
334
|
+
}
|
335
|
+
leaf->value = LONG2NUM(n);
|
336
|
+
}
|
337
|
+
leaf->value_type = RUBY_VAL;
|
338
|
+
}
|
339
|
+
|
340
|
+
#if 1
|
341
|
+
static void
|
342
|
+
leaf_float_value(Leaf leaf) {
|
343
|
+
leaf->value = DBL2NUM(rb_cstr_to_dbl(leaf->str, 1));
|
344
|
+
leaf->value_type = RUBY_VAL;
|
345
|
+
}
|
346
|
+
#else
|
347
|
+
static void
|
348
|
+
leaf_float_value(Leaf leaf) {
|
349
|
+
char *s = leaf->str;
|
350
|
+
int64_t n = 0;
|
351
|
+
long a = 0;
|
352
|
+
long div = 1;
|
353
|
+
long e = 0;
|
354
|
+
int neg = 0;
|
355
|
+
int eneg = 0;
|
356
|
+
int big = 0;
|
357
|
+
|
358
|
+
if ('-' == *s) {
|
359
|
+
s++;
|
360
|
+
neg = 1;
|
361
|
+
} else if ('+' == *s) {
|
362
|
+
s++;
|
363
|
+
}
|
364
|
+
for (; '0' <= *s && *s <= '9'; s++) {
|
365
|
+
n = n * 10 + (*s - '0');
|
366
|
+
if (NUM_MAX <= n) {
|
367
|
+
big = 1;
|
368
|
+
}
|
369
|
+
}
|
370
|
+
if (big) {
|
371
|
+
char c = *s;
|
372
|
+
|
373
|
+
*s = '\0';
|
374
|
+
leaf->value = rb_cstr_to_inum(leaf->str, 10, 0);
|
375
|
+
*s = c;
|
376
|
+
} else {
|
377
|
+
double d;
|
378
|
+
|
379
|
+
if ('.' == *s) {
|
380
|
+
s++;
|
381
|
+
for (; '0' <= *s && *s <= '9'; s++) {
|
382
|
+
a = a * 10 + (*s - '0');
|
383
|
+
div *= 10;
|
384
|
+
}
|
385
|
+
}
|
386
|
+
if ('e' == *s || 'E' == *s) {
|
387
|
+
s++;
|
388
|
+
if ('-' == *s) {
|
389
|
+
s++;
|
390
|
+
eneg = 1;
|
391
|
+
} else if ('+' == *s) {
|
392
|
+
s++;
|
393
|
+
}
|
394
|
+
for (; '0' <= *s && *s <= '9'; s++) {
|
395
|
+
e = e * 10 + (*s - '0');
|
396
|
+
}
|
397
|
+
}
|
398
|
+
d = (double)n + (double)a / (double)div;
|
399
|
+
if (neg) {
|
400
|
+
d = -d;
|
401
|
+
}
|
402
|
+
if (0 != e) {
|
403
|
+
if (eneg) {
|
404
|
+
e = -e;
|
405
|
+
}
|
406
|
+
d *= pow(10.0, e);
|
407
|
+
}
|
408
|
+
leaf->value = DBL2NUM(d);
|
409
|
+
}
|
410
|
+
leaf->value_type = RUBY_VAL;
|
411
|
+
}
|
412
|
+
#endif
|
413
|
+
|
414
|
+
static VALUE
|
415
|
+
leaf_array_value(Doc doc, Leaf leaf) {
|
416
|
+
VALUE a = rb_ary_new();
|
417
|
+
|
418
|
+
if (0 != leaf->elements) {
|
419
|
+
Leaf first = leaf->elements->next;
|
420
|
+
Leaf e = first;
|
421
|
+
|
422
|
+
do {
|
423
|
+
rb_ary_push(a, leaf_value(doc, e));
|
424
|
+
e = e->next;
|
425
|
+
} while (e != first);
|
426
|
+
}
|
427
|
+
return a;
|
428
|
+
}
|
429
|
+
|
430
|
+
static VALUE
|
431
|
+
leaf_hash_value(Doc doc, Leaf leaf) {
|
432
|
+
VALUE h = rb_hash_new();
|
433
|
+
|
434
|
+
if (0 != leaf->elements) {
|
435
|
+
Leaf first = leaf->elements->next;
|
436
|
+
Leaf e = first;
|
437
|
+
VALUE key;
|
438
|
+
|
439
|
+
do {
|
440
|
+
key = rb_str_new2(e->key);
|
441
|
+
#ifdef HAVE_RUBY_ENCODING_H
|
442
|
+
if (0 != doc->encoding) {
|
443
|
+
rb_enc_associate(key, doc->encoding);
|
444
|
+
}
|
445
|
+
#endif
|
446
|
+
rb_hash_aset(h, key, leaf_value(doc, e));
|
447
|
+
e = e->next;
|
448
|
+
} while (e != first);
|
449
|
+
}
|
450
|
+
return h;
|
451
|
+
}
|
452
|
+
|
453
|
+
static Leaf
|
454
|
+
read_next(ParseInfo pi) {
|
455
|
+
Leaf leaf = 0;
|
456
|
+
|
457
|
+
next_non_white(pi); // skip white space
|
458
|
+
switch (*pi->s) {
|
459
|
+
case '{':
|
460
|
+
leaf = read_obj(pi);
|
461
|
+
break;
|
462
|
+
case '[':
|
463
|
+
leaf = read_array(pi);
|
464
|
+
break;
|
465
|
+
case '"':
|
466
|
+
leaf = read_str(pi);
|
467
|
+
break;
|
468
|
+
case '+':
|
469
|
+
case '-':
|
470
|
+
case '0':
|
471
|
+
case '1':
|
472
|
+
case '2':
|
473
|
+
case '3':
|
474
|
+
case '4':
|
475
|
+
case '5':
|
476
|
+
case '6':
|
477
|
+
case '7':
|
478
|
+
case '8':
|
479
|
+
case '9':
|
480
|
+
leaf = read_num(pi);
|
481
|
+
break;
|
482
|
+
case 't':
|
483
|
+
leaf = read_true(pi);
|
484
|
+
break;
|
485
|
+
case 'f':
|
486
|
+
leaf = read_false(pi);
|
487
|
+
break;
|
488
|
+
case 'n':
|
489
|
+
leaf = read_nil(pi);
|
490
|
+
break;
|
491
|
+
case '\0':
|
492
|
+
default:
|
493
|
+
break; // returns 0
|
494
|
+
}
|
495
|
+
pi->doc->size++;
|
496
|
+
|
497
|
+
return leaf;
|
498
|
+
}
|
499
|
+
|
500
|
+
static Leaf
|
501
|
+
read_obj(ParseInfo pi) {
|
502
|
+
Leaf h = leaf_new(pi->doc, T_HASH);
|
503
|
+
char *end;
|
504
|
+
const char *key = 0;
|
505
|
+
Leaf val = 0;
|
506
|
+
|
507
|
+
pi->s++;
|
508
|
+
next_non_white(pi);
|
509
|
+
if ('}' == *pi->s) {
|
510
|
+
pi->s++;
|
511
|
+
return h;
|
512
|
+
}
|
513
|
+
while (1) {
|
514
|
+
next_non_white(pi);
|
515
|
+
key = 0;
|
516
|
+
val = 0;
|
517
|
+
if ('"' != *pi->s || 0 == (key = read_quoted_value(pi))) {
|
518
|
+
raise_error("unexpected character", pi->str, pi->s);
|
519
|
+
}
|
520
|
+
next_non_white(pi);
|
521
|
+
if (':' == *pi->s) {
|
522
|
+
pi->s++;
|
523
|
+
} else {
|
524
|
+
raise_error("invalid format, expected :", pi->str, pi->s);
|
525
|
+
}
|
526
|
+
if (0 == (val = read_next(pi))) {
|
527
|
+
//printf("*** '%s'\n", pi->s);
|
528
|
+
raise_error("unexpected character", pi->str, pi->s);
|
529
|
+
}
|
530
|
+
end = pi->s;
|
531
|
+
val->key = key;
|
532
|
+
val->parent_type = T_HASH;
|
533
|
+
leaf_append_element(h, val);
|
534
|
+
next_non_white(pi);
|
535
|
+
if ('}' == *pi->s) {
|
536
|
+
pi->s++;
|
537
|
+
*end = '\0';
|
538
|
+
break;
|
539
|
+
} else if (',' == *pi->s) {
|
540
|
+
pi->s++;
|
541
|
+
} else {
|
542
|
+
printf("*** '%s'\n", pi->s);
|
543
|
+
raise_error("invalid format, expected , or } while in an object", pi->str, pi->s);
|
544
|
+
}
|
545
|
+
*end = '\0';
|
546
|
+
}
|
547
|
+
return h;
|
548
|
+
}
|
549
|
+
|
550
|
+
static Leaf
|
551
|
+
read_array(ParseInfo pi) {
|
552
|
+
Leaf a = leaf_new(pi->doc, T_ARRAY);
|
553
|
+
Leaf e;
|
554
|
+
char *end;
|
555
|
+
int cnt = 0;
|
556
|
+
|
557
|
+
pi->s++;
|
558
|
+
next_non_white(pi);
|
559
|
+
if (']' == *pi->s) {
|
560
|
+
pi->s++;
|
561
|
+
return a;
|
562
|
+
}
|
563
|
+
while (1) {
|
564
|
+
next_non_white(pi);
|
565
|
+
if (0 == (e = read_next(pi))) {
|
566
|
+
raise_error("unexpected character", pi->str, pi->s);
|
567
|
+
}
|
568
|
+
cnt++;
|
569
|
+
e->index = cnt;
|
570
|
+
e->parent_type = T_ARRAY;
|
571
|
+
leaf_append_element(a, e);
|
572
|
+
end = pi->s;
|
573
|
+
next_non_white(pi);
|
574
|
+
if (',' == *pi->s) {
|
575
|
+
pi->s++;
|
576
|
+
} else if (']' == *pi->s) {
|
577
|
+
pi->s++;
|
578
|
+
*end = '\0';
|
579
|
+
break;
|
580
|
+
} else {
|
581
|
+
raise_error("invalid format, expected , or ] while in an array", pi->str, pi->s);
|
582
|
+
}
|
583
|
+
*end = '\0';
|
584
|
+
}
|
585
|
+
return a;
|
586
|
+
}
|
587
|
+
|
588
|
+
static Leaf
|
589
|
+
read_str(ParseInfo pi) {
|
590
|
+
Leaf leaf = leaf_new(pi->doc, T_STRING);
|
591
|
+
|
592
|
+
leaf->str = read_quoted_value(pi);
|
593
|
+
|
594
|
+
return leaf;
|
595
|
+
}
|
596
|
+
|
597
|
+
static Leaf
|
598
|
+
read_num(ParseInfo pi) {
|
599
|
+
char *start = pi->s;
|
600
|
+
int type = T_FIXNUM;
|
601
|
+
Leaf leaf = leaf_new(pi->doc, type);
|
602
|
+
|
603
|
+
if ('-' == *pi->s) {
|
604
|
+
pi->s++;
|
605
|
+
}
|
606
|
+
// digits
|
607
|
+
for (; '0' <= *pi->s && *pi->s <= '9'; pi->s++) {
|
608
|
+
}
|
609
|
+
if ('.' == *pi->s) {
|
610
|
+
type = T_FLOAT;
|
611
|
+
pi->s++;
|
612
|
+
for (; '0' <= *pi->s && *pi->s <= '9'; pi->s++) {
|
613
|
+
}
|
614
|
+
}
|
615
|
+
if ('e' == *pi->s || 'E' == *pi->s) {
|
616
|
+
pi->s++;
|
617
|
+
if ('-' == *pi->s || '+' == *pi->s) {
|
618
|
+
pi->s++;
|
619
|
+
}
|
620
|
+
for (; '0' <= *pi->s && *pi->s <= '9'; pi->s++) {
|
621
|
+
}
|
622
|
+
}
|
623
|
+
leaf = leaf_new(pi->doc, type);
|
624
|
+
leaf->str = start;
|
625
|
+
|
626
|
+
return leaf;
|
627
|
+
}
|
628
|
+
|
629
|
+
static Leaf
|
630
|
+
read_true(ParseInfo pi) {
|
631
|
+
Leaf leaf = leaf_new(pi->doc, T_TRUE);
|
632
|
+
|
633
|
+
pi->s++;
|
634
|
+
if ('r' != *pi->s || 'u' != *(pi->s + 1) || 'e' != *(pi->s + 2)) {
|
635
|
+
raise_error("invalid format, expected 'true'", pi->str, pi->s);
|
636
|
+
}
|
637
|
+
pi->s += 3;
|
638
|
+
|
639
|
+
return leaf;
|
640
|
+
}
|
641
|
+
|
642
|
+
static Leaf
|
643
|
+
read_false(ParseInfo pi) {
|
644
|
+
Leaf leaf = leaf_new(pi->doc, T_FALSE);
|
645
|
+
|
646
|
+
pi->s++;
|
647
|
+
if ('a' != *pi->s || 'l' != *(pi->s + 1) || 's' != *(pi->s + 2) || 'e' != *(pi->s + 3)) {
|
648
|
+
raise_error("invalid format, expected 'false'", pi->str, pi->s);
|
649
|
+
}
|
650
|
+
pi->s += 4;
|
651
|
+
|
652
|
+
return leaf;
|
653
|
+
}
|
654
|
+
|
655
|
+
static Leaf
|
656
|
+
read_nil(ParseInfo pi) {
|
657
|
+
Leaf leaf = leaf_new(pi->doc, T_NIL);
|
658
|
+
|
659
|
+
pi->s++;
|
660
|
+
if ('u' != *pi->s || 'l' != *(pi->s + 1) || 'l' != *(pi->s + 2)) {
|
661
|
+
raise_error("invalid format, expected 'nil'", pi->str, pi->s);
|
662
|
+
}
|
663
|
+
pi->s += 3;
|
664
|
+
|
665
|
+
return leaf;
|
666
|
+
}
|
667
|
+
|
668
|
+
static char
|
669
|
+
read_hex(ParseInfo pi, char *h) {
|
670
|
+
uint8_t b = 0;
|
671
|
+
|
672
|
+
if ('0' <= *h && *h <= '9') {
|
673
|
+
b = *h - '0';
|
674
|
+
} else if ('A' <= *h && *h <= 'F') {
|
675
|
+
b = *h - 'A' + 10;
|
676
|
+
} else if ('a' <= *h && *h <= 'f') {
|
677
|
+
b = *h - 'a' + 10;
|
678
|
+
} else {
|
679
|
+
pi->s = h;
|
680
|
+
raise_error("invalid hex character", pi->str, pi->s);
|
681
|
+
}
|
682
|
+
h++;
|
683
|
+
b = b << 4;
|
684
|
+
if ('0' <= *h && *h <= '9') {
|
685
|
+
b += *h - '0';
|
686
|
+
} else if ('A' <= *h && *h <= 'F') {
|
687
|
+
b += *h - 'A' + 10;
|
688
|
+
} else if ('a' <= *h && *h <= 'f') {
|
689
|
+
b += *h - 'a' + 10;
|
690
|
+
} else {
|
691
|
+
pi->s = h;
|
692
|
+
raise_error("invalid hex character", pi->str, pi->s);
|
693
|
+
}
|
694
|
+
return (char)b;
|
695
|
+
}
|
696
|
+
|
697
|
+
/* Assume the value starts immediately and goes until the quote character is
|
698
|
+
* reached again. Do not read the character after the terminating quote.
|
699
|
+
*/
|
700
|
+
static char*
|
701
|
+
read_quoted_value(ParseInfo pi) {
|
702
|
+
char *value = 0;
|
703
|
+
char *h = pi->s; // head
|
704
|
+
char *t = h; // tail
|
705
|
+
|
706
|
+
h++; // skip quote character
|
707
|
+
t++;
|
708
|
+
value = h;
|
709
|
+
for (; '"' != *h; h++, t++) {
|
710
|
+
if ('\0' == *h) {
|
711
|
+
pi->s = h;
|
712
|
+
raise_error("quoted string not terminated", pi->str, pi->s);
|
713
|
+
} else if ('\\' == *h) {
|
714
|
+
h++;
|
715
|
+
switch (*h) {
|
716
|
+
case 'n': *t = '\n'; break;
|
717
|
+
case 'r': *t = '\r'; break;
|
718
|
+
case 't': *t = '\t'; break;
|
719
|
+
case 'f': *t = '\f'; break;
|
720
|
+
case 'b': *t = '\b'; break;
|
721
|
+
case '"': *t = '"'; break;
|
722
|
+
case '/': *t = '/'; break;
|
723
|
+
case '\\': *t = '\\'; break;
|
724
|
+
case 'u':
|
725
|
+
h++;
|
726
|
+
*t = read_hex(pi, h);
|
727
|
+
h += 2;
|
728
|
+
if ('\0' != *t) {
|
729
|
+
t++;
|
730
|
+
}
|
731
|
+
*t = read_hex(pi, h);
|
732
|
+
h++;
|
733
|
+
break;
|
734
|
+
default:
|
735
|
+
pi->s = h;
|
736
|
+
raise_error("invalid escaped character", pi->str, pi->s);
|
737
|
+
break;
|
738
|
+
}
|
739
|
+
} else if (t != h) {
|
740
|
+
*t = *h;
|
741
|
+
}
|
742
|
+
}
|
743
|
+
*t = '\0'; // terminate value
|
744
|
+
pi->s = h + 1;
|
745
|
+
|
746
|
+
return value;
|
747
|
+
}
|
748
|
+
|
749
|
+
// doc support functions
|
750
|
+
inline static void
|
751
|
+
doc_init(Doc doc) {
|
752
|
+
//doc->where_path = doc->where_array;
|
753
|
+
//doc->where_len = 0;
|
754
|
+
doc->where = doc->where_path;
|
755
|
+
*doc->where = 0;
|
756
|
+
doc->data = 0;
|
757
|
+
doc->self = Qundef;
|
758
|
+
#ifdef HAVE_RUBY_ENCODING_H
|
759
|
+
doc->encoding = ('\0' == *oj_default_options.encoding) ? 0 : rb_enc_find(oj_default_options.encoding);
|
760
|
+
#else
|
761
|
+
doc->encoding = 0;
|
762
|
+
#endif
|
763
|
+
doc->size = 0;
|
764
|
+
doc->batches = &doc->batch0;
|
765
|
+
doc->batch0.next = 0;
|
766
|
+
doc->batch0.next_avail = 0;
|
767
|
+
}
|
768
|
+
|
769
|
+
static void
|
770
|
+
doc_free(Doc doc) {
|
771
|
+
if (0 != doc) {
|
772
|
+
Batch b;
|
773
|
+
|
774
|
+
while (0 != (b = doc->batches)) {
|
775
|
+
doc->batches = doc->batches->next;
|
776
|
+
if (&doc->batch0 != b) {
|
777
|
+
xfree(b);
|
778
|
+
}
|
779
|
+
}
|
780
|
+
/*
|
781
|
+
if (doc->where_array != doc->where_path) {
|
782
|
+
free(doc->where_path);
|
783
|
+
}
|
784
|
+
*/
|
785
|
+
//xfree(f);
|
786
|
+
}
|
787
|
+
}
|
788
|
+
|
789
|
+
static VALUE
|
790
|
+
protect_open_proc(VALUE x) {
|
791
|
+
ParseInfo pi = (ParseInfo)x;
|
792
|
+
|
793
|
+
pi->doc->data = read_next(pi); // parse
|
794
|
+
*pi->doc->where = pi->doc->data;
|
795
|
+
pi->doc->where = pi->doc->where_path;
|
796
|
+
return rb_yield(pi->doc->self); // caller processing
|
797
|
+
}
|
798
|
+
|
799
|
+
static VALUE
|
800
|
+
parse_json(VALUE clas, char *json) {
|
801
|
+
struct _ParseInfo pi;
|
802
|
+
VALUE result = Qnil;
|
803
|
+
struct _Doc doc;
|
804
|
+
int ex = 0;
|
805
|
+
|
806
|
+
if (!rb_block_given_p()) {
|
807
|
+
rb_raise(rb_eArgError, "Block or Proc is required.");
|
808
|
+
}
|
809
|
+
pi.str = json;
|
810
|
+
pi.s = pi.str;
|
811
|
+
doc_init(&doc);
|
812
|
+
pi.doc = &doc;
|
813
|
+
doc.self = rb_obj_alloc(clas);
|
814
|
+
DATA_PTR(doc.self) = pi.doc;
|
815
|
+
result = rb_protect(protect_open_proc, (VALUE)&pi, &ex);
|
816
|
+
DATA_PTR(doc.self) = 0;
|
817
|
+
doc_free(pi.doc);
|
818
|
+
//xfree(pi.str);
|
819
|
+
if (0 != ex) {
|
820
|
+
rb_jump_tag(ex);
|
821
|
+
}
|
822
|
+
return result;
|
823
|
+
}
|
824
|
+
|
825
|
+
static Leaf
|
826
|
+
get_doc_leaf(Doc doc, const char *path) {
|
827
|
+
Leaf leaf = *doc->where;
|
828
|
+
|
829
|
+
if (0 != doc->data && 0 != path) {
|
830
|
+
Leaf stack[MAX_STACK];
|
831
|
+
Leaf *lp;
|
832
|
+
|
833
|
+
if ('/' == *path) {
|
834
|
+
path++;
|
835
|
+
*stack = doc->data;
|
836
|
+
lp = stack;
|
837
|
+
} else {
|
838
|
+
size_t cnt = doc->where - doc->where_path;
|
839
|
+
|
840
|
+
memcpy(stack, doc->where_path, sizeof(Leaf) * cnt);
|
841
|
+
lp = stack + cnt;
|
842
|
+
}
|
843
|
+
return get_leaf(stack, lp, path);
|
844
|
+
}
|
845
|
+
return leaf;
|
846
|
+
}
|
847
|
+
|
848
|
+
static Leaf
|
849
|
+
get_leaf(Leaf *stack, Leaf *lp, const char *path) {
|
850
|
+
Leaf leaf = *lp;
|
851
|
+
|
852
|
+
if ('\0' != *path) {
|
853
|
+
if ('.' == *path && '.' == *(path + 1)) {
|
854
|
+
path += 2;
|
855
|
+
if ('/' == *path) {
|
856
|
+
path++;
|
857
|
+
}
|
858
|
+
if (stack < lp) {
|
859
|
+
leaf = get_leaf(stack, lp - 1, path);
|
860
|
+
} else {
|
861
|
+
return 0;
|
862
|
+
}
|
863
|
+
} else if (COL_VAL == leaf->value_type && 0 != leaf->elements) {
|
864
|
+
Leaf first = leaf->elements->next;
|
865
|
+
Leaf e = first;
|
866
|
+
int type = leaf->type;
|
867
|
+
|
868
|
+
// TBD fail if stack too deep
|
869
|
+
leaf = 0;
|
870
|
+
if (T_ARRAY == type) {
|
871
|
+
int cnt = 0;
|
872
|
+
|
873
|
+
for (; '0' <= *path && *path <= '9'; path++) {
|
874
|
+
cnt = cnt * 10 + (*path - '0');
|
875
|
+
}
|
876
|
+
if ('/' == *path) {
|
877
|
+
path++;
|
878
|
+
}
|
879
|
+
do {
|
880
|
+
if (1 >= cnt) {
|
881
|
+
lp++;
|
882
|
+
*lp = e;
|
883
|
+
leaf = get_leaf(stack, lp, path);
|
884
|
+
break;
|
885
|
+
}
|
886
|
+
cnt--;
|
887
|
+
e = e->next;
|
888
|
+
} while (e != first);
|
889
|
+
} else if (T_HASH == type) {
|
890
|
+
const char *key = path;
|
891
|
+
const char *slash = strchr(path, '/');
|
892
|
+
int klen;
|
893
|
+
|
894
|
+
if (0 == slash) {
|
895
|
+
klen = (int)strlen(key);
|
896
|
+
path += klen;
|
897
|
+
} else {
|
898
|
+
klen = (int)(slash - key);
|
899
|
+
path += klen + 1;
|
900
|
+
}
|
901
|
+
do {
|
902
|
+
if (0 == strncmp(key, e->key, klen) && '\0' == e->key[klen]) {
|
903
|
+
lp++;
|
904
|
+
*lp = e;
|
905
|
+
leaf = get_leaf(stack, lp, path);
|
906
|
+
break;
|
907
|
+
}
|
908
|
+
e = e->next;
|
909
|
+
} while (e != first);
|
910
|
+
}
|
911
|
+
}
|
912
|
+
}
|
913
|
+
return leaf;
|
914
|
+
}
|
915
|
+
|
916
|
+
static void
|
917
|
+
each_leaf(Doc doc, VALUE self) {
|
918
|
+
if (COL_VAL == (*doc->where)->value_type) {
|
919
|
+
if (0 != (*doc->where)->elements) {
|
920
|
+
Leaf first = (*doc->where)->elements->next;
|
921
|
+
Leaf e = first;
|
922
|
+
|
923
|
+
doc->where++;
|
924
|
+
do {
|
925
|
+
*doc->where = e;
|
926
|
+
each_leaf(doc, self);
|
927
|
+
e = e->next;
|
928
|
+
} while (e != first);
|
929
|
+
}
|
930
|
+
} else {
|
931
|
+
rb_yield(self);
|
932
|
+
}
|
933
|
+
}
|
934
|
+
|
935
|
+
static int
|
936
|
+
move_step(Doc doc, const char *path, int loc) {
|
937
|
+
// TBD raise if too deep
|
938
|
+
if ('\0' == *path) {
|
939
|
+
loc = 0;
|
940
|
+
} else {
|
941
|
+
Leaf leaf;
|
942
|
+
|
943
|
+
if (0 == doc->where || 0 == (leaf = *doc->where)) {
|
944
|
+
printf("*** Internal error at %s\n", path);
|
945
|
+
return loc;
|
946
|
+
}
|
947
|
+
if ('.' == *path && '.' == *(path + 1)) {
|
948
|
+
Leaf init = *doc->where;
|
949
|
+
|
950
|
+
path += 2;
|
951
|
+
if (doc->where == doc->where_path) {
|
952
|
+
return loc;
|
953
|
+
}
|
954
|
+
if ('/' == *path) {
|
955
|
+
path++;
|
956
|
+
}
|
957
|
+
*doc->where = 0;
|
958
|
+
doc->where--;
|
959
|
+
loc = move_step(doc, path, loc + 1);
|
960
|
+
if (0 != loc) {
|
961
|
+
*doc->where = init;
|
962
|
+
doc->where++;
|
963
|
+
}
|
964
|
+
} else if (COL_VAL == leaf->value_type && 0 != leaf->elements) {
|
965
|
+
Leaf first = leaf->elements->next;
|
966
|
+
Leaf e = first;
|
967
|
+
|
968
|
+
if (T_ARRAY == leaf->type) {
|
969
|
+
int cnt = 0;
|
970
|
+
|
971
|
+
for (; '0' <= *path && *path <= '9'; path++) {
|
972
|
+
cnt = cnt * 10 + (*path - '0');
|
973
|
+
}
|
974
|
+
if ('/' == *path) {
|
975
|
+
path++;
|
976
|
+
} else if ('\0' != *path) {
|
977
|
+
return loc;
|
978
|
+
}
|
979
|
+
do {
|
980
|
+
if (1 >= cnt) {
|
981
|
+
doc->where++;
|
982
|
+
*doc->where = e;
|
983
|
+
loc = move_step(doc, path, loc + 1);
|
984
|
+
if (0 != loc) {
|
985
|
+
*doc->where = 0;
|
986
|
+
doc->where--;
|
987
|
+
}
|
988
|
+
break;
|
989
|
+
}
|
990
|
+
cnt--;
|
991
|
+
e = e->next;
|
992
|
+
} while (e != first);
|
993
|
+
} else if (T_HASH == leaf->type) {
|
994
|
+
const char *key = path;
|
995
|
+
const char *slash = strchr(path, '/');
|
996
|
+
int klen;
|
997
|
+
|
998
|
+
if (0 == slash) {
|
999
|
+
klen = (int)strlen(key);
|
1000
|
+
path += klen;
|
1001
|
+
} else {
|
1002
|
+
klen = (int)(slash - key);
|
1003
|
+
path += klen + 1;
|
1004
|
+
}
|
1005
|
+
do {
|
1006
|
+
if (0 == strncmp(key, e->key, klen) && '\0' == e->key[klen]) {
|
1007
|
+
doc->where++;
|
1008
|
+
*doc->where = e;
|
1009
|
+
loc = move_step(doc, path, loc + 1);
|
1010
|
+
if (0 != loc) {
|
1011
|
+
*doc->where = 0;
|
1012
|
+
doc->where--;
|
1013
|
+
}
|
1014
|
+
break;
|
1015
|
+
}
|
1016
|
+
e = e->next;
|
1017
|
+
} while (e != first);
|
1018
|
+
}
|
1019
|
+
}
|
1020
|
+
}
|
1021
|
+
return loc;
|
1022
|
+
}
|
1023
|
+
|
1024
|
+
static void
|
1025
|
+
each_value(Doc doc, Leaf leaf) {
|
1026
|
+
if (COL_VAL == leaf->value_type) {
|
1027
|
+
if (0 != leaf->elements) {
|
1028
|
+
Leaf first = leaf->elements->next;
|
1029
|
+
Leaf e = first;
|
1030
|
+
|
1031
|
+
do {
|
1032
|
+
each_value(doc, e);
|
1033
|
+
e = e->next;
|
1034
|
+
} while (e != first);
|
1035
|
+
}
|
1036
|
+
} else {
|
1037
|
+
VALUE args[1];
|
1038
|
+
|
1039
|
+
*args = leaf_value(doc, leaf);
|
1040
|
+
rb_yield_values2(1, args);
|
1041
|
+
}
|
1042
|
+
}
|
1043
|
+
|
1044
|
+
// doc functions
|
1045
|
+
|
1046
|
+
/* call-seq: open(json) { |doc| ... } => Object
|
1047
|
+
*
|
1048
|
+
* Parses a JSON document String and then yields to the provided block with an
|
1049
|
+
* instance of the Oj::Doc as the single yield parameter.
|
1050
|
+
*
|
1051
|
+
* @param [String] json JSON document string
|
1052
|
+
* @yieldparam [Oj::Doc] doc parsed JSON document
|
1053
|
+
* @yieldreturn [Object] returns the result of the yield as the result of the method call
|
1054
|
+
* @example
|
1055
|
+
* Oj::Doc.open('[1,2,3]') { |doc| doc.size() } #=> 4
|
1056
|
+
*/
|
1057
|
+
static VALUE
|
1058
|
+
doc_open(VALUE clas, VALUE str) {
|
1059
|
+
char *json;
|
1060
|
+
size_t len;
|
1061
|
+
|
1062
|
+
Check_Type(str, T_STRING);
|
1063
|
+
len = RSTRING_LEN(str) + 1;
|
1064
|
+
json = ALLOCA_N(char, len);
|
1065
|
+
memcpy(json, StringValuePtr(str), len);
|
1066
|
+
|
1067
|
+
return parse_json(clas, json);
|
1068
|
+
}
|
1069
|
+
|
1070
|
+
/* call-seq: open_file(filename) { |doc| ... } => Object
|
1071
|
+
*
|
1072
|
+
* Parses a JSON document from a file and then yields to the provided block
|
1073
|
+
* with an instance of the Oj::Doc as the single yield parameter.
|
1074
|
+
*
|
1075
|
+
* @param [String] filename name of file that contains a JSON document
|
1076
|
+
* @yieldparam [Oj::Doc] doc parsed JSON document
|
1077
|
+
* @yieldreturn [Object] returns the result of the yield as the result of the method call
|
1078
|
+
* @example
|
1079
|
+
* File.open('array.json', 'w') { |f| f.write('[1,2,3]') }
|
1080
|
+
* Oj::Doc.open_file(filename) { |doc| doc.size() } #=> 4
|
1081
|
+
*/
|
1082
|
+
static VALUE
|
1083
|
+
doc_open_file(VALUE clas, VALUE filename) {
|
1084
|
+
char *path;
|
1085
|
+
char *json;
|
1086
|
+
FILE *f;
|
1087
|
+
size_t len;
|
1088
|
+
|
1089
|
+
Check_Type(filename, T_STRING);
|
1090
|
+
path = StringValuePtr(filename);
|
1091
|
+
if (0 == (f = fopen(path, "r"))) {
|
1092
|
+
rb_raise(rb_eIOError, "%s\n", strerror(errno));
|
1093
|
+
}
|
1094
|
+
fseek(f, 0, SEEK_END);
|
1095
|
+
len = ftell(f);
|
1096
|
+
json = ALLOCA_N(char, len + 1);
|
1097
|
+
fseek(f, 0, SEEK_SET);
|
1098
|
+
if (len != fread(json, 1, len, f)) {
|
1099
|
+
fclose(f);
|
1100
|
+
rb_raise(rb_eLoadError, "Failed to read %ld bytes from %s.\n", len, path);
|
1101
|
+
}
|
1102
|
+
fclose(f);
|
1103
|
+
json[len] = '\0';
|
1104
|
+
|
1105
|
+
return parse_json(clas, json);
|
1106
|
+
}
|
1107
|
+
|
1108
|
+
/* Document-method: parse
|
1109
|
+
* @see Oj::Doc.open
|
1110
|
+
*/
|
1111
|
+
|
1112
|
+
/* call-seq: where?() => String
|
1113
|
+
*
|
1114
|
+
* Returns a String that describes the absolute path to the current location
|
1115
|
+
* in the JSON document.
|
1116
|
+
*/
|
1117
|
+
static VALUE
|
1118
|
+
doc_where(VALUE self) {
|
1119
|
+
Doc doc = DATA_PTR(self);
|
1120
|
+
|
1121
|
+
if (0 == *doc->where_path || doc->where == doc->where_path) {
|
1122
|
+
return oj_slash_string;
|
1123
|
+
} else {
|
1124
|
+
Leaf *lp;
|
1125
|
+
Leaf leaf;
|
1126
|
+
size_t size = 3; // leading / and terminating \0
|
1127
|
+
char *path;
|
1128
|
+
char *p;
|
1129
|
+
|
1130
|
+
for (lp = doc->where_path; lp <= doc->where; lp++) {
|
1131
|
+
leaf = *lp;
|
1132
|
+
if (T_HASH == leaf->parent_type) {
|
1133
|
+
size += strlen((*lp)->key) + 1;
|
1134
|
+
} else if (T_ARRAY == leaf->parent_type) {
|
1135
|
+
size += ((*lp)->index < 100) ? 3 : 11;
|
1136
|
+
}
|
1137
|
+
}
|
1138
|
+
path = ALLOCA_N(char, size);
|
1139
|
+
p = path;
|
1140
|
+
for (lp = doc->where_path; lp <= doc->where; lp++) {
|
1141
|
+
leaf = *lp;
|
1142
|
+
if (T_HASH == leaf->parent_type) {
|
1143
|
+
p = stpcpy(p, (*lp)->key);
|
1144
|
+
} else if (T_ARRAY == leaf->parent_type) {
|
1145
|
+
p = ulong_fill(p, (*lp)->index);
|
1146
|
+
}
|
1147
|
+
*p++ = '/';
|
1148
|
+
}
|
1149
|
+
*--p = '\0';
|
1150
|
+
return rb_str_new2(path);
|
1151
|
+
}
|
1152
|
+
}
|
1153
|
+
|
1154
|
+
/* call-seq: local_key() => String, Fixnum, nil
|
1155
|
+
*
|
1156
|
+
* Returns the final key to the current location.
|
1157
|
+
* @example
|
1158
|
+
* Oj::Doc.open('[1,2,3]') { |doc| doc.move('/2'); doc.local_key() } #=> 2
|
1159
|
+
* Oj::Doc.open('{"one":3}') { |doc| doc.move('/one'); doc.local_key() } #=> "one"
|
1160
|
+
* Oj::Doc.open('[1,2,3]') { |doc| doc.local_key() } #=> nil
|
1161
|
+
*/
|
1162
|
+
static VALUE
|
1163
|
+
doc_local_key(VALUE self) {
|
1164
|
+
Doc doc = DATA_PTR(self);
|
1165
|
+
Leaf leaf = *doc->where;
|
1166
|
+
VALUE key = Qnil;
|
1167
|
+
|
1168
|
+
if (T_HASH == leaf->parent_type) {
|
1169
|
+
key = rb_str_new2(leaf->key);
|
1170
|
+
#ifdef HAVE_RUBY_ENCODING_H
|
1171
|
+
if (0 != doc->encoding) {
|
1172
|
+
rb_enc_associate(key, doc->encoding);
|
1173
|
+
}
|
1174
|
+
#endif
|
1175
|
+
} else if (T_ARRAY == leaf->parent_type) {
|
1176
|
+
key = LONG2NUM(leaf->index);
|
1177
|
+
}
|
1178
|
+
return key;
|
1179
|
+
}
|
1180
|
+
|
1181
|
+
/* call-seq: home() => nil
|
1182
|
+
*
|
1183
|
+
* Moves the document marker or location to the hoot or home position. The
|
1184
|
+
* same operation can be performed with a Oj::Doc.move('/').
|
1185
|
+
* @example
|
1186
|
+
* Oj::Doc.open('[1,2,3]') { |doc| doc.move('/2'); doc.home(); doc.where? } #=> '/'
|
1187
|
+
*/
|
1188
|
+
static VALUE
|
1189
|
+
doc_home(VALUE self) {
|
1190
|
+
Doc doc = DATA_PTR(self);
|
1191
|
+
|
1192
|
+
*doc->where_path = doc->data;
|
1193
|
+
doc->where = doc->where_path;
|
1194
|
+
|
1195
|
+
return oj_slash_string;
|
1196
|
+
}
|
1197
|
+
|
1198
|
+
/* call-seq: type(path=nil) => Class
|
1199
|
+
*
|
1200
|
+
* Returns the Class of the data value at the location identified by the path
|
1201
|
+
* or the current location if the path is nil or not provided. This method
|
1202
|
+
* does not create the Ruby Object at the location specified so the overhead
|
1203
|
+
* is low.
|
1204
|
+
* @param [String] path path to the location to get the type of if provided
|
1205
|
+
* @example
|
1206
|
+
* Oj::Doc.open('[1,2]') { |doc| doc.type() } #=> Array
|
1207
|
+
* Oj::Doc.open('[1,2]') { |doc| doc.type('/1') } #=> Fixnum
|
1208
|
+
*/
|
1209
|
+
static VALUE
|
1210
|
+
doc_type(int argc, VALUE *argv, VALUE self) {
|
1211
|
+
Doc doc = DATA_PTR(self);
|
1212
|
+
Leaf leaf;
|
1213
|
+
const char *path = 0;
|
1214
|
+
VALUE type = Qnil;
|
1215
|
+
|
1216
|
+
if (1 <= argc) {
|
1217
|
+
Check_Type(*argv, T_STRING);
|
1218
|
+
path = StringValuePtr(*argv);
|
1219
|
+
}
|
1220
|
+
if (0 != (leaf = get_doc_leaf(doc, path))) {
|
1221
|
+
switch (leaf->type) {
|
1222
|
+
case T_NIL: type = rb_cNilClass; break;
|
1223
|
+
case T_TRUE: type = rb_cTrueClass; break;
|
1224
|
+
case T_FALSE: type = rb_cFalseClass; break;
|
1225
|
+
case T_STRING: type = rb_cString; break;
|
1226
|
+
case T_FIXNUM: type = rb_cFixnum; break;
|
1227
|
+
case T_FLOAT: type = rb_cFloat; break;
|
1228
|
+
case T_ARRAY: type = rb_cArray; break;
|
1229
|
+
case T_HASH: type = rb_cHash; break;
|
1230
|
+
default: break;
|
1231
|
+
}
|
1232
|
+
}
|
1233
|
+
return type;
|
1234
|
+
}
|
1235
|
+
|
1236
|
+
/* call-seq: fetch(path=nil) => nil, true, false, Fixnum, Float, String, Array, Hash
|
1237
|
+
*
|
1238
|
+
* Returns the value at the location identified by the path or the current
|
1239
|
+
* location if the path is nil or not provided. This method will create and
|
1240
|
+
* return an Array or Hash if that is the type of Object at the location
|
1241
|
+
* specified. This is more expensive than navigating to the leaves of the JSON
|
1242
|
+
* document.
|
1243
|
+
* @param [String] path path to the location to get the type of if provided
|
1244
|
+
* @example
|
1245
|
+
* Oj::Doc.open('[1,2]') { |doc| doc.fetch() } #=> [1, 2]
|
1246
|
+
* Oj::Doc.open('[1,2]') { |doc| doc.fetch('/1') } #=> 1
|
1247
|
+
*/
|
1248
|
+
static VALUE
|
1249
|
+
doc_fetch(int argc, VALUE *argv, VALUE self) {
|
1250
|
+
Doc doc = DATA_PTR(self);
|
1251
|
+
Leaf leaf;
|
1252
|
+
VALUE val = Qnil;
|
1253
|
+
const char *path = 0;
|
1254
|
+
|
1255
|
+
if (1 <= argc) {
|
1256
|
+
Check_Type(*argv, T_STRING);
|
1257
|
+
path = StringValuePtr(*argv);
|
1258
|
+
if (2 == argc) {
|
1259
|
+
val = argv[1];
|
1260
|
+
}
|
1261
|
+
}
|
1262
|
+
if (0 != (leaf = get_doc_leaf(doc, path))) {
|
1263
|
+
val = leaf_value(doc, leaf);
|
1264
|
+
}
|
1265
|
+
return val;
|
1266
|
+
}
|
1267
|
+
|
1268
|
+
/* call-seq: each_leaf(path=nil) => nil
|
1269
|
+
*
|
1270
|
+
* Yields to the provided block for each leaf node with the identified
|
1271
|
+
* location of the JSON document as the root. The parameter passed to the
|
1272
|
+
* block on yield is the Doc instance after moving to the child location.
|
1273
|
+
* @param [String] path if provided it identified the top of the branch to process the leaves of
|
1274
|
+
* @yieldparam [Doc] Doc at the child location
|
1275
|
+
* @example
|
1276
|
+
* Oj::Doc.open('[3,[2,1]]') { |doc|
|
1277
|
+
* result = {}
|
1278
|
+
* doc.each_leaf() { |d| result[d.where?] = d.fetch() }
|
1279
|
+
* result
|
1280
|
+
* }
|
1281
|
+
* #=> ["/1" => 3, "/2/1" => 2, "/2/2" => 1]
|
1282
|
+
*/
|
1283
|
+
static VALUE
|
1284
|
+
doc_each_leaf(int argc, VALUE *argv, VALUE self) {
|
1285
|
+
if (rb_block_given_p()) {
|
1286
|
+
Leaf save_path[MAX_STACK];
|
1287
|
+
Doc doc = DATA_PTR(self);
|
1288
|
+
const char *path = 0;
|
1289
|
+
size_t wlen;
|
1290
|
+
|
1291
|
+
wlen = doc->where - doc->where_path;
|
1292
|
+
memcpy(save_path, doc->where_path, sizeof(Leaf) * wlen);
|
1293
|
+
if (1 <= argc) {
|
1294
|
+
Check_Type(*argv, T_STRING);
|
1295
|
+
path = StringValuePtr(*argv);
|
1296
|
+
if ('/' == *path) {
|
1297
|
+
doc->where = doc->where_path;
|
1298
|
+
path++;
|
1299
|
+
}
|
1300
|
+
if (0 != move_step(doc, path, 1)) {
|
1301
|
+
memcpy(doc->where_path, save_path, sizeof(Leaf) * wlen);
|
1302
|
+
return Qnil;
|
1303
|
+
}
|
1304
|
+
}
|
1305
|
+
each_leaf(doc, self);
|
1306
|
+
memcpy(doc->where_path, save_path, sizeof(Leaf) * wlen);
|
1307
|
+
}
|
1308
|
+
return Qnil;
|
1309
|
+
}
|
1310
|
+
|
1311
|
+
/* call-seq: move(path) => nil
|
1312
|
+
*
|
1313
|
+
* Moves the document marker to the path specified. The path can an absolute
|
1314
|
+
* path or a relative path.
|
1315
|
+
* @param [String] path path to the location to move to
|
1316
|
+
* @example
|
1317
|
+
* Oj::Doc.open('{"one":[1,2]') { |doc| doc.move('/one/2'); doc.where? } #=> "/one/2"
|
1318
|
+
*/
|
1319
|
+
static VALUE
|
1320
|
+
doc_move(VALUE self, VALUE str) {
|
1321
|
+
Doc doc = DATA_PTR(self);
|
1322
|
+
const char *path;
|
1323
|
+
int loc;
|
1324
|
+
|
1325
|
+
Check_Type(str, T_STRING);
|
1326
|
+
path = StringValuePtr(str);
|
1327
|
+
if ('/' == *path) {
|
1328
|
+
doc->where = doc->where_path;
|
1329
|
+
path++;
|
1330
|
+
}
|
1331
|
+
if (0 != (loc = move_step(doc, path, 1))) {
|
1332
|
+
rb_raise(rb_eArgError, "Failed to locate element %d of the path %s.", loc, path);
|
1333
|
+
}
|
1334
|
+
return Qnil;
|
1335
|
+
}
|
1336
|
+
|
1337
|
+
/* call-seq: each_child(path=nil) { |doc| ... } => nil
|
1338
|
+
*
|
1339
|
+
* Yields to the provided block for each immediate child node with the
|
1340
|
+
* identified location of the JSON document as the root. The parameter passed
|
1341
|
+
* to the block on yield is the Doc instance after moving to the child
|
1342
|
+
* location.
|
1343
|
+
* @param [String] path if provided it identified the top of the branch to process the chilren of
|
1344
|
+
* @yieldparam [Doc] Doc at the child location
|
1345
|
+
* @example
|
1346
|
+
* Oj::Doc.open('[3,[2,1]]') { |doc|
|
1347
|
+
* result = []
|
1348
|
+
* doc.each_value('/2') { |doc| result << doc.where? }
|
1349
|
+
* result
|
1350
|
+
* }
|
1351
|
+
* #=> ["/2/1", "/2/2"]
|
1352
|
+
*/
|
1353
|
+
static VALUE
|
1354
|
+
doc_each_child(int argc, VALUE *argv, VALUE self) {
|
1355
|
+
if (rb_block_given_p()) {
|
1356
|
+
Leaf save_path[MAX_STACK];
|
1357
|
+
Doc doc = DATA_PTR(self);
|
1358
|
+
const char *path = 0;
|
1359
|
+
size_t wlen;
|
1360
|
+
|
1361
|
+
wlen = doc->where - doc->where_path;
|
1362
|
+
memcpy(save_path, doc->where_path, sizeof(Leaf) * wlen);
|
1363
|
+
if (1 <= argc) {
|
1364
|
+
Check_Type(*argv, T_STRING);
|
1365
|
+
path = StringValuePtr(*argv);
|
1366
|
+
if ('/' == *path) {
|
1367
|
+
doc->where = doc->where_path;
|
1368
|
+
path++;
|
1369
|
+
}
|
1370
|
+
if (0 != move_step(doc, path, 1)) {
|
1371
|
+
memcpy(doc->where_path, save_path, sizeof(Leaf) * wlen);
|
1372
|
+
return Qnil;
|
1373
|
+
}
|
1374
|
+
}
|
1375
|
+
if (COL_VAL == (*doc->where)->value_type && 0 != (*doc->where)->elements) {
|
1376
|
+
Leaf first = (*doc->where)->elements->next;
|
1377
|
+
Leaf e = first;
|
1378
|
+
VALUE args[1];
|
1379
|
+
|
1380
|
+
*args = self;
|
1381
|
+
doc->where++;
|
1382
|
+
do {
|
1383
|
+
*doc->where = e;
|
1384
|
+
rb_yield_values2(1, args);
|
1385
|
+
e = e->next;
|
1386
|
+
} while (e != first);
|
1387
|
+
}
|
1388
|
+
memcpy(doc->where_path, save_path, sizeof(Leaf) * wlen);
|
1389
|
+
}
|
1390
|
+
return Qnil;
|
1391
|
+
}
|
1392
|
+
|
1393
|
+
/* call-seq: each_value(path=nil) { |val| ... } => nil
|
1394
|
+
*
|
1395
|
+
* Yields to the provided block for each leaf value in the identified location
|
1396
|
+
* of the JSON document. The parameter passed to the block on yield is the
|
1397
|
+
* value of the leaf. Only those leaves below the element specified by the
|
1398
|
+
* path parameter are processed.
|
1399
|
+
* @param [String] path if provided it identified the top of the branch to process the leaf values of
|
1400
|
+
* @yieldparam [Object] val each leaf value
|
1401
|
+
* @example
|
1402
|
+
* Oj::Doc.open('[3,[2,1]]') { |doc|
|
1403
|
+
* result = []
|
1404
|
+
* doc.each_value() { |v| result << v }
|
1405
|
+
* result
|
1406
|
+
* }
|
1407
|
+
* #=> [3, 2, 1]
|
1408
|
+
*
|
1409
|
+
* Oj::Doc.open('[3,[2,1]]') { |doc|
|
1410
|
+
* result = []
|
1411
|
+
* doc.each_value('/2') { |v| result << v }
|
1412
|
+
* result
|
1413
|
+
* }
|
1414
|
+
* #=> [2, 1]
|
1415
|
+
*/
|
1416
|
+
static VALUE
|
1417
|
+
doc_each_value(int argc, VALUE *argv, VALUE self) {
|
1418
|
+
if (rb_block_given_p()) {
|
1419
|
+
Doc doc = DATA_PTR(self);
|
1420
|
+
const char *path = 0;
|
1421
|
+
Leaf leaf;
|
1422
|
+
|
1423
|
+
if (1 <= argc) {
|
1424
|
+
Check_Type(*argv, T_STRING);
|
1425
|
+
path = StringValuePtr(*argv);
|
1426
|
+
}
|
1427
|
+
if (0 != (leaf = get_doc_leaf(doc, path))) {
|
1428
|
+
each_value(doc, leaf);
|
1429
|
+
}
|
1430
|
+
}
|
1431
|
+
return Qnil;
|
1432
|
+
}
|
1433
|
+
|
1434
|
+
// TBD improve to be more direct for higher performance
|
1435
|
+
|
1436
|
+
/* call-seq: dump(path=nil) => String
|
1437
|
+
*
|
1438
|
+
* Dumps the document or nodes to a new JSON document. It uses the default
|
1439
|
+
* options for generating the JSON.
|
1440
|
+
* @param [String] path if provided it identified the top of the branch to dump to JSON
|
1441
|
+
* @example
|
1442
|
+
* Oj::Doc.open('[3,[2,1]]') { |doc|
|
1443
|
+
* doc.dump('/2')
|
1444
|
+
* }
|
1445
|
+
* #=> "[2,1]"
|
1446
|
+
*/
|
1447
|
+
static VALUE
|
1448
|
+
doc_dump(int argc, VALUE *argv, VALUE self) {
|
1449
|
+
Doc doc = DATA_PTR(self);
|
1450
|
+
Leaf leaf;
|
1451
|
+
const char *path = 0;
|
1452
|
+
const char *json;
|
1453
|
+
|
1454
|
+
if (1 <= argc) {
|
1455
|
+
Check_Type(*argv, T_STRING);
|
1456
|
+
path = StringValuePtr(*argv);
|
1457
|
+
}
|
1458
|
+
if (0 != (leaf = get_doc_leaf(doc, path))) {
|
1459
|
+
json = oj_write_obj_to_str(leaf_value(doc, leaf), &oj_default_options);
|
1460
|
+
|
1461
|
+
return rb_str_new2(json);
|
1462
|
+
}
|
1463
|
+
return Qnil;
|
1464
|
+
}
|
1465
|
+
|
1466
|
+
/* call-seq: size() => Fixnum
|
1467
|
+
*
|
1468
|
+
* Returns the number of nodes in the JSON document where a node is any one of
|
1469
|
+
* the basic JSON components.
|
1470
|
+
* @return Returns the size of the JSON document.
|
1471
|
+
* @example
|
1472
|
+
* Oj::Doc.open('[1,2,3]') { |doc| doc.size() } #=> 4
|
1473
|
+
*/
|
1474
|
+
static VALUE
|
1475
|
+
doc_size(VALUE self) {
|
1476
|
+
return ULONG2NUM(((Doc)DATA_PTR(self))->size);
|
1477
|
+
}
|
1478
|
+
|
1479
|
+
/* Document-class: Oj::Doc
|
1480
|
+
*
|
1481
|
+
* The Doc class is used to parse and navigate a JSON document. The model it
|
1482
|
+
* employs is that of a document that while open can be navigated and values
|
1483
|
+
* extracted. Once the document is closed the document can not longer be
|
1484
|
+
* accessed. This allows the parsing and data extraction to be extremely fast
|
1485
|
+
* compared to other JSON parses.
|
1486
|
+
*
|
1487
|
+
* An Oj::Doc class is not created directly but the _open()_ class method is
|
1488
|
+
* used to open a document and the yield parameter to the block of the #open()
|
1489
|
+
* call is the Doc instance. The Doc instance can be moved across, up, and
|
1490
|
+
* down the JSON document. At each element the data associated with the
|
1491
|
+
* element can be extracted. It is also possible to just provide a path to the
|
1492
|
+
* data to be extracted and retrieve the data in that manner.
|
1493
|
+
*
|
1494
|
+
* For many of the methods a path is used to describe the location of an
|
1495
|
+
* element. Paths follow a subset of the XPath syntax. The slash ('/')
|
1496
|
+
* character is the separator. Each step in the path identifies the next
|
1497
|
+
* branch to take through the document. A JSON object will expect a key string
|
1498
|
+
* while an array will expect a positive index. A .. step indicates a move up
|
1499
|
+
* the JSON document.
|
1500
|
+
*
|
1501
|
+
* @example
|
1502
|
+
* json = %{[
|
1503
|
+
* {
|
1504
|
+
* "one" : 1,
|
1505
|
+
* "two" : 2
|
1506
|
+
* },
|
1507
|
+
* {
|
1508
|
+
* "three" : 3,
|
1509
|
+
* "four" : 4
|
1510
|
+
* }
|
1511
|
+
* ]}
|
1512
|
+
* # move and get value
|
1513
|
+
* Oj::Doc.open(json) do |doc|
|
1514
|
+
* doc.move('/1/two')
|
1515
|
+
* # doc location is now at the 'two' element of the hash that is the first element of the array.
|
1516
|
+
* doc.fetch()
|
1517
|
+
* end
|
1518
|
+
* #=> 2
|
1519
|
+
*
|
1520
|
+
* # Now try again using a path to Oj::Doc.fetch() directly.
|
1521
|
+
* Oj::Doc.open(json) { |doc| doc.fetch('/2/three') } #=> 3
|
1522
|
+
*/
|
1523
|
+
void
|
1524
|
+
oj_init_doc() {
|
1525
|
+
oj_doc_class = rb_define_class_under(Oj, "Doc", rb_cObject);
|
1526
|
+
rb_define_singleton_method(oj_doc_class, "open", doc_open, 1);
|
1527
|
+
rb_define_singleton_method(oj_doc_class, "open_file", doc_open_file, 1);
|
1528
|
+
rb_define_singleton_method(oj_doc_class, "parse", doc_open, 1);
|
1529
|
+
rb_define_method(oj_doc_class, "where?", doc_where, 0);
|
1530
|
+
rb_define_method(oj_doc_class, "local_key", doc_local_key, 0);
|
1531
|
+
rb_define_method(oj_doc_class, "home", doc_home, 0);
|
1532
|
+
rb_define_method(oj_doc_class, "type", doc_type, -1);
|
1533
|
+
rb_define_method(oj_doc_class, "fetch", doc_fetch, -1);
|
1534
|
+
rb_define_method(oj_doc_class, "each_leaf", doc_each_leaf, -1);
|
1535
|
+
rb_define_method(oj_doc_class, "move", doc_move, 1);
|
1536
|
+
rb_define_method(oj_doc_class, "each_child", doc_each_child, -1);
|
1537
|
+
rb_define_method(oj_doc_class, "each_value", doc_each_value, -1);
|
1538
|
+
rb_define_method(oj_doc_class, "dump", doc_dump, -1);
|
1539
|
+
rb_define_method(oj_doc_class, "size", doc_size, 0);
|
1540
|
+
}
|