oj 0.5.2 → 0.6.0

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of oj might be problematic. Click here for more details.

data/README.md CHANGED
@@ -18,13 +18,17 @@ A fast JSON parser and Object marshaller as a Ruby gem.
18
18
 
19
19
  ## <a name="release">Release Notes</a>
20
20
 
21
- ### Release 0.5.2
21
+ ### Release 0.6.0
22
22
 
23
- - Release 0.5.2 fixes encoding and float encoding.
23
+ - supports arbitrary Object dumping/serialization
24
24
 
25
- This is the first release sith a version of 0.5 indicating it is only half
26
- done. Basic load() and dump() is supported for Hash, Array, NilClass,
27
- TrueClass, FalseClass, Fixnum, Float, Symbol, and String Objects.
25
+ - to_hash() method called if the Object responds to to_hash and the result is converted to JSON
26
+
27
+ - to_json() method called if the Object responds to to_json
28
+
29
+ - almost any Object can be dumped, including Exceptions (not including Thread, Mutex and Objects that only make sense within a process)
30
+
31
+ - default options have been added
28
32
 
29
33
  ## <a name="description">Description</a>
30
34
 
@@ -33,12 +37,117 @@ optimized JSON handling. It was designed as a faster alternative to Yajl and
33
37
  other the common Ruby JSON parsers. So far is has achieved that at about 2
34
38
  time faster than Yajl for parsing and 3 or more times faster writing JSON.
35
39
 
40
+ Oj has several dump or serialization modes which control how Objects are
41
+ converted to JSON. These modes are set with the :effort option in either the
42
+ dafault options or as one of the options to the dump() method. The :strict
43
+ mode will only allow the 7 basic JSON types to be serialized. Any other Object
44
+ will raise and Exception. In the :lazy mode any Object that is not one of the
45
+ JSON types is replaced by a JSON null. In :interal mode any Object will be
46
+ dumped as a JSON Object with keys that match the Ruby Object's variable names
47
+ without the '@' character. This is the highest performance mode. The :internal
48
+ mode is not found in other JSON gems. The last mode, the :tolerant mode is the
49
+ most tolerant. It will serialize any Object but will check to see if the
50
+ Object implements a to_hash() or to_json() method. If either exists that
51
+ method is used for serializing the Object. The to_hash() is more flexible and
52
+ produces more consistent output so it has a preference over the to_json()
53
+ method. If neither the to_json() or to_hash() methods exist then the Oj
54
+ internal Object variable encoding is used.
55
+
36
56
  Coming soon: As an Object marshaller with support for circular references.
37
57
 
38
- Coming soon: A SAX like JSON stream parser.
58
+ Coming soon: A JSON stream parser.
39
59
 
40
60
  Oj is compatible with Ruby 1.8.7, 1.9.2, 1.9.3, JRuby, and RBX.
41
61
 
62
+ ## <a name="compare">Comparisons</a>
63
+
64
+ The following table shows the difference is speeds between several
65
+ serialization packages. The tests had to be scaled back due to limitation of
66
+ some of the gems. I finally gave up trying to get JSON to serialize without
67
+ errors with Ruby 1.9.3. It had internal errors on anything other than a simple
68
+ JSON structure. The errors encountered were:
69
+
70
+ - MessagePack fails to convert Bignum to JSON
71
+
72
+ - JSON Pure and Ext fails to serialize any numbers or Objects with the to_json() method
73
+
74
+ Options were added to the test/perf_simple.rb test to run the test without
75
+ Object encoding and without Bignums. There is also an option for shallow JSON
76
+ so that JSON Pure and Ext can be compared.
77
+
78
+ None of the packages except Oj were able to serialize Ruby Objects that did
79
+ not have a to_json() method or were of the 7 native JSON types.
80
+
81
+ It is also worth noting that although Oj is slightly behind MessagePack for
82
+ parsing, Oj serialization is much faster than MessagePack even though Oj uses
83
+ human readable JSON vs the binary MessagePack format.
84
+
85
+ The results:
86
+
87
+ with Object and Bignum encoding:
88
+
89
+ 100000 Oj.load()s in 1.456 seconds or 68.7 loads/msec
90
+ 100000 Yajl::Parser.parse()s in 2.681 seconds or 37.3 parses/msec
91
+ 100000 JSON::Ext::Parser parse()s in 2.804 seconds or 35.7 parses/msec
92
+ 100000 JSON::Pure::Parser parse()s in 27.494 seconds or 3.6 parses/msec
93
+ MessagePack failed: RangeError: bignum too big to convert into `unsigned long long'
94
+ 100000 Ox.load()s in 3.165 seconds or 31.6 loads/msec
95
+ Parser results:
96
+ gem seconds parses/msec X faster than JSON::Pure (higher is better)
97
+ oj 1.456 68.7 18.9
98
+ yajl 2.681 37.3 10.3
99
+ msgpack failed to generate JSON
100
+ pure 27.494 3.6 1.0
101
+ ext 2.804 35.7 9.8
102
+ ox 3.165 31.6 8.7
103
+
104
+ 100000 Oj.dump()s in 0.484 seconds or 206.7 dumps/msec
105
+ 100000 Yajl::Encoder.encode()s in 2.167 seconds or 46.2 encodes/msec
106
+ JSON::Ext failed: TypeError: wrong argument type JSON::Pure::Generator::State (expected Data)
107
+ JSON::Pure failed: TypeError: wrong argument type JSON::Pure::Generator::State (expected Data)
108
+ MessagePack failed: RangeError: bignum too big to convert into `unsigned long long'
109
+ 100000 Ox.dump()s in 0.554 seconds or 180.4 dumps/msec
110
+ Parser results:
111
+ gem seconds dumps/msec X faster than Yajl (higher is better)
112
+ oj 0.484 206.7 4.5
113
+ yajl 2.167 46.2 1.0
114
+ msgpack failed to generate JSON
115
+ pure failed to generate JSON
116
+ ext failed to generate JSON
117
+ ox 0.554 180.4 3.9
118
+
119
+ without Objects or numbers (for JSON Pure) JSON:
120
+
121
+ 100000 Oj.load()s in 0.739 seconds or 135.3 loads/msec
122
+ 100000 Yajl::Parser.parse()s in 1.421 seconds or 70.4 parses/msec
123
+ 100000 JSON::Ext::Parser parse()s in 1.512 seconds or 66.2 parses/msec
124
+ 100000 JSON::Pure::Parser parse()s in 16.953 seconds or 5.9 parses/msec
125
+ 100000 MessagePack.unpack()s in 0.635 seconds or 157.6 packs/msec
126
+ 100000 Ox.load()s in 0.971 seconds or 103.0 loads/msec
127
+ Parser results:
128
+ gem seconds parses/msec X faster than JSON::Pure (higher is better)
129
+ oj 0.739 135.3 22.9
130
+ yajl 1.421 70.4 11.9
131
+ msgpack 0.635 157.6 26.7
132
+ pure 16.953 5.9 1.0
133
+ ext 1.512 66.2 11.2
134
+ ox 0.971 103.0 17.5
135
+
136
+ 100000 Oj.dump()s in 0.174 seconds or 575.1 dumps/msec
137
+ 100000 Yajl::Encoder.encode()s in 0.729 seconds or 137.2 encodes/msec
138
+ 100000 JSON::Ext generate()s in 7.171 seconds or 13.9 generates/msec
139
+ 100000 JSON::Pure generate()s in 7.219 seconds or 13.9 generates/msec
140
+ 100000 Msgpack()s in 0.299 seconds or 334.8 unpacks/msec
141
+ 100000 Ox.dump()s in 0.210 seconds or 475.8 dumps/msec
142
+ Parser results:
143
+ gem seconds dumps/msec X faster than JSON::Pure (higher is better)
144
+ oj 0.174 575.1 41.5
145
+ yajl 0.729 137.2 9.9
146
+ msgpack 0.299 334.8 24.2
147
+ pure 7.219 13.9 1.0
148
+ ext 1.512 66.2 4.8
149
+ ox 0.210 475.8 34.3
150
+
42
151
  ### Simple JSON Writing and Parsing:
43
152
 
44
153
  require 'oj'
@@ -34,13 +34,12 @@
34
34
  #include <stdio.h>
35
35
  #include <string.h>
36
36
 
37
- #include "ruby.h"
38
- #ifdef HAVE_RUBY_ENCODING_H
37
+ #include "oj.h"
38
+ #if IVAR_HELPERS
39
39
  #include "ruby/st.h"
40
40
  #else
41
41
  #include "st.h"
42
42
  #endif
43
- #include "oj.h"
44
43
 
45
44
  typedef unsigned long ulong;
46
45
 
@@ -79,13 +78,19 @@ static void dump_nil(Out out);
79
78
  static void dump_true(Out out);
80
79
  static void dump_false(Out out);
81
80
  static void dump_fixnum(VALUE obj, Out out);
81
+ static void dump_bignum(VALUE obj, Out out);
82
82
  static void dump_float(VALUE obj, Out out);
83
83
  static void dump_cstr(const char *str, int cnt, Out out);
84
84
  static void dump_hex(u_char c, Out out);
85
85
  static void dump_str(VALUE obj, Out out);
86
86
  static void dump_sym(VALUE obj, Out out);
87
+ static void dump_class(VALUE obj, Out out);
87
88
  static void dump_array(VALUE obj, int depth, Out out);
88
89
  static void dump_hash(VALUE obj, int depth, Out out);
90
+ static void dump_data(VALUE obj, Out out);
91
+ static void dump_object(VALUE obj, int depth, Out out);
92
+ static int dump_attr_cb(ID key, VALUE value, Out out);
93
+ static void dump_obj_attrs(VALUE obj, int with_class, int depth, Out out);
89
94
 
90
95
  static void grow(Out out, size_t len);
91
96
  static int is_json_friendly(const u_char *str, int len);
@@ -249,6 +254,19 @@ dump_fixnum(VALUE obj, Out out) {
249
254
  *out->cur = '\0';
250
255
  }
251
256
 
257
+ static void
258
+ dump_bignum(VALUE obj, Out out) {
259
+ VALUE rs = rb_big2str(obj, 10);
260
+ int cnt = (int)RSTRING_LEN(rs);
261
+
262
+ if (out->end - out->cur <= (long)cnt) {
263
+ grow(out, cnt);
264
+ }
265
+ memcpy(out->cur, StringValuePtr(rs), cnt);
266
+ out->cur += cnt;
267
+ *out->cur = '\0';
268
+ }
269
+
252
270
  static void
253
271
  dump_float(VALUE obj, Out out) {
254
272
  char buf[64];
@@ -335,6 +353,38 @@ dump_sym(VALUE obj, Out out) {
335
353
  dump_cstr(sym, (int)strlen(sym), out);
336
354
  }
337
355
 
356
+ static void
357
+ dump_class(VALUE obj, Out out) {
358
+ switch (out->opts->mode) {
359
+ case StrictMode:
360
+ rb_raise(rb_eTypeError, "Failed to dump class %s to JSON.\n", rb_class2name(obj));
361
+ break;
362
+ case NullMode:
363
+ dump_nil(out);
364
+ break;
365
+ case ObjectMode:
366
+ case CompatMode:
367
+ default:
368
+ {
369
+ const char *s = rb_class2name(obj);
370
+ size_t len = strlen(s);
371
+ size_t size = len + 20;
372
+
373
+ if (out->end - out->cur <= (long)size) {
374
+ grow(out, size);
375
+ }
376
+ memcpy(out->cur, "{\"*\":\"Class\",\"-\":\"", 18);
377
+ out->cur += 18;
378
+ memcpy(out->cur, s, len);
379
+ out->cur += len;
380
+ *out->cur++ = '"';
381
+ *out->cur++ = '}';
382
+ *out->cur = '\0';
383
+ break;
384
+ }
385
+ }
386
+ }
387
+
338
388
  static void
339
389
  dump_array(VALUE a, int depth, Out out) {
340
390
  VALUE *np = RARRAY_PTR(a);
@@ -393,6 +443,9 @@ static void
393
443
  dump_hash(VALUE obj, int depth, Out out) {
394
444
  int cnt = (int)RHASH_SIZE(obj);
395
445
 
446
+ if (out->end - out->cur <= 2) {
447
+ grow(out, 2);
448
+ }
396
449
  *out->cur++ = '{';
397
450
  if (0 == cnt) {
398
451
  *out->cur++ = '}';
@@ -411,6 +464,196 @@ dump_hash(VALUE obj, int depth, Out out) {
411
464
  *out->cur = '\0';
412
465
  }
413
466
 
467
+ static void
468
+ dump_data(VALUE obj, Out out) {
469
+ VALUE clas = rb_obj_class(obj);
470
+
471
+ switch (out->opts->mode) {
472
+ case StrictMode:
473
+ rb_raise(rb_eTypeError, "Failed to dump %s Object to JSON in strict mode.\n", rb_class2name(clas));
474
+ break;
475
+ case NullMode:
476
+ dump_nil(out);
477
+ break;
478
+ case ObjectMode:
479
+ case CompatMode:
480
+ default:
481
+ if (rb_cTime == clas) {
482
+ char buf[64];
483
+ char *b = buf + sizeof(buf) - 1;
484
+ time_t sec = NUM2LONG(rb_funcall2(obj, oj_tv_sec_id, 0, 0));
485
+ long usec = NUM2LONG(rb_funcall2(obj, oj_tv_usec_id, 0, 0));
486
+ char *dot = b - 7;
487
+ long size;
488
+
489
+ *b-- = '\0';
490
+ for (; dot < b; b--, usec /= 10) {
491
+ *b = '0' + (usec % 10);
492
+ }
493
+ *b-- = '.';
494
+ for (; 0 < sec; b--, sec /= 10) {
495
+ *b = '0' + (sec % 10);
496
+ }
497
+ b++;
498
+ size = sizeof(buf) - (b - buf) - 1;
499
+ if (out->end - out->cur <= size + 20) {
500
+ grow(out, size + 20);
501
+ }
502
+ memcpy(out->cur, "{\"*\":\"Time\",\"-\":", 16);
503
+ out->cur += 16;
504
+ memcpy(out->cur, b, size);
505
+ out->cur += size;
506
+ *out->cur++ = '}';
507
+ *out->cur = '\0';
508
+ } else {
509
+ dump_nil(out);
510
+ }
511
+ }
512
+ }
513
+
514
+ static void
515
+ dump_object(VALUE obj, int depth, Out out) {
516
+ if (ObjectMode == out->opts->mode) {
517
+ dump_obj_attrs(obj, 1, depth, out);
518
+ } else {
519
+ switch (out->opts->mode) {
520
+ case StrictMode:
521
+ rb_raise(rb_eTypeError, "Failed to dump %s Object to JSON in strict mode.\n", rb_class2name(rb_obj_class(obj)));
522
+ break;
523
+ case NullMode:
524
+ dump_nil(out);
525
+ break;
526
+ case ObjectMode:
527
+ dump_obj_attrs(obj, 0, depth, out);
528
+ break;
529
+ case CompatMode:
530
+ default:
531
+ if (rb_respond_to(obj, oj_to_hash_id)) {
532
+ VALUE h = rb_funcall(obj, oj_to_hash_id, 0);
533
+
534
+ if (T_HASH != rb_type(h)) {
535
+ rb_raise(rb_eTypeError, "%s.to_hash() did not return a Hash.\n", rb_class2name(rb_obj_class(obj)));
536
+ }
537
+ dump_hash(h, depth, out);
538
+ } else if (rb_respond_to(obj, oj_to_json_id)) {
539
+ VALUE rs = rb_funcall(obj, oj_to_json_id, 0);
540
+ const char *s = StringValuePtr(rs);
541
+ int len = (int)RSTRING_LEN(rs);
542
+
543
+ if (out->end - out->cur <= len) {
544
+ grow(out, len);
545
+ }
546
+ memcpy(out->cur, s, len);
547
+ out->cur += len;
548
+ } else {
549
+ dump_obj_attrs(obj, 0, depth, out);
550
+ }
551
+ break;
552
+ }
553
+ }
554
+ *out->cur = '\0';
555
+ }
556
+
557
+ static int
558
+ dump_attr_cb(ID key, VALUE value, Out out) {
559
+ int depth = out->depth;
560
+ size_t size = depth * out->indent + 1;
561
+ const char *attr = rb_id2name(key);
562
+
563
+ if (out->end - out->cur <= (long)size) {
564
+ grow(out, size);
565
+ }
566
+ if ('@' == *attr) {
567
+ attr++;
568
+ } else {
569
+ // TBD handle unusual exception mesg data
570
+ }
571
+ fill_indent(out, depth);
572
+ dump_cstr(attr, (int)strlen(attr) - 1, out);
573
+ *out->cur++ = ':';
574
+ dump_val(value, depth, out);
575
+ out->depth = depth;
576
+ *out->cur++ = ',';
577
+
578
+ return ST_CONTINUE;
579
+ }
580
+
581
+ static void
582
+ dump_obj_attrs(VALUE obj, int with_class, int depth, Out out) {
583
+ size_t size;
584
+ int d2 = depth + 1;
585
+
586
+ if (out->end - out->cur <= 2) {
587
+ grow(out, 2);
588
+ }
589
+ *out->cur++ = '{';
590
+ if (with_class) {
591
+ const char *class_name = rb_class2name(rb_obj_class(obj));
592
+ int clen = (int)strlen(class_name);
593
+
594
+ size = d2 * out->indent + clen + 9;
595
+ if (out->end - out->cur <= (long)size) {
596
+ grow(out, size);
597
+ }
598
+ fill_indent(out, d2);
599
+ *out->cur++ = '"';
600
+ *out->cur++ = '*';
601
+ *out->cur++ = '"';
602
+ *out->cur++ = ':';
603
+ dump_cstr(class_name, clen, out);
604
+ }
605
+ {
606
+ int cnt;
607
+ // use encoding as the indicator for Ruby 1.8.7 or 1.9.x
608
+ #if IVAR_HELPERS
609
+ cnt = (int)rb_ivar_count(obj);
610
+ #else
611
+ VALUE vars = rb_funcall2(obj, oj_instance_variables_id, 0, 0);
612
+ VALUE *np = RARRAY_PTR(vars);
613
+ ID vid;
614
+ const char *attr;
615
+ int i;
616
+
617
+ cnt = (int)RARRAY_LEN(vars);
618
+ #endif
619
+ if (with_class && 0 < cnt) {
620
+ *out->cur++ = ',';
621
+ }
622
+ out->depth = depth + 1;
623
+ #if IVAR_HELPERS
624
+ rb_ivar_foreach(obj, dump_attr_cb, (VALUE)out);
625
+ out->cur--; // backup to overwrite last comma
626
+ #else
627
+ size = d2 * out->indent + 1;
628
+ for (i = cnt; 0 < i; i--, np++) {
629
+ if (out->end - out->cur <= (long)size) {
630
+ grow(out, size);
631
+ }
632
+ vid = rb_to_id(*np);
633
+ fill_indent(out, d2);
634
+ attr = rb_id2name(vid);
635
+ if ('@' == *attr) {
636
+ attr++;
637
+ } else {
638
+ // TBD handle unusual exception mesg data
639
+ }
640
+ dump_cstr(attr, (int)strlen(attr) - 1, out);
641
+ *out->cur++ = ':';
642
+ dump_val(rb_ivar_get(obj, vid), d2, out);
643
+ if (out->end - out->cur <= 2) {
644
+ grow(out, 2);
645
+ }
646
+ if (1 < i) {
647
+ *out->cur++ = ',';
648
+ }
649
+ }
650
+ #endif
651
+ out->depth = depth;
652
+ }
653
+ *out->cur++ = '}';
654
+ *out->cur = '\0';
655
+ }
656
+
414
657
  static void
415
658
  dump_val(VALUE obj, int depth, Out out) {
416
659
  switch (rb_type(obj)) {
@@ -419,15 +662,15 @@ dump_val(VALUE obj, int depth, Out out) {
419
662
  case T_FALSE: dump_false(out); break;
420
663
  case T_FIXNUM: dump_fixnum(obj, out); break;
421
664
  case T_FLOAT: dump_float(obj, out); break;
422
- case T_BIGNUM: break; // TBD
665
+ case T_BIGNUM: dump_bignum(obj, out); break;
423
666
  case T_STRING: dump_str(obj, out); break;
424
667
  case T_SYMBOL: dump_sym(obj, out); break;
425
668
  case T_ARRAY: dump_array(obj, depth, out); break;
426
669
  case T_HASH: dump_hash(obj, depth, out); break;
427
- case T_OBJECT:
670
+ case T_CLASS: dump_class(obj, out); break;
671
+ case T_OBJECT: dump_object(obj, depth, out); break;
672
+ case T_DATA: dump_data(obj, out); break;
428
673
  case T_REGEXP:
429
- case T_CLASS:
430
- case T_DATA: // for Time
431
674
  // TBD
432
675
  rb_raise(rb_eNotImpError, "Failed to dump '%s' Object (%02x)\n",
433
676
  rb_class2name(rb_obj_class(obj)), rb_type(obj));
@@ -460,7 +703,7 @@ dump_obj_to_json(VALUE obj, Options copts, Out out) {
460
703
  }
461
704
 
462
705
  char*
463
- write_obj_to_str(VALUE obj, Options copts) {
706
+ oj_write_obj_to_str(VALUE obj, Options copts) {
464
707
  struct _Out out;
465
708
 
466
709
  dump_obj_to_json(obj, copts, &out);
@@ -469,7 +712,7 @@ write_obj_to_str(VALUE obj, Options copts) {
469
712
  }
470
713
 
471
714
  void
472
- write_obj_to_file(VALUE obj, const char *path, Options copts) {
715
+ oj_write_obj_to_file(VALUE obj, const char *path, Options copts) {
473
716
  struct _Out out;
474
717
  size_t size;
475
718
  FILE *f;