pickle-interpreter 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (4) hide show
  1. checksums.yaml +15 -0
  2. data/README.md +17 -0
  3. data/lib/pickle_interpreter.rb +502 -0
  4. metadata +44 -0
checksums.yaml ADDED
@@ -0,0 +1,15 @@
1
+ ---
2
+ !binary "U0hBMQ==":
3
+ metadata.gz: !binary |-
4
+ MDAwZDlmNmU1MDg5Yjc2Y2I5MTAwYzIxYjQ4NGMyNGExOWQxODMxMg==
5
+ data.tar.gz: !binary |-
6
+ MzBhOTI2MGVlMjM3ZDMwYmM1MGZhMjg0MDYzNmEzYTZjZWIwMzg2Mw==
7
+ SHA512:
8
+ metadata.gz: !binary |-
9
+ NzA4MGI5ZjVjMDQ1MzBhNTY3NWExMzA0MTIzMGQ2YWNmNGQ2ZmUzYTRhMDgx
10
+ Yzc3ZTdhMzkwNzk0Y2I2MzY4M2E5NmMyNzM1MWI4ZDgwNjFhOGY0N2Y5YjU1
11
+ NGM3MDAyMDdkYzQwZDljODFkNDQzOWVmMzU2NmQ5NGI1ZDYzZDM=
12
+ data.tar.gz: !binary |-
13
+ ODM4MGVhODc3MDg1ZmNjNGYyNTM3OWU1ZGQyMzhjMjg2YzBjNzYyOGRlMTRi
14
+ OWU4MGY4YmNhMmQyMzA2MjYzMjdhOTM4YzMwMGU0MzIyYWQ1M2NjNDc1M2Fi
15
+ ZWY4MmNmOTZiNmU5ODZmNWMxM2IzYzRmYjhlMTEyMDEzYjAwNzc=
data/README.md ADDED
@@ -0,0 +1,17 @@
1
+ This is a library for reading Python pickle objects.
2
+
3
+ If you have a Base64-encoded string, you can just do PickleInterpreter.unpickle_base64(my_string)
4
+
5
+ If you have the actual binary string, be sure it is using the ASCII-8bit character encoding. Then you can do PickleInterpreter.unpickle(my_string)
6
+
7
+ If you are reading from the django_session table (which is why I bothered with this in the first place), you can do PickleInterpreter.unpickle_base64_signed(my_string). Note that in this function we don't check the signature, we just ignore it.
8
+
9
+ I *think* I have at least a rudimentary implementation of all of the pickle instructions. However, I don't use a lot of Python, so I don't really have anything to test on. If you use this, and something breaks, if you send me the pickle file, and what it should decode to, I will get it fixed. jonathan@newmedio.com
10
+
11
+ For objects, however, I pretty much just create a hash with the initialization parameters baked in somewhere (i.e. with a key like "__init_args" or something appropriate to how it was called). This is certainly an area that can be improved. However, you can also just walk the tree after the fact and look for these.
12
+
13
+ Let me know how it works, and if there is anything else I need to implement. This is based on the 2014-06-10 version of this file:
14
+
15
+ http://svn.python.org/projects/python/trunk/Lib/pickletools.py
16
+
17
+
@@ -0,0 +1,502 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "base64"
4
+ class PickleInterpreter
5
+ @@instructions = {}
6
+
7
+ def self.register_instruction(code, funcsym)
8
+ @@instructions[code] = funcsym
9
+ end
10
+
11
+ def initialize
12
+ @stack = []
13
+ @memo = []
14
+ @exts = []
15
+ end
16
+
17
+ def self.unpickle(str)
18
+ p = PickleInterpreter.new
19
+ p.interpret_string(str)
20
+ end
21
+
22
+ def self.unpickle_base64(str)
23
+ p = PickleInterpreter.new
24
+ p.interpret_base64(str)
25
+ end
26
+
27
+ def self.unpickle_base64_signed(str)
28
+ p = PickleInterpreter.new
29
+ p.interpret_base64_signed(str)
30
+ end
31
+
32
+ def interpret(instruction_stack)
33
+ while(!instruction_stack.empty?) do
34
+ instr = instruction_stack.shift
35
+
36
+ sym = @@instructions[instr]
37
+ if sym.nil?
38
+ raise "Error finding instruction: #{instr}"
39
+ end
40
+ if(sym == :STOP)
41
+ return @stack.pop
42
+ else
43
+ self.send("instr_#{sym}", instruction_stack)
44
+ end
45
+ end
46
+ end
47
+
48
+ def interpret_string(str)
49
+ interpret(str.chars.map{|char| char.ord})
50
+ end
51
+
52
+ def interpret_base64(str)
53
+ interpret_string(Base64.decode64(str))
54
+ end
55
+
56
+ def interpret_base64_signed(str)
57
+ val = Base64.decode64(str)
58
+ interpret_string(Base64.decode64(str).split(":", 2)[1])
59
+ end
60
+
61
+ def instr_PROTO(instruction_stack)
62
+ val = instruction_stack.shift
63
+ # Do nothing
64
+ end
65
+ register_instruction(128, :PROTO)
66
+
67
+ def instr_INT(instruction_stack)
68
+ @stack.push(read_nl_number(instruction_stack))
69
+ end
70
+ register_instruction("I".ord, :INT)
71
+
72
+ def instr_BININT(instruction_stack)
73
+ @stack.push(read_nbyte_long(instruction_stack, 4))
74
+ end
75
+ register_instruction("J".ord, :BININT)
76
+
77
+ # NOTE - positive-only!
78
+ def instr_BININT1(instruction_stack)
79
+ @stack.push(instruction_stack.shift)
80
+ end
81
+ register_instruction("K".ord, :BININT1)
82
+
83
+ # FIXME - positive-only!
84
+ def instr_BININT2(instruction_stack)
85
+ @stack.push(read_nbyte_long(instruction_stack, 2))
86
+ end
87
+ register_instruction("M".ord, :BININT2)
88
+
89
+ def instr_EMPTY_DICT(instruction_stack)
90
+ @stack.push({})
91
+ end
92
+ register_instruction("}".ord, :EMPTY_DICT)
93
+
94
+ def instr_DICT(instruction_stack)
95
+ tmphsh = {}
96
+ lst = slice_to_mark()
97
+ while(!lst.empty?) do
98
+ val = lst.pop
99
+ key = lst.pop
100
+ tmphsh[key] = val
101
+ end
102
+ @stack.push(tmphsh)
103
+ end
104
+ register_instruction("d".ord, :DICT)
105
+
106
+ def instr_EXT1(instruction_stack)
107
+ ereg_idx = instruction_stack.shift
108
+ @stack.push(@exts[ereg_idx])
109
+ end
110
+ register_instruction(130, :EXT1)
111
+
112
+ def instr_EXT2(instruction_stack)
113
+ ereg_idx = read_nbyte_long(instruction_stack, 2)
114
+ @stack.push(@exts[ereg_idx])
115
+ end
116
+ register_instruction(131, :EXT2)
117
+
118
+ def instr_EXT4(instruction_stack)
119
+ ereg_idx = read_nbyte_long(instruction_stack, 4)
120
+ @stack.push(@exts[ereg_idx])
121
+ end
122
+ register_instruction(132, :EXT4)
123
+
124
+ def instr_GLOBAL(instruction_stack)
125
+ modname = read_nl_string(instruction_stack)
126
+ clsname = read_nl_string(instruction_stack)
127
+ # FIXME - should I actually try to locate the classes? Currently just returning a hash with special keys
128
+ # - should have a set of defines mods/classes that we can pull from
129
+ @stack.push({"__module" => modname, "__class" => clsname})
130
+ end
131
+ register_instruction("c".ord, :GLOBAL)
132
+
133
+ # FIXME - not really implemented. Should have a set of defined callables
134
+ def instr_REDUCE(instruction_stack)
135
+ arg = @stack.pop
136
+ callable = @stack.pop
137
+ @stack.push({"__callable" => callable, "__argument" => arg})
138
+ end
139
+ register_instruction("R".ord, :REDUCE)
140
+
141
+ # FIXME - not really implemented. Just adds another parameter to a hash at the moment
142
+ def instr_BUILD(instruction_stack)
143
+ arg = @stack.pop
144
+ @stack.last["__state"] = arg
145
+ end
146
+ register_instruction("b".ord, :BUILD)
147
+
148
+ # FIXME - not really implemented.
149
+ def instr_INST(instruction_stack)
150
+ modname = read_nl_string(instruction_stack)
151
+ clsname = read_nl_string(instruction_stack)
152
+ args = slice_to_mark()
153
+ @stack.push({"__init_arg" => args, "__module" => modname, "__class" => clsname})
154
+ end
155
+ register_instruction("i".ord, :INST)
156
+
157
+ def instr_OBJ(instruction_stack)
158
+ lst = slice_to_mark()
159
+ class_obj = lst.shift
160
+ @stack.push({"__init_arg" => lst, "__classobj" => class_obj})
161
+ end
162
+ register_instruction("o".ord, :OBJ)
163
+
164
+ def instr_NEWOBJ(instruction_stack)
165
+ arg = @stack.pop
166
+ cls = @stack.pop
167
+ @stack.push({"__classobj" => cls, "__init_arg" => arg})
168
+ end
169
+ register_instruction(129, :NEWOBJ)
170
+
171
+ # FIXME - need to register persistent loaders
172
+ def instr_PERSID(instruction_stack)
173
+ persid = read_nl_string(instruction_stack)
174
+ @stack.push({"__persid" => persid})
175
+ end
176
+ register_instruction("P".ord, :PERSID)
177
+
178
+ # FIXME - need to register persistent loaders
179
+ def instr_BINPERSID(instruction_stack)
180
+ persid = @stack.pop
181
+ @stack.push({"__persid" => persid})
182
+ end
183
+ register_instruction("Q".ord, :BINPERSID)
184
+
185
+ def instr_SETITEM(instruction_stack)
186
+ value = @stack.pop
187
+ key = @stack.pop
188
+ @stack.last[key] = value
189
+ end
190
+ register_instruction("s".ord, :SETITEM)
191
+
192
+ def instr_PUT(instruction_stack)
193
+ location = read_nl_number(instruction_stack)
194
+ @memo[location] = @stack.last
195
+ end
196
+ register_instruction("p".ord, :PUT)
197
+
198
+ def instr_BINPUT(instruction_stack)
199
+ location = instruction_stack.shift
200
+ @memo[location] = @stack.last
201
+ end
202
+ register_instruction("q".ord, :BINPUT)
203
+
204
+ # FIXME - need to change encoding to UTF8
205
+ def instr_BINUNICODE(instruction_stack)
206
+ sz = read_nbyte_long(instruction_stack, 4)
207
+ str = read_string(instruction_stack, sz)
208
+ @stack.push(str)
209
+ end
210
+ register_instruction("X".ord, :BINUNICODE)
211
+
212
+ # FIXME - not sure if encoding right. Also need to get escape sequences
213
+ def instr_UNICODE(instruction_stack)
214
+ str = read_nl_string(instruction_stack)
215
+ @stack.push(str)
216
+ end
217
+ register_instruction("V".ord, :UNICODE)
218
+
219
+ def instr_NEWTRUE(instruction_stack)
220
+ @stack.push(true)
221
+ end
222
+ register_instruction(136, :NEWTRUE)
223
+
224
+ def instr_NEWFALSE(instruction_stack)
225
+ @stack.push(false)
226
+ end
227
+ register_instruction(137, :NEWFALSE)
228
+
229
+ def instr_NONE(instruction_stack)
230
+ @stack.push(nil)
231
+ end
232
+ register_instruction("N".ord, :NONE)
233
+
234
+
235
+ def instr_LONG_BINPUT(instruction_stack)
236
+ location = read_nbyte_long(instruction_stack, 4)
237
+ @memo[location] = @stack.last
238
+ end
239
+ register_instruction("r".ord, :LONG_BINPUT)
240
+
241
+ def read_nl_number(instruction_stack)
242
+ decimal = false
243
+ decimal_instr = ".".ord
244
+ terminator = "\n".ord
245
+ val = 0
246
+ offset = "0".ord
247
+ decimal_level = 0
248
+ negative = false
249
+
250
+ nextval = instruction_stack.shift
251
+ while(nextval != terminator) do
252
+ if nextval == decimal_instr
253
+ decimal = true
254
+ else
255
+ if nextval == 'L'
256
+ # skip
257
+ else
258
+ if nextval == '-'
259
+ negative = true
260
+ else
261
+ val = val * 10
262
+ val = val + (nextval - offset)
263
+ if decimal
264
+ decimal_level = decimal_level + 1
265
+ end
266
+ end
267
+ end
268
+ end
269
+
270
+ nextval = instruction_stack.shift
271
+ end
272
+
273
+ if negative
274
+ val = 0 - val
275
+ end
276
+
277
+ if decimal
278
+ divisor = 10 ** decimal_level
279
+ val = val.to_f / divisor.to_f
280
+ end
281
+
282
+ return val
283
+ end
284
+
285
+ def instr_GET(instruction_stack)
286
+ location = read_nl_number(instruction_stack)
287
+ @stack.push(@memo[location])
288
+ end
289
+ register_instruction("g".ord, :GET)
290
+
291
+ def instr_BINGET(instruction_stack)
292
+ location = instruction_stack.shift
293
+ @stack.push(@memo[location])
294
+ end
295
+ register_instruction("h".ord, :BINGET)
296
+
297
+ def instr_LONG_BINGET(instruction_stack)
298
+ location = read_nbyte_long(instruction_stack, 4)
299
+ @stack.push(@memo[location])
300
+ end
301
+ register_instruction("j".ord, :LONG_BINGET)
302
+
303
+ def instr_MARK(instruction_stack)
304
+ @stack.push(:mark)
305
+ end
306
+ register_instruction("(".ord, :MARK)
307
+
308
+ def slice_to_mark()
309
+ tmplist = []
310
+ val = @stack.pop
311
+ while(val != :mark) do
312
+ tmplist.unshift(val)
313
+ val = @stack.pop
314
+ end
315
+
316
+ return tmplist
317
+ end
318
+
319
+ def instr_POP_MARK(instruction_stack)
320
+ while(true) do
321
+ break if @stack.empty?
322
+ nextval = @stack.pop
323
+ break if nextval == :mark
324
+ end
325
+ end
326
+ register_instruction("1".ord, :POP_MARK)
327
+
328
+ # FIXME - twos complement
329
+ def read_nbyte_long(instruction_stack, n)
330
+ val = 0
331
+ multiplier = 1
332
+ while(n > 0) do
333
+ nextval = instruction_stack.shift
334
+ nextval = nextval * multiplier
335
+ val = val + nextval
336
+
337
+ multiplier = multiplier * 256
338
+ n = n - 1
339
+ end
340
+
341
+ return val
342
+ end
343
+
344
+ def read_string(instruction_stack, str_size)
345
+ str = ""
346
+ while(str_size > 0) do
347
+ nextchar = instruction_stack.shift
348
+ str = str + nextchar.chr
349
+ str_size = str_size - 1
350
+ end
351
+
352
+ return str
353
+ end
354
+
355
+ def instr_DUP(instruction_stack)
356
+ @stack.push(@stack.last)
357
+ end
358
+ register_instruction("2".ord, :DUP)
359
+
360
+ def instr_POP(instruction_stack)
361
+ @stack.pop
362
+ end
363
+ register_instruction("0".ord, :POP)
364
+
365
+ def instr_FLOAT(instruction_stack)
366
+ @stack.push(read_nl_number(instruction_stack))
367
+ end
368
+ register_instruction("F".ord, :FLOAT)
369
+
370
+ def instr_BINFLOAT(instruction_stack)
371
+ str = ""
372
+ ctr = 0
373
+ while(ctr < 8) do
374
+ str = "#{str}#{instruction_stack.shift.chr}"
375
+ ctr = ctr + 1
376
+ end
377
+ val = str.unpack("G")[0]
378
+ @stack.push(val)
379
+ end
380
+ register_instruction("G".ord, :BINFLOAT)
381
+
382
+ def instr_EMPTY_LIST(instruction_stack)
383
+ @stack.push([])
384
+ end
385
+ register_instruction("]".ord, :EMPTY_LIST)
386
+
387
+ def instr_APPEND(instruction_stack)
388
+ val = @stack.pop
389
+ @stack.last.push(val)
390
+ end
391
+ register_instruction('a'.ord, :APPEND)
392
+
393
+ def instr_APPENDS(instruction_stack)
394
+ tmplist = slice_to_mark()
395
+ tmplist.each do |tmp|
396
+ @stack.last.push(tmp)
397
+ end
398
+ end
399
+ register_instruction('e'.ord, :APPENDS)
400
+
401
+ def instr_LIST(instruction_stack)
402
+ @stack.push(slice_to_mark())
403
+ end
404
+ register_instruction("l".ord, :LIST)
405
+
406
+ def instr_EMPTY_TUPLE(instruction_stack)
407
+ @stack.push([])
408
+ end
409
+ register_instruction(')'.ord, :EMPTY_TUPLE)
410
+
411
+ def instr_TUPLE(instruction_stack)
412
+ @stack.push(slice_to_mark())
413
+ end
414
+ register_instruction('t'.ord, :TUPLE)
415
+
416
+ def instr_TUPLE1(instruction_stack)
417
+ @stack.push([@stack.pop])
418
+ end
419
+ register_instruction(133, :TUPLE1)
420
+
421
+ def instr_TUPLE2(instruction_stack)
422
+ val2 = @stack.pop
423
+ val1 = @stack.pop
424
+ @stack.push([val1, val2])
425
+ end
426
+ register_instruction(134, :TUPLE2)
427
+
428
+ def instr_TUPLE3(instruction_stack)
429
+ val3 = @stack.pop
430
+ val2 = @stack.pop
431
+ val1 = @stack.pop
432
+ @stack.push([val1, val2, val3])
433
+ end
434
+ register_instruction(135, :TUPLE3)
435
+
436
+ def instr_BINSTRING(instruction_stack)
437
+ sz = read_nbyte_long(instruction_stack, 4)
438
+ str = read_string(instruction_stack, sz)
439
+ @stack.push(str)
440
+ end
441
+ register_instruction("T".ord, :BINSTRING)
442
+
443
+ def instr_SHORT_BINSTRING(instruction_stack)
444
+ str_size = instruction_stack.shift
445
+ str = read_string(instruction_stack, str_size)
446
+ @stack.push(str)
447
+ end
448
+ register_instruction("U".ord, :SHORT_BINSTRING)
449
+
450
+ def instr_LONG(instruction_stack)
451
+ @stack.push(read_nl_number(instruction_stack))
452
+ end
453
+ register_instruction("L".ord, :LONG)
454
+
455
+ def instr_LONG1(instruction_stack)
456
+ sz = instruction_stack.shift
457
+ val = read_nbyte_long(instruction_stack, sz)
458
+ @stack.push(val)
459
+ end
460
+ register_instruction(138, :LONG1)
461
+
462
+ def instr_LONG4(instruction_stack)
463
+ sz = read_nbyte_long(instruction_stack, 4)
464
+ val = read_nbyte_long(instruction_stack, sz)
465
+ @stack.push(val)
466
+ end
467
+ register_instruction(139, :LONG4)
468
+
469
+ def read_nl_string(instruction_stack)
470
+ str = ""
471
+ terminator = "\n".ord
472
+ nextval = instruction_stack.shift
473
+ while(nextval != terminator) do
474
+ str = str + nextval.chr
475
+ nextval = instruction_stack.shift
476
+ end
477
+
478
+ return nextval
479
+ end
480
+
481
+ # FIXME - need to interpret string (repr-style according to docs)
482
+ def instr_STRING(instruction_stack)
483
+ str = read_nl_string(instruction_stack)
484
+ @stack.push(str)
485
+ end
486
+ register_instruction("S".ord, :STRING)
487
+
488
+ def instr_SETITEMS(instruction_stack)
489
+ lst = slice_to_mark()
490
+ while(!lst.empty?) do
491
+ val = lst.pop
492
+ key = lst.pop
493
+ @stack.last[key] = val
494
+ end
495
+ end
496
+ register_instruction("u".ord, :SETITEMS)
497
+
498
+ def instr_STOP
499
+ # we are treating this as a no-op
500
+ end
501
+ register_instruction(".".ord, :STOP)
502
+ end
metadata ADDED
@@ -0,0 +1,44 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: pickle-interpreter
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Jonathan Bartlett
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2014-06-10 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: A library to read pickled objects from pythong in Ruby
14
+ email: jonathan@newmedio.com
15
+ executables: []
16
+ extensions: []
17
+ extra_rdoc_files: []
18
+ files:
19
+ - README.md
20
+ - lib/pickle_interpreter.rb
21
+ homepage: http://github.com/newmedio/pickle-interpreter
22
+ licenses: []
23
+ metadata: {}
24
+ post_install_message:
25
+ rdoc_options: []
26
+ require_paths:
27
+ - lib
28
+ required_ruby_version: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - ! '>='
31
+ - !ruby/object:Gem::Version
32
+ version: '0'
33
+ required_rubygems_version: !ruby/object:Gem::Requirement
34
+ requirements:
35
+ - - ! '>='
36
+ - !ruby/object:Gem::Version
37
+ version: '0'
38
+ requirements: []
39
+ rubyforge_project:
40
+ rubygems_version: 2.3.0
41
+ signing_key:
42
+ specification_version: 4
43
+ summary: A Ruby Pickle interpreter to unpickle Python objects
44
+ test_files: []