float-formats 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,3 +1,35 @@
1
+ == 0.1.1 2007-12-15
2
+
3
+ * HP-71B formats defined
4
+
5
+ * Add half precision IEEE format (binary16)
6
+
7
+ * New names for IEEE formats
8
+
9
+ * Add some IEEE 754r interchange formats
10
+
11
+ * new methods hex_to_float, hex_from_float in float-formats/native
12
+
13
+ * Allow non-bcd values in fields of BCD formats by passing
14
+ hex values as Strings; allow such values to be used for
15
+ nan/infinity exponents.
16
+
17
+ * Nio 0.2.1 is now required
18
+
19
+ * Handle special values (Infinities and NaN) in #from_fmt, #from_number
20
+
21
+ * Add ulp methods to Value and FP classes and to Float
22
+
23
+ * Bug fixes
24
+ - Fix the encoding-decoding of nan and infinity in Decimal format.
25
+ - Fix the decoding of NaN in Binary & Hexadecimal
26
+ - The definition of IEEE_binary128 was not correct
27
+ - In formats such as XS256 where the minimum exponent is not used only for zero
28
+ and there is a hidden bit, then minimum nonzero significand is radix*(prec-1)+1
29
+ rather than radix*(prec-1); the latter value could be computed in ratio_float
30
+ and then packed in the representation, being replaced by zero. This would
31
+ result in an incorrect encoding of the minimum nonzero value.
32
+
1
33
  == 0.1.0 2007-11-04
2
34
 
3
35
  * Initial release
@@ -11,7 +11,6 @@ lib/float-formats/bytes.rb
11
11
  lib/float-formats/classes.rb
12
12
  lib/float-formats/formats.rb
13
13
  lib/float-formats/native.rb
14
- log/debug.log
15
14
  script/destroy
16
15
  script/destroy.cmd
17
16
  script/generate
data/README.txt CHANGED
@@ -33,14 +33,22 @@ The latest version of Float-Formats and its source code can be downloaded from
33
33
 
34
34
  A number of common formats are defined as constants in the FltPnt module:
35
35
 
36
- ==IEEE
37
- <b>IEEE 754 binary</b> floating point representations in little endian order:
38
- IEEE_SINGLE, IEEE_DOUBLE, IEEE_EXTENDED, IEEE_128 and
39
- as little endian: IEEE_S_BE, IEEE_D_BE, IEEE_X_BE, IEEE_128_BE.
40
- Note that the standard defines extended formats with either 64 bits or precision
41
- (IEEE_EXTENDED, IEEE_X_BE) or 112 (IEEE_128, IEEE_128_BE).
36
+ ==IEEE 754r
37
+ <b>binary</b> floating point representations in little endian order:
38
+ IEEE_binary16 (half precision),
39
+ IEEE_binary32 (single precision),
40
+ IEEE_binary64 (double precision),
41
+ IEEE_binary80 (extended), IEEE_binary128 (quadruple precision) and
42
+ as little endian: IEEE_binary16_BE, etc.
43
+
44
+ <b>decimal</b> formats (using DPD):
45
+ IEEE_decimal32, IEEE_decimal64 and IEEE_decimal128.
46
+
47
+ <b>interchange binary & decimal</b> formats:
48
+ IEEE_binary256, IEEE_binary512, IEEE_binary1024, IEEE_decimal192, IEEE_decimal256.
49
+ Others can be defined with IEEE.interchange_binary and IEEE.interchange_decimal
50
+ (see the IEEE module).
42
51
 
43
- <b>IEEE 754r decimal</b> formats (using DPD): IEEE_DEC32, IEEE_DEC64 and IEEE_DEC128.
44
52
 
45
53
  ==Legacy
46
54
  Formats of historical interest, some of which are found
@@ -68,13 +76,14 @@ Formats used in the Intel 8051 by the C51 compiler:
68
76
 
69
77
 
70
78
  ==Calculators
71
- Formats used in HP SATURN based calculators (RPL): (SATURN, SATURN_X),
72
- Classic HP 10 digit calculators: (HP_CLASSIC).
73
-
79
+ Formats used in HP RPL calculators: (RPL, RPL_X),
80
+ HP-71B formats (HP71B, HP71B_X)
81
+ and classic HP 10 digit calculators: (HP_CLASSIC).
74
82
 
75
83
 
76
84
  =Using the pre-defined formats
77
85
 
86
+ require 'rubygems'
78
87
  require 'float-formats'
79
88
  include FltPnt
80
89
 
@@ -209,3 +218,110 @@ Nio has been developed by Javier Goizueta (mailto:javier@goizueta.info).
209
218
 
210
219
  You can contact me through Rubyforge:http://rubyforge.org/sendmessage.php?touser=25432
211
220
 
221
+ =References
222
+
223
+
224
+ [<i>Floating Point Representations.</i> C.B. Silio.]
225
+ http://www.ece.umd.edu/class/enpm607.S2000/fltngpt.pdf
226
+ Description of formats used in UNIVAC 1100, CDC 6600/7600, PDP-11, IEEE754, IBM360/370
227
+
228
+ [<i>Floating-Point Formats.</i> John Savard.]
229
+ http://www.quadibloc.com/comp/cp0201.htm
230
+ Description of formats used in VAX and PDF-11
231
+
232
+
233
+ ===IEEE754 binary formats
234
+ [<i>IEEE-754 References.</i> Christopher Vickery.]
235
+ http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html
236
+
237
+ [<i>What Every Computer Scientist Should Know About Floating-Point Arithmetic.</i> David Goldberg.]
238
+ http://docs.sun.com/source/806-3568/ncg_goldberg.html
239
+
240
+
241
+ ===DPD/IEEE754r decimal formats
242
+ [<i>Decimal Arithmetic Encoding. Strawman 4d.</i> Mike Cowlishaw.]
243
+ http://www2.hursley.ibm.com/decimal/decbits.pdf
244
+
245
+ [<i>A Summary of Densely Packed Decimal encoding.</i> Mike Cowlishaw.]
246
+ http://www2.hursley.ibm.com/decimal/DPDecimal.html
247
+
248
+ [<i>Packed Decimal Encoding IEEE-754-r.</i> J.H.M. Bonten.]
249
+ http://home.hetnet.nl/mr_1/81/jhm.bonten/computers/bitsandbytes/wordsizes/ibmpde.htm
250
+
251
+ [<i>DRAFT Standard for Floating-Point Arithmetic P754.</i> IEEE.]
252
+ http://www.validlab.com/754R/drafts/archive/2007-10-05.pdf
253
+
254
+
255
+
256
+ ===HP 10 digits calculators
257
+
258
+ [<i>HP CPU and Programming</i>. David G.Hicks.]
259
+ http://www.hpmuseum.org/techcpu.htm Description of calculator CPUs from the Museum of HP Calculators.
260
+ [<i>HP 35 ROM step by step.</i> Jacques Laporte]
261
+ http://www.jacques-laporte.org/HP35%20ROM.htm
262
+ Description of HP35 registers.
263
+ [<i>Scientific Pocket Calculator Extends Range of Built-In Functions.</i> Eric A. Evett, Paul J. McClellan, Joseph P. Tanzini.]
264
+ Hewlett Packard Journal 1983-05 pgs 27-28. Describes format used in HP-15C.
265
+
266
+
267
+ ===HP 12 digits calculators
268
+ [<i>Software Internal Design Specification Volume I For the HP-71</i>. Hewlett Packard.]
269
+ Available from http://www.hpmuseum.org/cd/cddesc.htm
270
+ [<i>RPL PROGRAMMING GUIDE</i>]
271
+ Excerpted from <i>RPL: A Mathematical Control Language</i>. by W. C. Wickes.
272
+ Available at http://www.hpcalc.org/details.php?id=1743
273
+
274
+ ===HP-3000
275
+ [<i>A Pocket Calculator for Computer Science Professionals.</i> Eric A. Evett.]
276
+ Hewlett Packard Journal 1983-05 pg 37. Describes format used in HP-3000
277
+
278
+ ===IBM
279
+ [<i>IBM Floating Point Architecture.</i> Wikipedia.]
280
+ http://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture
281
+ [<i>The IBM eServer z990 floating-point unit</i>. G. Gerwig, H. Wetter, E. M. Schwarz, J. Haess, C. A. Krygowski, B. M. Fleischer and M. Kroener.]
282
+ http://www.research.ibm.com/journal/rd/483/gerwig.html
283
+
284
+ ===MBF
285
+ [<i>Microsoft Knowledbase Article 35826</i>]
286
+ http://support.microsoft.com/?scid=kb%3Ben-us%3B35826&x=17&y=12
287
+ [<i>Microsoft MBF2IEEE library</i>]
288
+ http://download.microsoft.com/download/vb30/install/1/win98/en-us/mbf2ieee.exe
289
+
290
+ ===Borland
291
+ [<i>An Overview of Floating Point Numbers.</i> Borland Developer Support Staff]
292
+
293
+ [<i>Pascal Floating-Point Page.<i> J R Stockton.]
294
+ http://www.merlyn.demon.co.uk/pas-real.htm
295
+
296
+ ===8-bit micros
297
+ This is the MS Basic format (BASIC09 for TRS-80 Color Computer, Dragon),
298
+ also used in the Sinclair Spectrum.
299
+
300
+ [<i>Numbers are followed by information not in listings</i>]
301
+ Sinclair User October 1983 http://www.sincuser.f9.co.uk/019/helplne.htm
302
+
303
+ [<i>Sinclair ZX Spectrum / Basic Programming.</i>. Steven Vickers.]
304
+ Chapter 24. http://www.worldofspectrum.org/ZXBasicManual/zxmanchap24.html
305
+
306
+
307
+
308
+ ===Apple II
309
+ [<i>Floating Point Routines for the 6502</i> Roy Rankin and Steve Wozniak.]
310
+ Dr. Dobb's Journal, August 1976, pages 17-19.
311
+
312
+ ===C51
313
+ [<i>Advanced Development System</i> Franklin Software, Inc.]
314
+ http://www.fsinc.com/reference/html/com9anm.htm
315
+
316
+ ===CDC6600
317
+ [<i>CONTROL DATA 6400/6500/6600 COMPUTER SYSTEMS Reference Manual</i>]
318
+ Manuals available at http://bitsavers.org/
319
+
320
+
321
+ ===Cray
322
+ [<i>CRAY-1 COMPUTER SYSTEM Hardware Reference Manual</i>]
323
+ See pg 3-20 from 2240004 or pg 4-30 from HR-0808 or pg 4-21 from HP-0032.
324
+ Manuals available at http://bitsavers.org/
325
+
326
+ ===Wang 2200
327
+ [<i>Internal Floating Point Representation</i>] http://www.wang2200.org/fp_format.html
@@ -60,7 +60,7 @@ hoe = Hoe.new(GEM_NAME, VERS) do |p|
60
60
  # == Optional
61
61
  p.changes = p.paragraphs_of("History.txt", 0..1).join("\\n\\n")
62
62
  p.extra_deps = [
63
- ['nio', '>=0.2.0']
63
+ ['nio', '>=0.2.1']
64
64
  ]
65
65
 
66
66
  #p.spec_extras = {} # A hash of extra values to set in the gemspec.
@@ -7,5 +7,4 @@ require 'float-formats/formats'
7
7
 
8
8
  # FltPnt contains constants for common floating point formats.
9
9
  module FltPnt
10
-
11
- end
10
+ end
@@ -97,12 +97,12 @@ class FormatBase
97
97
  @max_encoded_exp = params[:max_encoded_exp] || @exponent_radix**@fields[:exponent]-1 # maximum regular exponent, encoded
98
98
  if @infinity
99
99
  @infinite_encoded_exp = @nan_encoded_exp || @max_encoded_exp if !@infinite_encoded_exp
100
- @max_encoded_exp = @infinite_encoded_exp - 1 if @infinite_encoded_exp<=@max_encoded_exp
100
+ @max_encoded_exp = @infinite_encoded_exp - 1 if @infinite_encoded_exp.kind_of?(Integer) && @infinite_encoded_exp<=@max_encoded_exp
101
101
  end
102
102
  @nan = params[:nan] || (@nan_encoded_exp ? true : false)
103
103
  if @nan
104
104
  @nan_encoded_exp = @infinite_encoded_exp || @max_encoded_exp if !@nan_encoded_exp
105
- @max_encoded_exp = @nan_encoded_exp - 1 if @nan_encoded_exp<=@max_encoded_exp
105
+ @max_encoded_exp = @nan_encoded_exp - 1 if @nan_encoded_exp.kind_of?(Integer) && @nan_encoded_exp<=@max_encoded_exp
106
106
  end
107
107
 
108
108
  @exponent_mode = params[:exponent_mode]
@@ -411,7 +411,15 @@ class FormatBase
411
411
  # Produce an encoded floating point value using a number defined by a
412
412
  # formatted text string (using Nio formats). Returns a Value.
413
413
  def from_fmt(txt,fmt=Nio::Fmt.default)
414
- neutral = fmt.nio_read_formatted(txt)
414
+ neutral = fmt.nio_read_formatted(txt)
415
+ if neutral.special?
416
+ case neutral.special
417
+ when :nan
418
+ return nan
419
+ when :inf
420
+ return infinity(neutral.sign=='-' ? minus_sign_value : 0)
421
+ end
422
+ end
415
423
  if neutral.rep_pos<neutral.digits.length
416
424
  nd = fmt.get_base==10 ? decimal_digits_necessary : (significand_digits*Math.log(radix)/Math.log(fmt.get_base)).ceil+1
417
425
  fmt = fmt.mode(:sig,nd)
@@ -531,7 +539,7 @@ class FormatBase
531
539
  # Computes the next adjacent floating point value.
532
540
  # Accepts either a Value or a byte String.
533
541
  # Returns a Value.
534
- def next_float(v)
542
+ def next_float(v)
535
543
  s,f,e = to_integral_sign_significand_exponent(v)
536
544
  return neg(prev_float(neg(v))) if s!=0 && e!=:zero
537
545
  s = switch_sign_value(s) if e==:zero && s!=0
@@ -584,6 +592,27 @@ class FormatBase
584
592
  from_integral_sign_significand_exponent(s,f,e)
585
593
  end
586
594
  end
595
+
596
+ # ulp (unit in the last place) according to the definition proposed by J.M. Muller in
597
+ # "On the definition of ulp(x)" INRIA No. 5504
598
+ def ulp(v)
599
+ sign,sig,exp = to_integral_sign_significand_exponent(v)
600
+
601
+ mnexp = radix_min_exp(:integral_significand)
602
+ mxexp = radix_max_exp(:integral_significand)
603
+ prec = significand_digits
604
+
605
+ if exp==:nan
606
+ return_bytes v
607
+ elsif exp==:infinity
608
+ from_integral_sign_significand_exponent(1,1,mxexp) # from_integral_sign_significand_exponent(1,fmt.radix_power(prec-1),mxexp-prec+1)
609
+ elsif exp==:zero || exp <= mnexp
610
+ min_value
611
+ else
612
+ exp -= 1 if sig==radix_power(prec-1) # minimum normalized significand
613
+ from_integral_sign_significand_exponent(1,1,exp)
614
+ end
615
+ end
587
616
 
588
617
  # Produce an encoded floating point value from the integral value
589
618
  # of the sign, significand and exponent.
@@ -607,6 +636,7 @@ class FormatBase
607
636
  when :normalized_significand
608
637
  m = Rational(m,radix_power(@significand_digits-1))
609
638
  end
639
+ [s,m,e]
610
640
  end
611
641
 
612
642
  # Returns the encoded value of a floating-point number as an integer
@@ -802,14 +832,17 @@ class FormatBase
802
832
  v_r = v-r
803
833
  z = from_integral_sign_significand_exponent(0,q,k)
804
834
  if r<v_r
805
- z
806
835
  elsif r>v_r
807
- z = next_float(z)
836
+ q += 1
808
837
  elsif (round_mode==:even && q.even?) || (round_mode==:zero)
809
- z
810
838
  else
811
- z = next_float(z)
839
+ q += 1
840
+ end
841
+ if q==radix_power(significand_digits)
842
+ q = radix_power(significand_digits-1)
843
+ k += 1
812
844
  end
845
+ from_integral_sign_significand_exponent(0,q,k)
813
846
  end
814
847
 
815
848
  def algM(f,e,round_mode,eb=10)
@@ -935,7 +968,18 @@ class BCDFormat < DecimalFormatBase
935
968
  end
936
969
  # now we conver the nibble strings to numbers
937
970
  i = -1
938
- nibble_fields.collect{|ns| i+=1;bcd_field?(i) ? ns.reverse.to_i : ns.reverse.to_i(16)}
971
+ nibble_fields.collect do |ns|
972
+ i+=1
973
+ if bcd_field?(i)
974
+ if /\A\d+\Z/.match(ns)
975
+ ns.reverse.to_i
976
+ else
977
+ ns.reverse
978
+ end
979
+ else
980
+ ns.reverse.to_i(16)
981
+ end
982
+ end
939
983
  end
940
984
  def from_fields(*fields)
941
985
  fields = fields[0] if fields.size==1 and fields[0].kind_of?(Array)
@@ -943,8 +987,12 @@ class BCDFormat < DecimalFormatBase
943
987
  i = 0
944
988
  nibbles = ""
945
989
  for l in @field_lengths
946
- fmt = bcd_field?(i) ? 'd' : 'X'
947
- nibbles << ("%0#{l}#{fmt}" % fields[i]).reverse
990
+ f = fields[i]
991
+ unless f.kind_of?(String)
992
+ fmt = bcd_field?(i) ? 'd' : 'X'
993
+ f = "%0#{l}#{fmt}" % fields[i]
994
+ end
995
+ nibbles << f.reverse
948
996
  i += 1
949
997
  end
950
998
  v = hex_to_bytes(nibbles)
@@ -964,15 +1012,15 @@ class BCDFormat < DecimalFormatBase
964
1012
  e = f[:exponent]
965
1013
  s = f[:sign]
966
1014
  m,e = neg_significand_exponent(s,m,e) if s%2==1
967
- if m==0
968
- # +-zero
969
- e = :zero
970
- elsif @infinite_encoded_exp && e==@infinite_encoded_exp && m==0
971
- # +-inifinity
1015
+ if @infinite_encoded_exp && e==@infinite_encoded_exp
1016
+ # +-infinity
972
1017
  e = :infinity
973
- elsif @nan_encoded_exp && e==@nan_encoded_exp && m!=0
1018
+ elsif @nan_encoded_exp && e==@nan_encoded_exp
974
1019
  # NaN
975
1020
  e = :nan
1021
+ elsif m==0
1022
+ # +-zero
1023
+ e = :zero
976
1024
  else
977
1025
  # normalized number
978
1026
  e = decode_exponent(e, :integral_significand)
@@ -989,11 +1037,11 @@ class BCDFormat < DecimalFormatBase
989
1037
  e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
990
1038
  m = 0
991
1039
  elsif e==:nan
992
- e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
993
- s = minus_sign_value # ?
994
- m = radix_power(@significand_digits-2) if m==0
1040
+ e = @nan_encoded_exp || radix_power(@fields[:exponent])-1
1041
+ #s = minus_sign_value # ?
1042
+ #m = radix_power(@significand_digits-2) if m==0
995
1043
  elsif e==:denormal
996
- e = @denormal_encoded_exp
1044
+ e = @denormal_encoded_exp
997
1045
  else
998
1046
  # to do: try to adjust m to keep e in range if out of valid range
999
1047
  # to do: reduce m and adjust e if m too big
@@ -1173,7 +1221,6 @@ class DPDFormat < DecimalFormatBase
1173
1221
 
1174
1222
  def from_integral_sign_significand_exponent(s,m,e)
1175
1223
  msb = radix_power(@significand_digits-1)
1176
- #puts "DEC FROM #{s} #{m} #{e}"
1177
1224
  t = nil
1178
1225
  if e==:zero
1179
1226
  e = @zero_encoded_exp
@@ -1307,7 +1354,7 @@ class BinaryFormat < FieldsInBitsFormatBase
1307
1354
  e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
1308
1355
  m = 0
1309
1356
  elsif e==:nan
1310
- e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
1357
+ e = @nan_encoded_exp || radix_power(@fields[:exponent])-1
1311
1358
  s = minus_sign_value # ?
1312
1359
  m = radix_power(@significand_digits-2) if m==0
1313
1360
  elsif e==:denormal
@@ -1399,7 +1446,7 @@ class HexadecimalFormat < FieldsInBitsFormatBase
1399
1446
  e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
1400
1447
  m = 0
1401
1448
  elsif e==:nan
1402
- e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
1449
+ e = @nan_encoded_exp || radix_power(@fields[:exponent])-1
1403
1450
  s = minus_sign_value # ?
1404
1451
  m = radix_power(@significand_digits-2) if m==0
1405
1452
  elsif e==:denormal
@@ -1524,6 +1571,14 @@ class Value
1524
1571
  self.class.new(@fptype, @fptype.prev_float(@value))
1525
1572
  end
1526
1573
 
1574
+ def neg
1575
+ @fptype.neg(@value)
1576
+ end
1577
+
1578
+ def ulp
1579
+ @fptype.ulp(@value)
1580
+ end
1581
+
1527
1582
  def fp_format
1528
1583
  @fptype
1529
1584
  end
@@ -14,84 +14,114 @@ module FltPnt
14
14
 
15
15
  # Floating Point Format Definitions ==========================================
16
16
 
17
+ # Helper methods to define IEEE 754r formats
18
+ module IEEE
19
+ # Define an IEEE binary format by passing parameters in a hash;
20
+ # :significand and :exponent are used to defined the fields,
21
+ # optional parameters may follow.
22
+ def self.binary(parameters)
23
+ significand_bits = parameters[:significand]
24
+ exponent_bits = parameters[:exponent]
25
+ BinaryFormat.new({
26
+ :fields=>[:significand,significand_bits,:exponent,exponent_bits,:sign,1],
27
+ :bias=>2**(exponent_bits-1)-1, :bias_mode=>:normalized_significand,
28
+ :hidden_bit=>true,
29
+ :endianness=>:little_endian, :round=>:even,
30
+ :gradual_underflow=>true, :infinity=>true, :nan=>true
31
+ }.merge(parameters))
32
+ end
33
+
34
+ # Define an IEEE binary interchange format given its width in bits
35
+ def self.interchange_binary(width_in_bits, options={})
36
+ raise "Invalid IEEE binary interchange format definition: size (#{width_in_bits}) is not valid" unless (width_in_bits%32)==0 && (width_in_bits/32)>=4
37
+ p = width_in_bits - (4*Math.log(width_in_bits)/Math.log(2)).round.to_i + 13
38
+ binary({:significand=>p-1, :exponent=>width_in_bits-p}.merge(options))
39
+ end
40
+
41
+ # Define an IEEE decimal format by passing parameters in a hash;
42
+ # :significand and :exponent are used to defined the fields,
43
+ # optional parameters may follow.
44
+ def self.decimal(parameters)
45
+ significand_continuation_bits = parameters[:significand]
46
+ exponent_continuation_bits = parameters[:exponent]
47
+ DPDFormat.new({
48
+ :fields=>[:significand_continuation,significand_continuation_bits,:exponent_continuation,exponent_continuation_bits,:combination,5,:sign,1],
49
+ :endianness=>:big_endian,
50
+ :gradual_underflow=>true, :infinity=>true, :nan=>true
51
+ }.merge(parameters))
52
+ end
53
+
54
+ # Define an IEEE decimal interchange format given its width in bits
55
+ def self.interchange_decimal(width_in_bits, options={})
56
+ raise "Invalid IEEE decimal interchange format definition: size (#{width_in_bits}) is not valid" unless (width_in_bits%32)==0
57
+ p = width_in_bits*9/32 - 2
58
+ t = (p-1)*10/3
59
+ w = width_in_bits - t - 6
60
+ decimal({:significand=>t, :exponent=>w}.merge(options))
61
+ end
62
+
63
+ end
64
+
17
65
  # IEEE 754 binary types, as stored in little endian architectures such as Intel, Alpha
18
66
 
19
- IEEE_SINGLE = BinaryFormat.new(
20
- :fields=>[:significand,23,:exponent,8,:sign,1],
21
- :bias=>127, :bias_mode=>:normalized_significand,
22
- :hidden_bit=>true,
23
- :endianness=>:little_endian, :round=>:even,
24
- :gradual_underflow=>true, :infinity=>true, :nan=>true
25
- )
26
- IEEE_DOUBLE = BinaryFormat.new(
27
- :fields=>[:significand,52,:exponent,11,:sign,1],
28
- :bias=>1023, :bias_mode=>:normalized_significand,
29
- :hidden_bit=>true,
30
- :endianness=>:little_endian, :round=>:even,
31
- :gradual_underflow=>true, :infinity=>true, :nan=>true
32
- )
33
- IEEE_EXTENDED = BinaryFormat.new(
34
- :fields=>[:significand,64,:exponent,15,:sign,1],
35
- :bias=>16383, :bias_mode=>:normalized_significand,
36
- :hidden_bit=>false, :min_encoded_exp=>1, :round=>:even,
37
- :endianness=>:little_endian,
38
- :gradual_underflow=>true, :infinity=>true, :nan=>true
39
- )
40
- IEEE_128 = BinaryFormat.new(
41
- :fields=>[:significand,112,:exponent,15,:sign,1],
42
- :bias=>16383, :bias_mode=>:normalized_significand,
43
- :hidden_bit=>false, :min_encoded_exp=>1, :round=>:even,
44
- :endianness=>:little_endian,
45
- :gradual_underflow=>true, :infinity=>true, :nan=>true
46
- )
67
+ IEEE_binary16 = IEEE.binary(:significand=>10, :exponent=>5)
68
+ IEEE_binary32 = IEEE.binary(:significand=>23,:exponent=>8)
69
+ IEEE_binary64 = IEEE.binary(:significand=>52,:exponent=>11)
70
+ IEEE_binary80 = IEEE.binary(:significand=>64,:exponent=>15, :hidden_bit=>false, :min_encoded_exp=>1)
71
+ IEEE_binary128 = IEEE.binary(:significand=>112,:exponent=>15)
72
+
47
73
 
48
74
  # IEEE 754 in big endian order (SPARC, Motorola 68k, PowerPC)
49
75
 
50
- IEEE_S_BE = BinaryFormat.new(
51
- :fields=>[:significand,23,:exponent,8,:sign,1],
52
- :bias=>127, :bias_mode=>:normalized_significand,
53
- :hidden_bit=>true,
54
- :endianness=>:big_endian,
55
- :gradual_underflow=>true, :infinity=>true, :nan=>true)
56
- IEEE_D_BE = BinaryFormat.new(
57
- :fields=>[:significand,52,:exponent,11,:sign,1],
58
- :bias=>1023, :bias_mode=>:normalized_significand,
59
- :hidden_bit=>true, :round=>:even,
60
- :endianness=>:big_endian,
61
- :gradual_underflow=>true, :infinity=>true, :nan=>true
62
- )
63
- IEEE_X_BE = BinaryFormat.new(
64
- :fields=>[:significand,64,:exponent,15,:sign,1],
65
- :bias=>16383, :bias_mode=>:normalized_significand,
66
- :hidden_bit=>false, :round=>:even,
67
- :endianness=>:big_endian,
68
- :gradual_underflow=>true, :infinity=>true, :nan=>true
69
- )
70
- IEEE_128_BE = BinaryFormat.new(
71
- :fields=>[:significand,112,:exponent,15,:sign,1],
72
- :bias=>16383, :bias_mode=>:normalized_significand,
73
- :hidden_bit=>false, :min_encoded_exp=>1, :round=>:even,
74
- :endianness=>:big_endian,
75
- :gradual_underflow=>true, :infinity=>true, :nan=>true
76
- )
76
+ IEEE_binary16_BE = IEEE.binary(:significand=>10, :exponent=>5, :endianness=>:big_endian)
77
+ IEEE_binary32_BE = IEEE.binary(:significand=>23,:exponent=>8, :endianness=>:big_endian)
78
+ IEEE_binary64_BE = IEEE.binary(:significand=>52,:exponent=>11, :endianness=>:big_endian)
79
+ IEEE_binary80_BE = IEEE.binary(:significand=>64,:exponent=>15, :endianness=>:big_endian, :hidden_bit=>false, :min_encoded_exp=>1)
80
+ IEEE_binary128_BE = IEEE.binary(:significand=>112,:exponent=>15, :endianness=>:big_endian)
81
+
82
+
83
+ # some IEEE745r interchange binary formats
84
+
85
+ IEEE_binary256 = IEEE.interchange_binary(256)
86
+ IEEE_binary512 = IEEE.interchange_binary(512)
87
+ IEEE_binary1024 = IEEE.interchange_binary(1024)
88
+ IEEE_binary256_BE = IEEE.interchange_binary(256, :endianness=>:big_endian)
89
+ IEEE_binary512_BE = IEEE.interchange_binary(512, :endianness=>:big_endian)
90
+ IEEE_binary1024_BE = IEEE.interchange_binary(1024, :endianness=>:big_endian)
91
+
92
+
93
+ # old names
94
+ IEEE_binaryx = IEEE_binary80
95
+ IEEE_HALF = IEEE_binary16
96
+ IEEE_SINGLE = IEEE_binary32
97
+ IEEE_DOUBLE = IEEE_binary64
98
+ IEEE_EXTENDED = IEEE_binary80
99
+ IEEE_QUAD = IEEE_binary128
100
+ IEEE_128 = IEEE_binary128IEEE_H_BE = IEEE_binary16_BE
101
+ IEEE_S_BE = IEEE_binary32_BE
102
+ IEEE_D_BE = IEEE_binary64_BE
103
+ IEEE_X_BE = IEEE_binary80_BE
104
+ IEEE_128_BE = IEEE_binary128_BE
105
+ IEEE_Q_BE = IEEE_binary128_BE
106
+
107
+
77
108
  # Decimal IEEE 754r formats
78
109
 
79
- IEEE_DEC32 = DPDFormat.new(
80
- :fields=>[:significand_continuation,20,:exponent_continuation,6,:combination,5,:sign,1],
81
- :endianness=>:big_endian,
82
- :gradual_underflow=>true, :infinity=>true, :nan=>true
83
- )
84
- IEEE_DEC64 = DPDFormat.new(
85
- :fields=>[:significand_continuation,50,:exponent_continuation,8,:combination,5,:sign,1],
86
- :endianness=>:big_endian,
87
- :gradual_underflow=>true, :infinity=>true, :nan=>true
88
- )
89
- IEEE_DEC128 = DPDFormat.new(
90
- :fields=>[:significand_continuation,110,:exponent_continuation,12,:combination,5,:sign,1],
91
- :endianness=>:big_endian,
92
- :gradual_underflow=>true, :infinity=>true, :nan=>true
93
- )
110
+ IEEE_decimal32 = IEEE.decimal(:significand=>20, :exponent=>6)
111
+ IEEE_decimal64 = IEEE.decimal(:significand=>50, :exponent=>8)
112
+ IEEE_decimal128 = IEEE.decimal(:significand=>110, :exponent=>12)
113
+
114
+ # some IEEE745r interchange binary formats
115
+
116
+ IEEE_decimal96 = IEEE.interchange_decimal(96)
117
+ IEEE_decimal192 = IEEE.interchange_decimal(192)
118
+ IEEE_decimal256 = IEEE.interchange_decimal(256)
94
119
 
120
+ # old names
121
+
122
+ IEEE_DEC32 = IEEE_decimal32
123
+ IEEE_DEC64 = IEEE_decimal64
124
+ IEEE_DEC128 = IEEE_decimal128
95
125
 
96
126
  # Excess 128 used by Microsoft Basic in 8-bit micros, Spectrum, ...
97
127
 
@@ -202,6 +232,7 @@ PDP11_D = BinaryFormat.new(
202
232
 
203
233
 
204
234
  # Format used in HP Saturn-based RPL calculators (HP48,HP49,HP50, also HP32s, HP42s --which use RPL internally)
235
+ # (these formats are not used in the HP-71B which is a Saturn, non-RPL machine)
205
236
 
206
237
  SATURN = BCDFormat.new(
207
238
  :fields=>[:prolog,5,:exponent,3,:significand,12,:sign,1],
@@ -217,9 +248,40 @@ SATURN_X = BCDFormat.new(
217
248
  :endianness=>:little_endian, :round=>:even,
218
249
  :gradual_underflow=>false, :infinity=>false, :nan=>false
219
250
  )
220
-
221
- # Format used in classic HP calculators (HP-35, ... HP-15C) (endianess is unknown)
222
-
251
+
252
+
253
+ RPL = SATURN
254
+ RPL_X = SATURN_X
255
+
256
+ # SATURN HP-71B (IEEE, NON-RPL) formats
257
+
258
+ # HP-71B REAL format (12-form) which is stored in a single register
259
+ HP71B = BCDFormat.new(
260
+ :fields=>[:exponent,3,:significand,12,:sign,1],
261
+ :exponent_mode=>:radix_complement,
262
+ :endianness=>:little_endian, :round=>:even,
263
+ :gradual_underflow=>true, :infinity=>true, :nan=>true,
264
+ :denormal_encoded_exp=>501,
265
+ :nan_encoded_exp=>"F01", # signaling NaN is F02
266
+ :infinite_encoded_exp=>"F00"
267
+ )
268
+
269
+ # HP-71B internal 15-digit format (15-form), stored in a pair of registers
270
+ # we use here a little-endian order for the registers, otherwise the
271
+ # definition would be [:significand,15,:unused1,1,:exponent,5,:unused2,10,:sign,1]
272
+ HP71B_X = BCDFormat.new(
273
+ :fields=>[:exponent,5,:unused2,10,:sign,1, :significand,15,:unused1,1],
274
+ :exponent_mode=>:radix_complement,
275
+ :endianness=>:little_endian, :round=>:even,
276
+ :gradual_underflow=>false, :infinity=>true, :nan=>true,
277
+ :nan_encoded_exp=>"00F01",
278
+ :infinite_encoded_exp=>"00F00"
279
+ )
280
+
281
+ # Format used in classic HP calculators (HP-35, ... HP-15C)
282
+ # Endianness is indeterminate, since these machines have named registers that
283
+ # hold a floating-point value in a single 56-bit word.
284
+ # (But intra-word field/nibble addressing is little-endian)
223
285
  HP_CLASSIC = BCDFormat.new(
224
286
  :fields=>[:exponent,3,:significand,10,:sign,1],
225
287
  :exponent_mode=>:radix_complement,
@@ -306,15 +368,16 @@ CDC_SINGLE = CDCFLoatingPoint.new(
306
368
  :gradual_underflow=>false, :infinity=>false, :nan=>false
307
369
  )
308
370
 
309
- # the CDC_DOUBLE can be splitted in two CDC_SINGLE values:
371
+ # The CDC_DOUBLE can be splitted in two CDC_SINGLE values:
310
372
  # get_bitfields(v,[CDC_SINGLE.total_bits]*2,CDC_DOUBLE.endianness).collect{|x| int_to_bytes(x,0,CDC_SINGLE.endianness)}
311
373
  # and the value of the double is the sum of the values of the singles.
312
- # unlike the single, we must use :fractional_significand mode because with :integral_significand
374
+ # Unlike the single, we must use :fractional_significand mode because with :integral_significand
313
375
  # the exponent would refer to the whole significand, but it must refer only to the most significant half.
314
376
  # we substract the number of bits in the single to the bias and exponent because of this change,
315
377
  # and add 48 to the min_exponent to avoid the exponent of the low order single to be out of range
316
378
  # because the exponent of the low order single is adjusted to
317
379
  # the position of its digits by substracting 48 from the high order exponent
380
+ # when its exponent would be out of range
318
381
  # Note that when computing the low order exponent with the fields handler we must take into account the sign
319
382
  # because for negative numbers all the fields are one-complemented.
320
383
  CDC_DOUBLE= CDCFLoatingPoint.new(
@@ -353,17 +416,15 @@ UNIVAC_DOUBLE = BinaryFormat.new(
353
416
  :gradual_underflow=>false, :infinity=>false, :nan=>false
354
417
  )
355
418
 
356
- # Sofware floating point implementatin for the Apple II (6502)
357
- # the significand & sign are a single field in two's commplement
358
419
 
359
- APPLE = BinaryFormat.new(
420
+ # :stopdoc: # the next definition is not handled correctly by RDoc
421
+ APPLE_INSANE = BinaryFormat.new(
360
422
  :fields=>[:significand,23,:sign,1,:exponent,8],
361
423
  :bias=>128, :bias_mode=>:normalized_significand,
362
424
  :hidden_bit=>false, :min_encoded_exp=>0,
363
425
  :neg_mode=>:radix_complement_significand,
364
426
  :endianness=>:big_endian,
365
427
  :gradual_underflow=>true, :infinity=>false, :nan=>false) { |fp|
366
-
367
428
  # This needs a peculiar treatment for the negative values, which not simply use two's complement
368
429
  # but also avoid having the sign and msb of the significand equal.
369
430
  # Note that here we have a separate sign bit, but it can also be considered as the msb of the significand
@@ -387,10 +448,14 @@ APPLE = BinaryFormat.new(
387
448
  #puts ""
388
449
  [f,e]
389
450
  end
390
-
391
451
  }
452
+ # :startdoc:
392
453
 
393
454
 
455
+ # Sofware floating point implementatin for the Apple II (6502)
456
+ # the significand & sign are a single field in two's commplement
457
+ APPLE = APPLE_INSANE
458
+
394
459
  # Wang 2200 Basic Decimal floating point
395
460
  WANG2200 = BCDFormat.new(
396
461
  :fields=>[:significand,13,:exponent,2,:signs,1],
@@ -572,7 +637,31 @@ C51_BCD_LONG_DOUBLE = C51BCDFloatingPoint.new(
572
637
  :zero_encoded_exp=>0, :min_encoded_exp=>0,:max_encoded_exp=>127
573
638
  )
574
639
 
575
-
640
+ =begin
641
+ # Note:
642
+ # One could be tempted to define a double-double type as:
643
+ IEEE_DOUBLE_DOUBLE = BinaryFormat.new(
644
+ :fields=>[:significand,52,:lo_exponent,11,:lo_sign,1,:significand,52,:exponent,11,:sign,1],
645
+ :fields_handler=>lambda{|fields|
646
+ fields[2] = fields[5];
647
+ bits,max_exp = 53,2047
648
+ if fields[4]>bits && fields[4]<max_exp
649
+ fields[1] = fields[4] - bits
650
+ else # 0, denormals, small numbers, NaN, Infinities
651
+ fields[0] = fields[1] = 0
652
+ end
653
+ },
654
+ :bias=>1023, :bias_mode=>:normalized_significand,
655
+ :hidden_bit=>true,
656
+ :endianness=>:little_endian, :round=>:even,
657
+ :gradual_underflow=>true, :infinity=>true, :nan=>true
658
+ )
659
+ # But this is incorrect since there's a hidden bit in the low double too and it must be normalized.
660
+ # In general the halfs of the significand need not be adjacent, they
661
+ # can have exponets with a separation higher than 53; (in fact the minimum separation seems to be 54)
662
+ # and they can have different sings, too;
663
+ # double-double is too tricky to be supported by this package.
664
+ =end
576
665
 
577
666
 
578
667
 
@@ -99,6 +99,27 @@ class Float
99
99
 
100
100
  # Maximum significand == Math.ldexp(Math.ldexp(1,Float::MANT_DIG)-1,-Float::MANT_DIG)
101
101
  MAX_F = Math.frexp(Float::MAX)[0] == Math.ldexp(Math.ldexp(1,Float::MANT_DIG)-1,-Float::MANT_DIG)
102
+
103
+ # ulp (unit in the last place) according to the definition proposed by J.M. Muller in
104
+ # "On the definition of ulp(x)" INRIA No. 5504
105
+ def ulp
106
+ return self if nan?
107
+ x = abs
108
+ if x < Math.ldexp(1,MIN_EXP) # x < RADIX*MIN_N
109
+ res = Math.ldexp(1,MIN_EXP-MANT_DIG) # res = MIN_D
110
+ elsif x > Math.ldexp(1-Math.ldexp(1,-MANT_DIG),MAX_EXP) # x > MAX
111
+ res = Math.ldexp(1,MAX_EXP-MANT_DIG) # res = MAX - MAX.prev
112
+ else
113
+ f,e = Math.frexp(x)
114
+ if f==Math.ldexp(1,-1)
115
+ res = Math.ldexp(1,e-MANT_DIG-1)
116
+ else
117
+ res = Math.ldexp(1,e-MANT_DIG)
118
+ end
119
+ end
120
+ res
121
+ end
122
+
102
123
 
103
124
  end
104
125
 
@@ -127,6 +148,89 @@ def float_bin(x)
127
148
  x.nio_write(Nio::Fmt.mode(:sci,:exact).base(2))
128
149
  end
129
150
 
151
+ # decompose a float into a signed integer significand and exponent (base Float::RADIX)
152
+ def float_to_integral_significand_exponent(x)
153
+ s,e = Math.frexp(x)
154
+ [Math.ldexp(s,Float::MANT_DIG).to_i,e-Float::MANT_DIG]
155
+ end
156
+
157
+ # compose float from significand and exponent
158
+ def float_from_integral_significand_exponent(s,e)
159
+ Math.ldexp(s,e)
160
+ end
161
+
162
+ def float_to_integral_sign_significand_exponent(x)
163
+ if x==0.0
164
+ sign = (1/x<0) ? -1 : +1
165
+ else
166
+ sign = x<0 ? -1 : +1
167
+ end
168
+ x = -x if sign<0
169
+ s,e = Math.frexp(x)
170
+ [sign,Math.ldexp(s,Float::MANT_DIG).to_i,e-Float::MANT_DIG]
171
+ end
172
+
173
+ def float_from_integral_sign_significand_exponent(sgn,s,e)
174
+ f = Math.ldexp(s,e)
175
+ f = -f if sgn<0
176
+ f
177
+ end
178
+
179
+ # convert a float to C99's hexadecimal notation
180
+ def hex_from_float(v)
181
+ if Float::RADIX==2
182
+ sgn,s,e = float_to_integral_sign_significand_exponent(v)
183
+ else
184
+ txt = v.nio_write(Fmt.base(2).sep('.')).upcase
185
+ p = txt.index('E')
186
+ exp = 0
187
+ if p
188
+ exp = rep[p+1..-1].to_i
189
+ txt = rep[0...p]
190
+ end
191
+ p = txt.index('.')
192
+ if p
193
+ exp -= (txt.size-p-1)
194
+ txt.tr!('.','')
195
+ end
196
+ s = txt.to_i(2)
197
+ e = exp
198
+ end
199
+ "0x#{sgn<0 ? '-' : ''}#{s.to_s(16)}p#{e}"
200
+ end
201
+
202
+ # convert a string formatted in C99's hexadecimal notation to a float
203
+ def hex_to_float(txt)
204
+ txt = txt.strip.upcase
205
+ txt = txt[2..-1] if txt[0,2]=='0X'
206
+ p = txt.index('P')
207
+ if p
208
+ exp = txt[p+1..-1].to_i
209
+ txt = txt[0...p]
210
+ else
211
+ exp = 0
212
+ end
213
+ p = txt.index('.')
214
+ if p
215
+ exp -= (txt.size-p-1)*4
216
+ txt.tr!('.','')
217
+ end
218
+ if Float::RADIX==2
219
+ v = txt.to_i(16)
220
+ if v==0 && txt.include?('-')
221
+ sign = -1
222
+ elsif v<0
223
+ sign = -1
224
+ v = -v
225
+ else
226
+ sign = +1
227
+ end
228
+ float_from_integral_sign_significand_exponent(sign,v,exp)
229
+ else
230
+ (txt.to_i(16)*(2**exp)).to_f
231
+ end
232
+ end
233
+
130
234
  # ===== IEEE types =====================================================================================
131
235
 
132
236
  # generate a SGL value stored in a byte string given a decimal value formatted as text
@@ -2,7 +2,7 @@ module FltPnt
2
2
  module VERSION #:nodoc:
3
3
  MAJOR = 0
4
4
  MINOR = 1
5
- TINY = 0
5
+ TINY = 1
6
6
 
7
7
  STRING = [MAJOR, MINOR, TINY].join('.')
8
8
  end
@@ -484,60 +484,11 @@ XS256_DOUBLE:
484
484
  - "65536.0": 44 00 00 00 00 00 00 00
485
485
  - "-65536.0": C4 00 00 00 00 00 00 00
486
486
  - "-7.50": C0 B8 00 00 00 00 00 00
487
- - "8.6361685550944451E-78": 00 00 00 00 00 00 00 00
487
+ - "8.6361685550944451E-78": 00 00 00 00 00 00 00 01
488
488
  - 1.15792089237316192E77: 7F FF FF FF FF FF FF FF
489
489
  - "5.5511151231257827E-17": 32 80 00 00 00 00 00 00
490
490
  - "2.77555756156289135E-17": 32 40 00 00 00 00 00 00
491
491
  base: :bytes
492
- IEEE_128:
493
- parameters:
494
- - total_bits: 128
495
- - radix: 2
496
- - significand_digits: 112
497
- - radix_min_exp: -16382
498
- - radix_max_exp: 16383
499
- - decimal_digits_stored: 33
500
- - decimal_digits_necessary: 35
501
- - decimal_min_exp: -4931
502
- - decimal_max_exp: 4932
503
- values:
504
- - Rational(1, 3): AB AA AA AA AA AA AA AA AA AA AA AA AA AA FD 3F
505
- - Rational(1, 10): CD CC CC CC CC CC CC CC CC CC CC CC CC CC FB 3F
506
- - Rational(2, 3): AB AA AA AA AA AA AA AA AA AA AA AA AA AA FE 3F
507
- - Rational(1, 1024): 00 00 00 00 00 00 00 00 00 00 00 00 00 80 F5 3F
508
- - Rational(1, 1000): 31 08 AC 1C 5A 64 3B DF 4F 8D 97 6E 12 83 F5 3F
509
- - Rational(1024, 1): 00 00 00 00 00 00 00 00 00 00 00 00 00 80 09 40
510
- - Rational(1024, 1): 00 00 00 00 00 00 00 00 00 00 00 00 00 80 09 40
511
- special:
512
- - min_value: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
513
- - min_normalized_value: 00 00 00 00 00 00 00 00 00 00 00 00 00 80 01 00
514
- - max_value: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FE 7F
515
- - epsilon: 00 00 00 00 00 00 00 00 00 00 00 00 00 80 90 3F
516
- - strict_epsilon: 01 00 00 00 00 00 00 00 00 00 00 00 00 80 8F 3F
517
- numerals:
518
- - "+0": 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
519
- - "-0": 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80
520
- - "+1": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FF 3F
521
- - "-1": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FF BF
522
- - "+0.1": CD CC CC CC CC CC CC CC CC CC CC CC CC CC FB 3F
523
- - "-0.1": CD CC CC CC CC CC CC CC CC CC CC CC CC CC FB BF
524
- - "0.5": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FE 3F
525
- - "-0.5": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FE BF
526
- - "29.2": 9A 99 99 99 99 99 99 99 99 99 99 99 99 E9 03 40
527
- - "-29.2": 9A 99 99 99 99 99 99 99 99 99 99 99 99 E9 03 C0
528
- - "0.03125": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FA 3F
529
- - "-0.03125": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FA BF
530
- - "-0.3125": 00 00 00 00 00 00 00 00 00 00 00 00 00 A0 FD BF
531
- - 1.234E2: CD CC CC CC CC CC CC CC CC CC CC CC CC F6 05 40
532
- - "-1.234E-6": 90 0F B2 E0 09 B3 8C B1 6C 16 CA EA 9F A5 EB BF
533
- - "65536.0": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 0F 40
534
- - "-65536.0": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 0F C0
535
- - "-7.50": 00 00 00 00 00 00 00 00 00 00 00 00 00 F0 01 C0
536
- - "3.3621031431120935062626778173217526E-4932": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 01 00
537
- - 1.1897314953572317650857593266280069E4932: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FE 7F
538
- - "3.8518598887744717061119558851698546E-34": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 90 3F
539
- - "1.9259299443872358530559779425849281E-34": 01 00 00 00 00 00 00 00 00 00 00 00 00 80 8F 3F
540
- base: :bytes
541
492
  IEEE_DOUBLE:
542
493
  parameters:
543
494
  - total_bits: 64
@@ -680,7 +631,7 @@ PDP11_F:
680
631
  - "65536.0": 80 48 00 00
681
632
  - "-65536.0": 80 C8 00 00
682
633
  - "-7.50": F0 C1 00 00
683
- - "1.46936811E-39": 00 00 00 00
634
+ - "1.46936811E-39": 00 00 01 00
684
635
  - 1.70141173E38: FF 7F FF FF
685
636
  - "1.1920929E-7": 00 35 00 00
686
637
  - "1.1920929E-7": 00 35 00 00
@@ -1121,7 +1072,7 @@ XS256:
1121
1072
  - "65536.0": 44 00 00 00
1122
1073
  - "-65536.0": C4 00 00 00
1123
1074
  - "-7.50": C0 B8 00 00
1124
- - "8.6361706E-78": 00 00 00 00
1075
+ - "8.6361706E-78": 00 00 00 01
1125
1076
  - 1.1579208E77: 7F FF FF FF
1126
1077
  - "2.3841858E-7": 3A 80 00 00
1127
1078
  - "1.1920929E-7": 3A 40 00 00
@@ -1660,7 +1611,7 @@ BORLAND48:
1660
1611
  - "65536.0": 91 00 00 00 00 00
1661
1612
  - "-65536.0": 91 00 00 00 00 80
1662
1613
  - "-7.50": 83 00 00 00 00 F0
1663
- - "2.9387358770557E-39": 02 00 00 00 00 00
1614
+ - "2.9387358770557E-39": 01 00 00 00 00 00
1664
1615
  - 1.7014118346031E38: FF FF FF FF FF 7F
1665
1616
  - "1.8189894035459E-12": 5A 00 00 00 00 00
1666
1617
  - "1.8189894035459E-12": 5A 00 00 00 00 00
@@ -109,4 +109,61 @@ class TestFloatFormats < Test::Unit::TestCase
109
109
  end
110
110
 
111
111
  end
112
+
113
+ def test_hp71b
114
+ assert_equal(-499, HP71B.radix_min_exp)
115
+ assert_equal(499, HP71B.radix_max_exp)
116
+
117
+ fmt = Nio::Fmt.prec(12)
118
+ assert_equal '9.99999999999E499', HP71B.max_value.to_fmt(fmt)
119
+ assert_equal '0000000000001501', HP71B.min_value.to_bits_text(16)
120
+ assert_equal '1E-510', HP71B.min_value.to_fmt(fmt)
121
+ assert_equal '1E-499', HP71B.min_normalized_value.to_fmt(fmt)
122
+
123
+ assert_equal '9210000000000999',HP71B.from_fmt('-0.21').to_bits_text(16)
124
+ assert_equal '0100000000000001',HP71B.from_fmt('10').to_bits_text(16)
125
+ assert_equal '9000000000000000',HP71B.from_fmt('-0').to_bits_text(16)
126
+ assert_equal '0000510000000501', HP71B.from_fmt('0.0051E-499').to_bits_text(16)
127
+
128
+ assert_equal '0000000000000F01',HP71B.nan.to_bits_text(16).upcase
129
+ assert_equal 'NAN', HP71B.nan.to_fmt.upcase
130
+ assert_equal '0000000000000F00', HP71B.infinity.to_bits_text(16).upcase
131
+ assert_equal '+INFINITY', HP71B.infinity.to_fmt.upcase
132
+ assert_equal '9000000000000F00', HP71B.infinity.neg.to_bits_text(16).upcase
133
+ end
134
+ def test_quad
135
+ assert_equal "3fff 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('1').to_hex.downcase
136
+ assert_equal "7ffe ffff ffff ffff ffff ffff ffff ffff".tr(' ',''), IEEE_binary128_BE.max_value.to_hex.downcase
137
+ assert_equal '1.19E4932', IEEE_binary128.max_value.to_fmt(Nio::Fmt.prec(4))
138
+ assert_equal "c000 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('-2').to_hex.downcase
139
+ assert_equal "0000 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('0').to_hex.downcase
140
+ assert_equal "8000 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('-0').to_hex.downcase
141
+ assert_equal "7fff 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.infinity.to_hex.downcase
142
+ assert_equal "ffff 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.infinity(1).to_hex.downcase
143
+ assert_equal "3ffd 5555 5555 5555 5555 5555 5555 5555".tr(' ',''), IEEE_binary128_BE.from_number(Rational(1,3)).to_hex.downcase
144
+ assert_equal "3fff 0000 0000 0000 0000 0000 0000 0001".tr(' ',''), IEEE_binary128_BE.from_fmt('1').next.to_hex.downcase
145
+ end
146
+ def test_half
147
+ assert_equal "3c00", IEEE_binary16_BE.from_fmt('1').to_hex.downcase
148
+ assert_equal "7bff", IEEE_binary16_BE.max_value.to_hex.downcase
149
+ assert_equal '65504', IEEE_binary16_BE.max_value.to_fmt
150
+ assert_equal "0400", IEEE_binary16_BE.min_normalized_value.to_hex.downcase
151
+ assert_equal "6.103515625E-5", IEEE_binary16_BE.min_normalized_value.to_fmt
152
+ assert_equal "0001", IEEE_binary16_BE.min_value.to_hex.downcase
153
+ assert_equal "5.9604644775390625E-8", IEEE_binary16_BE.min_value.to_fmt
154
+ assert_equal "0000", IEEE_binary16_BE.from_fmt('0').to_hex.downcase
155
+ assert_equal "8000", IEEE_binary16_BE.from_fmt('-0').to_hex.downcase
156
+ assert_equal "7c00".tr(' ',''), IEEE_binary16_BE.infinity.to_hex.downcase
157
+ assert_equal "fc00".tr(' ',''), IEEE_binary16_BE.infinity(1).to_hex.downcase
158
+ end
159
+ def test_special
160
+ assert_equal '+Infinity', IEEE_binary32.from_number(1.0/0.0).to_fmt
161
+ assert_equal '-Infinity', IEEE_binary32.from_number(-1.0/0.0).to_fmt
162
+ assert_equal '+Infinity', IEEE_binary32.from_fmt('+Infinity').to_fmt
163
+ assert_equal '-Infinity', IEEE_binary32.from_fmt('-Infinity').to_fmt
164
+ assert_equal 'NAN', IEEE_binary32.from_number(0.0/0.0).to_fmt.upcase
165
+ assert_equal 'NAN', IEEE_binary32.from_fmt('NaN').to_fmt.upcase
166
+ end
167
+
168
+
112
169
  end
@@ -1,2 +1,3 @@
1
+ require 'rubygems'
1
2
  require 'test/unit'
2
3
  require File.dirname(__FILE__) + '/../lib/float-formats'
@@ -22,4 +22,56 @@ class TestNativeFloat < Test::Unit::TestCase
22
22
  assert(-(1.0.next) == (-1.0).prev)
23
23
  assert((-1.0).next == -(1.0.prev))
24
24
  end
25
+
26
+ def test_hex
27
+ if Float::RADIX==2 && Float::MANT_DIG==53
28
+ assert_equal((1.0+Float::EPSILON),hex_to_float('0x1.0000000000001p0'))
29
+ assert_equal '0x10000000000001p-52', hex_from_float(1.0+Float::EPSILON)
30
+ end
31
+ assert_equal 1.0, hex_to_float(hex_from_float(1.0))
32
+ assert_equal(-1.0, hex_to_float(hex_from_float(-1.0)))
33
+ assert_equal 1.0e-5, hex_to_float(hex_from_float(1.0e-5))
34
+ assert_equal(-1.0e-5, hex_to_float(hex_from_float(-1.0e-5)))
35
+
36
+ assert_equal(+1,float_to_integral_sign_significand_exponent(+0.0).first)
37
+ assert_equal(-1,float_to_integral_sign_significand_exponent(-0.0).first)
38
+ assert_not_equal hex_from_float(-0.0), hex_from_float(+0.0)
39
+ assert_equal hex_to_float(hex_from_float(-0.0)), hex_to_float(hex_from_float(+0.0))
40
+
41
+ end
42
+
43
+
44
+ def check_ulp_around(x)
45
+ assert_equal x-x.prev, x.prev.ulp
46
+ assert_equal x-x.prev, x.ulp
47
+ assert_equal x.next-x, x.next.ulp
48
+ assert_equal x.next.next-x.next, x.next.next.ulp
49
+ end
50
+
51
+ def test_ulp
52
+ r = Float::RADIX
53
+ assert_equal Float::MIN_D, 0.0.ulp
54
+ assert_equal Float::MIN_D, Float::MIN_D.ulp
55
+ assert_equal Float::MIN_D, Float::MIN_D.next.ulp
56
+ assert_equal Float::MIN_D, (0.5*(Float::MIN_D+Float::MAX_D)).ulp
57
+ assert_equal Float::MIN_D, Float::MAX_D.prev.ulp
58
+ assert_equal Float::MIN_D, Float::MAX_D.ulp
59
+ assert_equal Float::MIN_D, Float::MIN_N.ulp
60
+ assert_equal Float::MIN_D, Float::MIN_N.next.ulp
61
+ assert_equal Float::MIN_D, (r*Float::MIN_N).prev.ulp
62
+ assert_equal Float::MIN_D, (r*Float::MIN_N).ulp
63
+ assert_equal r*Float::MIN_D, (r*Float::MIN_N).next.ulp
64
+ check_ulp_around 1.0
65
+ assert_equal 1.0.next-1.0, 1.5.ulp
66
+ check_ulp_around r.to_f
67
+ check_ulp_around Math.ldexp(1,10)
68
+ check_ulp_around Math.ldexp(1,-10)
69
+ check_ulp_around Float::MAX/r
70
+ assert_equal Float::MAX-Float::MAX.prev, Float::MAX.prev.ulp
71
+ assert_equal Float::MAX-Float::MAX.prev, Float::MAX.ulp
72
+ assert_equal Float::MAX-Float::MAX.prev, (1.0/0.0).ulp
73
+ assert((0.0/0.0).ulp.nan?)
74
+ assert_equal Math.ldexp(1,10)-Math.ldexp(1,10).prev, Math.ldexp(1,10).prev.ulp
75
+ end
76
+
25
77
  end
metadata CHANGED
@@ -3,8 +3,8 @@ rubygems_version: 0.9.2
3
3
  specification_version: 1
4
4
  name: float-formats
5
5
  version: !ruby/object:Gem::Version
6
- version: 0.1.0
7
- date: 2007-11-04 00:00:00 +01:00
6
+ version: 0.1.1
7
+ date: 2007-12-15 00:00:00 +01:00
8
8
  summary: Floating-Point Formats
9
9
  require_paths:
10
10
  - lib
@@ -42,7 +42,6 @@ files:
42
42
  - lib/float-formats/classes.rb
43
43
  - lib/float-formats/formats.rb
44
44
  - lib/float-formats/native.rb
45
- - log/debug.log
46
45
  - script/destroy
47
46
  - script/destroy.cmd
48
47
  - script/generate
@@ -84,5 +83,5 @@ dependencies:
84
83
  requirements:
85
84
  - - ">="
86
85
  - !ruby/object:Gem::Version
87
- version: 0.2.0
86
+ version: 0.2.1
88
87
  version:
File without changes