float-formats 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/History.txt +32 -0
- data/Manifest.txt +0 -1
- data/README.txt +126 -10
- data/config/hoe.rb +1 -1
- data/lib/float-formats.rb +1 -2
- data/lib/float-formats/classes.rb +79 -24
- data/lib/float-formats/formats.rb +170 -81
- data/lib/float-formats/native.rb +104 -0
- data/lib/float-formats/version.rb +1 -1
- data/test/test_data.yaml +4 -53
- data/test/test_float-formats.rb +57 -0
- data/test/test_helper.rb +1 -0
- data/test/test_native-float.rb +52 -0
- metadata +3 -4
- data/log/debug.log +0 -0
data/History.txt
CHANGED
@@ -1,3 +1,35 @@
|
|
1
|
+
== 0.1.1 2007-12-15
|
2
|
+
|
3
|
+
* HP-71B formats defined
|
4
|
+
|
5
|
+
* Add half precision IEEE format (binary16)
|
6
|
+
|
7
|
+
* New names for IEEE formats
|
8
|
+
|
9
|
+
* Add some IEEE 754r interchange formats
|
10
|
+
|
11
|
+
* new methods hex_to_float, hex_from_float in float-formats/native
|
12
|
+
|
13
|
+
* Allow non-bcd values in fields of BCD formats by passing
|
14
|
+
hex values as Strings; allow such values to be used for
|
15
|
+
nan/infinity exponents.
|
16
|
+
|
17
|
+
* Nio 0.2.1 is now required
|
18
|
+
|
19
|
+
* Handle special values (Infinities and NaN) in #from_fmt, #from_number
|
20
|
+
|
21
|
+
* Add ulp methods to Value and FP classes and to Float
|
22
|
+
|
23
|
+
* Bug fixes
|
24
|
+
- Fix the encoding-decoding of nan and infinity in Decimal format.
|
25
|
+
- Fix the decoding of NaN in Binary & Hexadecimal
|
26
|
+
- The definition of IEEE_binary128 was not correct
|
27
|
+
- In formats such as XS256 where the minimum exponent is not used only for zero
|
28
|
+
and there is a hidden bit, then minimum nonzero significand is radix*(prec-1)+1
|
29
|
+
rather than radix*(prec-1); the latter value could be computed in ratio_float
|
30
|
+
and then packed in the representation, being replaced by zero. This would
|
31
|
+
result in an incorrect encoding of the minimum nonzero value.
|
32
|
+
|
1
33
|
== 0.1.0 2007-11-04
|
2
34
|
|
3
35
|
* Initial release
|
data/Manifest.txt
CHANGED
data/README.txt
CHANGED
@@ -33,14 +33,22 @@ The latest version of Float-Formats and its source code can be downloaded from
|
|
33
33
|
|
34
34
|
A number of common formats are defined as constants in the FltPnt module:
|
35
35
|
|
36
|
-
==IEEE
|
37
|
-
<b>
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
(
|
36
|
+
==IEEE 754r
|
37
|
+
<b>binary</b> floating point representations in little endian order:
|
38
|
+
IEEE_binary16 (half precision),
|
39
|
+
IEEE_binary32 (single precision),
|
40
|
+
IEEE_binary64 (double precision),
|
41
|
+
IEEE_binary80 (extended), IEEE_binary128 (quadruple precision) and
|
42
|
+
as little endian: IEEE_binary16_BE, etc.
|
43
|
+
|
44
|
+
<b>decimal</b> formats (using DPD):
|
45
|
+
IEEE_decimal32, IEEE_decimal64 and IEEE_decimal128.
|
46
|
+
|
47
|
+
<b>interchange binary & decimal</b> formats:
|
48
|
+
IEEE_binary256, IEEE_binary512, IEEE_binary1024, IEEE_decimal192, IEEE_decimal256.
|
49
|
+
Others can be defined with IEEE.interchange_binary and IEEE.interchange_decimal
|
50
|
+
(see the IEEE module).
|
42
51
|
|
43
|
-
<b>IEEE 754r decimal</b> formats (using DPD): IEEE_DEC32, IEEE_DEC64 and IEEE_DEC128.
|
44
52
|
|
45
53
|
==Legacy
|
46
54
|
Formats of historical interest, some of which are found
|
@@ -68,13 +76,14 @@ Formats used in the Intel 8051 by the C51 compiler:
|
|
68
76
|
|
69
77
|
|
70
78
|
==Calculators
|
71
|
-
Formats used in HP
|
72
|
-
|
73
|
-
|
79
|
+
Formats used in HP RPL calculators: (RPL, RPL_X),
|
80
|
+
HP-71B formats (HP71B, HP71B_X)
|
81
|
+
and classic HP 10 digit calculators: (HP_CLASSIC).
|
74
82
|
|
75
83
|
|
76
84
|
=Using the pre-defined formats
|
77
85
|
|
86
|
+
require 'rubygems'
|
78
87
|
require 'float-formats'
|
79
88
|
include FltPnt
|
80
89
|
|
@@ -209,3 +218,110 @@ Nio has been developed by Javier Goizueta (mailto:javier@goizueta.info).
|
|
209
218
|
|
210
219
|
You can contact me through Rubyforge:http://rubyforge.org/sendmessage.php?touser=25432
|
211
220
|
|
221
|
+
=References
|
222
|
+
|
223
|
+
|
224
|
+
[<i>Floating Point Representations.</i> C.B. Silio.]
|
225
|
+
http://www.ece.umd.edu/class/enpm607.S2000/fltngpt.pdf
|
226
|
+
Description of formats used in UNIVAC 1100, CDC 6600/7600, PDP-11, IEEE754, IBM360/370
|
227
|
+
|
228
|
+
[<i>Floating-Point Formats.</i> John Savard.]
|
229
|
+
http://www.quadibloc.com/comp/cp0201.htm
|
230
|
+
Description of formats used in VAX and PDF-11
|
231
|
+
|
232
|
+
|
233
|
+
===IEEE754 binary formats
|
234
|
+
[<i>IEEE-754 References.</i> Christopher Vickery.]
|
235
|
+
http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html
|
236
|
+
|
237
|
+
[<i>What Every Computer Scientist Should Know About Floating-Point Arithmetic.</i> David Goldberg.]
|
238
|
+
http://docs.sun.com/source/806-3568/ncg_goldberg.html
|
239
|
+
|
240
|
+
|
241
|
+
===DPD/IEEE754r decimal formats
|
242
|
+
[<i>Decimal Arithmetic Encoding. Strawman 4d.</i> Mike Cowlishaw.]
|
243
|
+
http://www2.hursley.ibm.com/decimal/decbits.pdf
|
244
|
+
|
245
|
+
[<i>A Summary of Densely Packed Decimal encoding.</i> Mike Cowlishaw.]
|
246
|
+
http://www2.hursley.ibm.com/decimal/DPDecimal.html
|
247
|
+
|
248
|
+
[<i>Packed Decimal Encoding IEEE-754-r.</i> J.H.M. Bonten.]
|
249
|
+
http://home.hetnet.nl/mr_1/81/jhm.bonten/computers/bitsandbytes/wordsizes/ibmpde.htm
|
250
|
+
|
251
|
+
[<i>DRAFT Standard for Floating-Point Arithmetic P754.</i> IEEE.]
|
252
|
+
http://www.validlab.com/754R/drafts/archive/2007-10-05.pdf
|
253
|
+
|
254
|
+
|
255
|
+
|
256
|
+
===HP 10 digits calculators
|
257
|
+
|
258
|
+
[<i>HP CPU and Programming</i>. David G.Hicks.]
|
259
|
+
http://www.hpmuseum.org/techcpu.htm Description of calculator CPUs from the Museum of HP Calculators.
|
260
|
+
[<i>HP 35 ROM step by step.</i> Jacques Laporte]
|
261
|
+
http://www.jacques-laporte.org/HP35%20ROM.htm
|
262
|
+
Description of HP35 registers.
|
263
|
+
[<i>Scientific Pocket Calculator Extends Range of Built-In Functions.</i> Eric A. Evett, Paul J. McClellan, Joseph P. Tanzini.]
|
264
|
+
Hewlett Packard Journal 1983-05 pgs 27-28. Describes format used in HP-15C.
|
265
|
+
|
266
|
+
|
267
|
+
===HP 12 digits calculators
|
268
|
+
[<i>Software Internal Design Specification Volume I For the HP-71</i>. Hewlett Packard.]
|
269
|
+
Available from http://www.hpmuseum.org/cd/cddesc.htm
|
270
|
+
[<i>RPL PROGRAMMING GUIDE</i>]
|
271
|
+
Excerpted from <i>RPL: A Mathematical Control Language</i>. by W. C. Wickes.
|
272
|
+
Available at http://www.hpcalc.org/details.php?id=1743
|
273
|
+
|
274
|
+
===HP-3000
|
275
|
+
[<i>A Pocket Calculator for Computer Science Professionals.</i> Eric A. Evett.]
|
276
|
+
Hewlett Packard Journal 1983-05 pg 37. Describes format used in HP-3000
|
277
|
+
|
278
|
+
===IBM
|
279
|
+
[<i>IBM Floating Point Architecture.</i> Wikipedia.]
|
280
|
+
http://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture
|
281
|
+
[<i>The IBM eServer z990 floating-point unit</i>. G. Gerwig, H. Wetter, E. M. Schwarz, J. Haess, C. A. Krygowski, B. M. Fleischer and M. Kroener.]
|
282
|
+
http://www.research.ibm.com/journal/rd/483/gerwig.html
|
283
|
+
|
284
|
+
===MBF
|
285
|
+
[<i>Microsoft Knowledbase Article 35826</i>]
|
286
|
+
http://support.microsoft.com/?scid=kb%3Ben-us%3B35826&x=17&y=12
|
287
|
+
[<i>Microsoft MBF2IEEE library</i>]
|
288
|
+
http://download.microsoft.com/download/vb30/install/1/win98/en-us/mbf2ieee.exe
|
289
|
+
|
290
|
+
===Borland
|
291
|
+
[<i>An Overview of Floating Point Numbers.</i> Borland Developer Support Staff]
|
292
|
+
|
293
|
+
[<i>Pascal Floating-Point Page.<i> J R Stockton.]
|
294
|
+
http://www.merlyn.demon.co.uk/pas-real.htm
|
295
|
+
|
296
|
+
===8-bit micros
|
297
|
+
This is the MS Basic format (BASIC09 for TRS-80 Color Computer, Dragon),
|
298
|
+
also used in the Sinclair Spectrum.
|
299
|
+
|
300
|
+
[<i>Numbers are followed by information not in listings</i>]
|
301
|
+
Sinclair User October 1983 http://www.sincuser.f9.co.uk/019/helplne.htm
|
302
|
+
|
303
|
+
[<i>Sinclair ZX Spectrum / Basic Programming.</i>. Steven Vickers.]
|
304
|
+
Chapter 24. http://www.worldofspectrum.org/ZXBasicManual/zxmanchap24.html
|
305
|
+
|
306
|
+
|
307
|
+
|
308
|
+
===Apple II
|
309
|
+
[<i>Floating Point Routines for the 6502</i> Roy Rankin and Steve Wozniak.]
|
310
|
+
Dr. Dobb's Journal, August 1976, pages 17-19.
|
311
|
+
|
312
|
+
===C51
|
313
|
+
[<i>Advanced Development System</i> Franklin Software, Inc.]
|
314
|
+
http://www.fsinc.com/reference/html/com9anm.htm
|
315
|
+
|
316
|
+
===CDC6600
|
317
|
+
[<i>CONTROL DATA 6400/6500/6600 COMPUTER SYSTEMS Reference Manual</i>]
|
318
|
+
Manuals available at http://bitsavers.org/
|
319
|
+
|
320
|
+
|
321
|
+
===Cray
|
322
|
+
[<i>CRAY-1 COMPUTER SYSTEM Hardware Reference Manual</i>]
|
323
|
+
See pg 3-20 from 2240004 or pg 4-30 from HR-0808 or pg 4-21 from HP-0032.
|
324
|
+
Manuals available at http://bitsavers.org/
|
325
|
+
|
326
|
+
===Wang 2200
|
327
|
+
[<i>Internal Floating Point Representation</i>] http://www.wang2200.org/fp_format.html
|
data/config/hoe.rb
CHANGED
@@ -60,7 +60,7 @@ hoe = Hoe.new(GEM_NAME, VERS) do |p|
|
|
60
60
|
# == Optional
|
61
61
|
p.changes = p.paragraphs_of("History.txt", 0..1).join("\\n\\n")
|
62
62
|
p.extra_deps = [
|
63
|
-
['nio', '>=0.2.
|
63
|
+
['nio', '>=0.2.1']
|
64
64
|
]
|
65
65
|
|
66
66
|
#p.spec_extras = {} # A hash of extra values to set in the gemspec.
|
data/lib/float-formats.rb
CHANGED
@@ -97,12 +97,12 @@ class FormatBase
|
|
97
97
|
@max_encoded_exp = params[:max_encoded_exp] || @exponent_radix**@fields[:exponent]-1 # maximum regular exponent, encoded
|
98
98
|
if @infinity
|
99
99
|
@infinite_encoded_exp = @nan_encoded_exp || @max_encoded_exp if !@infinite_encoded_exp
|
100
|
-
@max_encoded_exp = @infinite_encoded_exp - 1 if @infinite_encoded_exp<=@max_encoded_exp
|
100
|
+
@max_encoded_exp = @infinite_encoded_exp - 1 if @infinite_encoded_exp.kind_of?(Integer) && @infinite_encoded_exp<=@max_encoded_exp
|
101
101
|
end
|
102
102
|
@nan = params[:nan] || (@nan_encoded_exp ? true : false)
|
103
103
|
if @nan
|
104
104
|
@nan_encoded_exp = @infinite_encoded_exp || @max_encoded_exp if !@nan_encoded_exp
|
105
|
-
@max_encoded_exp = @nan_encoded_exp - 1 if @nan_encoded_exp<=@max_encoded_exp
|
105
|
+
@max_encoded_exp = @nan_encoded_exp - 1 if @nan_encoded_exp.kind_of?(Integer) && @nan_encoded_exp<=@max_encoded_exp
|
106
106
|
end
|
107
107
|
|
108
108
|
@exponent_mode = params[:exponent_mode]
|
@@ -411,7 +411,15 @@ class FormatBase
|
|
411
411
|
# Produce an encoded floating point value using a number defined by a
|
412
412
|
# formatted text string (using Nio formats). Returns a Value.
|
413
413
|
def from_fmt(txt,fmt=Nio::Fmt.default)
|
414
|
-
neutral = fmt.nio_read_formatted(txt)
|
414
|
+
neutral = fmt.nio_read_formatted(txt)
|
415
|
+
if neutral.special?
|
416
|
+
case neutral.special
|
417
|
+
when :nan
|
418
|
+
return nan
|
419
|
+
when :inf
|
420
|
+
return infinity(neutral.sign=='-' ? minus_sign_value : 0)
|
421
|
+
end
|
422
|
+
end
|
415
423
|
if neutral.rep_pos<neutral.digits.length
|
416
424
|
nd = fmt.get_base==10 ? decimal_digits_necessary : (significand_digits*Math.log(radix)/Math.log(fmt.get_base)).ceil+1
|
417
425
|
fmt = fmt.mode(:sig,nd)
|
@@ -531,7 +539,7 @@ class FormatBase
|
|
531
539
|
# Computes the next adjacent floating point value.
|
532
540
|
# Accepts either a Value or a byte String.
|
533
541
|
# Returns a Value.
|
534
|
-
def next_float(v)
|
542
|
+
def next_float(v)
|
535
543
|
s,f,e = to_integral_sign_significand_exponent(v)
|
536
544
|
return neg(prev_float(neg(v))) if s!=0 && e!=:zero
|
537
545
|
s = switch_sign_value(s) if e==:zero && s!=0
|
@@ -584,6 +592,27 @@ class FormatBase
|
|
584
592
|
from_integral_sign_significand_exponent(s,f,e)
|
585
593
|
end
|
586
594
|
end
|
595
|
+
|
596
|
+
# ulp (unit in the last place) according to the definition proposed by J.M. Muller in
|
597
|
+
# "On the definition of ulp(x)" INRIA No. 5504
|
598
|
+
def ulp(v)
|
599
|
+
sign,sig,exp = to_integral_sign_significand_exponent(v)
|
600
|
+
|
601
|
+
mnexp = radix_min_exp(:integral_significand)
|
602
|
+
mxexp = radix_max_exp(:integral_significand)
|
603
|
+
prec = significand_digits
|
604
|
+
|
605
|
+
if exp==:nan
|
606
|
+
return_bytes v
|
607
|
+
elsif exp==:infinity
|
608
|
+
from_integral_sign_significand_exponent(1,1,mxexp) # from_integral_sign_significand_exponent(1,fmt.radix_power(prec-1),mxexp-prec+1)
|
609
|
+
elsif exp==:zero || exp <= mnexp
|
610
|
+
min_value
|
611
|
+
else
|
612
|
+
exp -= 1 if sig==radix_power(prec-1) # minimum normalized significand
|
613
|
+
from_integral_sign_significand_exponent(1,1,exp)
|
614
|
+
end
|
615
|
+
end
|
587
616
|
|
588
617
|
# Produce an encoded floating point value from the integral value
|
589
618
|
# of the sign, significand and exponent.
|
@@ -607,6 +636,7 @@ class FormatBase
|
|
607
636
|
when :normalized_significand
|
608
637
|
m = Rational(m,radix_power(@significand_digits-1))
|
609
638
|
end
|
639
|
+
[s,m,e]
|
610
640
|
end
|
611
641
|
|
612
642
|
# Returns the encoded value of a floating-point number as an integer
|
@@ -802,14 +832,17 @@ class FormatBase
|
|
802
832
|
v_r = v-r
|
803
833
|
z = from_integral_sign_significand_exponent(0,q,k)
|
804
834
|
if r<v_r
|
805
|
-
z
|
806
835
|
elsif r>v_r
|
807
|
-
|
836
|
+
q += 1
|
808
837
|
elsif (round_mode==:even && q.even?) || (round_mode==:zero)
|
809
|
-
z
|
810
838
|
else
|
811
|
-
|
839
|
+
q += 1
|
840
|
+
end
|
841
|
+
if q==radix_power(significand_digits)
|
842
|
+
q = radix_power(significand_digits-1)
|
843
|
+
k += 1
|
812
844
|
end
|
845
|
+
from_integral_sign_significand_exponent(0,q,k)
|
813
846
|
end
|
814
847
|
|
815
848
|
def algM(f,e,round_mode,eb=10)
|
@@ -935,7 +968,18 @@ class BCDFormat < DecimalFormatBase
|
|
935
968
|
end
|
936
969
|
# now we conver the nibble strings to numbers
|
937
970
|
i = -1
|
938
|
-
nibble_fields.collect
|
971
|
+
nibble_fields.collect do |ns|
|
972
|
+
i+=1
|
973
|
+
if bcd_field?(i)
|
974
|
+
if /\A\d+\Z/.match(ns)
|
975
|
+
ns.reverse.to_i
|
976
|
+
else
|
977
|
+
ns.reverse
|
978
|
+
end
|
979
|
+
else
|
980
|
+
ns.reverse.to_i(16)
|
981
|
+
end
|
982
|
+
end
|
939
983
|
end
|
940
984
|
def from_fields(*fields)
|
941
985
|
fields = fields[0] if fields.size==1 and fields[0].kind_of?(Array)
|
@@ -943,8 +987,12 @@ class BCDFormat < DecimalFormatBase
|
|
943
987
|
i = 0
|
944
988
|
nibbles = ""
|
945
989
|
for l in @field_lengths
|
946
|
-
|
947
|
-
|
990
|
+
f = fields[i]
|
991
|
+
unless f.kind_of?(String)
|
992
|
+
fmt = bcd_field?(i) ? 'd' : 'X'
|
993
|
+
f = "%0#{l}#{fmt}" % fields[i]
|
994
|
+
end
|
995
|
+
nibbles << f.reverse
|
948
996
|
i += 1
|
949
997
|
end
|
950
998
|
v = hex_to_bytes(nibbles)
|
@@ -964,15 +1012,15 @@ class BCDFormat < DecimalFormatBase
|
|
964
1012
|
e = f[:exponent]
|
965
1013
|
s = f[:sign]
|
966
1014
|
m,e = neg_significand_exponent(s,m,e) if s%2==1
|
967
|
-
if
|
968
|
-
# +-
|
969
|
-
e = :zero
|
970
|
-
elsif @infinite_encoded_exp && e==@infinite_encoded_exp && m==0
|
971
|
-
# +-inifinity
|
1015
|
+
if @infinite_encoded_exp && e==@infinite_encoded_exp
|
1016
|
+
# +-infinity
|
972
1017
|
e = :infinity
|
973
|
-
elsif @nan_encoded_exp && e==@nan_encoded_exp
|
1018
|
+
elsif @nan_encoded_exp && e==@nan_encoded_exp
|
974
1019
|
# NaN
|
975
1020
|
e = :nan
|
1021
|
+
elsif m==0
|
1022
|
+
# +-zero
|
1023
|
+
e = :zero
|
976
1024
|
else
|
977
1025
|
# normalized number
|
978
1026
|
e = decode_exponent(e, :integral_significand)
|
@@ -989,11 +1037,11 @@ class BCDFormat < DecimalFormatBase
|
|
989
1037
|
e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
|
990
1038
|
m = 0
|
991
1039
|
elsif e==:nan
|
992
|
-
e = @
|
993
|
-
s = minus_sign_value # ?
|
994
|
-
m = radix_power(@significand_digits-2) if m==0
|
1040
|
+
e = @nan_encoded_exp || radix_power(@fields[:exponent])-1
|
1041
|
+
#s = minus_sign_value # ?
|
1042
|
+
#m = radix_power(@significand_digits-2) if m==0
|
995
1043
|
elsif e==:denormal
|
996
|
-
e = @denormal_encoded_exp
|
1044
|
+
e = @denormal_encoded_exp
|
997
1045
|
else
|
998
1046
|
# to do: try to adjust m to keep e in range if out of valid range
|
999
1047
|
# to do: reduce m and adjust e if m too big
|
@@ -1173,7 +1221,6 @@ class DPDFormat < DecimalFormatBase
|
|
1173
1221
|
|
1174
1222
|
def from_integral_sign_significand_exponent(s,m,e)
|
1175
1223
|
msb = radix_power(@significand_digits-1)
|
1176
|
-
#puts "DEC FROM #{s} #{m} #{e}"
|
1177
1224
|
t = nil
|
1178
1225
|
if e==:zero
|
1179
1226
|
e = @zero_encoded_exp
|
@@ -1307,7 +1354,7 @@ class BinaryFormat < FieldsInBitsFormatBase
|
|
1307
1354
|
e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
|
1308
1355
|
m = 0
|
1309
1356
|
elsif e==:nan
|
1310
|
-
e = @
|
1357
|
+
e = @nan_encoded_exp || radix_power(@fields[:exponent])-1
|
1311
1358
|
s = minus_sign_value # ?
|
1312
1359
|
m = radix_power(@significand_digits-2) if m==0
|
1313
1360
|
elsif e==:denormal
|
@@ -1399,7 +1446,7 @@ class HexadecimalFormat < FieldsInBitsFormatBase
|
|
1399
1446
|
e = @infinite_encoded_exp || radix_power(@fields[:exponent])-1
|
1400
1447
|
m = 0
|
1401
1448
|
elsif e==:nan
|
1402
|
-
e = @
|
1449
|
+
e = @nan_encoded_exp || radix_power(@fields[:exponent])-1
|
1403
1450
|
s = minus_sign_value # ?
|
1404
1451
|
m = radix_power(@significand_digits-2) if m==0
|
1405
1452
|
elsif e==:denormal
|
@@ -1524,6 +1571,14 @@ class Value
|
|
1524
1571
|
self.class.new(@fptype, @fptype.prev_float(@value))
|
1525
1572
|
end
|
1526
1573
|
|
1574
|
+
def neg
|
1575
|
+
@fptype.neg(@value)
|
1576
|
+
end
|
1577
|
+
|
1578
|
+
def ulp
|
1579
|
+
@fptype.ulp(@value)
|
1580
|
+
end
|
1581
|
+
|
1527
1582
|
def fp_format
|
1528
1583
|
@fptype
|
1529
1584
|
end
|
@@ -14,84 +14,114 @@ module FltPnt
|
|
14
14
|
|
15
15
|
# Floating Point Format Definitions ==========================================
|
16
16
|
|
17
|
+
# Helper methods to define IEEE 754r formats
|
18
|
+
module IEEE
|
19
|
+
# Define an IEEE binary format by passing parameters in a hash;
|
20
|
+
# :significand and :exponent are used to defined the fields,
|
21
|
+
# optional parameters may follow.
|
22
|
+
def self.binary(parameters)
|
23
|
+
significand_bits = parameters[:significand]
|
24
|
+
exponent_bits = parameters[:exponent]
|
25
|
+
BinaryFormat.new({
|
26
|
+
:fields=>[:significand,significand_bits,:exponent,exponent_bits,:sign,1],
|
27
|
+
:bias=>2**(exponent_bits-1)-1, :bias_mode=>:normalized_significand,
|
28
|
+
:hidden_bit=>true,
|
29
|
+
:endianness=>:little_endian, :round=>:even,
|
30
|
+
:gradual_underflow=>true, :infinity=>true, :nan=>true
|
31
|
+
}.merge(parameters))
|
32
|
+
end
|
33
|
+
|
34
|
+
# Define an IEEE binary interchange format given its width in bits
|
35
|
+
def self.interchange_binary(width_in_bits, options={})
|
36
|
+
raise "Invalid IEEE binary interchange format definition: size (#{width_in_bits}) is not valid" unless (width_in_bits%32)==0 && (width_in_bits/32)>=4
|
37
|
+
p = width_in_bits - (4*Math.log(width_in_bits)/Math.log(2)).round.to_i + 13
|
38
|
+
binary({:significand=>p-1, :exponent=>width_in_bits-p}.merge(options))
|
39
|
+
end
|
40
|
+
|
41
|
+
# Define an IEEE decimal format by passing parameters in a hash;
|
42
|
+
# :significand and :exponent are used to defined the fields,
|
43
|
+
# optional parameters may follow.
|
44
|
+
def self.decimal(parameters)
|
45
|
+
significand_continuation_bits = parameters[:significand]
|
46
|
+
exponent_continuation_bits = parameters[:exponent]
|
47
|
+
DPDFormat.new({
|
48
|
+
:fields=>[:significand_continuation,significand_continuation_bits,:exponent_continuation,exponent_continuation_bits,:combination,5,:sign,1],
|
49
|
+
:endianness=>:big_endian,
|
50
|
+
:gradual_underflow=>true, :infinity=>true, :nan=>true
|
51
|
+
}.merge(parameters))
|
52
|
+
end
|
53
|
+
|
54
|
+
# Define an IEEE decimal interchange format given its width in bits
|
55
|
+
def self.interchange_decimal(width_in_bits, options={})
|
56
|
+
raise "Invalid IEEE decimal interchange format definition: size (#{width_in_bits}) is not valid" unless (width_in_bits%32)==0
|
57
|
+
p = width_in_bits*9/32 - 2
|
58
|
+
t = (p-1)*10/3
|
59
|
+
w = width_in_bits - t - 6
|
60
|
+
decimal({:significand=>t, :exponent=>w}.merge(options))
|
61
|
+
end
|
62
|
+
|
63
|
+
end
|
64
|
+
|
17
65
|
# IEEE 754 binary types, as stored in little endian architectures such as Intel, Alpha
|
18
66
|
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
)
|
26
|
-
IEEE_DOUBLE = BinaryFormat.new(
|
27
|
-
:fields=>[:significand,52,:exponent,11,:sign,1],
|
28
|
-
:bias=>1023, :bias_mode=>:normalized_significand,
|
29
|
-
:hidden_bit=>true,
|
30
|
-
:endianness=>:little_endian, :round=>:even,
|
31
|
-
:gradual_underflow=>true, :infinity=>true, :nan=>true
|
32
|
-
)
|
33
|
-
IEEE_EXTENDED = BinaryFormat.new(
|
34
|
-
:fields=>[:significand,64,:exponent,15,:sign,1],
|
35
|
-
:bias=>16383, :bias_mode=>:normalized_significand,
|
36
|
-
:hidden_bit=>false, :min_encoded_exp=>1, :round=>:even,
|
37
|
-
:endianness=>:little_endian,
|
38
|
-
:gradual_underflow=>true, :infinity=>true, :nan=>true
|
39
|
-
)
|
40
|
-
IEEE_128 = BinaryFormat.new(
|
41
|
-
:fields=>[:significand,112,:exponent,15,:sign,1],
|
42
|
-
:bias=>16383, :bias_mode=>:normalized_significand,
|
43
|
-
:hidden_bit=>false, :min_encoded_exp=>1, :round=>:even,
|
44
|
-
:endianness=>:little_endian,
|
45
|
-
:gradual_underflow=>true, :infinity=>true, :nan=>true
|
46
|
-
)
|
67
|
+
IEEE_binary16 = IEEE.binary(:significand=>10, :exponent=>5)
|
68
|
+
IEEE_binary32 = IEEE.binary(:significand=>23,:exponent=>8)
|
69
|
+
IEEE_binary64 = IEEE.binary(:significand=>52,:exponent=>11)
|
70
|
+
IEEE_binary80 = IEEE.binary(:significand=>64,:exponent=>15, :hidden_bit=>false, :min_encoded_exp=>1)
|
71
|
+
IEEE_binary128 = IEEE.binary(:significand=>112,:exponent=>15)
|
72
|
+
|
47
73
|
|
48
74
|
# IEEE 754 in big endian order (SPARC, Motorola 68k, PowerPC)
|
49
75
|
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
)
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
|
76
|
+
IEEE_binary16_BE = IEEE.binary(:significand=>10, :exponent=>5, :endianness=>:big_endian)
|
77
|
+
IEEE_binary32_BE = IEEE.binary(:significand=>23,:exponent=>8, :endianness=>:big_endian)
|
78
|
+
IEEE_binary64_BE = IEEE.binary(:significand=>52,:exponent=>11, :endianness=>:big_endian)
|
79
|
+
IEEE_binary80_BE = IEEE.binary(:significand=>64,:exponent=>15, :endianness=>:big_endian, :hidden_bit=>false, :min_encoded_exp=>1)
|
80
|
+
IEEE_binary128_BE = IEEE.binary(:significand=>112,:exponent=>15, :endianness=>:big_endian)
|
81
|
+
|
82
|
+
|
83
|
+
# some IEEE745r interchange binary formats
|
84
|
+
|
85
|
+
IEEE_binary256 = IEEE.interchange_binary(256)
|
86
|
+
IEEE_binary512 = IEEE.interchange_binary(512)
|
87
|
+
IEEE_binary1024 = IEEE.interchange_binary(1024)
|
88
|
+
IEEE_binary256_BE = IEEE.interchange_binary(256, :endianness=>:big_endian)
|
89
|
+
IEEE_binary512_BE = IEEE.interchange_binary(512, :endianness=>:big_endian)
|
90
|
+
IEEE_binary1024_BE = IEEE.interchange_binary(1024, :endianness=>:big_endian)
|
91
|
+
|
92
|
+
|
93
|
+
# old names
|
94
|
+
IEEE_binaryx = IEEE_binary80
|
95
|
+
IEEE_HALF = IEEE_binary16
|
96
|
+
IEEE_SINGLE = IEEE_binary32
|
97
|
+
IEEE_DOUBLE = IEEE_binary64
|
98
|
+
IEEE_EXTENDED = IEEE_binary80
|
99
|
+
IEEE_QUAD = IEEE_binary128
|
100
|
+
IEEE_128 = IEEE_binary128IEEE_H_BE = IEEE_binary16_BE
|
101
|
+
IEEE_S_BE = IEEE_binary32_BE
|
102
|
+
IEEE_D_BE = IEEE_binary64_BE
|
103
|
+
IEEE_X_BE = IEEE_binary80_BE
|
104
|
+
IEEE_128_BE = IEEE_binary128_BE
|
105
|
+
IEEE_Q_BE = IEEE_binary128_BE
|
106
|
+
|
107
|
+
|
77
108
|
# Decimal IEEE 754r formats
|
78
109
|
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
)
|
89
|
-
IEEE_DEC128 = DPDFormat.new(
|
90
|
-
:fields=>[:significand_continuation,110,:exponent_continuation,12,:combination,5,:sign,1],
|
91
|
-
:endianness=>:big_endian,
|
92
|
-
:gradual_underflow=>true, :infinity=>true, :nan=>true
|
93
|
-
)
|
110
|
+
IEEE_decimal32 = IEEE.decimal(:significand=>20, :exponent=>6)
|
111
|
+
IEEE_decimal64 = IEEE.decimal(:significand=>50, :exponent=>8)
|
112
|
+
IEEE_decimal128 = IEEE.decimal(:significand=>110, :exponent=>12)
|
113
|
+
|
114
|
+
# some IEEE745r interchange binary formats
|
115
|
+
|
116
|
+
IEEE_decimal96 = IEEE.interchange_decimal(96)
|
117
|
+
IEEE_decimal192 = IEEE.interchange_decimal(192)
|
118
|
+
IEEE_decimal256 = IEEE.interchange_decimal(256)
|
94
119
|
|
120
|
+
# old names
|
121
|
+
|
122
|
+
IEEE_DEC32 = IEEE_decimal32
|
123
|
+
IEEE_DEC64 = IEEE_decimal64
|
124
|
+
IEEE_DEC128 = IEEE_decimal128
|
95
125
|
|
96
126
|
# Excess 128 used by Microsoft Basic in 8-bit micros, Spectrum, ...
|
97
127
|
|
@@ -202,6 +232,7 @@ PDP11_D = BinaryFormat.new(
|
|
202
232
|
|
203
233
|
|
204
234
|
# Format used in HP Saturn-based RPL calculators (HP48,HP49,HP50, also HP32s, HP42s --which use RPL internally)
|
235
|
+
# (these formats are not used in the HP-71B which is a Saturn, non-RPL machine)
|
205
236
|
|
206
237
|
SATURN = BCDFormat.new(
|
207
238
|
:fields=>[:prolog,5,:exponent,3,:significand,12,:sign,1],
|
@@ -217,9 +248,40 @@ SATURN_X = BCDFormat.new(
|
|
217
248
|
:endianness=>:little_endian, :round=>:even,
|
218
249
|
:gradual_underflow=>false, :infinity=>false, :nan=>false
|
219
250
|
)
|
220
|
-
|
221
|
-
|
222
|
-
|
251
|
+
|
252
|
+
|
253
|
+
RPL = SATURN
|
254
|
+
RPL_X = SATURN_X
|
255
|
+
|
256
|
+
# SATURN HP-71B (IEEE, NON-RPL) formats
|
257
|
+
|
258
|
+
# HP-71B REAL format (12-form) which is stored in a single register
|
259
|
+
HP71B = BCDFormat.new(
|
260
|
+
:fields=>[:exponent,3,:significand,12,:sign,1],
|
261
|
+
:exponent_mode=>:radix_complement,
|
262
|
+
:endianness=>:little_endian, :round=>:even,
|
263
|
+
:gradual_underflow=>true, :infinity=>true, :nan=>true,
|
264
|
+
:denormal_encoded_exp=>501,
|
265
|
+
:nan_encoded_exp=>"F01", # signaling NaN is F02
|
266
|
+
:infinite_encoded_exp=>"F00"
|
267
|
+
)
|
268
|
+
|
269
|
+
# HP-71B internal 15-digit format (15-form), stored in a pair of registers
|
270
|
+
# we use here a little-endian order for the registers, otherwise the
|
271
|
+
# definition would be [:significand,15,:unused1,1,:exponent,5,:unused2,10,:sign,1]
|
272
|
+
HP71B_X = BCDFormat.new(
|
273
|
+
:fields=>[:exponent,5,:unused2,10,:sign,1, :significand,15,:unused1,1],
|
274
|
+
:exponent_mode=>:radix_complement,
|
275
|
+
:endianness=>:little_endian, :round=>:even,
|
276
|
+
:gradual_underflow=>false, :infinity=>true, :nan=>true,
|
277
|
+
:nan_encoded_exp=>"00F01",
|
278
|
+
:infinite_encoded_exp=>"00F00"
|
279
|
+
)
|
280
|
+
|
281
|
+
# Format used in classic HP calculators (HP-35, ... HP-15C)
|
282
|
+
# Endianness is indeterminate, since these machines have named registers that
|
283
|
+
# hold a floating-point value in a single 56-bit word.
|
284
|
+
# (But intra-word field/nibble addressing is little-endian)
|
223
285
|
HP_CLASSIC = BCDFormat.new(
|
224
286
|
:fields=>[:exponent,3,:significand,10,:sign,1],
|
225
287
|
:exponent_mode=>:radix_complement,
|
@@ -306,15 +368,16 @@ CDC_SINGLE = CDCFLoatingPoint.new(
|
|
306
368
|
:gradual_underflow=>false, :infinity=>false, :nan=>false
|
307
369
|
)
|
308
370
|
|
309
|
-
#
|
371
|
+
# The CDC_DOUBLE can be splitted in two CDC_SINGLE values:
|
310
372
|
# get_bitfields(v,[CDC_SINGLE.total_bits]*2,CDC_DOUBLE.endianness).collect{|x| int_to_bytes(x,0,CDC_SINGLE.endianness)}
|
311
373
|
# and the value of the double is the sum of the values of the singles.
|
312
|
-
#
|
374
|
+
# Unlike the single, we must use :fractional_significand mode because with :integral_significand
|
313
375
|
# the exponent would refer to the whole significand, but it must refer only to the most significant half.
|
314
376
|
# we substract the number of bits in the single to the bias and exponent because of this change,
|
315
377
|
# and add 48 to the min_exponent to avoid the exponent of the low order single to be out of range
|
316
378
|
# because the exponent of the low order single is adjusted to
|
317
379
|
# the position of its digits by substracting 48 from the high order exponent
|
380
|
+
# when its exponent would be out of range
|
318
381
|
# Note that when computing the low order exponent with the fields handler we must take into account the sign
|
319
382
|
# because for negative numbers all the fields are one-complemented.
|
320
383
|
CDC_DOUBLE= CDCFLoatingPoint.new(
|
@@ -353,17 +416,15 @@ UNIVAC_DOUBLE = BinaryFormat.new(
|
|
353
416
|
:gradual_underflow=>false, :infinity=>false, :nan=>false
|
354
417
|
)
|
355
418
|
|
356
|
-
# Sofware floating point implementatin for the Apple II (6502)
|
357
|
-
# the significand & sign are a single field in two's commplement
|
358
419
|
|
359
|
-
|
420
|
+
# :stopdoc: # the next definition is not handled correctly by RDoc
|
421
|
+
APPLE_INSANE = BinaryFormat.new(
|
360
422
|
:fields=>[:significand,23,:sign,1,:exponent,8],
|
361
423
|
:bias=>128, :bias_mode=>:normalized_significand,
|
362
424
|
:hidden_bit=>false, :min_encoded_exp=>0,
|
363
425
|
:neg_mode=>:radix_complement_significand,
|
364
426
|
:endianness=>:big_endian,
|
365
427
|
:gradual_underflow=>true, :infinity=>false, :nan=>false) { |fp|
|
366
|
-
|
367
428
|
# This needs a peculiar treatment for the negative values, which not simply use two's complement
|
368
429
|
# but also avoid having the sign and msb of the significand equal.
|
369
430
|
# Note that here we have a separate sign bit, but it can also be considered as the msb of the significand
|
@@ -387,10 +448,14 @@ APPLE = BinaryFormat.new(
|
|
387
448
|
#puts ""
|
388
449
|
[f,e]
|
389
450
|
end
|
390
|
-
|
391
451
|
}
|
452
|
+
# :startdoc:
|
392
453
|
|
393
454
|
|
455
|
+
# Sofware floating point implementatin for the Apple II (6502)
|
456
|
+
# the significand & sign are a single field in two's commplement
|
457
|
+
APPLE = APPLE_INSANE
|
458
|
+
|
394
459
|
# Wang 2200 Basic Decimal floating point
|
395
460
|
WANG2200 = BCDFormat.new(
|
396
461
|
:fields=>[:significand,13,:exponent,2,:signs,1],
|
@@ -572,7 +637,31 @@ C51_BCD_LONG_DOUBLE = C51BCDFloatingPoint.new(
|
|
572
637
|
:zero_encoded_exp=>0, :min_encoded_exp=>0,:max_encoded_exp=>127
|
573
638
|
)
|
574
639
|
|
575
|
-
|
640
|
+
=begin
|
641
|
+
# Note:
|
642
|
+
# One could be tempted to define a double-double type as:
|
643
|
+
IEEE_DOUBLE_DOUBLE = BinaryFormat.new(
|
644
|
+
:fields=>[:significand,52,:lo_exponent,11,:lo_sign,1,:significand,52,:exponent,11,:sign,1],
|
645
|
+
:fields_handler=>lambda{|fields|
|
646
|
+
fields[2] = fields[5];
|
647
|
+
bits,max_exp = 53,2047
|
648
|
+
if fields[4]>bits && fields[4]<max_exp
|
649
|
+
fields[1] = fields[4] - bits
|
650
|
+
else # 0, denormals, small numbers, NaN, Infinities
|
651
|
+
fields[0] = fields[1] = 0
|
652
|
+
end
|
653
|
+
},
|
654
|
+
:bias=>1023, :bias_mode=>:normalized_significand,
|
655
|
+
:hidden_bit=>true,
|
656
|
+
:endianness=>:little_endian, :round=>:even,
|
657
|
+
:gradual_underflow=>true, :infinity=>true, :nan=>true
|
658
|
+
)
|
659
|
+
# But this is incorrect since there's a hidden bit in the low double too and it must be normalized.
|
660
|
+
# In general the halfs of the significand need not be adjacent, they
|
661
|
+
# can have exponets with a separation higher than 53; (in fact the minimum separation seems to be 54)
|
662
|
+
# and they can have different sings, too;
|
663
|
+
# double-double is too tricky to be supported by this package.
|
664
|
+
=end
|
576
665
|
|
577
666
|
|
578
667
|
|
data/lib/float-formats/native.rb
CHANGED
@@ -99,6 +99,27 @@ class Float
|
|
99
99
|
|
100
100
|
# Maximum significand == Math.ldexp(Math.ldexp(1,Float::MANT_DIG)-1,-Float::MANT_DIG)
|
101
101
|
MAX_F = Math.frexp(Float::MAX)[0] == Math.ldexp(Math.ldexp(1,Float::MANT_DIG)-1,-Float::MANT_DIG)
|
102
|
+
|
103
|
+
# ulp (unit in the last place) according to the definition proposed by J.M. Muller in
|
104
|
+
# "On the definition of ulp(x)" INRIA No. 5504
|
105
|
+
def ulp
|
106
|
+
return self if nan?
|
107
|
+
x = abs
|
108
|
+
if x < Math.ldexp(1,MIN_EXP) # x < RADIX*MIN_N
|
109
|
+
res = Math.ldexp(1,MIN_EXP-MANT_DIG) # res = MIN_D
|
110
|
+
elsif x > Math.ldexp(1-Math.ldexp(1,-MANT_DIG),MAX_EXP) # x > MAX
|
111
|
+
res = Math.ldexp(1,MAX_EXP-MANT_DIG) # res = MAX - MAX.prev
|
112
|
+
else
|
113
|
+
f,e = Math.frexp(x)
|
114
|
+
if f==Math.ldexp(1,-1)
|
115
|
+
res = Math.ldexp(1,e-MANT_DIG-1)
|
116
|
+
else
|
117
|
+
res = Math.ldexp(1,e-MANT_DIG)
|
118
|
+
end
|
119
|
+
end
|
120
|
+
res
|
121
|
+
end
|
122
|
+
|
102
123
|
|
103
124
|
end
|
104
125
|
|
@@ -127,6 +148,89 @@ def float_bin(x)
|
|
127
148
|
x.nio_write(Nio::Fmt.mode(:sci,:exact).base(2))
|
128
149
|
end
|
129
150
|
|
151
|
+
# decompose a float into a signed integer significand and exponent (base Float::RADIX)
|
152
|
+
def float_to_integral_significand_exponent(x)
|
153
|
+
s,e = Math.frexp(x)
|
154
|
+
[Math.ldexp(s,Float::MANT_DIG).to_i,e-Float::MANT_DIG]
|
155
|
+
end
|
156
|
+
|
157
|
+
# compose float from significand and exponent
|
158
|
+
def float_from_integral_significand_exponent(s,e)
|
159
|
+
Math.ldexp(s,e)
|
160
|
+
end
|
161
|
+
|
162
|
+
def float_to_integral_sign_significand_exponent(x)
|
163
|
+
if x==0.0
|
164
|
+
sign = (1/x<0) ? -1 : +1
|
165
|
+
else
|
166
|
+
sign = x<0 ? -1 : +1
|
167
|
+
end
|
168
|
+
x = -x if sign<0
|
169
|
+
s,e = Math.frexp(x)
|
170
|
+
[sign,Math.ldexp(s,Float::MANT_DIG).to_i,e-Float::MANT_DIG]
|
171
|
+
end
|
172
|
+
|
173
|
+
def float_from_integral_sign_significand_exponent(sgn,s,e)
|
174
|
+
f = Math.ldexp(s,e)
|
175
|
+
f = -f if sgn<0
|
176
|
+
f
|
177
|
+
end
|
178
|
+
|
179
|
+
# convert a float to C99's hexadecimal notation
|
180
|
+
def hex_from_float(v)
|
181
|
+
if Float::RADIX==2
|
182
|
+
sgn,s,e = float_to_integral_sign_significand_exponent(v)
|
183
|
+
else
|
184
|
+
txt = v.nio_write(Fmt.base(2).sep('.')).upcase
|
185
|
+
p = txt.index('E')
|
186
|
+
exp = 0
|
187
|
+
if p
|
188
|
+
exp = rep[p+1..-1].to_i
|
189
|
+
txt = rep[0...p]
|
190
|
+
end
|
191
|
+
p = txt.index('.')
|
192
|
+
if p
|
193
|
+
exp -= (txt.size-p-1)
|
194
|
+
txt.tr!('.','')
|
195
|
+
end
|
196
|
+
s = txt.to_i(2)
|
197
|
+
e = exp
|
198
|
+
end
|
199
|
+
"0x#{sgn<0 ? '-' : ''}#{s.to_s(16)}p#{e}"
|
200
|
+
end
|
201
|
+
|
202
|
+
# convert a string formatted in C99's hexadecimal notation to a float
|
203
|
+
def hex_to_float(txt)
|
204
|
+
txt = txt.strip.upcase
|
205
|
+
txt = txt[2..-1] if txt[0,2]=='0X'
|
206
|
+
p = txt.index('P')
|
207
|
+
if p
|
208
|
+
exp = txt[p+1..-1].to_i
|
209
|
+
txt = txt[0...p]
|
210
|
+
else
|
211
|
+
exp = 0
|
212
|
+
end
|
213
|
+
p = txt.index('.')
|
214
|
+
if p
|
215
|
+
exp -= (txt.size-p-1)*4
|
216
|
+
txt.tr!('.','')
|
217
|
+
end
|
218
|
+
if Float::RADIX==2
|
219
|
+
v = txt.to_i(16)
|
220
|
+
if v==0 && txt.include?('-')
|
221
|
+
sign = -1
|
222
|
+
elsif v<0
|
223
|
+
sign = -1
|
224
|
+
v = -v
|
225
|
+
else
|
226
|
+
sign = +1
|
227
|
+
end
|
228
|
+
float_from_integral_sign_significand_exponent(sign,v,exp)
|
229
|
+
else
|
230
|
+
(txt.to_i(16)*(2**exp)).to_f
|
231
|
+
end
|
232
|
+
end
|
233
|
+
|
130
234
|
# ===== IEEE types =====================================================================================
|
131
235
|
|
132
236
|
# generate a SGL value stored in a byte string given a decimal value formatted as text
|
data/test/test_data.yaml
CHANGED
@@ -484,60 +484,11 @@ XS256_DOUBLE:
|
|
484
484
|
- "65536.0": 44 00 00 00 00 00 00 00
|
485
485
|
- "-65536.0": C4 00 00 00 00 00 00 00
|
486
486
|
- "-7.50": C0 B8 00 00 00 00 00 00
|
487
|
-
- "8.6361685550944451E-78": 00 00 00 00 00 00 00
|
487
|
+
- "8.6361685550944451E-78": 00 00 00 00 00 00 00 01
|
488
488
|
- 1.15792089237316192E77: 7F FF FF FF FF FF FF FF
|
489
489
|
- "5.5511151231257827E-17": 32 80 00 00 00 00 00 00
|
490
490
|
- "2.77555756156289135E-17": 32 40 00 00 00 00 00 00
|
491
491
|
base: :bytes
|
492
|
-
IEEE_128:
|
493
|
-
parameters:
|
494
|
-
- total_bits: 128
|
495
|
-
- radix: 2
|
496
|
-
- significand_digits: 112
|
497
|
-
- radix_min_exp: -16382
|
498
|
-
- radix_max_exp: 16383
|
499
|
-
- decimal_digits_stored: 33
|
500
|
-
- decimal_digits_necessary: 35
|
501
|
-
- decimal_min_exp: -4931
|
502
|
-
- decimal_max_exp: 4932
|
503
|
-
values:
|
504
|
-
- Rational(1, 3): AB AA AA AA AA AA AA AA AA AA AA AA AA AA FD 3F
|
505
|
-
- Rational(1, 10): CD CC CC CC CC CC CC CC CC CC CC CC CC CC FB 3F
|
506
|
-
- Rational(2, 3): AB AA AA AA AA AA AA AA AA AA AA AA AA AA FE 3F
|
507
|
-
- Rational(1, 1024): 00 00 00 00 00 00 00 00 00 00 00 00 00 80 F5 3F
|
508
|
-
- Rational(1, 1000): 31 08 AC 1C 5A 64 3B DF 4F 8D 97 6E 12 83 F5 3F
|
509
|
-
- Rational(1024, 1): 00 00 00 00 00 00 00 00 00 00 00 00 00 80 09 40
|
510
|
-
- Rational(1024, 1): 00 00 00 00 00 00 00 00 00 00 00 00 00 80 09 40
|
511
|
-
special:
|
512
|
-
- min_value: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|
513
|
-
- min_normalized_value: 00 00 00 00 00 00 00 00 00 00 00 00 00 80 01 00
|
514
|
-
- max_value: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FE 7F
|
515
|
-
- epsilon: 00 00 00 00 00 00 00 00 00 00 00 00 00 80 90 3F
|
516
|
-
- strict_epsilon: 01 00 00 00 00 00 00 00 00 00 00 00 00 80 8F 3F
|
517
|
-
numerals:
|
518
|
-
- "+0": 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|
519
|
-
- "-0": 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80
|
520
|
-
- "+1": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FF 3F
|
521
|
-
- "-1": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FF BF
|
522
|
-
- "+0.1": CD CC CC CC CC CC CC CC CC CC CC CC CC CC FB 3F
|
523
|
-
- "-0.1": CD CC CC CC CC CC CC CC CC CC CC CC CC CC FB BF
|
524
|
-
- "0.5": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FE 3F
|
525
|
-
- "-0.5": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FE BF
|
526
|
-
- "29.2": 9A 99 99 99 99 99 99 99 99 99 99 99 99 E9 03 40
|
527
|
-
- "-29.2": 9A 99 99 99 99 99 99 99 99 99 99 99 99 E9 03 C0
|
528
|
-
- "0.03125": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FA 3F
|
529
|
-
- "-0.03125": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 FA BF
|
530
|
-
- "-0.3125": 00 00 00 00 00 00 00 00 00 00 00 00 00 A0 FD BF
|
531
|
-
- 1.234E2: CD CC CC CC CC CC CC CC CC CC CC CC CC F6 05 40
|
532
|
-
- "-1.234E-6": 90 0F B2 E0 09 B3 8C B1 6C 16 CA EA 9F A5 EB BF
|
533
|
-
- "65536.0": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 0F 40
|
534
|
-
- "-65536.0": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 0F C0
|
535
|
-
- "-7.50": 00 00 00 00 00 00 00 00 00 00 00 00 00 F0 01 C0
|
536
|
-
- "3.3621031431120935062626778173217526E-4932": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 01 00
|
537
|
-
- 1.1897314953572317650857593266280069E4932: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FE 7F
|
538
|
-
- "3.8518598887744717061119558851698546E-34": 00 00 00 00 00 00 00 00 00 00 00 00 00 80 90 3F
|
539
|
-
- "1.9259299443872358530559779425849281E-34": 01 00 00 00 00 00 00 00 00 00 00 00 00 80 8F 3F
|
540
|
-
base: :bytes
|
541
492
|
IEEE_DOUBLE:
|
542
493
|
parameters:
|
543
494
|
- total_bits: 64
|
@@ -680,7 +631,7 @@ PDP11_F:
|
|
680
631
|
- "65536.0": 80 48 00 00
|
681
632
|
- "-65536.0": 80 C8 00 00
|
682
633
|
- "-7.50": F0 C1 00 00
|
683
|
-
- "1.46936811E-39": 00 00
|
634
|
+
- "1.46936811E-39": 00 00 01 00
|
684
635
|
- 1.70141173E38: FF 7F FF FF
|
685
636
|
- "1.1920929E-7": 00 35 00 00
|
686
637
|
- "1.1920929E-7": 00 35 00 00
|
@@ -1121,7 +1072,7 @@ XS256:
|
|
1121
1072
|
- "65536.0": 44 00 00 00
|
1122
1073
|
- "-65536.0": C4 00 00 00
|
1123
1074
|
- "-7.50": C0 B8 00 00
|
1124
|
-
- "8.6361706E-78": 00 00 00
|
1075
|
+
- "8.6361706E-78": 00 00 00 01
|
1125
1076
|
- 1.1579208E77: 7F FF FF FF
|
1126
1077
|
- "2.3841858E-7": 3A 80 00 00
|
1127
1078
|
- "1.1920929E-7": 3A 40 00 00
|
@@ -1660,7 +1611,7 @@ BORLAND48:
|
|
1660
1611
|
- "65536.0": 91 00 00 00 00 00
|
1661
1612
|
- "-65536.0": 91 00 00 00 00 80
|
1662
1613
|
- "-7.50": 83 00 00 00 00 F0
|
1663
|
-
- "2.9387358770557E-39":
|
1614
|
+
- "2.9387358770557E-39": 01 00 00 00 00 00
|
1664
1615
|
- 1.7014118346031E38: FF FF FF FF FF 7F
|
1665
1616
|
- "1.8189894035459E-12": 5A 00 00 00 00 00
|
1666
1617
|
- "1.8189894035459E-12": 5A 00 00 00 00 00
|
data/test/test_float-formats.rb
CHANGED
@@ -109,4 +109,61 @@ class TestFloatFormats < Test::Unit::TestCase
|
|
109
109
|
end
|
110
110
|
|
111
111
|
end
|
112
|
+
|
113
|
+
def test_hp71b
|
114
|
+
assert_equal(-499, HP71B.radix_min_exp)
|
115
|
+
assert_equal(499, HP71B.radix_max_exp)
|
116
|
+
|
117
|
+
fmt = Nio::Fmt.prec(12)
|
118
|
+
assert_equal '9.99999999999E499', HP71B.max_value.to_fmt(fmt)
|
119
|
+
assert_equal '0000000000001501', HP71B.min_value.to_bits_text(16)
|
120
|
+
assert_equal '1E-510', HP71B.min_value.to_fmt(fmt)
|
121
|
+
assert_equal '1E-499', HP71B.min_normalized_value.to_fmt(fmt)
|
122
|
+
|
123
|
+
assert_equal '9210000000000999',HP71B.from_fmt('-0.21').to_bits_text(16)
|
124
|
+
assert_equal '0100000000000001',HP71B.from_fmt('10').to_bits_text(16)
|
125
|
+
assert_equal '9000000000000000',HP71B.from_fmt('-0').to_bits_text(16)
|
126
|
+
assert_equal '0000510000000501', HP71B.from_fmt('0.0051E-499').to_bits_text(16)
|
127
|
+
|
128
|
+
assert_equal '0000000000000F01',HP71B.nan.to_bits_text(16).upcase
|
129
|
+
assert_equal 'NAN', HP71B.nan.to_fmt.upcase
|
130
|
+
assert_equal '0000000000000F00', HP71B.infinity.to_bits_text(16).upcase
|
131
|
+
assert_equal '+INFINITY', HP71B.infinity.to_fmt.upcase
|
132
|
+
assert_equal '9000000000000F00', HP71B.infinity.neg.to_bits_text(16).upcase
|
133
|
+
end
|
134
|
+
def test_quad
|
135
|
+
assert_equal "3fff 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('1').to_hex.downcase
|
136
|
+
assert_equal "7ffe ffff ffff ffff ffff ffff ffff ffff".tr(' ',''), IEEE_binary128_BE.max_value.to_hex.downcase
|
137
|
+
assert_equal '1.19E4932', IEEE_binary128.max_value.to_fmt(Nio::Fmt.prec(4))
|
138
|
+
assert_equal "c000 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('-2').to_hex.downcase
|
139
|
+
assert_equal "0000 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('0').to_hex.downcase
|
140
|
+
assert_equal "8000 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.from_fmt('-0').to_hex.downcase
|
141
|
+
assert_equal "7fff 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.infinity.to_hex.downcase
|
142
|
+
assert_equal "ffff 0000 0000 0000 0000 0000 0000 0000".tr(' ',''), IEEE_binary128_BE.infinity(1).to_hex.downcase
|
143
|
+
assert_equal "3ffd 5555 5555 5555 5555 5555 5555 5555".tr(' ',''), IEEE_binary128_BE.from_number(Rational(1,3)).to_hex.downcase
|
144
|
+
assert_equal "3fff 0000 0000 0000 0000 0000 0000 0001".tr(' ',''), IEEE_binary128_BE.from_fmt('1').next.to_hex.downcase
|
145
|
+
end
|
146
|
+
def test_half
|
147
|
+
assert_equal "3c00", IEEE_binary16_BE.from_fmt('1').to_hex.downcase
|
148
|
+
assert_equal "7bff", IEEE_binary16_BE.max_value.to_hex.downcase
|
149
|
+
assert_equal '65504', IEEE_binary16_BE.max_value.to_fmt
|
150
|
+
assert_equal "0400", IEEE_binary16_BE.min_normalized_value.to_hex.downcase
|
151
|
+
assert_equal "6.103515625E-5", IEEE_binary16_BE.min_normalized_value.to_fmt
|
152
|
+
assert_equal "0001", IEEE_binary16_BE.min_value.to_hex.downcase
|
153
|
+
assert_equal "5.9604644775390625E-8", IEEE_binary16_BE.min_value.to_fmt
|
154
|
+
assert_equal "0000", IEEE_binary16_BE.from_fmt('0').to_hex.downcase
|
155
|
+
assert_equal "8000", IEEE_binary16_BE.from_fmt('-0').to_hex.downcase
|
156
|
+
assert_equal "7c00".tr(' ',''), IEEE_binary16_BE.infinity.to_hex.downcase
|
157
|
+
assert_equal "fc00".tr(' ',''), IEEE_binary16_BE.infinity(1).to_hex.downcase
|
158
|
+
end
|
159
|
+
def test_special
|
160
|
+
assert_equal '+Infinity', IEEE_binary32.from_number(1.0/0.0).to_fmt
|
161
|
+
assert_equal '-Infinity', IEEE_binary32.from_number(-1.0/0.0).to_fmt
|
162
|
+
assert_equal '+Infinity', IEEE_binary32.from_fmt('+Infinity').to_fmt
|
163
|
+
assert_equal '-Infinity', IEEE_binary32.from_fmt('-Infinity').to_fmt
|
164
|
+
assert_equal 'NAN', IEEE_binary32.from_number(0.0/0.0).to_fmt.upcase
|
165
|
+
assert_equal 'NAN', IEEE_binary32.from_fmt('NaN').to_fmt.upcase
|
166
|
+
end
|
167
|
+
|
168
|
+
|
112
169
|
end
|
data/test/test_helper.rb
CHANGED
data/test/test_native-float.rb
CHANGED
@@ -22,4 +22,56 @@ class TestNativeFloat < Test::Unit::TestCase
|
|
22
22
|
assert(-(1.0.next) == (-1.0).prev)
|
23
23
|
assert((-1.0).next == -(1.0.prev))
|
24
24
|
end
|
25
|
+
|
26
|
+
def test_hex
|
27
|
+
if Float::RADIX==2 && Float::MANT_DIG==53
|
28
|
+
assert_equal((1.0+Float::EPSILON),hex_to_float('0x1.0000000000001p0'))
|
29
|
+
assert_equal '0x10000000000001p-52', hex_from_float(1.0+Float::EPSILON)
|
30
|
+
end
|
31
|
+
assert_equal 1.0, hex_to_float(hex_from_float(1.0))
|
32
|
+
assert_equal(-1.0, hex_to_float(hex_from_float(-1.0)))
|
33
|
+
assert_equal 1.0e-5, hex_to_float(hex_from_float(1.0e-5))
|
34
|
+
assert_equal(-1.0e-5, hex_to_float(hex_from_float(-1.0e-5)))
|
35
|
+
|
36
|
+
assert_equal(+1,float_to_integral_sign_significand_exponent(+0.0).first)
|
37
|
+
assert_equal(-1,float_to_integral_sign_significand_exponent(-0.0).first)
|
38
|
+
assert_not_equal hex_from_float(-0.0), hex_from_float(+0.0)
|
39
|
+
assert_equal hex_to_float(hex_from_float(-0.0)), hex_to_float(hex_from_float(+0.0))
|
40
|
+
|
41
|
+
end
|
42
|
+
|
43
|
+
|
44
|
+
def check_ulp_around(x)
|
45
|
+
assert_equal x-x.prev, x.prev.ulp
|
46
|
+
assert_equal x-x.prev, x.ulp
|
47
|
+
assert_equal x.next-x, x.next.ulp
|
48
|
+
assert_equal x.next.next-x.next, x.next.next.ulp
|
49
|
+
end
|
50
|
+
|
51
|
+
def test_ulp
|
52
|
+
r = Float::RADIX
|
53
|
+
assert_equal Float::MIN_D, 0.0.ulp
|
54
|
+
assert_equal Float::MIN_D, Float::MIN_D.ulp
|
55
|
+
assert_equal Float::MIN_D, Float::MIN_D.next.ulp
|
56
|
+
assert_equal Float::MIN_D, (0.5*(Float::MIN_D+Float::MAX_D)).ulp
|
57
|
+
assert_equal Float::MIN_D, Float::MAX_D.prev.ulp
|
58
|
+
assert_equal Float::MIN_D, Float::MAX_D.ulp
|
59
|
+
assert_equal Float::MIN_D, Float::MIN_N.ulp
|
60
|
+
assert_equal Float::MIN_D, Float::MIN_N.next.ulp
|
61
|
+
assert_equal Float::MIN_D, (r*Float::MIN_N).prev.ulp
|
62
|
+
assert_equal Float::MIN_D, (r*Float::MIN_N).ulp
|
63
|
+
assert_equal r*Float::MIN_D, (r*Float::MIN_N).next.ulp
|
64
|
+
check_ulp_around 1.0
|
65
|
+
assert_equal 1.0.next-1.0, 1.5.ulp
|
66
|
+
check_ulp_around r.to_f
|
67
|
+
check_ulp_around Math.ldexp(1,10)
|
68
|
+
check_ulp_around Math.ldexp(1,-10)
|
69
|
+
check_ulp_around Float::MAX/r
|
70
|
+
assert_equal Float::MAX-Float::MAX.prev, Float::MAX.prev.ulp
|
71
|
+
assert_equal Float::MAX-Float::MAX.prev, Float::MAX.ulp
|
72
|
+
assert_equal Float::MAX-Float::MAX.prev, (1.0/0.0).ulp
|
73
|
+
assert((0.0/0.0).ulp.nan?)
|
74
|
+
assert_equal Math.ldexp(1,10)-Math.ldexp(1,10).prev, Math.ldexp(1,10).prev.ulp
|
75
|
+
end
|
76
|
+
|
25
77
|
end
|
metadata
CHANGED
@@ -3,8 +3,8 @@ rubygems_version: 0.9.2
|
|
3
3
|
specification_version: 1
|
4
4
|
name: float-formats
|
5
5
|
version: !ruby/object:Gem::Version
|
6
|
-
version: 0.1.
|
7
|
-
date: 2007-
|
6
|
+
version: 0.1.1
|
7
|
+
date: 2007-12-15 00:00:00 +01:00
|
8
8
|
summary: Floating-Point Formats
|
9
9
|
require_paths:
|
10
10
|
- lib
|
@@ -42,7 +42,6 @@ files:
|
|
42
42
|
- lib/float-formats/classes.rb
|
43
43
|
- lib/float-formats/formats.rb
|
44
44
|
- lib/float-formats/native.rb
|
45
|
-
- log/debug.log
|
46
45
|
- script/destroy
|
47
46
|
- script/destroy.cmd
|
48
47
|
- script/generate
|
@@ -84,5 +83,5 @@ dependencies:
|
|
84
83
|
requirements:
|
85
84
|
- - ">="
|
86
85
|
- !ruby/object:Gem::Version
|
87
|
-
version: 0.2.
|
86
|
+
version: 0.2.1
|
88
87
|
version:
|
data/log/debug.log
DELETED
File without changes
|