cumo 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (266) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +27 -0
  3. data/.travis.yml +5 -0
  4. data/3rd_party/mkmf-cu/.gitignore +36 -0
  5. data/3rd_party/mkmf-cu/Gemfile +3 -0
  6. data/3rd_party/mkmf-cu/LICENSE +21 -0
  7. data/3rd_party/mkmf-cu/README.md +36 -0
  8. data/3rd_party/mkmf-cu/Rakefile +11 -0
  9. data/3rd_party/mkmf-cu/bin/mkmf-cu-nvcc +4 -0
  10. data/3rd_party/mkmf-cu/lib/mkmf-cu.rb +32 -0
  11. data/3rd_party/mkmf-cu/lib/mkmf-cu/cli.rb +80 -0
  12. data/3rd_party/mkmf-cu/lib/mkmf-cu/nvcc.rb +157 -0
  13. data/3rd_party/mkmf-cu/mkmf-cu.gemspec +16 -0
  14. data/3rd_party/mkmf-cu/test/test_mkmf-cu.rb +67 -0
  15. data/CODE_OF_CONDUCT.md +46 -0
  16. data/Gemfile +8 -0
  17. data/LICENSE.txt +82 -0
  18. data/README.md +252 -0
  19. data/Rakefile +43 -0
  20. data/bench/broadcast_fp32.rb +138 -0
  21. data/bench/cumo_bench.rb +193 -0
  22. data/bench/numo_bench.rb +138 -0
  23. data/bench/reduction_fp32.rb +117 -0
  24. data/bin/console +14 -0
  25. data/bin/setup +8 -0
  26. data/cumo.gemspec +32 -0
  27. data/ext/cumo/cuda/cublas.c +278 -0
  28. data/ext/cumo/cuda/driver.c +421 -0
  29. data/ext/cumo/cuda/memory_pool.cpp +185 -0
  30. data/ext/cumo/cuda/memory_pool_impl.cpp +308 -0
  31. data/ext/cumo/cuda/memory_pool_impl.hpp +370 -0
  32. data/ext/cumo/cuda/memory_pool_impl_test.cpp +554 -0
  33. data/ext/cumo/cuda/nvrtc.c +207 -0
  34. data/ext/cumo/cuda/runtime.c +167 -0
  35. data/ext/cumo/cumo.c +148 -0
  36. data/ext/cumo/depend.erb +58 -0
  37. data/ext/cumo/extconf.rb +179 -0
  38. data/ext/cumo/include/cumo.h +25 -0
  39. data/ext/cumo/include/cumo/compat.h +23 -0
  40. data/ext/cumo/include/cumo/cuda/cublas.h +153 -0
  41. data/ext/cumo/include/cumo/cuda/cumo_thrust.hpp +187 -0
  42. data/ext/cumo/include/cumo/cuda/cumo_thrust_complex.hpp +79 -0
  43. data/ext/cumo/include/cumo/cuda/driver.h +22 -0
  44. data/ext/cumo/include/cumo/cuda/memory_pool.h +28 -0
  45. data/ext/cumo/include/cumo/cuda/nvrtc.h +22 -0
  46. data/ext/cumo/include/cumo/cuda/runtime.h +40 -0
  47. data/ext/cumo/include/cumo/indexer.h +238 -0
  48. data/ext/cumo/include/cumo/intern.h +142 -0
  49. data/ext/cumo/include/cumo/intern_fwd.h +38 -0
  50. data/ext/cumo/include/cumo/intern_kernel.h +6 -0
  51. data/ext/cumo/include/cumo/narray.h +429 -0
  52. data/ext/cumo/include/cumo/narray_kernel.h +149 -0
  53. data/ext/cumo/include/cumo/ndloop.h +95 -0
  54. data/ext/cumo/include/cumo/reduce_kernel.h +126 -0
  55. data/ext/cumo/include/cumo/template.h +158 -0
  56. data/ext/cumo/include/cumo/template_kernel.h +77 -0
  57. data/ext/cumo/include/cumo/types/bit.h +40 -0
  58. data/ext/cumo/include/cumo/types/bit_kernel.h +34 -0
  59. data/ext/cumo/include/cumo/types/complex.h +402 -0
  60. data/ext/cumo/include/cumo/types/complex_kernel.h +414 -0
  61. data/ext/cumo/include/cumo/types/complex_macro.h +382 -0
  62. data/ext/cumo/include/cumo/types/complex_macro_kernel.h +186 -0
  63. data/ext/cumo/include/cumo/types/dcomplex.h +46 -0
  64. data/ext/cumo/include/cumo/types/dcomplex_kernel.h +13 -0
  65. data/ext/cumo/include/cumo/types/dfloat.h +47 -0
  66. data/ext/cumo/include/cumo/types/dfloat_kernel.h +14 -0
  67. data/ext/cumo/include/cumo/types/float_def.h +34 -0
  68. data/ext/cumo/include/cumo/types/float_def_kernel.h +39 -0
  69. data/ext/cumo/include/cumo/types/float_macro.h +191 -0
  70. data/ext/cumo/include/cumo/types/float_macro_kernel.h +158 -0
  71. data/ext/cumo/include/cumo/types/int16.h +24 -0
  72. data/ext/cumo/include/cumo/types/int16_kernel.h +23 -0
  73. data/ext/cumo/include/cumo/types/int32.h +24 -0
  74. data/ext/cumo/include/cumo/types/int32_kernel.h +19 -0
  75. data/ext/cumo/include/cumo/types/int64.h +24 -0
  76. data/ext/cumo/include/cumo/types/int64_kernel.h +19 -0
  77. data/ext/cumo/include/cumo/types/int8.h +24 -0
  78. data/ext/cumo/include/cumo/types/int8_kernel.h +19 -0
  79. data/ext/cumo/include/cumo/types/int_macro.h +67 -0
  80. data/ext/cumo/include/cumo/types/int_macro_kernel.h +48 -0
  81. data/ext/cumo/include/cumo/types/real_accum.h +486 -0
  82. data/ext/cumo/include/cumo/types/real_accum_kernel.h +101 -0
  83. data/ext/cumo/include/cumo/types/robj_macro.h +80 -0
  84. data/ext/cumo/include/cumo/types/robj_macro_kernel.h +0 -0
  85. data/ext/cumo/include/cumo/types/robject.h +27 -0
  86. data/ext/cumo/include/cumo/types/robject_kernel.h +7 -0
  87. data/ext/cumo/include/cumo/types/scomplex.h +46 -0
  88. data/ext/cumo/include/cumo/types/scomplex_kernel.h +13 -0
  89. data/ext/cumo/include/cumo/types/sfloat.h +48 -0
  90. data/ext/cumo/include/cumo/types/sfloat_kernel.h +14 -0
  91. data/ext/cumo/include/cumo/types/uint16.h +25 -0
  92. data/ext/cumo/include/cumo/types/uint16_kernel.h +20 -0
  93. data/ext/cumo/include/cumo/types/uint32.h +25 -0
  94. data/ext/cumo/include/cumo/types/uint32_kernel.h +20 -0
  95. data/ext/cumo/include/cumo/types/uint64.h +25 -0
  96. data/ext/cumo/include/cumo/types/uint64_kernel.h +20 -0
  97. data/ext/cumo/include/cumo/types/uint8.h +25 -0
  98. data/ext/cumo/include/cumo/types/uint8_kernel.h +20 -0
  99. data/ext/cumo/include/cumo/types/uint_macro.h +58 -0
  100. data/ext/cumo/include/cumo/types/uint_macro_kernel.h +38 -0
  101. data/ext/cumo/include/cumo/types/xint_macro.h +169 -0
  102. data/ext/cumo/include/cumo/types/xint_macro_kernel.h +88 -0
  103. data/ext/cumo/narray/SFMT-params.h +97 -0
  104. data/ext/cumo/narray/SFMT-params19937.h +46 -0
  105. data/ext/cumo/narray/SFMT.c +620 -0
  106. data/ext/cumo/narray/SFMT.h +167 -0
  107. data/ext/cumo/narray/array.c +638 -0
  108. data/ext/cumo/narray/data.c +961 -0
  109. data/ext/cumo/narray/gen/cogen.rb +56 -0
  110. data/ext/cumo/narray/gen/cogen_kernel.rb +58 -0
  111. data/ext/cumo/narray/gen/def/bit.rb +37 -0
  112. data/ext/cumo/narray/gen/def/dcomplex.rb +39 -0
  113. data/ext/cumo/narray/gen/def/dfloat.rb +37 -0
  114. data/ext/cumo/narray/gen/def/int16.rb +36 -0
  115. data/ext/cumo/narray/gen/def/int32.rb +36 -0
  116. data/ext/cumo/narray/gen/def/int64.rb +36 -0
  117. data/ext/cumo/narray/gen/def/int8.rb +36 -0
  118. data/ext/cumo/narray/gen/def/robject.rb +37 -0
  119. data/ext/cumo/narray/gen/def/scomplex.rb +39 -0
  120. data/ext/cumo/narray/gen/def/sfloat.rb +37 -0
  121. data/ext/cumo/narray/gen/def/uint16.rb +36 -0
  122. data/ext/cumo/narray/gen/def/uint32.rb +36 -0
  123. data/ext/cumo/narray/gen/def/uint64.rb +36 -0
  124. data/ext/cumo/narray/gen/def/uint8.rb +36 -0
  125. data/ext/cumo/narray/gen/erbpp2.rb +346 -0
  126. data/ext/cumo/narray/gen/narray_def.rb +268 -0
  127. data/ext/cumo/narray/gen/spec.rb +425 -0
  128. data/ext/cumo/narray/gen/tmpl/accum.c +86 -0
  129. data/ext/cumo/narray/gen/tmpl/accum_binary.c +121 -0
  130. data/ext/cumo/narray/gen/tmpl/accum_binary_kernel.cu +61 -0
  131. data/ext/cumo/narray/gen/tmpl/accum_index.c +119 -0
  132. data/ext/cumo/narray/gen/tmpl/accum_index_kernel.cu +66 -0
  133. data/ext/cumo/narray/gen/tmpl/accum_kernel.cu +12 -0
  134. data/ext/cumo/narray/gen/tmpl/alloc_func.c +107 -0
  135. data/ext/cumo/narray/gen/tmpl/allocate.c +37 -0
  136. data/ext/cumo/narray/gen/tmpl/aref.c +66 -0
  137. data/ext/cumo/narray/gen/tmpl/aref_cpu.c +50 -0
  138. data/ext/cumo/narray/gen/tmpl/aset.c +56 -0
  139. data/ext/cumo/narray/gen/tmpl/binary.c +162 -0
  140. data/ext/cumo/narray/gen/tmpl/binary2.c +70 -0
  141. data/ext/cumo/narray/gen/tmpl/binary2_kernel.cu +15 -0
  142. data/ext/cumo/narray/gen/tmpl/binary_kernel.cu +31 -0
  143. data/ext/cumo/narray/gen/tmpl/binary_s.c +45 -0
  144. data/ext/cumo/narray/gen/tmpl/binary_s_kernel.cu +15 -0
  145. data/ext/cumo/narray/gen/tmpl/bincount.c +181 -0
  146. data/ext/cumo/narray/gen/tmpl/cast.c +44 -0
  147. data/ext/cumo/narray/gen/tmpl/cast_array.c +13 -0
  148. data/ext/cumo/narray/gen/tmpl/class.c +9 -0
  149. data/ext/cumo/narray/gen/tmpl/class_kernel.cu +6 -0
  150. data/ext/cumo/narray/gen/tmpl/clip.c +121 -0
  151. data/ext/cumo/narray/gen/tmpl/coerce_cast.c +10 -0
  152. data/ext/cumo/narray/gen/tmpl/complex_accum_kernel.cu +129 -0
  153. data/ext/cumo/narray/gen/tmpl/cond_binary.c +68 -0
  154. data/ext/cumo/narray/gen/tmpl/cond_binary_kernel.cu +18 -0
  155. data/ext/cumo/narray/gen/tmpl/cond_unary.c +46 -0
  156. data/ext/cumo/narray/gen/tmpl/cum.c +50 -0
  157. data/ext/cumo/narray/gen/tmpl/each.c +47 -0
  158. data/ext/cumo/narray/gen/tmpl/each_with_index.c +70 -0
  159. data/ext/cumo/narray/gen/tmpl/ewcomp.c +79 -0
  160. data/ext/cumo/narray/gen/tmpl/ewcomp_kernel.cu +19 -0
  161. data/ext/cumo/narray/gen/tmpl/extract.c +22 -0
  162. data/ext/cumo/narray/gen/tmpl/extract_cpu.c +26 -0
  163. data/ext/cumo/narray/gen/tmpl/extract_data.c +53 -0
  164. data/ext/cumo/narray/gen/tmpl/eye.c +105 -0
  165. data/ext/cumo/narray/gen/tmpl/eye_kernel.cu +19 -0
  166. data/ext/cumo/narray/gen/tmpl/fill.c +52 -0
  167. data/ext/cumo/narray/gen/tmpl/fill_kernel.cu +29 -0
  168. data/ext/cumo/narray/gen/tmpl/float_accum_kernel.cu +106 -0
  169. data/ext/cumo/narray/gen/tmpl/format.c +62 -0
  170. data/ext/cumo/narray/gen/tmpl/format_to_a.c +49 -0
  171. data/ext/cumo/narray/gen/tmpl/frexp.c +38 -0
  172. data/ext/cumo/narray/gen/tmpl/gemm.c +203 -0
  173. data/ext/cumo/narray/gen/tmpl/init_class.c +20 -0
  174. data/ext/cumo/narray/gen/tmpl/init_module.c +12 -0
  175. data/ext/cumo/narray/gen/tmpl/inspect.c +21 -0
  176. data/ext/cumo/narray/gen/tmpl/lib.c +50 -0
  177. data/ext/cumo/narray/gen/tmpl/lib_kernel.cu +24 -0
  178. data/ext/cumo/narray/gen/tmpl/logseq.c +102 -0
  179. data/ext/cumo/narray/gen/tmpl/logseq_kernel.cu +31 -0
  180. data/ext/cumo/narray/gen/tmpl/map_with_index.c +98 -0
  181. data/ext/cumo/narray/gen/tmpl/median.c +66 -0
  182. data/ext/cumo/narray/gen/tmpl/minmax.c +47 -0
  183. data/ext/cumo/narray/gen/tmpl/module.c +9 -0
  184. data/ext/cumo/narray/gen/tmpl/module_kernel.cu +1 -0
  185. data/ext/cumo/narray/gen/tmpl/new_dim0.c +15 -0
  186. data/ext/cumo/narray/gen/tmpl/new_dim0_kernel.cu +8 -0
  187. data/ext/cumo/narray/gen/tmpl/poly.c +50 -0
  188. data/ext/cumo/narray/gen/tmpl/pow.c +97 -0
  189. data/ext/cumo/narray/gen/tmpl/pow_kernel.cu +29 -0
  190. data/ext/cumo/narray/gen/tmpl/powint.c +17 -0
  191. data/ext/cumo/narray/gen/tmpl/qsort.c +212 -0
  192. data/ext/cumo/narray/gen/tmpl/rand.c +168 -0
  193. data/ext/cumo/narray/gen/tmpl/rand_norm.c +121 -0
  194. data/ext/cumo/narray/gen/tmpl/real_accum_kernel.cu +75 -0
  195. data/ext/cumo/narray/gen/tmpl/seq.c +112 -0
  196. data/ext/cumo/narray/gen/tmpl/seq_kernel.cu +43 -0
  197. data/ext/cumo/narray/gen/tmpl/set2.c +57 -0
  198. data/ext/cumo/narray/gen/tmpl/sort.c +48 -0
  199. data/ext/cumo/narray/gen/tmpl/sort_index.c +111 -0
  200. data/ext/cumo/narray/gen/tmpl/store.c +41 -0
  201. data/ext/cumo/narray/gen/tmpl/store_array.c +187 -0
  202. data/ext/cumo/narray/gen/tmpl/store_array_kernel.cu +58 -0
  203. data/ext/cumo/narray/gen/tmpl/store_bit.c +86 -0
  204. data/ext/cumo/narray/gen/tmpl/store_bit_kernel.cu +66 -0
  205. data/ext/cumo/narray/gen/tmpl/store_from.c +81 -0
  206. data/ext/cumo/narray/gen/tmpl/store_from_kernel.cu +58 -0
  207. data/ext/cumo/narray/gen/tmpl/store_kernel.cu +3 -0
  208. data/ext/cumo/narray/gen/tmpl/store_numeric.c +9 -0
  209. data/ext/cumo/narray/gen/tmpl/to_a.c +43 -0
  210. data/ext/cumo/narray/gen/tmpl/unary.c +132 -0
  211. data/ext/cumo/narray/gen/tmpl/unary2.c +60 -0
  212. data/ext/cumo/narray/gen/tmpl/unary_kernel.cu +72 -0
  213. data/ext/cumo/narray/gen/tmpl/unary_ret2.c +34 -0
  214. data/ext/cumo/narray/gen/tmpl/unary_s.c +86 -0
  215. data/ext/cumo/narray/gen/tmpl/unary_s_kernel.cu +58 -0
  216. data/ext/cumo/narray/gen/tmpl_bit/allocate.c +24 -0
  217. data/ext/cumo/narray/gen/tmpl_bit/aref.c +54 -0
  218. data/ext/cumo/narray/gen/tmpl_bit/aref_cpu.c +57 -0
  219. data/ext/cumo/narray/gen/tmpl_bit/aset.c +56 -0
  220. data/ext/cumo/narray/gen/tmpl_bit/binary.c +98 -0
  221. data/ext/cumo/narray/gen/tmpl_bit/bit_count.c +64 -0
  222. data/ext/cumo/narray/gen/tmpl_bit/bit_count_cpu.c +88 -0
  223. data/ext/cumo/narray/gen/tmpl_bit/bit_count_kernel.cu +76 -0
  224. data/ext/cumo/narray/gen/tmpl_bit/bit_reduce.c +133 -0
  225. data/ext/cumo/narray/gen/tmpl_bit/each.c +48 -0
  226. data/ext/cumo/narray/gen/tmpl_bit/each_with_index.c +70 -0
  227. data/ext/cumo/narray/gen/tmpl_bit/extract.c +30 -0
  228. data/ext/cumo/narray/gen/tmpl_bit/extract_cpu.c +29 -0
  229. data/ext/cumo/narray/gen/tmpl_bit/fill.c +69 -0
  230. data/ext/cumo/narray/gen/tmpl_bit/format.c +64 -0
  231. data/ext/cumo/narray/gen/tmpl_bit/format_to_a.c +51 -0
  232. data/ext/cumo/narray/gen/tmpl_bit/inspect.c +21 -0
  233. data/ext/cumo/narray/gen/tmpl_bit/mask.c +136 -0
  234. data/ext/cumo/narray/gen/tmpl_bit/none_p.c +14 -0
  235. data/ext/cumo/narray/gen/tmpl_bit/store_array.c +108 -0
  236. data/ext/cumo/narray/gen/tmpl_bit/store_bit.c +70 -0
  237. data/ext/cumo/narray/gen/tmpl_bit/store_from.c +60 -0
  238. data/ext/cumo/narray/gen/tmpl_bit/to_a.c +47 -0
  239. data/ext/cumo/narray/gen/tmpl_bit/unary.c +81 -0
  240. data/ext/cumo/narray/gen/tmpl_bit/where.c +90 -0
  241. data/ext/cumo/narray/gen/tmpl_bit/where2.c +95 -0
  242. data/ext/cumo/narray/index.c +880 -0
  243. data/ext/cumo/narray/kwargs.c +153 -0
  244. data/ext/cumo/narray/math.c +142 -0
  245. data/ext/cumo/narray/narray.c +1948 -0
  246. data/ext/cumo/narray/ndloop.c +2105 -0
  247. data/ext/cumo/narray/rand.c +45 -0
  248. data/ext/cumo/narray/step.c +474 -0
  249. data/ext/cumo/narray/struct.c +886 -0
  250. data/lib/cumo.rb +3 -0
  251. data/lib/cumo/cuda.rb +11 -0
  252. data/lib/cumo/cuda/compile_error.rb +36 -0
  253. data/lib/cumo/cuda/compiler.rb +161 -0
  254. data/lib/cumo/cuda/device.rb +47 -0
  255. data/lib/cumo/cuda/link_state.rb +31 -0
  256. data/lib/cumo/cuda/module.rb +40 -0
  257. data/lib/cumo/cuda/nvrtc_program.rb +27 -0
  258. data/lib/cumo/linalg.rb +12 -0
  259. data/lib/cumo/narray.rb +2 -0
  260. data/lib/cumo/narray/extra.rb +1278 -0
  261. data/lib/erbpp.rb +294 -0
  262. data/lib/erbpp/line_number.rb +137 -0
  263. data/lib/erbpp/narray_def.rb +381 -0
  264. data/numo-narray-version +1 -0
  265. data/run.gdb +7 -0
  266. metadata +353 -0
@@ -0,0 +1,67 @@
1
+ require "test/unit"
2
+ require "mkmf-cu/opt"
3
+ require "mkmf-cu"
4
+
5
+ class TestMkmfCuOpt < Test::Unit::TestCase
6
+
7
+ def setup
8
+ @opt, @opt_h = build_optparser()
9
+ end
10
+
11
+ def test_opt
12
+ argv = ["-pipe", "-Os", "-x", "cu", "-O2", "-Wall", "-arch", "x86_64"]
13
+ parse_ill_short(argv, @opt_h)
14
+ assert_equal(["-Os", "-x", "cu", "-O2", "-Wall", "--arch", "x86_64"],
15
+ argv)
16
+ end
17
+
18
+ def test_Gg
19
+ argv = ["-g", "-G"]
20
+ parse_ill_short(argv, @opt_h)
21
+ @opt.parse(argv)
22
+ assert_equal({"-g" => [""], "-G" => [""]}, @opt_h)
23
+ end
24
+
25
+ def test_compiler_option
26
+ @opt_h.merge!({"-shared"=>[""], "-pipe"=>[""]})
27
+ assert_equal(" --compiler-options -pipe", compiler_option(@opt_h))
28
+ end
29
+
30
+ def test_Wl
31
+ argv = ["-Wl,-headerpad_max_install_names", "-fstack-protector", "-L/opt/local/lib",
32
+ "-Wl,-undefined,dynamic_lookup", "-Wl,-multiply_defined,suppress"]
33
+ parse_ill_short_with_arg(argv, @opt_h)
34
+ assert_equal(["--Wl=-headerpad_max_install_names", "-fstack-protector", "-L/opt/local/lib",
35
+ "--Wl=-undefined,dynamic_lookup", "--Wl=-multiply_defined,suppress"],
36
+ argv)
37
+ end
38
+
39
+ def test_linker_option
40
+ @opt_h.merge!({"-Wl"=>["-a", "-b"]})
41
+ assert_equal(" --linker-options -a --linker-options -b",
42
+ linker_option(@opt_h))
43
+ end
44
+
45
+ def test_std
46
+ argv = ["-std=c++11"]
47
+ parse_ill_short_with_arg(argv, @opt_h)
48
+ @opt.parse(argv)
49
+ assert_equal({"-std" => ["c++11"]}, @opt_h)
50
+ end
51
+
52
+ def test_compiler_bin
53
+ h = Hash.new{|h, k| h[k] = [] }.merge({"-shared"=>[""], "-pipe"=>[""], "--mkmf-cu-ext"=>["c"]})
54
+ assert_equal(" --compiler-bindir " + RbConfig::CONFIG["CC"],
55
+ compiler_bin(h))
56
+ end
57
+
58
+ def test_mkmf_cu
59
+ assert MakeMakefile::C_EXT.include?("cu")
60
+ assert MakeMakefile::SRC_EXT.include?("cu")
61
+
62
+ treat_cu_as_cxx()
63
+ assert !MakeMakefile::C_EXT.include?("cu")
64
+ assert MakeMakefile::CXX_EXT.include?("cu")
65
+ end
66
+
67
+ end
@@ -0,0 +1,46 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
6
+
7
+ ## Our Standards
8
+
9
+ Examples of behavior that contributes to creating a positive environment include:
10
+
11
+ * Using welcoming and inclusive language
12
+ * Being respectful of differing viewpoints and experiences
13
+ * Gracefully accepting constructive criticism
14
+ * Focusing on what is best for the community
15
+ * Showing empathy towards other community members
16
+
17
+ Examples of unacceptable behavior by participants include:
18
+
19
+ * The use of sexualized language or imagery and unwelcome sexual attention or advances
20
+ * Trolling, insulting/derogatory comments, and personal or political attacks
21
+ * Public or private harassment
22
+ * Publishing others' private information, such as a physical or electronic address, without explicit permission
23
+ * Other conduct which could reasonably be considered inappropriate in a professional setting
24
+
25
+ ## Our Responsibilities
26
+
27
+ Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
28
+
29
+ Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
30
+
31
+ ## Scope
32
+
33
+ This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
34
+
35
+ ## Enforcement
36
+
37
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at sonots@gmail.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
38
+
39
+ Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
40
+
41
+ ## Attribution
42
+
43
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]
44
+
45
+ [homepage]: http://contributor-covenant.org
46
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,8 @@
1
+ source "https://rubygems.org"
2
+
3
+ gemspec
4
+
5
+ gem 'test-unit'
6
+ gem 'yard'
7
+ gem 'pry-byebug'
8
+ gem 'power_assert', git: 'https://github.com/k-tsj/power_assert'
@@ -0,0 +1,82 @@
1
+ BSD 3-Clause License
2
+
3
+ Copyright (c) 2017 Naotoshi Seo
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
22
+
23
+ ######################################################################
24
+ # The Cumo is a fork of Numo NArray v0.9.0.9.
25
+ # Cumo's source code and documents contain the original Ruby/Numo ones.
26
+ ######################################################################
27
+ BSD 3-Clause License
28
+
29
+ Copyright (c) 1999-2017, Masahiro TANAKA
30
+ All rights reserved.
31
+
32
+ Redistribution and use in source and binary forms, with or without
33
+ modification, are permitted provided that the following conditions are met:
34
+
35
+ * Redistributions of source code must retain the above copyright notice, this
36
+ list of conditions and the following disclaimer.
37
+
38
+ * Redistributions in binary form must reproduce the above copyright notice,
39
+ this list of conditions and the following disclaimer in the documentation
40
+ and/or other materials provided with the distribution.
41
+
42
+ * Neither the name of the copyright holder nor the names of its
43
+ contributors may be used to endorse or promote products derived from
44
+ this software without specific prior written permission.
45
+
46
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
47
+ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
48
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
49
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
50
+ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
51
+ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
52
+ SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
53
+ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
54
+ OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
55
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
56
+
57
+ ######################################################################
58
+ # The Cumo refers some implementations of CuPy such as memory pool.
59
+ # Cumo's source code and documents contain the original CuPy ones.
60
+ ######################################################################
61
+ MIT License
62
+
63
+ Copyright (c) 2015 Preferred Infrastructure, Inc.
64
+ Copyright (c) 2015 Preferred Networks, Inc.
65
+
66
+ Permission is hereby granted, free of charge, to any person obtaining a copy
67
+ of this software and associated documentation files (the "Software"), to deal
68
+ in the Software without restriction, including without limitation the rights
69
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
70
+ copies of the Software, and to permit persons to whom the Software is
71
+ furnished to do so, subject to the following conditions:
72
+
73
+ The above copyright notice and this permission notice shall be included in
74
+ all copies or substantial portions of the Software.
75
+
76
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
77
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
78
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
79
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
80
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
81
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
82
+ THE SOFTWARE.
@@ -0,0 +1,252 @@
1
+ **This project is still under development. This project received grant from [Ruby Association Grant 2017](http://www.ruby.or.jp/ja/news/20171206).**
2
+
3
+ # Cumo
4
+
5
+ Cumo (pronounced like "koomo") is CUDA-aware numerical library whose interface is highly compatible with [Ruby Numo](https://github.com/ruby-numo).
6
+ This library provides the benefit of speedup using GPU by replacing Numo with only a small piece of codes.
7
+
8
+
9
+ ## Requirements
10
+
11
+ * Ruby 2.0 or later
12
+ * NVIDIA GPU Compute Capability 6.0 (Pascal) or later
13
+ * CUDA 9.0 or later
14
+
15
+ ## Preparation
16
+
17
+ Install CUDA and setup environment variables as follows:
18
+
19
+ ```bash
20
+ export CUDA_PATH="/usr/local/cuda"
21
+ export CPATH="$CUDA_PATH/include:$CPATH"
22
+ export LD_LIBRARY_PATH="$CUDA_PATH/lib64:$CUDA_PATH/lib:$LD_LIBRARY_PATH"
23
+ export PATH="$CUDA_PATH/bin:$PATH"
24
+ export LIBRARY_PATH="$CUDA_PATH/lib64:$CUDA_PATH/lib:$LIBRARY_PATH"
25
+ ```
26
+
27
+ ## Installation
28
+
29
+ Add a following line to your Gemfile:
30
+
31
+ ```ruby
32
+ gem 'cumo'
33
+ ```
34
+
35
+ And then execute:
36
+
37
+ $ bundle
38
+
39
+ Or install it yourself as:
40
+
41
+ $ gem install cumo
42
+
43
+ ## How To Use
44
+
45
+ ### Quick start
46
+
47
+ An example:
48
+
49
+ ```ruby
50
+ [1] pry(main)> require "cumo/narray"
51
+ => true
52
+ [2] pry(main)> a = Cumo::DFloat.new(3,5).seq
53
+ => Cumo::DFloat#shape=[3,5]
54
+ [[0, 1, 2, 3, 4],
55
+ [5, 6, 7, 8, 9],
56
+ [10, 11, 12, 13, 14]]
57
+ [3] pry(main)> a.shape
58
+ => [3, 5]
59
+ [4] pry(main)> a.ndim
60
+ => 2
61
+ [5] pry(main)> a.class
62
+ => Cumo::DFloat
63
+ [6] pry(main)> a.size
64
+ => 15
65
+ ```
66
+
67
+ ### How to switch from Numo to Cumo
68
+
69
+ Basically, following command should make it work with Cumo.
70
+
71
+ ```
72
+ find . -type f | xargs sed -i -e 's/Numo/Cumo/g' -e 's/numo/cumo/g'
73
+ ```
74
+
75
+ If you want to switch Numo and Cumo dynamically, following snippets should work:
76
+
77
+ ```ruby
78
+ if gpu
79
+ require 'cumo/narray'
80
+ Xumo = Cumo
81
+ else
82
+ require 'numo/narray'
83
+ Xumo = Numo
84
+ end
85
+
86
+ a = Xumo::DFloat.new(3,5).seq
87
+ ```
88
+
89
+ ### Incompatibility With Numo
90
+
91
+ Following methods behave incompatibly with Numo as default for performance.
92
+
93
+ * `extract`
94
+ * `[]`
95
+ * `count_true`
96
+ * `count_false`
97
+
98
+ Numo returns a Ruby numeric object for 0-dimensional NArray, but Cumo returns the 0-dimensional NArray instead of a Ruby numeric object.
99
+ This is to avoid synchnoziation between CPU and GPU for performance.
100
+
101
+ You may set `CUMO_COMPATIBLE_MODE=ON` environment variable to force Cumo NArray behave compatibly with Numo NArray.
102
+
103
+ You may enable or disable `compatible_mode` as:
104
+
105
+ ```
106
+ require 'cumo'
107
+ Cumo.enable_compatible_mode # enable
108
+ Cumo.compattible_mode_enabled? #=> true
109
+ Cumo.disable_compatible_mode # disable
110
+ Cumo.compattible_mode_enabled? #=> false
111
+ ```
112
+
113
+ You can also use following methods which behaves as Numo NArray's methods. Behaviors of these methods do not depend on `compatible_mode`.
114
+
115
+ * `extract_cpu`
116
+ * `aref_cpu(*idx)`
117
+ * `count_true_cpu`
118
+ * `count_false_cpu`
119
+
120
+ ### Select a GPU device ID
121
+
122
+ Set `CUDA_VISIBLE_DEVICES=id` environment variable, or
123
+
124
+ ```
125
+ require 'cumo'
126
+ Cumo::CUDA::Runtime.cudaSetDevice(id)
127
+ ```
128
+
129
+ where `id` is an integer.
130
+
131
+ ### Disable GPU Memory Pool
132
+
133
+ Set `CUMO_MEMORY_POOL=OFF` environment variable , or
134
+
135
+ ```
136
+ require 'cumo'
137
+ Cumo::CUDA::MemoryPool.disable
138
+ ```
139
+
140
+ if you found GPU memory pool implementation has a bug.
141
+
142
+ ## Documentation
143
+
144
+ See https://github.com/ruby-numo/numo-narray#documentation and replace Numo to Cumo.
145
+
146
+ ## Development
147
+
148
+ Install ruby dependencies:
149
+
150
+ ```
151
+ bundle install --path vendor/bundle
152
+ ```
153
+
154
+ Compile:
155
+
156
+ ```
157
+ bundle exec rake compile
158
+ ```
159
+
160
+ Run tests:
161
+
162
+ ```
163
+ bundle exec rake test
164
+ ```
165
+
166
+ Generate docs:
167
+
168
+ ```
169
+ bundle exec rake docs
170
+ ```
171
+
172
+ ## Source code organizations
173
+
174
+ * `*_kernel.{h,cuh,cu}` files are for device (CUDA kernels).
175
+ * .cu files are compiled via nvcc.
176
+ * .cu files define C wrapper functions to launch CUDA kernels to enable to be called from .c files.
177
+ * Technically, it is not possible to use CRuby API such as `VALUE` in .cu files.
178
+ * CRuby API is not callable from CUDA kernel because they do not have `__device__` modifier.
179
+ * nvcc does not support `#include RUBY_EXTCONF_H`, so can not include `ruby.h`.
180
+ * (RULE) It is allowed to use C++14 codes in .cu files.
181
+ * Rest of `*.{h,c}` files are for host (CPU).
182
+ * Call C wrapper functions defined in .cu files.
183
+ * It can use CRuby API.
184
+ * (RULE) It is not allowed to use C++ codes in host files.
185
+
186
+ Ruby's `mkmf` (or `extconf.rb`) does not support to specify 3rd compiler such as NVCC for another files of extensions `.cu`.
187
+ Therefore, cumo specify a wrapper command `bin/mkmf-cu-nvcc` as a compiler and changes its behavor depending on extensions of files to compile.
188
+
189
+ ## Advanced Tips on Development
190
+
191
+ ### ccache
192
+
193
+ [ccache](https://ccache.samba.org/) would be useful to speedup compilation time.
194
+ Install ccache and setup as:
195
+
196
+
197
+ ```bash
198
+ export PATH="$HOME/opt/ccache/bin:$PATH"
199
+ ln -sf "$HOME/opt/ccache/bin/ccache" "$HOME/opt/ccache/bin/gcc"
200
+ ln -sf "$HOME/opt/ccache/bin/ccache" "$HOME/opt/ccache/bin/g++"
201
+ ln -sf "$HOME/opt/ccache/bin/ccache" "$HOME/opt/ccache/bin/nvcc"
202
+ ```
203
+
204
+ ## Run tests only a specific line
205
+
206
+ `--location` option is available as:
207
+
208
+ ```
209
+ bundle exec ruby test/narray_test.rb --location 121
210
+ ```
211
+
212
+ ### Compile and run tests only a specific type
213
+
214
+ `DTYPE` environment variable is available as:
215
+
216
+ ```
217
+ bundle exec DTYPE=dfloat rake compile
218
+ ```
219
+
220
+ ```
221
+ bundle exec DTYPE=dfloat ruby test/narray_test.rb
222
+ ```
223
+
224
+ ### Run tests with gdb
225
+
226
+ Compile with debug option:
227
+
228
+ ```
229
+ bundle exec DEBUG=1 rake compile
230
+ ```
231
+
232
+ Run tests with gdb:
233
+
234
+ ```
235
+ bundle exec gdb -x run.gdb --args ruby test/narray_test.rb
236
+ ```
237
+
238
+ You may put a breakpoint by calling `cumo_debug_breakpoint()` at C source codes.
239
+
240
+ ### Run program always synchronizing CPU and GPU
241
+
242
+ ```
243
+ bundle exec CUDA_LAUNCH_BLOCKING=1
244
+ ```
245
+
246
+ ## Contributing
247
+
248
+ Bug reports and pull requests are welcome on GitHub at https://github.com/sonots/cumo.
249
+
250
+ ## License
251
+
252
+ [LICENSE.txt](./LICENSE.txt)