RubyGems - ruby-dnn - Versions diffs - 0.6.10 → 0.7.0 - Mend

ruby-dnn 0.6.10 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +4 -4
data/API-Reference.ja.md +79 -43
data/lib/dnn.rb +10 -10
data/lib/dnn/core/activations.rb +29 -14
data/lib/dnn/core/cnn_layers.rb +24 -41
data/lib/dnn/core/layers.rb +77 -27
data/lib/dnn/core/model.rb +18 -3
data/lib/dnn/core/rnn_layers.rb +164 -95
data/lib/dnn/core/util.rb +0 -5
data/lib/dnn/lib/cifar10.rb +1 -1
data/lib/dnn/lib/image_io.rb +1 -1
data/lib/dnn/version.rb +1 -1
metadata +2 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 67843ee541cf946e48d412ee8f749a9b1d70c87f013ab474f638864ca21a7f65
-  data.tar.gz: bdf327a1460559008f6789932b54e1a80cc0273c40144de6d9461b53356aed89
+  metadata.gz: 16d5b6027014914e3e3356599dbf5ba9735e04b6a9824da1615a3e0b5c3e0a75
+  data.tar.gz: ad81eb7df3f442d0ab54b6df97f6dadfd085c8571335593c84eb6599fa4b3ea8
 SHA512:
-  metadata.gz: af03e256c44a3fa9a1f702327145d7ce0058570ea322794dbf78c56021892e93a7930ecc0298331cbd50d76f6852a5d87383e69226356d624622a42255d12213
-  data.tar.gz: 15c09d42df89744daded80e932f6aca5f503691127fc9f1d713dc7a11a65063307a88fc5decf824c4c4a045c02f9c39566e6a6e2a183b1172db3fd5954ae63d2
+  metadata.gz: f8f53054e425bd8cba7a13ee99ca8c4a0ec356fd8750bca8f8ba84542b14d3f413ae385bf1aa710e289a991618cdb73d71f38ffe4f8cb61fa8d5662166360e9d
+  data.tar.gz: 92425a72d2cc0e9072d36f10187571eeb710869e6d7246f5a611e7f992b763aaafe541a853f1133f50c45cdda6cda2e52ee11f80b6a876f6afe1993332bb40a3

data/API-Reference.ja.md CHANGED Viewed

@@ -2,7 +2,7 @@
 ruby-dnnのAPIリファレンスです。このリファレンスでは、APIを利用するうえで必要となるクラスとメソッドしか記載していません。
 そのため、プログラムの詳細が必要な場合は、ソースコードを参照してください。
-最終更新バージョン:0.6.10
+最終更新バージョン:0.7.0
 # module DNN
 ruby-dnnの名前空間をなすモジュールです。
@@ -187,6 +187,35 @@ predictとは異なり、一つの入力データに対して、一つの出力
 Numo::SFloat
 推論結果を返します。
+## def copy
+現在のモデルをコピーした新たなモデルを生成します。
+### arguments
+なし。
+### return
+Model
+コピーしたモデル。
+## def get_layer(index)
+indexのレイヤーを取得します。
+### arguments
+* Integer index
+取得するレイヤーのインデックス。
+### return
+Layer
+対象のレイヤーのインスタンス。
+## def get_layer(layer_class, index)
+layer_classで指定されたクラスのレイヤーをindexで取得します。
+### arguments
+* Layer layer_class
+取得するレイヤーのクラス。
+* Integer index
+レイヤーのインデックス。例えば、layersが[InputLayer, Dense, Dense,SoftmaxWithLoss]のとき、
+最初のDenseを取得したい場合、インデックスは0になります。
+### return
+Layer
+対象のレイヤーのインスタンス。
 # module Layers
 レイヤーの名前空間をなすモジュールです。
@@ -274,7 +303,29 @@ Hash
 入力層のdimentionまたはshapeを指定します。引数がIntegerだとdimentionとみなし、Arrayだとshapeとみなします。
-# class Dense
+# class Connection < HasParamLayer
+ニューロンを接続するすべてのレイヤーのスーパークラスです。
+## 【Properties】
+## attr_reader :weight_initializer
+Initializer
+重みの初期化に使用するイニシャライザーを取得します。
+## attr_reader :bias_initializer
+Initializer
+バイアスの初期化に使用するイニシャライザーを取得します。
+## attr_reader :l1_lambda
+Float
+重みのL1正則化の係数を取得します。
+## attr_reader :l2_lambda
+Float
+重みのL2正則化の係数を取得します。
+# class Dense < Connection
 全結合レイヤーを扱うクラスです。
 ## 【Properties】
@@ -283,28 +334,26 @@ Hash
 Integer
 レイヤーのノード数を取得します。
-## attr_reader :weight_decay
-Float
-重み減衰の係数を取得します。
 ## 【Instance methods】
-## def initialize(num_nodes, weight_initializer: nil, bias_initializer: nil, weight_decay: 0)
+## def initialize(num_nodes, weight_initializer: nil, bias_initializer: nil, l1_lambda: 0, l2_lambda: 0)
 コンストラクタ。
 ### arguments
 * Integer num_nodes
 レイヤーのノード数を設定します。
 * Initializer weight_initializer: nil
-重みの初期化に使用するイニシャライザーを設定します
+重みの初期化に使用するイニシャライザーを設定します。
 nilを指定すると、RandomNormalイニシャライザーが使用されます。
 * Initializer bias_initializer: nil
 バイアスの初期化に使用するイニシャライザーを設定します。
 nilを指定すると、Zerosイニシャライザーが使用されます。
-* Float weight_decay: 0
-重み減衰の係数を設定します。
+* Float l1_lambda: 0
+重みのL1正則化の係数を設定します。
+* Float l2_lambda: 0
+重みのL2正則化の係数を設定します。
-# class Conv2D < HasParamLayer
+# class Conv2D < Connection
 畳み込みレイヤーを扱うクラスです。
 ## 【Properties】
@@ -323,13 +372,9 @@ Array
 畳み込みを行う際のストライドの単位。
 [Integer height, Integer width]の形式で取得します。
-## attr_reader :weight_decay
-Float
-重み減衰を行うL2正則化項の強さを取得します。
 ## 【Instance methods】
-## def initialize(num_filters, filter_size, weight_initializer: nil, bias_initializer: nil, strides: 1, padding false, weight_decay: 0)
+## def initialize(num_filters, filter_size, weight_initializer: nil, bias_initializer: nil, strides: 1, padding false, l1_lambda: 0, l2_lambda: 0)
 コンストラクタ。
 ### arguments
 * Integer num_filters
@@ -348,8 +393,10 @@ Arrayで指定する場合、[Integer height, Integer width]の形式で指定
 * bool padding: true
 イメージに対してゼロパディングを行うか否かを設定します。trueを設定すると、出力されるイメージのサイズが入力されたイメージと同じになるように
 ゼロパディングを行います。
-* Float weight_decay: 0
-重み減衰を行うL2正則化項の強さを設定します。
+* Float l1_lambda: 0
+重みのL1正則化の係数を設定します。
+* Float l2_lambda: 0
+重みのL2正則化の係数を設定します。
 # class Pool2D < Layer
@@ -412,16 +459,9 @@ Array
 Arrayで指定する場合、[Integer height, Integer width]の形式で指定します。
-# class RNN < HasParamLayer
+# class RNN < Connection
 全てのリカレントニューラルネットワークのレイヤーのスーパークラスです。
-## 【Properties】
-## attr_accessor :h
-Numo::SFloat
-中間層の現在のステートを取得します。
-nilを設定することで、中間層のステートをリセットすることができます。
 ## attr_reader :num_nodes
 Integer
 レイヤーのノード数を取得します。
@@ -430,13 +470,9 @@ Integer
 bool
 レイヤーがステートフルであるか否かを返します。
-## attr_reader :weight_decay
-Float
-重み減衰の係数を取得します。
 ## 【Instance methods】
-## def initialize(num_nodes, stateful: false, return_sequences: true, weight_initializer: nil, bias_initializer: nil, weight_decay: 0)
+## def initialize(num_nodes, stateful: false, return_sequences: true, weight_initializer: nil, bias_initializer: nil, l1_lamda: 0, l2_lambda: 0)
 コンストラクタ。
 ### arguments
 * Integer num_nodes
@@ -452,8 +488,13 @@ nilを指定すると、RandomNormalイニシャライザーが使用されま
 * Initializer bias_initializer: nil
 バイアスの初期化に使用するイニシャライザーを設定します。
 nilを指定すると、Zerosイニシャライザーが使用されます。
-* Float weight_decay: 0
-重み減衰の係数を設定します。
+* Float l1_lambda: 0
+重みのL1正則化の係数を設定します。
+* Float l2_lambda: 0
+重みのL2正則化の係数を設定します。
+## def reset_state
+中間層のステートをリセットします。
 # class SimpleRNN < RNN
@@ -461,7 +502,7 @@ nilを指定すると、Zerosイニシャライザーが使用されます。
 ## 【Instance methods】
-## def initialize(num_nodes, stateful: false, return_sequences: true,  activation: nil, weight_initializer: nil, bias_initializer: nil, weight_decay: 0)
+## def initialize(num_nodes, stateful: false, return_sequences: true,  activation: nil, weight_initializer: nil, bias_initializer: nil, l1_lamda: 0, l2_lambda: 0)
 コンストラクタ。
 ### arguments
 * Integer num_nodes
@@ -480,20 +521,15 @@ nilを指定すると、RandomNormalイニシャライザーが使用されま
 * Initializer bias_initializer: nil
 バイアスの初期化に使用するイニシャライザーを設定します。
 nilを指定すると、Zerosイニシャライザーが使用されます。
-* Float weight_decay: 0
-重み減衰の係数を設定します。
+* Float l1_lambda: 0
+重みのL1正則化の係数を設定します。
+* Float l2_lambda: 0
+重みのL2正則化の係数を設定します。
 # class LSTM < RNN
 LSTMレイヤーを扱うクラスです。
-## 【Properties】
-## attr_accessor :cell
-Numo::SFloat
-中間層の現在のセルステートを取得します。
-nilを設定することで、中間層のセルステートをリセットすることができます。
 # class GRU < RNN
 GRUレイヤーを扱うクラスです。

data/lib/dnn.rb CHANGED Viewed

@@ -9,13 +9,13 @@ Xumo::SFloat.srand(rand(2**64))
 module DNN; end
-require "dnn/version"
-require "dnn/core/error"
-require "dnn/core/model"
-require "dnn/core/initializers"
-require "dnn/core/layers"
-require "dnn/core/activations"
-require "dnn/core/cnn_layers"
-require "dnn/core/rnn_layers"
-require "dnn/core/optimizers"
-require "dnn/core/util"
+require_relative "dnn/version"
+require_relative "dnn/core/error"
+require_relative "dnn/core/model"
+require_relative "dnn/core/initializers"
+require_relative "dnn/core/layers"
+require_relative "dnn/core/activations"
+require_relative "dnn/core/cnn_layers"
+require_relative "dnn/core/rnn_layers"
+require_relative "dnn/core/optimizers"
+require_relative "dnn/core/util"

data/lib/dnn/core/activations.rb CHANGED Viewed

@@ -2,8 +2,10 @@ module DNN
   module Activations
     class Sigmoid < Layers::Layer
+      NMath = Xumo::NMath
       def forward(x)
-        @out = 1 / (1 + Xumo::NMath.exp(-x))
+        @out = 1 / (1 + NMath.exp(-x))
       end
       def backward(dout)
@@ -13,8 +15,10 @@ module DNN
     class Tanh < Layers::Layer
+      NMath = Xumo::NMath
       def forward(x)
-        @out = Xumo::NMath.tanh(x)
+        @out = NMath.tanh(x)
       end
       def backward(dout)
@@ -36,25 +40,29 @@ module DNN
     class Softplus < Layers::Layer
+      NMath = Xumo::NMath
       def forward(x)
         @x = x
-        Xumo::NMath.log(1 + Xumo::NMath.exp(x))
+        NMath.log(1 + NMath.exp(x))
       end
       def backward(dout)
-        dout * (1 / (1 + Xumo::NMath.exp(-@x)))
+        dout * (1 / (1 + NMath.exp(-@x)))
       end
     end
     class Swish < Layers::Layer
+      NMath = Xumo::NMath
       def forward(x)
         @x = x
-        @out = x * (1 / (1 + Xumo::NMath.exp(-x)))
+        @out = x * (1 / (1 + NMath.exp(-x)))
       end
       def backward(dout)
-        dout * (@out + (1 / (1 + Xumo::NMath.exp(-@x))) * (1 - @out))
+        dout * (@out + (1 / (1 + NMath.exp(-@x))) * (1 - @out))
       end
     end
@@ -105,6 +113,8 @@ module DNN
     class ELU < Layers::Layer
+      NMath = Xumo::NMath
       attr_reader :alpha
       def self.load_hash(hash)
@@ -122,7 +132,7 @@ module DNN
         x1 *= x
         x2 = Xumo::SFloat.zeros(x.shape)
         x2[x < 0] = 1
-        x2 *= @alpha * Xumo::NMath.exp(x) - @alpha
+        x2 *= @alpha * NMath.exp(x) - @alpha
         x1 + x2
       end
@@ -131,7 +141,7 @@ module DNN
         dx[@x < 0] = 0
         dx2 = Xumo::SFloat.zeros(@x.shape)
         dx2[@x < 0] = 1
-        dx2 *= @alpha * Xumo::NMath.exp(@x)
+        dx2 *= @alpha * NMath.exp(@x)
         dout * (dx + dx2)
       end
@@ -152,7 +162,7 @@ module DNN
       def loss(y)
         batch_size = y.shape[0]
-        0.5 * ((@out - y)**2).sum / batch_size + ridge
+        0.5 * ((@out - y)**2).sum / batch_size + lasso + ridge
       end
     end
@@ -171,7 +181,7 @@ module DNN
       def loss(y)
         batch_size = y.shape[0]
-        (@out - y).abs.sum / batch_size + ridge
+        (@out - y).abs.sum / batch_size + lasso + ridge
       end
     end
@@ -183,7 +193,8 @@ module DNN
       def loss(y)
         loss = loss_l1(y)
-        @loss = loss > 1 ? loss : loss_l2(y)
+        loss = loss > 1 ? loss : loss_l2(y)
+        @loss = loss + lasso + ridge
       end
       def backward(y)
@@ -210,8 +221,10 @@ module DNN
     class SoftmaxWithLoss < Layers::OutputLayer
+      NMath = Xumo::NMath
       def forward(x)
-        @out = Xumo::NMath.exp(x) / Xumo::NMath.exp(x).sum(1).reshape(x.shape[0], 1)
+        @out = NMath.exp(x) / NMath.exp(x).sum(1).reshape(x.shape[0], 1)
       end
       def backward(y)
@@ -220,12 +233,14 @@ module DNN
       def loss(y)
         batch_size = y.shape[0]
-        -(y * Xumo::NMath.log(@out + 1e-7)).sum / batch_size + ridge
+        -(y * NMath.log(@out + 1e-7)).sum / batch_size + lasso + ridge
       end
     end
     class SigmoidWithLoss < Layers::OutputLayer
+      NMath = Xumo::NMath
       def initialize
         @sigmoid = Sigmoid.new
       end
@@ -240,7 +255,7 @@ module DNN
       def loss(y)
         batch_size = y.shape[0]
-        -(y * Xumo::NMath.log(@out + 1e-7) + (1 - y) * Xumo::NMath.log(1 - @out + 1e-7)).sum / batch_size + ridge
+        -(y * NMath.log(@out + 1e-7) + (1 - y) * NMath.log(1 - @out + 1e-7)).sum / batch_size + lasso + ridge
       end
     end

data/lib/dnn/core/cnn_layers.rb CHANGED Viewed

@@ -60,29 +60,26 @@ module DNN
     end
-    class Conv2D < HasParamLayer
-      include Initializers
+    class Conv2D < Connection
       include Conv2DModule
       attr_reader :num_filters
       attr_reader :filter_size
       attr_reader :strides
-      attr_reader :weight_decay
       def initialize(num_filters, filter_size,
                      weight_initializer: nil,
                      bias_initializer: nil,
                      strides: 1,
                      padding: false,
-                     weight_decay: 0)
-        super()
+                     l1_lambda: 0,
+                     l2_lambda: 0)
+        super(weight_initializer: weight_initializer, bias_initializer: bias_initializer,
+              l1_lambda: l1_lambda, l2_lambda: l1_lambda)
         @num_filters = num_filters
         @filter_size = filter_size.is_a?(Integer) ? [filter_size, filter_size] : filter_size
-        @weight_initializer = (weight_initializer || RandomNormal.new)
-        @bias_initializer = (bias_initializer || Zeros.new)
         @strides = strides.is_a?(Integer) ? [strides, strides] : strides
         @padding = padding
-        @weight_decay = weight_decay
       end
       def self.load_hash(hash)
@@ -91,7 +88,8 @@ module DNN
                    bias_initializer: Util.load_hash(hash[:bias_initializer]),
                    strides: hash[:strides],
                    padding: hash[:padding],
-                   weight_decay: hash[:weight_decay])
+                   l1_lambda: hash[:l1_lambda],
+                   l2_lambda: hash[:l2_lambda])
       end
       def build(model)
@@ -116,8 +114,9 @@ module DNN
       def backward(dout)
         dout = dout.reshape(dout.shape[0..2].reduce(:*), dout.shape[3])
         @grads[:weight] = @col.transpose.dot(dout)
-        if @weight_decay > 0
-          dridge = @weight_decay * @params[:weight]
+        if @l1_lambda > 0
+          @grads[:weight] += dlasso
+        elsif @l2_lambda > 0
           @grads[:weight] += dridge
         end
         @grads[:bias] = dout.sum(0)
@@ -130,22 +129,11 @@ module DNN
         [*@out_size, @num_filters]
       end
-      def ridge
-        if @weight_decay > 0
-          0.5 * @weight_decay * (@params[:weight]**2).sum
-        else
-          0
-        end
-      end
       def to_hash
         super({num_filters: @num_filters,
                filter_size: @filter_size,
-               weight_initializer: @weight_initializer.to_hash,
-               bias_initializer: @bias_initializer.to_hash,
                strides: @strides,
-               padding: @padding,
-               weight_decay: @weight_decay})
+               padding: @padding})
       end
       private
@@ -154,8 +142,7 @@ module DNN
         num_prev_filter = prev_layer.shape[2]
         @params[:weight] = Xumo::SFloat.new(num_prev_filter * @filter_size.reduce(:*), @num_filters)
         @params[:bias] = Xumo::SFloat.new(@num_filters)
-        @weight_initializer.init_param(self, :weight)
-        @bias_initializer.init_param(self, :bias)
+        super()
       end
     end
@@ -193,18 +180,6 @@ module DNN
         end
       end
-      def forward(x)
-        x = padding(x, @pad) if @padding
-        @x_shape = x.shape
-        col = im2col(x, *@out_size, *@pool_size, @strides)
-        col.reshape(x.shape[0] * @out_size.reduce(:*) * x.shape[3], @pool_size.reduce(:*))
-      end
-      def backward(dcol)
-        dx = col2im(dcol, @x_shape, *@out_size, *@pool_size, @strides)
-        @padding ? back_padding(dx, @pad) : dx
-      end
       def shape
         [*@out_size, @num_channel]
       end
@@ -224,7 +199,10 @@ module DNN
       end
       def forward(x)
-        col = super(x)
+        x = padding(x, @pad) if @padding
+        @x_shape = x.shape
+        col = im2col(x, *@out_size, *@pool_size, @strides)
+        col = col.reshape(x.shape[0] * @out_size.reduce(:*) * x.shape[3], @pool_size.reduce(:*))
         @max_index = col.max_index(1)
         col.max(1).reshape(x.shape[0], *@out_size, x.shape[3])
       end
@@ -233,7 +211,8 @@ module DNN
         dmax = Xumo::SFloat.zeros(dout.size * @pool_size.reduce(:*))
         dmax[@max_index] = dout.flatten
         dcol = dmax.reshape(dout.shape[0..2].reduce(:*), dout.shape[3] * @pool_size.reduce(:*))
-        super(dcol)
+        dx = col2im(dcol, @x_shape, *@out_size, *@pool_size, @strides)
+        @padding ? back_padding(dx, @pad) : dx
       end
     end
@@ -244,7 +223,10 @@ module DNN
       end
       def forward(x)
-        col = super(x)
+        x = padding(x, @pad) if @padding
+        @x_shape = x.shape
+        col = im2col(x, *@out_size, *@pool_size, @strides)
+        col = col.reshape(x.shape[0] * @out_size.reduce(:*) * x.shape[3], @pool_size.reduce(:*))
         col.mean(1).reshape(x.shape[0], *@out_size, x.shape[3])
       end
@@ -256,7 +238,8 @@ module DNN
           davg[true, i] = dout.flatten
         end
         dcol = davg.reshape(dout.shape[0..2].reduce(:*), dout.shape[3] * @pool_size.reduce(:*))
-        super(dcol)
+        dx = col2im(dcol, @x_shape, *@out_size, *@pool_size, @strides)
+        @padding ? back_padding(dx, @pad) : dx
       end
     end

data/lib/dnn/core/layers.rb CHANGED Viewed

@@ -100,30 +100,86 @@ module DNN
         super({shape: @shape})
       end
     end
-    class Dense < HasParamLayer
+    class Connection < HasParamLayer
       include Initializers
+      attr_reader :l1_lambda
+      attr_reader :l2_lambda
+      def initialize(weight_initializer: nil,
+                     bias_initializer: nil,
+                     l1_lambda: 0,
+                     l2_lambda: 0)
+        super()
+        @weight_initializer = (weight_initializer || RandomNormal.new)
+        @bias_initializer = (bias_initializer || Zeros.new)
+        @l1_lambda = l1_lambda
+        @l2_lambda = l2_lambda
+      end
+      def lasso
+        if @l1_lambda > 0
+          @l1_lambda * @params[:weight].abs.sum
+        else
+          0
+        end
+      end
+      def ridge
+        if @l2_lambda > 0
+          0.5 * @l2_lambda * (@params[:weight]**2).sum
+        else
+          0
+        end
+      end
+      def dlasso
+        dlasso = Xumo::SFloat.ones(*@params[:weight].shape)
+        dlasso[@params[:weight] < 0] = -1
+        @l1_lambda * dlasso
+      end
+      def dridge
+        @l2_lambda * @params[:weight]
+      end
+      def to_hash(merge_hash)
+        super({weight_initializer: @weight_initializer.to_hash,
+               bias_initializer: @bias_initializer.to_hash,
+               l1_lambda: @l1_lambda,
+               l2_lambda: @l2_lambda}.merge(merge_hash))
+      end
+      private
+      def init_params
+        @weight_initializer.init_param(self, :weight)
+        @bias_initializer.init_param(self, :bias)
+      end
+    end
+    class Dense < Connection
       attr_reader :num_nodes
-      attr_reader :weight_decay
       def self.load_hash(hash)
         self.new(hash[:num_nodes],
                  weight_initializer: Util.load_hash(hash[:weight_initializer]),
                  bias_initializer: Util.load_hash(hash[:bias_initializer]),
-                 weight_decay: hash[:weight_decay])
+                 l1_lambda: hash[:l1_lambda],
+                 l2_lambda: hash[:l2_lambda])
       end
       def initialize(num_nodes,
                      weight_initializer: nil,
                      bias_initializer: nil,
-                     weight_decay: 0)
-        super()
+                     l1_lambda: 0,
+                     l2_lambda: 0)
+        super(weight_initializer: weight_initializer, bias_initializer: bias_initializer,
+              l1_lambda: l1_lambda, l2_lambda: l2_lambda)
         @num_nodes = num_nodes
-        @weight_initializer = (weight_initializer || RandomNormal.new)
-        @bias_initializer = (bias_initializer || Zeros.new)
-        @weight_decay = weight_decay
       end
       def forward(x)
@@ -133,8 +189,9 @@ module DNN
       def backward(dout)
         @grads[:weight] = @x.transpose.dot(dout)
-        if @weight_decay > 0
-          dridge = @weight_decay * @params[:weight]
+        if @l1_lambda > 0
+          @grads[:weight] += dlasso
+        elsif @l2_lambda > 0
           @grads[:weight] += dridge
         end
         @grads[:bias] = dout.sum(0)
@@ -145,19 +202,8 @@ module DNN
         [@num_nodes]
       end
-      def ridge
-        if @weight_decay > 0
-          0.5 * @weight_decay * (@params[:weight]**2).sum
-        else
-          0
-        end
-      end
       def to_hash
-        super({num_nodes: @num_nodes,
-               weight_initializer: @weight_initializer.to_hash,
-               bias_initializer: @bias_initializer.to_hash,
-               weight_decay: @weight_decay})
+        super({num_nodes: @num_nodes})
       end
       private
@@ -166,8 +212,7 @@ module DNN
         num_prev_nodes = prev_layer.shape[0]
         @params[:weight] = Xumo::SFloat.new(num_prev_nodes, @num_nodes)
         @params[:bias] = Xumo::SFloat.new(@num_nodes)
-        @weight_initializer.init_param(self, :weight)
-        @bias_initializer.init_param(self, :bias)
+        super()
       end
     end
@@ -218,9 +263,14 @@ module DNN
     class OutputLayer < Layer
       private
+      def lasso
+        @model.layers.select { |layer| layer.is_a?(Connection) }
+                     .reduce(0) { |sum, layer| sum + layer.lasso }
+      end
       def ridge
-        @model.layers.select { |layer| layer.respond_to?(:ridge) }
+        @model.layers.select { |layer| layer.is_a?(Connection) }
                      .reduce(0) { |sum, layer| sum + layer.ridge }
       end
     end

data/lib/dnn/core/model.rb CHANGED Viewed

@@ -1,4 +1,5 @@
 require "json"
+require "base64"
 module DNN
   # This class deals with the model of the network.
@@ -32,8 +33,9 @@ module DNN
       @layers.each do |layer|
         next unless layer.is_a?(HasParamLayer)
         hash_params = has_param_layers_params[has_param_layers_index]
-        hash_params.each do |key, param|
-          layer.params[key] = Xumo::SFloat.cast(param)
+        hash_params.each do |key, (shape, base64_param)|
+          bin = Base64.decode64(base64_param)
+          layer.params[key] = Xumo::SFloat.from_binary(bin).reshape(*shape)
         end
         has_param_layers_index += 1
       end
@@ -59,7 +61,10 @@ module DNN
     def params_to_json
       has_param_layers = @layers.select { |layer| layer.is_a?(HasParamLayer) }
       has_param_layers_params = has_param_layers.map do |layer|
-        layer.params.map { |key, param| [key, param.to_a] }.to_h
+        layer.params.map { |key, param|
+          base64_param = Base64.encode64(param.to_binary)
+          [key, [param.shape, base64_param]]
+        }.to_h
       end
       JSON.dump(has_param_layers_params)
     end
@@ -190,6 +195,16 @@ module DNN
     def copy
       Marshal.load(Marshal.dump(self))
     end
+    def get_layer(*args)
+      if args.length == 1
+        index = args[0]
+        @layers[index]
+      else
+        layer_class, index = args
+        @layers.select { |layer| layer.is_a?(layer_class) }[index]
+      end
+    end
     def forward(x, training)
       @training = training

data/lib/dnn/core/rnn_layers.rb CHANGED Viewed

@@ -2,28 +2,25 @@ module DNN
   module Layers
     # Super class of all RNN classes.
-    class RNN < HasParamLayer
-      include Initializers
+    class RNN < Connection
       include Activations
       attr_accessor :h
       attr_reader :num_nodes
       attr_reader :stateful
-      attr_reader :weight_decay
       def initialize(num_nodes,
                      stateful: false,
                      return_sequences: true,
                      weight_initializer: nil,
                      bias_initializer: nil,
-                     weight_decay: 0)
-        super()
+                     l1_lambda: 0,
+                     l2_lambda: 0)
+        super(weight_initializer: weight_initializer, bias_initializer: bias_initializer,
+              l1_lambda: l1_lambda, l2_lambda: l2_lambda)
         @num_nodes = num_nodes
         @stateful = stateful
         @return_sequences = return_sequences
-        @weight_initializer = (weight_initializer || RandomNormal.new)
-        @bias_initializer = (bias_initializer || Zeros.new)
-        @weight_decay = weight_decay
         @layers = []
         @h = nil
       end
@@ -62,30 +59,61 @@ module DNN
       def to_hash(merge_hash = nil)
         hash = {
-          class: self.class.name,
           num_nodes: @num_nodes,
           stateful: @stateful,
           return_sequences: @return_sequences,
-          weight_initializer: @weight_initializer.to_hash,
-          bias_initializer: @bias_initializer.to_hash,
-          weight_decay: @weight_decay,
+          h: @h.to_a
         }
         hash.merge!(merge_hash) if merge_hash
-        hash
+        super(hash)
       end
       def shape
         @return_sequences ? [@time_length, @num_nodes] : [@num_nodes]
       end
+      def reset_state
+        @h = @h.fill(0) if @h
+      end
+      def lasso
+        if @l1_lambda > 0
+          @l1_lambda * (@params[:weight].abs.sum + @params[:weight2].abs.sum)
+        else
+          0
+        end
+      end
       def ridge
-        if @weight_decay > 0
-          0.5 * (@weight_decay * ((@params[:weight]**2).sum + (@params[:weight2]**2).sum))
+        if @l2_lambda > 0
+          0.5 * (@l2_lambda * ((@params[:weight]**2).sum + (@params[:weight2]**2).sum))
         else
           0
         end
       end
+      def dlasso
+        dlasso = Xumo::SFloat.ones(*@params[:weight].shape)
+        dlasso[@params[:weight] < 0] = -1
+        @l1_lambda * dlasso
+      end
+      def dridge
+        @l2_lambda * @params[:weight]
+      end
+      def dlasso2
+        dlasso = Xumo::SFloat.ones(*@params[:weight2].shape)
+        dlasso[@params[:weight2] < 0] = -1
+        @l1_lambda * dlasso
+      end
+      def dridge2
+        @l2_lambda * @params[:weight2]
+      end
+      private
       def init_params
         @time_length = prev_layer.shape[0]
       end
@@ -93,26 +121,32 @@ module DNN
     class SimpleRNN_Dense
-      def initialize(params, grads, activation)
-        @params = params
-        @grads = grads
-        @activation = activation
+      def initialize(rnn)
+        @rnn = rnn
+        @activation = rnn.activation.clone
       end
       def forward(x, h)
         @x = x
         @h = h
-        h2 = x.dot(@params[:weight]) + h.dot(@params[:weight2]) + @params[:bias]
+        h2 = x.dot(@rnn.params[:weight]) + h.dot(@rnn.params[:weight2]) + @rnn.params[:bias]
         @activation.forward(h2)
       end
       def backward(dh2)
         dh2 = @activation.backward(dh2)
-        @grads[:weight] += @x.transpose.dot(dh2)
-        @grads[:weight2] += @h.transpose.dot(dh2)
-        @grads[:bias] += dh2.sum(0)
-        dx = dh2.dot(@params[:weight].transpose)
-        dh = dh2.dot(@params[:weight2].transpose)
+        @rnn.grads[:weight] += @x.transpose.dot(dh2)
+        @rnn.grads[:weight2] += @h.transpose.dot(dh2)
+        if @rnn.l1_lambda > 0
+          @rnn.grads[:weight] += dlasso
+          @rnn.grads[:weight2] += dlasso2
+        elsif @rnn.l2_lambda > 0
+          @rnn.grads[:weight] += dridge
+          @grads[:weight2] += dridge2
+        end
+        @rnn.grads[:bias] += dh2.sum(0)
+        dx = dh2.dot(@rnn.params[:weight].transpose)
+        dh = dh2.dot(@rnn.params[:weight2].transpose)
         [dx, dh]
       end
     end
@@ -120,13 +154,16 @@ module DNN
     class SimpleRNN < RNN
       def self.load_hash(hash)
-        self.new(hash[:num_nodes],
-                 stateful: hash[:stateful],
-                 return_sequences: hash[:return_sequences],
-                 activation: Util.load_hash(hash[:activation]),
-                 weight_initializer: Util.load_hash(hash[:weight_initializer]),
-                 bias_initializer: Util.load_hash(hash[:bias_initializer]),
-                 weight_decay: hash[:weight_decay])
+        simple_rnn = self.new(hash[:num_nodes],
+                              stateful: hash[:stateful],
+                              return_sequences: hash[:return_sequences],
+                              activation: Util.load_hash(hash[:activation]),
+                              weight_initializer: Util.load_hash(hash[:weight_initializer]),
+                              bias_initializer: Util.load_hash(hash[:bias_initializer]),
+                              l1_lambda: hash[:l1_lambda],
+                              l2_lambda: hash[:l2_lambda])
+        simple_rnn.h = Xumo::SFloat.cast(hash[:h])
+        simple_rnn
       end
       def initialize(num_nodes,
@@ -135,13 +172,15 @@ module DNN
                      activation: nil,
                      weight_initializer: nil,
                      bias_initializer: nil,
-                     weight_decay: 0)
+                     l1_lambda: 0,
+                     l2_lambda: 0)
         super(num_nodes,
               stateful: stateful,
               return_sequences: return_sequences,
               weight_initializer: weight_initializer,
               bias_initializer: bias_initializer,
-              weight_decay: weight_decay)
+              l1_lambda: 0,
+              l2_lambda: 0)
         @activation = (activation || Tanh.new)
       end
@@ -161,16 +200,15 @@ module DNN
         @weight_initializer.init_param(self, :weight2)
         @bias_initializer.init_param(self, :bias)
         @time_length.times do |t|
-          @layers << SimpleRNN_Dense.new(@params, @grads, @activation.clone)
+          @layers << SimpleRNN_Dense.new(self)
         end
       end
     end
     class LSTM_Dense
-      def initialize(params, grads)
-        @params = params
-        @grads = grads
+      def initialize(rnn)
+        @rnn = rnn
         @tanh = Tanh.new
         @g_tanh = Tanh.new
         @forget_sigmoid = Sigmoid.new
@@ -178,56 +216,67 @@ module DNN
         @out_sigmoid = Sigmoid.new
       end
-      def forward(x, h, cell)
+      def forward(x, h, c)
         @x = x
         @h = h
-        @cell = cell
+        @c = c
         num_nodes = h.shape[1]
-        a = x.dot(@params[:weight]) + h.dot(@params[:weight2]) + @params[:bias]
+        a = x.dot(@rnn.params[:weight]) + h.dot(@rnn.params[:weight2]) + @rnn.params[:bias]
         @forget = @forget_sigmoid.forward(a[true, 0...num_nodes])
         @g = @g_tanh.forward(a[true, num_nodes...(num_nodes * 2)])
         @in = @in_sigmoid.forward(a[true, (num_nodes * 2)...(num_nodes * 3)])
         @out = @out_sigmoid.forward(a[true, (num_nodes * 3)..-1])
-        cell2 = @forget * cell + @g * @in
-        @tanh_cell2 = @tanh.forward(cell2)
-        h2 = @out * @tanh_cell2
-        [h2, cell2]
+        c2 = @forget * c + @g * @in
+        @tanh_c2 = @tanh.forward(c2)
+        h2 = @out * @tanh_c2
+        [h2, c2]
       end
-      def backward(dh2, dcell2)
-        dh2_tmp = @tanh_cell2 * dh2
-        dcell2_tmp = @tanh.backward(@out * dh2) + dcell2
+      def backward(dh2, dc2)
+        dh2_tmp = @tanh_c2 * dh2
+        dc2_tmp = @tanh.backward(@out * dh2) + dc2
         dout = @out_sigmoid.backward(dh2_tmp)
-        din = @in_sigmoid.backward(dcell2_tmp * @g)
-        dg = @g_tanh.backward(dcell2_tmp * @in)
-        dforget = @forget_sigmoid.backward(dcell2_tmp * @cell)
+        din = @in_sigmoid.backward(dc2_tmp * @g)
+        dg = @g_tanh.backward(dc2_tmp * @in)
+        dforget = @forget_sigmoid.backward(dc2_tmp * @c)
         da = Xumo::SFloat.hstack([dforget, dg, din, dout])
-        @grads[:weight] += @x.transpose.dot(da)
-        @grads[:weight2] += @h.transpose.dot(da)
-        @grads[:bias] += da.sum(0)
-        dx = da.dot(@params[:weight].transpose)
-        dh = da.dot(@params[:weight2].transpose)
-        dcell = dcell2_tmp * @forget
-        [dx, dh, dcell]
+        @rnn.grads[:weight] += @x.transpose.dot(da)
+        @rnn.grads[:weight2] += @h.transpose.dot(da)
+        if @rnn.l1_lambda > 0
+          @rnn.grads[:weight] += dlasso
+          @rnn.grads[:weight2] += dlasso2
+        elsif @rnn.l2_lambda > 0
+          @rnn.grads[:weight] += dridge
+          @rnn.grads[:weight2] += dridge2
+        end
+        @rnn.grads[:bias] += da.sum(0)
+        dx = da.dot(@rnn.params[:weight].transpose)
+        dh = da.dot(@rnn.params[:weight2].transpose)
+        dc = dc2_tmp * @forget
+        [dx, dh, dc]
       end
     end
     class LSTM < RNN
-      attr_accessor :cell
+      attr_accessor :c
       def self.load_hash(hash)
-        self.new(hash[:num_nodes],
-                 stateful: hash[:stateful],
-                 return_sequences: hash[:return_sequences],
-                 weight_initializer: Util.load_hash(hash[:weight_initializer]),
-                 bias_initializer: Util.load_hash(hash[:bias_initializer]),
-                 weight_decay: hash[:weight_decay])
+        lstm = self.new(hash[:num_nodes],
+                        stateful: hash[:stateful],
+                        return_sequences: hash[:return_sequences],
+                        weight_initializer: Util.load_hash(hash[:weight_initializer]),
+                        bias_initializer: Util.load_hash(hash[:bias_initializer]),
+                        l1_lambda: hash[:l1_lambda],
+                        l2_lambda: hash[:l2_lambda])
+        lstm.h = Xumo::SFloat.cast(hash[:h])
+        lstm.c = Xumo::SFloat.cast(hash[:c])
+        lstm
       end
       def initialize(num_nodes,
@@ -235,29 +284,30 @@ module DNN
                      return_sequences: true,
                      weight_initializer: nil,
                      bias_initializer: nil,
-                     weight_decay: 0)
+                     l1_lambda: 0,
+                     l2_lambda: 0)
         super
-        @cell = nil
+        @c = nil
       end
       def forward(xs)
         @xs_shape = xs.shape
         hs = Xumo::SFloat.zeros(xs.shape[0], @time_length, @num_nodes)
         h = nil
-        cell = nil
+        c = nil
         if @stateful
           h = @h if @h
-          cell = @cell if @cell
+          c = @c if @c
         end
         h ||= Xumo::SFloat.zeros(xs.shape[0], @num_nodes)
-        cell ||= Xumo::SFloat.zeros(xs.shape[0], @num_nodes)
+        c ||= Xumo::SFloat.zeros(xs.shape[0], @num_nodes)
         xs.shape[1].times do |t|
           x = xs[true, t, false]
-          h, cell = @layers[t].forward(x, h, cell)
+          h, c = @layers[t].forward(x, h, c)
           hs[true, t, false] = h
         end
         @h = h
-        @cell = cell
+        @c = c
         @return_sequences ? hs : h
       end
@@ -272,15 +322,24 @@ module DNN
         end
         dxs = Xumo::SFloat.zeros(@xs_shape)
         dh = 0
-        dcell = 0
+        dc = 0
         (0...dh2s.shape[1]).to_a.reverse.each do |t|
           dh2 = dh2s[true, t, false]
-          dx, dh, dcell = @layers[t].backward(dh2 + dh, dcell)
+          dx, dh, dc = @layers[t].backward(dh2 + dh, dc)
           dxs[true, t, false] = dx
         end
         dxs
       end
+      def reset_state
+        super()
+        @c = @c.fill(0) if @c
+      end
+      def to_hash
+        super({c: @c.to_a})
+      end
       private
       def init_params
@@ -293,16 +352,15 @@ module DNN
         @weight_initializer.init_param(self, :weight2)
         @bias_initializer.init_param(self, :bias)
         @time_length.times do |t|
-          @layers << LSTM_Dense.new(@params, @grads)
+          @layers << LSTM_Dense.new(self)
         end
       end
     end
     class GRU_Dense
-      def initialize(params, grads)
-        @params = params
-        @grads = grads
+      def initialize(rnn)
+        @rnn = rnn
         @update_sigmoid = Sigmoid.new
         @reset_sigmoid = Sigmoid.new
         @tanh = Tanh.new
@@ -312,16 +370,16 @@ module DNN
         @x = x
         @h = h
         num_nodes = h.shape[1]
-        @weight_a = @params[:weight][true, 0...(num_nodes * 2)]
-        @weight2_a = @params[:weight2][true, 0...(num_nodes * 2)]
-        bias_a = @params[:bias][0...(num_nodes * 2)]
+        @weight_a = @rnn.params[:weight][true, 0...(num_nodes * 2)]
+        @weight2_a = @rnn.params[:weight2][true, 0...(num_nodes * 2)]
+        bias_a = @rnn.params[:bias][0...(num_nodes * 2)]
         a = x.dot(@weight_a) + h.dot(@weight2_a) + bias_a
         @update = @update_sigmoid.forward(a[true, 0...num_nodes])
         @reset = @reset_sigmoid.forward(a[true, num_nodes..-1])
-        @weight_h = @params[:weight][true, (num_nodes * 2)..-1]
-        @weight2_h = @params[:weight2][true, (num_nodes * 2)..-1]
-        bias_h = @params[:bias][(num_nodes * 2)..-1]
+        @weight_h = @rnn.params[:weight][true, (num_nodes * 2)..-1]
+        @weight2_h = @rnn.params[:weight2][true, (num_nodes * 2)..-1]
+        bias_h = @rnn.params[:bias][(num_nodes * 2)..-1]
         @tanh_h = @tanh.forward(x.dot(@weight_h) + (h * @reset).dot(@weight2_h) + bias_h)
         h2 = (1 - @update) * h + @update * @tanh_h
         h2
@@ -346,9 +404,16 @@ module DNN
         dh += da.dot(@weight2_a.transpose)
         dbias_a = da.sum(0)
-        @grads[:weight] += Xumo::SFloat.hstack([dweight_a, dweight_h])
-        @grads[:weight2] += Xumo::SFloat.hstack([dweight2_a, dweight2_h])
-        @grads[:bias] += Xumo::SFloat.hstack([dbias_a, dbias_h])
+        @rnn.grads[:weight] += Xumo::SFloat.hstack([dweight_a, dweight_h])
+        @rnn.grads[:weight2] += Xumo::SFloat.hstack([dweight2_a, dweight2_h])
+        if @rnn.l1_lambda > 0
+          @rnn.grads[:weight] += dlasso
+          @rnn.grads[:weight2] += dlasso2
+        elsif @rnn.l2_lambda > 0
+          @rnn.grads[:weight] += dridge
+          @rnn.grads[:weight2] += dridge2
+        end
+        @rnn.grads[:bias] += Xumo::SFloat.hstack([dbias_a, dbias_h])
         [dx, dh]
       end
     end
@@ -356,12 +421,15 @@ module DNN
     class GRU < RNN
       def self.load_hash(hash)
-        self.new(hash[:num_nodes],
-                 stateful: hash[:stateful],
-                 return_sequences: hash[:return_sequences],
-                 weight_initializer: Util.load_hash(hash[:weight_initializer]),
-                 bias_initializer: Util.load_hash(hash[:bias_initializer]),
-                 weight_decay: hash[:weight_decay])
+        gru = self.new(hash[:num_nodes],
+                       stateful: hash[:stateful],
+                       return_sequences: hash[:return_sequences],
+                       weight_initializer: Util.load_hash(hash[:weight_initializer]),
+                       bias_initializer: Util.load_hash(hash[:bias_initializer]),
+                       l1_lambda: hash[:l1_lambda],
+                       l2_lambda: hash[:l2_lambda])
+        gru.h = Xumo::SFloat.cast(hash[:h])
+        gru
       end
       def initialize(num_nodes,
@@ -369,7 +437,8 @@ module DNN
                      return_sequences: true,
                      weight_initializer: nil,
                      bias_initializer: nil,
-                     weight_decay: 0)
+                     l1_lambda: 0,
+                     l2_lambda: 0)
         super
       end
@@ -385,7 +454,7 @@ module DNN
         @weight_initializer.init_param(self, :weight2)
         @bias_initializer.init_param(self, :bias)
         @time_length.times do |t|
-          @layers << GRU_Dense.new(@params, @grads)
+          @layers << GRU_Dense.new(self)
         end
       end
     end

data/lib/dnn/core/util.rb CHANGED Viewed

@@ -16,11 +16,6 @@ module DNN
       end
       y2
     end
-    # Perform numerical differentiation.
-    def self.numerical_grad(x, func)
-      (func.(x + 1e-7) - func.(x)) / 1e-7
-    end
     # Convert hash to an object.
     def self.load_hash(hash)

data/lib/dnn/lib/cifar10.rb CHANGED Viewed

@@ -1,8 +1,8 @@
 require "dnn"
-require "dnn/ext/cifar10_loader/cifar10_loader"
 require "open-uri"
 require "zlib"
 require "archive/tar/minitar"
+require_relative "dnn/ext/cifar10_loader/cifar10_loader"
 URL_CIFAR10 = "https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz"
 CIFAR10_DIR = "cifar-10-batches-bin"

data/lib/dnn/lib/image_io.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 require "numo/narray"
-require "dnn/ext/rb_stb_image/rb_stb_image"
+require_relative "dnn/ext/rb_stb_image/rb_stb_image"
 module DNN
   module ImageIO

data/lib/dnn/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module DNN
-  VERSION = "0.6.10"
+  VERSION = "0.7.0"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ruby-dnn
 version: !ruby/object:Gem::Version
-  version: 0.6.10
+  version: 0.7.0
 platform: ruby
 authors:
 - unagiootoro
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2018-09-05 00:00:00.000000000 Z
+date: 2018-09-19 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: numo-narray