froog 0.3.1__tar.gz → 0.4.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,7 +1,7 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: froog
3
- Version: 0.3.1
4
- Summary: a beautifully simplistic tensor library
3
+ Version: 0.4.0
4
+ Summary: a toy tensor library with opencl support
5
5
  Author: Kevin Buhler
6
6
  License: MIT
7
7
  Classifier: Programming Language :: Python :: 3
@@ -9,10 +9,6 @@ Classifier: License :: OSI Approved :: MIT License
9
9
  Requires-Python: >=3.8
10
10
  Description-Content-Type: text/markdown
11
11
  License-File: LICENSE
12
- Requires-Dist: numpy
13
- Requires-Dist: requests
14
- Requires-Dist: matplotlib
15
- Requires-Dist: urllib
16
12
 
17
13
  # froog <img src="https://github.com/kevbuh/froog/actions/workflows/test.yml/badge.svg" alt="unit test badge" > <img src="https://static.pepy.tech/badge/froog" alt="num downloads badge">
18
14
  <div align="center" >
@@ -27,7 +23,7 @@ Requires-Dist: urllib
27
23
  <br/>
28
24
  </div>
29
25
 
30
- ```froog``` is an easy-to-read tensor library (<a href="https://www.pepy.tech/projects/froog">16k pip installs!</a>) meant for those looking to get into machine learning and who want to understand how the underlying machine learning framework's code works before they are ultra-optimized (which all modern ml libraries are).
26
+ ```froog``` is an easy-to-read tensor library (<a href="https://www.pepy.tech/projects/froog">25k pip installs!</a>) meant for those looking to get into machine learning and who want to understand how the underlying machine learning framework's code works before they are ultra-optimized (which all modern ml libraries are).
31
27
 
32
28
  ```froog``` encapsulates everything from <a href="https://github.com/kevbuh/froog/blob/main/models/linear_regression.py">linear regression</a> to <a href="https://github.com/kevbuh/froog/blob/main/models/efficientnet.py">convolutional neural networks </a> in under 1000 lines.
33
29
 
@@ -85,7 +81,7 @@ from froog.tensor import Tensor
85
81
  my_tensor = Tensor([1,2,3])
86
82
  ```
87
83
 
88
- Notice how we had to import numpy. If you want to create a Tensor manually, make sure that it is a Numpy array!
84
+ Notice how we had to import NumPy. If you want to create a Tensor manually, make sure that it is a NumPy array!
89
85
 
90
86
  <!-- Learn more about ```froog``` Tensors <a href="https://github.com/kevbuh/froog/blob/main/docs/tensors.md">here</a>. -->
91
87
 
@@ -95,13 +91,10 @@ Tensors are the fundamental datatype in froog, and one of the two main classes.
95
91
 
96
92
  - ```def __init__(self, data)```:
97
93
 
98
- - Tensor takes in one param, which is the data. Since froog has a numpy backend, the input data into tensors has to be a numpy array.
99
-
94
+ - Tensor takes in one param, which is the data. Since ```froog``` has a NumPy backend, the input data into tensors has to be a NumPy array.
100
95
  - Tensor has a ```self.data``` state that it holds. this contains the data inside of the tensor.
101
-
102
96
  - In addition, it has ```self.grad```. this is to hold what the gradients of the tensor is.
103
-
104
- - Lastly, it has ```self._ctx```. theser are the internal vairables used for autograd graph construction. put more simply, this is where the backward gradient computations are saved.
97
+ - Lastly, it has ```self._ctx```. These are the internal variables used for autograd graph construction. This is where the backward gradient computations are saved.
105
98
 
106
99
  *Properties*
107
100
 
@@ -109,38 +102,34 @@ Tensors are the fundamental datatype in froog, and one of the two main classes.
109
102
 
110
103
  *Methods*
111
104
  - ```def zeros(*shape)```: this returns a tensor full of zeros with any shape that you pass in. Defaults to np.float32
112
-
113
105
  - ```def ones(*shape)```: this returns a tensor full of ones with any shape that you pass in. Defaults to np.float32
114
-
115
106
  - ```def randn(*shape):```: this returns a randomly initialized Tensor of *shape
116
107
 
117
108
  *Gradient calculations*
118
109
 
119
- - ```froog``` computes gradients automatically through a process called automatic differentiation. it has a variable ```_ctx```, which stores the chain of operations. it will take the current operation, lets say a dot product, and go to the dot product definition in ```froog/ops.py```, which contains a backward pass specfically for dot products. all methods, from add to 2x2 maxpools, have this backward pass implemented.
110
+ - ```froog``` computes gradients automatically through a process called automatic differentiation. it has a variable ```_ctx```, which stores the chain of operations. It will take the current operation, let's say a dot product, and go to the dot product definition in ```froog/ops.py```, which contains a backward pass specifically for dot products. all methods, from add to 2x2 maxpools, have this backward pass implemented.
120
111
 
121
112
  *Functions*
122
113
 
123
114
  The other base class in froog is the class ```Function```. It keeps track of input tensors and tensors that need to be saved for backward passes
124
115
 
125
116
  - ```def __init__(self, *tensors)```: takes in an argument of tensors, which are then saved.
126
-
127
117
  - ```def save_for_backward(self, *x)```: saves Tensors that are necessary to compute for the computation of gradients in the backward pass.
128
-
129
- - ```def apply(self, arg, *x)```: This is what makes everything work. The apply() method takes care of the forward pass, applying the operation to the inputs.
118
+ - ```def apply(self, arg, *x)```: takes care of the forward pass, applying the operation to the inputs.
130
119
 
131
120
  *Register*
132
121
 
133
- ```def register(name, fxn)```: this function allows you to add a method to a Tensor. This allows you to chain any operations, e.g. x.dot(w).relu(), where w is a tensor
122
+ - ```def register(name, fxn)```: allows you to add a method to a Tensor. This allows you to chain any operations, e.g. x.dot(w).relu(), where w is a tensor
134
123
 
135
124
  # Creating a model
136
125
 
137
126
  Okay cool, so now you know that ```froog```'s main datatype is a Tensor and uses NumPy in the background. How do I actually build a model?
138
127
 
139
- Here's an example of how to create an MNIST multi-layer perceptron (MLP). We wanted to make it as simple as possible for you to do so so it resembles very basic python concepts like classes. There's really only two methods you need to define:
128
+ Here's an example of how to create an MNIST multi-layer perceptron (MLP). We wanted to make it as simple as possible for you to do so it resembles very basic Python concepts like classes. There are really only two methods you need to define:
140
129
  1. ```__init__``` that defines layers of the model (here we use ```Linear```)
141
130
  2. ```forward``` which defines how the input should flow through your model. We use a simple dot product with a ```Linear``` layer with a <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">```ReLU```</a> activation.
142
131
 
143
- In order to create an instance of the ```mnistMLP``` model, do the same as you would in python: ```model = mnistMLP()``` .
132
+ To create an instance of the ```mnistMLP``` model, do the same as you would in Python: ```model = mnistMLP()```.
144
133
 
145
134
  We support a few different optimizers, <a href="https://github.com/kevbuh/froog/blob/main/froog/optim.py">here</a> which include:
146
135
  - <a href="https://en.wikipedia.org/wiki/Stochastic_gradient_descent">Stochastic Gradient Descent (SGD)</a>
@@ -201,7 +190,7 @@ So there are two quick examples to get you up and running. You might have notice
201
190
 
202
191
  ## GPU Support
203
192
 
204
- Have a GPU and need a speedup? You're in good luck because we have GPU support from for our operations defined in <a href="https://github.com/kevbuh/froog/blob/main/froog/ops_gpu.py">```ops_gpu.py```</a>. In order to do this we have a backend built on <a href="https://en.wikipedia.org/wiki/OpenGL">OpenGL</a> that invokes kernel functions that work on the GPU.
193
+ Have a GPU and need a speedup? You're in good luck because we have GPU support from for our operations defined in <a href="https://github.com/kevbuh/froog/blob/main/froog/ops_gpu.py">```ops_gpu.py```</a>. In order to do this we have a backend built on <a href="https://en.wikipedia.org/wiki/OpenCL">OpenCL</a> that invokes kernel functions that work on the GPU.
205
194
 
206
195
  Here's how you can send data to the GPU during a forward pass and bring it back to the CPU.
207
196
 
@@ -11,7 +11,7 @@
11
11
  <br/>
12
12
  </div>
13
13
 
14
- ```froog``` is an easy-to-read tensor library (<a href="https://www.pepy.tech/projects/froog">16k pip installs!</a>) meant for those looking to get into machine learning and who want to understand how the underlying machine learning framework's code works before they are ultra-optimized (which all modern ml libraries are).
14
+ ```froog``` is an easy-to-read tensor library (<a href="https://www.pepy.tech/projects/froog">25k pip installs!</a>) meant for those looking to get into machine learning and who want to understand how the underlying machine learning framework's code works before they are ultra-optimized (which all modern ml libraries are).
15
15
 
16
16
  ```froog``` encapsulates everything from <a href="https://github.com/kevbuh/froog/blob/main/models/linear_regression.py">linear regression</a> to <a href="https://github.com/kevbuh/froog/blob/main/models/efficientnet.py">convolutional neural networks </a> in under 1000 lines.
17
17
 
@@ -69,7 +69,7 @@ from froog.tensor import Tensor
69
69
  my_tensor = Tensor([1,2,3])
70
70
  ```
71
71
 
72
- Notice how we had to import numpy. If you want to create a Tensor manually, make sure that it is a Numpy array!
72
+ Notice how we had to import NumPy. If you want to create a Tensor manually, make sure that it is a NumPy array!
73
73
 
74
74
  <!-- Learn more about ```froog``` Tensors <a href="https://github.com/kevbuh/froog/blob/main/docs/tensors.md">here</a>. -->
75
75
 
@@ -79,13 +79,10 @@ Tensors are the fundamental datatype in froog, and one of the two main classes.
79
79
 
80
80
  - ```def __init__(self, data)```:
81
81
 
82
- - Tensor takes in one param, which is the data. Since froog has a numpy backend, the input data into tensors has to be a numpy array.
83
-
82
+ - Tensor takes in one param, which is the data. Since ```froog``` has a NumPy backend, the input data into tensors has to be a NumPy array.
84
83
  - Tensor has a ```self.data``` state that it holds. this contains the data inside of the tensor.
85
-
86
84
  - In addition, it has ```self.grad```. this is to hold what the gradients of the tensor is.
87
-
88
- - Lastly, it has ```self._ctx```. theser are the internal vairables used for autograd graph construction. put more simply, this is where the backward gradient computations are saved.
85
+ - Lastly, it has ```self._ctx```. These are the internal variables used for autograd graph construction. This is where the backward gradient computations are saved.
89
86
 
90
87
  *Properties*
91
88
 
@@ -93,38 +90,34 @@ Tensors are the fundamental datatype in froog, and one of the two main classes.
93
90
 
94
91
  *Methods*
95
92
  - ```def zeros(*shape)```: this returns a tensor full of zeros with any shape that you pass in. Defaults to np.float32
96
-
97
93
  - ```def ones(*shape)```: this returns a tensor full of ones with any shape that you pass in. Defaults to np.float32
98
-
99
94
  - ```def randn(*shape):```: this returns a randomly initialized Tensor of *shape
100
95
 
101
96
  *Gradient calculations*
102
97
 
103
- - ```froog``` computes gradients automatically through a process called automatic differentiation. it has a variable ```_ctx```, which stores the chain of operations. it will take the current operation, lets say a dot product, and go to the dot product definition in ```froog/ops.py```, which contains a backward pass specfically for dot products. all methods, from add to 2x2 maxpools, have this backward pass implemented.
98
+ - ```froog``` computes gradients automatically through a process called automatic differentiation. it has a variable ```_ctx```, which stores the chain of operations. It will take the current operation, let's say a dot product, and go to the dot product definition in ```froog/ops.py```, which contains a backward pass specifically for dot products. all methods, from add to 2x2 maxpools, have this backward pass implemented.
104
99
 
105
100
  *Functions*
106
101
 
107
102
  The other base class in froog is the class ```Function```. It keeps track of input tensors and tensors that need to be saved for backward passes
108
103
 
109
104
  - ```def __init__(self, *tensors)```: takes in an argument of tensors, which are then saved.
110
-
111
105
  - ```def save_for_backward(self, *x)```: saves Tensors that are necessary to compute for the computation of gradients in the backward pass.
112
-
113
- - ```def apply(self, arg, *x)```: This is what makes everything work. The apply() method takes care of the forward pass, applying the operation to the inputs.
106
+ - ```def apply(self, arg, *x)```: takes care of the forward pass, applying the operation to the inputs.
114
107
 
115
108
  *Register*
116
109
 
117
- ```def register(name, fxn)```: this function allows you to add a method to a Tensor. This allows you to chain any operations, e.g. x.dot(w).relu(), where w is a tensor
110
+ - ```def register(name, fxn)```: allows you to add a method to a Tensor. This allows you to chain any operations, e.g. x.dot(w).relu(), where w is a tensor
118
111
 
119
112
  # Creating a model
120
113
 
121
114
  Okay cool, so now you know that ```froog```'s main datatype is a Tensor and uses NumPy in the background. How do I actually build a model?
122
115
 
123
- Here's an example of how to create an MNIST multi-layer perceptron (MLP). We wanted to make it as simple as possible for you to do so so it resembles very basic python concepts like classes. There's really only two methods you need to define:
116
+ Here's an example of how to create an MNIST multi-layer perceptron (MLP). We wanted to make it as simple as possible for you to do so it resembles very basic Python concepts like classes. There are really only two methods you need to define:
124
117
  1. ```__init__``` that defines layers of the model (here we use ```Linear```)
125
118
  2. ```forward``` which defines how the input should flow through your model. We use a simple dot product with a ```Linear``` layer with a <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">```ReLU```</a> activation.
126
119
 
127
- In order to create an instance of the ```mnistMLP``` model, do the same as you would in python: ```model = mnistMLP()``` .
120
+ To create an instance of the ```mnistMLP``` model, do the same as you would in Python: ```model = mnistMLP()```.
128
121
 
129
122
  We support a few different optimizers, <a href="https://github.com/kevbuh/froog/blob/main/froog/optim.py">here</a> which include:
130
123
  - <a href="https://en.wikipedia.org/wiki/Stochastic_gradient_descent">Stochastic Gradient Descent (SGD)</a>
@@ -185,7 +178,7 @@ So there are two quick examples to get you up and running. You might have notice
185
178
 
186
179
  ## GPU Support
187
180
 
188
- Have a GPU and need a speedup? You're in good luck because we have GPU support from for our operations defined in <a href="https://github.com/kevbuh/froog/blob/main/froog/ops_gpu.py">```ops_gpu.py```</a>. In order to do this we have a backend built on <a href="https://en.wikipedia.org/wiki/OpenGL">OpenGL</a> that invokes kernel functions that work on the GPU.
181
+ Have a GPU and need a speedup? You're in good luck because we have GPU support from for our operations defined in <a href="https://github.com/kevbuh/froog/blob/main/froog/ops_gpu.py">```ops_gpu.py```</a>. In order to do this we have a backend built on <a href="https://en.wikipedia.org/wiki/OpenCL">OpenCL</a> that invokes kernel functions that work on the GPU.
189
182
 
190
183
  Here's how you can send data to the GPU during a forward pass and bring it back to the CPU.
191
184
 
@@ -10,8 +10,8 @@ from froog.tensor import Tensor
10
10
  import numpy as np
11
11
 
12
12
  def Linear(*x):
13
- # TODO: why dividing by sqrt?
14
- ret = np.random.uniform(-1., 1., size=x)/np.sqrt(np.prod(x)) # random init weights
13
+ # random Glorot initialization
14
+ ret = np.random.uniform(-1., 1., size=x)/np.sqrt(np.prod(x))
15
15
  return ret.astype(np.float32)
16
16
 
17
17
  def swish(x):
@@ -55,6 +55,6 @@ class BatchNorm2D:
55
55
  def __call__(self, x):
56
56
  x = x.sub(self.running_mean.reshape(shape=[1, -1, 1, 1]))
57
57
  x = x.mul(self.weight.reshape(shape=[1, -1, 1, 1]))
58
- x = x.div(self.running_var.add(Tensor([self.eps], gpu=x.gpu)).reshape(shape=[1, -1, 1, 1]).sqrt()) # TODO: shouldn't div go first?
58
+ x = x.div(self.running_var.add(Tensor([self.eps], gpu=x.gpu)).reshape(shape=[1, -1, 1, 1]).sqrt())
59
59
  x = x.add(self.bias.reshape(shape=[1, -1, 1, 1]))
60
60
  return x
@@ -340,7 +340,6 @@ class MaxPool2D(Function):
340
340
  *ctx.kernel_size)
341
341
  register('max_pool2d', MaxPool2D)
342
342
 
343
-
344
343
  class AvgPool2D(Function):
345
344
  @staticmethod
346
345
  def forward(ctx, x, kernel_size=(2,2)):
@@ -351,7 +350,7 @@ class AvgPool2D(Function):
351
350
  @staticmethod
352
351
  def backward(ctx, grad_output):
353
352
  s, = ctx.saved_tensors
354
- py, px = ctx.kernel_size # TODO: where does kernel_size come from?
353
+ py, px = ctx.kernel_size # kernel_size passed from forward context
355
354
  my, mx = (s[2]//py)*py, (s[3]//px)*px
356
355
  ret = np.zeros(s, dtype=grad_output.dtype)
357
356
  for Y in range(py):
@@ -5,6 +5,8 @@
5
5
  # | ___|| __ || |_| || |_| || || |
6
6
  # | | | | | || || || |_| |
7
7
  # |___| |___| |_||_______||_______||_______|
8
+ #
9
+ # OpenCL kernels
8
10
 
9
11
  import numpy as np
10
12
  from .tensor import Function, register
@@ -71,7 +73,6 @@ def unary_op(ctx, code, x):
71
73
  prg.unop(ctx.cl_queue, [np.prod(ret.shape)], None, x, ret)
72
74
  return ret
73
75
 
74
- # ???
75
76
  @functools.lru_cache
76
77
  def cl_pooling_krnl_build(cl_ctx, iter_op, result_op, init_val=0):
77
78
  prg = """
@@ -302,23 +303,42 @@ register('relu', ReLU, gpu=True)
302
303
  class LogSoftmax(Function):
303
304
  @staticmethod
304
305
  def forward(ctx, input):
306
+ # first find max values for numerical stability
307
+ max_vals = buffer_new(ctx, (input.shape[0],))
308
+ prg = clbuild(ctx.cl_ctx, """
309
+ __kernel void max_vals(
310
+ __global const float *a_g, int sz, __global float *res_g)
311
+ {
312
+ int gid = get_global_id(0);
313
+ int gidsz = gid*sz;
314
+ float max_val = -INFINITY;
315
+ for (int x = 0; x < sz; x++) {
316
+ max_val = max(max_val, a_g[gidsz+x]);
317
+ }
318
+ res_g[gid] = max_val;
319
+ }
320
+ """)
321
+ prg.max_vals(ctx.cl_queue, [input.shape[0]], None, input, np.int32(input.shape[1]), max_vals)
322
+
323
+ # compute exp(x - max) and sum
305
324
  lsum = buffer_new(ctx, (input.shape[0],))
306
325
  prg = clbuild(ctx.cl_ctx, """
307
326
  __kernel void logsoftmax(
308
- __global const float *a_g, int sz, __global float *res_g)
327
+ __global const float *a_g, __global const float *max_vals, int sz, __global float *res_g)
309
328
  {
310
329
  int gid = get_global_id(0);
311
330
  int gidsz = gid*sz;
312
- // TODO: stability with max
331
+ float max_val = max_vals[gid];
313
332
  float out = 0.0;
314
333
  for (int x = 0; x < sz; x++) {
315
- out += exp(a_g[gidsz+x]);
334
+ out += exp(a_g[gidsz+x] - max_val);
316
335
  }
317
- res_g[gid] = log(out);
336
+ res_g[gid] = log(out) + max_val;
318
337
  }
319
338
  """)
320
- prg.logsoftmax(ctx.cl_queue, [input.shape[0]], None, input, np.int32(input.shape[1]), lsum)
339
+ prg.logsoftmax(ctx.cl_queue, [input.shape[0]], None, input, max_vals, np.int32(input.shape[1]), lsum)
321
340
 
341
+ # compute final output
322
342
  output = buffer_like(ctx, input)
323
343
  prg = clbuild(ctx.cl_ctx, """
324
344
  __kernel void lsmsub(
@@ -474,8 +494,38 @@ class AvgPool2D(Function):
474
494
 
475
495
  @staticmethod
476
496
  def backward(ctx, grad_output):
477
- # TODO Finish this
478
- pass
497
+ # for average pooling, we need to distribute the gradient evenly across all elements in the pooling window
498
+ input_shape = ctx.data.shape
499
+ N, C, Y, X = input_shape
500
+ py, px = ctx.kernel_size
501
+ ret = buffer_zeros(ctx, input_shape)
502
+
503
+ prg = clbuild(ctx.cl_ctx, """
504
+ __kernel void avgpool_backward(
505
+ __global float *grad_input, __global const float *grad_output,
506
+ uint2 osize, uint2 isize, uint2 kernel_size, int nelem
507
+ ) {
508
+ int3 gid = (int3)(get_global_id(2), get_global_id(1), get_global_id(0));
509
+ int oid = gid.x + osize.x*(gid.y + osize.y*gid.z);
510
+ float grad = grad_output[oid] / (kernel_size.x * kernel_size.y);
511
+
512
+ for (uint j=0; j<kernel_size.y; ++j) {
513
+ for (uint i=0; i<kernel_size.x; ++i) {
514
+ int iid = (gid.x*kernel_size.x+i) + isize.x*((gid.y*kernel_size.y+j) + isize.y*gid.z);
515
+ if (iid < nelem)
516
+ grad_input[iid] += grad;
517
+ }
518
+ }
519
+ }
520
+ """)
521
+
522
+ osize = np.array((X//px, Y//py), dtype=cl.cltypes.uint2)
523
+ isize = np.array((X, Y), dtype=cl.cltypes.uint2)
524
+ ksize = np.array((px,py), dtype=cl.cltypes.uint2)
525
+
526
+ prg.avgpool_backward(ctx.cl_queue, (N*C, Y//py, X//px), None, ret, grad_output, osize, isize, ksize, np.int32(input_shape.size))
527
+
528
+ return ret
479
529
  register('avg_pool2d', AvgPool2D, gpu=True)
480
530
 
481
531
  class MaxPool2D(Function):
@@ -484,10 +534,65 @@ class MaxPool2D(Function):
484
534
  init_val = "FLT_MIN"
485
535
  iter_op = "group_res = max(group_res, input[iid])"
486
536
  result_op = "group_res"
487
- return pooling_op(ctx, input, kernel_size, iter_op, result_op, init_val=init_val)
537
+ ret = pooling_op(ctx, input, kernel_size, iter_op, result_op, init_val=init_val)
538
+
539
+ # save indices of max elements for backward pass
540
+ indices = buffer_new(ctx, ret.shape)
541
+ prg = clbuild(ctx.cl_ctx, """
542
+ __kernel void maxpool_indices(
543
+ __global const float *input, __global float *output, __global int *indices,
544
+ uint2 osize, uint2 isize, uint2 kernel_size, int nelem
545
+ ) {
546
+ int3 gid = (int3)(get_global_id(2), get_global_id(1), get_global_id(0));
547
+ int oid = gid.x + osize.x*(gid.y + osize.y*gid.z);
548
+ float max_val = -INFINITY;
549
+ int max_idx = 0;
550
+
551
+ for (uint j=0; j<kernel_size.y; ++j) {
552
+ for (uint i=0; i<kernel_size.x; ++i) {
553
+ int iid = (gid.x*kernel_size.x+i) + isize.x*((gid.y*kernel_size.y+j) + isize.y*gid.z);
554
+ if (iid < nelem) {
555
+ float val = input[iid];
556
+ if (val > max_val) {
557
+ max_val = val;
558
+ max_idx = iid;
559
+ }
560
+ }
561
+ }
562
+ }
563
+ indices[oid] = max_idx;
564
+ }
565
+ """)
566
+
567
+ N, C, Y, X = input.shape
568
+ py, px = kernel_size
569
+ osize = np.array((X//px, Y//py), dtype=cl.cltypes.uint2)
570
+ isize = np.array((X, Y), dtype=cl.cltypes.uint2)
571
+ ksize = np.array((px,py), dtype=cl.cltypes.uint2)
572
+
573
+ prg.maxpool_indices(ctx.cl_queue, (N*C, Y//py, X//px), None, input, ret, indices, osize, isize, ksize, np.int32(input.size))
574
+
575
+ ctx.save_for_backward(indices)
576
+ return ret
488
577
 
489
578
  @staticmethod
490
579
  def backward(ctx, grad_output):
491
- # TODO Finish this
492
- pass
580
+ indices, = ctx.saved_tensors
581
+ input_shape = ctx.data.shape
582
+ ret = buffer_zeros(ctx, input_shape)
583
+ prg = clbuild(ctx.cl_ctx, """
584
+ __kernel void maxpool_backward(
585
+ __global float *grad_input, __global const float *grad_output,
586
+ __global const int *indices, int nelem
587
+ ) {
588
+ int gid = get_global_id(0);
589
+ if (gid < nelem) {
590
+ int idx = indices[gid];
591
+ grad_input[idx] += grad_output[gid];
592
+ }
593
+ }
594
+ """)
595
+
596
+ prg.maxpool_backward(ctx.cl_queue, [np.prod(grad_output.shape)], None, ret, grad_output, indices, np.int32(grad_output.size))
597
+ return ret
493
598
  register('max_pool2d', MaxPool2D, gpu=True)
@@ -62,7 +62,7 @@ class Tensor:
62
62
  self.gpu = False
63
63
 
64
64
  self.data = data
65
- self.grad = None # TODO: why self.grad.data instead of self.grad?
65
+ self.grad = None
66
66
 
67
67
  if gpu:
68
68
  self.gpu_()
@@ -67,7 +67,8 @@ def im2col(x, H, W):
67
67
  tx = x.reshape(bs, -1)[:, idx]
68
68
 
69
69
  # all the time is spent here
70
- tx = tx.ravel() # TODO: whats the purpose of ravel ???
70
+ # np.ravel() flattens the array into a 1-dimensional shape
71
+ tx = tx.ravel()
71
72
  return tx.reshape(-1, cin*W*H)
72
73
 
73
74
  def col2im(tx, H, W, OY, OX):
@@ -1,7 +1,7 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: froog
3
- Version: 0.3.1
4
- Summary: a beautifully simplistic tensor library
3
+ Version: 0.4.0
4
+ Summary: a toy tensor library with opencl support
5
5
  Author: Kevin Buhler
6
6
  License: MIT
7
7
  Classifier: Programming Language :: Python :: 3
@@ -9,10 +9,6 @@ Classifier: License :: OSI Approved :: MIT License
9
9
  Requires-Python: >=3.8
10
10
  Description-Content-Type: text/markdown
11
11
  License-File: LICENSE
12
- Requires-Dist: numpy
13
- Requires-Dist: requests
14
- Requires-Dist: matplotlib
15
- Requires-Dist: urllib
16
12
 
17
13
  # froog <img src="https://github.com/kevbuh/froog/actions/workflows/test.yml/badge.svg" alt="unit test badge" > <img src="https://static.pepy.tech/badge/froog" alt="num downloads badge">
18
14
  <div align="center" >
@@ -27,7 +23,7 @@ Requires-Dist: urllib
27
23
  <br/>
28
24
  </div>
29
25
 
30
- ```froog``` is an easy-to-read tensor library (<a href="https://www.pepy.tech/projects/froog">16k pip installs!</a>) meant for those looking to get into machine learning and who want to understand how the underlying machine learning framework's code works before they are ultra-optimized (which all modern ml libraries are).
26
+ ```froog``` is an easy-to-read tensor library (<a href="https://www.pepy.tech/projects/froog">25k pip installs!</a>) meant for those looking to get into machine learning and who want to understand how the underlying machine learning framework's code works before they are ultra-optimized (which all modern ml libraries are).
31
27
 
32
28
  ```froog``` encapsulates everything from <a href="https://github.com/kevbuh/froog/blob/main/models/linear_regression.py">linear regression</a> to <a href="https://github.com/kevbuh/froog/blob/main/models/efficientnet.py">convolutional neural networks </a> in under 1000 lines.
33
29
 
@@ -85,7 +81,7 @@ from froog.tensor import Tensor
85
81
  my_tensor = Tensor([1,2,3])
86
82
  ```
87
83
 
88
- Notice how we had to import numpy. If you want to create a Tensor manually, make sure that it is a Numpy array!
84
+ Notice how we had to import NumPy. If you want to create a Tensor manually, make sure that it is a NumPy array!
89
85
 
90
86
  <!-- Learn more about ```froog``` Tensors <a href="https://github.com/kevbuh/froog/blob/main/docs/tensors.md">here</a>. -->
91
87
 
@@ -95,13 +91,10 @@ Tensors are the fundamental datatype in froog, and one of the two main classes.
95
91
 
96
92
  - ```def __init__(self, data)```:
97
93
 
98
- - Tensor takes in one param, which is the data. Since froog has a numpy backend, the input data into tensors has to be a numpy array.
99
-
94
+ - Tensor takes in one param, which is the data. Since ```froog``` has a NumPy backend, the input data into tensors has to be a NumPy array.
100
95
  - Tensor has a ```self.data``` state that it holds. this contains the data inside of the tensor.
101
-
102
96
  - In addition, it has ```self.grad```. this is to hold what the gradients of the tensor is.
103
-
104
- - Lastly, it has ```self._ctx```. theser are the internal vairables used for autograd graph construction. put more simply, this is where the backward gradient computations are saved.
97
+ - Lastly, it has ```self._ctx```. These are the internal variables used for autograd graph construction. This is where the backward gradient computations are saved.
105
98
 
106
99
  *Properties*
107
100
 
@@ -109,38 +102,34 @@ Tensors are the fundamental datatype in froog, and one of the two main classes.
109
102
 
110
103
  *Methods*
111
104
  - ```def zeros(*shape)```: this returns a tensor full of zeros with any shape that you pass in. Defaults to np.float32
112
-
113
105
  - ```def ones(*shape)```: this returns a tensor full of ones with any shape that you pass in. Defaults to np.float32
114
-
115
106
  - ```def randn(*shape):```: this returns a randomly initialized Tensor of *shape
116
107
 
117
108
  *Gradient calculations*
118
109
 
119
- - ```froog``` computes gradients automatically through a process called automatic differentiation. it has a variable ```_ctx```, which stores the chain of operations. it will take the current operation, lets say a dot product, and go to the dot product definition in ```froog/ops.py```, which contains a backward pass specfically for dot products. all methods, from add to 2x2 maxpools, have this backward pass implemented.
110
+ - ```froog``` computes gradients automatically through a process called automatic differentiation. it has a variable ```_ctx```, which stores the chain of operations. It will take the current operation, let's say a dot product, and go to the dot product definition in ```froog/ops.py```, which contains a backward pass specifically for dot products. all methods, from add to 2x2 maxpools, have this backward pass implemented.
120
111
 
121
112
  *Functions*
122
113
 
123
114
  The other base class in froog is the class ```Function```. It keeps track of input tensors and tensors that need to be saved for backward passes
124
115
 
125
116
  - ```def __init__(self, *tensors)```: takes in an argument of tensors, which are then saved.
126
-
127
117
  - ```def save_for_backward(self, *x)```: saves Tensors that are necessary to compute for the computation of gradients in the backward pass.
128
-
129
- - ```def apply(self, arg, *x)```: This is what makes everything work. The apply() method takes care of the forward pass, applying the operation to the inputs.
118
+ - ```def apply(self, arg, *x)```: takes care of the forward pass, applying the operation to the inputs.
130
119
 
131
120
  *Register*
132
121
 
133
- ```def register(name, fxn)```: this function allows you to add a method to a Tensor. This allows you to chain any operations, e.g. x.dot(w).relu(), where w is a tensor
122
+ - ```def register(name, fxn)```: allows you to add a method to a Tensor. This allows you to chain any operations, e.g. x.dot(w).relu(), where w is a tensor
134
123
 
135
124
  # Creating a model
136
125
 
137
126
  Okay cool, so now you know that ```froog```'s main datatype is a Tensor and uses NumPy in the background. How do I actually build a model?
138
127
 
139
- Here's an example of how to create an MNIST multi-layer perceptron (MLP). We wanted to make it as simple as possible for you to do so so it resembles very basic python concepts like classes. There's really only two methods you need to define:
128
+ Here's an example of how to create an MNIST multi-layer perceptron (MLP). We wanted to make it as simple as possible for you to do so it resembles very basic Python concepts like classes. There are really only two methods you need to define:
140
129
  1. ```__init__``` that defines layers of the model (here we use ```Linear```)
141
130
  2. ```forward``` which defines how the input should flow through your model. We use a simple dot product with a ```Linear``` layer with a <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">```ReLU```</a> activation.
142
131
 
143
- In order to create an instance of the ```mnistMLP``` model, do the same as you would in python: ```model = mnistMLP()``` .
132
+ To create an instance of the ```mnistMLP``` model, do the same as you would in Python: ```model = mnistMLP()```.
144
133
 
145
134
  We support a few different optimizers, <a href="https://github.com/kevbuh/froog/blob/main/froog/optim.py">here</a> which include:
146
135
  - <a href="https://en.wikipedia.org/wiki/Stochastic_gradient_descent">Stochastic Gradient Descent (SGD)</a>
@@ -201,7 +190,7 @@ So there are two quick examples to get you up and running. You might have notice
201
190
 
202
191
  ## GPU Support
203
192
 
204
- Have a GPU and need a speedup? You're in good luck because we have GPU support from for our operations defined in <a href="https://github.com/kevbuh/froog/blob/main/froog/ops_gpu.py">```ops_gpu.py```</a>. In order to do this we have a backend built on <a href="https://en.wikipedia.org/wiki/OpenGL">OpenGL</a> that invokes kernel functions that work on the GPU.
193
+ Have a GPU and need a speedup? You're in good luck because we have GPU support from for our operations defined in <a href="https://github.com/kevbuh/froog/blob/main/froog/ops_gpu.py">```ops_gpu.py```</a>. In order to do this we have a backend built on <a href="https://en.wikipedia.org/wiki/OpenCL">OpenCL</a> that invokes kernel functions that work on the GPU.
205
194
 
206
195
  Here's how you can send data to the GPU during a forward pass and bring it back to the CPU.
207
196
 
@@ -1,4 +1,3 @@
1
1
  numpy
2
2
  requests
3
3
  matplotlib
4
- urllib
@@ -1,4 +1,3 @@
1
-
2
1
  #!/usr/bin/env python3
3
2
  # this file specifies how the froog package is installed, including any necessary dependencies required to run
4
3
 
@@ -10,8 +9,8 @@ with open(os.path.join(directory, 'README.md'), encoding='utf-8') as f:
10
9
  long_description = f.read()
11
10
 
12
11
  setup(name='froog',
13
- version='0.3.1',
14
- description='a beautifully simplistic tensor library',
12
+ version='0.4.0',
13
+ description='a toy tensor library with opencl support',
15
14
  author='Kevin Buhler',
16
15
  license='MIT',
17
16
  long_description=long_description,
@@ -21,6 +20,6 @@ setup(name='froog',
21
20
  "Programming Language :: Python :: 3",
22
21
  "License :: OSI Approved :: MIT License"
23
22
  ],
24
- install_requires=['numpy', 'requests', 'matplotlib', 'urllib'],
23
+ install_requires=['numpy', 'requests', 'matplotlib'],
25
24
  python_requires='>=3.8',
26
25
  include_package_data=True)
@@ -16,7 +16,6 @@ X_train, Y_train, X_test, Y_test = fetch_mnist()
16
16
  class SimpleMLP:
17
17
  def __init__(self):
18
18
  # 784 pixel inputs -> 128 -> 10 output
19
- # TODO: why down to 128?
20
19
  self.l1 = Tensor(Linear(784, 128))
21
20
  self.l2 = Tensor(Linear(128, 10))
22
21
 
@@ -73,7 +72,7 @@ def train(model, optimizer, steps, BS=128, gpu=False):
73
72
  model_outputs = model.forward(x)
74
73
 
75
74
  # ********* backward pass *********
76
- loss = model_outputs.mul(y).mean() # TODO: what exactly is NLL loss function?
75
+ loss = model_outputs.mul(y).mean()
77
76
  loss.backward()
78
77
  optimizer.step()
79
78
 
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes