polars-df 0.13.0-x64-mingw-ucrt

Sign up to get free protection for your applications and to get access to all the features.
Files changed (80) hide show
  1. checksums.yaml +7 -0
  2. data/.yardopts +3 -0
  3. data/CHANGELOG.md +208 -0
  4. data/Cargo.lock +2556 -0
  5. data/Cargo.toml +6 -0
  6. data/LICENSE-THIRD-PARTY.txt +39278 -0
  7. data/LICENSE.txt +20 -0
  8. data/README.md +437 -0
  9. data/lib/polars/3.1/polars.so +0 -0
  10. data/lib/polars/3.2/polars.so +0 -0
  11. data/lib/polars/3.3/polars.so +0 -0
  12. data/lib/polars/array_expr.rb +537 -0
  13. data/lib/polars/array_name_space.rb +423 -0
  14. data/lib/polars/batched_csv_reader.rb +104 -0
  15. data/lib/polars/binary_expr.rb +77 -0
  16. data/lib/polars/binary_name_space.rb +66 -0
  17. data/lib/polars/cat_expr.rb +36 -0
  18. data/lib/polars/cat_name_space.rb +88 -0
  19. data/lib/polars/config.rb +530 -0
  20. data/lib/polars/convert.rb +98 -0
  21. data/lib/polars/data_frame.rb +5191 -0
  22. data/lib/polars/data_types.rb +466 -0
  23. data/lib/polars/date_time_expr.rb +1397 -0
  24. data/lib/polars/date_time_name_space.rb +1287 -0
  25. data/lib/polars/dynamic_group_by.rb +52 -0
  26. data/lib/polars/exceptions.rb +38 -0
  27. data/lib/polars/expr.rb +7256 -0
  28. data/lib/polars/expr_dispatch.rb +22 -0
  29. data/lib/polars/functions/aggregation/horizontal.rb +246 -0
  30. data/lib/polars/functions/aggregation/vertical.rb +282 -0
  31. data/lib/polars/functions/as_datatype.rb +271 -0
  32. data/lib/polars/functions/col.rb +47 -0
  33. data/lib/polars/functions/eager.rb +182 -0
  34. data/lib/polars/functions/lazy.rb +1329 -0
  35. data/lib/polars/functions/len.rb +49 -0
  36. data/lib/polars/functions/lit.rb +35 -0
  37. data/lib/polars/functions/random.rb +16 -0
  38. data/lib/polars/functions/range/date_range.rb +136 -0
  39. data/lib/polars/functions/range/datetime_range.rb +149 -0
  40. data/lib/polars/functions/range/int_range.rb +51 -0
  41. data/lib/polars/functions/range/time_range.rb +141 -0
  42. data/lib/polars/functions/repeat.rb +144 -0
  43. data/lib/polars/functions/whenthen.rb +96 -0
  44. data/lib/polars/functions.rb +57 -0
  45. data/lib/polars/group_by.rb +613 -0
  46. data/lib/polars/io/avro.rb +24 -0
  47. data/lib/polars/io/csv.rb +696 -0
  48. data/lib/polars/io/database.rb +73 -0
  49. data/lib/polars/io/ipc.rb +275 -0
  50. data/lib/polars/io/json.rb +29 -0
  51. data/lib/polars/io/ndjson.rb +80 -0
  52. data/lib/polars/io/parquet.rb +233 -0
  53. data/lib/polars/lazy_frame.rb +2708 -0
  54. data/lib/polars/lazy_group_by.rb +181 -0
  55. data/lib/polars/list_expr.rb +791 -0
  56. data/lib/polars/list_name_space.rb +449 -0
  57. data/lib/polars/meta_expr.rb +222 -0
  58. data/lib/polars/name_expr.rb +198 -0
  59. data/lib/polars/plot.rb +109 -0
  60. data/lib/polars/rolling_group_by.rb +35 -0
  61. data/lib/polars/series.rb +4444 -0
  62. data/lib/polars/slice.rb +104 -0
  63. data/lib/polars/sql_context.rb +194 -0
  64. data/lib/polars/string_cache.rb +75 -0
  65. data/lib/polars/string_expr.rb +1495 -0
  66. data/lib/polars/string_name_space.rb +811 -0
  67. data/lib/polars/struct_expr.rb +98 -0
  68. data/lib/polars/struct_name_space.rb +96 -0
  69. data/lib/polars/testing.rb +507 -0
  70. data/lib/polars/utils/constants.rb +9 -0
  71. data/lib/polars/utils/convert.rb +97 -0
  72. data/lib/polars/utils/parse.rb +89 -0
  73. data/lib/polars/utils/various.rb +76 -0
  74. data/lib/polars/utils/wrap.rb +19 -0
  75. data/lib/polars/utils.rb +130 -0
  76. data/lib/polars/version.rb +4 -0
  77. data/lib/polars/whenthen.rb +83 -0
  78. data/lib/polars-df.rb +1 -0
  79. data/lib/polars.rb +91 -0
  80. metadata +138 -0
data/LICENSE.txt ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2020 Ritchie Vink
2
+ Copyright (c) 2022-2024 Andrew Kane
3
+
4
+ Permission is hereby granted, free of charge, to any person obtaining a copy
5
+ of this software and associated documentation files (the "Software"), to deal
6
+ in the Software without restriction, including without limitation the rights
7
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8
+ copies of the Software, and to permit persons to whom the Software is
9
+ furnished to do so, subject to the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be included in all
12
+ copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
20
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,437 @@
1
+ # Ruby Polars
2
+
3
+ :fire: Blazingly fast DataFrames for Ruby, powered by [Polars](https://github.com/pola-rs/polars)
4
+
5
+ [![Build Status](https://github.com/ankane/ruby-polars/actions/workflows/build.yml/badge.svg)](https://github.com/ankane/ruby-polars/actions)
6
+
7
+ ## Installation
8
+
9
+ Add this line to your application’s Gemfile:
10
+
11
+ ```ruby
12
+ gem "polars-df"
13
+ ```
14
+
15
+ ## Getting Started
16
+
17
+ This library follows the [Polars Python API](https://pola-rs.github.io/polars/py-polars/html/reference/index.html).
18
+
19
+ ```ruby
20
+ Polars.read_csv("iris.csv")
21
+ .lazy
22
+ .filter(Polars.col("sepal_length") > 5)
23
+ .group_by("species")
24
+ .agg(Polars.all.sum)
25
+ .collect
26
+ ```
27
+
28
+ You can follow [Polars tutorials](https://pola-rs.github.io/polars-book/user-guide/) and convert the code to Ruby in many cases. Feel free to open an issue if you run into problems.
29
+
30
+ ## Reference
31
+
32
+ - [Series](https://www.rubydoc.info/gems/polars-df/Polars/Series)
33
+ - [DataFrame](https://www.rubydoc.info/gems/polars-df/Polars/DataFrame)
34
+ - [LazyFrame](https://www.rubydoc.info/gems/polars-df/Polars/LazyFrame)
35
+
36
+ ## Examples
37
+
38
+ ### Creating DataFrames
39
+
40
+ From a CSV
41
+
42
+ ```ruby
43
+ Polars.read_csv("file.csv")
44
+
45
+ # or lazily with
46
+ Polars.scan_csv("file.csv")
47
+ ```
48
+
49
+ From Parquet
50
+
51
+ ```ruby
52
+ Polars.read_parquet("file.parquet")
53
+
54
+ # or lazily with
55
+ Polars.scan_parquet("file.parquet")
56
+ ```
57
+
58
+ From Active Record
59
+
60
+ ```ruby
61
+ Polars.read_database(User.all)
62
+ # or
63
+ Polars.read_database("SELECT * FROM users")
64
+ ```
65
+
66
+ From JSON
67
+
68
+ ```ruby
69
+ Polars.read_json("file.json")
70
+ # or
71
+ Polars.read_ndjson("file.ndjson")
72
+
73
+ # or lazily with
74
+ Polars.scan_ndjson("file.ndjson")
75
+ ```
76
+
77
+ From Feather / Arrow IPC
78
+
79
+ ```ruby
80
+ Polars.read_ipc("file.arrow")
81
+
82
+ # or lazily with
83
+ Polars.scan_ipc("file.arrow")
84
+ ```
85
+
86
+ From Avro
87
+
88
+ ```ruby
89
+ Polars.read_avro("file.avro")
90
+ ```
91
+
92
+ From a hash
93
+
94
+ ```ruby
95
+ Polars::DataFrame.new({
96
+ a: [1, 2, 3],
97
+ b: ["one", "two", "three"]
98
+ })
99
+ ```
100
+
101
+ From an array of hashes
102
+
103
+ ```ruby
104
+ Polars::DataFrame.new([
105
+ {a: 1, b: "one"},
106
+ {a: 2, b: "two"},
107
+ {a: 3, b: "three"}
108
+ ])
109
+ ```
110
+
111
+ From an array of series
112
+
113
+ ```ruby
114
+ Polars::DataFrame.new([
115
+ Polars::Series.new("a", [1, 2, 3]),
116
+ Polars::Series.new("b", ["one", "two", "three"])
117
+ ])
118
+ ```
119
+
120
+ ## Attributes
121
+
122
+ Get number of rows
123
+
124
+ ```ruby
125
+ df.height
126
+ ```
127
+
128
+ Get column names
129
+
130
+ ```ruby
131
+ df.columns
132
+ ```
133
+
134
+ Check if a column exists
135
+
136
+ ```ruby
137
+ df.include?(name)
138
+ ```
139
+
140
+ ## Selecting Data
141
+
142
+ Select a column
143
+
144
+ ```ruby
145
+ df["a"]
146
+ ```
147
+
148
+ Select multiple columns
149
+
150
+ ```ruby
151
+ df[["a", "b"]]
152
+ ```
153
+
154
+ Select first rows
155
+
156
+ ```ruby
157
+ df.head
158
+ ```
159
+
160
+ Select last rows
161
+
162
+ ```ruby
163
+ df.tail
164
+ ```
165
+
166
+ ## Filtering
167
+
168
+ Filter on a condition
169
+
170
+ ```ruby
171
+ df[Polars.col("a") == 2]
172
+ df[Polars.col("a") != 2]
173
+ df[Polars.col("a") > 2]
174
+ df[Polars.col("a") >= 2]
175
+ df[Polars.col("a") < 2]
176
+ df[Polars.col("a") <= 2]
177
+ ```
178
+
179
+ And, or, and exclusive or
180
+
181
+ ```ruby
182
+ df[(Polars.col("a") > 1) & (Polars.col("b") == "two")] # and
183
+ df[(Polars.col("a") > 1) | (Polars.col("b") == "two")] # or
184
+ df[(Polars.col("a") > 1) ^ (Polars.col("b") == "two")] # xor
185
+ ```
186
+
187
+ ## Operations
188
+
189
+ Basic operations
190
+
191
+ ```ruby
192
+ df["a"] + 5
193
+ df["a"] - 5
194
+ df["a"] * 5
195
+ df["a"] / 5
196
+ df["a"] % 5
197
+ df["a"] ** 2
198
+ df["a"].sqrt
199
+ df["a"].abs
200
+ ```
201
+
202
+ Rounding
203
+
204
+ ```ruby
205
+ df["a"].round(2)
206
+ df["a"].ceil
207
+ df["a"].floor
208
+ ```
209
+
210
+ Logarithm
211
+
212
+ ```ruby
213
+ df["a"].log # natural log
214
+ df["a"].log(10)
215
+ ```
216
+
217
+ Exponentiation
218
+
219
+ ```ruby
220
+ df["a"].exp
221
+ ```
222
+
223
+ Trigonometric functions
224
+
225
+ ```ruby
226
+ df["a"].sin
227
+ df["a"].cos
228
+ df["a"].tan
229
+ df["a"].asin
230
+ df["a"].acos
231
+ df["a"].atan
232
+ ```
233
+
234
+ Hyperbolic functions
235
+
236
+ ```ruby
237
+ df["a"].sinh
238
+ df["a"].cosh
239
+ df["a"].tanh
240
+ df["a"].asinh
241
+ df["a"].acosh
242
+ df["a"].atanh
243
+ ```
244
+
245
+ Summary statistics
246
+
247
+ ```ruby
248
+ df["a"].sum
249
+ df["a"].mean
250
+ df["a"].median
251
+ df["a"].quantile(0.90)
252
+ df["a"].min
253
+ df["a"].max
254
+ df["a"].std
255
+ df["a"].var
256
+ ```
257
+
258
+ ## Grouping
259
+
260
+ Group
261
+
262
+ ```ruby
263
+ df.group_by("a").count
264
+ ```
265
+
266
+ Works with all summary statistics
267
+
268
+ ```ruby
269
+ df.group_by("a").max
270
+ ```
271
+
272
+ Multiple groups
273
+
274
+ ```ruby
275
+ df.group_by(["a", "b"]).count
276
+ ```
277
+
278
+ ## Combining Data Frames
279
+
280
+ Add rows
281
+
282
+ ```ruby
283
+ df.vstack(other_df)
284
+ ```
285
+
286
+ Add columns
287
+
288
+ ```ruby
289
+ df.hstack(other_df)
290
+ ```
291
+
292
+ Inner join
293
+
294
+ ```ruby
295
+ df.join(other_df, on: "a")
296
+ ```
297
+
298
+ Left join
299
+
300
+ ```ruby
301
+ df.join(other_df, on: "a", how: "left")
302
+ ```
303
+
304
+ ## Encoding
305
+
306
+ One-hot encoding
307
+
308
+ ```ruby
309
+ df.to_dummies
310
+ ```
311
+
312
+ ## Conversion
313
+
314
+ Array of hashes
315
+
316
+ ```ruby
317
+ df.rows(named: true)
318
+ ```
319
+
320
+ Hash of series
321
+
322
+ ```ruby
323
+ df.to_h
324
+ ```
325
+
326
+ CSV
327
+
328
+ ```ruby
329
+ df.to_csv
330
+ # or
331
+ df.write_csv("file.csv")
332
+ ```
333
+
334
+ Parquet
335
+
336
+ ```ruby
337
+ df.write_parquet("file.parquet")
338
+ ```
339
+
340
+ Numo array
341
+
342
+ ```ruby
343
+ df.to_numo
344
+ ```
345
+
346
+ ## Types
347
+
348
+ You can specify column types when creating a data frame
349
+
350
+ ```ruby
351
+ Polars::DataFrame.new(data, schema: {"a" => Polars::Int32, "b" => Polars::Float32})
352
+ ```
353
+
354
+ Supported types are:
355
+
356
+ - boolean - `Boolean`
357
+ - float - `Float64`, `Float32`
358
+ - integer - `Int64`, `Int32`, `Int16`, `Int8`
359
+ - unsigned integer - `UInt64`, `UInt32`, `UInt16`, `UInt8`
360
+ - string - `String`, `Binary`, `Categorical`
361
+ - temporal - `Date`, `Datetime`, `Time`, `Duration`
362
+ - nested - `List`, `Struct`, `Array`
363
+ - other - `Object`, `Null`
364
+
365
+ Get column types
366
+
367
+ ```ruby
368
+ df.schema
369
+ ```
370
+
371
+ For a specific column
372
+
373
+ ```ruby
374
+ df["a"].dtype
375
+ ```
376
+
377
+ Cast a column
378
+
379
+ ```ruby
380
+ df["a"].cast(Polars::Int32)
381
+ ```
382
+
383
+ ## Visualization
384
+
385
+ Add [Vega](https://github.com/ankane/vega-ruby) to your application’s Gemfile:
386
+
387
+ ```ruby
388
+ gem "vega"
389
+ ```
390
+
391
+ And use:
392
+
393
+ ```ruby
394
+ df.plot("a", "b")
395
+ ```
396
+
397
+ Specify the chart type (`line`, `pie`, `column`, `bar`, `area`, or `scatter`)
398
+
399
+ ```ruby
400
+ df.plot("a", "b", type: "pie")
401
+ ```
402
+
403
+ Group data
404
+
405
+ ```ruby
406
+ df.group_by("c").plot("a", "b")
407
+ ```
408
+
409
+ Stacked columns or bars
410
+
411
+ ```ruby
412
+ df.group_by("c").plot("a", "b", stacked: true)
413
+ ```
414
+
415
+ ## History
416
+
417
+ View the [changelog](CHANGELOG.md)
418
+
419
+ ## Contributing
420
+
421
+ Everyone is encouraged to help improve this project. Here are a few ways you can help:
422
+
423
+ - [Report bugs](https://github.com/ankane/ruby-polars/issues)
424
+ - Fix bugs and [submit pull requests](https://github.com/ankane/ruby-polars/pulls)
425
+ - Write, clarify, or fix documentation
426
+ - Suggest or add new features
427
+
428
+ To get started with development:
429
+
430
+ ```sh
431
+ git clone https://github.com/ankane/ruby-polars.git
432
+ cd ruby-polars
433
+ bundle install
434
+ bundle exec rake compile
435
+ bundle exec rake test
436
+ bundle exec rake test:docs
437
+ ```
Binary file
Binary file
Binary file