polars-df 0.13.0-aarch64-linux-musl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. checksums.yaml +7 -0
  2. data/.yardopts +3 -0
  3. data/CHANGELOG.md +208 -0
  4. data/Cargo.lock +2556 -0
  5. data/Cargo.toml +6 -0
  6. data/LICENSE-THIRD-PARTY.txt +39059 -0
  7. data/LICENSE.txt +20 -0
  8. data/README.md +437 -0
  9. data/lib/polars/3.1/polars.so +0 -0
  10. data/lib/polars/3.2/polars.so +0 -0
  11. data/lib/polars/3.3/polars.so +0 -0
  12. data/lib/polars/array_expr.rb +537 -0
  13. data/lib/polars/array_name_space.rb +423 -0
  14. data/lib/polars/batched_csv_reader.rb +104 -0
  15. data/lib/polars/binary_expr.rb +77 -0
  16. data/lib/polars/binary_name_space.rb +66 -0
  17. data/lib/polars/cat_expr.rb +36 -0
  18. data/lib/polars/cat_name_space.rb +88 -0
  19. data/lib/polars/config.rb +530 -0
  20. data/lib/polars/convert.rb +98 -0
  21. data/lib/polars/data_frame.rb +5191 -0
  22. data/lib/polars/data_types.rb +466 -0
  23. data/lib/polars/date_time_expr.rb +1397 -0
  24. data/lib/polars/date_time_name_space.rb +1287 -0
  25. data/lib/polars/dynamic_group_by.rb +52 -0
  26. data/lib/polars/exceptions.rb +38 -0
  27. data/lib/polars/expr.rb +7256 -0
  28. data/lib/polars/expr_dispatch.rb +22 -0
  29. data/lib/polars/functions/aggregation/horizontal.rb +246 -0
  30. data/lib/polars/functions/aggregation/vertical.rb +282 -0
  31. data/lib/polars/functions/as_datatype.rb +271 -0
  32. data/lib/polars/functions/col.rb +47 -0
  33. data/lib/polars/functions/eager.rb +182 -0
  34. data/lib/polars/functions/lazy.rb +1329 -0
  35. data/lib/polars/functions/len.rb +49 -0
  36. data/lib/polars/functions/lit.rb +35 -0
  37. data/lib/polars/functions/random.rb +16 -0
  38. data/lib/polars/functions/range/date_range.rb +136 -0
  39. data/lib/polars/functions/range/datetime_range.rb +149 -0
  40. data/lib/polars/functions/range/int_range.rb +51 -0
  41. data/lib/polars/functions/range/time_range.rb +141 -0
  42. data/lib/polars/functions/repeat.rb +144 -0
  43. data/lib/polars/functions/whenthen.rb +96 -0
  44. data/lib/polars/functions.rb +57 -0
  45. data/lib/polars/group_by.rb +613 -0
  46. data/lib/polars/io/avro.rb +24 -0
  47. data/lib/polars/io/csv.rb +696 -0
  48. data/lib/polars/io/database.rb +73 -0
  49. data/lib/polars/io/ipc.rb +275 -0
  50. data/lib/polars/io/json.rb +29 -0
  51. data/lib/polars/io/ndjson.rb +80 -0
  52. data/lib/polars/io/parquet.rb +233 -0
  53. data/lib/polars/lazy_frame.rb +2708 -0
  54. data/lib/polars/lazy_group_by.rb +181 -0
  55. data/lib/polars/list_expr.rb +791 -0
  56. data/lib/polars/list_name_space.rb +449 -0
  57. data/lib/polars/meta_expr.rb +222 -0
  58. data/lib/polars/name_expr.rb +198 -0
  59. data/lib/polars/plot.rb +109 -0
  60. data/lib/polars/rolling_group_by.rb +35 -0
  61. data/lib/polars/series.rb +4444 -0
  62. data/lib/polars/slice.rb +104 -0
  63. data/lib/polars/sql_context.rb +194 -0
  64. data/lib/polars/string_cache.rb +75 -0
  65. data/lib/polars/string_expr.rb +1495 -0
  66. data/lib/polars/string_name_space.rb +811 -0
  67. data/lib/polars/struct_expr.rb +98 -0
  68. data/lib/polars/struct_name_space.rb +96 -0
  69. data/lib/polars/testing.rb +507 -0
  70. data/lib/polars/utils/constants.rb +9 -0
  71. data/lib/polars/utils/convert.rb +97 -0
  72. data/lib/polars/utils/parse.rb +89 -0
  73. data/lib/polars/utils/various.rb +76 -0
  74. data/lib/polars/utils/wrap.rb +19 -0
  75. data/lib/polars/utils.rb +130 -0
  76. data/lib/polars/version.rb +4 -0
  77. data/lib/polars/whenthen.rb +83 -0
  78. data/lib/polars-df.rb +1 -0
  79. data/lib/polars.rb +91 -0
  80. metadata +138 -0
data/LICENSE.txt ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2020 Ritchie Vink
2
+ Copyright (c) 2022-2024 Andrew Kane
3
+
4
+ Permission is hereby granted, free of charge, to any person obtaining a copy
5
+ of this software and associated documentation files (the "Software"), to deal
6
+ in the Software without restriction, including without limitation the rights
7
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8
+ copies of the Software, and to permit persons to whom the Software is
9
+ furnished to do so, subject to the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be included in all
12
+ copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
20
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,437 @@
1
+ # Ruby Polars
2
+
3
+ :fire: Blazingly fast DataFrames for Ruby, powered by [Polars](https://github.com/pola-rs/polars)
4
+
5
+ [![Build Status](https://github.com/ankane/ruby-polars/actions/workflows/build.yml/badge.svg)](https://github.com/ankane/ruby-polars/actions)
6
+
7
+ ## Installation
8
+
9
+ Add this line to your application’s Gemfile:
10
+
11
+ ```ruby
12
+ gem "polars-df"
13
+ ```
14
+
15
+ ## Getting Started
16
+
17
+ This library follows the [Polars Python API](https://pola-rs.github.io/polars/py-polars/html/reference/index.html).
18
+
19
+ ```ruby
20
+ Polars.read_csv("iris.csv")
21
+ .lazy
22
+ .filter(Polars.col("sepal_length") > 5)
23
+ .group_by("species")
24
+ .agg(Polars.all.sum)
25
+ .collect
26
+ ```
27
+
28
+ You can follow [Polars tutorials](https://pola-rs.github.io/polars-book/user-guide/) and convert the code to Ruby in many cases. Feel free to open an issue if you run into problems.
29
+
30
+ ## Reference
31
+
32
+ - [Series](https://www.rubydoc.info/gems/polars-df/Polars/Series)
33
+ - [DataFrame](https://www.rubydoc.info/gems/polars-df/Polars/DataFrame)
34
+ - [LazyFrame](https://www.rubydoc.info/gems/polars-df/Polars/LazyFrame)
35
+
36
+ ## Examples
37
+
38
+ ### Creating DataFrames
39
+
40
+ From a CSV
41
+
42
+ ```ruby
43
+ Polars.read_csv("file.csv")
44
+
45
+ # or lazily with
46
+ Polars.scan_csv("file.csv")
47
+ ```
48
+
49
+ From Parquet
50
+
51
+ ```ruby
52
+ Polars.read_parquet("file.parquet")
53
+
54
+ # or lazily with
55
+ Polars.scan_parquet("file.parquet")
56
+ ```
57
+
58
+ From Active Record
59
+
60
+ ```ruby
61
+ Polars.read_database(User.all)
62
+ # or
63
+ Polars.read_database("SELECT * FROM users")
64
+ ```
65
+
66
+ From JSON
67
+
68
+ ```ruby
69
+ Polars.read_json("file.json")
70
+ # or
71
+ Polars.read_ndjson("file.ndjson")
72
+
73
+ # or lazily with
74
+ Polars.scan_ndjson("file.ndjson")
75
+ ```
76
+
77
+ From Feather / Arrow IPC
78
+
79
+ ```ruby
80
+ Polars.read_ipc("file.arrow")
81
+
82
+ # or lazily with
83
+ Polars.scan_ipc("file.arrow")
84
+ ```
85
+
86
+ From Avro
87
+
88
+ ```ruby
89
+ Polars.read_avro("file.avro")
90
+ ```
91
+
92
+ From a hash
93
+
94
+ ```ruby
95
+ Polars::DataFrame.new({
96
+ a: [1, 2, 3],
97
+ b: ["one", "two", "three"]
98
+ })
99
+ ```
100
+
101
+ From an array of hashes
102
+
103
+ ```ruby
104
+ Polars::DataFrame.new([
105
+ {a: 1, b: "one"},
106
+ {a: 2, b: "two"},
107
+ {a: 3, b: "three"}
108
+ ])
109
+ ```
110
+
111
+ From an array of series
112
+
113
+ ```ruby
114
+ Polars::DataFrame.new([
115
+ Polars::Series.new("a", [1, 2, 3]),
116
+ Polars::Series.new("b", ["one", "two", "three"])
117
+ ])
118
+ ```
119
+
120
+ ## Attributes
121
+
122
+ Get number of rows
123
+
124
+ ```ruby
125
+ df.height
126
+ ```
127
+
128
+ Get column names
129
+
130
+ ```ruby
131
+ df.columns
132
+ ```
133
+
134
+ Check if a column exists
135
+
136
+ ```ruby
137
+ df.include?(name)
138
+ ```
139
+
140
+ ## Selecting Data
141
+
142
+ Select a column
143
+
144
+ ```ruby
145
+ df["a"]
146
+ ```
147
+
148
+ Select multiple columns
149
+
150
+ ```ruby
151
+ df[["a", "b"]]
152
+ ```
153
+
154
+ Select first rows
155
+
156
+ ```ruby
157
+ df.head
158
+ ```
159
+
160
+ Select last rows
161
+
162
+ ```ruby
163
+ df.tail
164
+ ```
165
+
166
+ ## Filtering
167
+
168
+ Filter on a condition
169
+
170
+ ```ruby
171
+ df[Polars.col("a") == 2]
172
+ df[Polars.col("a") != 2]
173
+ df[Polars.col("a") > 2]
174
+ df[Polars.col("a") >= 2]
175
+ df[Polars.col("a") < 2]
176
+ df[Polars.col("a") <= 2]
177
+ ```
178
+
179
+ And, or, and exclusive or
180
+
181
+ ```ruby
182
+ df[(Polars.col("a") > 1) & (Polars.col("b") == "two")] # and
183
+ df[(Polars.col("a") > 1) | (Polars.col("b") == "two")] # or
184
+ df[(Polars.col("a") > 1) ^ (Polars.col("b") == "two")] # xor
185
+ ```
186
+
187
+ ## Operations
188
+
189
+ Basic operations
190
+
191
+ ```ruby
192
+ df["a"] + 5
193
+ df["a"] - 5
194
+ df["a"] * 5
195
+ df["a"] / 5
196
+ df["a"] % 5
197
+ df["a"] ** 2
198
+ df["a"].sqrt
199
+ df["a"].abs
200
+ ```
201
+
202
+ Rounding
203
+
204
+ ```ruby
205
+ df["a"].round(2)
206
+ df["a"].ceil
207
+ df["a"].floor
208
+ ```
209
+
210
+ Logarithm
211
+
212
+ ```ruby
213
+ df["a"].log # natural log
214
+ df["a"].log(10)
215
+ ```
216
+
217
+ Exponentiation
218
+
219
+ ```ruby
220
+ df["a"].exp
221
+ ```
222
+
223
+ Trigonometric functions
224
+
225
+ ```ruby
226
+ df["a"].sin
227
+ df["a"].cos
228
+ df["a"].tan
229
+ df["a"].asin
230
+ df["a"].acos
231
+ df["a"].atan
232
+ ```
233
+
234
+ Hyperbolic functions
235
+
236
+ ```ruby
237
+ df["a"].sinh
238
+ df["a"].cosh
239
+ df["a"].tanh
240
+ df["a"].asinh
241
+ df["a"].acosh
242
+ df["a"].atanh
243
+ ```
244
+
245
+ Summary statistics
246
+
247
+ ```ruby
248
+ df["a"].sum
249
+ df["a"].mean
250
+ df["a"].median
251
+ df["a"].quantile(0.90)
252
+ df["a"].min
253
+ df["a"].max
254
+ df["a"].std
255
+ df["a"].var
256
+ ```
257
+
258
+ ## Grouping
259
+
260
+ Group
261
+
262
+ ```ruby
263
+ df.group_by("a").count
264
+ ```
265
+
266
+ Works with all summary statistics
267
+
268
+ ```ruby
269
+ df.group_by("a").max
270
+ ```
271
+
272
+ Multiple groups
273
+
274
+ ```ruby
275
+ df.group_by(["a", "b"]).count
276
+ ```
277
+
278
+ ## Combining Data Frames
279
+
280
+ Add rows
281
+
282
+ ```ruby
283
+ df.vstack(other_df)
284
+ ```
285
+
286
+ Add columns
287
+
288
+ ```ruby
289
+ df.hstack(other_df)
290
+ ```
291
+
292
+ Inner join
293
+
294
+ ```ruby
295
+ df.join(other_df, on: "a")
296
+ ```
297
+
298
+ Left join
299
+
300
+ ```ruby
301
+ df.join(other_df, on: "a", how: "left")
302
+ ```
303
+
304
+ ## Encoding
305
+
306
+ One-hot encoding
307
+
308
+ ```ruby
309
+ df.to_dummies
310
+ ```
311
+
312
+ ## Conversion
313
+
314
+ Array of hashes
315
+
316
+ ```ruby
317
+ df.rows(named: true)
318
+ ```
319
+
320
+ Hash of series
321
+
322
+ ```ruby
323
+ df.to_h
324
+ ```
325
+
326
+ CSV
327
+
328
+ ```ruby
329
+ df.to_csv
330
+ # or
331
+ df.write_csv("file.csv")
332
+ ```
333
+
334
+ Parquet
335
+
336
+ ```ruby
337
+ df.write_parquet("file.parquet")
338
+ ```
339
+
340
+ Numo array
341
+
342
+ ```ruby
343
+ df.to_numo
344
+ ```
345
+
346
+ ## Types
347
+
348
+ You can specify column types when creating a data frame
349
+
350
+ ```ruby
351
+ Polars::DataFrame.new(data, schema: {"a" => Polars::Int32, "b" => Polars::Float32})
352
+ ```
353
+
354
+ Supported types are:
355
+
356
+ - boolean - `Boolean`
357
+ - float - `Float64`, `Float32`
358
+ - integer - `Int64`, `Int32`, `Int16`, `Int8`
359
+ - unsigned integer - `UInt64`, `UInt32`, `UInt16`, `UInt8`
360
+ - string - `String`, `Binary`, `Categorical`
361
+ - temporal - `Date`, `Datetime`, `Time`, `Duration`
362
+ - nested - `List`, `Struct`, `Array`
363
+ - other - `Object`, `Null`
364
+
365
+ Get column types
366
+
367
+ ```ruby
368
+ df.schema
369
+ ```
370
+
371
+ For a specific column
372
+
373
+ ```ruby
374
+ df["a"].dtype
375
+ ```
376
+
377
+ Cast a column
378
+
379
+ ```ruby
380
+ df["a"].cast(Polars::Int32)
381
+ ```
382
+
383
+ ## Visualization
384
+
385
+ Add [Vega](https://github.com/ankane/vega-ruby) to your application’s Gemfile:
386
+
387
+ ```ruby
388
+ gem "vega"
389
+ ```
390
+
391
+ And use:
392
+
393
+ ```ruby
394
+ df.plot("a", "b")
395
+ ```
396
+
397
+ Specify the chart type (`line`, `pie`, `column`, `bar`, `area`, or `scatter`)
398
+
399
+ ```ruby
400
+ df.plot("a", "b", type: "pie")
401
+ ```
402
+
403
+ Group data
404
+
405
+ ```ruby
406
+ df.group_by("c").plot("a", "b")
407
+ ```
408
+
409
+ Stacked columns or bars
410
+
411
+ ```ruby
412
+ df.group_by("c").plot("a", "b", stacked: true)
413
+ ```
414
+
415
+ ## History
416
+
417
+ View the [changelog](CHANGELOG.md)
418
+
419
+ ## Contributing
420
+
421
+ Everyone is encouraged to help improve this project. Here are a few ways you can help:
422
+
423
+ - [Report bugs](https://github.com/ankane/ruby-polars/issues)
424
+ - Fix bugs and [submit pull requests](https://github.com/ankane/ruby-polars/pulls)
425
+ - Write, clarify, or fix documentation
426
+ - Suggest or add new features
427
+
428
+ To get started with development:
429
+
430
+ ```sh
431
+ git clone https://github.com/ankane/ruby-polars.git
432
+ cd ruby-polars
433
+ bundle install
434
+ bundle exec rake compile
435
+ bundle exec rake test
436
+ bundle exec rake test:docs
437
+ ```
Binary file
Binary file
Binary file