polars-df 0.13.0-aarch64-linux-musl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.yardopts +3 -0
- data/CHANGELOG.md +208 -0
- data/Cargo.lock +2556 -0
- data/Cargo.toml +6 -0
- data/LICENSE-THIRD-PARTY.txt +39059 -0
- data/LICENSE.txt +20 -0
- data/README.md +437 -0
- data/lib/polars/3.1/polars.so +0 -0
- data/lib/polars/3.2/polars.so +0 -0
- data/lib/polars/3.3/polars.so +0 -0
- data/lib/polars/array_expr.rb +537 -0
- data/lib/polars/array_name_space.rb +423 -0
- data/lib/polars/batched_csv_reader.rb +104 -0
- data/lib/polars/binary_expr.rb +77 -0
- data/lib/polars/binary_name_space.rb +66 -0
- data/lib/polars/cat_expr.rb +36 -0
- data/lib/polars/cat_name_space.rb +88 -0
- data/lib/polars/config.rb +530 -0
- data/lib/polars/convert.rb +98 -0
- data/lib/polars/data_frame.rb +5191 -0
- data/lib/polars/data_types.rb +466 -0
- data/lib/polars/date_time_expr.rb +1397 -0
- data/lib/polars/date_time_name_space.rb +1287 -0
- data/lib/polars/dynamic_group_by.rb +52 -0
- data/lib/polars/exceptions.rb +38 -0
- data/lib/polars/expr.rb +7256 -0
- data/lib/polars/expr_dispatch.rb +22 -0
- data/lib/polars/functions/aggregation/horizontal.rb +246 -0
- data/lib/polars/functions/aggregation/vertical.rb +282 -0
- data/lib/polars/functions/as_datatype.rb +271 -0
- data/lib/polars/functions/col.rb +47 -0
- data/lib/polars/functions/eager.rb +182 -0
- data/lib/polars/functions/lazy.rb +1329 -0
- data/lib/polars/functions/len.rb +49 -0
- data/lib/polars/functions/lit.rb +35 -0
- data/lib/polars/functions/random.rb +16 -0
- data/lib/polars/functions/range/date_range.rb +136 -0
- data/lib/polars/functions/range/datetime_range.rb +149 -0
- data/lib/polars/functions/range/int_range.rb +51 -0
- data/lib/polars/functions/range/time_range.rb +141 -0
- data/lib/polars/functions/repeat.rb +144 -0
- data/lib/polars/functions/whenthen.rb +96 -0
- data/lib/polars/functions.rb +57 -0
- data/lib/polars/group_by.rb +613 -0
- data/lib/polars/io/avro.rb +24 -0
- data/lib/polars/io/csv.rb +696 -0
- data/lib/polars/io/database.rb +73 -0
- data/lib/polars/io/ipc.rb +275 -0
- data/lib/polars/io/json.rb +29 -0
- data/lib/polars/io/ndjson.rb +80 -0
- data/lib/polars/io/parquet.rb +233 -0
- data/lib/polars/lazy_frame.rb +2708 -0
- data/lib/polars/lazy_group_by.rb +181 -0
- data/lib/polars/list_expr.rb +791 -0
- data/lib/polars/list_name_space.rb +449 -0
- data/lib/polars/meta_expr.rb +222 -0
- data/lib/polars/name_expr.rb +198 -0
- data/lib/polars/plot.rb +109 -0
- data/lib/polars/rolling_group_by.rb +35 -0
- data/lib/polars/series.rb +4444 -0
- data/lib/polars/slice.rb +104 -0
- data/lib/polars/sql_context.rb +194 -0
- data/lib/polars/string_cache.rb +75 -0
- data/lib/polars/string_expr.rb +1495 -0
- data/lib/polars/string_name_space.rb +811 -0
- data/lib/polars/struct_expr.rb +98 -0
- data/lib/polars/struct_name_space.rb +96 -0
- data/lib/polars/testing.rb +507 -0
- data/lib/polars/utils/constants.rb +9 -0
- data/lib/polars/utils/convert.rb +97 -0
- data/lib/polars/utils/parse.rb +89 -0
- data/lib/polars/utils/various.rb +76 -0
- data/lib/polars/utils/wrap.rb +19 -0
- data/lib/polars/utils.rb +130 -0
- data/lib/polars/version.rb +4 -0
- data/lib/polars/whenthen.rb +83 -0
- data/lib/polars-df.rb +1 -0
- data/lib/polars.rb +91 -0
- metadata +138 -0
data/LICENSE.txt
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2020 Ritchie Vink
|
2
|
+
Copyright (c) 2022-2024 Andrew Kane
|
3
|
+
|
4
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
5
|
+
of this software and associated documentation files (the "Software"), to deal
|
6
|
+
in the Software without restriction, including without limitation the rights
|
7
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
8
|
+
copies of the Software, and to permit persons to whom the Software is
|
9
|
+
furnished to do so, subject to the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be included in all
|
12
|
+
copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
15
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
16
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
17
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
18
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
19
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
20
|
+
SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,437 @@
|
|
1
|
+
# Ruby Polars
|
2
|
+
|
3
|
+
:fire: Blazingly fast DataFrames for Ruby, powered by [Polars](https://github.com/pola-rs/polars)
|
4
|
+
|
5
|
+
[](https://github.com/ankane/ruby-polars/actions)
|
6
|
+
|
7
|
+
## Installation
|
8
|
+
|
9
|
+
Add this line to your application’s Gemfile:
|
10
|
+
|
11
|
+
```ruby
|
12
|
+
gem "polars-df"
|
13
|
+
```
|
14
|
+
|
15
|
+
## Getting Started
|
16
|
+
|
17
|
+
This library follows the [Polars Python API](https://pola-rs.github.io/polars/py-polars/html/reference/index.html).
|
18
|
+
|
19
|
+
```ruby
|
20
|
+
Polars.read_csv("iris.csv")
|
21
|
+
.lazy
|
22
|
+
.filter(Polars.col("sepal_length") > 5)
|
23
|
+
.group_by("species")
|
24
|
+
.agg(Polars.all.sum)
|
25
|
+
.collect
|
26
|
+
```
|
27
|
+
|
28
|
+
You can follow [Polars tutorials](https://pola-rs.github.io/polars-book/user-guide/) and convert the code to Ruby in many cases. Feel free to open an issue if you run into problems.
|
29
|
+
|
30
|
+
## Reference
|
31
|
+
|
32
|
+
- [Series](https://www.rubydoc.info/gems/polars-df/Polars/Series)
|
33
|
+
- [DataFrame](https://www.rubydoc.info/gems/polars-df/Polars/DataFrame)
|
34
|
+
- [LazyFrame](https://www.rubydoc.info/gems/polars-df/Polars/LazyFrame)
|
35
|
+
|
36
|
+
## Examples
|
37
|
+
|
38
|
+
### Creating DataFrames
|
39
|
+
|
40
|
+
From a CSV
|
41
|
+
|
42
|
+
```ruby
|
43
|
+
Polars.read_csv("file.csv")
|
44
|
+
|
45
|
+
# or lazily with
|
46
|
+
Polars.scan_csv("file.csv")
|
47
|
+
```
|
48
|
+
|
49
|
+
From Parquet
|
50
|
+
|
51
|
+
```ruby
|
52
|
+
Polars.read_parquet("file.parquet")
|
53
|
+
|
54
|
+
# or lazily with
|
55
|
+
Polars.scan_parquet("file.parquet")
|
56
|
+
```
|
57
|
+
|
58
|
+
From Active Record
|
59
|
+
|
60
|
+
```ruby
|
61
|
+
Polars.read_database(User.all)
|
62
|
+
# or
|
63
|
+
Polars.read_database("SELECT * FROM users")
|
64
|
+
```
|
65
|
+
|
66
|
+
From JSON
|
67
|
+
|
68
|
+
```ruby
|
69
|
+
Polars.read_json("file.json")
|
70
|
+
# or
|
71
|
+
Polars.read_ndjson("file.ndjson")
|
72
|
+
|
73
|
+
# or lazily with
|
74
|
+
Polars.scan_ndjson("file.ndjson")
|
75
|
+
```
|
76
|
+
|
77
|
+
From Feather / Arrow IPC
|
78
|
+
|
79
|
+
```ruby
|
80
|
+
Polars.read_ipc("file.arrow")
|
81
|
+
|
82
|
+
# or lazily with
|
83
|
+
Polars.scan_ipc("file.arrow")
|
84
|
+
```
|
85
|
+
|
86
|
+
From Avro
|
87
|
+
|
88
|
+
```ruby
|
89
|
+
Polars.read_avro("file.avro")
|
90
|
+
```
|
91
|
+
|
92
|
+
From a hash
|
93
|
+
|
94
|
+
```ruby
|
95
|
+
Polars::DataFrame.new({
|
96
|
+
a: [1, 2, 3],
|
97
|
+
b: ["one", "two", "three"]
|
98
|
+
})
|
99
|
+
```
|
100
|
+
|
101
|
+
From an array of hashes
|
102
|
+
|
103
|
+
```ruby
|
104
|
+
Polars::DataFrame.new([
|
105
|
+
{a: 1, b: "one"},
|
106
|
+
{a: 2, b: "two"},
|
107
|
+
{a: 3, b: "three"}
|
108
|
+
])
|
109
|
+
```
|
110
|
+
|
111
|
+
From an array of series
|
112
|
+
|
113
|
+
```ruby
|
114
|
+
Polars::DataFrame.new([
|
115
|
+
Polars::Series.new("a", [1, 2, 3]),
|
116
|
+
Polars::Series.new("b", ["one", "two", "three"])
|
117
|
+
])
|
118
|
+
```
|
119
|
+
|
120
|
+
## Attributes
|
121
|
+
|
122
|
+
Get number of rows
|
123
|
+
|
124
|
+
```ruby
|
125
|
+
df.height
|
126
|
+
```
|
127
|
+
|
128
|
+
Get column names
|
129
|
+
|
130
|
+
```ruby
|
131
|
+
df.columns
|
132
|
+
```
|
133
|
+
|
134
|
+
Check if a column exists
|
135
|
+
|
136
|
+
```ruby
|
137
|
+
df.include?(name)
|
138
|
+
```
|
139
|
+
|
140
|
+
## Selecting Data
|
141
|
+
|
142
|
+
Select a column
|
143
|
+
|
144
|
+
```ruby
|
145
|
+
df["a"]
|
146
|
+
```
|
147
|
+
|
148
|
+
Select multiple columns
|
149
|
+
|
150
|
+
```ruby
|
151
|
+
df[["a", "b"]]
|
152
|
+
```
|
153
|
+
|
154
|
+
Select first rows
|
155
|
+
|
156
|
+
```ruby
|
157
|
+
df.head
|
158
|
+
```
|
159
|
+
|
160
|
+
Select last rows
|
161
|
+
|
162
|
+
```ruby
|
163
|
+
df.tail
|
164
|
+
```
|
165
|
+
|
166
|
+
## Filtering
|
167
|
+
|
168
|
+
Filter on a condition
|
169
|
+
|
170
|
+
```ruby
|
171
|
+
df[Polars.col("a") == 2]
|
172
|
+
df[Polars.col("a") != 2]
|
173
|
+
df[Polars.col("a") > 2]
|
174
|
+
df[Polars.col("a") >= 2]
|
175
|
+
df[Polars.col("a") < 2]
|
176
|
+
df[Polars.col("a") <= 2]
|
177
|
+
```
|
178
|
+
|
179
|
+
And, or, and exclusive or
|
180
|
+
|
181
|
+
```ruby
|
182
|
+
df[(Polars.col("a") > 1) & (Polars.col("b") == "two")] # and
|
183
|
+
df[(Polars.col("a") > 1) | (Polars.col("b") == "two")] # or
|
184
|
+
df[(Polars.col("a") > 1) ^ (Polars.col("b") == "two")] # xor
|
185
|
+
```
|
186
|
+
|
187
|
+
## Operations
|
188
|
+
|
189
|
+
Basic operations
|
190
|
+
|
191
|
+
```ruby
|
192
|
+
df["a"] + 5
|
193
|
+
df["a"] - 5
|
194
|
+
df["a"] * 5
|
195
|
+
df["a"] / 5
|
196
|
+
df["a"] % 5
|
197
|
+
df["a"] ** 2
|
198
|
+
df["a"].sqrt
|
199
|
+
df["a"].abs
|
200
|
+
```
|
201
|
+
|
202
|
+
Rounding
|
203
|
+
|
204
|
+
```ruby
|
205
|
+
df["a"].round(2)
|
206
|
+
df["a"].ceil
|
207
|
+
df["a"].floor
|
208
|
+
```
|
209
|
+
|
210
|
+
Logarithm
|
211
|
+
|
212
|
+
```ruby
|
213
|
+
df["a"].log # natural log
|
214
|
+
df["a"].log(10)
|
215
|
+
```
|
216
|
+
|
217
|
+
Exponentiation
|
218
|
+
|
219
|
+
```ruby
|
220
|
+
df["a"].exp
|
221
|
+
```
|
222
|
+
|
223
|
+
Trigonometric functions
|
224
|
+
|
225
|
+
```ruby
|
226
|
+
df["a"].sin
|
227
|
+
df["a"].cos
|
228
|
+
df["a"].tan
|
229
|
+
df["a"].asin
|
230
|
+
df["a"].acos
|
231
|
+
df["a"].atan
|
232
|
+
```
|
233
|
+
|
234
|
+
Hyperbolic functions
|
235
|
+
|
236
|
+
```ruby
|
237
|
+
df["a"].sinh
|
238
|
+
df["a"].cosh
|
239
|
+
df["a"].tanh
|
240
|
+
df["a"].asinh
|
241
|
+
df["a"].acosh
|
242
|
+
df["a"].atanh
|
243
|
+
```
|
244
|
+
|
245
|
+
Summary statistics
|
246
|
+
|
247
|
+
```ruby
|
248
|
+
df["a"].sum
|
249
|
+
df["a"].mean
|
250
|
+
df["a"].median
|
251
|
+
df["a"].quantile(0.90)
|
252
|
+
df["a"].min
|
253
|
+
df["a"].max
|
254
|
+
df["a"].std
|
255
|
+
df["a"].var
|
256
|
+
```
|
257
|
+
|
258
|
+
## Grouping
|
259
|
+
|
260
|
+
Group
|
261
|
+
|
262
|
+
```ruby
|
263
|
+
df.group_by("a").count
|
264
|
+
```
|
265
|
+
|
266
|
+
Works with all summary statistics
|
267
|
+
|
268
|
+
```ruby
|
269
|
+
df.group_by("a").max
|
270
|
+
```
|
271
|
+
|
272
|
+
Multiple groups
|
273
|
+
|
274
|
+
```ruby
|
275
|
+
df.group_by(["a", "b"]).count
|
276
|
+
```
|
277
|
+
|
278
|
+
## Combining Data Frames
|
279
|
+
|
280
|
+
Add rows
|
281
|
+
|
282
|
+
```ruby
|
283
|
+
df.vstack(other_df)
|
284
|
+
```
|
285
|
+
|
286
|
+
Add columns
|
287
|
+
|
288
|
+
```ruby
|
289
|
+
df.hstack(other_df)
|
290
|
+
```
|
291
|
+
|
292
|
+
Inner join
|
293
|
+
|
294
|
+
```ruby
|
295
|
+
df.join(other_df, on: "a")
|
296
|
+
```
|
297
|
+
|
298
|
+
Left join
|
299
|
+
|
300
|
+
```ruby
|
301
|
+
df.join(other_df, on: "a", how: "left")
|
302
|
+
```
|
303
|
+
|
304
|
+
## Encoding
|
305
|
+
|
306
|
+
One-hot encoding
|
307
|
+
|
308
|
+
```ruby
|
309
|
+
df.to_dummies
|
310
|
+
```
|
311
|
+
|
312
|
+
## Conversion
|
313
|
+
|
314
|
+
Array of hashes
|
315
|
+
|
316
|
+
```ruby
|
317
|
+
df.rows(named: true)
|
318
|
+
```
|
319
|
+
|
320
|
+
Hash of series
|
321
|
+
|
322
|
+
```ruby
|
323
|
+
df.to_h
|
324
|
+
```
|
325
|
+
|
326
|
+
CSV
|
327
|
+
|
328
|
+
```ruby
|
329
|
+
df.to_csv
|
330
|
+
# or
|
331
|
+
df.write_csv("file.csv")
|
332
|
+
```
|
333
|
+
|
334
|
+
Parquet
|
335
|
+
|
336
|
+
```ruby
|
337
|
+
df.write_parquet("file.parquet")
|
338
|
+
```
|
339
|
+
|
340
|
+
Numo array
|
341
|
+
|
342
|
+
```ruby
|
343
|
+
df.to_numo
|
344
|
+
```
|
345
|
+
|
346
|
+
## Types
|
347
|
+
|
348
|
+
You can specify column types when creating a data frame
|
349
|
+
|
350
|
+
```ruby
|
351
|
+
Polars::DataFrame.new(data, schema: {"a" => Polars::Int32, "b" => Polars::Float32})
|
352
|
+
```
|
353
|
+
|
354
|
+
Supported types are:
|
355
|
+
|
356
|
+
- boolean - `Boolean`
|
357
|
+
- float - `Float64`, `Float32`
|
358
|
+
- integer - `Int64`, `Int32`, `Int16`, `Int8`
|
359
|
+
- unsigned integer - `UInt64`, `UInt32`, `UInt16`, `UInt8`
|
360
|
+
- string - `String`, `Binary`, `Categorical`
|
361
|
+
- temporal - `Date`, `Datetime`, `Time`, `Duration`
|
362
|
+
- nested - `List`, `Struct`, `Array`
|
363
|
+
- other - `Object`, `Null`
|
364
|
+
|
365
|
+
Get column types
|
366
|
+
|
367
|
+
```ruby
|
368
|
+
df.schema
|
369
|
+
```
|
370
|
+
|
371
|
+
For a specific column
|
372
|
+
|
373
|
+
```ruby
|
374
|
+
df["a"].dtype
|
375
|
+
```
|
376
|
+
|
377
|
+
Cast a column
|
378
|
+
|
379
|
+
```ruby
|
380
|
+
df["a"].cast(Polars::Int32)
|
381
|
+
```
|
382
|
+
|
383
|
+
## Visualization
|
384
|
+
|
385
|
+
Add [Vega](https://github.com/ankane/vega-ruby) to your application’s Gemfile:
|
386
|
+
|
387
|
+
```ruby
|
388
|
+
gem "vega"
|
389
|
+
```
|
390
|
+
|
391
|
+
And use:
|
392
|
+
|
393
|
+
```ruby
|
394
|
+
df.plot("a", "b")
|
395
|
+
```
|
396
|
+
|
397
|
+
Specify the chart type (`line`, `pie`, `column`, `bar`, `area`, or `scatter`)
|
398
|
+
|
399
|
+
```ruby
|
400
|
+
df.plot("a", "b", type: "pie")
|
401
|
+
```
|
402
|
+
|
403
|
+
Group data
|
404
|
+
|
405
|
+
```ruby
|
406
|
+
df.group_by("c").plot("a", "b")
|
407
|
+
```
|
408
|
+
|
409
|
+
Stacked columns or bars
|
410
|
+
|
411
|
+
```ruby
|
412
|
+
df.group_by("c").plot("a", "b", stacked: true)
|
413
|
+
```
|
414
|
+
|
415
|
+
## History
|
416
|
+
|
417
|
+
View the [changelog](CHANGELOG.md)
|
418
|
+
|
419
|
+
## Contributing
|
420
|
+
|
421
|
+
Everyone is encouraged to help improve this project. Here are a few ways you can help:
|
422
|
+
|
423
|
+
- [Report bugs](https://github.com/ankane/ruby-polars/issues)
|
424
|
+
- Fix bugs and [submit pull requests](https://github.com/ankane/ruby-polars/pulls)
|
425
|
+
- Write, clarify, or fix documentation
|
426
|
+
- Suggest or add new features
|
427
|
+
|
428
|
+
To get started with development:
|
429
|
+
|
430
|
+
```sh
|
431
|
+
git clone https://github.com/ankane/ruby-polars.git
|
432
|
+
cd ruby-polars
|
433
|
+
bundle install
|
434
|
+
bundle exec rake compile
|
435
|
+
bundle exec rake test
|
436
|
+
bundle exec rake test:docs
|
437
|
+
```
|
Binary file
|
Binary file
|
Binary file
|