rgwfuncs 0.0.23__tar.gz → 0.0.25__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {rgwfuncs-0.0.23/src/rgwfuncs.egg-info → rgwfuncs-0.0.25}/PKG-INFO +178 -76
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/README.md +177 -75
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/pyproject.toml +1 -1
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/setup.cfg +1 -1
- rgwfuncs-0.0.25/src/rgwfuncs/__init__.py +7 -0
- rgwfuncs-0.0.25/src/rgwfuncs/algebra_lib.py +186 -0
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/src/rgwfuncs/df_lib.py +0 -37
- rgwfuncs-0.0.25/src/rgwfuncs/docs_lib.py +49 -0
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/src/rgwfuncs/str_lib.py +2 -38
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25/src/rgwfuncs.egg-info}/PKG-INFO +178 -76
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/src/rgwfuncs.egg-info/SOURCES.txt +4 -1
- rgwfuncs-0.0.25/tests/test_algebra_lib.py +59 -0
- rgwfuncs-0.0.23/src/rgwfuncs/__init__.py +0 -5
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/LICENSE +0 -0
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/src/rgwfuncs.egg-info/dependency_links.txt +0 -0
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/src/rgwfuncs.egg-info/entry_points.txt +0 -0
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/src/rgwfuncs.egg-info/requires.txt +0 -0
- {rgwfuncs-0.0.23 → rgwfuncs-0.0.25}/src/rgwfuncs.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.2
|
2
2
|
Name: rgwfuncs
|
3
|
-
Version: 0.0.
|
3
|
+
Version: 0.0.25
|
4
4
|
Summary: A functional programming paradigm for mathematical modelling and data science
|
5
5
|
Home-page: https://github.com/ryangerardwilson/rgwfunc
|
6
6
|
Author: Ryan Gerard Wilson
|
@@ -135,22 +135,126 @@ To display all docstrings, use:
|
|
135
135
|
|
136
136
|
--------------------------------------------------------------------------------
|
137
137
|
|
138
|
-
##
|
138
|
+
## Documentation Access Functions
|
139
139
|
|
140
|
-
### 1.
|
141
|
-
Print a list of available function names in alphabetical order. If a filter is provided, print the
|
140
|
+
### 1. docs
|
141
|
+
Print a list of available function names in alphabetical order. If a filter is provided, print the docstrings of functions containing the term.
|
142
142
|
|
143
143
|
• Parameters:
|
144
144
|
- `method_type_filter` (str): Optional, comma-separated to select docstring types, or '*' for all.
|
145
145
|
|
146
146
|
• Example:
|
147
147
|
|
148
|
-
import
|
149
|
-
|
148
|
+
from rgwfuncs import docs
|
149
|
+
docs(method_type_filter='numeric_clean,limit_dataframe')
|
150
150
|
|
151
151
|
--------------------------------------------------------------------------------
|
152
152
|
|
153
|
-
|
153
|
+
## Algebra Based Functions
|
154
|
+
|
155
|
+
This section provides comprehensive functions for handling algebraic expressions, performing tasks such as computation, simplification, solving equations, and prime factorization, all outputted in LaTeX format.
|
156
|
+
|
157
|
+
### 1. `compute_algebraic_expression`
|
158
|
+
|
159
|
+
Evaluates complex algebraic expressions and provides numerical results.
|
160
|
+
|
161
|
+
- **Parameters:**
|
162
|
+
- `expression` (str): A string representing an arithmetic operation.
|
163
|
+
|
164
|
+
- **Returns:**
|
165
|
+
- `float`: The computed numerical result.
|
166
|
+
|
167
|
+
- **Example:**
|
168
|
+
|
169
|
+
from rgwfuncs import compute_algebraic_expression
|
170
|
+
result1 = compute_algebraic_expression("2 + 2")
|
171
|
+
print(result1) # Output: 4.0
|
172
|
+
|
173
|
+
result2 = compute_algebraic_expression("10 % 3")
|
174
|
+
print(result2) # Output: 1.0
|
175
|
+
|
176
|
+
result3 = compute_algebraic_expression("math.gcd(36, 60) * math.sin(math.radians(45)) * 10000")
|
177
|
+
print(result3) # Output: 84852.8137423857
|
178
|
+
|
179
|
+
These examples illustrate the ability to handle basic arithmetic, the modulo operator, and functions utilizing the Python math module.
|
180
|
+
|
181
|
+
--------------------------------------------------------------------------------
|
182
|
+
|
183
|
+
### 2. `simplify_algebraic_expression`
|
184
|
+
|
185
|
+
Simplifies expressions and returns them in LaTeX format.
|
186
|
+
|
187
|
+
- **Parameters:**
|
188
|
+
- `expression` (str): A string of the expression to simplify.
|
189
|
+
|
190
|
+
- **Returns:**
|
191
|
+
- `str`: Simplified expression in LaTeX.
|
192
|
+
|
193
|
+
- **Example:**
|
194
|
+
|
195
|
+
from rgwfuncs import simplify_algebraic_expression
|
196
|
+
simplified_expr1 = simplify_algebraic_expression("2*x + 3*x")
|
197
|
+
print(simplified_expr1) # Output: "5 x"
|
198
|
+
|
199
|
+
simplified_expr2 = simplify_algebraic_expression("(np.diff(3*x**8)) / (np.diff(8*x**30) * 11*y**3)")
|
200
|
+
print(simplified_expr2) # Output: "\frac{1}{110 x^{22} y^{3}}"
|
201
|
+
|
202
|
+
These examples demonstrate simplification of polynomial expressions and more complex ratios involving derivatives.
|
203
|
+
|
204
|
+
--------------------------------------------------------------------------------
|
205
|
+
|
206
|
+
### 3. `solve_algebraic_expression`
|
207
|
+
|
208
|
+
Solves equations for specified variables, with optional substitutions, returning LaTeX-formatted solutions.
|
209
|
+
|
210
|
+
- **Parameters:**
|
211
|
+
- `expression` (str): A string of the equation to solve.
|
212
|
+
- `variable` (str): The variable to solve for.
|
213
|
+
- `subs` (Optional[Dict[str, float]]): Substitutions for variables.
|
214
|
+
|
215
|
+
- **Returns:**
|
216
|
+
- `str`: Solutions formatted in LaTeX.
|
217
|
+
|
218
|
+
- **Example:**
|
219
|
+
|
220
|
+
from rgwfuncs import solve_algebraic_expression
|
221
|
+
solutions1 = solve_algebraic_expression("a*x**2 + b*x + c", "x", {"a": 3, "b": 7, "c": 5})
|
222
|
+
print(solutions1) # Output: "\left[-7/6 - sqrt(11)*I/6, -7/6 + sqrt(11)*I/6\right]"
|
223
|
+
|
224
|
+
solutions2 = solve_algebraic_expression("x**2 - 4", "x")
|
225
|
+
print(solutions2) # Output: "\left[-2, 2\right]"
|
226
|
+
|
227
|
+
Here, we solve both a quadratic equation with complex solutions and a simpler polynomial equation.
|
228
|
+
|
229
|
+
--------------------------------------------------------------------------------
|
230
|
+
|
231
|
+
### 4. `get_prime_factors_latex`
|
232
|
+
|
233
|
+
Computes prime factors of a number and presents them in LaTeX format.
|
234
|
+
|
235
|
+
- **Parameters:**
|
236
|
+
- `n` (int): The integer to factorize.
|
237
|
+
|
238
|
+
- **Returns:**
|
239
|
+
- `str`: Prime factorization in LaTeX.
|
240
|
+
|
241
|
+
- **Example:**
|
242
|
+
|
243
|
+
from rgwfuncs import get_prime_factors_latex
|
244
|
+
factors1 = get_prime_factors_latex(100)
|
245
|
+
print(factors1) # Output: "2^{2} \cdot 5^{2}"
|
246
|
+
|
247
|
+
factors2 = get_prime_factors_latex(60)
|
248
|
+
print(factors2) # Output: "2^{2} \cdot 3 \cdot 5"
|
249
|
+
|
250
|
+
factors3 = get_prime_factors_latex(17)
|
251
|
+
print(factors3) # Output: "17"
|
252
|
+
|
253
|
+
--------------------------------------------------------------------------------
|
254
|
+
|
255
|
+
## String Based Functions
|
256
|
+
|
257
|
+
### 1. send_telegram_message
|
154
258
|
|
155
259
|
Send a message to a Telegram chat using a specified preset from your configuration file.
|
156
260
|
|
@@ -176,20 +280,7 @@ Send a message to a Telegram chat using a specified preset from your configurati
|
|
176
280
|
|
177
281
|
Below is a quick reference of available functions, their purpose, and basic usage examples.
|
178
282
|
|
179
|
-
### 1.
|
180
|
-
Print a list of available function names in alphabetical order. If a filter is provided, print the matching docstrings.
|
181
|
-
|
182
|
-
• Parameters:
|
183
|
-
- `method_type_filter` (str): Optional, comma-separated to select docstring types, or '*' for all.
|
184
|
-
|
185
|
-
• Example:
|
186
|
-
|
187
|
-
import rgwfuncs
|
188
|
-
rgwfuncs.df_docs(method_type_filter='numeric_clean,limit_dataframe')
|
189
|
-
|
190
|
-
--------------------------------------------------------------------------------
|
191
|
-
|
192
|
-
### 2. `numeric_clean`
|
283
|
+
### 1. `numeric_clean`
|
193
284
|
Cleans the numeric columns in a DataFrame according to specified treatments.
|
194
285
|
|
195
286
|
• Parameters:
|
@@ -218,7 +309,7 @@ Cleans the numeric columns in a DataFrame according to specified treatments.
|
|
218
309
|
|
219
310
|
--------------------------------------------------------------------------------
|
220
311
|
|
221
|
-
###
|
312
|
+
### 2. `limit_dataframe`
|
222
313
|
Limit the DataFrame to a specified number of rows.
|
223
314
|
|
224
315
|
• Parameters:
|
@@ -239,7 +330,7 @@ Limit the DataFrame to a specified number of rows.
|
|
239
330
|
|
240
331
|
--------------------------------------------------------------------------------
|
241
332
|
|
242
|
-
###
|
333
|
+
### 3. `from_raw_data`
|
243
334
|
Create a DataFrame from raw data.
|
244
335
|
|
245
336
|
• Parameters:
|
@@ -265,7 +356,7 @@ Create a DataFrame from raw data.
|
|
265
356
|
|
266
357
|
--------------------------------------------------------------------------------
|
267
358
|
|
268
|
-
###
|
359
|
+
### 4. `append_rows`
|
269
360
|
Append rows to the DataFrame.
|
270
361
|
|
271
362
|
• Parameters:
|
@@ -290,7 +381,7 @@ Append rows to the DataFrame.
|
|
290
381
|
|
291
382
|
--------------------------------------------------------------------------------
|
292
383
|
|
293
|
-
###
|
384
|
+
### 5. `append_columns`
|
294
385
|
Append new columns to the DataFrame with None values.
|
295
386
|
|
296
387
|
• Parameters:
|
@@ -311,7 +402,7 @@ Append new columns to the DataFrame with None values.
|
|
311
402
|
|
312
403
|
--------------------------------------------------------------------------------
|
313
404
|
|
314
|
-
###
|
405
|
+
### 6. `update_rows`
|
315
406
|
Update specific rows in the DataFrame based on a condition.
|
316
407
|
|
317
408
|
• Parameters:
|
@@ -333,7 +424,7 @@ Update specific rows in the DataFrame based on a condition.
|
|
333
424
|
|
334
425
|
--------------------------------------------------------------------------------
|
335
426
|
|
336
|
-
###
|
427
|
+
### 7. `delete_rows`
|
337
428
|
Delete rows from the DataFrame based on a condition.
|
338
429
|
|
339
430
|
• Parameters:
|
@@ -354,7 +445,7 @@ Delete rows from the DataFrame based on a condition.
|
|
354
445
|
|
355
446
|
--------------------------------------------------------------------------------
|
356
447
|
|
357
|
-
###
|
448
|
+
### 8. `drop_duplicates`
|
358
449
|
Drop duplicate rows in the DataFrame, retaining the first occurrence.
|
359
450
|
|
360
451
|
• Parameters:
|
@@ -374,7 +465,7 @@ Drop duplicate rows in the DataFrame, retaining the first occurrence.
|
|
374
465
|
|
375
466
|
--------------------------------------------------------------------------------
|
376
467
|
|
377
|
-
###
|
468
|
+
### 9. `drop_duplicates_retain_first`
|
378
469
|
Drop duplicate rows based on specified columns, retaining the first occurrence.
|
379
470
|
|
380
471
|
• Parameters:
|
@@ -395,7 +486,7 @@ Drop duplicate rows based on specified columns, retaining the first occurrence.
|
|
395
486
|
|
396
487
|
--------------------------------------------------------------------------------
|
397
488
|
|
398
|
-
###
|
489
|
+
### 10. `drop_duplicates_retain_last`
|
399
490
|
Drop duplicate rows based on specified columns, retaining the last occurrence.
|
400
491
|
|
401
492
|
• Parameters:
|
@@ -417,7 +508,7 @@ Drop duplicate rows based on specified columns, retaining the last occurrence.
|
|
417
508
|
|
418
509
|
--------------------------------------------------------------------------------
|
419
510
|
|
420
|
-
###
|
511
|
+
### 11. `load_data_from_query`
|
421
512
|
|
422
513
|
Load data from a database query into a DataFrame based on a configuration preset.
|
423
514
|
|
@@ -444,7 +535,7 @@ Load data from a database query into a DataFrame based on a configuration preset
|
|
444
535
|
|
445
536
|
--------------------------------------------------------------------------------
|
446
537
|
|
447
|
-
###
|
538
|
+
### 12. `load_data_from_path`
|
448
539
|
Load data from a file into a DataFrame based on the file extension.
|
449
540
|
|
450
541
|
• Parameters:
|
@@ -463,7 +554,7 @@ Load data from a file into a DataFrame based on the file extension.
|
|
463
554
|
|
464
555
|
--------------------------------------------------------------------------------
|
465
556
|
|
466
|
-
###
|
557
|
+
### 13. `load_data_from_sqlite_path`
|
467
558
|
Execute a query on a SQLite database file and return the results as a DataFrame.
|
468
559
|
|
469
560
|
• Parameters:
|
@@ -483,7 +574,7 @@ Execute a query on a SQLite database file and return the results as a DataFrame.
|
|
483
574
|
|
484
575
|
--------------------------------------------------------------------------------
|
485
576
|
|
486
|
-
###
|
577
|
+
### 14. `first_n_rows`
|
487
578
|
Display the first n rows of the DataFrame (prints out in dictionary format).
|
488
579
|
|
489
580
|
• Parameters:
|
@@ -501,7 +592,7 @@ Display the first n rows of the DataFrame (prints out in dictionary format).
|
|
501
592
|
|
502
593
|
--------------------------------------------------------------------------------
|
503
594
|
|
504
|
-
###
|
595
|
+
### 15. `last_n_rows`
|
505
596
|
Display the last n rows of the DataFrame (prints out in dictionary format).
|
506
597
|
|
507
598
|
• Parameters:
|
@@ -519,7 +610,7 @@ Display the last n rows of the DataFrame (prints out in dictionary format).
|
|
519
610
|
|
520
611
|
--------------------------------------------------------------------------------
|
521
612
|
|
522
|
-
###
|
613
|
+
### 16. `top_n_unique_values`
|
523
614
|
Print the top n unique values for specified columns in the DataFrame.
|
524
615
|
|
525
616
|
• Parameters:
|
@@ -538,7 +629,7 @@ Print the top n unique values for specified columns in the DataFrame.
|
|
538
629
|
|
539
630
|
--------------------------------------------------------------------------------
|
540
631
|
|
541
|
-
###
|
632
|
+
### 17. `bottom_n_unique_values`
|
542
633
|
Print the bottom n unique values for specified columns in the DataFrame.
|
543
634
|
|
544
635
|
• Parameters:
|
@@ -557,7 +648,7 @@ Print the bottom n unique values for specified columns in the DataFrame.
|
|
557
648
|
|
558
649
|
--------------------------------------------------------------------------------
|
559
650
|
|
560
|
-
###
|
651
|
+
### 18. `print_correlation`
|
561
652
|
Print correlation for multiple pairs of columns in the DataFrame.
|
562
653
|
|
563
654
|
• Parameters:
|
@@ -582,7 +673,7 @@ Print correlation for multiple pairs of columns in the DataFrame.
|
|
582
673
|
|
583
674
|
--------------------------------------------------------------------------------
|
584
675
|
|
585
|
-
###
|
676
|
+
### 19. `print_memory_usage`
|
586
677
|
Print the memory usage of the DataFrame in megabytes.
|
587
678
|
|
588
679
|
• Parameters:
|
@@ -599,7 +690,7 @@ Print the memory usage of the DataFrame in megabytes.
|
|
599
690
|
|
600
691
|
--------------------------------------------------------------------------------
|
601
692
|
|
602
|
-
###
|
693
|
+
### 20. `filter_dataframe`
|
603
694
|
Return a new DataFrame filtered by a given query expression.
|
604
695
|
|
605
696
|
• Parameters:
|
@@ -625,7 +716,7 @@ Return a new DataFrame filtered by a given query expression.
|
|
625
716
|
|
626
717
|
--------------------------------------------------------------------------------
|
627
718
|
|
628
|
-
###
|
719
|
+
### 21. `filter_indian_mobiles`
|
629
720
|
Filter and return rows containing valid Indian mobile numbers in the specified column.
|
630
721
|
|
631
722
|
• Parameters:
|
@@ -647,7 +738,7 @@ Filter and return rows containing valid Indian mobile numbers in the specified c
|
|
647
738
|
|
648
739
|
--------------------------------------------------------------------------------
|
649
740
|
|
650
|
-
###
|
741
|
+
### 22. `print_dataframe`
|
651
742
|
Print the entire DataFrame and its column types. Optionally print a source path.
|
652
743
|
|
653
744
|
• Parameters:
|
@@ -665,7 +756,7 @@ Print the entire DataFrame and its column types. Optionally print a source path.
|
|
665
756
|
|
666
757
|
--------------------------------------------------------------------------------
|
667
758
|
|
668
|
-
###
|
759
|
+
### 23. `send_dataframe_via_telegram`
|
669
760
|
Send a DataFrame via Telegram using a specified bot configuration.
|
670
761
|
|
671
762
|
• Parameters:
|
@@ -692,7 +783,7 @@ Send a DataFrame via Telegram using a specified bot configuration.
|
|
692
783
|
|
693
784
|
--------------------------------------------------------------------------------
|
694
785
|
|
695
|
-
###
|
786
|
+
### 24. `send_data_to_email`
|
696
787
|
Send an email with an optional DataFrame attachment using the Gmail API via a specified preset.
|
697
788
|
|
698
789
|
• Parameters:
|
@@ -722,7 +813,7 @@ Send an email with an optional DataFrame attachment using the Gmail API via a sp
|
|
722
813
|
|
723
814
|
--------------------------------------------------------------------------------
|
724
815
|
|
725
|
-
###
|
816
|
+
### 25. `send_data_to_slack`
|
726
817
|
Send a DataFrame or message to Slack using a specified bot configuration.
|
727
818
|
|
728
819
|
• Parameters:
|
@@ -748,7 +839,7 @@ Send a DataFrame or message to Slack using a specified bot configuration.
|
|
748
839
|
|
749
840
|
--------------------------------------------------------------------------------
|
750
841
|
|
751
|
-
###
|
842
|
+
### 26. `order_columns`
|
752
843
|
Reorder the columns of a DataFrame based on a string input.
|
753
844
|
|
754
845
|
• Parameters:
|
@@ -770,7 +861,7 @@ Reorder the columns of a DataFrame based on a string input.
|
|
770
861
|
|
771
862
|
--------------------------------------------------------------------------------
|
772
863
|
|
773
|
-
###
|
864
|
+
### 27. `append_ranged_classification_column`
|
774
865
|
Append a ranged classification column to the DataFrame.
|
775
866
|
|
776
867
|
• Parameters:
|
@@ -794,7 +885,7 @@ Append a ranged classification column to the DataFrame.
|
|
794
885
|
|
795
886
|
--------------------------------------------------------------------------------
|
796
887
|
|
797
|
-
###
|
888
|
+
### 28. `append_percentile_classification_column`
|
798
889
|
Append a percentile classification column to the DataFrame.
|
799
890
|
|
800
891
|
• Parameters:
|
@@ -818,7 +909,7 @@ Append a percentile classification column to the DataFrame.
|
|
818
909
|
|
819
910
|
--------------------------------------------------------------------------------
|
820
911
|
|
821
|
-
###
|
912
|
+
### 29. `append_ranged_date_classification_column`
|
822
913
|
Append a ranged date classification column to the DataFrame.
|
823
914
|
|
824
915
|
• Parameters:
|
@@ -847,7 +938,7 @@ Append a ranged date classification column to the DataFrame.
|
|
847
938
|
|
848
939
|
--------------------------------------------------------------------------------
|
849
940
|
|
850
|
-
###
|
941
|
+
### 30. `rename_columns`
|
851
942
|
Rename columns in the DataFrame.
|
852
943
|
|
853
944
|
• Parameters:
|
@@ -869,7 +960,7 @@ Rename columns in the DataFrame.
|
|
869
960
|
|
870
961
|
--------------------------------------------------------------------------------
|
871
962
|
|
872
|
-
###
|
963
|
+
### 31. `cascade_sort`
|
873
964
|
Cascade sort the DataFrame by specified columns and order.
|
874
965
|
|
875
966
|
• Parameters:
|
@@ -895,7 +986,7 @@ Cascade sort the DataFrame by specified columns and order.
|
|
895
986
|
|
896
987
|
--------------------------------------------------------------------------------
|
897
988
|
|
898
|
-
###
|
989
|
+
### 32. `append_xgb_labels`
|
899
990
|
Append XGB training labels (TRAIN, VALIDATE, TEST) based on a ratio string.
|
900
991
|
|
901
992
|
• Parameters:
|
@@ -917,7 +1008,7 @@ Append XGB training labels (TRAIN, VALIDATE, TEST) based on a ratio string.
|
|
917
1008
|
|
918
1009
|
--------------------------------------------------------------------------------
|
919
1010
|
|
920
|
-
###
|
1011
|
+
### 33. `append_xgb_regression_predictions`
|
921
1012
|
Append XGB regression predictions to the DataFrame. Requires an `XGB_TYPE` column for TRAIN/TEST splits.
|
922
1013
|
|
923
1014
|
• Parameters:
|
@@ -949,7 +1040,7 @@ Append XGB regression predictions to the DataFrame. Requires an `XGB_TYPE` colum
|
|
949
1040
|
|
950
1041
|
--------------------------------------------------------------------------------
|
951
1042
|
|
952
|
-
###
|
1043
|
+
### 34. `append_xgb_logistic_regression_predictions`
|
953
1044
|
Append XGB logistic regression predictions to the DataFrame. Requires an `XGB_TYPE` column for TRAIN/TEST splits.
|
954
1045
|
|
955
1046
|
• Parameters:
|
@@ -981,7 +1072,7 @@ Append XGB logistic regression predictions to the DataFrame. Requires an `XGB_TY
|
|
981
1072
|
|
982
1073
|
--------------------------------------------------------------------------------
|
983
1074
|
|
984
|
-
###
|
1075
|
+
### 35. `print_n_frequency_cascading`
|
985
1076
|
Print the cascading frequency of top n values for specified columns.
|
986
1077
|
|
987
1078
|
• Parameters:
|
@@ -1001,27 +1092,36 @@ Print the cascading frequency of top n values for specified columns.
|
|
1001
1092
|
|
1002
1093
|
--------------------------------------------------------------------------------
|
1003
1094
|
|
1004
|
-
###
|
1005
|
-
Print the linear frequency of top n values for specified columns.
|
1095
|
+
### 36. `print_n_frequency_linear`
|
1006
1096
|
|
1007
|
-
|
1008
|
-
|
1009
|
-
|
1010
|
-
|
1011
|
-
|
1097
|
+
Prints the linear frequency of the top `n` values for specified columns.
|
1098
|
+
|
1099
|
+
#### Parameters:
|
1100
|
+
- **df** (`pd.DataFrame`): The DataFrame to analyze.
|
1101
|
+
- **n** (`int`): The number of top values to print for each column.
|
1102
|
+
- **columns** (`list`): A list of column names to be analyzed.
|
1103
|
+
- **order_by** (`str`): The order of frequency. The available options are:
|
1104
|
+
- `"ASC"`: Sort keys in ascending lexicographical order.
|
1105
|
+
- `"DESC"`: Sort keys in descending lexicographical order.
|
1106
|
+
- `"FREQ_ASC"`: Sort the frequencies in ascending order (least frequent first).
|
1107
|
+
- `"FREQ_DESC"`: Sort the frequencies in descending order (most frequent first).
|
1108
|
+
- `"BY_KEYS_ASC"`: Sort keys in ascending order, numerically if possible, handling special strings like 'NaN' as typical entries.
|
1109
|
+
- `"BY_KEYS_DESC"`: Sort keys in descending order, numerically if possible, handling special strings like 'NaN' as typical entries.
|
1110
|
+
|
1111
|
+
#### Example:
|
1012
1112
|
|
1013
|
-
• Example:
|
1014
|
-
|
1015
1113
|
from rgwfuncs import print_n_frequency_linear
|
1016
1114
|
import pandas as pd
|
1017
1115
|
|
1018
|
-
df = pd.DataFrame({'City': ['NY','LA','NY','SF','LA','LA']})
|
1019
|
-
print_n_frequency_linear(df, 2, 'City', 'FREQ_DESC')
|
1020
|
-
|
1116
|
+
df = pd.DataFrame({'City': ['NY', 'LA', 'NY', 'SF', 'LA', 'LA']})
|
1117
|
+
print_n_frequency_linear(df, 2, ['City'], 'FREQ_DESC')
|
1118
|
+
|
1119
|
+
This example analyzes the `City` column, printing the top 2 most frequent values in descending order of frequency.
|
1120
|
+
|
1021
1121
|
|
1022
1122
|
--------------------------------------------------------------------------------
|
1023
1123
|
|
1024
|
-
###
|
1124
|
+
### 37. `retain_columns`
|
1025
1125
|
Retain specified columns in the DataFrame and drop the others.
|
1026
1126
|
|
1027
1127
|
• Parameters:
|
@@ -1043,7 +1143,7 @@ Retain specified columns in the DataFrame and drop the others.
|
|
1043
1143
|
|
1044
1144
|
--------------------------------------------------------------------------------
|
1045
1145
|
|
1046
|
-
###
|
1146
|
+
### 38. `mask_against_dataframe`
|
1047
1147
|
Retain only rows with common column values between two DataFrames.
|
1048
1148
|
|
1049
1149
|
• Parameters:
|
@@ -1068,7 +1168,7 @@ Retain only rows with common column values between two DataFrames.
|
|
1068
1168
|
|
1069
1169
|
--------------------------------------------------------------------------------
|
1070
1170
|
|
1071
|
-
###
|
1171
|
+
### 39. `mask_against_dataframe_converse`
|
1072
1172
|
Retain only rows with uncommon column values between two DataFrames.
|
1073
1173
|
|
1074
1174
|
• Parameters:
|
@@ -1093,7 +1193,7 @@ Retain only rows with uncommon column values between two DataFrames.
|
|
1093
1193
|
|
1094
1194
|
--------------------------------------------------------------------------------
|
1095
1195
|
|
1096
|
-
###
|
1196
|
+
### 40. `union_join`
|
1097
1197
|
Perform a union join, concatenating two DataFrames and dropping duplicates.
|
1098
1198
|
|
1099
1199
|
• Parameters:
|
@@ -1116,7 +1216,7 @@ Perform a union join, concatenating two DataFrames and dropping duplicates.
|
|
1116
1216
|
|
1117
1217
|
--------------------------------------------------------------------------------
|
1118
1218
|
|
1119
|
-
###
|
1219
|
+
### 41. `bag_union_join`
|
1120
1220
|
Perform a bag union join, concatenating two DataFrames without dropping duplicates.
|
1121
1221
|
|
1122
1222
|
• Parameters:
|
@@ -1139,7 +1239,7 @@ Perform a bag union join, concatenating two DataFrames without dropping duplicat
|
|
1139
1239
|
|
1140
1240
|
--------------------------------------------------------------------------------
|
1141
1241
|
|
1142
|
-
###
|
1242
|
+
### 42. `left_join`
|
1143
1243
|
Perform a left join on two DataFrames.
|
1144
1244
|
|
1145
1245
|
• Parameters:
|
@@ -1164,7 +1264,7 @@ Perform a left join on two DataFrames.
|
|
1164
1264
|
|
1165
1265
|
--------------------------------------------------------------------------------
|
1166
1266
|
|
1167
|
-
###
|
1267
|
+
### 43. `right_join`
|
1168
1268
|
Perform a right join on two DataFrames.
|
1169
1269
|
|
1170
1270
|
• Parameters:
|
@@ -1189,7 +1289,7 @@ Perform a right join on two DataFrames.
|
|
1189
1289
|
|
1190
1290
|
--------------------------------------------------------------------------------
|
1191
1291
|
|
1192
|
-
###
|
1292
|
+
### 44. `insert_dataframe_in_sqlite_database`
|
1193
1293
|
|
1194
1294
|
Inserts a Pandas DataFrame into a SQLite database table. If the specified table does not exist, it will be created with column types automatically inferred from the DataFrame's data types.
|
1195
1295
|
|
@@ -1227,7 +1327,7 @@ Inserts a Pandas DataFrame into a SQLite database table. If the specified table
|
|
1227
1327
|
|
1228
1328
|
--------------------------------------------------------------------------------
|
1229
1329
|
|
1230
|
-
###
|
1330
|
+
### 45. `sync_dataframe_to_sqlite_database`
|
1231
1331
|
Processes and saves a DataFrame to an SQLite database, adding a timestamp column and replacing the existing table if needed. Creates the table if it does not exist.
|
1232
1332
|
|
1233
1333
|
• Parameters:
|
@@ -1251,6 +1351,8 @@ Processes and saves a DataFrame to an SQLite database, adding a timestamp column
|
|
1251
1351
|
|
1252
1352
|
--------------------------------------------------------------------------------
|
1253
1353
|
|
1354
|
+
|
1355
|
+
|
1254
1356
|
## Additional Info
|
1255
1357
|
|
1256
1358
|
For more information, refer to each function’s docstring by calling:
|