rgwfuncs 0.0.21__py3-none-any.whl → 0.0.54__py3-none-any.whl
Sign up to get free protection for your applications and to get access to all the features.
- rgwfuncs/__init__.py +5 -2
- rgwfuncs/algebra_lib.py +901 -0
- rgwfuncs/df_lib.py +111 -61
- rgwfuncs/docs_lib.py +51 -0
- rgwfuncs/interactive_shell_lib.py +32 -0
- rgwfuncs/str_lib.py +8 -44
- {rgwfuncs-0.0.21.dist-info → rgwfuncs-0.0.54.dist-info}/METADATA +517 -92
- rgwfuncs-0.0.54.dist-info/RECORD +12 -0
- rgwfuncs-0.0.21.dist-info/RECORD +0 -9
- {rgwfuncs-0.0.21.dist-info → rgwfuncs-0.0.54.dist-info}/LICENSE +0 -0
- {rgwfuncs-0.0.21.dist-info → rgwfuncs-0.0.54.dist-info}/WHEEL +0 -0
- {rgwfuncs-0.0.21.dist-info → rgwfuncs-0.0.54.dist-info}/entry_points.txt +0 -0
- {rgwfuncs-0.0.21.dist-info → rgwfuncs-0.0.54.dist-info}/top_level.txt +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.2
|
2
2
|
Name: rgwfuncs
|
3
|
-
Version: 0.0.
|
3
|
+
Version: 0.0.54
|
4
4
|
Summary: A functional programming paradigm for mathematical modelling and data science
|
5
5
|
Home-page: https://github.com/ryangerardwilson/rgwfunc
|
6
6
|
Author: Ryan Gerard Wilson
|
@@ -23,12 +23,16 @@ Requires-Dist: xgboost
|
|
23
23
|
Requires-Dist: requests
|
24
24
|
Requires-Dist: slack-sdk
|
25
25
|
Requires-Dist: google-api-python-client
|
26
|
+
Requires-Dist: boto3
|
26
27
|
|
27
28
|
# RGWFUNCS
|
28
29
|
|
29
30
|
***By Ryan Gerard Wilson (https://ryangerardwilson.com)***
|
30
31
|
|
31
|
-
|
32
|
+
|
33
|
+
This library is meant to protect your eyes (and brain) from OOP syntax. It is unbelievably sad that some of the best work done in creating math and data science libraries in Python has been corrupted by the OOP mind-virus.
|
34
|
+
|
35
|
+
By creating a functional-programming wrapper around these libraries, we aim to soothe. This library assumes a Linux environment and the existence of a `.rgwfuncsrc` file for certain features (like database querying, sending data to Slack, etc.).
|
32
36
|
|
33
37
|
--------------------------------------------------------------------------------
|
34
38
|
|
@@ -75,6 +79,15 @@ A `.rgwfuncsrc` file (located at `vi ~/.rgwfuncsrc) is required for MSSQL, CLICK
|
|
75
79
|
"db_type": "google_big_query",
|
76
80
|
"json_file_path": "",
|
77
81
|
"project_id": ""
|
82
|
+
},
|
83
|
+
{
|
84
|
+
"name": "athena_db1",
|
85
|
+
"db_type": "aws_athena",
|
86
|
+
"aws_access_key": "",
|
87
|
+
"aws_secret_key": "",
|
88
|
+
"aws_region: "",
|
89
|
+
"database": "logs",
|
90
|
+
"output_bucket": "s3://bucket-name"
|
78
91
|
}
|
79
92
|
],
|
80
93
|
"vm_presets": [
|
@@ -135,22 +148,415 @@ To display all docstrings, use:
|
|
135
148
|
|
136
149
|
--------------------------------------------------------------------------------
|
137
150
|
|
138
|
-
##
|
151
|
+
## Documentation Access
|
139
152
|
|
140
|
-
### 1.
|
141
|
-
Print a list of available function names in alphabetical order. If a filter is provided, print the
|
153
|
+
### 1. docs
|
154
|
+
Print a list of available function names in alphabetical order. If a filter is provided, print the docstrings of functions containing the term.
|
142
155
|
|
143
156
|
• Parameters:
|
144
157
|
- `method_type_filter` (str): Optional, comma-separated to select docstring types, or '*' for all.
|
145
158
|
|
146
159
|
• Example:
|
147
160
|
|
148
|
-
import
|
149
|
-
|
161
|
+
from rgwfuncs import docs
|
162
|
+
docs(method_type_filter='numeric_clean,limit_dataframe')
|
150
163
|
|
151
164
|
--------------------------------------------------------------------------------
|
152
165
|
|
153
|
-
|
166
|
+
## Interactive Shell
|
167
|
+
|
168
|
+
This section includes functions that facilitate launching an interactive Python shell to inspect and modify local variables within the user's environment.
|
169
|
+
|
170
|
+
### 1. `interactive_shell`
|
171
|
+
|
172
|
+
Launches an interactive prompt for inspecting and modifying local variables, making all methods in the rgwfuncs library available by default. This REPL (Read-Eval-Print Loop) environment supports command history and autocompletion, making it easier to interact with your Python code. This function is particularly useful for debugging purposes when you want real-time interaction with your program's execution environment.
|
173
|
+
|
174
|
+
• Parameters:
|
175
|
+
- `local_vars` (dict, optional): A dictionary of local variables to be accessible within the interactive shell. If not provided, defaults to an empty dictionary.
|
176
|
+
|
177
|
+
• Usage:
|
178
|
+
- You can call this function to enter an interactive shell where you can view and modify the variables in the given local scope.
|
179
|
+
|
180
|
+
• Example:
|
181
|
+
|
182
|
+
from rgwfuncs import interactive_shell
|
183
|
+
import pandas as pd
|
184
|
+
import numpy as np
|
185
|
+
|
186
|
+
# Example DataFrame
|
187
|
+
df = pd.DataFrame({
|
188
|
+
'id': [1, 2, 3, 4, 5],
|
189
|
+
'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
|
190
|
+
'age': [30, 25, 35, 28, 22],
|
191
|
+
'city': ['New York', 'Los Angeles', 'Chicago', 'San Francisco', 'Boston']
|
192
|
+
})
|
193
|
+
|
194
|
+
# Launch the interactive shell with local variables
|
195
|
+
interactive_shell(locals())
|
196
|
+
|
197
|
+
Subsequently, in the interactive shell you can use any library in your python file, as well as all rgwfuncs methods (even if they are not imported). Notice, that while pandas and numpy are available in the shell as a result of importing them in the above script, the rgwfuncs method `first_n_rows` was not imported - yet is available for use.
|
198
|
+
|
199
|
+
Welcome to the rgwfuncs interactive shell.
|
200
|
+
>>> pirst_n_rows(df, 2)
|
201
|
+
Traceback (most recent call last):
|
202
|
+
File "<console>", line 1, in <module>
|
203
|
+
NameError: name 'pirst_n_rows' is not defined. Did you mean: 'first_n_rows'?
|
204
|
+
>>> first_n_rows(df, 2)
|
205
|
+
{'age': '30', 'city': 'New York', 'id': '1', 'name': 'Alice'}
|
206
|
+
{'age': '25', 'city': 'Los Angeles', 'id': '2', 'name': 'Bob'}
|
207
|
+
>>> print(df)
|
208
|
+
id name age city
|
209
|
+
0 1 Alice 30 New York
|
210
|
+
1 2 Bob 25 Los Angeles
|
211
|
+
2 3 Charlie 35 Chicago
|
212
|
+
3 4 David 28 San Francisco
|
213
|
+
4 5 Eva 22 Boston
|
214
|
+
>>> arr = np.array([1, 2, 3, 4, 5])
|
215
|
+
>>> arr
|
216
|
+
array([1, 2, 3, 4, 5])
|
217
|
+
|
218
|
+
--------------------------------------------------------------------------------
|
219
|
+
|
220
|
+
## Algebra Based Functions
|
221
|
+
|
222
|
+
This section provides comprehensive functions for handling algebraic expressions, performing tasks such as computation, simplification, solving equations, and prime factorization, all outputted in LaTeX format.
|
223
|
+
|
224
|
+
### 1. `compute_prime_factors`
|
225
|
+
|
226
|
+
Computes prime factors of a number and presents them in LaTeX format.
|
227
|
+
|
228
|
+
• Parameters:
|
229
|
+
- `n` (int): The integer to factorize.
|
230
|
+
|
231
|
+
• Returns:
|
232
|
+
- `str`: Prime factorization in LaTeX.
|
233
|
+
|
234
|
+
• Example:
|
235
|
+
|
236
|
+
from rgwfuncs import compute_prime_factors
|
237
|
+
factors_1 = compute_prime_factors(100)
|
238
|
+
print(factors_1) # Output: "2^{2} \cdot 5^{2}"
|
239
|
+
|
240
|
+
factors_2 = compute_prime_factors(60)
|
241
|
+
print(factors_2) # Output: "2^{2} \cdot 3 \cdot 5"
|
242
|
+
|
243
|
+
factors_3 = compute_prime_factors(17)
|
244
|
+
print(factors_3) # Output: "17"
|
245
|
+
|
246
|
+
--------------------------------------------------------------------------------
|
247
|
+
|
248
|
+
### 2. `compute_constant_expression`
|
249
|
+
|
250
|
+
Computes the numerical result of a given expression, which can evaluate to a constant, represented as a float. Evaluates an constant expression provided as a string and returns the computed result. Supports various arithmetic operations, including addition, subtraction, multiplication, division, and modulo, as well as mathematical functions from the math module.
|
251
|
+
|
252
|
+
• Parameters:
|
253
|
+
- `expression` (str): The constant expression to compute. This should be a string consisting of arithmetic operations and Python's math module functions.
|
254
|
+
|
255
|
+
• Returns:
|
256
|
+
- `float`: The computed numerical result.
|
257
|
+
|
258
|
+
• Example:
|
259
|
+
|
260
|
+
from rgwfuncs import compute_constant_expression
|
261
|
+
result1 = compute_constant_expression("2 + 2")
|
262
|
+
print(result1) # Output: 4.0
|
263
|
+
|
264
|
+
result2 = compute_constant_expression("10 % 3")
|
265
|
+
print(result2) # Output: 1.0
|
266
|
+
|
267
|
+
result3 = compute_constant_expression("math.gcd(36, 60) * math.sin(math.radians(45)) * 10000")
|
268
|
+
print(result3) # Output: 84852.8137423857
|
269
|
+
|
270
|
+
--------------------------------------------------------------------------------
|
271
|
+
|
272
|
+
### 3. `compute_constant_expression_involving_matrices`
|
273
|
+
|
274
|
+
Computes the result of a constant expression involving matrices and returns it as a LaTeX string.
|
275
|
+
|
276
|
+
• Parameters:
|
277
|
+
- `expression` (str): The constant expression involving matrices. Example format includes operations such as "+", "-", "*", "/".
|
278
|
+
|
279
|
+
• Returns:
|
280
|
+
- `str`: The LaTeX-formatted string representation of the computed matrix, or an error message if the operations cannot be performed due to dimensional mismatches.
|
281
|
+
|
282
|
+
• Example:
|
283
|
+
|
284
|
+
from rgwfuncs import compute_constant_expression_involving_matrices
|
285
|
+
|
286
|
+
# Example with addition of 2D matrices
|
287
|
+
result = compute_constant_expression_involving_matrices("[[2, 6, 9], [1, 3, 5]] + [[1, 2, 3], [4, 5, 6]]")
|
288
|
+
print(result) # Output: \begin{bmatrix}3 & 8 & 12\\5 & 8 & 11\end{bmatrix}
|
289
|
+
|
290
|
+
# Example of mixed operations with 1D matrices treated as 2D
|
291
|
+
result = compute_constant_expression_involving_matrices("[3, 6, 9] + [1, 2, 3] - [2, 2, 2]")
|
292
|
+
print(result) # Output: \begin{bmatrix}2 & 6 & 10\end{bmatrix}
|
293
|
+
|
294
|
+
# Example with dimension mismatch
|
295
|
+
result = compute_constant_expression_involving_matrices("[[4, 3, 51]] + [[1, 1]]")
|
296
|
+
print(result) # Output: Operations between matrices must involve matrices of the same dimension
|
297
|
+
|
298
|
+
--------------------------------------------------------------------------------
|
299
|
+
|
300
|
+
### 4. `compute_constant_expression_involving_ordered_series`
|
301
|
+
|
302
|
+
Computes the result of a constant expression involving ordered series, and returns it as a Latex string.
|
303
|
+
|
304
|
+
|
305
|
+
• Parameters:
|
306
|
+
- `expression` (str): A series operation expression. Supports operations such as "+", "-", "*", "/", and `dd()` for discrete differences.
|
307
|
+
|
308
|
+
• Returns:
|
309
|
+
- `str`: The string representation of the resultant series after performing operations, or an error message if series lengths do not match.
|
310
|
+
|
311
|
+
• Example:
|
312
|
+
|
313
|
+
from rgwfuncs import compute_constant_expression_involving_ordered_series
|
314
|
+
|
315
|
+
# Example with addition and discrete differences
|
316
|
+
result = compute_constant_expression_involving_ordered_series("dd([2, 6, 9, 60]) + dd([78, 79, 80])")
|
317
|
+
print(result) # Output: [4, 3, 51] + [1, 1]
|
318
|
+
|
319
|
+
# Example with elementwise subtraction
|
320
|
+
result = compute_constant_expression_involving_ordered_series("[10, 15, 21] - [5, 5, 5]")
|
321
|
+
print(result) # Output: [5, 10, 16]
|
322
|
+
|
323
|
+
# Example with length mismatch
|
324
|
+
result = compute_constant_expression_involving_ordered_series("[4, 3, 51] + [1, 1]")
|
325
|
+
print(result) # Output: Operations between ordered series must involve series of equal length
|
326
|
+
|
327
|
+
--------------------------------------------------------------------------------
|
328
|
+
|
329
|
+
### 5. `python_polynomial_expression_to_latex`
|
330
|
+
|
331
|
+
Converts a polynomial expression written in Python syntax to a LaTeX formatted string. This function parses algebraic expressions provided as strings using Python’s syntax and translates them into equivalent LaTeX representations, making them suitable for academic or professional documentation. The function supports inclusion of named variables, with an option to substitute specific values into the expression.
|
332
|
+
|
333
|
+
• Parameters:
|
334
|
+
- `expression` (str): The algebraic expression to convert to LaTeX. This should be a string formatted with Python syntax acceptable by SymPy.
|
335
|
+
- `subs` (Optional[Dict[str, float]]): An optional dictionary of substitutions where the keys are variable names in the expression, and the values are the numbers with which to substitute those variables.
|
336
|
+
|
337
|
+
• Returns:
|
338
|
+
- `str`: The LaTeX formatted string equivalent to the provided expression.
|
339
|
+
|
340
|
+
• Raises:
|
341
|
+
- `ValueError`: If the expression cannot be parsed due to syntax errors.
|
342
|
+
|
343
|
+
• Example:
|
344
|
+
|
345
|
+
from rgwfuncs import python_polynomial_expression_to_latex
|
346
|
+
|
347
|
+
# Convert a simple polynomial expression to LaTeX format
|
348
|
+
latex_result1 = python_polynomial_expression_to_latex("x**2 + y**2")
|
349
|
+
print(latex_result1) # Output: "x^{2} + y^{2}"
|
350
|
+
|
351
|
+
# Convert polynomial expression with substituted values
|
352
|
+
latex_result2 = python_polynomial_expression_to_latex("x**2 + y**2", {"x": 3, "y": 4})
|
353
|
+
print(latex_result2) # Output: "25"
|
354
|
+
|
355
|
+
# Another example with partial substitution
|
356
|
+
latex_result3 = python_polynomial_expression_to_latex("x**2 + y**2", {"x": 3})
|
357
|
+
print(latex_result3) # Output: "y^{2} + 9"
|
358
|
+
|
359
|
+
# Trigonometric functions included with symbolic variables
|
360
|
+
latex_result4 = python_polynomial_expression_to_latex("sin(x+z**2) + cos(y)", {"x": 55})
|
361
|
+
print(latex_result4) # Output: "cos y + sin \\left(z^{2} + 55\\right)"
|
362
|
+
|
363
|
+
# Simplified trigonometric functions example with substitution
|
364
|
+
latex_result5 = python_polynomial_expression_to_latex("sin(x) + cos(y)", {"x": 0})
|
365
|
+
print(latex_result5) # Output: "cos y"
|
366
|
+
|
367
|
+
--------------------------------------------------------------------------------
|
368
|
+
|
369
|
+
### 6. `expand_polynomial_expression`
|
370
|
+
|
371
|
+
Expands a polynomial expression written in Python syntax and converts it into a LaTeX formatted string. This function takes algebraic expressions provided as strings using Python's syntax, applies polynomial expansion through SymPy, and translates them into LaTeX representations, suitable for academic or professional documentation. It supports expressions with named variables and provides an option to substitute specific values into the expression before expansion.
|
372
|
+
|
373
|
+
• Parameters:
|
374
|
+
- `expression` (str): The algebraic expression to expand and convert to LaTeX. This string should be formatted using Python syntax acceptable by SymPy.
|
375
|
+
- `subs` (Optional[Dict[str, float]]): An optional dictionary of substitutions where the keys are variable names in the expression, and the values are the numbers with which to substitute those variables before expanding.
|
376
|
+
|
377
|
+
• Returns:
|
378
|
+
- `str`: The LaTeX formatted string of the expanded expression.
|
379
|
+
|
380
|
+
• Raises:
|
381
|
+
- `ValueError`: If the expression cannot be parsed due to syntax errors.
|
382
|
+
|
383
|
+
• Example:
|
384
|
+
|
385
|
+
from rgwfuncs import expand_polynomial_expression
|
386
|
+
|
387
|
+
# Expand a simple polynomial expression and convert to LaTeX
|
388
|
+
latex_result1 = expand_polynomial_expression("(x + y)**2")
|
389
|
+
print(latex_result1) # Output: "x^{2} + 2 x y + y^{2}"
|
390
|
+
|
391
|
+
# Expand polynomial expression with substituted values
|
392
|
+
latex_result2 = expand_polynomial_expression("(x + y)**2", {"x": 3, "y": 4})
|
393
|
+
print(latex_result2) # Output: "49"
|
394
|
+
|
395
|
+
# Another example with partial substitution
|
396
|
+
latex_result3 = expand_polynomial_expression("(x + y)**2", {"x": 3})
|
397
|
+
print(latex_result3) # Output: "y^{2} + 6 y + 9"
|
398
|
+
|
399
|
+
# Handling trigonometric functions with symbolic variables
|
400
|
+
latex_result4 = expand_polynomial_expression("sin(x + z**2) + cos(y)", {"x": 55})
|
401
|
+
print(latex_result4) # Output: "cos y + sin \\left(z^{2} + 55\\right)"
|
402
|
+
|
403
|
+
--------------------------------------------------------------------------------
|
404
|
+
|
405
|
+
### 7. `factor_polynomial_expression`
|
406
|
+
|
407
|
+
Factors a polynomial expression written in Python syntax and converts it into a LaTeX formatted string. This function parses an algebraic expression, performs polynomial factoring using SymPy, and converts the factored expression into a LaTeX representation, ideal for academic or professional use. Optional substitutions can be made before factoring.
|
408
|
+
|
409
|
+
• Parameters:
|
410
|
+
- `expression` (str): The polynomial expression to factor and convert to LaTeX. This should be a valid expression formatted using Python syntax.
|
411
|
+
- `subs` (Optional[Dict[str, float]]): An optional dictionary of substitutions. The keys are variable names in the expression, and the values are numbers that replace these variables.
|
412
|
+
|
413
|
+
• Returns:
|
414
|
+
- `str`: The LaTeX formatted string representing the factored expression.
|
415
|
+
|
416
|
+
• Raises:
|
417
|
+
- `ValueError`: If the expression cannot be parsed due to syntax errors.
|
418
|
+
|
419
|
+
• Example:
|
420
|
+
|
421
|
+
from rgwfuncs import factor_polynomial_expression
|
422
|
+
|
423
|
+
# Factor a polynomial expression and convert to LaTeX
|
424
|
+
latex_result1 = factor_polynomial_expression("x**2 - 4")
|
425
|
+
print(latex_result1) # Output: "\left(x - 2\right) \left(x + 2\right)"
|
426
|
+
|
427
|
+
# Factor with substituted values
|
428
|
+
latex_result2 = factor_polynomial_expression("x**2 - y**2", {"y": 3})
|
429
|
+
print(latex_result2) # Output: "\left(x - 3\right) \left(x + 3\right)"
|
430
|
+
|
431
|
+
--------------------------------------------------------------------------------
|
432
|
+
|
433
|
+
### 8. `simplify_polynomial_expression`
|
434
|
+
|
435
|
+
Simplifies an algebraic expression in polynomial form and returns it in LaTeX format. Takes an algebraic expression, in polynomial form, written in Python syntax and simplifies it. The result is returned as a LaTeX formatted string, suitable for academic or professional documentation.
|
436
|
+
|
437
|
+
• Parameters:
|
438
|
+
- `expression` (str): The algebraic expression, in polynomial form, to simplify. For instance, the expression 'np.diff(8*x**30) where as 'np.diff([2,5,9,11)' is not a polynomial.
|
439
|
+
- `subs` (Optional[Dict[str, float]]): An optional dictionary of substitutions where keys are variable names and values are the numbers to substitute them with.
|
440
|
+
|
441
|
+
• Returns:
|
442
|
+
- `str`: The simplified expression formatted as a LaTeX string.
|
443
|
+
|
444
|
+
• Example:
|
445
|
+
|
446
|
+
from rgwfuncs import simplify_polynomial_expression
|
447
|
+
|
448
|
+
# Example 1: Simplifying a polynomial expression without substitutions
|
449
|
+
simplified_expr1 = simplify_polynomial_expression("2*x + 3*x")
|
450
|
+
print(simplified_expr1) # Output: "5 x"
|
451
|
+
|
452
|
+
# Example 2: Simplifying a complex expression involving derivatives
|
453
|
+
simplified_expr2 = simplify_polynomial_expression("(np.diff(3*x**8)) / (np.diff(8*x**30) * 11*y**3)")
|
454
|
+
print(simplified_expr2) # Output: r"\frac{1}{110 x^{22} y^{3}}"
|
455
|
+
|
456
|
+
# Example 3: Simplifying with substitutions
|
457
|
+
simplified_expr3 = simplify_polynomial_expression("x**2 + y**2", subs={"x": 3, "y": 4})
|
458
|
+
print(simplified_expr3) # Output: "25"
|
459
|
+
|
460
|
+
# Example 4: Simplifying with partial substitution
|
461
|
+
simplified_expr4 = simplify_polynomial_expression("a*b + b", subs={"b": 2})
|
462
|
+
print(simplified_expr4) # Output: "2 a + 2"
|
463
|
+
|
464
|
+
--------------------------------------------------------------------------------
|
465
|
+
|
466
|
+
### 9. `cancel_polynomial_expression`
|
467
|
+
|
468
|
+
Cancels common factors within a polynomial expression written in Python syntax and converts it to a LaTeX formatted string. This function parses an algebraic expression, cancels common factors using SymPy, and translates the reduced expression into a LaTeX representation. It can also accommodate optional substitutions to be made prior to simplification.
|
469
|
+
|
470
|
+
• Parameters:
|
471
|
+
- `expression` (str): The algebraic expression to simplify and convert to LaTeX. This string should be formatted using Python syntax.
|
472
|
+
- `subs` (Optional[Dict[str, float]]): An optional dictionary of substitutions where the keys are variable names in the expression, and the values are the numbers to substitute.
|
473
|
+
|
474
|
+
• Returns:
|
475
|
+
- `str`: The LaTeX formatted string of the simplified expression. If the expression involves indeterminate forms due to operations like division by zero, a descriptive error message is returned instead.
|
476
|
+
|
477
|
+
• Raises:
|
478
|
+
- `ValueError`: If the expression cannot be parsed due to syntax errors or involves undefined operations, such as division by zero.
|
479
|
+
|
480
|
+
• Example:
|
481
|
+
|
482
|
+
from rgwfuncs import cancel_polynomial_expression
|
483
|
+
|
484
|
+
# Cancel common factors within a polynomial expression
|
485
|
+
latex_result1 = cancel_polynomial_expression("(x**2 - 4) / (x - 2)")
|
486
|
+
print(latex_result1) # Output: "x + 2"
|
487
|
+
|
488
|
+
# Cancel with substituted values
|
489
|
+
latex_result2 = cancel_polynomial_expression("(x**2 - 4) / (x - 2)", {"x": 2})
|
490
|
+
print(latex_result2) # Output: "Undefined result. This could be a division by zero error."
|
491
|
+
|
492
|
+
--------------------------------------------------------------------------------
|
493
|
+
|
494
|
+
### 10. `solve_homogeneous_polynomial_expression`
|
495
|
+
|
496
|
+
Solves a homogeneous polynomial expression for a specified variable and returns solutions in LaTeX format. Assumes that the expression is homoegeneous (i.e. equal to zero), and solves for a designated variable. May optionally include substitutions for other variables in the equation. The solutions are provided as a LaTeX formatted string. The method solves equations for specified variables, with optional substitutions, returning LaTeX-formatted solutions.
|
497
|
+
|
498
|
+
• Parameters:
|
499
|
+
- `expression` (str): A string of the homogeneous polynomial expression to solve.
|
500
|
+
- `variable` (str): The variable to solve for.
|
501
|
+
- `subs` (Optional[Dict[str, float]]): Substitutions for variables.
|
502
|
+
|
503
|
+
• Returns:
|
504
|
+
- `str`: Solutions formatted in LaTeX.
|
505
|
+
|
506
|
+
• Example:
|
507
|
+
|
508
|
+
from rgwfuncs import solve_homogeneous_polynomial_expression
|
509
|
+
solutions1 = solve_homogeneous_polynomial_expression("a*x**2 + b*x + c", "x", {"a": 3, "b": 7, "c": 5})
|
510
|
+
print(solutions1) # Output: "\left[-7/6 - sqrt(11)*I/6, -7/6 + sqrt(11)*I/6\right]"
|
511
|
+
|
512
|
+
solutions2 = solve_homogeneous_polynomial_expression("x**2 - 4", "x")
|
513
|
+
print(solutions2) # Output: "\left[-2, 2\right]"
|
514
|
+
|
515
|
+
--------------------------------------------------------------------------------
|
516
|
+
|
517
|
+
### 11. `plot_polynomial_functions`
|
518
|
+
|
519
|
+
This function plots polynomial functions described by a list of expressions and their corresponding substitution dictionaries. It generates SVG markup of the plots, with options for specifying the domain, axis zoom, and legend display.
|
520
|
+
|
521
|
+
• Parameters:
|
522
|
+
- `functions` (`List[Dict[str, Dict[str, Any]]]`): A list of dictionaries, each containing:
|
523
|
+
- A key which is a string representing a Python/NumPy expression (e.g., `"x**2"`, `"np.diff(x,2)"`).
|
524
|
+
- A value which is a dictionary containing substitutions for the expression. Must include an `"x"` key, either as `"*"` for default domain or a NumPy array.
|
525
|
+
- `zoom` (`float`): Determines the numeric axis range from `-zoom` to `+zoom` for both x and y axes (default is `10.0`).
|
526
|
+
- `show_legend` (`bool`): Specifies whether to include a legend in the plot (default is `True`).
|
527
|
+
- `open_file` (`bool`): If saving to path is not desireable, opens the svg as a temp file, else opens the file from the actual location using the system's default viewer (defaults to False).
|
528
|
+
- `save_path` (`Optional[str]`): If specified, saves the output string as a .svg at the indicated path (defaults to None).
|
529
|
+
|
530
|
+
• Returns:
|
531
|
+
- `str`: The raw SVG markup of the resulting plot.
|
532
|
+
|
533
|
+
• Example:
|
534
|
+
|
535
|
+
from rgwfuncs import plot_polynomial_functions
|
536
|
+
|
537
|
+
# Generate the SVG
|
538
|
+
plot_svg_string = plot_polynomial_functions(
|
539
|
+
functions=[
|
540
|
+
{"x**2": {"x": "*"}}, # Single expression, "*" means plot all discernable points
|
541
|
+
{"x**2/(2 + a) + a": {"x": np.linspace(-3, 4, 101), "a": 1.23}},
|
542
|
+
{"np.diff(x**3, 2)": {"x": np.linspace(-2, 2, 10)}}
|
543
|
+
],
|
544
|
+
zoom=2
|
545
|
+
)
|
546
|
+
|
547
|
+
# Write the SVG to an actual file
|
548
|
+
with open("plot.svg", "w", encoding="utf-8") as file:
|
549
|
+
file.write(plot_svg_string)
|
550
|
+
|
551
|
+
• Displaying the SVG:
|
552
|
+
|
553
|
+

|
554
|
+
|
555
|
+
--------------------------------------------------------------------------------
|
556
|
+
|
557
|
+
## String Based Functions
|
558
|
+
|
559
|
+
### 1. send_telegram_message
|
154
560
|
|
155
561
|
Send a message to a Telegram chat using a specified preset from your configuration file.
|
156
562
|
|
@@ -176,20 +582,7 @@ Send a message to a Telegram chat using a specified preset from your configurati
|
|
176
582
|
|
177
583
|
Below is a quick reference of available functions, their purpose, and basic usage examples.
|
178
584
|
|
179
|
-
### 1.
|
180
|
-
Print a list of available function names in alphabetical order. If a filter is provided, print the matching docstrings.
|
181
|
-
|
182
|
-
• Parameters:
|
183
|
-
- `method_type_filter` (str): Optional, comma-separated to select docstring types, or '*' for all.
|
184
|
-
|
185
|
-
• Example:
|
186
|
-
|
187
|
-
import rgwfuncs
|
188
|
-
rgwfuncs.df_docs(method_type_filter='numeric_clean,limit_dataframe')
|
189
|
-
|
190
|
-
--------------------------------------------------------------------------------
|
191
|
-
|
192
|
-
### 2. `numeric_clean`
|
585
|
+
### 1. `numeric_clean`
|
193
586
|
Cleans the numeric columns in a DataFrame according to specified treatments.
|
194
587
|
|
195
588
|
• Parameters:
|
@@ -218,7 +611,7 @@ Cleans the numeric columns in a DataFrame according to specified treatments.
|
|
218
611
|
|
219
612
|
--------------------------------------------------------------------------------
|
220
613
|
|
221
|
-
###
|
614
|
+
### 2. `limit_dataframe`
|
222
615
|
Limit the DataFrame to a specified number of rows.
|
223
616
|
|
224
617
|
• Parameters:
|
@@ -239,7 +632,7 @@ Limit the DataFrame to a specified number of rows.
|
|
239
632
|
|
240
633
|
--------------------------------------------------------------------------------
|
241
634
|
|
242
|
-
###
|
635
|
+
### 3. `from_raw_data`
|
243
636
|
Create a DataFrame from raw data.
|
244
637
|
|
245
638
|
• Parameters:
|
@@ -265,7 +658,7 @@ Create a DataFrame from raw data.
|
|
265
658
|
|
266
659
|
--------------------------------------------------------------------------------
|
267
660
|
|
268
|
-
###
|
661
|
+
### 4. `append_rows`
|
269
662
|
Append rows to the DataFrame.
|
270
663
|
|
271
664
|
• Parameters:
|
@@ -290,7 +683,7 @@ Append rows to the DataFrame.
|
|
290
683
|
|
291
684
|
--------------------------------------------------------------------------------
|
292
685
|
|
293
|
-
###
|
686
|
+
### 5. `append_columns`
|
294
687
|
Append new columns to the DataFrame with None values.
|
295
688
|
|
296
689
|
• Parameters:
|
@@ -311,7 +704,7 @@ Append new columns to the DataFrame with None values.
|
|
311
704
|
|
312
705
|
--------------------------------------------------------------------------------
|
313
706
|
|
314
|
-
###
|
707
|
+
### 6. `update_rows`
|
315
708
|
Update specific rows in the DataFrame based on a condition.
|
316
709
|
|
317
710
|
• Parameters:
|
@@ -333,7 +726,7 @@ Update specific rows in the DataFrame based on a condition.
|
|
333
726
|
|
334
727
|
--------------------------------------------------------------------------------
|
335
728
|
|
336
|
-
###
|
729
|
+
### 7. `delete_rows`
|
337
730
|
Delete rows from the DataFrame based on a condition.
|
338
731
|
|
339
732
|
• Parameters:
|
@@ -354,7 +747,7 @@ Delete rows from the DataFrame based on a condition.
|
|
354
747
|
|
355
748
|
--------------------------------------------------------------------------------
|
356
749
|
|
357
|
-
###
|
750
|
+
### 8. `drop_duplicates`
|
358
751
|
Drop duplicate rows in the DataFrame, retaining the first occurrence.
|
359
752
|
|
360
753
|
• Parameters:
|
@@ -374,7 +767,7 @@ Drop duplicate rows in the DataFrame, retaining the first occurrence.
|
|
374
767
|
|
375
768
|
--------------------------------------------------------------------------------
|
376
769
|
|
377
|
-
###
|
770
|
+
### 9. `drop_duplicates_retain_first`
|
378
771
|
Drop duplicate rows based on specified columns, retaining the first occurrence.
|
379
772
|
|
380
773
|
• Parameters:
|
@@ -395,7 +788,7 @@ Drop duplicate rows based on specified columns, retaining the first occurrence.
|
|
395
788
|
|
396
789
|
--------------------------------------------------------------------------------
|
397
790
|
|
398
|
-
###
|
791
|
+
### 10. `drop_duplicates_retain_last`
|
399
792
|
Drop duplicate rows based on specified columns, retaining the last occurrence.
|
400
793
|
|
401
794
|
• Parameters:
|
@@ -417,34 +810,55 @@ Drop duplicate rows based on specified columns, retaining the last occurrence.
|
|
417
810
|
|
418
811
|
--------------------------------------------------------------------------------
|
419
812
|
|
420
|
-
###
|
813
|
+
### 11. `load_data_from_query`
|
421
814
|
|
422
|
-
Load data from a database query
|
815
|
+
Load data from a specified database using a SQL query and return the results in a Pandas DataFrame. The database connection configurations are determined by a preset name specified in a configuration file.
|
423
816
|
|
424
|
-
|
425
|
-
- `db_preset_name` (str): Name of the database preset in the configuration file.
|
426
|
-
- `query` (str): The SQL query to execute.
|
817
|
+
#### Features
|
427
818
|
|
428
|
-
-
|
429
|
-
|
819
|
+
- Multi-Database Support: This function supports different database types, including MSSQL, MySQL, ClickHouse, Google BigQuery, and AWS Athena, based on the configuration preset selected.
|
820
|
+
- Configuration-Based: It utilizes a configuration file to store database connection details securely, avoiding hardcoding sensitive information directly into the script.
|
821
|
+
- Dynamic Query Execution: Capable of executing custom user-defined SQL queries against the specified database.
|
822
|
+
- Automatic Result Loading: Fetches query results and loads them directly into a Pandas DataFrame for further manipulation and analysis.
|
430
823
|
|
431
|
-
|
432
|
-
- The configuration file is assumed to be located at `~/.rgwfuncsrc`.
|
824
|
+
#### Parameters
|
433
825
|
|
434
|
-
-
|
826
|
+
- `db_preset_name` (str): The name of the database preset found in the configuration file. This preset determines which database connection details to use.
|
827
|
+
- `query` (str): The SQL query string to be executed on the database.
|
435
828
|
|
436
|
-
|
829
|
+
#### Returns
|
830
|
+
|
831
|
+
- `pd.DataFrame`: Returns a DataFrame that contains the results from the executed SQL query.
|
832
|
+
|
833
|
+
#### Configuration Details
|
834
|
+
|
835
|
+
- The configuration file is expected to be in JSON format and located at `~/.rgwfuncsrc`.
|
836
|
+
- Each preset within the configuration file must include:
|
837
|
+
- `name`: Name of the database preset.
|
838
|
+
- `db_type`: Type of the database (`mssql`, `mysql`, `clickhouse`, `google_big_query`, `aws_athena`).
|
839
|
+
- `credentials`: Necessary credentials such as host, username, password, and potentially others depending on the database type.
|
840
|
+
|
841
|
+
#### Example
|
842
|
+
|
843
|
+
from rgwfuncs import load_data_from_query
|
844
|
+
|
845
|
+
# Load data using a preset configuration
|
846
|
+
df = load_data_from_query(
|
847
|
+
db_preset_name="MyDBPreset",
|
848
|
+
query="SELECT * FROM my_table"
|
849
|
+
)
|
850
|
+
print(df)
|
437
851
|
|
438
|
-
|
439
|
-
db_preset_name="MyDBPreset",
|
440
|
-
query="SELECT * FROM my_table"
|
441
|
-
)
|
442
|
-
print(df)
|
852
|
+
#### Notes
|
443
853
|
|
854
|
+
- Security: Ensure that the configuration file (`~/.rgwfuncsrc`) is secure and accessible only to authorized users, as it contains sensitive information.
|
855
|
+
- Pre-requisites: Ensure the necessary Python packages are installed for each database type you wish to query. For example, `pymssql` for MSSQL, `mysql-connector-python` for MySQL, and so on.
|
856
|
+
- Error Handling: The function raises a `ValueError` if the specified preset name does not exist or if the database type is unsupported. Additional exceptions may arise from network issues or database errors.
|
857
|
+
- Environment: For AWS Athena, ensure that AWS credentials are configured properly for the boto3 library to authenticate successfully. Consider using AWS IAM roles or AWS Secrets Manager for better security management.
|
444
858
|
|
445
859
|
--------------------------------------------------------------------------------
|
446
860
|
|
447
|
-
###
|
861
|
+
### 12. `load_data_from_path`
|
448
862
|
Load data from a file into a DataFrame based on the file extension.
|
449
863
|
|
450
864
|
• Parameters:
|
@@ -463,7 +877,7 @@ Load data from a file into a DataFrame based on the file extension.
|
|
463
877
|
|
464
878
|
--------------------------------------------------------------------------------
|
465
879
|
|
466
|
-
###
|
880
|
+
### 13. `load_data_from_sqlite_path`
|
467
881
|
Execute a query on a SQLite database file and return the results as a DataFrame.
|
468
882
|
|
469
883
|
• Parameters:
|
@@ -483,7 +897,7 @@ Execute a query on a SQLite database file and return the results as a DataFrame.
|
|
483
897
|
|
484
898
|
--------------------------------------------------------------------------------
|
485
899
|
|
486
|
-
###
|
900
|
+
### 14. `first_n_rows`
|
487
901
|
Display the first n rows of the DataFrame (prints out in dictionary format).
|
488
902
|
|
489
903
|
• Parameters:
|
@@ -501,7 +915,7 @@ Display the first n rows of the DataFrame (prints out in dictionary format).
|
|
501
915
|
|
502
916
|
--------------------------------------------------------------------------------
|
503
917
|
|
504
|
-
###
|
918
|
+
### 15. `last_n_rows`
|
505
919
|
Display the last n rows of the DataFrame (prints out in dictionary format).
|
506
920
|
|
507
921
|
• Parameters:
|
@@ -519,7 +933,7 @@ Display the last n rows of the DataFrame (prints out in dictionary format).
|
|
519
933
|
|
520
934
|
--------------------------------------------------------------------------------
|
521
935
|
|
522
|
-
###
|
936
|
+
### 16. `top_n_unique_values`
|
523
937
|
Print the top n unique values for specified columns in the DataFrame.
|
524
938
|
|
525
939
|
• Parameters:
|
@@ -538,7 +952,7 @@ Print the top n unique values for specified columns in the DataFrame.
|
|
538
952
|
|
539
953
|
--------------------------------------------------------------------------------
|
540
954
|
|
541
|
-
###
|
955
|
+
### 17. `bottom_n_unique_values`
|
542
956
|
Print the bottom n unique values for specified columns in the DataFrame.
|
543
957
|
|
544
958
|
• Parameters:
|
@@ -557,7 +971,7 @@ Print the bottom n unique values for specified columns in the DataFrame.
|
|
557
971
|
|
558
972
|
--------------------------------------------------------------------------------
|
559
973
|
|
560
|
-
###
|
974
|
+
### 18. `print_correlation`
|
561
975
|
Print correlation for multiple pairs of columns in the DataFrame.
|
562
976
|
|
563
977
|
• Parameters:
|
@@ -582,7 +996,7 @@ Print correlation for multiple pairs of columns in the DataFrame.
|
|
582
996
|
|
583
997
|
--------------------------------------------------------------------------------
|
584
998
|
|
585
|
-
###
|
999
|
+
### 19. `print_memory_usage`
|
586
1000
|
Print the memory usage of the DataFrame in megabytes.
|
587
1001
|
|
588
1002
|
• Parameters:
|
@@ -599,7 +1013,7 @@ Print the memory usage of the DataFrame in megabytes.
|
|
599
1013
|
|
600
1014
|
--------------------------------------------------------------------------------
|
601
1015
|
|
602
|
-
###
|
1016
|
+
### 20. `filter_dataframe`
|
603
1017
|
Return a new DataFrame filtered by a given query expression.
|
604
1018
|
|
605
1019
|
• Parameters:
|
@@ -625,7 +1039,7 @@ Return a new DataFrame filtered by a given query expression.
|
|
625
1039
|
|
626
1040
|
--------------------------------------------------------------------------------
|
627
1041
|
|
628
|
-
###
|
1042
|
+
### 21. `filter_indian_mobiles`
|
629
1043
|
Filter and return rows containing valid Indian mobile numbers in the specified column.
|
630
1044
|
|
631
1045
|
• Parameters:
|
@@ -647,7 +1061,7 @@ Filter and return rows containing valid Indian mobile numbers in the specified c
|
|
647
1061
|
|
648
1062
|
--------------------------------------------------------------------------------
|
649
1063
|
|
650
|
-
###
|
1064
|
+
### 22. `print_dataframe`
|
651
1065
|
Print the entire DataFrame and its column types. Optionally print a source path.
|
652
1066
|
|
653
1067
|
• Parameters:
|
@@ -665,7 +1079,7 @@ Print the entire DataFrame and its column types. Optionally print a source path.
|
|
665
1079
|
|
666
1080
|
--------------------------------------------------------------------------------
|
667
1081
|
|
668
|
-
###
|
1082
|
+
### 23. `send_dataframe_via_telegram`
|
669
1083
|
Send a DataFrame via Telegram using a specified bot configuration.
|
670
1084
|
|
671
1085
|
• Parameters:
|
@@ -692,7 +1106,7 @@ Send a DataFrame via Telegram using a specified bot configuration.
|
|
692
1106
|
|
693
1107
|
--------------------------------------------------------------------------------
|
694
1108
|
|
695
|
-
###
|
1109
|
+
### 24. `send_data_to_email`
|
696
1110
|
Send an email with an optional DataFrame attachment using the Gmail API via a specified preset.
|
697
1111
|
|
698
1112
|
• Parameters:
|
@@ -722,7 +1136,7 @@ Send an email with an optional DataFrame attachment using the Gmail API via a sp
|
|
722
1136
|
|
723
1137
|
--------------------------------------------------------------------------------
|
724
1138
|
|
725
|
-
###
|
1139
|
+
### 25. `send_data_to_slack`
|
726
1140
|
Send a DataFrame or message to Slack using a specified bot configuration.
|
727
1141
|
|
728
1142
|
• Parameters:
|
@@ -748,7 +1162,7 @@ Send a DataFrame or message to Slack using a specified bot configuration.
|
|
748
1162
|
|
749
1163
|
--------------------------------------------------------------------------------
|
750
1164
|
|
751
|
-
###
|
1165
|
+
### 26. `order_columns`
|
752
1166
|
Reorder the columns of a DataFrame based on a string input.
|
753
1167
|
|
754
1168
|
• Parameters:
|
@@ -770,7 +1184,7 @@ Reorder the columns of a DataFrame based on a string input.
|
|
770
1184
|
|
771
1185
|
--------------------------------------------------------------------------------
|
772
1186
|
|
773
|
-
###
|
1187
|
+
### 27. `append_ranged_classification_column`
|
774
1188
|
Append a ranged classification column to the DataFrame.
|
775
1189
|
|
776
1190
|
• Parameters:
|
@@ -794,7 +1208,7 @@ Append a ranged classification column to the DataFrame.
|
|
794
1208
|
|
795
1209
|
--------------------------------------------------------------------------------
|
796
1210
|
|
797
|
-
###
|
1211
|
+
### 28. `append_percentile_classification_column`
|
798
1212
|
Append a percentile classification column to the DataFrame.
|
799
1213
|
|
800
1214
|
• Parameters:
|
@@ -818,7 +1232,7 @@ Append a percentile classification column to the DataFrame.
|
|
818
1232
|
|
819
1233
|
--------------------------------------------------------------------------------
|
820
1234
|
|
821
|
-
###
|
1235
|
+
### 29. `append_ranged_date_classification_column`
|
822
1236
|
Append a ranged date classification column to the DataFrame.
|
823
1237
|
|
824
1238
|
• Parameters:
|
@@ -847,7 +1261,7 @@ Append a ranged date classification column to the DataFrame.
|
|
847
1261
|
|
848
1262
|
--------------------------------------------------------------------------------
|
849
1263
|
|
850
|
-
###
|
1264
|
+
### 30. `rename_columns`
|
851
1265
|
Rename columns in the DataFrame.
|
852
1266
|
|
853
1267
|
• Parameters:
|
@@ -869,7 +1283,7 @@ Rename columns in the DataFrame.
|
|
869
1283
|
|
870
1284
|
--------------------------------------------------------------------------------
|
871
1285
|
|
872
|
-
###
|
1286
|
+
### 31. `cascade_sort`
|
873
1287
|
Cascade sort the DataFrame by specified columns and order.
|
874
1288
|
|
875
1289
|
• Parameters:
|
@@ -895,7 +1309,7 @@ Cascade sort the DataFrame by specified columns and order.
|
|
895
1309
|
|
896
1310
|
--------------------------------------------------------------------------------
|
897
1311
|
|
898
|
-
###
|
1312
|
+
### 32. `append_xgb_labels`
|
899
1313
|
Append XGB training labels (TRAIN, VALIDATE, TEST) based on a ratio string.
|
900
1314
|
|
901
1315
|
• Parameters:
|
@@ -917,7 +1331,7 @@ Append XGB training labels (TRAIN, VALIDATE, TEST) based on a ratio string.
|
|
917
1331
|
|
918
1332
|
--------------------------------------------------------------------------------
|
919
1333
|
|
920
|
-
###
|
1334
|
+
### 33. `append_xgb_regression_predictions`
|
921
1335
|
Append XGB regression predictions to the DataFrame. Requires an `XGB_TYPE` column for TRAIN/TEST splits.
|
922
1336
|
|
923
1337
|
• Parameters:
|
@@ -949,7 +1363,7 @@ Append XGB regression predictions to the DataFrame. Requires an `XGB_TYPE` colum
|
|
949
1363
|
|
950
1364
|
--------------------------------------------------------------------------------
|
951
1365
|
|
952
|
-
###
|
1366
|
+
### 34. `append_xgb_logistic_regression_predictions`
|
953
1367
|
Append XGB logistic regression predictions to the DataFrame. Requires an `XGB_TYPE` column for TRAIN/TEST splits.
|
954
1368
|
|
955
1369
|
• Parameters:
|
@@ -981,7 +1395,7 @@ Append XGB logistic regression predictions to the DataFrame. Requires an `XGB_TY
|
|
981
1395
|
|
982
1396
|
--------------------------------------------------------------------------------
|
983
1397
|
|
984
|
-
###
|
1398
|
+
### 35. `print_n_frequency_cascading`
|
985
1399
|
Print the cascading frequency of top n values for specified columns.
|
986
1400
|
|
987
1401
|
• Parameters:
|
@@ -1001,27 +1415,36 @@ Print the cascading frequency of top n values for specified columns.
|
|
1001
1415
|
|
1002
1416
|
--------------------------------------------------------------------------------
|
1003
1417
|
|
1004
|
-
###
|
1005
|
-
Print the linear frequency of top n values for specified columns.
|
1418
|
+
### 36. `print_n_frequency_linear`
|
1006
1419
|
|
1007
|
-
|
1008
|
-
|
1009
|
-
|
1010
|
-
|
1011
|
-
|
1420
|
+
Prints the linear frequency of the top `n` values for specified columns.
|
1421
|
+
|
1422
|
+
#### Parameters:
|
1423
|
+
- **df** (`pd.DataFrame`): The DataFrame to analyze.
|
1424
|
+
- **n** (`int`): The number of top values to print for each column.
|
1425
|
+
- **columns** (`list`): A list of column names to be analyzed.
|
1426
|
+
- **order_by** (`str`): The order of frequency. The available options are:
|
1427
|
+
- `"ASC"`: Sort keys in ascending lexicographical order.
|
1428
|
+
- `"DESC"`: Sort keys in descending lexicographical order.
|
1429
|
+
- `"FREQ_ASC"`: Sort the frequencies in ascending order (least frequent first).
|
1430
|
+
- `"FREQ_DESC"`: Sort the frequencies in descending order (most frequent first).
|
1431
|
+
- `"BY_KEYS_ASC"`: Sort keys in ascending order, numerically if possible, handling special strings like 'NaN' as typical entries.
|
1432
|
+
- `"BY_KEYS_DESC"`: Sort keys in descending order, numerically if possible, handling special strings like 'NaN' as typical entries.
|
1433
|
+
|
1434
|
+
#### Example:
|
1012
1435
|
|
1013
|
-
• Example:
|
1014
|
-
|
1015
1436
|
from rgwfuncs import print_n_frequency_linear
|
1016
1437
|
import pandas as pd
|
1017
1438
|
|
1018
|
-
df = pd.DataFrame({'City': ['NY','LA','NY','SF','LA','LA']})
|
1019
|
-
print_n_frequency_linear(df, 2, 'City', 'FREQ_DESC')
|
1020
|
-
|
1439
|
+
df = pd.DataFrame({'City': ['NY', 'LA', 'NY', 'SF', 'LA', 'LA']})
|
1440
|
+
print_n_frequency_linear(df, 2, ['City'], 'FREQ_DESC')
|
1441
|
+
|
1442
|
+
This example analyzes the `City` column, printing the top 2 most frequent values in descending order of frequency.
|
1443
|
+
|
1021
1444
|
|
1022
1445
|
--------------------------------------------------------------------------------
|
1023
1446
|
|
1024
|
-
###
|
1447
|
+
### 37. `retain_columns`
|
1025
1448
|
Retain specified columns in the DataFrame and drop the others.
|
1026
1449
|
|
1027
1450
|
• Parameters:
|
@@ -1043,7 +1466,7 @@ Retain specified columns in the DataFrame and drop the others.
|
|
1043
1466
|
|
1044
1467
|
--------------------------------------------------------------------------------
|
1045
1468
|
|
1046
|
-
###
|
1469
|
+
### 38. `mask_against_dataframe`
|
1047
1470
|
Retain only rows with common column values between two DataFrames.
|
1048
1471
|
|
1049
1472
|
• Parameters:
|
@@ -1068,7 +1491,7 @@ Retain only rows with common column values between two DataFrames.
|
|
1068
1491
|
|
1069
1492
|
--------------------------------------------------------------------------------
|
1070
1493
|
|
1071
|
-
###
|
1494
|
+
### 39. `mask_against_dataframe_converse`
|
1072
1495
|
Retain only rows with uncommon column values between two DataFrames.
|
1073
1496
|
|
1074
1497
|
• Parameters:
|
@@ -1093,7 +1516,7 @@ Retain only rows with uncommon column values between two DataFrames.
|
|
1093
1516
|
|
1094
1517
|
--------------------------------------------------------------------------------
|
1095
1518
|
|
1096
|
-
###
|
1519
|
+
### 40. `union_join`
|
1097
1520
|
Perform a union join, concatenating two DataFrames and dropping duplicates.
|
1098
1521
|
|
1099
1522
|
• Parameters:
|
@@ -1116,7 +1539,7 @@ Perform a union join, concatenating two DataFrames and dropping duplicates.
|
|
1116
1539
|
|
1117
1540
|
--------------------------------------------------------------------------------
|
1118
1541
|
|
1119
|
-
###
|
1542
|
+
### 41. `bag_union_join`
|
1120
1543
|
Perform a bag union join, concatenating two DataFrames without dropping duplicates.
|
1121
1544
|
|
1122
1545
|
• Parameters:
|
@@ -1139,7 +1562,7 @@ Perform a bag union join, concatenating two DataFrames without dropping duplicat
|
|
1139
1562
|
|
1140
1563
|
--------------------------------------------------------------------------------
|
1141
1564
|
|
1142
|
-
###
|
1565
|
+
### 42. `left_join`
|
1143
1566
|
Perform a left join on two DataFrames.
|
1144
1567
|
|
1145
1568
|
• Parameters:
|
@@ -1164,7 +1587,7 @@ Perform a left join on two DataFrames.
|
|
1164
1587
|
|
1165
1588
|
--------------------------------------------------------------------------------
|
1166
1589
|
|
1167
|
-
###
|
1590
|
+
### 43. `right_join`
|
1168
1591
|
Perform a right join on two DataFrames.
|
1169
1592
|
|
1170
1593
|
• Parameters:
|
@@ -1189,7 +1612,7 @@ Perform a right join on two DataFrames.
|
|
1189
1612
|
|
1190
1613
|
--------------------------------------------------------------------------------
|
1191
1614
|
|
1192
|
-
###
|
1615
|
+
### 44. `insert_dataframe_in_sqlite_database`
|
1193
1616
|
|
1194
1617
|
Inserts a Pandas DataFrame into a SQLite database table. If the specified table does not exist, it will be created with column types automatically inferred from the DataFrame's data types.
|
1195
1618
|
|
@@ -1227,7 +1650,7 @@ Inserts a Pandas DataFrame into a SQLite database table. If the specified table
|
|
1227
1650
|
|
1228
1651
|
--------------------------------------------------------------------------------
|
1229
1652
|
|
1230
|
-
###
|
1653
|
+
### 45. `sync_dataframe_to_sqlite_database`
|
1231
1654
|
Processes and saves a DataFrame to an SQLite database, adding a timestamp column and replacing the existing table if needed. Creates the table if it does not exist.
|
1232
1655
|
|
1233
1656
|
• Parameters:
|
@@ -1251,6 +1674,8 @@ Processes and saves a DataFrame to an SQLite database, adding a timestamp column
|
|
1251
1674
|
|
1252
1675
|
--------------------------------------------------------------------------------
|
1253
1676
|
|
1677
|
+
|
1678
|
+
|
1254
1679
|
## Additional Info
|
1255
1680
|
|
1256
1681
|
For more information, refer to each function’s docstring by calling:
|