spells-mtg 0.9.4__tar.gz → 0.9.6__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of spells-mtg might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: spells-mtg
3
- Version: 0.9.4
3
+ Version: 0.9.6
4
4
  Summary: analaysis of 17Lands.com public datasets
5
5
  Author-Email: Joel Barnes <oelarnes@gmail.com>
6
6
  License: MIT
@@ -15,21 +15,23 @@ Description-Content-Type: text/markdown
15
15
 
16
16
  ```
17
17
  $ spells add DSK
18
- 🪄 spells ✨ [data home]=/Users/joel/.local/share/spells/
18
+ 🪄 spells ✨ [data home]=/home/joel/.local/share/spells/
19
19
 
20
- 🪄 add ✨ Downloading draft dataset from 17Lands.com
20
+ 🪄 add ✨ Downloading draft dataset from 17Lands.com
21
21
  100% [......................................................................] 250466473 / 250466473
22
- 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
23
- 🪄 add ✨ Wrote file /Users/joel/.local/share/spells/external/DSK/DSK_PremierDraft_draft.parquet
24
- 🪄 clean ✨ No local cache found for set DSK
25
- 🪄 add ✨ Fetching card data from mtgjson.com and writing card parquet file
26
- 🪄 add ✨ Wrote file /Users/joel/.local/share/spells/external/DSK/DSK_card.parquet
27
- 🪄 add ✨ Downloading game dataset from 17Lands.com
22
+ 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
23
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_PremierDraft_draft.parquet
24
+ 🪄 clean ✨ No local cache found for set DSK
25
+ 🪄 add ✨ Fetching card data from mtgjson.com and writing card file
26
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_card.parquet
27
+ 🪄 add ✨ Calculating set context
28
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_PremierDraft_context.parquet
29
+ 🪄 add ✨ Downloading game dataset from 17Lands.com
28
30
  100% [........................................................................] 77145600 / 77145600
29
- 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
30
- 🪄 add ✨ Wrote file /Users/joel/.local/share/spells/external/DSK/DSK_PremierDraft_game.parquet
31
- 🪄 clean ✨ No local cache found for set DSK
32
- $ ipython
31
+ 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
32
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_PremierDraft_game.parquet
33
+ 🪄 clean ✨ Removed 1 files from local cache for set DSK
34
+ 🪄 clean ✨ Removed local cache dir /home/joel/.local/share/spells/cache/DSK
33
35
  ```
34
36
 
35
37
  ```python
@@ -70,6 +72,7 @@ Spells is not affiliated with 17Lands. Please review the [Usage Guidelines](http
70
72
 
71
73
  - Uses [Polars](https://docs.pola.rs/) for high-performance, multi-threaded aggregations of large datasets
72
74
  - Uses Polars to power an expressive query language for specifying custom extensions
75
+ - Analyzes larger-than-memory datasets using Polars streaming mode
73
76
  - Converts csv datasets to parquet for 10x faster calculations and 20x smaller file sizes
74
77
  - Supports calculating the standard aggregations and measures out of the box with no arguments (ALSA, GIH WR, etc)
75
78
  - Caches aggregate DataFrames in the local file system automatically for instantaneous reproduction of previous analysis
@@ -249,7 +252,7 @@ Spells caches the results of expensive aggregations in the local file system as
249
252
 
250
253
  ### Memory Usage
251
254
 
252
- One of my goals in creating Spells was to eliminate issues with memory pressure by exclusively using the map-reduce paradigm and a technology that supports partitioned/streaming aggregation of larget-than-memory datasets. By default, Polars loads the entire dataset in memory, but the API exposes a parameter `streaming` which I have exposed as `use_streaming`. Unfortunately, that feature does not seem to work for my queries and the memory performance can be quite poor. The one feature that may assist in memory management is the local caching, since you can restart the kernel without losing all of your progress. In particular, be careful about opening multiple Jupyter tabs unless you have at least 32 GB. In general I have not run into issues on my 16 GB MacBook Air except with running multiple kernels at once. Supporting larger-than memory computations is on my roadmap, so check back periodically to see if I've made any progress.
255
+ One of my goals in creating Spells was to eliminate issues with memory pressure by exclusively using the map-reduce paradigm and a technology that supports partitioned/streaming aggregation of larget-than-memory datasets. By default, Polars loads the entire dataset in memory, but the API exposes a parameter `streaming` which I have exposed as `use_streaming`. Further testing is needed to determine the performance impacts, but this is the first thing you should try if you run into memory issues.
253
256
 
254
257
  When refreshing a given set's data files from 17Lands using the provided cli, the cache for that set is automatically cleared. The `spells` CLI gives additional tools for managing the local and external caches.
255
258
 
@@ -289,9 +292,10 @@ To use `spells`, make sure Spells is installed in your environment using pip or
289
292
  ### Summon
290
293
 
291
294
  ```python
292
- from spell import summon
295
+ from spells import summon
293
296
 
294
297
  summon(
298
+ set_code: list[str] | str,
295
299
  columns: list[str] | None = None,
296
300
  group_by: list[str] | None = None,
297
301
  filter_spec: dict | None = None,
@@ -300,11 +304,16 @@ summon(
300
304
  set_context: pl.DataFrame | dict[str, Any] | None = None,
301
305
  read_cache: bool = True,
302
306
  write_cache: bool = True,
307
+ use_streaming: bool = False,
308
+ log_to_console: int = logging.ERROR,
303
309
  ) -> polars.DataFrame
304
310
  ```
305
311
 
306
312
  #### parameters
307
313
 
314
+ - `set_code`: a set code or list of set codes among those that you have added using `spells add`.
315
+ You can use "expansion" as a group_by to separate results from multiple sets, or you can aggregate them together.
316
+
308
317
  - `columns`: a list of string or `ColName` values to select as non-grouped columns. Valid `ColTypes` are `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, and `AGG`. Min/Max/Unique
309
318
  aggregations of non-numeric (or numeric) data types are not supported. If `None`, use a set of columns modeled on the commonly used values on 17Lands.com/card_data.
310
319
 
@@ -326,6 +335,8 @@ aggregations of non-numeric (or numeric) data types are not supported. If `None`
326
335
 
327
336
  - `read_cache`/`write_cache`: Use the local file system to cache and retrieve aggregations to minimize expensive reads of the large datasets. You shouldn't need to touch these arguments unless you are debugging.
328
337
 
338
+ - 'log_to_console': Set to `logging.INFO` to see useful messages on the progress of your aggregation, or `logging.WARNING` to see warning messages about potentially invalid column definitions.
339
+
329
340
  ### Enums
330
341
 
331
342
  ```python
@@ -505,13 +516,10 @@ A table of all included columns. Columns can be referenced by enum or by string
505
516
  # Roadmap to 1.0
506
517
 
507
518
  - [ ] Support Traditional and Premier datasets (currently only Premier is supported)
508
- - [ ] Group by all
509
519
  - [ ] Enable configuration using $XDG_CONFIG_HOME/cfg.toml
510
- - [ ] Support min and max aggregations over base views
511
520
  - [ ] Enhanced profiling
512
521
  - [ ] Optimized caching strategy
513
522
  - [ ] Organize and analyze daily downloads from 17Lands (not a scraper!)
514
523
  - [ ] Helper functions to generate second-order analysis by card name
515
524
  - [ ] Helper functions for common plotting paradigms
516
- - [ ] Example notebooks
517
525
  - [ ] Scientific workflows: regression, MLE, etc
@@ -4,21 +4,23 @@
4
4
 
5
5
  ```
6
6
  $ spells add DSK
7
- 🪄 spells ✨ [data home]=/Users/joel/.local/share/spells/
7
+ 🪄 spells ✨ [data home]=/home/joel/.local/share/spells/
8
8
 
9
- 🪄 add ✨ Downloading draft dataset from 17Lands.com
9
+ 🪄 add ✨ Downloading draft dataset from 17Lands.com
10
10
  100% [......................................................................] 250466473 / 250466473
11
- 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
12
- 🪄 add ✨ Wrote file /Users/joel/.local/share/spells/external/DSK/DSK_PremierDraft_draft.parquet
13
- 🪄 clean ✨ No local cache found for set DSK
14
- 🪄 add ✨ Fetching card data from mtgjson.com and writing card parquet file
15
- 🪄 add ✨ Wrote file /Users/joel/.local/share/spells/external/DSK/DSK_card.parquet
16
- 🪄 add ✨ Downloading game dataset from 17Lands.com
11
+ 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
12
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_PremierDraft_draft.parquet
13
+ 🪄 clean ✨ No local cache found for set DSK
14
+ 🪄 add ✨ Fetching card data from mtgjson.com and writing card file
15
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_card.parquet
16
+ 🪄 add ✨ Calculating set context
17
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_PremierDraft_context.parquet
18
+ 🪄 add ✨ Downloading game dataset from 17Lands.com
17
19
  100% [........................................................................] 77145600 / 77145600
18
- 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
19
- 🪄 add ✨ Wrote file /Users/joel/.local/share/spells/external/DSK/DSK_PremierDraft_game.parquet
20
- 🪄 clean ✨ No local cache found for set DSK
21
- $ ipython
20
+ 🪄 add ✨ Unzipping and transforming to parquet (this might take a few minutes)...
21
+ 🪄 add ✨ Wrote file /home/joel/.local/share/spells/external/DSK/DSK_PremierDraft_game.parquet
22
+ 🪄 clean ✨ Removed 1 files from local cache for set DSK
23
+ 🪄 clean ✨ Removed local cache dir /home/joel/.local/share/spells/cache/DSK
22
24
  ```
23
25
 
24
26
  ```python
@@ -59,6 +61,7 @@ Spells is not affiliated with 17Lands. Please review the [Usage Guidelines](http
59
61
 
60
62
  - Uses [Polars](https://docs.pola.rs/) for high-performance, multi-threaded aggregations of large datasets
61
63
  - Uses Polars to power an expressive query language for specifying custom extensions
64
+ - Analyzes larger-than-memory datasets using Polars streaming mode
62
65
  - Converts csv datasets to parquet for 10x faster calculations and 20x smaller file sizes
63
66
  - Supports calculating the standard aggregations and measures out of the box with no arguments (ALSA, GIH WR, etc)
64
67
  - Caches aggregate DataFrames in the local file system automatically for instantaneous reproduction of previous analysis
@@ -238,7 +241,7 @@ Spells caches the results of expensive aggregations in the local file system as
238
241
 
239
242
  ### Memory Usage
240
243
 
241
- One of my goals in creating Spells was to eliminate issues with memory pressure by exclusively using the map-reduce paradigm and a technology that supports partitioned/streaming aggregation of larget-than-memory datasets. By default, Polars loads the entire dataset in memory, but the API exposes a parameter `streaming` which I have exposed as `use_streaming`. Unfortunately, that feature does not seem to work for my queries and the memory performance can be quite poor. The one feature that may assist in memory management is the local caching, since you can restart the kernel without losing all of your progress. In particular, be careful about opening multiple Jupyter tabs unless you have at least 32 GB. In general I have not run into issues on my 16 GB MacBook Air except with running multiple kernels at once. Supporting larger-than memory computations is on my roadmap, so check back periodically to see if I've made any progress.
244
+ One of my goals in creating Spells was to eliminate issues with memory pressure by exclusively using the map-reduce paradigm and a technology that supports partitioned/streaming aggregation of larget-than-memory datasets. By default, Polars loads the entire dataset in memory, but the API exposes a parameter `streaming` which I have exposed as `use_streaming`. Further testing is needed to determine the performance impacts, but this is the first thing you should try if you run into memory issues.
242
245
 
243
246
  When refreshing a given set's data files from 17Lands using the provided cli, the cache for that set is automatically cleared. The `spells` CLI gives additional tools for managing the local and external caches.
244
247
 
@@ -278,9 +281,10 @@ To use `spells`, make sure Spells is installed in your environment using pip or
278
281
  ### Summon
279
282
 
280
283
  ```python
281
- from spell import summon
284
+ from spells import summon
282
285
 
283
286
  summon(
287
+ set_code: list[str] | str,
284
288
  columns: list[str] | None = None,
285
289
  group_by: list[str] | None = None,
286
290
  filter_spec: dict | None = None,
@@ -289,11 +293,16 @@ summon(
289
293
  set_context: pl.DataFrame | dict[str, Any] | None = None,
290
294
  read_cache: bool = True,
291
295
  write_cache: bool = True,
296
+ use_streaming: bool = False,
297
+ log_to_console: int = logging.ERROR,
292
298
  ) -> polars.DataFrame
293
299
  ```
294
300
 
295
301
  #### parameters
296
302
 
303
+ - `set_code`: a set code or list of set codes among those that you have added using `spells add`.
304
+ You can use "expansion" as a group_by to separate results from multiple sets, or you can aggregate them together.
305
+
297
306
  - `columns`: a list of string or `ColName` values to select as non-grouped columns. Valid `ColTypes` are `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, and `AGG`. Min/Max/Unique
298
307
  aggregations of non-numeric (or numeric) data types are not supported. If `None`, use a set of columns modeled on the commonly used values on 17Lands.com/card_data.
299
308
 
@@ -315,6 +324,8 @@ aggregations of non-numeric (or numeric) data types are not supported. If `None`
315
324
 
316
325
  - `read_cache`/`write_cache`: Use the local file system to cache and retrieve aggregations to minimize expensive reads of the large datasets. You shouldn't need to touch these arguments unless you are debugging.
317
326
 
327
+ - 'log_to_console': Set to `logging.INFO` to see useful messages on the progress of your aggregation, or `logging.WARNING` to see warning messages about potentially invalid column definitions.
328
+
318
329
  ### Enums
319
330
 
320
331
  ```python
@@ -494,13 +505,10 @@ A table of all included columns. Columns can be referenced by enum or by string
494
505
  # Roadmap to 1.0
495
506
 
496
507
  - [ ] Support Traditional and Premier datasets (currently only Premier is supported)
497
- - [ ] Group by all
498
508
  - [ ] Enable configuration using $XDG_CONFIG_HOME/cfg.toml
499
- - [ ] Support min and max aggregations over base views
500
509
  - [ ] Enhanced profiling
501
510
  - [ ] Optimized caching strategy
502
511
  - [ ] Organize and analyze daily downloads from 17Lands (not a scraper!)
503
512
  - [ ] Helper functions to generate second-order analysis by card name
504
513
  - [ ] Helper functions for common plotting paradigms
505
- - [ ] Example notebooks
506
514
  - [ ] Scientific workflows: regression, MLE, etc
@@ -11,7 +11,7 @@ dependencies = [
11
11
  ]
12
12
  requires-python = ">=3.11"
13
13
  readme = "README.md"
14
- version = "0.9.4"
14
+ version = "0.9.6"
15
15
 
16
16
  [project.license]
17
17
  text = "MIT"
@@ -28,6 +28,9 @@ build-backend = "pdm.backend"
28
28
  [tool.pdm]
29
29
  distribution = true
30
30
 
31
+ [tool.pdm.scripts]
32
+ post_install = "scripts/post_install.py"
33
+
31
34
  [tool.pdm.publish.upload]
32
35
  env_file = "$HOME/.pypienv"
33
36
 
@@ -25,7 +25,7 @@ class DataDir(StrEnum):
25
25
 
26
26
 
27
27
  def spells_print(mode, content):
28
- print(f"🪄 {mode} ✨ {content}")
28
+ print(f" 🪄 {mode} ✨ {content}")
29
29
 
30
30
 
31
31
  def data_home() -> str:
@@ -78,7 +78,7 @@ def card_df(draft_set_code: str, names: list[str]) -> pl.DataFrame:
78
78
  draft_set_json = _fetch_mtg_json(draft_set_code)
79
79
  booster_info = draft_set_json["data"]["booster"]
80
80
 
81
- booster_type = "play" if "play" in booster_info else "draft"
81
+ booster_type = "play" if "play" in booster_info else "draft" if "draft" in booster_info else list(booster_info.keys())[0]
82
82
  set_codes = booster_info[booster_type]["sourceSetCodes"]
83
83
  set_codes.reverse()
84
84
 
@@ -455,7 +455,7 @@ def _base_agg_df(
455
455
  return joined_df
456
456
 
457
457
 
458
- @make_verbose
458
+ @make_verbose()
459
459
  def summon(
460
460
  set_code: str | list[str],
461
461
  columns: list[str] | None = None,
@@ -59,10 +59,12 @@ def console_logging(log_level):
59
59
  logger.removeHandler(console_handler)
60
60
 
61
61
 
62
- def make_verbose(func: Callable) -> Callable:
63
- @wraps(func)
64
- def wrapped(*args, logging: int=logging.ERROR, **kwargs):
65
- with console_logging(logging):
66
- return func(*args, **kwargs)
67
- return wrapped
62
+ def make_verbose(level: int=logging.ERROR) -> Callable:
63
+ def decorator(func: Callable) -> Callable:
64
+ @wraps(func)
65
+ def wrapped(*args, log_to_console: int=level, **kwargs):
66
+ with console_logging(log_to_console):
67
+ return func(*args, **kwargs)
68
+ return wrapped
69
+ return decorator
68
70
 
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes