spells-mtg 0.2.2__tar.gz → 0.3.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of spells-mtg might be problematic. Click here for more details.
- spells_mtg-0.2.2/README.md → spells_mtg-0.3.0/PKG-INFO +18 -3
- spells_mtg-0.2.2/PKG-INFO → spells_mtg-0.3.0/README.md +7 -14
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/pyproject.toml +1 -1
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/columns.py +13 -1
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/draft_data.py +28 -13
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/enums.py +3 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/manifest.py +26 -16
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/LICENSE +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/__init__.py +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/cache.py +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/cards.py +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/external.py +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/filter.py +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/spells/schema.py +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/tests/__init__.py +0 -0
- {spells_mtg-0.2.2 → spells_mtg-0.3.0}/tests/filter_test.py +0 -0
|
@@ -1,3 +1,14 @@
|
|
|
1
|
+
Metadata-Version: 2.1
|
|
2
|
+
Name: spells-mtg
|
|
3
|
+
Version: 0.3.0
|
|
4
|
+
Summary: analaysis of 17Lands.com public datasets
|
|
5
|
+
Author-Email: Joel Barnes <oelarnes@gmail.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Requires-Python: >=3.11
|
|
8
|
+
Requires-Dist: polars>=1.14.0
|
|
9
|
+
Requires-Dist: wget>=3.2
|
|
10
|
+
Description-Content-Type: text/markdown
|
|
11
|
+
|
|
1
12
|
# 🪄 spells ✨
|
|
2
13
|
|
|
3
14
|
**spells** is a python package that tutors up blazing-fast and extensible analysis of the public data sets provided by [17Lands](https://www.17lands.com/) and exiles the annoying and slow parts of your workflow. Spells exposes one first-class function, `summon`, which summons a Polars DataFrame to the battlefield.
|
|
@@ -211,6 +222,10 @@ Spells is built on top of Polars, a modern, well-supported DataFrame engine writ
|
|
|
211
222
|
|
|
212
223
|
Spells caches the results of expensive aggregations in the local file system as parquet files, which by default are found under the `data/local` path from the execution directory, which can be configured using the environment variable `SPELLS_PROJECT_DIR`. Query plans which request the same set of first-stage aggregations (sums over base rows) will attempt to locate the aggregate data in the cache before calculating. This guarantees that a repeated call to `summon` returns instantaneously.
|
|
213
224
|
|
|
225
|
+
### Memory Usage
|
|
226
|
+
|
|
227
|
+
One of my goals in creating Spells was to eliminate issues with memory pressure by exclusively using the map-reduce paradigm and a technology that supports partitioned/streaming aggregation of larget-than-memory datasets. By default, Polars loads the entire dataset in memory, but the API exposes a parameter `streaming` which I have exposed as `use_streaming`. Unfortunately, that feature does not seem to work for my queries and the memory performance can be quite poor, including poor garbage collection. The one feature that may assist in memory management is the local caching, since you can restart the kernel without losing all of your progress. In particular, be careful about opening multiple Jupyter tabs unless you have at least 32 GB. In general I have not run into issues on my 16 GB MacBook Air except with running multiple kernels at once. Supporting larger-than memory computations is on my roadmap, so check back periodically to see if I've made any progress.
|
|
228
|
+
|
|
214
229
|
When refreshing a given set's data files from 17Lands using the provided cli, the cache for that set is automatically cleared. The `spells` CLI gives additional tools for managing the local and external caches.
|
|
215
230
|
|
|
216
231
|
# Documentation
|
|
@@ -263,13 +278,13 @@ summon(
|
|
|
263
278
|
|
|
264
279
|
#### parameters
|
|
265
280
|
|
|
266
|
-
- columns: a list of string or `ColName` values to select as non-grouped columns. Valid `ColTypes` are `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, and `AGG`. Min/Max/Unique
|
|
281
|
+
- columns: a list of string or `ColName` values to select as non-grouped columns. Valid `ColTypes` are `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, `CARD_SUM` and `AGG`. Min/Max/Unique
|
|
267
282
|
aggregations of non-numeric (or numeric) data types are not supported. If `None`, use a set of columns modeled on the commonly used values on 17Lands.com/card_data.
|
|
268
283
|
|
|
269
284
|
- group_by: a list of string or `ColName` values to display as grouped columns. Valid `ColTypes` are `GROUP_BY` and `CARD_ATTR`. By default, group by "name" (card name).
|
|
270
285
|
|
|
271
286
|
- filter_spec: a dictionary specifying a filter, using a small number of paradigms. Columns used must be in each base view ("draft" and "game") that the `columns` and `group_by` columns depend on, so
|
|
272
|
-
`AGG` and `CARD_ATTR` columns are not valid. `NAME_SUM` columns are also not supported. Derived columns are supported. No filter is applied by default. Yes, I should rewrite it to use the mongo query language. The specification is best understood with examples:
|
|
287
|
+
`AGG`, `CARD_SUM` and `CARD_ATTR` columns are not valid. `NAME_SUM` columns are also not supported. Derived columns are supported. No filter is applied by default. Yes, I should rewrite it to use the mongo query language. The specification is best understood with examples:
|
|
273
288
|
|
|
274
289
|
- `{'player_cohort': 'Top'}` "player_cohort" value equals "Top".
|
|
275
290
|
- `{'lhs': 'player_cohort', 'op': 'in', 'rhs': ['Top', 'Middle']}` "player_cohort" value is either "Top" or "Middle". Supported values for `op` are `<`, `<=`, `>`, `>=`, `!=`, `=`, `in` and `nin`.
|
|
@@ -309,7 +324,7 @@ Used to define extensions in `summon`
|
|
|
309
324
|
|
|
310
325
|
- `name`: any string, including existing columns, although this is very likely to break dependent columns, so don't do it. For `NAME_SUM` columns, the name is the prefix without the underscore, e.g. "drawn".
|
|
311
326
|
|
|
312
|
-
- `col_type`: one of the `ColType` enum values, `FILTER_ONLY`, `GROUP_BY`, `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, and `AGG`. See documentation for `summon` for usage. All columns except `CARD_ATTR` and `AGG` must be derivable at the individual row level on one or both base views. `CARD_ATTR` must be derivable at the individual row level from the card file. `AGG` can depend on any column present after summing over groups, and can include polars Expression aggregations. Arbitrarily long chains of aggregate dependencies are supported.
|
|
327
|
+
- `col_type`: one of the `ColType` enum values, `FILTER_ONLY`, `GROUP_BY`, `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, `CARD_SUM`, and `AGG`. See documentation for `summon` for usage. All columns except `CARD_ATTR`, `CARD_SUM` and `AGG` must be derivable at the individual row level on one or both base views. `CARD_ATTR` must be derivable at the individual row level from the card file. `AGG` can depend on any column present after summing over groups, and can include polars Expression aggregations. `CARD_SUM` columns are expressed similarly to `AGG`, but they are calculated before grouping by card name and are summed before the `AGG` selection stage (for example, to calculate average mana value. See example notebook "Card Attributes"). Arbitrarily long chains of aggregate dependencies are supported.
|
|
313
328
|
|
|
314
329
|
- `expr`: A polars expression giving the derivation of the column value at the first level where it is defined. For `NAME_SUM` columns the `exprMap` attribute must be used instead. `AGG` columns that depend on `NAME_SUM` columns reference the prefix (`cdef.name`) only, since the unpivot has occured prior to selection.
|
|
315
330
|
|
|
@@ -1,14 +1,3 @@
|
|
|
1
|
-
Metadata-Version: 2.1
|
|
2
|
-
Name: spells-mtg
|
|
3
|
-
Version: 0.2.2
|
|
4
|
-
Summary: analaysis of 17Lands.com public datasets
|
|
5
|
-
Author-Email: Joel Barnes <oelarnes@gmail.com>
|
|
6
|
-
License: MIT
|
|
7
|
-
Requires-Python: >=3.11
|
|
8
|
-
Requires-Dist: polars>=1.14.0
|
|
9
|
-
Requires-Dist: wget>=3.2
|
|
10
|
-
Description-Content-Type: text/markdown
|
|
11
|
-
|
|
12
1
|
# 🪄 spells ✨
|
|
13
2
|
|
|
14
3
|
**spells** is a python package that tutors up blazing-fast and extensible analysis of the public data sets provided by [17Lands](https://www.17lands.com/) and exiles the annoying and slow parts of your workflow. Spells exposes one first-class function, `summon`, which summons a Polars DataFrame to the battlefield.
|
|
@@ -222,6 +211,10 @@ Spells is built on top of Polars, a modern, well-supported DataFrame engine writ
|
|
|
222
211
|
|
|
223
212
|
Spells caches the results of expensive aggregations in the local file system as parquet files, which by default are found under the `data/local` path from the execution directory, which can be configured using the environment variable `SPELLS_PROJECT_DIR`. Query plans which request the same set of first-stage aggregations (sums over base rows) will attempt to locate the aggregate data in the cache before calculating. This guarantees that a repeated call to `summon` returns instantaneously.
|
|
224
213
|
|
|
214
|
+
### Memory Usage
|
|
215
|
+
|
|
216
|
+
One of my goals in creating Spells was to eliminate issues with memory pressure by exclusively using the map-reduce paradigm and a technology that supports partitioned/streaming aggregation of larget-than-memory datasets. By default, Polars loads the entire dataset in memory, but the API exposes a parameter `streaming` which I have exposed as `use_streaming`. Unfortunately, that feature does not seem to work for my queries and the memory performance can be quite poor, including poor garbage collection. The one feature that may assist in memory management is the local caching, since you can restart the kernel without losing all of your progress. In particular, be careful about opening multiple Jupyter tabs unless you have at least 32 GB. In general I have not run into issues on my 16 GB MacBook Air except with running multiple kernels at once. Supporting larger-than memory computations is on my roadmap, so check back periodically to see if I've made any progress.
|
|
217
|
+
|
|
225
218
|
When refreshing a given set's data files from 17Lands using the provided cli, the cache for that set is automatically cleared. The `spells` CLI gives additional tools for managing the local and external caches.
|
|
226
219
|
|
|
227
220
|
# Documentation
|
|
@@ -274,13 +267,13 @@ summon(
|
|
|
274
267
|
|
|
275
268
|
#### parameters
|
|
276
269
|
|
|
277
|
-
- columns: a list of string or `ColName` values to select as non-grouped columns. Valid `ColTypes` are `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, and `AGG`. Min/Max/Unique
|
|
270
|
+
- columns: a list of string or `ColName` values to select as non-grouped columns. Valid `ColTypes` are `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, `CARD_SUM` and `AGG`. Min/Max/Unique
|
|
278
271
|
aggregations of non-numeric (or numeric) data types are not supported. If `None`, use a set of columns modeled on the commonly used values on 17Lands.com/card_data.
|
|
279
272
|
|
|
280
273
|
- group_by: a list of string or `ColName` values to display as grouped columns. Valid `ColTypes` are `GROUP_BY` and `CARD_ATTR`. By default, group by "name" (card name).
|
|
281
274
|
|
|
282
275
|
- filter_spec: a dictionary specifying a filter, using a small number of paradigms. Columns used must be in each base view ("draft" and "game") that the `columns` and `group_by` columns depend on, so
|
|
283
|
-
`AGG` and `CARD_ATTR` columns are not valid. `NAME_SUM` columns are also not supported. Derived columns are supported. No filter is applied by default. Yes, I should rewrite it to use the mongo query language. The specification is best understood with examples:
|
|
276
|
+
`AGG`, `CARD_SUM` and `CARD_ATTR` columns are not valid. `NAME_SUM` columns are also not supported. Derived columns are supported. No filter is applied by default. Yes, I should rewrite it to use the mongo query language. The specification is best understood with examples:
|
|
284
277
|
|
|
285
278
|
- `{'player_cohort': 'Top'}` "player_cohort" value equals "Top".
|
|
286
279
|
- `{'lhs': 'player_cohort', 'op': 'in', 'rhs': ['Top', 'Middle']}` "player_cohort" value is either "Top" or "Middle". Supported values for `op` are `<`, `<=`, `>`, `>=`, `!=`, `=`, `in` and `nin`.
|
|
@@ -320,7 +313,7 @@ Used to define extensions in `summon`
|
|
|
320
313
|
|
|
321
314
|
- `name`: any string, including existing columns, although this is very likely to break dependent columns, so don't do it. For `NAME_SUM` columns, the name is the prefix without the underscore, e.g. "drawn".
|
|
322
315
|
|
|
323
|
-
- `col_type`: one of the `ColType` enum values, `FILTER_ONLY`, `GROUP_BY`, `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, and `AGG`. See documentation for `summon` for usage. All columns except `CARD_ATTR` and `AGG` must be derivable at the individual row level on one or both base views. `CARD_ATTR` must be derivable at the individual row level from the card file. `AGG` can depend on any column present after summing over groups, and can include polars Expression aggregations. Arbitrarily long chains of aggregate dependencies are supported.
|
|
316
|
+
- `col_type`: one of the `ColType` enum values, `FILTER_ONLY`, `GROUP_BY`, `PICK_SUM`, `NAME_SUM`, `GAME_SUM`, `CARD_ATTR`, `CARD_SUM`, and `AGG`. See documentation for `summon` for usage. All columns except `CARD_ATTR`, `CARD_SUM` and `AGG` must be derivable at the individual row level on one or both base views. `CARD_ATTR` must be derivable at the individual row level from the card file. `AGG` can depend on any column present after summing over groups, and can include polars Expression aggregations. `CARD_SUM` columns are expressed similarly to `AGG`, but they are calculated before grouping by card name and are summed before the `AGG` selection stage (for example, to calculate average mana value. See example notebook "Card Attributes"). Arbitrarily long chains of aggregate dependencies are supported.
|
|
324
317
|
|
|
325
318
|
- `expr`: A polars expression giving the derivation of the column value at the first level where it is defined. For `NAME_SUM` columns the `exprMap` attribute must be used instead. `AGG` columns that depend on `NAME_SUM` columns reference the prefix (`cdef.name`) only, since the unpivot has occured prior to selection.
|
|
326
319
|
|
|
@@ -24,7 +24,7 @@ class ColumnDefinition:
|
|
|
24
24
|
name: str
|
|
25
25
|
col_type: ColType
|
|
26
26
|
expr: pl.Expr | tuple[pl.Expr, ...]
|
|
27
|
-
views:
|
|
27
|
+
views: set[View]
|
|
28
28
|
dependencies: tuple[str, ...]
|
|
29
29
|
signature: str
|
|
30
30
|
|
|
@@ -504,6 +504,12 @@ _column_specs = [
|
|
|
504
504
|
name=ColName.MANA_VALUE,
|
|
505
505
|
col_type=ColType.CARD_ATTR,
|
|
506
506
|
),
|
|
507
|
+
ColumnSpec(
|
|
508
|
+
name=ColName.DECK_MANA_VALUE,
|
|
509
|
+
col_type=ColType.CARD_SUM,
|
|
510
|
+
expr=pl.col(ColName.MANA_VALUE) * pl.col(ColName.DECK),
|
|
511
|
+
dependencies=[ColName.MANA_VALUE, ColName.DECK],
|
|
512
|
+
),
|
|
507
513
|
ColumnSpec(
|
|
508
514
|
name=ColName.MANA_COST,
|
|
509
515
|
col_type=ColType.CARD_ATTR,
|
|
@@ -721,6 +727,12 @@ _column_specs = [
|
|
|
721
727
|
expr=pl.col(ColName.GIH_WR_EXCESS) / pl.col(ColName.GIH_WR_STDEV),
|
|
722
728
|
dependencies=[ColName.GIH_WR_EXCESS, ColName.GIH_WR_STDEV],
|
|
723
729
|
),
|
|
730
|
+
ColumnSpec(
|
|
731
|
+
name=ColName.DECK_MANA_VALUE_AVG,
|
|
732
|
+
col_type=ColType.AGG,
|
|
733
|
+
expr=pl.col(ColName.DECK_MANA_VALUE) / pl.col(ColName.DECK),
|
|
734
|
+
dependencies=[ColName.DECK_MANA_VALUE, ColName.DECK],
|
|
735
|
+
),
|
|
724
736
|
]
|
|
725
737
|
|
|
726
738
|
col_spec_map = {col.name: col for col in _column_specs}
|
|
@@ -53,22 +53,26 @@ def _get_names(set_code: str) -> tuple[str, ...]:
|
|
|
53
53
|
|
|
54
54
|
|
|
55
55
|
def _hydrate_col_defs(set_code: str, col_spec_map: dict[str, ColumnSpec]):
|
|
56
|
-
def get_views(spec: ColumnSpec) ->
|
|
57
|
-
if spec.name == ColName.NAME or spec.col_type
|
|
58
|
-
|
|
56
|
+
def get_views(spec: ColumnSpec) -> set[View]:
|
|
57
|
+
if spec.name == ColName.NAME or spec.col_type in (
|
|
58
|
+
ColType.AGG,
|
|
59
|
+
ColType.CARD_SUM,
|
|
60
|
+
):
|
|
61
|
+
return set()
|
|
59
62
|
if spec.col_type == ColType.CARD_ATTR:
|
|
60
|
-
return
|
|
63
|
+
return {View.CARD}
|
|
61
64
|
if spec.views is not None:
|
|
62
|
-
return spec.views
|
|
65
|
+
return set(spec.views)
|
|
63
66
|
assert (
|
|
64
67
|
spec.dependencies is not None
|
|
65
68
|
), f"Col {spec.name} should have dependencies"
|
|
66
69
|
|
|
67
|
-
views =
|
|
68
|
-
|
|
69
|
-
|
|
70
|
+
views = functools.reduce(
|
|
71
|
+
lambda prev, curr: prev.intersection(curr),
|
|
72
|
+
[get_views(col_spec_map[dep]) for dep in spec.dependencies],
|
|
73
|
+
)
|
|
70
74
|
|
|
71
|
-
return
|
|
75
|
+
return views
|
|
72
76
|
|
|
73
77
|
names = _get_names(set_code)
|
|
74
78
|
assert len(names) > 0, "there should be names"
|
|
@@ -118,7 +122,7 @@ def _hydrate_col_defs(set_code: str, col_spec_map: dict[str, ColumnSpec]):
|
|
|
118
122
|
cdef = ColumnDefinition(
|
|
119
123
|
name=spec.name,
|
|
120
124
|
col_type=spec.col_type,
|
|
121
|
-
views=
|
|
125
|
+
views=views,
|
|
122
126
|
expr=expr,
|
|
123
127
|
dependencies=dependencies,
|
|
124
128
|
signature=signature,
|
|
@@ -132,13 +136,18 @@ def _view_select(
|
|
|
132
136
|
view_cols: frozenset[str],
|
|
133
137
|
col_def_map: dict[str, ColumnDefinition],
|
|
134
138
|
is_agg_view: bool,
|
|
139
|
+
is_card_sum: bool = False,
|
|
135
140
|
) -> DF:
|
|
136
141
|
base_cols = frozenset()
|
|
137
142
|
cdefs = [col_def_map[c] for c in view_cols]
|
|
138
143
|
select = []
|
|
139
144
|
for cdef in cdefs:
|
|
140
145
|
if is_agg_view:
|
|
141
|
-
if
|
|
146
|
+
if (
|
|
147
|
+
cdef.col_type == ColType.AGG
|
|
148
|
+
or cdef.col_type == ColType.CARD_SUM
|
|
149
|
+
and is_card_sum
|
|
150
|
+
):
|
|
142
151
|
base_cols = base_cols.union(cdef.dependencies)
|
|
143
152
|
select.append(cdef.expr)
|
|
144
153
|
else:
|
|
@@ -155,7 +164,7 @@ def _view_select(
|
|
|
155
164
|
select.append(cdef.expr)
|
|
156
165
|
|
|
157
166
|
if base_cols != view_cols:
|
|
158
|
-
df = _view_select(df, base_cols, col_def_map, is_agg_view)
|
|
167
|
+
df = _view_select(df, base_cols, col_def_map, is_agg_view, is_card_sum)
|
|
159
168
|
|
|
160
169
|
return df.select(select)
|
|
161
170
|
|
|
@@ -326,8 +335,14 @@ def summon(
|
|
|
326
335
|
fp = data_file_path(set_code, View.CARD)
|
|
327
336
|
card_df = pl.read_parquet(fp)
|
|
328
337
|
select_df = _view_select(card_df, card_cols, m.col_def_map, is_agg_view=False)
|
|
329
|
-
|
|
330
338
|
agg_df = agg_df.join(select_df, on="name", how="outer", coalesce=True)
|
|
339
|
+
|
|
340
|
+
if m.card_sum:
|
|
341
|
+
card_sum_df = _view_select(
|
|
342
|
+
agg_df, m.card_sum, m.col_def_map, is_agg_view=True, is_card_sum=True
|
|
343
|
+
)
|
|
344
|
+
agg_df = pl.concat([agg_df, card_sum_df], how="horizontal")
|
|
345
|
+
|
|
331
346
|
if ColName.NAME not in m.group_by:
|
|
332
347
|
agg_df = agg_df.group_by(m.group_by).sum()
|
|
333
348
|
|
|
@@ -19,6 +19,7 @@ class ColType(StrEnum):
|
|
|
19
19
|
NAME_SUM = "name_sum"
|
|
20
20
|
AGG = "agg"
|
|
21
21
|
CARD_ATTR = "card_attr"
|
|
22
|
+
CARD_SUM = "card_sum"
|
|
22
23
|
|
|
23
24
|
|
|
24
25
|
class ColName(StrEnum):
|
|
@@ -115,6 +116,7 @@ class ColName(StrEnum):
|
|
|
115
116
|
CARD_TYPE = "card_type"
|
|
116
117
|
SUBTYPE = "subtype"
|
|
117
118
|
MANA_VALUE = "mana_value"
|
|
119
|
+
DECK_MANA_VALUE = "deck_mana_value"
|
|
118
120
|
MANA_COST = "mana_cost"
|
|
119
121
|
POWER = "power"
|
|
120
122
|
TOUGHNESS = "toughness"
|
|
@@ -154,3 +156,4 @@ class ColName(StrEnum):
|
|
|
154
156
|
GIH_WR_VAR = "gih_wr_var"
|
|
155
157
|
GIH_WR_STDEV = "gh_wr_stdev"
|
|
156
158
|
GIH_WR_Z = "gih_wr_z"
|
|
159
|
+
DECK_MANA_VALUE_AVG = "deck_mana_value_avg"
|
|
@@ -14,6 +14,7 @@ class Manifest:
|
|
|
14
14
|
view_cols: dict[View, frozenset[str]]
|
|
15
15
|
group_by: tuple[str, ...]
|
|
16
16
|
filter: spells.filter.Filter | None
|
|
17
|
+
card_sum: frozenset[str]
|
|
17
18
|
|
|
18
19
|
def __post_init__(self):
|
|
19
20
|
# No name filter check
|
|
@@ -94,19 +95,19 @@ class Manifest:
|
|
|
94
95
|
def _resolve_view_cols(
|
|
95
96
|
col_set: frozenset[str],
|
|
96
97
|
col_def_map: dict[str, ColumnDefinition],
|
|
97
|
-
) -> dict[View, frozenset[str]]:
|
|
98
|
+
) -> tuple[dict[View, frozenset[str]], frozenset[str]]:
|
|
98
99
|
"""
|
|
99
100
|
For each view ('game', 'draft', and 'card'), return the columns
|
|
100
101
|
that must be present at the aggregation step. 'name' need not be
|
|
101
102
|
included, and 'pick' will be added if needed.
|
|
102
|
-
|
|
103
|
-
Dependencies within base views will be resolved by `col_df`.
|
|
104
103
|
"""
|
|
104
|
+
MAX_DEPTH = 1000
|
|
105
105
|
unresolved_cols = col_set
|
|
106
106
|
view_resolution = {}
|
|
107
|
+
card_sum = frozenset()
|
|
107
108
|
|
|
108
109
|
iter_num = 0
|
|
109
|
-
while unresolved_cols and iter_num <
|
|
110
|
+
while unresolved_cols and iter_num < MAX_DEPTH:
|
|
110
111
|
iter_num += 1
|
|
111
112
|
next_cols = frozenset()
|
|
112
113
|
for col in unresolved_cols:
|
|
@@ -115,6 +116,8 @@ def _resolve_view_cols(
|
|
|
115
116
|
view_resolution[View.DRAFT] = view_resolution.get(
|
|
116
117
|
View.DRAFT, frozenset()
|
|
117
118
|
).union({ColName.PICK})
|
|
119
|
+
if cdef.col_type == ColType.CARD_SUM:
|
|
120
|
+
card_sum = card_sum.union({col})
|
|
118
121
|
if cdef.views:
|
|
119
122
|
for view in cdef.views:
|
|
120
123
|
view_resolution[view] = view_resolution.get(
|
|
@@ -129,10 +132,10 @@ def _resolve_view_cols(
|
|
|
129
132
|
next_cols = next_cols.union({dep})
|
|
130
133
|
unresolved_cols = next_cols
|
|
131
134
|
|
|
132
|
-
if iter_num >=
|
|
135
|
+
if iter_num >= MAX_DEPTH:
|
|
133
136
|
raise ValueError("broken dependency chain in column spec, loop probable")
|
|
134
137
|
|
|
135
|
-
return view_resolution
|
|
138
|
+
return view_resolution, card_sum
|
|
136
139
|
|
|
137
140
|
|
|
138
141
|
def create(
|
|
@@ -149,14 +152,6 @@ def create(
|
|
|
149
152
|
else:
|
|
150
153
|
cols = tuple(columns)
|
|
151
154
|
|
|
152
|
-
base_view_group_by = frozenset()
|
|
153
|
-
for col in gbs:
|
|
154
|
-
cdef = col_def_map[col]
|
|
155
|
-
if cdef.col_type == ColType.GROUP_BY:
|
|
156
|
-
base_view_group_by = base_view_group_by.union({col})
|
|
157
|
-
elif cdef.col_type == ColType.CARD_ATTR:
|
|
158
|
-
base_view_group_by = base_view_group_by.union({ColName.NAME})
|
|
159
|
-
|
|
160
155
|
m_filter = spells.filter.from_spec(filter_spec)
|
|
161
156
|
|
|
162
157
|
col_set = frozenset(cols)
|
|
@@ -164,14 +159,28 @@ def create(
|
|
|
164
159
|
if m_filter is not None:
|
|
165
160
|
col_set = col_set.union(m_filter.lhs)
|
|
166
161
|
|
|
167
|
-
view_cols = _resolve_view_cols(col_set, col_def_map)
|
|
162
|
+
view_cols, card_sum = _resolve_view_cols(col_set, col_def_map)
|
|
163
|
+
base_view_group_by = frozenset()
|
|
164
|
+
|
|
165
|
+
if card_sum:
|
|
166
|
+
base_view_group_by = base_view_group_by.union({ColName.NAME})
|
|
167
|
+
|
|
168
|
+
for col in gbs:
|
|
169
|
+
cdef = col_def_map[col]
|
|
170
|
+
if cdef.col_type == ColType.GROUP_BY:
|
|
171
|
+
base_view_group_by = base_view_group_by.union({col})
|
|
172
|
+
elif cdef.col_type == ColType.CARD_ATTR:
|
|
173
|
+
base_view_group_by = base_view_group_by.union({ColName.NAME})
|
|
168
174
|
|
|
169
175
|
needed_views = frozenset()
|
|
170
176
|
for view, cols_for_view in view_cols.items():
|
|
171
177
|
for col in cols_for_view:
|
|
172
|
-
if col_def_map[col].views ==
|
|
178
|
+
if col_def_map[col].views == {view}: # only found in this view
|
|
173
179
|
needed_views = needed_views.union({view})
|
|
174
180
|
|
|
181
|
+
if not needed_views:
|
|
182
|
+
needed_views = {View.DRAFT}
|
|
183
|
+
|
|
175
184
|
view_cols = {v: view_cols[v] for v in needed_views}
|
|
176
185
|
|
|
177
186
|
return Manifest(
|
|
@@ -181,4 +190,5 @@ def create(
|
|
|
181
190
|
view_cols=view_cols,
|
|
182
191
|
group_by=gbs,
|
|
183
192
|
filter=m_filter,
|
|
193
|
+
card_sum=card_sum,
|
|
184
194
|
)
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|