PyPI - policyengine - Versions diffs - 3.2.4__tar.gz → 3.3.0__tar.gz - Mend

policyengine 3.2.4tar.gz → 3.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (142) hide show

policyengine-3.3.0/.github/workflows/pr_docs_changes.yaml ADDED Viewed

@@ -0,0 +1,27 @@
+# Workflow that runs on code changes to a pull request.
+name: Docs changes
+on:
+  pull_request:
+    branches:
+      - main
+    paths:
+      - docs/**
+      - .github/**
+  workflow_dispatch:
+jobs:
+  Test:
+    runs-on: ubuntu-latest
+    name: Test documentation builds
+    steps:
+      - name: Checkout repo
+        uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 18.x
+      - name: Install MyST
+        run: npm install -g mystmd
+      - name: Test documentation builds
+        run: cd docs && myst build --html

{policyengine-3.2.4 → policyengine-3.3.0}/CHANGELOG.md RENAMED Viewed

@@ -1,3 +1,10 @@
+## [3.3.0] - 2026-03-20
+### Added
+- Added documentation for economic impact analysis, advanced outputs (DecileImpact, Poverty, Inequality, IntraDecileImpact), regions and scoping strategies, simulation lifecycle (ensure vs run), Dynamic class, data loading, and simulation modifiers. Added US budgetary impact example script. Fixed PR docs CI to use MyST matching production.
 ## [3.2.4] - 2026-03-17
 ### Changed

{policyengine-3.2.4 → policyengine-3.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: policyengine
-Version: 3.2.4
+Version: 3.3.0
 Summary: A package to conduct policy analysis using PolicyEngine tax-benefit models.
 Author-email: PolicyEngine <hello@policyengine.org>
 License:                     GNU AFFERO GENERAL PUBLIC LICENSE

policyengine-3.3.0/docs/advanced-outputs.md ADDED Viewed

@@ -0,0 +1,276 @@
+# Advanced outputs
+Beyond `Aggregate` and `ChangeAggregate` (covered in [Core concepts](core-concepts.md)), the package provides specialised output types for distributional analysis, poverty measurement, and inequality metrics.
+All output types follow the same pattern: create an instance, call `.run()`, read the result fields. Convenience functions are provided for common use cases.
+## OutputCollection
+Many convenience functions return an `OutputCollection[T]`, a container holding both the individual output objects and a pandas DataFrame:
+```python
+from policyengine.core import OutputCollection
+# Returned by calculate_decile_impacts(), calculate_us_poverty_rates(), etc.
+collection = calculate_us_poverty_rates(simulation)
+# Access individual objects
+for poverty in collection.outputs:
+    print(f"{poverty.poverty_type}: {poverty.rate:.4f}")
+# Access as DataFrame
+print(collection.dataframe)
+```
+## DecileImpact
+Calculates the impact of a policy reform on a single income decile: baseline and reform mean income, absolute and relative change, and counts of people better off, worse off, and unchanged.
+### Using the convenience function
+```python
+from policyengine.outputs.decile_impact import calculate_decile_impacts
+decile_impacts = calculate_decile_impacts(
+    dataset=dataset,
+    tax_benefit_model_version=us_latest,
+    baseline_policy=None,           # Current law
+    reform_policy=reform,
+    income_variable="household_net_income",  # Default for US
+)
+for d in decile_impacts.outputs:
+    print(f"Decile {d.decile}: "
+          f"baseline={d.baseline_mean:,.0f}, "
+          f"reform={d.reform_mean:,.0f}, "
+          f"change={d.absolute_change:+,.0f} "
+          f"({d.relative_change:+.2f}%)")
+```
+### Using directly
+```python
+from policyengine.outputs.decile_impact import DecileImpact
+impact = DecileImpact(
+    baseline_simulation=baseline_sim,
+    reform_simulation=reform_sim,
+    income_variable="household_net_income",
+    decile=5,  # 5th decile
+)
+impact.run()
+print(f"Count better off: {impact.count_better_off:,.0f}")
+print(f"Count worse off: {impact.count_worse_off:,.0f}")
+```
+### Parameters
+| Parameter | Default | Description |
+|---|---|---|
+| `income_variable` | `equiv_hbai_household_net_income` | Income variable to group by and measure changes |
+| `decile_variable` | `None` | Use a pre-computed grouping variable instead of `qcut` |
+| `entity` | Auto-detected | Entity level for the income variable |
+| `quantiles` | `10` | Number of quantile groups (10 = deciles, 5 = quintiles) |
+For US simulations, use `income_variable="household_net_income"`. The UK default (`equiv_hbai_household_net_income`) is the equivalised HBAI measure.
+## IntraDecileImpact
+Classifies people within each decile into five income change categories:
+| Category | Threshold |
+|---|---|
+| Lose more than 5% | change <= -5% |
+| Lose less than 5% | -5% < change <= -0.1% |
+| No change | -0.1% < change <= 0.1% |
+| Gain less than 5% | 0.1% < change <= 5% |
+| Gain more than 5% | change > 5% |
+Proportions are people-weighted (using `household_count_people * household_weight`).
+### Using the convenience function
+```python
+from policyengine.outputs.intra_decile_impact import compute_intra_decile_impacts
+intra = compute_intra_decile_impacts(
+    baseline_simulation=baseline_sim,
+    reform_simulation=reform_sim,
+    income_variable="household_net_income",
+)
+for row in intra.outputs:
+    if row.decile == 0:
+        label = "Overall"
+    else:
+        label = f"Decile {row.decile}"
+    print(f"{label}: "
+          f"lose>5%={row.lose_more_than_5pct:.2%}, "
+          f"lose<5%={row.lose_less_than_5pct:.2%}, "
+          f"no change={row.no_change:.2%}, "
+          f"gain<5%={row.gain_less_than_5pct:.2%}, "
+          f"gain>5%={row.gain_more_than_5pct:.2%}")
+```
+The function returns deciles 1-10 plus an overall average at `decile=0`.
+## Poverty
+Calculates poverty headcount and rates for a single simulation, with optional demographic filtering.
+### Poverty types
+**UK** (4 measures):
+- Absolute before housing costs (BHC)
+- Absolute after housing costs (AHC)
+- Relative before housing costs (BHC)
+- Relative after housing costs (AHC)
+**US** (2 measures):
+- SPM poverty
+- Deep SPM poverty (below 50% of SPM threshold)
+### Calculating all poverty rates
+```python
+from policyengine.outputs.poverty import (
+    calculate_uk_poverty_rates,
+    calculate_us_poverty_rates,
+)
+# US
+us_poverty = calculate_us_poverty_rates(simulation)
+for p in us_poverty.outputs:
+    print(f"{p.poverty_type}: headcount={p.headcount:,.0f}, rate={p.rate:.4f}")
+# UK
+uk_poverty = calculate_uk_poverty_rates(simulation)
+for p in uk_poverty.outputs:
+    print(f"{p.poverty_type}: headcount={p.headcount:,.0f}, rate={p.rate:.4f}")
+```
+### Poverty by demographic group
+```python
+from policyengine.outputs.poverty import (
+    calculate_us_poverty_by_age,
+    calculate_us_poverty_by_gender,
+    calculate_us_poverty_by_race,
+    calculate_uk_poverty_by_age,
+    calculate_uk_poverty_by_gender,
+)
+# By age group (child <18, adult 18-64, senior 65+)
+by_age = calculate_us_poverty_by_age(simulation)
+for p in by_age.outputs:
+    print(f"{p.filter_group} {p.poverty_type}: {p.rate:.4f}")
+# By gender
+by_gender = calculate_us_poverty_by_gender(simulation)
+# By race (US only: WHITE, BLACK, HISPANIC, OTHER)
+by_race = calculate_us_poverty_by_race(simulation)
+```
+### Custom filters
+```python
+from policyengine.outputs.poverty import Poverty
+# Child poverty only
+child_poverty = Poverty(
+    simulation=simulation,
+    poverty_variable="spm_unit_is_in_spm_poverty",
+    entity="person",
+    filter_variable="age",
+    filter_variable_leq=17,
+)
+child_poverty.run()
+print(f"Child SPM poverty rate: {child_poverty.rate:.4f}")
+```
+### Result fields
+| Field | Description |
+|---|---|
+| `headcount` | Weighted count of people in poverty |
+| `total_population` | Weighted total population (after filters) |
+| `rate` | `headcount / total_population` |
+| `filter_group` | Group label set by demographic convenience functions |
+## Inequality
+Calculates weighted inequality metrics for a single simulation: Gini coefficient and income share measures.
+### Using convenience functions
+```python
+from policyengine.outputs.inequality import (
+    calculate_uk_inequality,
+    calculate_us_inequality,
+)
+# US (uses household_net_income by default)
+ineq = calculate_us_inequality(simulation)
+print(f"Gini: {ineq.gini:.4f}")
+print(f"Top 10% share: {ineq.top_10_share:.4f}")
+print(f"Top 1% share: {ineq.top_1_share:.4f}")
+print(f"Bottom 50% share: {ineq.bottom_50_share:.4f}")
+# UK (uses equiv_hbai_household_net_income by default)
+ineq = calculate_uk_inequality(simulation)
+```
+### With demographic filters
+```python
+# Inequality among working-age adults only
+ineq = calculate_us_inequality(
+    simulation,
+    filter_variable="age",
+    filter_variable_geq=18,
+    filter_variable_leq=64,
+)
+```
+### Using directly
+```python
+from policyengine.outputs.inequality import Inequality
+ineq = Inequality(
+    simulation=simulation,
+    income_variable="household_net_income",
+    entity="household",
+)
+ineq.run()
+```
+### Result fields
+| Field | Description |
+|---|---|
+| `gini` | Weighted Gini coefficient (0 = perfect equality, 1 = perfect inequality) |
+| `top_10_share` | Share of total income held by top 10% |
+| `top_1_share` | Share of total income held by top 1% |
+| `bottom_50_share` | Share of total income held by bottom 50% |
+## Comparing baseline and reform
+Poverty and inequality are single-simulation outputs. To compare baseline and reform, compute both and take the difference:
+```python
+baseline_poverty = calculate_us_poverty_rates(baseline_sim)
+reform_poverty = calculate_us_poverty_rates(reform_sim)
+for bp, rp in zip(baseline_poverty.outputs, reform_poverty.outputs):
+    change = rp.rate - bp.rate
+    print(f"{bp.poverty_type}: {bp.rate:.4f} -> {rp.rate:.4f} ({change:+.4f})")
+baseline_ineq = calculate_us_inequality(baseline_sim)
+reform_ineq = calculate_us_inequality(reform_sim)
+print(f"Gini change: {reform_ineq.gini - baseline_ineq.gini:+.4f}")
+```
+The `economic_impact_analysis()` function does this automatically and returns both baseline and reform poverty/inequality in the `PolicyReformAnalysis` result. See [Economic impact analysis](economic-impact-analysis.md).

{policyengine-3.2.4 → policyengine-3.3.0}/docs/core-concepts.md RENAMED Viewed

@@ -117,6 +117,40 @@ dataset = PolicyEngineUKDataset(
 )
 ```
+## Data loading
+Before running simulations, you need representative microdata. The package provides three functions for managing datasets:
+- **`ensure_datasets()`**: Load from disk if available, otherwise download and compute (recommended)
+- **`create_datasets()`**: Always download from HuggingFace and compute from scratch
+- **`load_datasets()`**: Load previously saved HDF5 files from disk
+```python
+from policyengine.tax_benefit_models.us import ensure_datasets
+# First run: downloads from HuggingFace, computes variables, saves to ./data/
+# Subsequent runs: loads from disk instantly
+datasets = ensure_datasets(
+    datasets=["hf://policyengine/policyengine-us-data/enhanced_cps_2024.h5"],
+    years=[2026],
+    data_folder="./data",
+)
+dataset = datasets["enhanced_cps_2024_2026"]
+```
+```python
+from policyengine.tax_benefit_models.uk import ensure_datasets
+datasets = ensure_datasets(
+    datasets=["hf://policyengine/policyengine-uk-data/enhanced_frs_2023_24.h5"],
+    years=[2026],
+    data_folder="./data",
+)
+dataset = datasets["enhanced_frs_2023_24_2026"]
+```
+All datasets are stored as HDF5 files on disk. No database server is required.
 ## Simulations
 Simulations apply tax-benefit models to datasets, calculating all variables for the specified year.
@@ -141,6 +175,25 @@ output_household = simulation.output_dataset.data.household
 print(output_household[["household_id", "household_net_income", "household_tax"]])
 ```
+### Simulation lifecycle: `run()` vs `ensure()`
+The `Simulation` class provides two methods for computing results:
+| Method | Behaviour |
+|---|---|
+| `simulation.run()` | Always recomputes from scratch. No caching. |
+| `simulation.ensure()` | Checks in-memory LRU cache, then tries loading from disk, then falls back to `run()` + `save()`. |
+```python
+# One-off computation (no caching)
+simulation.run()
+# Cache-or-compute (preferred for production use)
+simulation.ensure()
+```
+`ensure()` uses a module-level LRU cache (max 100 simulations) and saves output datasets as HDF5 files alongside the input dataset. On repeated calls, it returns cached results instantly. For baseline-vs-reform comparisons, `economic_impact_analysis()` calls `ensure()` internally, so you rarely need to call it yourself.
 ### Accessing calculated variables
 After running a simulation, you can access the calculated variables from the output dataset:
@@ -211,6 +264,56 @@ reform = Simulation(
 reform.run()
 ```
+### Combining policies
+Policies can be combined using the `+` operator:
+```python
+combined = policy_a + policy_b
+# Concatenates parameter_values and chains simulation_modifiers
+```
+### Simulation modifiers
+For reforms that cannot be expressed as parameter value changes, `Policy` accepts a `simulation_modifier` callable that directly manipulates the underlying `policyengine_core` simulation:
+```python
+def my_modifier(sim):
+    """Custom reform logic applied to the core simulation object."""
+    p = sim.tax_benefit_system.parameters
+    # Modify parameters programmatically
+    return sim
+policy = Policy(
+    name="Custom reform",
+    simulation_modifier=my_modifier,
+)
+```
+Note: the UK model supports `simulation_modifier`. The US model currently only uses the `parameter_values` path.
+## Dynamic behavioural responses
+The `Dynamic` class is structurally identical to `Policy` and represents behavioural responses to policy changes (e.g., labour supply elasticities). It is applied after the policy in the simulation pipeline.
+```python
+from policyengine.core.dynamic import Dynamic
+dynamic = Dynamic(
+    name="Labour supply response",
+    parameter_values=[...],  # Same format as Policy
+)
+simulation = Simulation(
+    dataset=dataset,
+    tax_benefit_model_version=uk_latest,
+    policy=policy,
+    dynamic=dynamic,
+)
+```
+Dynamic responses can also be combined using the `+` operator and support `simulation_modifier` callables.
 ## Outputs
 Output classes provide structured analysis of simulation results.
@@ -480,7 +583,7 @@ COLORS = {
 ### 1. Analyse employment income variation
-See `examples/employment_income_variation_uk.py` for a complete example of:
+See [UK employment income variation](examples.md#uk-employment-income-variation) for a complete example of:
 - Creating custom datasets with varied parameters
 - Running single simulations
 - Extracting results with filters
@@ -488,7 +591,7 @@ See `examples/employment_income_variation_uk.py` for a complete example of:
 ### 2. Policy reform analysis
-See `examples/policy_change_uk.py` for:
+See [UK policy reform analysis](examples.md#uk-policy-reform-analysis) for:
 - Applying parametric reforms
 - Comparing baseline and reform
 - Analysing winners/losers by decile
@@ -496,7 +599,7 @@ See `examples/policy_change_uk.py` for:
 ### 3. Distributional analysis
-See `examples/income_distribution_us.py` for:
+See [US income distribution](examples.md#us-income-distribution) for:
 - Loading representative microdata
 - Calculating statistics by income decile
 - Mapping variables across entity levels
@@ -549,8 +652,11 @@ See `examples/income_distribution_us.py` for:
 ## Next steps
-- See `examples/` for complete working examples
-- Review country-specific documentation:
+- [Economic impact analysis](economic-impact-analysis.md): Full baseline-vs-reform comparison workflow
+- [Advanced outputs](advanced-outputs.md): DecileImpact, Poverty, Inequality, IntraDecileImpact
+- [Regions and scoping](regions-and-scoping.md): Sub-national analysis (states, constituencies, districts)
+- Country-specific documentation:
   - [UK tax-benefit model](country-models-uk.md)
   - [US tax-benefit model](country-models-us.md)
-- Explore the API reference for detailed class documentation
+- [Visualisation](visualisation.md): Publication-ready charts
+- [Examples](examples.md): Complete working scripts

{policyengine-3.2.4 → policyengine-3.3.0}/docs/country-models-uk.md RENAMED Viewed

@@ -363,11 +363,9 @@ When creating custom datasets, validate:
 ## Examples
-See working examples in the `examples/` directory:
-- `employment_income_variation_uk.py`: Vary employment income, analyse benefit phase-outs
-- `policy_change_uk.py`: Apply reforms, analyse winners/losers
-- `income_bands_uk.py`: Create income band scenarios
+- [UK employment income variation](examples.md#uk-employment-income-variation): Vary employment income, analyse benefit phase-outs
+- [UK policy reform analysis](examples.md#uk-policy-reform-analysis): Apply reforms, analyse winners/losers
+- [UK income bands](examples.md#uk-income-bands): Calculate net income and tax by income decile
 ## References

{policyengine-3.2.4 → policyengine-3.3.0}/docs/country-models-us.md RENAMED Viewed

@@ -431,11 +431,10 @@ When creating custom datasets, validate:
 ## Examples
-See working examples in the `examples/` directory:
-- `income_distribution_us.py`: Analyse benefit distribution by income decile
-- `employment_income_variation_us.py`: Vary employment income, analyse phase-outs
-- `speedtest_us_simulation.py`: Performance benchmarking
+- [US income distribution](examples.md#us-income-distribution): Analyse benefit distribution by income decile
+- [US employment income variation](examples.md#us-employment-income-variation): Vary employment income, analyse phase-outs
+- [US budgetary impact](examples.md#us-budgetary-impact): Full baseline-vs-reform comparison
+- [Simulation performance](examples.md#simulation-performance): Performance benchmarking
 ## References

policyengine-3.3.0/docs/dev.md ADDED Viewed

@@ -0,0 +1,101 @@
+# Development
+## Principles
+1. **STRONG** preference for simplicity. Let's make this package as simple as it possibly can be.
+2. Remember the goal of this package: to make it easy to create, run, save and analyse PolicyEngine simulations. When considering further features, always ask: can we instead *make it super easy* for people to do this outside the package?
+3. Be consistent about property names. `name` = human readable few words you could put as the noun in a sentence without fail. `id` = unique identifier, ideally a UUID. `description` = longer human readable text that describes the object. `created_at` and `updated_at` = timestamps for when the object was created and last updated.
+4. Constraints can be good. We should set constraints where they help us simplify the codebase and usage, but not where they unnecessarily block useful functionality.
+## Setup
+```bash
+git clone https://github.com/PolicyEngine/policyengine.py.git
+cd policyengine.py
+uv pip install -e .[dev]
+```
+This installs both UK and US country models plus dev dependencies (pytest, ruff, mypy, towncrier).
+## Common commands
+```bash
+make format           # ruff format
+make test             # pytest with coverage
+make docs             # build documentation site
+make clean            # remove caches, build artifacts, .h5 files
+```
+## Testing
+Tests require a `HUGGING_FACE_TOKEN` environment variable for downloading datasets:
+```bash
+export HUGGING_FACE_TOKEN=hf_...
+make test
+```
+To run a specific test:
+```bash
+pytest tests/test_models.py -v
+pytest tests/test_parametric_reforms.py -k "test_uk" -v
+```
+## Linting and formatting
+```bash
+ruff format .                    # format code
+ruff check .                     # lint
+mypy src/policyengine            # type check (informational)
+```
+## CI pipeline
+PRs trigger the following checks:
+| Check | Status | Command |
+|---|---|---|
+| Lint + format | Required | `ruff check .` + `ruff format --check .` |
+| Tests (Python 3.13) | Required | `make test` |
+| Tests (Python 3.14) | Required | `make test` |
+| Mypy | Informational | `mypy src/policyengine` |
+| Docs build | Required | MyST build |
+## Versioning and releases
+This project uses [towncrier](https://towncrier.readthedocs.io/) for changelog management. When making a PR, add a changelog fragment:
+```bash
+# Fragment types: breaking, added, changed, fixed, removed
+echo "Description of change" > changelog.d/my-change.added
+```
+On merge, the versioning workflow bumps the version, builds the changelog, and creates a GitHub Release.
+## Architecture
+### Package layout
+```
+src/policyengine/
+├── core/                  # Domain models (Simulation, Dataset, Policy, etc.)
+├── tax_benefit_models/
+│   ├── uk/                # UK model, datasets, analysis, outputs
+│   └── us/                # US model, datasets, analysis, outputs
+├── outputs/               # Output templates (Aggregate, Poverty, etc.)
+├── countries/             # Geographic region registries
+└── utils/                 # Helpers (reforms, entity mapping, plotting)
+```
+### Key design decisions
+**Pydantic everywhere**: All domain objects are Pydantic `BaseModel` subclasses. This gives us validation, serialisation, and clear field documentation.
+**HDF5 for storage**: Datasets and simulation outputs are stored as HDF5 files. No database server is required. The `MicroDataFrame` from the `microdf` package wraps pandas DataFrames with weight-aware `.sum()`, `.mean()`, `.count()`.
+**Country-specific model classes**: `PolicyEngineUSLatest` and `PolicyEngineUKLatest` each implement `run()`, `save()`, and `load()`. The US model passes reforms as a dict at `Microsimulation(reform=...)` construction time. The UK model supports both parametric reforms and `simulation_modifier` callables applied post-construction.
+**LRU cache + file caching**: `Simulation.ensure()` checks an in-process LRU cache (max 100 entries), then tries loading from disk, then falls back to `run()` + `save()`.
+**Output pattern**: All output types inherit from `Output`, implement `.run()`, and populate result fields. Convenience functions (e.g., `calculate_us_poverty_rates()`) create, run, and return collections of output objects.

policyengine 3.2.4__tar.gz → 3.3.0__tar.gz

policyengine 3.2.4tar.gz → 3.3.0tar.gz