PyPI - dml-dev - Versions diffs - 0.1.0__tar.gz → 0.1.1__tar.gz - Mend

dml-dev 0.1.0tar.gz → 0.1.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (51) hide show

dml_dev-0.1.1/MANIFEST.in ADDED Viewed

@@ -0,0 +1,4 @@
+recursive-include dml_code *.py
+recursive-include project_configuration *.py *.yaml
+include README.md
+include agent.md

dml_dev-0.1.1/PKG-INFO ADDED Viewed

@@ -0,0 +1,137 @@
+Metadata-Version: 2.4
+Name: dml-dev
+Version: 0.1.1
+Summary: DoubleML build, estimation, plotting, and utility pipelines.
+Author: DML Pipeline Contributors
+Keywords: administrative-data,causal-inference,doubleml,observational-data,program-evaluation
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Science/Research
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Scientific/Engineering
+Requires-Python: >=3.12
+Description-Content-Type: text/markdown
+Requires-Dist: doubleml
+Requires-Dist: joblib
+Requires-Dist: oi-tools[figures]
+Requires-Dist: plotnine
+Requires-Dist: polars
+Requires-Dist: pyarrow
+Requires-Dist: psutil
+Requires-Dist: PyYAML
+Requires-Dist: scikit-learn
+Requires-Dist: threadpoolctl
+Provides-Extra: dev
+Requires-Dist: build; extra == "dev"
+Requires-Dist: twine; extra == "dev"
+# DML Pipeline
+This repo is a small framework for running DoubleML on administrative-style
+program data. It separates project-specific choices from reusable pipeline code:
+you edit `project_configuration/`, then run the pipeline in `dml_code/`.
+The repo is currently filled with a synthetic example so you can run the whole
+flow before replacing it with real project data.
+## Mental Model
+The workflow has two main steps:
+1. **Build an analysis dataset.** Start from a databank and program file,
+   join them, construct event-time variables, and write processed panels to
+   `data/build_output/`.
+2. **Estimate DML effects.** Read a YAML experiment, resolve its program,
+   covariates, filters, and models from the registries, then write logs to
+   `outputs/raw/`.
+After estimation, scripts can turn the raw logs into plots and tables.
+```text
+project_configuration/ + data/build/
+        |
+        v
+dml_code.pipeline.step1_build
+        |
+        v
+data/build_output/
+        |
+        v
+dml_code.pipeline.step2_estimate
+        |
+        v
+outputs/raw/ -> outputs/plots/ and outputs/tables/
+```
+## Run The Example
+```bash
+python project_scripts/generate_example.py
+python -m dml_code.pipeline.step1_build example_program
+python -m dml_code.pipeline.step2_estimate synthetic_example
+python project_scripts/plot_example.py
+```
+The first command creates synthetic input data in `data/build/`. Step 1 writes
+processed panels to `data/build_output/`. Step 2 writes estimation and
+prediction logs to `outputs/raw/`. The plotting script writes diagnostics to
+`outputs/plots/` and `outputs/tables/`.
+## What You Edit
+Most project setup happens in `project_configuration/`.
+- `project_configuration/build_spec.py`: define the databank files, columns to carry through,
+  relative-time columns to generate, and any generated features created after
+  panel construction.
+- `project_configuration/registries/programs.py`: define each program: its source file,
+  treatment column, enrollment-year column, and program-specific columns.
+- `project_configuration/registries/covariate_sets.py`: name reusable covariate lists and mark
+  categorical covariates for dummy encoding.
+- `project_configuration/registries/filter_sets.py`: name reusable Polars filters for
+  estimation samples.
+- `project_configuration/registries/models.py`: name outcome and propensity learners.
+- `project_configuration/estimation_experiments/*.yaml`: choose combinations of programs, outcomes,
+  covariates, filters, models, and control sampling rates to estimate.
+The pipeline code in `dml_code/` is meant to stay reusable.
+- `dml_code/pipeline/`: runnable steps, `step1_build.py` and
+  `step2_estimate.py`.
+- `dml_code/src/`: shared helpers for building, estimating, paths, outputs,
+  and logging.
+`project_scripts/` is for ad hoc project work tied to particular runs:
+generating example data, viewing outputs, making plots, running diagnostics,
+and writing small experiment-specific analyses.
+## How To Add A Real Project
+1. Put source parquet files somewhere under `data/` or point `project_configuration/` at their
+   real locations.
+2. Update `project_configuration/build_spec.py` with the databank files and feature-generation
+   logic.
+3. Add program definitions in `project_configuration/registries/programs.py`.
+4. Add covariate sets, filters, and models in the registry files.
+5. Create or copy a YAML file in `project_configuration/estimation_experiments/`.
+6. Run step 1 for a program, then step 2 for an experiment.
+Example:
+```bash
+python -m dml_code.pipeline.step1_build my_program
+python -m dml_code.pipeline.step2_estimate my_experiment
+```
+Use `project_scripts/` for project-specific follow-up work: viewing outputs
+from particular runs, making plots and tables, running diagnostics, robustness
+checks, and other exploratory analyses.
+## Where Results Go
+- `data/build/`: input data used by the example.
+- `data/build_output/`: processed analysis datasets created by step 1.
+- `outputs/raw/`: machine-readable estimation, prediction, and diagnostic logs.
+- `outputs/plots/`: generated figures.
+- `outputs/tables/`: generated tables.

dml_dev-0.1.1/README.md ADDED Viewed

@@ -0,0 +1,109 @@
+# DML Pipeline
+This repo is a small framework for running DoubleML on administrative-style
+program data. It separates project-specific choices from reusable pipeline code:
+you edit `project_configuration/`, then run the pipeline in `dml_code/`.
+The repo is currently filled with a synthetic example so you can run the whole
+flow before replacing it with real project data.
+## Mental Model
+The workflow has two main steps:
+1. **Build an analysis dataset.** Start from a databank and program file,
+   join them, construct event-time variables, and write processed panels to
+   `data/build_output/`.
+2. **Estimate DML effects.** Read a YAML experiment, resolve its program,
+   covariates, filters, and models from the registries, then write logs to
+   `outputs/raw/`.
+After estimation, scripts can turn the raw logs into plots and tables.
+```text
+project_configuration/ + data/build/
+        |
+        v
+dml_code.pipeline.step1_build
+        |
+        v
+data/build_output/
+        |
+        v
+dml_code.pipeline.step2_estimate
+        |
+        v
+outputs/raw/ -> outputs/plots/ and outputs/tables/
+```
+## Run The Example
+```bash
+python project_scripts/generate_example.py
+python -m dml_code.pipeline.step1_build example_program
+python -m dml_code.pipeline.step2_estimate synthetic_example
+python project_scripts/plot_example.py
+```
+The first command creates synthetic input data in `data/build/`. Step 1 writes
+processed panels to `data/build_output/`. Step 2 writes estimation and
+prediction logs to `outputs/raw/`. The plotting script writes diagnostics to
+`outputs/plots/` and `outputs/tables/`.
+## What You Edit
+Most project setup happens in `project_configuration/`.
+- `project_configuration/build_spec.py`: define the databank files, columns to carry through,
+  relative-time columns to generate, and any generated features created after
+  panel construction.
+- `project_configuration/registries/programs.py`: define each program: its source file,
+  treatment column, enrollment-year column, and program-specific columns.
+- `project_configuration/registries/covariate_sets.py`: name reusable covariate lists and mark
+  categorical covariates for dummy encoding.
+- `project_configuration/registries/filter_sets.py`: name reusable Polars filters for
+  estimation samples.
+- `project_configuration/registries/models.py`: name outcome and propensity learners.
+- `project_configuration/estimation_experiments/*.yaml`: choose combinations of programs, outcomes,
+  covariates, filters, models, and control sampling rates to estimate.
+The pipeline code in `dml_code/` is meant to stay reusable.
+- `dml_code/pipeline/`: runnable steps, `step1_build.py` and
+  `step2_estimate.py`.
+- `dml_code/src/`: shared helpers for building, estimating, paths, outputs,
+  and logging.
+`project_scripts/` is for ad hoc project work tied to particular runs:
+generating example data, viewing outputs, making plots, running diagnostics,
+and writing small experiment-specific analyses.
+## How To Add A Real Project
+1. Put source parquet files somewhere under `data/` or point `project_configuration/` at their
+   real locations.
+2. Update `project_configuration/build_spec.py` with the databank files and feature-generation
+   logic.
+3. Add program definitions in `project_configuration/registries/programs.py`.
+4. Add covariate sets, filters, and models in the registry files.
+5. Create or copy a YAML file in `project_configuration/estimation_experiments/`.
+6. Run step 1 for a program, then step 2 for an experiment.
+Example:
+```bash
+python -m dml_code.pipeline.step1_build my_program
+python -m dml_code.pipeline.step2_estimate my_experiment
+```
+Use `project_scripts/` for project-specific follow-up work: viewing outputs
+from particular runs, making plots and tables, running diagnostics, robustness
+checks, and other exploratory analyses.
+## Where Results Go
+- `data/build/`: input data used by the example.
+- `data/build_output/`: processed analysis datasets created by step 1.
+- `outputs/raw/`: machine-readable estimation, prediction, and diagnostic logs.
+- `outputs/plots/`: generated figures.
+- `outputs/tables/`: generated tables.

dml_dev-0.1.1/agent.md ADDED Viewed

@@ -0,0 +1,78 @@
+# Repository Guide
+This repository contains a general-purpose Polars build pipeline and DoubleML
+estimation pipeline for administrative observational data. The package exposes
+the full implementation under `dml_code`, including executable pipeline
+entrypoints and shared helper modules.
+## Runtime Paths
+Runtime locations are defined in `dml_code/src/paths.py`. `LOCAL_DIR` is
+resolved from the repository location, and data, project configuration, and
+outputs are defined relative to it.
+## Main Pipeline
+### Build Processed Panels
+`dml_code/pipeline/step1_build.py` builds program-specific processed parquet
+files from a configured administrative data source and program registry entry.
+The script:
+1. Loads source panel files from the build spec.
+2. Loads a selected program source definition.
+3. Joins program records to the source panel on the configured join key.
+4. Constructs relative-time panels for treated and eligible comparison records.
+5. Creates relative-time variables from calendar-time source columns.
+6. Applies configured post-panel feature transforms.
+7. Writes processed parquet files for downstream estimation.
+CLI usage:
+```bash
+python dml_code/pipeline/step1_build.py <program_pointer>
+```
+### Estimate Effects
+`dml_code/pipeline/step2_estimate.py` runs DoubleML estimation from a YAML
+experiment spec.
+The script:
+1. Loads an experiment YAML file.
+2. Expands registry pointer lists into concrete runs.
+3. Loads processed parquet files for the selected program.
+4. Cleans missing values and applies configured filters.
+5. Selects treatment, outcome, covariates, and join key columns.
+6. Encodes configured categorical covariates.
+7. Fits DoubleML IRM with configured outcome and propensity models.
+8. Logs estimation summaries and row-level predictions.
+CLI usage:
+```bash
+python dml_code/pipeline/step2_estimate.py <experiment_name>
+```
+## Configuration
+The repository includes a runnable example configuration under `project_configuration`.
+Expected project configuration shape:
+- `project_configuration/build_spec.py`: databank spec and generated feature transforms.
+- `project_configuration/estimation_experiments/*.yaml`: experiment specs with registry pointer lists.
+- `project_configuration/registries/programs.py`: program spec registry.
+- `project_configuration/registries/covariate_sets.py`: covariate set registry.
+- `project_configuration/registries/filter_sets.py`: filter expression registry.
+- `project_configuration/registries/models.py`: outcome and propensity model registries.
+## Shared Modules
+- `dml_code/src/paths.py`: path configuration and output path helpers.
+- `dml_code/src/build_helpers.py`: build dataclasses and panel construction helpers.
+- `dml_code/src/estimate_helpers.py`: experiment loading, run expansion, validation, and DoubleML fitting.
+- `dml_code/src/outputs.py`: diagnostic plots and tables from estimation and prediction logs.
+- `dml_code/src/utils.py`: logging, table export, timing, and resource helpers.

dml_dev-0.1.1/dml_code/pipeline/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ """Executable DML pipeline steps."""

dml_dev-0.1.0/project_code/pipeline/build.py → dml_dev-0.1.1/dml_code/pipeline/step1_build.py RENAMED Viewed

@@ -12,12 +12,12 @@ from pathlib import Path
 LOCAL_DIR = Path(__file__).resolve().parents[2]
 sys.path.insert(0, str(LOCAL_DIR))
-from project_code.src.build_helpers import (
+from dml_code.src.build_helpers import (
     backup_existing_output,
     build_cohort_file,
-    get_post_panel_transforms,
+    get_generated_features,
     get_program_spec,
-    get_source_data_spec,
+    get_databank_spec,
     time_elapsed,
 )
@@ -27,19 +27,19 @@ def main(program: str) -> None:
     start = time.time()
-    source_data_spec = get_source_data_spec()
+    databank_spec = get_databank_spec()
     program_spec = get_program_spec(program)
-    post_panel_transforms = get_post_panel_transforms()
+    generated_features = get_generated_features()
     backup_existing_output(program)
-    for source_data_path in source_data_spec.paths:
+    for databank_path in databank_spec.paths:
         build_cohort_file(
-            source_data_path=source_data_path,
+            databank_path=databank_path,
             program=program,
-            source_data_spec=source_data_spec,
+            databank_spec=databank_spec,
             program_spec=program_spec,
-            post_panel_transforms=post_panel_transforms,
+            generated_features=generated_features,
         )
     end = time.time()

dml_dev-0.1.0/project_code/pipeline/estimate.py → dml_dev-0.1.1/dml_code/pipeline/step2_estimate.py RENAMED Viewed

@@ -19,14 +19,14 @@ if os.environ.get("NCPUS"):
 LOCAL_DIR = Path(__file__).resolve().parents[2]
 sys.path.insert(0, str(LOCAL_DIR))
-from project_code.src.estimate_helpers import (
+from dml_code.src.estimate_helpers import (
     fit_doubleml_irm,
     get_experiment,
     prepare_estimation_data,
     unpack_runs,
     validate_runs,
 )
-from project_code.src.utils import log_process_resources, log_results, time_elapsed, trim_memory
+from dml_code.src.utils import log_process_resources, log_results, time_elapsed, trim_memory
 def main(experiment_name: str) -> None:
@@ -39,7 +39,7 @@ def main(experiment_name: str) -> None:
     stop_resource_logging = log_process_resources(interval=30)
     try:
         for run_number, run in enumerate(runs, start=1):
-            print(f"Starting run #{run_number} of {len(runs)} \n")
+            print(f"\nRun {run_number}/{len(runs)}: {run.program_name}, outcome={run.outcome}")
             start = time.time()
             df, x_cols, summary = prepare_estimation_data(run)
@@ -56,7 +56,7 @@ def main(experiment_name: str) -> None:
             estimation_run_time = time_elapsed(start_estimation, end)
             estimation_run_time_hours = (end - start_estimation) / (60 * 60)
-            print("\n Starting logging...\n")
+            print("Writing estimation and prediction logs...")
             estimation_log = pl.DataFrame({
                 "program": [run.program_name],
                 "treatment": [run.treatment],
@@ -78,12 +78,12 @@ def main(experiment_name: str) -> None:
                 "estimation_run_time": [estimation_run_time],
                 "estimation_run_time_hours": [estimation_run_time_hours],
                 "timestamp": [datetime.now()],
-                "n_controls": [summary["n_controls"]],
-                "n_unique_controls": [summary["n_unique_controls"]],
-                "n_covariates": [summary["n_covariates"]],
-                "n_treated": [summary["n_treated"]],
-                "n_null_rows_dropped": [summary["n_null_rows_dropped"]],
-                "n_rows": [summary["n_rows"]],
+                "num_controls": [summary["num_controls"]],
+                "num_unique_controls": [summary["num_unique_controls"]],
+                "num_covariates": [summary["num_covariates"]],
+                "num_treated": [summary["num_treated"]],
+                "num_null_rows_dropped": [summary["num_null_rows_dropped"]],
+                "num_rows": [summary["num_rows"]],
                 "run_number": [run_number],
             })
             predictions_log = pl.DataFrame({
@@ -98,9 +98,10 @@ def main(experiment_name: str) -> None:
             log_results("estimation", estimation_log, experiment_name, run_number)
             log_results("predictions", predictions_log, experiment_name, run_number)
-            print(f"""\n Run #{run_number} complete
-                  \n Estimation run time: {estimation_run_time}
-                  \n Total run time: {total_run_time}\n \n""")
+            print(
+                f"Run {run_number}/{len(runs)} complete. "
+                f"Estimation: {estimation_run_time}; total: {total_run_time}.\n"
+            )
             del dml_obj, df
             trim_memory()

{dml_dev-0.1.0/project_code → dml_dev-0.1.1/dml_code}/src/build_helpers.py RENAMED Viewed

@@ -1,5 +1,6 @@
 from collections.abc import Callable, Sequence
 from dataclasses import dataclass, field
+import importlib
 from pathlib import Path
 import shutil
 import sys
@@ -8,8 +9,13 @@ import time
 import polars as pl
-from project_code.src.paths import CONFIG_DIR, processed_data_out_folder, processed_data_out_path
-from project_code.src.utils import time_elapsed, trim_memory
+from dml_code.src.paths import (
+    CONFIG_DIR,
+    CONFIG_PACKAGE,
+    processed_data_out_folder,
+    processed_data_out_path,
+)
+from dml_code.src.utils import time_elapsed, trim_memory
 TREATMENT_COL = "treatment"
 OBSERVATION_COL = "observation_year"
@@ -26,8 +32,8 @@ class RelativeCol:
 @dataclass(frozen=True, kw_only=True)
-class BuildSource:
-    """Input files and columns to carry from one side of the build join."""
+class DatabankSpec:
+    """Shared input files and columns to carry into every program build."""
     paths: Sequence[Path]
     passthrough_cols: Sequence[pl.Expr]
@@ -36,7 +42,7 @@ class BuildSource:
 @dataclass(frozen=True, kw_only=True)
-class ProgramSource(BuildSource):
+class ProgramSpec(DatabankSpec):
     """Program-specific source data and column mappings."""
     name: str
@@ -46,36 +52,36 @@ class ProgramSource(BuildSource):
 @dataclass(frozen=True, init=False)
 class BuildSpec:
-    """Complete build recipe: source data, programs, and post-panel transforms."""
+    """Complete build recipe: databank, programs, and generated features."""
-    source_data: BuildSource
-    programs: dict[str, ProgramSource]
-    post_panel_transforms: Sequence[Transform]
+    databank: DatabankSpec
+    programs: dict[str, ProgramSpec]
+    generated_features: Sequence[Transform]
     def __init__(
         self,
-        source_data: BuildSource | None = None,
-        programs: dict[str, ProgramSource] | None = None,
-        post_panel_transforms: Sequence[Transform] = (),
+        databank: DatabankSpec | None = None,
+        programs: dict[str, ProgramSpec] | None = None,
+        generated_features: Sequence[Transform] = (),
     ):
-        if source_data is None:
-            raise ValueError("BuildSpec requires source_data")
+        if databank is None:
+            raise ValueError("BuildSpec requires databank")
-        object.__setattr__(self, "source_data", source_data)
+        object.__setattr__(self, "databank", databank)
         object.__setattr__(self, "programs", programs or {})
-        object.__setattr__(self, "post_panel_transforms", post_panel_transforms)
+        object.__setattr__(self, "generated_features", generated_features)
 def get_build_spec() -> BuildSpec:
     """Load the configured build recipe lazily to avoid import cycles."""
     sys.path.insert(0, str(CONFIG_DIR.parent))
-    from config.build_spec import BUILD_SPEC
+    build_spec_module = importlib.import_module(f"{CONFIG_PACKAGE}.build_spec")
-    return BUILD_SPEC
+    return build_spec_module.BUILD_SPEC
-def get_program_spec(program: str) -> ProgramSource:
+def get_program_spec(program: str) -> ProgramSpec:
     """Return the configured source definition for one program."""
     try:
@@ -84,16 +90,16 @@ def get_program_spec(program: str) -> ProgramSource:
         raise ValueError(f"Unknown program: {program}") from e
-def get_source_data_spec() -> BuildSource:
-    """Return the shared source data input definition."""
+def get_databank_spec() -> DatabankSpec:
+    """Return the shared databank input definition."""
-    return get_build_spec().source_data
+    return get_build_spec().databank
-def get_post_panel_transforms() -> Sequence[Transform]:
-    """Return transforms applied after event-time panel construction."""
+def get_generated_features() -> Sequence[Transform]:
+    """Return generated feature transforms applied after panel construction."""
-    return get_build_spec().post_panel_transforms
+    return get_build_spec().generated_features
 def backup_existing_output(program: str) -> None:
@@ -110,7 +116,7 @@ def backup_existing_output(program: str) -> None:
-def load_program_lf(program_spec: ProgramSource) -> pl.LazyFrame:
+def load_program_lf(program_spec: ProgramSpec) -> pl.LazyFrame:
     """Load treated program records and normalize key build columns."""
     return (
@@ -180,11 +186,11 @@ def apply_transforms(
 def build_cohort_file(
-    source_data_path: Path,
+    databank_path: Path,
     program: str,
-    source_data_spec: BuildSource,
-    program_spec: ProgramSource,
-    post_panel_transforms: Sequence[Transform],
+    databank_spec: DatabankSpec,
+    program_spec: ProgramSpec,
+    generated_features: Sequence[Transform],
 ) -> None:
     """Build and write one processed parquet file for one birth cohort.
@@ -193,21 +199,21 @@ def build_cohort_file(
     """
     start = time.time()
-    cohort = int(source_data_path.stem.split("=")[1])
+    cohort = int(databank_path.stem.split("=")[1])
     print(f"\n \n Starting cohort {cohort}")
     # Temporary cohort window used to avoid scanning out-of-scope source files.
     if cohort < 1940 or cohort > 1995:
         return
-    source_data_lf = pl.scan_parquet(source_data_path).with_columns(
-        source_data_spec.join_key_col.alias(JOIN_KEY)
+    databank_lf = pl.scan_parquet(databank_path).with_columns(
+        databank_spec.join_key_col.alias(JOIN_KEY)
     )
     program_lf = load_program_lf(program_spec)
     treated_enrollment_years = get_treated_enrollment_years(program_lf)
     # Join once at calendar time, then slice into event-time panels below.
-    merged_lf = source_data_lf.join(program_lf, on=JOIN_KEY, how="left")
+    merged_lf = databank_lf.join(program_lf, on=JOIN_KEY, how="left")
     merged_lf = merged_lf.with_columns(pl.col(TREATMENT_COL).fill_null(0))
     available_cols = set(merged_lf.collect_schema().names())
@@ -216,11 +222,11 @@ def build_cohort_file(
         pl.col(OBSERVATION_COL),
         pl.col(TREATMENT_COL),
         *program_spec.passthrough_cols,
-        *source_data_spec.passthrough_cols,
+        *databank_spec.passthrough_cols,
     ]
     passthrough_cols_as_lag = [
         *program_spec.passthrough_cols_as_lag,
-        *source_data_spec.passthrough_cols_as_lag,
+        *databank_spec.passthrough_cols_as_lag,
     ]
     missing_cols = set()
@@ -263,8 +269,8 @@ def build_cohort_file(
         [pl.scan_parquet(path) for path in cohort_panel_paths],
         how="vertical_relaxed",
     )
-    # Add common post-panel features after all relative columns exist.
-    result = apply_transforms(result, post_panel_transforms)
+    # Add generated features after all relative columns exist.
+    result = apply_transforms(result, generated_features)
     out_path = processed_data_out_path(program, cohort)
     result.sink_parquet(out_path, engine="streaming")
@@ -278,12 +284,12 @@ def build_cohort_file(
 def add_derived_columns(program: str) -> None:
-    """Re-apply post-panel transforms to files that have already been built."""
+    """Re-apply generated feature transforms to files that have already been built."""
     folder = processed_data_out_folder(program)
-    post_panel_transforms = get_post_panel_transforms()
+    generated_features = get_generated_features()
     for path in folder.iterdir():
         lf = pl.scan_parquet(path)
-        lf = apply_transforms(lf, post_panel_transforms)
+        lf = apply_transforms(lf, generated_features)
         lf.sink_parquet(path, engine="streaming")

dml-dev 0.1.0__tar.gz → 0.1.1__tar.gz

dml-dev 0.1.0tar.gz → 0.1.1tar.gz