diffindiff 2.3.4__tar.gz → 2.3.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (22) hide show
  1. {diffindiff-2.3.4 → diffindiff-2.3.5}/PKG-INFO +7 -4
  2. {diffindiff-2.3.4 → diffindiff-2.3.5}/README.md +6 -3
  3. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/config.py +3 -3
  4. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/didanalysis.py +31 -4
  5. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/didanalysis_helper.py +15 -3
  6. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/didtools.py +80 -19
  7. diffindiff-2.3.5/diffindiff/tests/__init__.py +0 -0
  8. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/tests/tests_diffindiff.py +28 -6
  9. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff.egg-info/PKG-INFO +7 -4
  10. {diffindiff-2.3.4 → diffindiff-2.3.5}/setup.py +1 -1
  11. diffindiff-2.3.4/diffindiff/__init__.py +0 -4
  12. {diffindiff-2.3.4 → diffindiff-2.3.5}/MANIFEST.in +0 -0
  13. {diffindiff-2.3.4/diffindiff/tests → diffindiff-2.3.5/diffindiff}/__init__.py +0 -0
  14. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/diddata.py +0 -0
  15. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/tests/data/Corona_Hesse.xlsx +0 -0
  16. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/tests/data/counties_DE.csv +0 -0
  17. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff/tests/data/curfew_DE.csv +0 -0
  18. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff.egg-info/SOURCES.txt +0 -0
  19. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff.egg-info/dependency_links.txt +0 -0
  20. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff.egg-info/requires.txt +0 -0
  21. {diffindiff-2.3.4 → diffindiff-2.3.5}/diffindiff.egg-info/top_level.txt +0 -0
  22. {diffindiff-2.3.4 → diffindiff-2.3.5}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: diffindiff
3
- Version: 2.3.4
3
+ Version: 2.3.5
4
4
  Summary: diffindiff: Python library for convenient Difference-in-Differences analyses
5
5
  Author: Thomas Wieland
6
6
  Author-email: geowieland@googlemail.com
@@ -27,7 +27,7 @@ Thomas Wieland [ORCID](https://orcid.org/0000-0001-5168-9846) [EMail](mailto:geo
27
27
 
28
28
  If you use this software, please cite:
29
29
 
30
- Wieland, T. (2026). diffindiff: A Python library for convenient difference-in-differences analyses (Version 2.3.4) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.18656820
30
+ Wieland, T. (2026). diffindiff: A Python library for convenient difference-in-differences analyses (Version 2.3.5) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.18656820
31
31
 
32
32
 
33
33
  ## Installation
@@ -173,7 +173,10 @@ See the /tests directory for usage examples of most of the included functions.
173
173
  This software was developed without the use of AI-generated code. The Continue Agent in Microsoft Visual Studio Code using the GPT-5 mini model (by OpenAI) was used solely to assist in drafting and refining docstrings for documentation. The corresponding guidelines and constraints defined by the author are documented in `AGENTS-docstrings.md` in the [public GitHub repository](https://github.com/geowieland/diffindiff_official).
174
174
 
175
175
 
176
- ## What's new (v2.3.4)
176
+ ## What's new (v2.3.5)
177
177
 
178
178
  - Bugfixes:
179
- - Fixed bug in DiffData instance creation in diddata.merge_data()
179
+ - Test whether input data is panel data via didtools.is_panel() which is included in didanalysis_helper.data_diagnostics()
180
+ - Fixed false test results given continuous treatments are accepted in didtools.is_simultaneous()
181
+ - Fixed false test results given continuous treatments are accepted in didtools.is_prepost()
182
+ - Argument 'pre_post' is passed to is_simultaneous() in didanalysis_helper.treatment_diagnostics()
@@ -19,7 +19,7 @@ Thomas Wieland [ORCID](https://orcid.org/0000-0001-5168-9846) [EMail](mailto:geo
19
19
 
20
20
  If you use this software, please cite:
21
21
 
22
- Wieland, T. (2026). diffindiff: A Python library for convenient difference-in-differences analyses (Version 2.3.4) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.18656820
22
+ Wieland, T. (2026). diffindiff: A Python library for convenient difference-in-differences analyses (Version 2.3.5) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.18656820
23
23
 
24
24
 
25
25
  ## Installation
@@ -165,7 +165,10 @@ See the /tests directory for usage examples of most of the included functions.
165
165
  This software was developed without the use of AI-generated code. The Continue Agent in Microsoft Visual Studio Code using the GPT-5 mini model (by OpenAI) was used solely to assist in drafting and refining docstrings for documentation. The corresponding guidelines and constraints defined by the author are documented in `AGENTS-docstrings.md` in the [public GitHub repository](https://github.com/geowieland/diffindiff_official).
166
166
 
167
167
 
168
- ## What's new (v2.3.4)
168
+ ## What's new (v2.3.5)
169
169
 
170
170
  - Bugfixes:
171
- - Fixed bug in DiffData instance creation in diddata.merge_data()
171
+ - Test whether input data is panel data via didtools.is_panel() which is included in didanalysis_helper.data_diagnostics()
172
+ - Fixed false test results given continuous treatments are accepted in didtools.is_simultaneous()
173
+ - Fixed false test results given continuous treatments are accepted in didtools.is_prepost()
174
+ - Argument 'pre_post' is passed to is_simultaneous() in didanalysis_helper.treatment_diagnostics()
@@ -4,15 +4,15 @@
4
4
  # Author: Thomas Wieland
5
5
  # ORCID: 0000-0001-5168-9846
6
6
  # mail: geowieland@googlemail.com
7
- # Version: 1.0.12
8
- # Last update: 2026-03-14 11:28
7
+ # Version: 1.0.13
8
+ # Last update: 2026-03-16 17:54
9
9
  # Copyright (c) 2025-2026 Thomas Wieland
10
10
  #-----------------------------------------------------------------------
11
11
 
12
12
  # Basic config:
13
13
 
14
14
  PACKAGE_NAME = "diffindiff"
15
- PACKAGE_VERSION = "2.3.4"
15
+ PACKAGE_VERSION = "2.3.5"
16
16
 
17
17
  VERBOSE = False
18
18
 
@@ -4,8 +4,8 @@
4
4
  # Author: Thomas Wieland
5
5
  # ORCID: 0000-0001-5168-9846
6
6
  # mail: geowieland@googlemail.com
7
- # Version: 2.3.3
8
- # Last update: 2026-03-12 19:40
7
+ # Version: 2.3.4
8
+ # Last update: 2026-03-16 17:39
9
9
  # Copyright (c) 2024-2026 Thomas Wieland
10
10
  #-----------------------------------------------------------------------
11
11
 
@@ -1290,7 +1290,7 @@ class DiffModel:
1290
1290
 
1291
1291
  TG_col_ = f"{config.TG_COL}{config.DELIMITER}{treatment}"
1292
1292
  TT_col_ = f"{config.TT_COL}{config.DELIMITER}{treatment}"
1293
- TGxTT_ = f"Placebo{config.DELIMITER}{treatment}"
1293
+ TGxTT_ = f"Placebo{config.DELIMITER}{treatment}"
1294
1294
 
1295
1295
  if TG_col is None and TG_col_ not in model_config["TG_col"]:
1296
1296
  raise ValueError(f"No treatment group identification variable for treatment {treatment}. Please state TG_col = your_treatment_group_dummy.")
@@ -2199,6 +2199,29 @@ def did_analysis(
2199
2199
  ... intercept=False
2200
2200
  ... )
2201
2201
  >>> Hesse_model1.summary()
2202
+ >>> Hesse_model5=did_analysis(
2203
+ ... data=Corona_Hesse,
2204
+ ... unit_col="REG_NAME",
2205
+ ... time_col="infection_date",
2206
+ ... treatment_col=["Nighttime_curfew", "Mobility_restrictions", "Retail_closed", "CR_private_2"],
2207
+ ... covariates=["infections_cum", "R7_rm_lag10"],
2208
+ ... outcome_col="R7_rm"
2209
+ ... )
2210
+ >>> Hesse_model5.summary()
2211
+ >>> Hesse_model6=did_analysis(
2212
+ ... data=Corona_Hesse,
2213
+ ... unit_col="REG_NAME",
2214
+ ... time_col="infection_date",
2215
+ ... treatment_col=["Nighttime_curfew", "Mobility_restrictions"],
2216
+ ... covariates=["infections_cum", "R7_rm_lag10"],
2217
+ ... interactions={
2218
+ ... 0: {
2219
+ ... "name": "curfew_and_mobility",
2220
+ ... "treatments": ["Nighttime_curfew", "Nighttime_curfew"]
2221
+ ... }
2222
+ ... },
2223
+ ... outcome_col="R7_rm"
2224
+ ... )
2202
2225
  """
2203
2226
 
2204
2227
  if TG_col is None:
@@ -2274,7 +2297,7 @@ def did_analysis(
2274
2297
  verbose=verbose
2275
2298
  )
2276
2299
  treatment_diagnostics = treatment_diagnostics_results[0]
2277
- staggered_adoption = treatment_diagnostics_results[1]
2300
+ staggered_adoption = treatment_diagnostics_results[1]
2278
2301
 
2279
2302
  if no_treatments > 1:
2280
2303
 
@@ -2445,6 +2468,10 @@ def did_analysis(
2445
2468
 
2446
2469
  pre_post = True
2447
2470
 
2471
+ FE_unit = False
2472
+ FE_time = False
2473
+ FE_group = False
2474
+
2448
2475
  if log_outcome:
2449
2476
 
2450
2477
  if missing_replace_by_zero:
@@ -4,8 +4,8 @@
4
4
  # Author: Thomas Wieland
5
5
  # ORCID: 0000-0001-5168-9846
6
6
  # mail: geowieland@googlemail.com
7
- # Version: 1.1.1
8
- # Last update: 2025-03-12 20:44
7
+ # Version: 1.1.2
8
+ # Last update: 2025-03-16 17:46
9
9
  # Copyright (c) 2025-2026 Thomas Wieland
10
10
  #-----------------------------------------------------------------------
11
11
 
@@ -531,6 +531,16 @@ def data_diagnostics(
531
531
  if cols_relevant is None:
532
532
  cols_relevant = []
533
533
 
534
+ modeldata_ispanel = tools.is_panel(
535
+ data=data,
536
+ unit_col=unit_col,
537
+ time_col=time_col,
538
+ verbose=verbose
539
+ )
540
+
541
+ if not modeldata_ispanel[0]:
542
+ raise TypeError(f"A difference-in-differences analysis requires panel data with at least two observational units and time points, respectively. Input data is likely {modeldata_ispanel[1]}")
543
+
534
544
  modeldata_ismissing = tools.is_missing(
535
545
  data,
536
546
  drop_missing = drop_missing,
@@ -560,7 +570,8 @@ def data_diagnostics(
560
570
  modeldata_isprepost = tools.is_prepost(
561
571
  data = data,
562
572
  unit_col = unit_col,
563
- time_col = time_col
573
+ time_col = time_col,
574
+ verbose = verbose
564
575
  )
565
576
  if modeldata_isprepost:
566
577
  data_type = config.PREPOST_PANELDATA_DESCRIPTION
@@ -666,6 +677,7 @@ def treatment_diagnostics(
666
677
  unit_col = unit_col,
667
678
  time_col = time_col,
668
679
  treatment_col = treatment,
680
+ pre_post = pre_post,
669
681
  verbose = verbose
670
682
  )
671
683
  if is_simultaneous_result:
@@ -4,8 +4,8 @@
4
4
  # Author: Thomas Wieland
5
5
  # ORCID: 0000-0001-5168-9846
6
6
  # mail: geowieland@googlemail.com
7
- # Version: 2.2.1
8
- # Last update: 2026-03-03 17:34
7
+ # Version: 2.2.2
8
+ # Last update: 2026-03-16 18:04
9
9
  # Copyright (c) 2025-2026 Thomas Wieland
10
10
  #-----------------------------------------------------------------------
11
11
 
@@ -498,11 +498,11 @@ def is_simultaneous(
498
498
  --------
499
499
  >>> is_simultaneous(df, 'unit', 'time', 'treat')
500
500
  """
501
-
501
+
502
502
  if pre_post:
503
503
 
504
504
  if verbose:
505
- print(f"Checking whether treatment '{treatment_col}' is simultaneous or staggered", end = " ... ")
505
+ print(f"Data for treatment '{treatment_col}' is pre-post data and considered as simultaneous.")
506
506
 
507
507
  simultaneous = True
508
508
 
@@ -521,25 +521,24 @@ def is_simultaneous(
521
521
  treatment_group = data_isnotreatment[1]
522
522
  data_TG = data[data[unit_col].isin(treatment_group)]
523
523
 
524
- data_TG_pivot = data_TG.pivot_table(
525
- index = time_col,
526
- columns = unit_col,
527
- values = treatment_col
528
- )
524
+ treated = data_TG[treatment_col] > 0
529
525
 
530
- if config.ACCEPT_CONTINUOUS_TREATMENTS:
531
- simultaneous = (data_TG_pivot.nunique(axis=1) > 0).all()
532
- else:
533
- simultaneous = (data_TG_pivot.nunique(axis=1) == 1).all()
526
+ simultaneous = (
527
+ data_TG.assign(treated=treated)
528
+ .groupby(time_col)["treated"]
529
+ .nunique()
530
+ .le(1)
531
+ .all()
532
+ )
534
533
 
535
- if verbose:
536
- print("OK")
534
+ if verbose:
535
+ print("OK")
537
536
 
538
537
  if not simultaneous and data_isnotreatment[0]:
539
538
  print(f"NOTE: treatment '{treatment_col}' is not simultaneous.")
540
539
 
541
- if simultaneous and not data_isnotreatment[0]:
542
- print(f"WARNING: treatment '{treatment_col}' is simultaneous and does not include a {config.NO_TREATMENT_CG_DESCRIPTION}")
540
+ if simultaneous and not data_isnotreatment[0]:
541
+ print(f"WARNING: treatment '{treatment_col}' is simultaneous and does not include a {config.NO_TREATMENT_CG_DESCRIPTION}")
543
542
 
544
543
  return simultaneous
545
544
 
@@ -905,6 +904,65 @@ def is_parallel(
905
904
  test_ols_model
906
905
  ]
907
906
 
907
+ def is_panel(
908
+ data: pd.DataFrame,
909
+ unit_col: str,
910
+ time_col: str,
911
+ verbose: bool = config.VERBOSE
912
+ ):
913
+
914
+ """
915
+ Check whether panel data is panel data
916
+ (>=2 units and >= 2 timepoints).
917
+
918
+ Parameters
919
+ ----------
920
+ data : pandas.DataFrame
921
+ Panel data.
922
+ unit_col : str
923
+ Column name for units.
924
+ time_col : str
925
+ Column name for time.
926
+ verbose : bool, optional
927
+ If True, print progress messages.
928
+
929
+ Returns
930
+ -------
931
+ bool
932
+ True if panel data, False otherwise.
933
+
934
+ Examples
935
+ --------
936
+ >>> is_panel(df, 'unit', 'time')
937
+ """
938
+
939
+ if verbose:
940
+ print("Checking whether input data is panel data", end = " ... ")
941
+
942
+ panel = True
943
+ other_data_type = ""
944
+
945
+ no_units = data[unit_col].nunique()
946
+ no_timepoints = data[time_col].nunique()
947
+
948
+ if no_units < 2 or no_timepoints < 2:
949
+ panel = False
950
+
951
+ if verbose:
952
+ print("OK")
953
+
954
+ if no_units < 2 and no_timepoints >= 2:
955
+ other_data_type = "Single time series data"
956
+ elif no_units > 2 and no_timepoints < 2:
957
+ other_data_type = "Cross-sectional data"
958
+ elif no_units < 2 and no_timepoints < 2:
959
+ other_data_type = "Single observation"
960
+
961
+ if not panel:
962
+ print(f"WARNING: Input data contains {no_units} units and {no_timepoints} time points. It is not panel data but likely: {other_data_type}.")
963
+
964
+ return panel, other_data_type
965
+
908
966
  def is_prepost(
909
967
  data: pd.DataFrame,
910
968
  unit_col: str,
@@ -937,9 +995,12 @@ def is_prepost(
937
995
  """
938
996
 
939
997
  if verbose:
940
- print("Checking whether panel data is pre-post or multi-period", end = " ... ")
998
+ print("Checking whether panel data is pre-post or multi-period", end = " ... ")
999
+
1000
+ prepost = False
941
1001
 
942
- prepost = (data.groupby(unit_col)[time_col].nunique().le(2).all())
1002
+ if data[time_col].nunique() == 2:
1003
+ prepost = True
943
1004
 
944
1005
  if verbose:
945
1006
  print("OK")
File without changes
@@ -4,8 +4,8 @@
4
4
  # Author: Thomas Wieland
5
5
  # ORCID: 0000-0001-5168-9846
6
6
  # mail: geowieland@googlemail.com
7
- # Version: 2.0.12
8
- # Last update: 2026-03-01 11:29
7
+ # Version: 2.0.14
8
+ # Last update: 2026-03-16 17:35
9
9
  # Copyright (c) 2025-2026 Thomas Wieland
10
10
  #-----------------------------------------------------------------------
11
11
 
@@ -82,7 +82,7 @@ curfew_data_prepost=create_data(
82
82
  curfew_data_prepost.summary()
83
83
  # Summary of created data
84
84
 
85
- curfew_model_prepost=curfew_data_prepost.analysis()
85
+ curfew_model_prepost=curfew_data_prepost.analysis(verbose=True)
86
86
  # Model analysis of created data
87
87
 
88
88
  print(curfew_model_prepost.treatment_effects())
@@ -184,7 +184,7 @@ curfew_data=create_data(
184
184
  curfew_data.summary()
185
185
  # Summary of created treatment data
186
186
 
187
- curfew_model=curfew_data.analysis()
187
+ curfew_model=curfew_data.analysis(verbose=True)
188
188
  # Model analysis of created data
189
189
 
190
190
  curfew_model.summary()
@@ -211,6 +211,7 @@ curfew_placebo = curfew_model.placebo(
211
211
  curfew_placebo.summary()
212
212
  # Summary of placebo test
213
213
 
214
+
214
215
  # Two-way-fixed-effects model:
215
216
 
216
217
  curfew_model_FE=curfew_data.analysis(
@@ -340,7 +341,8 @@ Hesse_model1=did_analysis(
340
341
  time_col="infection_date",
341
342
  treatment_col="Nighttime_curfew",
342
343
  outcome_col="R7_rm",
343
- intercept=False
344
+ intercept=False,
345
+ verbose=True
344
346
  )
345
347
  # Model with staggered adoption (FE automatically)
346
348
 
@@ -430,8 +432,28 @@ Hesse_model5=did_analysis(
430
432
  time_col="infection_date",
431
433
  treatment_col=["Nighttime_curfew", "Mobility_restrictions", "Retail_closed", "CR_private_2"],
432
434
  covariates=["infections_cum", "R7_rm_lag10"],
433
- outcome_col="R7_rm")
435
+ outcome_col="R7_rm"
436
+ )
434
437
  # Model with four interventions (two staggered, two without control conditions)
435
438
 
436
439
  Hesse_model5.summary()
440
+ # Model summary
441
+
442
+ Hesse_model6=did_analysis(
443
+ data=Corona_Hesse,
444
+ unit_col="REG_NAME",
445
+ time_col="infection_date",
446
+ treatment_col=["Nighttime_curfew", "School_holidays"],
447
+ covariates=["infections_cum", "R7_rm_lag10"],
448
+ interactions={
449
+ 0: {
450
+ "name": "curfew_and_holidays",
451
+ "treatments": ["Nighttime_curfew", "School_holidays"]
452
+ }
453
+ },
454
+ outcome_col="R7_rm"
455
+ )
456
+ # Model with two interventions and one interaction of the two treatments
457
+
458
+ Hesse_model6.summary()
437
459
  # Model summary
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: diffindiff
3
- Version: 2.3.4
3
+ Version: 2.3.5
4
4
  Summary: diffindiff: Python library for convenient Difference-in-Differences analyses
5
5
  Author: Thomas Wieland
6
6
  Author-email: geowieland@googlemail.com
@@ -27,7 +27,7 @@ Thomas Wieland [ORCID](https://orcid.org/0000-0001-5168-9846) [EMail](mailto:geo
27
27
 
28
28
  If you use this software, please cite:
29
29
 
30
- Wieland, T. (2026). diffindiff: A Python library for convenient difference-in-differences analyses (Version 2.3.4) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.18656820
30
+ Wieland, T. (2026). diffindiff: A Python library for convenient difference-in-differences analyses (Version 2.3.5) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.18656820
31
31
 
32
32
 
33
33
  ## Installation
@@ -173,7 +173,10 @@ See the /tests directory for usage examples of most of the included functions.
173
173
  This software was developed without the use of AI-generated code. The Continue Agent in Microsoft Visual Studio Code using the GPT-5 mini model (by OpenAI) was used solely to assist in drafting and refining docstrings for documentation. The corresponding guidelines and constraints defined by the author are documented in `AGENTS-docstrings.md` in the [public GitHub repository](https://github.com/geowieland/diffindiff_official).
174
174
 
175
175
 
176
- ## What's new (v2.3.4)
176
+ ## What's new (v2.3.5)
177
177
 
178
178
  - Bugfixes:
179
- - Fixed bug in DiffData instance creation in diddata.merge_data()
179
+ - Test whether input data is panel data via didtools.is_panel() which is included in didanalysis_helper.data_diagnostics()
180
+ - Fixed false test results given continuous treatments are accepted in didtools.is_simultaneous()
181
+ - Fixed false test results given continuous treatments are accepted in didtools.is_prepost()
182
+ - Argument 'pre_post' is passed to is_simultaneous() in didanalysis_helper.treatment_diagnostics()
@@ -7,7 +7,7 @@ def read_README():
7
7
 
8
8
  setup(
9
9
  name='diffindiff',
10
- version='2.3.4',
10
+ version='2.3.5',
11
11
  description='diffindiff: Python library for convenient Difference-in-Differences analyses',
12
12
  packages=find_packages(include=["diffindiff", "diffindiff.tests"]),
13
13
  include_package_data=True,
@@ -1,4 +0,0 @@
1
- from diffindiff.didanalysis import DiffModel, did_analysis
2
- from diffindiff.diddata import DiffGroups, create_groups, DiffTreatment, create_treatment, DiffData, merge_data, create_data
3
- from diffindiff.didtools import is_balanced, is_missing, is_simultaneous, is_notreatment, date_counter, check_columns, is_binary, is_parallel, unique, model_wrapper, treatment_times, clean_column_name
4
- from diffindiff.didanalysis_helper import create_fixed_effects, create_specific_time_trends, create_specific_treatment_effects, create_spillover
File without changes
File without changes