datafc 1.2.0__tar.gz → 1.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. {datafc-1.2.0 → datafc-1.3.0}/PKG-INFO +80 -25
  2. {datafc-1.2.0 → datafc-1.3.0}/README.md +79 -24
  3. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/__init__.py +3 -1
  4. datafc-1.3.0/datafc/sofascore/fetch_past_matches_data.py +142 -0
  5. datafc-1.3.0/datafc/utils/_config.py +8 -0
  6. {datafc-1.2.0 → datafc-1.3.0}/datafc.egg-info/PKG-INFO +80 -25
  7. {datafc-1.2.0 → datafc-1.3.0}/datafc.egg-info/SOURCES.txt +1 -0
  8. {datafc-1.2.0 → datafc-1.3.0}/setup.py +1 -1
  9. datafc-1.2.0/datafc/utils/_config.py +0 -6
  10. {datafc-1.2.0 → datafc-1.3.0}/LICENSE +0 -0
  11. {datafc-1.2.0 → datafc-1.3.0}/datafc/__init__.py +0 -0
  12. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_coordinates_data.py +0 -0
  13. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_goal_networks_data.py +0 -0
  14. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_lineups_data.py +0 -0
  15. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_match_data.py +0 -0
  16. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_match_odds_data.py +0 -0
  17. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_match_stats_data.py +0 -0
  18. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_momentum_data.py +0 -0
  19. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_shots_data.py +0 -0
  20. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_standings_data.py +0 -0
  21. {datafc-1.2.0 → datafc-1.3.0}/datafc/sofascore/fetch_substitutions_data.py +0 -0
  22. {datafc-1.2.0 → datafc-1.3.0}/datafc/utils/__init__.py +0 -0
  23. {datafc-1.2.0 → datafc-1.3.0}/datafc/utils/_save_files.py +0 -0
  24. {datafc-1.2.0 → datafc-1.3.0}/datafc/utils/_setup_webdriver.py +0 -0
  25. {datafc-1.2.0 → datafc-1.3.0}/datafc.egg-info/dependency_links.txt +0 -0
  26. {datafc-1.2.0 → datafc-1.3.0}/datafc.egg-info/requires.txt +0 -0
  27. {datafc-1.2.0 → datafc-1.3.0}/datafc.egg-info/top_level.txt +0 -0
  28. {datafc-1.2.0 → datafc-1.3.0}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: datafc
3
- Version: 1.2.0
3
+ Version: 1.3.0
4
4
  Summary: A scalable Python library for fetching, processing, and exporting structured football match data.
5
5
  Home-page: https://github.com/urazakgul/datafc
6
6
  Author: Uraz Akgül
@@ -14,7 +14,7 @@ Requires-Python: >=3.8
14
14
  Description-Content-Type: text/markdown
15
15
  License-File: LICENSE
16
16
 
17
- # datafc v1.2.0
17
+ # datafc v1.3.0
18
18
 
19
19
  ## Overview
20
20
 
@@ -53,7 +53,7 @@ pip install git+https://github.com/urazakgul/datafc.git
53
53
  To install a specific version of `datafc`, use:
54
54
 
55
55
  ```bash
56
- pip install datafc==1.2.0
56
+ pip install datafc==1.3.0
57
57
  ```
58
58
 
59
59
  If you already have `datafc` installed and want to upgrade to the latest version, run:
@@ -105,6 +105,7 @@ from datafc.sofascore import (
105
105
  match_odds_data,
106
106
  match_stats_data,
107
107
  momentum_data,
108
+ past_matches_data,
108
109
  lineups_data,
109
110
  coordinates_data,
110
111
  substitutions_data,
@@ -140,20 +141,19 @@ The `lineups_data` function fetches player lineup details for each match and is
140
141
 
141
142
  Without `lineups_data`, these dependent functions will not work as expected.
142
143
 
143
- Exception: `standings_data`
144
+ Exception: `standings_data` and `past_matches_data`
144
145
 
145
- Unlike the other functions, `standings_data` does not require `match_data` or `lineups_data`. It can be executed independently using only `tournament_id` and `season_id`.
146
+ Unlike other functions, `standings_data` and `past_matches_data` do not require `match_data` or `lineups_data`. They can be executed independently using only `tournament_id` and `season_id`. Additionally, `past_matches_data` includes an extra field: `week_number`.
146
147
 
147
148
  ### Match Data
148
149
 
149
150
  #### `match_data`
150
151
 
151
- The `match_data` function fetches match data for a specified tournament, season, and matchweek. It returns a DataFrame containing details such as country, tournament name, season, week number, game ID, home team, home team ID, away team, away team ID, added injury times for both halves, start timestamp, match status, and score-related information.
152
+ The `match_data` function fetches match data for a specified tournament, season, and matchweek.
152
153
 
153
154
  Example Usage:
154
155
 
155
156
  ```python
156
- # Fetch match data for a specific tournament, season, and week
157
157
  match_df = match_data(
158
158
  tournament_id=52,
159
159
  season_id=63814,
@@ -210,12 +210,11 @@ Dependencies:
210
210
 
211
211
  #### `match_odds_data`
212
212
 
213
- The `match_odds_data` function fetches betting odds data for each match in the provided match dataset. It returns a DataFrame containing match odds details, including market names, odds values, and whether the odds changed.
213
+ The `match_odds_data` function fetches betting odds data for each match in the provided match dataset.
214
214
 
215
215
  Example Usage:
216
216
 
217
217
  ```python
218
- # Fetch match odds data
219
218
  match_odds_df = match_odds_data(
220
219
  match_df=match_df,
221
220
  data_source="sofascore",
@@ -258,12 +257,11 @@ Dependencies:
258
257
 
259
258
  #### `match_stats_data`
260
259
 
261
- The `match_stats_data` function fetches statistical data for each match in the provided match dataset. It returns a DataFrame containing key match statistics, including team performance metrics.
260
+ The `match_stats_data` function fetches statistical data for each match in the provided match dataset.
262
261
 
263
262
  Example Usage:
264
263
 
265
264
  ```python
266
- # Fetch match statistics data
267
265
  match_stats_df = match_stats_data(
268
266
  match_df=match_df,
269
267
  data_source="sofascore",
@@ -303,12 +301,11 @@ Dependencies:
303
301
 
304
302
  #### `momentum_data`
305
303
 
306
- The `momentum_data` function fetches momentum data for each match in the provided match dataset. It returns a DataFrame containing match momentum values over time.
304
+ The `momentum_data` function fetches momentum data for each match in the provided match dataset.
307
305
 
308
306
  Example Usage:
309
307
 
310
308
  ```python
311
- # Fetch momentum data
312
309
  momentum_df = momentum_data(
313
310
  match_df=match_df,
314
311
  data_source="sofascore",
@@ -343,16 +340,76 @@ Dependencies:
343
340
 
344
341
  * Requires `match_data` output as `match_df`.
345
342
 
343
+ #### `past_matches_data`
344
+
345
+ The `past_matches_data` function fetches past match data for a specified tournament, season, and week number.
346
+
347
+ Example Usage:
348
+
349
+ ```python
350
+ past_matches_df = past_matches_data(
351
+ tournament_id=52,
352
+ season_id=63814,
353
+ week_number=21,
354
+ data_source="sofascore",
355
+ enable_json_export=True,
356
+ enable_excel_export=True
357
+ )
358
+
359
+ print(past_matches_df)
360
+ ```
361
+
362
+ Parameters:
363
+
364
+ * `tournament_id` (int): The unique identifier for the tournament.
365
+ * `season_id` (int): The unique identifier for the season.
366
+ * `week_number` (int): The matchweek number within the season.
367
+ * `data_source` (str): The data source (`sofavpn` or `sofascore`). Defaults to `sofascore`.
368
+ * `element_load_timeout` (int): The maximum time (in seconds) to wait for the API response. Defaults to 10.
369
+ * `enable_json_export` (bool): If `True`, exports the fetched data as a JSON file. Defaults to `False`.
370
+ * `enable_excel_export` (bool): If `True`, exports the fetched data as an Excel file. Defaults to `False`.
371
+
372
+ Data Structure:
373
+
374
+ The returned DataFrame includes the following columns:
375
+
376
+ * `country`: The country where the tournament is held.
377
+ * `tournament`: The name of the tournament.
378
+ * `season`: The season year.
379
+ * `week`: The matchweek number.
380
+ * `game_id`: The unique identifier for the match.
381
+ * `home_team`: The name of the home team.
382
+ * `home_team_id`: The unique identifier for the home team.
383
+ * `away_team`: The name of the away team.
384
+ * `away_team_id`: The unique identifier for the away team.
385
+ * `injury_time_1`: Added injury time in the first half.
386
+ * `injury_time_2`: Added injury time in the second half.
387
+ * `start_timestamp`: The start time of the match.
388
+ * `status`: The current status of the match.
389
+ * `home_score_current`: The latest recorded score for the home team.
390
+ * `home_score_display`: The displayed score of the home team.
391
+ * `home_score_period1`: The home team's score at the end of the first half.
392
+ * `home_score_period2`: The home team's goals scored in the second half.
393
+ * `home_score_normaltime`: The home team's final score at the end of normal time (90 minutes).
394
+ * `away_score_current`: The latest recorded score for the away team.
395
+ * `away_score_display`: The displayed score of the away team.
396
+ * `away_score_period1`: The away team's score at the end of the first half.
397
+ * `away_score_period2`: The away team's goals scored in the second half.
398
+ * `away_score_normaltime`: The away team's final score at the end of normal time (90 minutes).
399
+
400
+ Dependencies:
401
+
402
+ * No prior function dependency required.
403
+
346
404
  ### Player Data
347
405
 
348
406
  #### `lineups_data`
349
407
 
350
- The `lineups_data` function fetches lineup data for each match in the provided match dataset. It returns a DataFrame containing lineup details such as country, tournament name, season, week number, game ID, team, player name, player ID, statistic name, and statistic value.
408
+ The `lineups_data` function fetches lineup data for each match in the provided match dataset.
351
409
 
352
410
  Example Usage:
353
411
 
354
412
  ```python
355
- # Fetch lineups data based on match data
356
413
  lineups_df = lineups_data(
357
414
  match_df=match_df,
358
415
  data_source="sofascore",
@@ -392,12 +449,11 @@ Dependencies:
392
449
 
393
450
  #### `coordinates_data`
394
451
 
395
- The `coordinates_data` function fetches coordinate data for each player in the provided lineup dataset. It returns a DataFrame containing coordinate details such as country, tournament name, season, week number, game ID, team, player ID, player name, and x-y coordinates.
452
+ The `coordinates_data` function fetches coordinate data for each player in the provided lineup dataset.
396
453
 
397
454
  Example Usage:
398
455
 
399
456
  ```python
400
- # Fetch coordinates data
401
457
  coordinates_df = coordinates_data(
402
458
  lineups_df=lineups_df,
403
459
  data_source="sofascore",
@@ -437,12 +493,11 @@ Dependencies:
437
493
 
438
494
  #### `substitutions_data`
439
495
 
440
- The `substitutions_data` function fetches substitution data for each match in the provided match dataset. It returns a DataFrame containing details of player substitutions, including the players involved and the time of the substitution.
496
+ The `substitutions_data` function fetches substitution data for each match in the provided match dataset.
441
497
 
442
498
  Example Usage:
443
499
 
444
500
  ```python
445
- # Fetch substitution data
446
501
  substitutions_df = substitutions_data(
447
502
  match_df=match_df,
448
503
  data_source="sofascore",
@@ -484,12 +539,11 @@ Dependencies:
484
539
 
485
540
  #### `goal_networks_data`
486
541
 
487
- The `goal_networks_data` function fetches goal network data for each match in the provided match dataset. It returns a DataFrame containing goal-related events, including passing networks and shot locations.
542
+ The `goal_networks_data` function fetches goal network data for each match in the provided match dataset.
488
543
 
489
544
  Example Usage:
490
545
 
491
546
  ```python
492
- # Fetch goal networks data
493
547
  goal_networks_df = goal_networks_data(
494
548
  match_df=match_df,
495
549
  data_source="sofascore",
@@ -541,12 +595,11 @@ Dependencies:
541
595
 
542
596
  #### `shots_data`
543
597
 
544
- The `shots_data` function fetches shot data for each match in the provided match dataset. It returns a DataFrame containing detailed shot-related information, including player coordinates, xG values, shot types, and goal mouth locations.
598
+ The `shots_data` function fetches shot data for each match in the provided match dataset.
545
599
 
546
600
  Example Usage:
547
601
 
548
602
  ```python
549
- # Fetch shot data
550
603
  shots_df = shots_data(
551
604
  match_df=match_df,
552
605
  data_source="sofascore",
@@ -608,12 +661,11 @@ Dependencies:
608
661
 
609
662
  #### `standings_data`
610
663
 
611
- The `standings_data` function fetches league standings for a specific tournament and season. It returns a DataFrame containing team rankings, match results, and points.
664
+ The `standings_data` function fetches league standings for a specific tournament and season.
612
665
 
613
666
  Example Usage:
614
667
 
615
668
  ```python
616
- # Fetch league standings
617
669
  standings_df = standings_data(
618
670
  tournament_id=52,
619
671
  season_id=63814,
@@ -658,6 +710,9 @@ Dependencies:
658
710
 
659
711
  ## Changelog
660
712
 
713
+ * v1.3.0
714
+ * Added `past_matches_data` function to fetch historical match data.
715
+
661
716
  * v1.2.0
662
717
  * Added match score columns to `match_data`
663
718
 
@@ -1,4 +1,4 @@
1
- # datafc v1.2.0
1
+ # datafc v1.3.0
2
2
 
3
3
  ## Overview
4
4
 
@@ -37,7 +37,7 @@ pip install git+https://github.com/urazakgul/datafc.git
37
37
  To install a specific version of `datafc`, use:
38
38
 
39
39
  ```bash
40
- pip install datafc==1.2.0
40
+ pip install datafc==1.3.0
41
41
  ```
42
42
 
43
43
  If you already have `datafc` installed and want to upgrade to the latest version, run:
@@ -89,6 +89,7 @@ from datafc.sofascore import (
89
89
  match_odds_data,
90
90
  match_stats_data,
91
91
  momentum_data,
92
+ past_matches_data,
92
93
  lineups_data,
93
94
  coordinates_data,
94
95
  substitutions_data,
@@ -124,20 +125,19 @@ The `lineups_data` function fetches player lineup details for each match and is
124
125
 
125
126
  Without `lineups_data`, these dependent functions will not work as expected.
126
127
 
127
- Exception: `standings_data`
128
+ Exception: `standings_data` and `past_matches_data`
128
129
 
129
- Unlike the other functions, `standings_data` does not require `match_data` or `lineups_data`. It can be executed independently using only `tournament_id` and `season_id`.
130
+ Unlike other functions, `standings_data` and `past_matches_data` do not require `match_data` or `lineups_data`. They can be executed independently using only `tournament_id` and `season_id`. Additionally, `past_matches_data` includes an extra field: `week_number`.
130
131
 
131
132
  ### Match Data
132
133
 
133
134
  #### `match_data`
134
135
 
135
- The `match_data` function fetches match data for a specified tournament, season, and matchweek. It returns a DataFrame containing details such as country, tournament name, season, week number, game ID, home team, home team ID, away team, away team ID, added injury times for both halves, start timestamp, match status, and score-related information.
136
+ The `match_data` function fetches match data for a specified tournament, season, and matchweek.
136
137
 
137
138
  Example Usage:
138
139
 
139
140
  ```python
140
- # Fetch match data for a specific tournament, season, and week
141
141
  match_df = match_data(
142
142
  tournament_id=52,
143
143
  season_id=63814,
@@ -194,12 +194,11 @@ Dependencies:
194
194
 
195
195
  #### `match_odds_data`
196
196
 
197
- The `match_odds_data` function fetches betting odds data for each match in the provided match dataset. It returns a DataFrame containing match odds details, including market names, odds values, and whether the odds changed.
197
+ The `match_odds_data` function fetches betting odds data for each match in the provided match dataset.
198
198
 
199
199
  Example Usage:
200
200
 
201
201
  ```python
202
- # Fetch match odds data
203
202
  match_odds_df = match_odds_data(
204
203
  match_df=match_df,
205
204
  data_source="sofascore",
@@ -242,12 +241,11 @@ Dependencies:
242
241
 
243
242
  #### `match_stats_data`
244
243
 
245
- The `match_stats_data` function fetches statistical data for each match in the provided match dataset. It returns a DataFrame containing key match statistics, including team performance metrics.
244
+ The `match_stats_data` function fetches statistical data for each match in the provided match dataset.
246
245
 
247
246
  Example Usage:
248
247
 
249
248
  ```python
250
- # Fetch match statistics data
251
249
  match_stats_df = match_stats_data(
252
250
  match_df=match_df,
253
251
  data_source="sofascore",
@@ -287,12 +285,11 @@ Dependencies:
287
285
 
288
286
  #### `momentum_data`
289
287
 
290
- The `momentum_data` function fetches momentum data for each match in the provided match dataset. It returns a DataFrame containing match momentum values over time.
288
+ The `momentum_data` function fetches momentum data for each match in the provided match dataset.
291
289
 
292
290
  Example Usage:
293
291
 
294
292
  ```python
295
- # Fetch momentum data
296
293
  momentum_df = momentum_data(
297
294
  match_df=match_df,
298
295
  data_source="sofascore",
@@ -327,16 +324,76 @@ Dependencies:
327
324
 
328
325
  * Requires `match_data` output as `match_df`.
329
326
 
327
+ #### `past_matches_data`
328
+
329
+ The `past_matches_data` function fetches past match data for a specified tournament, season, and week number.
330
+
331
+ Example Usage:
332
+
333
+ ```python
334
+ past_matches_df = past_matches_data(
335
+ tournament_id=52,
336
+ season_id=63814,
337
+ week_number=21,
338
+ data_source="sofascore",
339
+ enable_json_export=True,
340
+ enable_excel_export=True
341
+ )
342
+
343
+ print(past_matches_df)
344
+ ```
345
+
346
+ Parameters:
347
+
348
+ * `tournament_id` (int): The unique identifier for the tournament.
349
+ * `season_id` (int): The unique identifier for the season.
350
+ * `week_number` (int): The matchweek number within the season.
351
+ * `data_source` (str): The data source (`sofavpn` or `sofascore`). Defaults to `sofascore`.
352
+ * `element_load_timeout` (int): The maximum time (in seconds) to wait for the API response. Defaults to 10.
353
+ * `enable_json_export` (bool): If `True`, exports the fetched data as a JSON file. Defaults to `False`.
354
+ * `enable_excel_export` (bool): If `True`, exports the fetched data as an Excel file. Defaults to `False`.
355
+
356
+ Data Structure:
357
+
358
+ The returned DataFrame includes the following columns:
359
+
360
+ * `country`: The country where the tournament is held.
361
+ * `tournament`: The name of the tournament.
362
+ * `season`: The season year.
363
+ * `week`: The matchweek number.
364
+ * `game_id`: The unique identifier for the match.
365
+ * `home_team`: The name of the home team.
366
+ * `home_team_id`: The unique identifier for the home team.
367
+ * `away_team`: The name of the away team.
368
+ * `away_team_id`: The unique identifier for the away team.
369
+ * `injury_time_1`: Added injury time in the first half.
370
+ * `injury_time_2`: Added injury time in the second half.
371
+ * `start_timestamp`: The start time of the match.
372
+ * `status`: The current status of the match.
373
+ * `home_score_current`: The latest recorded score for the home team.
374
+ * `home_score_display`: The displayed score of the home team.
375
+ * `home_score_period1`: The home team's score at the end of the first half.
376
+ * `home_score_period2`: The home team's goals scored in the second half.
377
+ * `home_score_normaltime`: The home team's final score at the end of normal time (90 minutes).
378
+ * `away_score_current`: The latest recorded score for the away team.
379
+ * `away_score_display`: The displayed score of the away team.
380
+ * `away_score_period1`: The away team's score at the end of the first half.
381
+ * `away_score_period2`: The away team's goals scored in the second half.
382
+ * `away_score_normaltime`: The away team's final score at the end of normal time (90 minutes).
383
+
384
+ Dependencies:
385
+
386
+ * No prior function dependency required.
387
+
330
388
  ### Player Data
331
389
 
332
390
  #### `lineups_data`
333
391
 
334
- The `lineups_data` function fetches lineup data for each match in the provided match dataset. It returns a DataFrame containing lineup details such as country, tournament name, season, week number, game ID, team, player name, player ID, statistic name, and statistic value.
392
+ The `lineups_data` function fetches lineup data for each match in the provided match dataset.
335
393
 
336
394
  Example Usage:
337
395
 
338
396
  ```python
339
- # Fetch lineups data based on match data
340
397
  lineups_df = lineups_data(
341
398
  match_df=match_df,
342
399
  data_source="sofascore",
@@ -376,12 +433,11 @@ Dependencies:
376
433
 
377
434
  #### `coordinates_data`
378
435
 
379
- The `coordinates_data` function fetches coordinate data for each player in the provided lineup dataset. It returns a DataFrame containing coordinate details such as country, tournament name, season, week number, game ID, team, player ID, player name, and x-y coordinates.
436
+ The `coordinates_data` function fetches coordinate data for each player in the provided lineup dataset.
380
437
 
381
438
  Example Usage:
382
439
 
383
440
  ```python
384
- # Fetch coordinates data
385
441
  coordinates_df = coordinates_data(
386
442
  lineups_df=lineups_df,
387
443
  data_source="sofascore",
@@ -421,12 +477,11 @@ Dependencies:
421
477
 
422
478
  #### `substitutions_data`
423
479
 
424
- The `substitutions_data` function fetches substitution data for each match in the provided match dataset. It returns a DataFrame containing details of player substitutions, including the players involved and the time of the substitution.
480
+ The `substitutions_data` function fetches substitution data for each match in the provided match dataset.
425
481
 
426
482
  Example Usage:
427
483
 
428
484
  ```python
429
- # Fetch substitution data
430
485
  substitutions_df = substitutions_data(
431
486
  match_df=match_df,
432
487
  data_source="sofascore",
@@ -468,12 +523,11 @@ Dependencies:
468
523
 
469
524
  #### `goal_networks_data`
470
525
 
471
- The `goal_networks_data` function fetches goal network data for each match in the provided match dataset. It returns a DataFrame containing goal-related events, including passing networks and shot locations.
526
+ The `goal_networks_data` function fetches goal network data for each match in the provided match dataset.
472
527
 
473
528
  Example Usage:
474
529
 
475
530
  ```python
476
- # Fetch goal networks data
477
531
  goal_networks_df = goal_networks_data(
478
532
  match_df=match_df,
479
533
  data_source="sofascore",
@@ -525,12 +579,11 @@ Dependencies:
525
579
 
526
580
  #### `shots_data`
527
581
 
528
- The `shots_data` function fetches shot data for each match in the provided match dataset. It returns a DataFrame containing detailed shot-related information, including player coordinates, xG values, shot types, and goal mouth locations.
582
+ The `shots_data` function fetches shot data for each match in the provided match dataset.
529
583
 
530
584
  Example Usage:
531
585
 
532
586
  ```python
533
- # Fetch shot data
534
587
  shots_df = shots_data(
535
588
  match_df=match_df,
536
589
  data_source="sofascore",
@@ -592,12 +645,11 @@ Dependencies:
592
645
 
593
646
  #### `standings_data`
594
647
 
595
- The `standings_data` function fetches league standings for a specific tournament and season. It returns a DataFrame containing team rankings, match results, and points.
648
+ The `standings_data` function fetches league standings for a specific tournament and season.
596
649
 
597
650
  Example Usage:
598
651
 
599
652
  ```python
600
- # Fetch league standings
601
653
  standings_df = standings_data(
602
654
  tournament_id=52,
603
655
  season_id=63814,
@@ -642,6 +694,9 @@ Dependencies:
642
694
 
643
695
  ## Changelog
644
696
 
697
+ * v1.3.0
698
+ * Added `past_matches_data` function to fetch historical match data.
699
+
645
700
  * v1.2.0
646
701
  * Added match score columns to `match_data`
647
702
 
@@ -8,6 +8,7 @@ from .fetch_coordinates_data import coordinates_data
8
8
  from .fetch_substitutions_data import substitutions_data
9
9
  from .fetch_match_odds_data import match_odds_data
10
10
  from .fetch_momentum_data import momentum_data
11
+ from .fetch_past_matches_data import past_matches_data
11
12
 
12
13
  __all__ = [
13
14
  "match_data",
@@ -19,5 +20,6 @@ __all__ = [
19
20
  "coordinates_data",
20
21
  "substitutions_data",
21
22
  "match_odds_data",
22
- "momentum_data"
23
+ "momentum_data",
24
+ "past_matches_data"
23
25
  ]
@@ -0,0 +1,142 @@
1
+ import json
2
+ import pandas as pd
3
+ from selenium.webdriver.common.by import By
4
+ from selenium.webdriver.support.ui import WebDriverWait
5
+ from selenium.webdriver.support import expected_conditions as EC
6
+ from selenium.common.exceptions import TimeoutException, WebDriverException
7
+ from datafc.utils._setup_webdriver import setup_webdriver
8
+ from datafc.utils._save_files import save_json, save_excel
9
+ from datafc.utils._config import ALLOWED_SOURCES, API_BASE_URLS
10
+
11
+ def past_matches_data(
12
+ tournament_id: int,
13
+ season_id: int,
14
+ week_number: int,
15
+ data_source: str = "sofascore",
16
+ element_load_timeout: int = 10,
17
+ enable_json_export: bool = False,
18
+ enable_excel_export: bool = False
19
+ ) -> pd.DataFrame:
20
+ """
21
+ Fetches past match data for a specified tournament, season, and week number.
22
+
23
+ Args:
24
+ tournament_id (int): The unique identifier for the tournament.
25
+ season_id (int): The unique identifier for the season.
26
+ week_number (int): The matchweek number within the season.
27
+ data_source (str): The data source ('sofavpn' or 'sofascore'). Defaults to 'sofascore'.
28
+ element_load_timeout (int): The maximum time (in seconds) to wait for the API response. Defaults to 10.
29
+ enable_json_export (bool): If `True`, exports the fetched data as a JSON file. Defaults to `False`.
30
+ enable_excel_export (bool): If `True`, exports the fetched data as an Excel file. Defaults to `False`.
31
+ """
32
+ if data_source not in ALLOWED_SOURCES:
33
+ raise ValueError(f"Invalid data source: {data_source}. Must be one of {ALLOWED_SOURCES}")
34
+
35
+ api_request_url = f"{API_BASE_URLS[data_source]}/api/v1/unique-tournament/{tournament_id}/season/{season_id}/events/round/{week_number}"
36
+
37
+ try:
38
+ webdriver_instance = setup_webdriver()
39
+ webdriver_instance.get(api_request_url)
40
+
41
+ response_element = WebDriverWait(webdriver_instance, element_load_timeout).until(
42
+ EC.visibility_of_element_located((By.TAG_NAME, "pre"))
43
+ )
44
+ response_text = response_element.text.strip()
45
+ if not response_text:
46
+ raise RuntimeError("API response is empty.")
47
+
48
+ api_response_data = json.loads(response_text)
49
+ if "events" not in api_response_data or not isinstance(api_response_data["events"], list):
50
+ raise ValueError("Invalid API response format: 'events' key is missing or not a list.")
51
+
52
+ events_df = pd.DataFrame(api_response_data.get("events", []))
53
+ if events_df.empty:
54
+ raise ValueError("No match data found for the specified parameters.")
55
+
56
+ fn_country = events_df.iloc[0]["tournament"].get("category", {}).get("name", "")
57
+ fn_tournament = events_df.iloc[0]["tournament"].get("name", "")
58
+ fn_season = events_df.iloc[0]["season"].get("year", "")
59
+ fn_week = week_number
60
+
61
+ custom_ids = events_df["customId"].tolist()
62
+ all_matches_data = []
63
+
64
+ for custom_id in custom_ids:
65
+ h2h_url = f"{API_BASE_URLS[data_source + '2']}/api/v1/event/{custom_id}/h2h/events"
66
+ webdriver_instance.get(h2h_url)
67
+ h2h_response_element = WebDriverWait(webdriver_instance, element_load_timeout).until(
68
+ EC.visibility_of_element_located((By.TAG_NAME, "pre"))
69
+ )
70
+ h2h_response_text = h2h_response_element.text.strip()
71
+ if not h2h_response_text:
72
+ continue
73
+
74
+ h2h_data = json.loads(h2h_response_text)
75
+ if "events" not in h2h_data or not isinstance(h2h_data["events"], list):
76
+ continue
77
+
78
+ for event in h2h_data["events"]:
79
+ match_info = {
80
+ "country": event["tournament"].get("category", {}).get("name", ""),
81
+ "tournament": event["tournament"].get("name", ""),
82
+ "season": event["season"].get("year", ""),
83
+ "week": event.get("roundInfo", {}).get("round", ""),
84
+ "game_id": event.get("id", ""),
85
+ "home_team": event["homeTeam"].get("name", ""),
86
+ "home_team_id": event["homeTeam"].get("id", ""),
87
+ "away_team": event["awayTeam"].get("name", ""),
88
+ "away_team_id": event["awayTeam"].get("id", ""),
89
+ "injury_time_1": event.get("time", {}).get("injuryTime1", ""),
90
+ "injury_time_2": event.get("time", {}).get("injuryTime2", ""),
91
+ "start_timestamp": event.get("startTimestamp", ""),
92
+ "status": event["status"].get("description", ""),
93
+ "home_score_current": event["homeScore"].get("current", ""),
94
+ "home_score_display": event["homeScore"].get("display", ""),
95
+ "home_score_period1": event["homeScore"].get("period1", ""),
96
+ "home_score_period2": event["homeScore"].get("period2", ""),
97
+ "home_score_normaltime": event["homeScore"].get("normaltime", ""),
98
+ "away_score_current": event["awayScore"].get("current", ""),
99
+ "away_score_display": event["awayScore"].get("display", ""),
100
+ "away_score_period1": event["awayScore"].get("period1", ""),
101
+ "away_score_period2": event["awayScore"].get("period2", ""),
102
+ "away_score_normaltime": event["awayScore"].get("normaltime", "")
103
+ }
104
+ all_matches_data.append(match_info)
105
+
106
+ detailed_matches_df = pd.DataFrame(all_matches_data)
107
+
108
+ if enable_json_export or enable_excel_export:
109
+ if enable_json_export:
110
+ save_json(
111
+ data=detailed_matches_df,
112
+ data_source=data_source,
113
+ country=fn_country,
114
+ tournament=fn_tournament,
115
+ season=fn_season,
116
+ week_number=fn_week
117
+ )
118
+
119
+ if enable_excel_export:
120
+ save_excel(
121
+ data=detailed_matches_df,
122
+ data_source=data_source,
123
+ country=fn_country,
124
+ tournament=fn_tournament,
125
+ season=fn_season,
126
+ week_number=fn_week
127
+ )
128
+
129
+ return detailed_matches_df
130
+
131
+ except TimeoutException:
132
+ raise RuntimeError("Timeout occurred while waiting for the page or API response.")
133
+ except WebDriverException as e:
134
+ raise RuntimeError(f"Selenium WebDriver error: {str(e)}")
135
+ except json.JSONDecodeError:
136
+ raise RuntimeError("Failed to decode API response as JSON.")
137
+ except Exception as e:
138
+ raise RuntimeError(f"Unexpected error while fetching past matches data: {e.__class__.__name__} - {e}")
139
+
140
+ finally:
141
+ if webdriver_instance:
142
+ webdriver_instance.quit()
@@ -0,0 +1,8 @@
1
+ ALLOWED_SOURCES = {"sofavpn", "sofascore"}
2
+
3
+ API_BASE_URLS = {
4
+ "sofavpn": "https://api.sofavpn.com",
5
+ "sofascore": "https://api.sofascore.com",
6
+ "sofavpn2": "https://www.sofavpn.com",
7
+ "sofascore2": "https://www.sofascore.com"
8
+ }
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: datafc
3
- Version: 1.2.0
3
+ Version: 1.3.0
4
4
  Summary: A scalable Python library for fetching, processing, and exporting structured football match data.
5
5
  Home-page: https://github.com/urazakgul/datafc
6
6
  Author: Uraz Akgül
@@ -14,7 +14,7 @@ Requires-Python: >=3.8
14
14
  Description-Content-Type: text/markdown
15
15
  License-File: LICENSE
16
16
 
17
- # datafc v1.2.0
17
+ # datafc v1.3.0
18
18
 
19
19
  ## Overview
20
20
 
@@ -53,7 +53,7 @@ pip install git+https://github.com/urazakgul/datafc.git
53
53
  To install a specific version of `datafc`, use:
54
54
 
55
55
  ```bash
56
- pip install datafc==1.2.0
56
+ pip install datafc==1.3.0
57
57
  ```
58
58
 
59
59
  If you already have `datafc` installed and want to upgrade to the latest version, run:
@@ -105,6 +105,7 @@ from datafc.sofascore import (
105
105
  match_odds_data,
106
106
  match_stats_data,
107
107
  momentum_data,
108
+ past_matches_data,
108
109
  lineups_data,
109
110
  coordinates_data,
110
111
  substitutions_data,
@@ -140,20 +141,19 @@ The `lineups_data` function fetches player lineup details for each match and is
140
141
 
141
142
  Without `lineups_data`, these dependent functions will not work as expected.
142
143
 
143
- Exception: `standings_data`
144
+ Exception: `standings_data` and `past_matches_data`
144
145
 
145
- Unlike the other functions, `standings_data` does not require `match_data` or `lineups_data`. It can be executed independently using only `tournament_id` and `season_id`.
146
+ Unlike other functions, `standings_data` and `past_matches_data` do not require `match_data` or `lineups_data`. They can be executed independently using only `tournament_id` and `season_id`. Additionally, `past_matches_data` includes an extra field: `week_number`.
146
147
 
147
148
  ### Match Data
148
149
 
149
150
  #### `match_data`
150
151
 
151
- The `match_data` function fetches match data for a specified tournament, season, and matchweek. It returns a DataFrame containing details such as country, tournament name, season, week number, game ID, home team, home team ID, away team, away team ID, added injury times for both halves, start timestamp, match status, and score-related information.
152
+ The `match_data` function fetches match data for a specified tournament, season, and matchweek.
152
153
 
153
154
  Example Usage:
154
155
 
155
156
  ```python
156
- # Fetch match data for a specific tournament, season, and week
157
157
  match_df = match_data(
158
158
  tournament_id=52,
159
159
  season_id=63814,
@@ -210,12 +210,11 @@ Dependencies:
210
210
 
211
211
  #### `match_odds_data`
212
212
 
213
- The `match_odds_data` function fetches betting odds data for each match in the provided match dataset. It returns a DataFrame containing match odds details, including market names, odds values, and whether the odds changed.
213
+ The `match_odds_data` function fetches betting odds data for each match in the provided match dataset.
214
214
 
215
215
  Example Usage:
216
216
 
217
217
  ```python
218
- # Fetch match odds data
219
218
  match_odds_df = match_odds_data(
220
219
  match_df=match_df,
221
220
  data_source="sofascore",
@@ -258,12 +257,11 @@ Dependencies:
258
257
 
259
258
  #### `match_stats_data`
260
259
 
261
- The `match_stats_data` function fetches statistical data for each match in the provided match dataset. It returns a DataFrame containing key match statistics, including team performance metrics.
260
+ The `match_stats_data` function fetches statistical data for each match in the provided match dataset.
262
261
 
263
262
  Example Usage:
264
263
 
265
264
  ```python
266
- # Fetch match statistics data
267
265
  match_stats_df = match_stats_data(
268
266
  match_df=match_df,
269
267
  data_source="sofascore",
@@ -303,12 +301,11 @@ Dependencies:
303
301
 
304
302
  #### `momentum_data`
305
303
 
306
- The `momentum_data` function fetches momentum data for each match in the provided match dataset. It returns a DataFrame containing match momentum values over time.
304
+ The `momentum_data` function fetches momentum data for each match in the provided match dataset.
307
305
 
308
306
  Example Usage:
309
307
 
310
308
  ```python
311
- # Fetch momentum data
312
309
  momentum_df = momentum_data(
313
310
  match_df=match_df,
314
311
  data_source="sofascore",
@@ -343,16 +340,76 @@ Dependencies:
343
340
 
344
341
  * Requires `match_data` output as `match_df`.
345
342
 
343
+ #### `past_matches_data`
344
+
345
+ The `past_matches_data` function fetches past match data for a specified tournament, season, and week number.
346
+
347
+ Example Usage:
348
+
349
+ ```python
350
+ past_matches_df = past_matches_data(
351
+ tournament_id=52,
352
+ season_id=63814,
353
+ week_number=21,
354
+ data_source="sofascore",
355
+ enable_json_export=True,
356
+ enable_excel_export=True
357
+ )
358
+
359
+ print(past_matches_df)
360
+ ```
361
+
362
+ Parameters:
363
+
364
+ * `tournament_id` (int): The unique identifier for the tournament.
365
+ * `season_id` (int): The unique identifier for the season.
366
+ * `week_number` (int): The matchweek number within the season.
367
+ * `data_source` (str): The data source (`sofavpn` or `sofascore`). Defaults to `sofascore`.
368
+ * `element_load_timeout` (int): The maximum time (in seconds) to wait for the API response. Defaults to 10.
369
+ * `enable_json_export` (bool): If `True`, exports the fetched data as a JSON file. Defaults to `False`.
370
+ * `enable_excel_export` (bool): If `True`, exports the fetched data as an Excel file. Defaults to `False`.
371
+
372
+ Data Structure:
373
+
374
+ The returned DataFrame includes the following columns:
375
+
376
+ * `country`: The country where the tournament is held.
377
+ * `tournament`: The name of the tournament.
378
+ * `season`: The season year.
379
+ * `week`: The matchweek number.
380
+ * `game_id`: The unique identifier for the match.
381
+ * `home_team`: The name of the home team.
382
+ * `home_team_id`: The unique identifier for the home team.
383
+ * `away_team`: The name of the away team.
384
+ * `away_team_id`: The unique identifier for the away team.
385
+ * `injury_time_1`: Added injury time in the first half.
386
+ * `injury_time_2`: Added injury time in the second half.
387
+ * `start_timestamp`: The start time of the match.
388
+ * `status`: The current status of the match.
389
+ * `home_score_current`: The latest recorded score for the home team.
390
+ * `home_score_display`: The displayed score of the home team.
391
+ * `home_score_period1`: The home team's score at the end of the first half.
392
+ * `home_score_period2`: The home team's goals scored in the second half.
393
+ * `home_score_normaltime`: The home team's final score at the end of normal time (90 minutes).
394
+ * `away_score_current`: The latest recorded score for the away team.
395
+ * `away_score_display`: The displayed score of the away team.
396
+ * `away_score_period1`: The away team's score at the end of the first half.
397
+ * `away_score_period2`: The away team's goals scored in the second half.
398
+ * `away_score_normaltime`: The away team's final score at the end of normal time (90 minutes).
399
+
400
+ Dependencies:
401
+
402
+ * No prior function dependency required.
403
+
346
404
  ### Player Data
347
405
 
348
406
  #### `lineups_data`
349
407
 
350
- The `lineups_data` function fetches lineup data for each match in the provided match dataset. It returns a DataFrame containing lineup details such as country, tournament name, season, week number, game ID, team, player name, player ID, statistic name, and statistic value.
408
+ The `lineups_data` function fetches lineup data for each match in the provided match dataset.
351
409
 
352
410
  Example Usage:
353
411
 
354
412
  ```python
355
- # Fetch lineups data based on match data
356
413
  lineups_df = lineups_data(
357
414
  match_df=match_df,
358
415
  data_source="sofascore",
@@ -392,12 +449,11 @@ Dependencies:
392
449
 
393
450
  #### `coordinates_data`
394
451
 
395
- The `coordinates_data` function fetches coordinate data for each player in the provided lineup dataset. It returns a DataFrame containing coordinate details such as country, tournament name, season, week number, game ID, team, player ID, player name, and x-y coordinates.
452
+ The `coordinates_data` function fetches coordinate data for each player in the provided lineup dataset.
396
453
 
397
454
  Example Usage:
398
455
 
399
456
  ```python
400
- # Fetch coordinates data
401
457
  coordinates_df = coordinates_data(
402
458
  lineups_df=lineups_df,
403
459
  data_source="sofascore",
@@ -437,12 +493,11 @@ Dependencies:
437
493
 
438
494
  #### `substitutions_data`
439
495
 
440
- The `substitutions_data` function fetches substitution data for each match in the provided match dataset. It returns a DataFrame containing details of player substitutions, including the players involved and the time of the substitution.
496
+ The `substitutions_data` function fetches substitution data for each match in the provided match dataset.
441
497
 
442
498
  Example Usage:
443
499
 
444
500
  ```python
445
- # Fetch substitution data
446
501
  substitutions_df = substitutions_data(
447
502
  match_df=match_df,
448
503
  data_source="sofascore",
@@ -484,12 +539,11 @@ Dependencies:
484
539
 
485
540
  #### `goal_networks_data`
486
541
 
487
- The `goal_networks_data` function fetches goal network data for each match in the provided match dataset. It returns a DataFrame containing goal-related events, including passing networks and shot locations.
542
+ The `goal_networks_data` function fetches goal network data for each match in the provided match dataset.
488
543
 
489
544
  Example Usage:
490
545
 
491
546
  ```python
492
- # Fetch goal networks data
493
547
  goal_networks_df = goal_networks_data(
494
548
  match_df=match_df,
495
549
  data_source="sofascore",
@@ -541,12 +595,11 @@ Dependencies:
541
595
 
542
596
  #### `shots_data`
543
597
 
544
- The `shots_data` function fetches shot data for each match in the provided match dataset. It returns a DataFrame containing detailed shot-related information, including player coordinates, xG values, shot types, and goal mouth locations.
598
+ The `shots_data` function fetches shot data for each match in the provided match dataset.
545
599
 
546
600
  Example Usage:
547
601
 
548
602
  ```python
549
- # Fetch shot data
550
603
  shots_df = shots_data(
551
604
  match_df=match_df,
552
605
  data_source="sofascore",
@@ -608,12 +661,11 @@ Dependencies:
608
661
 
609
662
  #### `standings_data`
610
663
 
611
- The `standings_data` function fetches league standings for a specific tournament and season. It returns a DataFrame containing team rankings, match results, and points.
664
+ The `standings_data` function fetches league standings for a specific tournament and season.
612
665
 
613
666
  Example Usage:
614
667
 
615
668
  ```python
616
- # Fetch league standings
617
669
  standings_df = standings_data(
618
670
  tournament_id=52,
619
671
  season_id=63814,
@@ -658,6 +710,9 @@ Dependencies:
658
710
 
659
711
  ## Changelog
660
712
 
713
+ * v1.3.0
714
+ * Added `past_matches_data` function to fetch historical match data.
715
+
661
716
  * v1.2.0
662
717
  * Added match score columns to `match_data`
663
718
 
@@ -15,6 +15,7 @@ datafc/sofascore/fetch_match_data.py
15
15
  datafc/sofascore/fetch_match_odds_data.py
16
16
  datafc/sofascore/fetch_match_stats_data.py
17
17
  datafc/sofascore/fetch_momentum_data.py
18
+ datafc/sofascore/fetch_past_matches_data.py
18
19
  datafc/sofascore/fetch_shots_data.py
19
20
  datafc/sofascore/fetch_standings_data.py
20
21
  datafc/sofascore/fetch_substitutions_data.py
@@ -5,7 +5,7 @@ with open("README.md", "r", encoding="utf-8") as fh:
5
5
 
6
6
  setup(
7
7
  name="datafc",
8
- version="1.2.0",
8
+ version="1.3.0",
9
9
  author="Uraz Akgül",
10
10
  author_email="urazdev@gmail.com",
11
11
  description="A scalable Python library for fetching, processing, and exporting structured football match data.",
@@ -1,6 +0,0 @@
1
- ALLOWED_SOURCES = {"sofavpn", "sofascore"}
2
-
3
- API_BASE_URLS = {
4
- "sofavpn": "https://api.sofavpn.com",
5
- "sofascore": "https://api.sofascore.com"
6
- }
File without changes
File without changes
File without changes
File without changes