@tricoteuses/assemblee 2.2.1 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -42,53 +42,74 @@ npm install
42
42
 
43
43
  ### Basic usage
44
44
 
45
- Create a folder where the data will be downloaded and run the following command to download, reorganize and clean the data.
46
-
45
+ Create a directory to store the data, then run the following command to download, reorganize and clean the data.
47
46
  ```bash
48
47
  mkdir ../assemblee-data/
49
-
50
- # Download and clean open data
51
48
  npm run data:download ../assemblee-data
52
49
  ```
53
50
 
54
- Data from other sources is also available :
55
-
56
- ```bash
57
- # Retrieval of députés' pictures from Assemblée nationale's website
58
- npm run data:retrieve_deputes_photos ../assemblee-data
51
+ ### Available Commands
59
52
 
60
- # Retrieval of sénateurs' pictures from Assemblée nationale's website
61
- npm run data:retrieve_senateurs_photos ../assemblee-data
53
+ - `npm run data:download <dir>`: Download, reorganize, and clean data
54
+ - `npm run data:retrieve_open_data <dir>`: Download raw data files.
55
+ - `npm run data:reorganize_data <dir>`: Reorganize raw files by entity.
56
+ - `npm run data:clean_data <dir>`: Clean and validate reorganized files.
57
+ - `npm run data:retrieve_deputes_photos <dir>`: Retrieval of députés' pictures from Assemblée nationale's website
58
+ - `npm run data:retrieve_senateurs_photos <dir>`: Retrieval of sénateurs' pictures from Assemblée nationale's website
59
+ - `npm run data:retrieve_documents <dir>`: Retrieval of legislative documents from Assemblée nationale's website
60
+ - `npm run data:retrieve_pending_amendements <dir>`: Retrieval of pending amendments from Assemblée nationale's website (waiting to be processed by Assemblée services)
62
61
 
63
- # Retrieval of pending amendments from Assemblée nationale's website (waiting to be processed by Assemblée services)
64
- npm run data:retrieve_pending_amendements ../assemblee-data
65
- ```
66
62
 
67
63
  _Notes_:
68
64
 
69
65
  - Reorganized files (generated by the _data:reorganize_data_ command) are also available in [Tricoteuses / Data / Données brutes de l'Assemblée](https://git.en-root.org/tricoteuses/data/assemblee-brut). They are updated on a regular basis.
70
66
  - Split & cleaned files (generated by the _data:clean_data_ command) are also available in [Tricoteuses / Data / Données nettoyées de l'Assemblée](https://git.en-root.org/tricoteuses/data/assemblee-nettoye) with the `_nettoye` suffix. They are updated on a regular basis.
71
67
 
72
- ### Filtering options
68
+ ### Filtering Options
73
69
 
74
70
  Downloading and cleaning all the data is long and takes up a lot of disk space. It is possible to choose the type of data that you want to retrieve to reduce the load.
75
71
 
76
- To download only a type of dataset, use the _--categories_ option (shortcut _-k_) :
72
+ Examples:
77
73
 
78
74
  ```bash
79
- # Available options : ActeursEtOrganes, Agendas, Amendements, DossiersLegislatifs, Photos, Scrutins, Questions, ComptesRendusSeances
80
- npm run data:download ../assemblee-data -- --categories Amendements
81
- ```
75
+ # Only download amendments
76
+ npm run data:download ../assemblee-data -- -k Amendements
82
77
 
83
- To download a specific or multiple legislatures, use the *--legislature* option (shortcut *-l*):
84
- ```bash
85
- # Available options : 14, 15, 16, 17
78
+ # Only process 16th and 17th legislatures
86
79
  npm run data:download ../assemblee-data -- -l 16 -l 17
87
-
88
80
  ```
89
81
 
82
+ ### Common Options
83
+
84
+ - `--categories` or `-k <name>`: Filter by dataset categories (Available options : `ActeursEtOrganes`, `Agendas`, `Amendements`, `DossiersLegislatifs`, `Photos`, `Scrutins`, `Questions`, `ComptesRendusSeances`)
85
+
86
+ - `--legislature` or `-l <number>`: Specify one or more legislatures to process (e.g., `-l 15 -l 16`)
87
+ - `--dataDir <path>` (Mandatory): Path to the working directory where all data is stored (required)
88
+ - `--silent` or `-s`: Disable logging
89
+ - `--verbose` or `-v`: Enable verbose logging
90
+ - `--fetch` or `-f`: Force re-download of data even if already present
91
+ - `--commit` or `-c`: Automatically commit cleaned data
92
+ - `--pull` or `-p`: Pull repositories before starting
93
+ - `--clone` or `-C <url>`: Clone Git repositories from a remote group or organization
94
+ - `--remote` or `-r <name>`: Push commits to specified Git remote(s)
95
+
90
96
  If you use such options, use them in all subsequent commands too (_data:regorganize_data_ and _data:clean_data_).
91
97
 
98
+ ### Options for Cleaning Data
99
+
100
+ - `--dataset` or `-d <name>`: Clean a specific dataset only
101
+ - `--no-reset-after-commit`: Skip Git reset after committing (useful to preserve local changes)
102
+ - `--no-validate` or `-V`: Skip schema validation during cleaning
103
+ - `--fullCompteRenduCommissions`: Force reprocessing of commission reports
104
+ - `--fetchDocuments` : Specify to retrieve documents like reports, videos metadata files
105
+ - `--parseDocuments`: Specify to parse documents into cleaned json
106
+
107
+ ### Options for Retrieving Documents
108
+
109
+ - `--full` or `-f`: Retrieve all documents, even those already downloaded
110
+ - `--document-type` or `-T <type>`: Restrict to specific document types (e.g., `PION`)
111
+
112
+
92
113
  ## Download using Docker
93
114
 
94
115
  A Docker image that downloads and cleans the data all at once is available. Build it locally or run it from the container registry.
@@ -1 +1,2 @@
1
+ export declare function cleanDecompteVoix(decompteVoix: any): void;
1
2
  export declare function cleanScrutin(scrutin: any): void;