eqcctpro 0.5.5__tar.gz → 0.5.7__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of eqcctpro might be problematic. Click here for more details.
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/PKG-INFO +124 -51
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/README.md +123 -50
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/eqcctpro.egg-info/PKG-INFO +124 -51
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/pyproject.toml +1 -1
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/eqcctpro/__init__.py +0 -0
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/eqcctpro.egg-info/SOURCES.txt +0 -0
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/eqcctpro.egg-info/dependency_links.txt +0 -0
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/eqcctpro.egg-info/requires.txt +0 -0
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/eqcctpro.egg-info/top_level.txt +0 -0
- {eqcctpro-0.5.5 → eqcctpro-0.5.7}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: eqcctpro
|
|
3
|
-
Version: 0.5.
|
|
3
|
+
Version: 0.5.7
|
|
4
4
|
Summary: EQCCTPro: A powerful seismic event detection toolkit
|
|
5
5
|
Author-email: Constantinos Skevofilax <constantinos.skevofilax@austin.utexas.edu>, Victor Salles <victor.salles@beg.utexas.edu>
|
|
6
6
|
Project-URL: Homepage, https://pypi.org/project/eqcctpro/
|
|
@@ -126,20 +126,25 @@ For additional details and package updates, visit the **EQCCTPro PyPI page**:
|
|
|
126
126
|
### **Using Sample Waveform Data**
|
|
127
127
|
To understand how **EQCCTPro** works, it is **highly recommended** to use provided sample seismic waveform data as the data source when testing the package.
|
|
128
128
|
|
|
129
|
-
|
|
129
|
+
1-minute long sample seismic waveforms from 229 TexNet stations have been provided in the repository under the `230_stations_1_min_dt.zip` file.
|
|
130
130
|
|
|
131
131
|
### **Step 1: Unzip the Sample Wavefrom Data**
|
|
132
132
|
After downloading the `.zip` file through the GitHub methods above, run:
|
|
133
133
|
```sh
|
|
134
|
-
[skevofilaxc] unzip
|
|
134
|
+
[skevofilaxc] unzip 230_stations_1_min_dt.zip
|
|
135
135
|
```
|
|
136
136
|
### **Step 2: Check and Understand the Directory Structure**
|
|
137
|
-
The extracted data will contain multiple station directories:
|
|
137
|
+
The extracted data will contain a timechunk subdirectories, comprised of multiple station directories:
|
|
138
138
|
```sh
|
|
139
|
-
[skevofilaxc
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
BP01
|
|
139
|
+
[skevofilaxc 230_stations_1_min_dt]$ ls
|
|
140
|
+
20241215T120000Z_20241215T120100Z
|
|
141
|
+
[skevofilaxc 230_stations_1_min_dt]$ cd 20241215T120000Z_20241215T120100Z
|
|
142
|
+
237B BP01 CT02 DG02 DG10 EE04 EF07 EF54 EF63 EF69 EF77 FOAK3 FW06 FW14 HBVL LWM2 MB05 MB12 MB19 MBBB3 MID03 NM01 OG02 PB05 PB11 PB19 PB26 PB34 PB41 PB51 PB57 PH03 SA06 SGCY SN02 SN10 WB03 WB09 YK01
|
|
143
|
+
435B BRDY CV01 DG04 DKNS EF02 EF08 EF56 EF64 EF71 ELG6 FOAK4 FW07 FW15 HNDO LWM3 MB06 MB13 MB21 MBBB5 MLDN NM02 OG04 PB06 PB12 PB21 PB28 PB35 PB42 PB52 PB58 PL01 SA07 SM01 SN03 SNAG WB04 WB10
|
|
144
|
+
ALPN BW01 CW01 DG05 DRIO EF03 EF09 EF58 EF65 EF72 ET02 FW01 FW09 GV01 HP01 MB01 MB07 MB15 MB22 MBBB6 MNHN NM03 OZNA PB07 PB14 PB22 PB29 PB37 PB43 PB53 PB59 PLPT SA09 SM02 SN04 TREL WB05 WB11
|
|
145
|
+
APMT CF01 DB02 DG06 DRZT EF04 EF51 EF59 EF66 EF74 FLRS FW02 FW11 GV02 HP02 MB02 MB08 MB16 MB25 MG01 MO01 ODSA PB01 PB08 PB16 PB23 PB30 PB38 PB44 PB54 PCOS POST SAND SM03 SN07 VHRN WB06 WB12
|
|
146
|
+
AT01 CRHG DB03 DG07 EE02 EF05 EF52 EF61 EF67 EF75 FOAK1 FW04 FW12 GV03 INDO MB03 MB09 MB17 MBBB1 MID01 NGL01 OE01 PB03 PB09 PB17 PB24 PB32 PB39 PB46 PB55 PECS SA02 SD01 SM04 SN08 VW01 WB07 WTFS
|
|
147
|
+
BB01 CT01 DB04 DG09 EE03 EF06 EF53 EF62 EF68 EF76 FOAK2 FW05 FW13 GV04 LWM1 MB04 MB11 MB18 MBBB2 MID02 NGL02 OG01 PB04 PB10 PB18 PB25 PB33 PB40 PB47 PB56 PH02 SA04 SE01 SMWD SN09 WB02 WB08 WW01
|
|
143
148
|
```
|
|
144
149
|
Each subdirectory contains **mSEED** files of different waveform components:
|
|
145
150
|
```sh
|
|
@@ -175,14 +180,29 @@ To process mSEED from various seismic stations, use the **EQCCTMSeedRunner** cla
|
|
|
175
180
|
**EQCCTMSeedRunner** enables users to process multiple mSEED from a given input directory, which consists of station directories formatted as follows:
|
|
176
181
|
|
|
177
182
|
```sh
|
|
178
|
-
[skevofilaxc
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
BP01
|
|
183
|
+
[skevofilaxc 230_stations_1_min_dt]$ ls
|
|
184
|
+
20241215T120000Z_20241215T120100Z
|
|
185
|
+
[skevofilaxc 230_stations_1_min_dt]$ cd 20241215T120000Z_20241215T120100Z
|
|
186
|
+
237B BP01 CT02 DG02 DG10 EE04 EF07 EF54 EF63 EF69 EF77 FOAK3 FW06 FW14 HBVL LWM2 MB05 MB12 MB19 MBBB3 MID03 NM01 OG02 PB05 PB11 PB19 PB26 PB34 PB41 PB51 PB57 PH03 SA06 SGCY SN02 SN10 WB03 WB09 YK01
|
|
187
|
+
435B BRDY CV01 DG04 DKNS EF02 EF08 EF56 EF64 EF71 ELG6 FOAK4 FW07 FW15 HNDO LWM3 MB06 MB13 MB21 MBBB5 MLDN NM02 OG04 PB06 PB12 PB21 PB28 PB35 PB42 PB52 PB58 PL01 SA07 SM01 SN03 SNAG WB04 WB10
|
|
188
|
+
ALPN BW01 CW01 DG05 DRIO EF03 EF09 EF58 EF65 EF72 ET02 FW01 FW09 GV01 HP01 MB01 MB07 MB15 MB22 MBBB6 MNHN NM03 OZNA PB07 PB14 PB22 PB29 PB37 PB43 PB53 PB59 PLPT SA09 SM02 SN04 TREL WB05 WB11
|
|
189
|
+
APMT CF01 DB02 DG06 DRZT EF04 EF51 EF59 EF66 EF74 FLRS FW02 FW11 GV02 HP02 MB02 MB08 MB16 MB25 MG01 MO01 ODSA PB01 PB08 PB16 PB23 PB30 PB38 PB44 PB54 PCOS POST SAND SM03 SN07 VHRN WB06 WB12
|
|
190
|
+
AT01 CRHG DB03 DG07 EE02 EF05 EF52 EF61 EF67 EF75 FOAK1 FW04 FW12 GV03 INDO MB03 MB09 MB17 MBBB1 MID01 NGL01 OE01 PB03 PB09 PB17 PB24 PB32 PB39 PB46 PB55 PECS SA02 SD01 SM04 SN08 VW01 WB07 WTFS
|
|
191
|
+
BB01 CT01 DB04 DG09 EE03 EF06 EF53 EF62 EF68 EF76 FOAK2 FW05 FW13 GV04 LWM1 MB04 MB11 MB18 MBBB2 MID02 NGL02 OG01 PB04 PB10 PB18 PB25 PB33 PB40 PB47 PB56 PH02 SA04 SE01 SMWD SN09 WB02 WB08 WW01
|
|
182
192
|
```
|
|
183
|
-
Where each subdirectory is named after station code. If you wish to use create your own input directory with custom waveform mSEED files, **please follow the above naming
|
|
193
|
+
Where each subdirectory is named after station code. If you wish to use create your own input directory with custom waveform mSEED files, **please follow the above naming conventions.** Otherwise, EQCCTPro will **not** work.
|
|
194
|
+
Create subdirectories for each timechunk (sub-parent directories) and for each station (child directories). The station directories should be named as shown above.
|
|
195
|
+
Each timechunk directory spans from the **start of the analysis period minus the waveform overlap** to the **end of the analysis period**, based on the defined timechunk duration.
|
|
184
196
|
|
|
185
|
-
|
|
197
|
+
For example:
|
|
198
|
+
```sh
|
|
199
|
+
[skevofilaxc 230_stations_2hr_1_hr_dt]$ ls
|
|
200
|
+
20241215T115800Z_20241215T130000Z 20241215T125800Z_20241215T140000Z
|
|
201
|
+
```
|
|
202
|
+
The timechunk time length is 1 hour long. At the same time, we use a waveform overlap of 2 minutes. Hence: `20241215T115800Z_20241215T130000Z` spans from `11:58:00 to 13:00:00 UTC on Dec 15, 2024` and `20241215T125800Z_20241215T140000Z` spans from `12:58:00 to 14:00:00 UTC on Dec 15, 2024`
|
|
203
|
+
|
|
204
|
+
|
|
205
|
+
Each station subdirectory, such as PB35, are made up of mSEED files from seismometer different poses (EX. N, E, Z):
|
|
186
206
|
```sh
|
|
187
207
|
[skevofilaxc PB35]$ ls
|
|
188
208
|
TX.PB35.00.HH1__20241215T115800Z__20241215T120100Z.mseed TX.PB35.00.HHZ__20241215T115800Z__20241215T120100Z.mseed
|
|
@@ -207,17 +227,22 @@ eqcct_runner = EQCCTMSeedRunner(
|
|
|
207
227
|
S_threshold=0.02,
|
|
208
228
|
p_model_filepath='/path/to/model_p.h5',
|
|
209
229
|
s_model_filepath='/path/to/model_s.h5',
|
|
210
|
-
|
|
230
|
+
number_of_concurrent_station_predictions=5,
|
|
231
|
+
number_of_concurrent_timechunk_predictions=2
|
|
211
232
|
best_usecase_config=True,
|
|
212
233
|
csv_dir='/path/to/csv',
|
|
213
234
|
selected_gpus=[0],
|
|
214
235
|
set_vram_mb=24750,
|
|
215
|
-
specific_stations='AT01, BP01, DG05'
|
|
216
|
-
|
|
236
|
+
specific_stations='AT01, BP01, DG05',
|
|
237
|
+
start_time='2024-12-14 12:00:00',
|
|
238
|
+
end_time='2024-12-15 12:00:00',
|
|
239
|
+
timechunk_dt=1,
|
|
240
|
+
waveform_overlap=2)
|
|
241
|
+
|
|
217
242
|
eqcct_runner.run_eqcctpro()
|
|
218
243
|
```
|
|
219
244
|
|
|
220
|
-
**EQCCTMseedRunner** has multiple input
|
|
245
|
+
**EQCCTMseedRunner** has multiple input parameters that need to be configured and are defined below:
|
|
221
246
|
|
|
222
247
|
- **`use_gpu (bool)`: True or False**
|
|
223
248
|
- Tells Ray to use either the GPU(s) (True) or CPUs (False) on your computer to process the waveforms in the entire workflow
|
|
@@ -232,7 +257,7 @@ eqcct_runner.run_eqcctpro()
|
|
|
232
257
|
- "I want this program to run only on these specific cores."
|
|
233
258
|
- **`input_dir (str)`**
|
|
234
259
|
- Directory path to the the mSEED directory
|
|
235
|
-
- EX. `/home/skevofilaxc/my_work_directory/eqcct/eqcctpro/
|
|
260
|
+
- EX. `/home/skevofilaxc/my_work_directory/eqcct/eqcctpro/230_stations_1_min_dt`
|
|
236
261
|
- **`output_dir (str)`**
|
|
237
262
|
- Directory path to where the output picks and logs will be sent
|
|
238
263
|
- Doesn't need to exist, will be created if doesn't exist
|
|
@@ -249,13 +274,16 @@ eqcct_runner.run_eqcctpro()
|
|
|
249
274
|
- Filepath to where the P EQCCT detection model is stored
|
|
250
275
|
- **`s_model_filepath (str)`**
|
|
251
276
|
- Filepath to where the S EQCCT detection model is stored
|
|
252
|
-
- **`
|
|
277
|
+
- **`number_of_concurrent_station_predictions (int)`**
|
|
253
278
|
- The number of concurrent EQCCT detection tasks that can happen simultaneously on a given number of resources
|
|
254
|
-
- EX. if
|
|
255
|
-
-
|
|
279
|
+
- EX. if number_of_concurrent_station_predictions = 5, up to 5 EQCCT instances can simultaneously analyze waveforms from 5 distinct seismic stations
|
|
280
|
+
- To use the optimal parameter value for this param, use the **EvaluateSystem** class (can be found below)
|
|
281
|
+
- **`number_of_concurrent_timechunk_predictions (int)`: default = None**
|
|
282
|
+
- The number of timechunks running in parallel
|
|
283
|
+
- Avoids the sequential processing of timechunks by processing multiple timechunks in parallel, exponetially reducing runtime
|
|
256
284
|
- **`best_usecase_config (bool)`: default = False**
|
|
257
|
-
- If True, will override inputted cpu_id_list, number_of_concurrent_predictions, intra_threads, inter_threads values for the best overall
|
|
258
|
-
- Best overall
|
|
285
|
+
- If True, will override inputted cpu_id_list, number_of_concurrent_predictions, intra_threads, inter_threads values for the best overall use-case configurations
|
|
286
|
+
- Best overall use-case configurations are defined as the best overall input configurations that minimize runtime while doing the most amount of processing with your available hardware
|
|
259
287
|
- Can only be used if EvaluateSystem has been run
|
|
260
288
|
- **`csv_dir (str)`**
|
|
261
289
|
- Directory path containing the CSV's outputted by EvaluateSystem that contain the trial data that will be used to find the best_usecase_config
|
|
@@ -268,12 +296,26 @@ eqcct_runner.run_eqcctpro()
|
|
|
268
296
|
- Must be a real value that is based on your hardware's physical memory space, if it exceeds the space the code will break due to **OutOfMemoryError**
|
|
269
297
|
- **`specific_stations (str)`: default = None**
|
|
270
298
|
- String that contains the "list" of stations you want to only analyze
|
|
271
|
-
- EX. Out of the 50 sample stations in `
|
|
299
|
+
- EX. Out of the 50 sample stations in `230_stations_1_min_dt`, if I only want to analyze AT01, BP01, DG05, then specific_stations='AT01, BP01, DG05'.
|
|
272
300
|
- Removes the need to move station directories around to be used as input, can contain all stations in one directory for access
|
|
273
|
-
- **`
|
|
274
|
-
-
|
|
275
|
-
-
|
|
276
|
-
|
|
301
|
+
- **`start_time (str)`: default = None**
|
|
302
|
+
- The start time of the area of time that is being analyzed
|
|
303
|
+
- EX. 2024-12-14 12:00:00
|
|
304
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
305
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
306
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
307
|
+
- **`end_time (str)`: default = None**
|
|
308
|
+
- The end time of the area of time that is being analyzed
|
|
309
|
+
- EX. 2024-12-15 12:00:00
|
|
310
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
311
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
312
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
313
|
+
- **`timechunk_dt (int)`: default = None**
|
|
314
|
+
- The length each time chunk is (in minutes)
|
|
315
|
+
- EX. timechunk_dt = 10 and the analysis period is 30 minutes, then three 10-minute long timechunks will be created
|
|
316
|
+
- **`waveform_overlap (int)`: default = None**
|
|
317
|
+
- The duration (in minutes) for which each waveform overlaps with the others
|
|
318
|
+
|
|
277
319
|
|
|
278
320
|
---
|
|
279
321
|
|
|
@@ -295,10 +337,10 @@ eval_gpu = EvaluateSystem(
|
|
|
295
337
|
S_threshold=0.02,
|
|
296
338
|
p_model_filepath='/path/to/model_p.h5',
|
|
297
339
|
s_model_filepath='/path/to/model_s.h5',
|
|
298
|
-
stations2use=2,
|
|
299
340
|
cpu_id_list=[0,1],
|
|
300
341
|
set_vram_mb=24750,
|
|
301
|
-
selected_gpus=[0]
|
|
342
|
+
selected_gpus=[0],
|
|
343
|
+
stations2use=2
|
|
302
344
|
)
|
|
303
345
|
eval_gpu.evaluate()
|
|
304
346
|
```
|
|
@@ -318,17 +360,24 @@ eval_cpu = EvaluateSystem(
|
|
|
318
360
|
S_threshold=0.02,
|
|
319
361
|
p_model_filepath='/path/to/model_p.h5',
|
|
320
362
|
s_model_filepath='/path/to/model_s.h5',
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
|
|
363
|
+
cpu_id_list=range(0,49),
|
|
364
|
+
min_cpu_amount=20,
|
|
365
|
+
cpu_test_step_size=1,
|
|
366
|
+
stations2use=50,
|
|
367
|
+
starting_amount_of_stations=25,
|
|
324
368
|
station_list_step_size=1,
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
369
|
+
min_conc_stations=25,
|
|
370
|
+
conc_station_tasks_step_size=5,
|
|
371
|
+
start_time='2024-12-15 12:00:00',
|
|
372
|
+
end_time='2024-12-15 14:00:00',
|
|
373
|
+
conc_timechunk_tasks_step_size=1,
|
|
374
|
+
timechunk_dt=30,
|
|
375
|
+
waveform_overlap=2,
|
|
376
|
+
tmp_dir=tmp_dir)
|
|
328
377
|
eval_cpu.evaluate()
|
|
329
378
|
```
|
|
330
379
|
|
|
331
|
-
**EvaluateSystem** will iterate through different combinations of CPU(s), Concurrent
|
|
380
|
+
**EvaluateSystem** will iterate through different combinations of CPU(s), Concurrent Timechunk and Station Tasks, as well as GPU(s), and the amount of VRAM (MB) each Concurrent Prediction can use.
|
|
332
381
|
**EvaluateSystem** will take time, depending on the number of CPU/GPUs, the amount of VRAM available, and the total workload that needs to be tested. However, after doing the testing once for most if not all usecases,
|
|
333
382
|
the trial data will be available and can be used to identify the optimal input parallelization configurations for **EQCCTMSeedRunner** to use to get the maximum amount of processing out of your system in the shortest amonut of time.
|
|
334
383
|
|
|
@@ -336,14 +385,14 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
336
385
|
|
|
337
386
|
- **`mode (str)`**
|
|
338
387
|
- Can be either `cpu` or `gpu`
|
|
339
|
-
- Tells `EvaluateSystem` which
|
|
388
|
+
- Tells `EvaluateSystem` which computing approach the trials should it iterate with
|
|
340
389
|
- **`intra_threads (int)`: default = 1**
|
|
341
390
|
- Controls how many intra-parallelism threads Tensorflow can use
|
|
342
391
|
- **`inter_threads (int)`: default = 1**
|
|
343
392
|
- Controls how many inter-parallelism threads Tensorflow can use
|
|
344
393
|
- **`input_dir (str)`**
|
|
345
394
|
- Directory path to the the mSEED directory
|
|
346
|
-
- EX. /home/skevofilaxc/my_work_directory/eqcct/eqcctpro/
|
|
395
|
+
- EX. /home/skevofilaxc/my_work_directory/eqcct/eqcctpro/230_stations_1_min_dt
|
|
347
396
|
- **`output_dir (str)`**
|
|
348
397
|
- Directory path to where the output picks and logs will be sent
|
|
349
398
|
- Doesn't need to exist, will be created if doesn't exist
|
|
@@ -363,14 +412,20 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
363
412
|
- Filepath to where the P EQCCT detection model is stored
|
|
364
413
|
- **`s_model_filepath (str)`**
|
|
365
414
|
- Filepath to where the S EQCCT detection model is stored
|
|
366
|
-
- **`stations2use (int)`: default = None**
|
|
367
|
-
- Controls the maximum amount of stations EvaluateSystem can use in its trial iterations
|
|
368
|
-
- Sample data has been provided so that the maximum is 50, however, if using custom data, configure for your specific usecase
|
|
369
415
|
- **`cpu_id_list (list)`: default = [1]**
|
|
370
416
|
- List that defines which specific CPU cores that sched_setaffinity will allocate for executing the current EQCCTPro process and **is the maximum amount of cores EvaluteSystem can use in its trial iterations**
|
|
371
417
|
- Allows for specific allocation and limitation of CPUs for a given EQCCTPro process
|
|
372
418
|
- "I want this program to run only on these specific cores."
|
|
373
419
|
- Must be at least 1 CPU if using GPUs (Ray needs CPUs to manage the Raylets (concurrent tasks), however the processing of the waveform is done on the GPU)
|
|
420
|
+
- **`min_cpu_amount (int)`: default = 1**
|
|
421
|
+
- Is the minimum amount of CPUs you want to start your trials with
|
|
422
|
+
- By default, trials will start iterating with 1 CPU up to the maximum allocated
|
|
423
|
+
- Can now set a value as the starting point, such as 15 CPUs up to the maximum of for instance 25
|
|
424
|
+
- **`cpu_test_step_size`: default = 1**
|
|
425
|
+
- Is the desired step size for the trials will march from `min_cpu_amount` to `len(cpu_id_list)`
|
|
426
|
+
- **`stations2use (int)`: default = None**
|
|
427
|
+
- Controls the maximum amount of stations EvaluateSystem can use in its trial iterations
|
|
428
|
+
- Sample data has been provided so that the maximum is 50, however, if using custom data, configure for your specific usecase
|
|
374
429
|
- **`starting_amount_of_stations (int)`: default = 1**
|
|
375
430
|
- For evaluating your system, you have the option to set a starting amount of stations you want to use in the test
|
|
376
431
|
- By default, the test will start using 1 station but now is configurable
|
|
@@ -378,16 +433,34 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
378
433
|
- You can set a step size for the station list that is generated
|
|
379
434
|
- For example if the stepsize is set to 10 and you start with 50 stations with a max of 100, then your list would be: [50, 60, 70, 80, 80, 100]
|
|
380
435
|
- Using 1 will use the default step size of 1-10, then step size of 5 up to station2use
|
|
381
|
-
- **`
|
|
382
|
-
- Is the minimum amount of
|
|
383
|
-
- By default, trials will start iterating with 1 CPU up to the maximum allocated
|
|
384
|
-
- Can now set a value as the starting point, such as 15 CPUs up to the maximum of for instance 25
|
|
385
|
-
- **`min_conc_predictions (int)`: default = 1**
|
|
386
|
-
- Is the minimum amount of concurrent predictions you want each trial iteration to start with
|
|
436
|
+
- **`min_conc_stations (int)`: default = 1**
|
|
437
|
+
- Is the minimum amount of concurrent stations predictions you want each trial iteration to start with
|
|
387
438
|
- By default, if `min_conc_predictions` and `conc_predictions_step_size` are set to 1, a custom step size iteration will be applied to test the 50 sample waveforms. The sequence follows: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, n+5, 50].
|
|
388
|
-
- **`
|
|
389
|
-
- Is the concurrent predictions step size you want each trial iteration to iterate with
|
|
439
|
+
- **`conc_station_tasks_step_size (int)`: default = 1**
|
|
440
|
+
- Is the concurrent station predictions step size you want each trial iteration to iterate with
|
|
390
441
|
- By default, if `min_conc_predictions` and `conc_predictions_step_size` are set to 1, a custom step size iteration will be applied to test the 50 sample waveforms. The sequence follows: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, n+5, 50]
|
|
442
|
+
- **`start_time (str)`: default = None**
|
|
443
|
+
- The start time of the area of time that is being analyzed
|
|
444
|
+
- EX. 2024-12-14 12:00:00
|
|
445
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
446
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
447
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
448
|
+
- **`end_time (str)`: default = None**
|
|
449
|
+
- The end time of the area of time that is being analyzed
|
|
450
|
+
- EX. 2024-12-15 12:00:00
|
|
451
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
452
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
453
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
454
|
+
- **`conc_timechunk_tasks_step_size (int)`: default = 1**
|
|
455
|
+
- Is the concurrent timechunk predictions step size you want each trial iteration to iterate with
|
|
456
|
+
- **`timechunk_dt (int)`: default = None**
|
|
457
|
+
- The length each time chunk is (in minutes)
|
|
458
|
+
- EX. timechunk_dt = 10 and the analysis period is 30 minutes, then three 10-minute long timechunks will be created
|
|
459
|
+
- **`waveform_overlap (int)`: default = None**
|
|
460
|
+
- The duration (in minutes) for which each waveform oself.start_timeverlaps with the others
|
|
461
|
+
- **`tmp_dir (str)`: default = 1**
|
|
462
|
+
- A temporary directory to store all temp files produced by EQCCTPro
|
|
463
|
+
- Used to help ease system cleanup and to not write to system's default temporary directory
|
|
391
464
|
- **`set_vram_mb (float)`**
|
|
392
465
|
- Value of the maximum amount of VRAM EQCCTPro can use
|
|
393
466
|
- Must be a real value that is based on your hardware's physical memory space, if it exceeds the space the code will break due to OutOfMemoryError
|
|
@@ -93,20 +93,25 @@ For additional details and package updates, visit the **EQCCTPro PyPI page**:
|
|
|
93
93
|
### **Using Sample Waveform Data**
|
|
94
94
|
To understand how **EQCCTPro** works, it is **highly recommended** to use provided sample seismic waveform data as the data source when testing the package.
|
|
95
95
|
|
|
96
|
-
|
|
96
|
+
1-minute long sample seismic waveforms from 229 TexNet stations have been provided in the repository under the `230_stations_1_min_dt.zip` file.
|
|
97
97
|
|
|
98
98
|
### **Step 1: Unzip the Sample Wavefrom Data**
|
|
99
99
|
After downloading the `.zip` file through the GitHub methods above, run:
|
|
100
100
|
```sh
|
|
101
|
-
[skevofilaxc] unzip
|
|
101
|
+
[skevofilaxc] unzip 230_stations_1_min_dt.zip
|
|
102
102
|
```
|
|
103
103
|
### **Step 2: Check and Understand the Directory Structure**
|
|
104
|
-
The extracted data will contain multiple station directories:
|
|
104
|
+
The extracted data will contain a timechunk subdirectories, comprised of multiple station directories:
|
|
105
105
|
```sh
|
|
106
|
-
[skevofilaxc
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
BP01
|
|
106
|
+
[skevofilaxc 230_stations_1_min_dt]$ ls
|
|
107
|
+
20241215T120000Z_20241215T120100Z
|
|
108
|
+
[skevofilaxc 230_stations_1_min_dt]$ cd 20241215T120000Z_20241215T120100Z
|
|
109
|
+
237B BP01 CT02 DG02 DG10 EE04 EF07 EF54 EF63 EF69 EF77 FOAK3 FW06 FW14 HBVL LWM2 MB05 MB12 MB19 MBBB3 MID03 NM01 OG02 PB05 PB11 PB19 PB26 PB34 PB41 PB51 PB57 PH03 SA06 SGCY SN02 SN10 WB03 WB09 YK01
|
|
110
|
+
435B BRDY CV01 DG04 DKNS EF02 EF08 EF56 EF64 EF71 ELG6 FOAK4 FW07 FW15 HNDO LWM3 MB06 MB13 MB21 MBBB5 MLDN NM02 OG04 PB06 PB12 PB21 PB28 PB35 PB42 PB52 PB58 PL01 SA07 SM01 SN03 SNAG WB04 WB10
|
|
111
|
+
ALPN BW01 CW01 DG05 DRIO EF03 EF09 EF58 EF65 EF72 ET02 FW01 FW09 GV01 HP01 MB01 MB07 MB15 MB22 MBBB6 MNHN NM03 OZNA PB07 PB14 PB22 PB29 PB37 PB43 PB53 PB59 PLPT SA09 SM02 SN04 TREL WB05 WB11
|
|
112
|
+
APMT CF01 DB02 DG06 DRZT EF04 EF51 EF59 EF66 EF74 FLRS FW02 FW11 GV02 HP02 MB02 MB08 MB16 MB25 MG01 MO01 ODSA PB01 PB08 PB16 PB23 PB30 PB38 PB44 PB54 PCOS POST SAND SM03 SN07 VHRN WB06 WB12
|
|
113
|
+
AT01 CRHG DB03 DG07 EE02 EF05 EF52 EF61 EF67 EF75 FOAK1 FW04 FW12 GV03 INDO MB03 MB09 MB17 MBBB1 MID01 NGL01 OE01 PB03 PB09 PB17 PB24 PB32 PB39 PB46 PB55 PECS SA02 SD01 SM04 SN08 VW01 WB07 WTFS
|
|
114
|
+
BB01 CT01 DB04 DG09 EE03 EF06 EF53 EF62 EF68 EF76 FOAK2 FW05 FW13 GV04 LWM1 MB04 MB11 MB18 MBBB2 MID02 NGL02 OG01 PB04 PB10 PB18 PB25 PB33 PB40 PB47 PB56 PH02 SA04 SE01 SMWD SN09 WB02 WB08 WW01
|
|
110
115
|
```
|
|
111
116
|
Each subdirectory contains **mSEED** files of different waveform components:
|
|
112
117
|
```sh
|
|
@@ -142,14 +147,29 @@ To process mSEED from various seismic stations, use the **EQCCTMSeedRunner** cla
|
|
|
142
147
|
**EQCCTMSeedRunner** enables users to process multiple mSEED from a given input directory, which consists of station directories formatted as follows:
|
|
143
148
|
|
|
144
149
|
```sh
|
|
145
|
-
[skevofilaxc
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
BP01
|
|
150
|
+
[skevofilaxc 230_stations_1_min_dt]$ ls
|
|
151
|
+
20241215T120000Z_20241215T120100Z
|
|
152
|
+
[skevofilaxc 230_stations_1_min_dt]$ cd 20241215T120000Z_20241215T120100Z
|
|
153
|
+
237B BP01 CT02 DG02 DG10 EE04 EF07 EF54 EF63 EF69 EF77 FOAK3 FW06 FW14 HBVL LWM2 MB05 MB12 MB19 MBBB3 MID03 NM01 OG02 PB05 PB11 PB19 PB26 PB34 PB41 PB51 PB57 PH03 SA06 SGCY SN02 SN10 WB03 WB09 YK01
|
|
154
|
+
435B BRDY CV01 DG04 DKNS EF02 EF08 EF56 EF64 EF71 ELG6 FOAK4 FW07 FW15 HNDO LWM3 MB06 MB13 MB21 MBBB5 MLDN NM02 OG04 PB06 PB12 PB21 PB28 PB35 PB42 PB52 PB58 PL01 SA07 SM01 SN03 SNAG WB04 WB10
|
|
155
|
+
ALPN BW01 CW01 DG05 DRIO EF03 EF09 EF58 EF65 EF72 ET02 FW01 FW09 GV01 HP01 MB01 MB07 MB15 MB22 MBBB6 MNHN NM03 OZNA PB07 PB14 PB22 PB29 PB37 PB43 PB53 PB59 PLPT SA09 SM02 SN04 TREL WB05 WB11
|
|
156
|
+
APMT CF01 DB02 DG06 DRZT EF04 EF51 EF59 EF66 EF74 FLRS FW02 FW11 GV02 HP02 MB02 MB08 MB16 MB25 MG01 MO01 ODSA PB01 PB08 PB16 PB23 PB30 PB38 PB44 PB54 PCOS POST SAND SM03 SN07 VHRN WB06 WB12
|
|
157
|
+
AT01 CRHG DB03 DG07 EE02 EF05 EF52 EF61 EF67 EF75 FOAK1 FW04 FW12 GV03 INDO MB03 MB09 MB17 MBBB1 MID01 NGL01 OE01 PB03 PB09 PB17 PB24 PB32 PB39 PB46 PB55 PECS SA02 SD01 SM04 SN08 VW01 WB07 WTFS
|
|
158
|
+
BB01 CT01 DB04 DG09 EE03 EF06 EF53 EF62 EF68 EF76 FOAK2 FW05 FW13 GV04 LWM1 MB04 MB11 MB18 MBBB2 MID02 NGL02 OG01 PB04 PB10 PB18 PB25 PB33 PB40 PB47 PB56 PH02 SA04 SE01 SMWD SN09 WB02 WB08 WW01
|
|
149
159
|
```
|
|
150
|
-
Where each subdirectory is named after station code. If you wish to use create your own input directory with custom waveform mSEED files, **please follow the above naming
|
|
160
|
+
Where each subdirectory is named after station code. If you wish to use create your own input directory with custom waveform mSEED files, **please follow the above naming conventions.** Otherwise, EQCCTPro will **not** work.
|
|
161
|
+
Create subdirectories for each timechunk (sub-parent directories) and for each station (child directories). The station directories should be named as shown above.
|
|
162
|
+
Each timechunk directory spans from the **start of the analysis period minus the waveform overlap** to the **end of the analysis period**, based on the defined timechunk duration.
|
|
151
163
|
|
|
152
|
-
|
|
164
|
+
For example:
|
|
165
|
+
```sh
|
|
166
|
+
[skevofilaxc 230_stations_2hr_1_hr_dt]$ ls
|
|
167
|
+
20241215T115800Z_20241215T130000Z 20241215T125800Z_20241215T140000Z
|
|
168
|
+
```
|
|
169
|
+
The timechunk time length is 1 hour long. At the same time, we use a waveform overlap of 2 minutes. Hence: `20241215T115800Z_20241215T130000Z` spans from `11:58:00 to 13:00:00 UTC on Dec 15, 2024` and `20241215T125800Z_20241215T140000Z` spans from `12:58:00 to 14:00:00 UTC on Dec 15, 2024`
|
|
170
|
+
|
|
171
|
+
|
|
172
|
+
Each station subdirectory, such as PB35, are made up of mSEED files from seismometer different poses (EX. N, E, Z):
|
|
153
173
|
```sh
|
|
154
174
|
[skevofilaxc PB35]$ ls
|
|
155
175
|
TX.PB35.00.HH1__20241215T115800Z__20241215T120100Z.mseed TX.PB35.00.HHZ__20241215T115800Z__20241215T120100Z.mseed
|
|
@@ -174,17 +194,22 @@ eqcct_runner = EQCCTMSeedRunner(
|
|
|
174
194
|
S_threshold=0.02,
|
|
175
195
|
p_model_filepath='/path/to/model_p.h5',
|
|
176
196
|
s_model_filepath='/path/to/model_s.h5',
|
|
177
|
-
|
|
197
|
+
number_of_concurrent_station_predictions=5,
|
|
198
|
+
number_of_concurrent_timechunk_predictions=2
|
|
178
199
|
best_usecase_config=True,
|
|
179
200
|
csv_dir='/path/to/csv',
|
|
180
201
|
selected_gpus=[0],
|
|
181
202
|
set_vram_mb=24750,
|
|
182
|
-
specific_stations='AT01, BP01, DG05'
|
|
183
|
-
|
|
203
|
+
specific_stations='AT01, BP01, DG05',
|
|
204
|
+
start_time='2024-12-14 12:00:00',
|
|
205
|
+
end_time='2024-12-15 12:00:00',
|
|
206
|
+
timechunk_dt=1,
|
|
207
|
+
waveform_overlap=2)
|
|
208
|
+
|
|
184
209
|
eqcct_runner.run_eqcctpro()
|
|
185
210
|
```
|
|
186
211
|
|
|
187
|
-
**EQCCTMseedRunner** has multiple input
|
|
212
|
+
**EQCCTMseedRunner** has multiple input parameters that need to be configured and are defined below:
|
|
188
213
|
|
|
189
214
|
- **`use_gpu (bool)`: True or False**
|
|
190
215
|
- Tells Ray to use either the GPU(s) (True) or CPUs (False) on your computer to process the waveforms in the entire workflow
|
|
@@ -199,7 +224,7 @@ eqcct_runner.run_eqcctpro()
|
|
|
199
224
|
- "I want this program to run only on these specific cores."
|
|
200
225
|
- **`input_dir (str)`**
|
|
201
226
|
- Directory path to the the mSEED directory
|
|
202
|
-
- EX. `/home/skevofilaxc/my_work_directory/eqcct/eqcctpro/
|
|
227
|
+
- EX. `/home/skevofilaxc/my_work_directory/eqcct/eqcctpro/230_stations_1_min_dt`
|
|
203
228
|
- **`output_dir (str)`**
|
|
204
229
|
- Directory path to where the output picks and logs will be sent
|
|
205
230
|
- Doesn't need to exist, will be created if doesn't exist
|
|
@@ -216,13 +241,16 @@ eqcct_runner.run_eqcctpro()
|
|
|
216
241
|
- Filepath to where the P EQCCT detection model is stored
|
|
217
242
|
- **`s_model_filepath (str)`**
|
|
218
243
|
- Filepath to where the S EQCCT detection model is stored
|
|
219
|
-
- **`
|
|
244
|
+
- **`number_of_concurrent_station_predictions (int)`**
|
|
220
245
|
- The number of concurrent EQCCT detection tasks that can happen simultaneously on a given number of resources
|
|
221
|
-
- EX. if
|
|
222
|
-
-
|
|
246
|
+
- EX. if number_of_concurrent_station_predictions = 5, up to 5 EQCCT instances can simultaneously analyze waveforms from 5 distinct seismic stations
|
|
247
|
+
- To use the optimal parameter value for this param, use the **EvaluateSystem** class (can be found below)
|
|
248
|
+
- **`number_of_concurrent_timechunk_predictions (int)`: default = None**
|
|
249
|
+
- The number of timechunks running in parallel
|
|
250
|
+
- Avoids the sequential processing of timechunks by processing multiple timechunks in parallel, exponetially reducing runtime
|
|
223
251
|
- **`best_usecase_config (bool)`: default = False**
|
|
224
|
-
- If True, will override inputted cpu_id_list, number_of_concurrent_predictions, intra_threads, inter_threads values for the best overall
|
|
225
|
-
- Best overall
|
|
252
|
+
- If True, will override inputted cpu_id_list, number_of_concurrent_predictions, intra_threads, inter_threads values for the best overall use-case configurations
|
|
253
|
+
- Best overall use-case configurations are defined as the best overall input configurations that minimize runtime while doing the most amount of processing with your available hardware
|
|
226
254
|
- Can only be used if EvaluateSystem has been run
|
|
227
255
|
- **`csv_dir (str)`**
|
|
228
256
|
- Directory path containing the CSV's outputted by EvaluateSystem that contain the trial data that will be used to find the best_usecase_config
|
|
@@ -235,12 +263,26 @@ eqcct_runner.run_eqcctpro()
|
|
|
235
263
|
- Must be a real value that is based on your hardware's physical memory space, if it exceeds the space the code will break due to **OutOfMemoryError**
|
|
236
264
|
- **`specific_stations (str)`: default = None**
|
|
237
265
|
- String that contains the "list" of stations you want to only analyze
|
|
238
|
-
- EX. Out of the 50 sample stations in `
|
|
266
|
+
- EX. Out of the 50 sample stations in `230_stations_1_min_dt`, if I only want to analyze AT01, BP01, DG05, then specific_stations='AT01, BP01, DG05'.
|
|
239
267
|
- Removes the need to move station directories around to be used as input, can contain all stations in one directory for access
|
|
240
|
-
- **`
|
|
241
|
-
-
|
|
242
|
-
-
|
|
243
|
-
|
|
268
|
+
- **`start_time (str)`: default = None**
|
|
269
|
+
- The start time of the area of time that is being analyzed
|
|
270
|
+
- EX. 2024-12-14 12:00:00
|
|
271
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
272
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
273
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
274
|
+
- **`end_time (str)`: default = None**
|
|
275
|
+
- The end time of the area of time that is being analyzed
|
|
276
|
+
- EX. 2024-12-15 12:00:00
|
|
277
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
278
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
279
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
280
|
+
- **`timechunk_dt (int)`: default = None**
|
|
281
|
+
- The length each time chunk is (in minutes)
|
|
282
|
+
- EX. timechunk_dt = 10 and the analysis period is 30 minutes, then three 10-minute long timechunks will be created
|
|
283
|
+
- **`waveform_overlap (int)`: default = None**
|
|
284
|
+
- The duration (in minutes) for which each waveform overlaps with the others
|
|
285
|
+
|
|
244
286
|
|
|
245
287
|
---
|
|
246
288
|
|
|
@@ -262,10 +304,10 @@ eval_gpu = EvaluateSystem(
|
|
|
262
304
|
S_threshold=0.02,
|
|
263
305
|
p_model_filepath='/path/to/model_p.h5',
|
|
264
306
|
s_model_filepath='/path/to/model_s.h5',
|
|
265
|
-
stations2use=2,
|
|
266
307
|
cpu_id_list=[0,1],
|
|
267
308
|
set_vram_mb=24750,
|
|
268
|
-
selected_gpus=[0]
|
|
309
|
+
selected_gpus=[0],
|
|
310
|
+
stations2use=2
|
|
269
311
|
)
|
|
270
312
|
eval_gpu.evaluate()
|
|
271
313
|
```
|
|
@@ -285,17 +327,24 @@ eval_cpu = EvaluateSystem(
|
|
|
285
327
|
S_threshold=0.02,
|
|
286
328
|
p_model_filepath='/path/to/model_p.h5',
|
|
287
329
|
s_model_filepath='/path/to/model_s.h5',
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
330
|
+
cpu_id_list=range(0,49),
|
|
331
|
+
min_cpu_amount=20,
|
|
332
|
+
cpu_test_step_size=1,
|
|
333
|
+
stations2use=50,
|
|
334
|
+
starting_amount_of_stations=25,
|
|
291
335
|
station_list_step_size=1,
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
336
|
+
min_conc_stations=25,
|
|
337
|
+
conc_station_tasks_step_size=5,
|
|
338
|
+
start_time='2024-12-15 12:00:00',
|
|
339
|
+
end_time='2024-12-15 14:00:00',
|
|
340
|
+
conc_timechunk_tasks_step_size=1,
|
|
341
|
+
timechunk_dt=30,
|
|
342
|
+
waveform_overlap=2,
|
|
343
|
+
tmp_dir=tmp_dir)
|
|
295
344
|
eval_cpu.evaluate()
|
|
296
345
|
```
|
|
297
346
|
|
|
298
|
-
**EvaluateSystem** will iterate through different combinations of CPU(s), Concurrent
|
|
347
|
+
**EvaluateSystem** will iterate through different combinations of CPU(s), Concurrent Timechunk and Station Tasks, as well as GPU(s), and the amount of VRAM (MB) each Concurrent Prediction can use.
|
|
299
348
|
**EvaluateSystem** will take time, depending on the number of CPU/GPUs, the amount of VRAM available, and the total workload that needs to be tested. However, after doing the testing once for most if not all usecases,
|
|
300
349
|
the trial data will be available and can be used to identify the optimal input parallelization configurations for **EQCCTMSeedRunner** to use to get the maximum amount of processing out of your system in the shortest amonut of time.
|
|
301
350
|
|
|
@@ -303,14 +352,14 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
303
352
|
|
|
304
353
|
- **`mode (str)`**
|
|
305
354
|
- Can be either `cpu` or `gpu`
|
|
306
|
-
- Tells `EvaluateSystem` which
|
|
355
|
+
- Tells `EvaluateSystem` which computing approach the trials should it iterate with
|
|
307
356
|
- **`intra_threads (int)`: default = 1**
|
|
308
357
|
- Controls how many intra-parallelism threads Tensorflow can use
|
|
309
358
|
- **`inter_threads (int)`: default = 1**
|
|
310
359
|
- Controls how many inter-parallelism threads Tensorflow can use
|
|
311
360
|
- **`input_dir (str)`**
|
|
312
361
|
- Directory path to the the mSEED directory
|
|
313
|
-
- EX. /home/skevofilaxc/my_work_directory/eqcct/eqcctpro/
|
|
362
|
+
- EX. /home/skevofilaxc/my_work_directory/eqcct/eqcctpro/230_stations_1_min_dt
|
|
314
363
|
- **`output_dir (str)`**
|
|
315
364
|
- Directory path to where the output picks and logs will be sent
|
|
316
365
|
- Doesn't need to exist, will be created if doesn't exist
|
|
@@ -330,14 +379,20 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
330
379
|
- Filepath to where the P EQCCT detection model is stored
|
|
331
380
|
- **`s_model_filepath (str)`**
|
|
332
381
|
- Filepath to where the S EQCCT detection model is stored
|
|
333
|
-
- **`stations2use (int)`: default = None**
|
|
334
|
-
- Controls the maximum amount of stations EvaluateSystem can use in its trial iterations
|
|
335
|
-
- Sample data has been provided so that the maximum is 50, however, if using custom data, configure for your specific usecase
|
|
336
382
|
- **`cpu_id_list (list)`: default = [1]**
|
|
337
383
|
- List that defines which specific CPU cores that sched_setaffinity will allocate for executing the current EQCCTPro process and **is the maximum amount of cores EvaluteSystem can use in its trial iterations**
|
|
338
384
|
- Allows for specific allocation and limitation of CPUs for a given EQCCTPro process
|
|
339
385
|
- "I want this program to run only on these specific cores."
|
|
340
386
|
- Must be at least 1 CPU if using GPUs (Ray needs CPUs to manage the Raylets (concurrent tasks), however the processing of the waveform is done on the GPU)
|
|
387
|
+
- **`min_cpu_amount (int)`: default = 1**
|
|
388
|
+
- Is the minimum amount of CPUs you want to start your trials with
|
|
389
|
+
- By default, trials will start iterating with 1 CPU up to the maximum allocated
|
|
390
|
+
- Can now set a value as the starting point, such as 15 CPUs up to the maximum of for instance 25
|
|
391
|
+
- **`cpu_test_step_size`: default = 1**
|
|
392
|
+
- Is the desired step size for the trials will march from `min_cpu_amount` to `len(cpu_id_list)`
|
|
393
|
+
- **`stations2use (int)`: default = None**
|
|
394
|
+
- Controls the maximum amount of stations EvaluateSystem can use in its trial iterations
|
|
395
|
+
- Sample data has been provided so that the maximum is 50, however, if using custom data, configure for your specific usecase
|
|
341
396
|
- **`starting_amount_of_stations (int)`: default = 1**
|
|
342
397
|
- For evaluating your system, you have the option to set a starting amount of stations you want to use in the test
|
|
343
398
|
- By default, the test will start using 1 station but now is configurable
|
|
@@ -345,16 +400,34 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
345
400
|
- You can set a step size for the station list that is generated
|
|
346
401
|
- For example if the stepsize is set to 10 and you start with 50 stations with a max of 100, then your list would be: [50, 60, 70, 80, 80, 100]
|
|
347
402
|
- Using 1 will use the default step size of 1-10, then step size of 5 up to station2use
|
|
348
|
-
- **`
|
|
349
|
-
- Is the minimum amount of
|
|
350
|
-
- By default, trials will start iterating with 1 CPU up to the maximum allocated
|
|
351
|
-
- Can now set a value as the starting point, such as 15 CPUs up to the maximum of for instance 25
|
|
352
|
-
- **`min_conc_predictions (int)`: default = 1**
|
|
353
|
-
- Is the minimum amount of concurrent predictions you want each trial iteration to start with
|
|
403
|
+
- **`min_conc_stations (int)`: default = 1**
|
|
404
|
+
- Is the minimum amount of concurrent stations predictions you want each trial iteration to start with
|
|
354
405
|
- By default, if `min_conc_predictions` and `conc_predictions_step_size` are set to 1, a custom step size iteration will be applied to test the 50 sample waveforms. The sequence follows: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, n+5, 50].
|
|
355
|
-
- **`
|
|
356
|
-
- Is the concurrent predictions step size you want each trial iteration to iterate with
|
|
406
|
+
- **`conc_station_tasks_step_size (int)`: default = 1**
|
|
407
|
+
- Is the concurrent station predictions step size you want each trial iteration to iterate with
|
|
357
408
|
- By default, if `min_conc_predictions` and `conc_predictions_step_size` are set to 1, a custom step size iteration will be applied to test the 50 sample waveforms. The sequence follows: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, n+5, 50]
|
|
409
|
+
- **`start_time (str)`: default = None**
|
|
410
|
+
- The start time of the area of time that is being analyzed
|
|
411
|
+
- EX. 2024-12-14 12:00:00
|
|
412
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
413
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
414
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
415
|
+
- **`end_time (str)`: default = None**
|
|
416
|
+
- The end time of the area of time that is being analyzed
|
|
417
|
+
- EX. 2024-12-15 12:00:00
|
|
418
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
419
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
420
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
421
|
+
- **`conc_timechunk_tasks_step_size (int)`: default = 1**
|
|
422
|
+
- Is the concurrent timechunk predictions step size you want each trial iteration to iterate with
|
|
423
|
+
- **`timechunk_dt (int)`: default = None**
|
|
424
|
+
- The length each time chunk is (in minutes)
|
|
425
|
+
- EX. timechunk_dt = 10 and the analysis period is 30 minutes, then three 10-minute long timechunks will be created
|
|
426
|
+
- **`waveform_overlap (int)`: default = None**
|
|
427
|
+
- The duration (in minutes) for which each waveform oself.start_timeverlaps with the others
|
|
428
|
+
- **`tmp_dir (str)`: default = 1**
|
|
429
|
+
- A temporary directory to store all temp files produced by EQCCTPro
|
|
430
|
+
- Used to help ease system cleanup and to not write to system's default temporary directory
|
|
358
431
|
- **`set_vram_mb (float)`**
|
|
359
432
|
- Value of the maximum amount of VRAM EQCCTPro can use
|
|
360
433
|
- Must be a real value that is based on your hardware's physical memory space, if it exceeds the space the code will break due to OutOfMemoryError
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: eqcctpro
|
|
3
|
-
Version: 0.5.
|
|
3
|
+
Version: 0.5.7
|
|
4
4
|
Summary: EQCCTPro: A powerful seismic event detection toolkit
|
|
5
5
|
Author-email: Constantinos Skevofilax <constantinos.skevofilax@austin.utexas.edu>, Victor Salles <victor.salles@beg.utexas.edu>
|
|
6
6
|
Project-URL: Homepage, https://pypi.org/project/eqcctpro/
|
|
@@ -126,20 +126,25 @@ For additional details and package updates, visit the **EQCCTPro PyPI page**:
|
|
|
126
126
|
### **Using Sample Waveform Data**
|
|
127
127
|
To understand how **EQCCTPro** works, it is **highly recommended** to use provided sample seismic waveform data as the data source when testing the package.
|
|
128
128
|
|
|
129
|
-
|
|
129
|
+
1-minute long sample seismic waveforms from 229 TexNet stations have been provided in the repository under the `230_stations_1_min_dt.zip` file.
|
|
130
130
|
|
|
131
131
|
### **Step 1: Unzip the Sample Wavefrom Data**
|
|
132
132
|
After downloading the `.zip` file through the GitHub methods above, run:
|
|
133
133
|
```sh
|
|
134
|
-
[skevofilaxc] unzip
|
|
134
|
+
[skevofilaxc] unzip 230_stations_1_min_dt.zip
|
|
135
135
|
```
|
|
136
136
|
### **Step 2: Check and Understand the Directory Structure**
|
|
137
|
-
The extracted data will contain multiple station directories:
|
|
137
|
+
The extracted data will contain a timechunk subdirectories, comprised of multiple station directories:
|
|
138
138
|
```sh
|
|
139
|
-
[skevofilaxc
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
BP01
|
|
139
|
+
[skevofilaxc 230_stations_1_min_dt]$ ls
|
|
140
|
+
20241215T120000Z_20241215T120100Z
|
|
141
|
+
[skevofilaxc 230_stations_1_min_dt]$ cd 20241215T120000Z_20241215T120100Z
|
|
142
|
+
237B BP01 CT02 DG02 DG10 EE04 EF07 EF54 EF63 EF69 EF77 FOAK3 FW06 FW14 HBVL LWM2 MB05 MB12 MB19 MBBB3 MID03 NM01 OG02 PB05 PB11 PB19 PB26 PB34 PB41 PB51 PB57 PH03 SA06 SGCY SN02 SN10 WB03 WB09 YK01
|
|
143
|
+
435B BRDY CV01 DG04 DKNS EF02 EF08 EF56 EF64 EF71 ELG6 FOAK4 FW07 FW15 HNDO LWM3 MB06 MB13 MB21 MBBB5 MLDN NM02 OG04 PB06 PB12 PB21 PB28 PB35 PB42 PB52 PB58 PL01 SA07 SM01 SN03 SNAG WB04 WB10
|
|
144
|
+
ALPN BW01 CW01 DG05 DRIO EF03 EF09 EF58 EF65 EF72 ET02 FW01 FW09 GV01 HP01 MB01 MB07 MB15 MB22 MBBB6 MNHN NM03 OZNA PB07 PB14 PB22 PB29 PB37 PB43 PB53 PB59 PLPT SA09 SM02 SN04 TREL WB05 WB11
|
|
145
|
+
APMT CF01 DB02 DG06 DRZT EF04 EF51 EF59 EF66 EF74 FLRS FW02 FW11 GV02 HP02 MB02 MB08 MB16 MB25 MG01 MO01 ODSA PB01 PB08 PB16 PB23 PB30 PB38 PB44 PB54 PCOS POST SAND SM03 SN07 VHRN WB06 WB12
|
|
146
|
+
AT01 CRHG DB03 DG07 EE02 EF05 EF52 EF61 EF67 EF75 FOAK1 FW04 FW12 GV03 INDO MB03 MB09 MB17 MBBB1 MID01 NGL01 OE01 PB03 PB09 PB17 PB24 PB32 PB39 PB46 PB55 PECS SA02 SD01 SM04 SN08 VW01 WB07 WTFS
|
|
147
|
+
BB01 CT01 DB04 DG09 EE03 EF06 EF53 EF62 EF68 EF76 FOAK2 FW05 FW13 GV04 LWM1 MB04 MB11 MB18 MBBB2 MID02 NGL02 OG01 PB04 PB10 PB18 PB25 PB33 PB40 PB47 PB56 PH02 SA04 SE01 SMWD SN09 WB02 WB08 WW01
|
|
143
148
|
```
|
|
144
149
|
Each subdirectory contains **mSEED** files of different waveform components:
|
|
145
150
|
```sh
|
|
@@ -175,14 +180,29 @@ To process mSEED from various seismic stations, use the **EQCCTMSeedRunner** cla
|
|
|
175
180
|
**EQCCTMSeedRunner** enables users to process multiple mSEED from a given input directory, which consists of station directories formatted as follows:
|
|
176
181
|
|
|
177
182
|
```sh
|
|
178
|
-
[skevofilaxc
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
BP01
|
|
183
|
+
[skevofilaxc 230_stations_1_min_dt]$ ls
|
|
184
|
+
20241215T120000Z_20241215T120100Z
|
|
185
|
+
[skevofilaxc 230_stations_1_min_dt]$ cd 20241215T120000Z_20241215T120100Z
|
|
186
|
+
237B BP01 CT02 DG02 DG10 EE04 EF07 EF54 EF63 EF69 EF77 FOAK3 FW06 FW14 HBVL LWM2 MB05 MB12 MB19 MBBB3 MID03 NM01 OG02 PB05 PB11 PB19 PB26 PB34 PB41 PB51 PB57 PH03 SA06 SGCY SN02 SN10 WB03 WB09 YK01
|
|
187
|
+
435B BRDY CV01 DG04 DKNS EF02 EF08 EF56 EF64 EF71 ELG6 FOAK4 FW07 FW15 HNDO LWM3 MB06 MB13 MB21 MBBB5 MLDN NM02 OG04 PB06 PB12 PB21 PB28 PB35 PB42 PB52 PB58 PL01 SA07 SM01 SN03 SNAG WB04 WB10
|
|
188
|
+
ALPN BW01 CW01 DG05 DRIO EF03 EF09 EF58 EF65 EF72 ET02 FW01 FW09 GV01 HP01 MB01 MB07 MB15 MB22 MBBB6 MNHN NM03 OZNA PB07 PB14 PB22 PB29 PB37 PB43 PB53 PB59 PLPT SA09 SM02 SN04 TREL WB05 WB11
|
|
189
|
+
APMT CF01 DB02 DG06 DRZT EF04 EF51 EF59 EF66 EF74 FLRS FW02 FW11 GV02 HP02 MB02 MB08 MB16 MB25 MG01 MO01 ODSA PB01 PB08 PB16 PB23 PB30 PB38 PB44 PB54 PCOS POST SAND SM03 SN07 VHRN WB06 WB12
|
|
190
|
+
AT01 CRHG DB03 DG07 EE02 EF05 EF52 EF61 EF67 EF75 FOAK1 FW04 FW12 GV03 INDO MB03 MB09 MB17 MBBB1 MID01 NGL01 OE01 PB03 PB09 PB17 PB24 PB32 PB39 PB46 PB55 PECS SA02 SD01 SM04 SN08 VW01 WB07 WTFS
|
|
191
|
+
BB01 CT01 DB04 DG09 EE03 EF06 EF53 EF62 EF68 EF76 FOAK2 FW05 FW13 GV04 LWM1 MB04 MB11 MB18 MBBB2 MID02 NGL02 OG01 PB04 PB10 PB18 PB25 PB33 PB40 PB47 PB56 PH02 SA04 SE01 SMWD SN09 WB02 WB08 WW01
|
|
182
192
|
```
|
|
183
|
-
Where each subdirectory is named after station code. If you wish to use create your own input directory with custom waveform mSEED files, **please follow the above naming
|
|
193
|
+
Where each subdirectory is named after station code. If you wish to use create your own input directory with custom waveform mSEED files, **please follow the above naming conventions.** Otherwise, EQCCTPro will **not** work.
|
|
194
|
+
Create subdirectories for each timechunk (sub-parent directories) and for each station (child directories). The station directories should be named as shown above.
|
|
195
|
+
Each timechunk directory spans from the **start of the analysis period minus the waveform overlap** to the **end of the analysis period**, based on the defined timechunk duration.
|
|
184
196
|
|
|
185
|
-
|
|
197
|
+
For example:
|
|
198
|
+
```sh
|
|
199
|
+
[skevofilaxc 230_stations_2hr_1_hr_dt]$ ls
|
|
200
|
+
20241215T115800Z_20241215T130000Z 20241215T125800Z_20241215T140000Z
|
|
201
|
+
```
|
|
202
|
+
The timechunk time length is 1 hour long. At the same time, we use a waveform overlap of 2 minutes. Hence: `20241215T115800Z_20241215T130000Z` spans from `11:58:00 to 13:00:00 UTC on Dec 15, 2024` and `20241215T125800Z_20241215T140000Z` spans from `12:58:00 to 14:00:00 UTC on Dec 15, 2024`
|
|
203
|
+
|
|
204
|
+
|
|
205
|
+
Each station subdirectory, such as PB35, are made up of mSEED files from seismometer different poses (EX. N, E, Z):
|
|
186
206
|
```sh
|
|
187
207
|
[skevofilaxc PB35]$ ls
|
|
188
208
|
TX.PB35.00.HH1__20241215T115800Z__20241215T120100Z.mseed TX.PB35.00.HHZ__20241215T115800Z__20241215T120100Z.mseed
|
|
@@ -207,17 +227,22 @@ eqcct_runner = EQCCTMSeedRunner(
|
|
|
207
227
|
S_threshold=0.02,
|
|
208
228
|
p_model_filepath='/path/to/model_p.h5',
|
|
209
229
|
s_model_filepath='/path/to/model_s.h5',
|
|
210
|
-
|
|
230
|
+
number_of_concurrent_station_predictions=5,
|
|
231
|
+
number_of_concurrent_timechunk_predictions=2
|
|
211
232
|
best_usecase_config=True,
|
|
212
233
|
csv_dir='/path/to/csv',
|
|
213
234
|
selected_gpus=[0],
|
|
214
235
|
set_vram_mb=24750,
|
|
215
|
-
specific_stations='AT01, BP01, DG05'
|
|
216
|
-
|
|
236
|
+
specific_stations='AT01, BP01, DG05',
|
|
237
|
+
start_time='2024-12-14 12:00:00',
|
|
238
|
+
end_time='2024-12-15 12:00:00',
|
|
239
|
+
timechunk_dt=1,
|
|
240
|
+
waveform_overlap=2)
|
|
241
|
+
|
|
217
242
|
eqcct_runner.run_eqcctpro()
|
|
218
243
|
```
|
|
219
244
|
|
|
220
|
-
**EQCCTMseedRunner** has multiple input
|
|
245
|
+
**EQCCTMseedRunner** has multiple input parameters that need to be configured and are defined below:
|
|
221
246
|
|
|
222
247
|
- **`use_gpu (bool)`: True or False**
|
|
223
248
|
- Tells Ray to use either the GPU(s) (True) or CPUs (False) on your computer to process the waveforms in the entire workflow
|
|
@@ -232,7 +257,7 @@ eqcct_runner.run_eqcctpro()
|
|
|
232
257
|
- "I want this program to run only on these specific cores."
|
|
233
258
|
- **`input_dir (str)`**
|
|
234
259
|
- Directory path to the the mSEED directory
|
|
235
|
-
- EX. `/home/skevofilaxc/my_work_directory/eqcct/eqcctpro/
|
|
260
|
+
- EX. `/home/skevofilaxc/my_work_directory/eqcct/eqcctpro/230_stations_1_min_dt`
|
|
236
261
|
- **`output_dir (str)`**
|
|
237
262
|
- Directory path to where the output picks and logs will be sent
|
|
238
263
|
- Doesn't need to exist, will be created if doesn't exist
|
|
@@ -249,13 +274,16 @@ eqcct_runner.run_eqcctpro()
|
|
|
249
274
|
- Filepath to where the P EQCCT detection model is stored
|
|
250
275
|
- **`s_model_filepath (str)`**
|
|
251
276
|
- Filepath to where the S EQCCT detection model is stored
|
|
252
|
-
- **`
|
|
277
|
+
- **`number_of_concurrent_station_predictions (int)`**
|
|
253
278
|
- The number of concurrent EQCCT detection tasks that can happen simultaneously on a given number of resources
|
|
254
|
-
- EX. if
|
|
255
|
-
-
|
|
279
|
+
- EX. if number_of_concurrent_station_predictions = 5, up to 5 EQCCT instances can simultaneously analyze waveforms from 5 distinct seismic stations
|
|
280
|
+
- To use the optimal parameter value for this param, use the **EvaluateSystem** class (can be found below)
|
|
281
|
+
- **`number_of_concurrent_timechunk_predictions (int)`: default = None**
|
|
282
|
+
- The number of timechunks running in parallel
|
|
283
|
+
- Avoids the sequential processing of timechunks by processing multiple timechunks in parallel, exponetially reducing runtime
|
|
256
284
|
- **`best_usecase_config (bool)`: default = False**
|
|
257
|
-
- If True, will override inputted cpu_id_list, number_of_concurrent_predictions, intra_threads, inter_threads values for the best overall
|
|
258
|
-
- Best overall
|
|
285
|
+
- If True, will override inputted cpu_id_list, number_of_concurrent_predictions, intra_threads, inter_threads values for the best overall use-case configurations
|
|
286
|
+
- Best overall use-case configurations are defined as the best overall input configurations that minimize runtime while doing the most amount of processing with your available hardware
|
|
259
287
|
- Can only be used if EvaluateSystem has been run
|
|
260
288
|
- **`csv_dir (str)`**
|
|
261
289
|
- Directory path containing the CSV's outputted by EvaluateSystem that contain the trial data that will be used to find the best_usecase_config
|
|
@@ -268,12 +296,26 @@ eqcct_runner.run_eqcctpro()
|
|
|
268
296
|
- Must be a real value that is based on your hardware's physical memory space, if it exceeds the space the code will break due to **OutOfMemoryError**
|
|
269
297
|
- **`specific_stations (str)`: default = None**
|
|
270
298
|
- String that contains the "list" of stations you want to only analyze
|
|
271
|
-
- EX. Out of the 50 sample stations in `
|
|
299
|
+
- EX. Out of the 50 sample stations in `230_stations_1_min_dt`, if I only want to analyze AT01, BP01, DG05, then specific_stations='AT01, BP01, DG05'.
|
|
272
300
|
- Removes the need to move station directories around to be used as input, can contain all stations in one directory for access
|
|
273
|
-
- **`
|
|
274
|
-
-
|
|
275
|
-
-
|
|
276
|
-
|
|
301
|
+
- **`start_time (str)`: default = None**
|
|
302
|
+
- The start time of the area of time that is being analyzed
|
|
303
|
+
- EX. 2024-12-14 12:00:00
|
|
304
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
305
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
306
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
307
|
+
- **`end_time (str)`: default = None**
|
|
308
|
+
- The end time of the area of time that is being analyzed
|
|
309
|
+
- EX. 2024-12-15 12:00:00
|
|
310
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
311
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
312
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
313
|
+
- **`timechunk_dt (int)`: default = None**
|
|
314
|
+
- The length each time chunk is (in minutes)
|
|
315
|
+
- EX. timechunk_dt = 10 and the analysis period is 30 minutes, then three 10-minute long timechunks will be created
|
|
316
|
+
- **`waveform_overlap (int)`: default = None**
|
|
317
|
+
- The duration (in minutes) for which each waveform overlaps with the others
|
|
318
|
+
|
|
277
319
|
|
|
278
320
|
---
|
|
279
321
|
|
|
@@ -295,10 +337,10 @@ eval_gpu = EvaluateSystem(
|
|
|
295
337
|
S_threshold=0.02,
|
|
296
338
|
p_model_filepath='/path/to/model_p.h5',
|
|
297
339
|
s_model_filepath='/path/to/model_s.h5',
|
|
298
|
-
stations2use=2,
|
|
299
340
|
cpu_id_list=[0,1],
|
|
300
341
|
set_vram_mb=24750,
|
|
301
|
-
selected_gpus=[0]
|
|
342
|
+
selected_gpus=[0],
|
|
343
|
+
stations2use=2
|
|
302
344
|
)
|
|
303
345
|
eval_gpu.evaluate()
|
|
304
346
|
```
|
|
@@ -318,17 +360,24 @@ eval_cpu = EvaluateSystem(
|
|
|
318
360
|
S_threshold=0.02,
|
|
319
361
|
p_model_filepath='/path/to/model_p.h5',
|
|
320
362
|
s_model_filepath='/path/to/model_s.h5',
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
|
|
363
|
+
cpu_id_list=range(0,49),
|
|
364
|
+
min_cpu_amount=20,
|
|
365
|
+
cpu_test_step_size=1,
|
|
366
|
+
stations2use=50,
|
|
367
|
+
starting_amount_of_stations=25,
|
|
324
368
|
station_list_step_size=1,
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
369
|
+
min_conc_stations=25,
|
|
370
|
+
conc_station_tasks_step_size=5,
|
|
371
|
+
start_time='2024-12-15 12:00:00',
|
|
372
|
+
end_time='2024-12-15 14:00:00',
|
|
373
|
+
conc_timechunk_tasks_step_size=1,
|
|
374
|
+
timechunk_dt=30,
|
|
375
|
+
waveform_overlap=2,
|
|
376
|
+
tmp_dir=tmp_dir)
|
|
328
377
|
eval_cpu.evaluate()
|
|
329
378
|
```
|
|
330
379
|
|
|
331
|
-
**EvaluateSystem** will iterate through different combinations of CPU(s), Concurrent
|
|
380
|
+
**EvaluateSystem** will iterate through different combinations of CPU(s), Concurrent Timechunk and Station Tasks, as well as GPU(s), and the amount of VRAM (MB) each Concurrent Prediction can use.
|
|
332
381
|
**EvaluateSystem** will take time, depending on the number of CPU/GPUs, the amount of VRAM available, and the total workload that needs to be tested. However, after doing the testing once for most if not all usecases,
|
|
333
382
|
the trial data will be available and can be used to identify the optimal input parallelization configurations for **EQCCTMSeedRunner** to use to get the maximum amount of processing out of your system in the shortest amonut of time.
|
|
334
383
|
|
|
@@ -336,14 +385,14 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
336
385
|
|
|
337
386
|
- **`mode (str)`**
|
|
338
387
|
- Can be either `cpu` or `gpu`
|
|
339
|
-
- Tells `EvaluateSystem` which
|
|
388
|
+
- Tells `EvaluateSystem` which computing approach the trials should it iterate with
|
|
340
389
|
- **`intra_threads (int)`: default = 1**
|
|
341
390
|
- Controls how many intra-parallelism threads Tensorflow can use
|
|
342
391
|
- **`inter_threads (int)`: default = 1**
|
|
343
392
|
- Controls how many inter-parallelism threads Tensorflow can use
|
|
344
393
|
- **`input_dir (str)`**
|
|
345
394
|
- Directory path to the the mSEED directory
|
|
346
|
-
- EX. /home/skevofilaxc/my_work_directory/eqcct/eqcctpro/
|
|
395
|
+
- EX. /home/skevofilaxc/my_work_directory/eqcct/eqcctpro/230_stations_1_min_dt
|
|
347
396
|
- **`output_dir (str)`**
|
|
348
397
|
- Directory path to where the output picks and logs will be sent
|
|
349
398
|
- Doesn't need to exist, will be created if doesn't exist
|
|
@@ -363,14 +412,20 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
363
412
|
- Filepath to where the P EQCCT detection model is stored
|
|
364
413
|
- **`s_model_filepath (str)`**
|
|
365
414
|
- Filepath to where the S EQCCT detection model is stored
|
|
366
|
-
- **`stations2use (int)`: default = None**
|
|
367
|
-
- Controls the maximum amount of stations EvaluateSystem can use in its trial iterations
|
|
368
|
-
- Sample data has been provided so that the maximum is 50, however, if using custom data, configure for your specific usecase
|
|
369
415
|
- **`cpu_id_list (list)`: default = [1]**
|
|
370
416
|
- List that defines which specific CPU cores that sched_setaffinity will allocate for executing the current EQCCTPro process and **is the maximum amount of cores EvaluteSystem can use in its trial iterations**
|
|
371
417
|
- Allows for specific allocation and limitation of CPUs for a given EQCCTPro process
|
|
372
418
|
- "I want this program to run only on these specific cores."
|
|
373
419
|
- Must be at least 1 CPU if using GPUs (Ray needs CPUs to manage the Raylets (concurrent tasks), however the processing of the waveform is done on the GPU)
|
|
420
|
+
- **`min_cpu_amount (int)`: default = 1**
|
|
421
|
+
- Is the minimum amount of CPUs you want to start your trials with
|
|
422
|
+
- By default, trials will start iterating with 1 CPU up to the maximum allocated
|
|
423
|
+
- Can now set a value as the starting point, such as 15 CPUs up to the maximum of for instance 25
|
|
424
|
+
- **`cpu_test_step_size`: default = 1**
|
|
425
|
+
- Is the desired step size for the trials will march from `min_cpu_amount` to `len(cpu_id_list)`
|
|
426
|
+
- **`stations2use (int)`: default = None**
|
|
427
|
+
- Controls the maximum amount of stations EvaluateSystem can use in its trial iterations
|
|
428
|
+
- Sample data has been provided so that the maximum is 50, however, if using custom data, configure for your specific usecase
|
|
374
429
|
- **`starting_amount_of_stations (int)`: default = 1**
|
|
375
430
|
- For evaluating your system, you have the option to set a starting amount of stations you want to use in the test
|
|
376
431
|
- By default, the test will start using 1 station but now is configurable
|
|
@@ -378,16 +433,34 @@ The following input parameters need to be configurated for **EvaluateSystem** to
|
|
|
378
433
|
- You can set a step size for the station list that is generated
|
|
379
434
|
- For example if the stepsize is set to 10 and you start with 50 stations with a max of 100, then your list would be: [50, 60, 70, 80, 80, 100]
|
|
380
435
|
- Using 1 will use the default step size of 1-10, then step size of 5 up to station2use
|
|
381
|
-
- **`
|
|
382
|
-
- Is the minimum amount of
|
|
383
|
-
- By default, trials will start iterating with 1 CPU up to the maximum allocated
|
|
384
|
-
- Can now set a value as the starting point, such as 15 CPUs up to the maximum of for instance 25
|
|
385
|
-
- **`min_conc_predictions (int)`: default = 1**
|
|
386
|
-
- Is the minimum amount of concurrent predictions you want each trial iteration to start with
|
|
436
|
+
- **`min_conc_stations (int)`: default = 1**
|
|
437
|
+
- Is the minimum amount of concurrent stations predictions you want each trial iteration to start with
|
|
387
438
|
- By default, if `min_conc_predictions` and `conc_predictions_step_size` are set to 1, a custom step size iteration will be applied to test the 50 sample waveforms. The sequence follows: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, n+5, 50].
|
|
388
|
-
- **`
|
|
389
|
-
- Is the concurrent predictions step size you want each trial iteration to iterate with
|
|
439
|
+
- **`conc_station_tasks_step_size (int)`: default = 1**
|
|
440
|
+
- Is the concurrent station predictions step size you want each trial iteration to iterate with
|
|
390
441
|
- By default, if `min_conc_predictions` and `conc_predictions_step_size` are set to 1, a custom step size iteration will be applied to test the 50 sample waveforms. The sequence follows: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, n+5, 50]
|
|
442
|
+
- **`start_time (str)`: default = None**
|
|
443
|
+
- The start time of the area of time that is being analyzed
|
|
444
|
+
- EX. 2024-12-14 12:00:00
|
|
445
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
446
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
447
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
448
|
+
- **`end_time (str)`: default = None**
|
|
449
|
+
- The end time of the area of time that is being analyzed
|
|
450
|
+
- EX. 2024-12-15 12:00:00
|
|
451
|
+
- Must follow the following convention YYYY-MO-DA HR:MI:SC
|
|
452
|
+
- Used to create a list of defined timechunks from the defined analysis timeframe
|
|
453
|
+
- Also used in the EvaluateSystem() class to help users note the analysis timeframe in the results CSV file for future result review
|
|
454
|
+
- **`conc_timechunk_tasks_step_size (int)`: default = 1**
|
|
455
|
+
- Is the concurrent timechunk predictions step size you want each trial iteration to iterate with
|
|
456
|
+
- **`timechunk_dt (int)`: default = None**
|
|
457
|
+
- The length each time chunk is (in minutes)
|
|
458
|
+
- EX. timechunk_dt = 10 and the analysis period is 30 minutes, then three 10-minute long timechunks will be created
|
|
459
|
+
- **`waveform_overlap (int)`: default = None**
|
|
460
|
+
- The duration (in minutes) for which each waveform oself.start_timeverlaps with the others
|
|
461
|
+
- **`tmp_dir (str)`: default = 1**
|
|
462
|
+
- A temporary directory to store all temp files produced by EQCCTPro
|
|
463
|
+
- Used to help ease system cleanup and to not write to system's default temporary directory
|
|
391
464
|
- **`set_vram_mb (float)`**
|
|
392
465
|
- Value of the maximum amount of VRAM EQCCTPro can use
|
|
393
466
|
- Must be a real value that is based on your hardware's physical memory space, if it exceeds the space the code will break due to OutOfMemoryError
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|