sl-shared-assets 5.0.1__py3-none-any.whl → 5.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of sl-shared-assets might be problematic. Click here for more details.

@@ -1,10 +1,10 @@
1
- """This module provides the core Job class, used as the starting point for all SLURM-managed job executed on lab compute
2
- server(s). Specifically, the Job class acts as a wrapper around the SLURM configuration and specific logic of each
3
- job. During runtime, the Server class interacts with input job objects to manage their transfer and execution on the
4
- remote servers.
1
+ """This module provides the core Job class, used as the starting point for all SLURM-managed jobs executed on remote
2
+ compute server(s). Specifically, the Job class encapsulates the SLURM configuration and specific logic of each job.
3
+ During runtime, the Server class interacts with input Job objects to manage their transfer and execution on the remote
4
+ compute servers.
5
5
 
6
- Since version 3.0.0, this module also provides the specialized JupyterJob class used to launch remote Jupyter
7
- notebook servers.
6
+ Since version 3.0.0, this module also provides the specialized JupyterJob class used to launch remote Jupyter notebook
7
+ servers.
8
8
  """
9
9
 
10
10
  import re
@@ -22,8 +22,8 @@ class _JupyterConnectionInfo:
22
22
  """Stores the data used to establish the connection with a Jupyter notebook server running under SLURM control on a
23
23
  remote Sun lab server.
24
24
 
25
- More specifically, this class is used to transfer the connection metadata collected on the remote server back to
26
- the local machine that requested the server to be established.
25
+ This class is used to transfer the connection metadata collected on the remote server back to the local machine
26
+ that requested the server to be established.
27
27
  """
28
28
 
29
29
  compute_node: str
@@ -52,7 +52,8 @@ class Job:
52
52
 
53
53
  This class provides the API for constructing any server-side job in the Sun lab. Internally, it wraps an instance
54
54
  of a Slurm class to package the job data into the format expected by the SLURM job manager. All jobs managed by this
55
- class instance should be submitted to an initialized Server class 'submit_job' method to be executed on the server.
55
+ class instance should be submitted to an initialized Server instance's submit_job() method to be executed on the
56
+ server.
56
57
 
57
58
  Notes:
58
59
  The initialization method of the class contains the arguments for configuring the SLURM and Conda environments
@@ -61,20 +62,16 @@ class Job:
61
62
 
62
63
  Each job can be conceptualized as a sequence of shell instructions to execute on the remote compute server. For
63
64
  the lab, that means that the bulk of the command consists of calling various CLIs exposed by data processing or
64
- analysis pipelines, installed in the Conda environment on the server. Other than that, the job contains commands
65
- for activating the target conda environment and, in some cases, doing other preparatory or cleanup work. The
66
- source code of a 'remote' job is typically identical to what a human operator would type in a 'local' terminal
67
- to run the same job on their PC.
65
+ analysis pipelines, installed in the calling user's Conda environments on the server. The Job instance also
66
+ contains commands for activating the target conda environment and, in some cases, doing other preparatory or
67
+ cleanup work. The source code of a 'remote' job is typically identical to what a human operator would type in a
68
+ 'local' terminal to run the same job on their PC.
68
69
 
69
70
  A key feature of server-side jobs is that they are executed on virtual machines managed by SLURM. Since the
70
71
  server has a lot more compute and memory resources than likely needed by individual jobs, each job typically
71
72
  requests a subset of these resources. Upon being executed, SLURM creates an isolated environment with the
72
73
  requested resources and runs the job in that environment.
73
74
 
74
- Since all jobs are expected to use the CLIs from python packages (pre)installed on the BioHPC server, make sure
75
- that the target environment is installed and configured before submitting jobs to the server. See notes in
76
- ReadMe to learn more about configuring server-side conda environments.
77
-
78
75
  Args:
79
76
  job_name: The descriptive name of the SLURM job to be created. Primarily, this name is used in terminal
80
77
  printouts to identify the job to human operators.
@@ -145,8 +142,8 @@ class Job:
145
142
  def add_command(self, command: str) -> None:
146
143
  """Adds the input command string to the end of the managed SLURM job command list.
147
144
 
148
- This method is a wrapper around simple_slurm's 'add_cmd' method. It is used to iteratively build the shell
149
- command sequence of the job.
145
+ This method is a wrapper around simple-slurm's add_cmd() method. It is used to iteratively build the shell
146
+ command sequence for the managed job.
150
147
 
151
148
  Args:
152
149
  command: The command string to add to the command list, e.g.: 'python main.py --input 1'.
@@ -159,8 +156,8 @@ class Job:
159
156
  """Translates the managed job data into a shell-script-writable string and returns it to caller.
160
157
 
161
158
  This method is used by the Server class to translate the job into the format that can be submitted to and
162
- executed on the remote compute server. Do not call this method manually unless you know what you are doing.
163
- The returned string is safe to dump into a .sh (shell script) file and move to the BioHPC server for execution.
159
+ executed on the remote compute server. The returned string is safe to dump into a .sh (shell script) file and
160
+ move to the remote compute server for execution.
164
161
  """
165
162
 
166
163
  # Appends the command to clean up (remove) the temporary script file after processing runtime is over
@@ -178,11 +175,11 @@ class Job:
178
175
 
179
176
 
180
177
  class JupyterJob(Job):
181
- """Specialized Job instance designed to launch a Jupyter notebook server on SLURM.
178
+ """Aggregates the data of a specialized job used to launch a Jupyter notebook server under SLURM's control.
182
179
 
183
- This class extends the base Job class to include Jupyter-specific configuration and commands for starting a
184
- notebook server in a SLURM environment. Using this specialized job allows users to set up remote Jupyter servers
185
- while still benefitting from SLURM's job management and fair airtime policies.
180
+ This class extends the base Job class to include specific configuration and commands for starting a Jupyter notebook
181
+ server in a SLURM environment. Using this specialized job allows users to set up remote Jupyter servers while
182
+ benefitting from SLURM's job scheduling and resource management policies.
186
183
 
187
184
  Notes:
188
185
  Jupyter servers directly compete for resources with headless data processing jobs. Therefore, it is important
@@ -195,34 +192,28 @@ class JupyterJob(Job):
195
192
  data of the job.
196
193
  error_log: The absolute path to the .txt file on the processing server, where to store the standard error
197
194
  data of the job.
198
- working_directory: The absolute path to the directory where temporary job files will be stored. During runtime,
199
- classes from this library use that directory to store files such as the job's shell script. All such files
200
- are automatically removed from the directory at the end of a non-errors runtime.
201
- conda_environment: The name of the conda environment to activate on the server before running the job logic. The
195
+ working_directory: The absolute path to the directory where to store temporary job files.
196
+ conda_environment: The name of the conda environment to activate on the server before running the job. The
202
197
  environment should contain the necessary Python packages and CLIs to support running the job's logic. For
203
198
  Jupyter jobs, this necessarily includes the Jupyter notebook and jupyterlab packages.
204
- port: The connection port number for the Jupyter server. Do not change the default value unless you know what
205
- you are doing, as the server has most common communication ports closed for security reasons.
206
- notebook_directory: The directory to use as Jupyter's root. During runtime, Jupyter will only have access to
207
- items stored in or under this directory. For most runtimes, this should be set to the user's root data or
208
- working directory.
209
- cpus_to_use: The number of CPUs to allocate to the Jupyter server. Keep this value as small as possible to avoid
210
- interfering with headless data processing jobs.
211
- ram_gb: The amount of RAM, in GB, to allocate to the Jupyter server. Keep this value as small as possible to
212
- avoid interfering with headless data processing jobs.
213
- time_limit: The maximum Jupyter server uptime, in minutes. Set this to the expected duration of your jupyter
214
- session.
215
- jupyter_args: Stores additional arguments to pass to jupyter notebook initialization command.
199
+ port: The connection port to use for the Jupyter server.
200
+ notebook_directory: The root directory where to run the Jupyter notebook. During runtime, the notebook will
201
+ only have access to items stored under this directory. For most runtimes, this should be set to the user's
202
+ root working directory.
203
+ cpus_to_use: The number of CPUs to allocate to the Jupyter server.
204
+ ram_gb: The amount of RAM, in GB, to allocate to the Jupyter server.
205
+ time_limit: The maximum Jupyter server uptime, in minutes.
206
+ jupyter_args: Stores additional arguments to pass to the jupyter notebook initialization command.
216
207
 
217
208
  Attributes:
218
- port: Stores the connection port of the managed Jupyter server.
219
- notebook_dir: Stores the absolute path to the directory used as Jupyter's root, relative to the remote server
220
- root.
209
+ port: Stores the connection port for the managed Jupyter server.
210
+ notebook_dir: Stores the absolute path to the directory used to run the Jupyter notebook, relative to the
211
+ remote server root.
221
212
  connection_info: Stores the JupyterConnectionInfo instance after the Jupyter server is instantiated.
222
213
  host: Stores the hostname of the remote server.
223
214
  user: Stores the username used to connect with the remote server.
224
- connection_info_file: The absolute path to the file that stores connection information relative to the remote
225
- server root.
215
+ connection_info_file: Stores the absolute path to the file that contains the connection information for the
216
+ initialized Jupyter session, relative to the remote server root.
226
217
  _command: Stores the shell command for launching the Jupyter server.
227
218
  """
228
219
 
@@ -301,10 +292,12 @@ class JupyterJob(Job):
301
292
  self.add_command(jupyter_cmd_str)
302
293
 
303
294
  def parse_connection_info(self, info_file: Path) -> None:
304
- """Parses the connection information file created by the Jupyter job on the server.
295
+ """Parses the connection information file created by the Jupyter job on the remote server.
305
296
 
306
- Use this method to parse the connection file fetched from the server to finalize setting up the Jupyter
307
- server job.
297
+ This method is used to finalize the remote Jupyter session initialization by parsing the connection session
298
+ instructions from the temporary storage file created by the remote Job running on the server. After this
299
+ method's runtime, the print_connection_info() method can be used to print the connection information to the
300
+ terminal.
308
301
 
309
302
  Args:
310
303
  info_file: The path to the .txt file generated by the remote server that stores the Jupyter connection
@@ -336,7 +329,7 @@ class JupyterJob(Job):
336
329
 
337
330
  The SSH command should be used via a separate terminal or subprocess call to establish the secure SSH tunnel to
338
331
  the Jupyter server. Once the SSH tunnel is established, the printed localhost url can be used to view the
339
- server from the local machine.
332
+ server from the local machine's browser.
340
333
  """
341
334
 
342
335
  # If connection information is not available, there is nothing to print