lsst-ctrl-bps-htcondor 29.2025.2200__tar.gz → 29.2025.2400__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. {lsst_ctrl_bps_htcondor-29.2025.2200/python/lsst_ctrl_bps_htcondor.egg-info → lsst_ctrl_bps_htcondor-29.2025.2400}/PKG-INFO +1 -1
  2. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/doc/lsst.ctrl.bps.htcondor/userguide.rst +52 -10
  3. lsst_ctrl_bps_htcondor-29.2025.2400/python/lsst/ctrl/bps/htcondor/version.py +2 -0
  4. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400/python/lsst_ctrl_bps_htcondor.egg-info}/PKG-INFO +1 -1
  5. lsst_ctrl_bps_htcondor-29.2025.2200/python/lsst/ctrl/bps/htcondor/version.py +0 -2
  6. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/COPYRIGHT +0 -0
  7. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/LICENSE +0 -0
  8. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/MANIFEST.in +0 -0
  9. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/README.rst +0 -0
  10. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/bsd_license.txt +0 -0
  11. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/doc/lsst.ctrl.bps.htcondor/CHANGES.rst +0 -0
  12. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/doc/lsst.ctrl.bps.htcondor/index.rst +0 -0
  13. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/gpl-v3.0.txt +0 -0
  14. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/pyproject.toml +0 -0
  15. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/__init__.py +0 -0
  16. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/etc/__init__.py +0 -0
  17. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/etc/htcondor_defaults.yaml +0 -0
  18. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/final_post.sh +0 -0
  19. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/handlers.py +0 -0
  20. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/htcondor_config.py +0 -0
  21. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/htcondor_service.py +0 -0
  22. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/lssthtc.py +0 -0
  23. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst/ctrl/bps/htcondor/provisioner.py +0 -0
  24. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst_ctrl_bps_htcondor.egg-info/SOURCES.txt +0 -0
  25. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst_ctrl_bps_htcondor.egg-info/dependency_links.txt +0 -0
  26. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst_ctrl_bps_htcondor.egg-info/requires.txt +0 -0
  27. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst_ctrl_bps_htcondor.egg-info/top_level.txt +0 -0
  28. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/python/lsst_ctrl_bps_htcondor.egg-info/zip-safe +0 -0
  29. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/setup.cfg +0 -0
  30. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/tests/test_handlers.py +0 -0
  31. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/tests/test_htcondor_service.py +0 -0
  32. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/tests/test_lssthtc.py +0 -0
  33. {lsst_ctrl_bps_htcondor-29.2025.2200 → lsst_ctrl_bps_htcondor-29.2025.2400}/tests/test_provisioner.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: lsst-ctrl-bps-htcondor
3
- Version: 29.2025.2200
3
+ Version: 29.2025.2400
4
4
  Summary: HTCondor plugin for lsst-ctrl-bps.
5
5
  Author-email: Rubin Observatory Data Management <dm-admin@lists.lsst.org>
6
6
  License: BSD 3-Clause License
@@ -181,7 +181,7 @@ DAG, this status can lag behind by a few minutes. Also, DAGMan tracks
181
181
  deletion of individual jobs as failures (no separate counts for
182
182
  deleted jobs). So the summary report flag column will show ``F`` when
183
183
  there are either failed or deleted jobs. If getting a detailed report
184
- (``bps report --id <id>``), the plugin reads detailed job information
184
+ (``bps report --id <ID>``), the plugin reads detailed job information
185
185
  from files. So, the detailed report can distinguish between failed and
186
186
  deleted jobs, and thus will show ``D`` in the flag column for a running
187
187
  workflow if there is a deleted job.
@@ -202,7 +202,7 @@ jobs are being held, use
202
202
 
203
203
  .. code-block:: bash
204
204
 
205
- condor_q -hold <id> # to see a specific job being held
205
+ condor_q -hold <ID> # to see a specific job being held
206
206
  condor-q -hold <user> # to see all held jobs owned by the user
207
207
 
208
208
  .. _htc-plugin-cancel:
@@ -231,18 +231,18 @@ See `bps restart`_.
231
231
  .. Describe any plugin specific aspects of restarting failed jobs below
232
232
  if any.
233
233
 
234
- A valid run id is one of the following:
234
+ A valid run ID is one of the following:
235
235
 
236
- * job id, e.g., ``1234.0`` (using just the cluster id, ``1234``, will also
236
+ * job ID, e.g., ``1234.0`` (using just the cluster ID, ``1234``, will also
237
237
  work),
238
- * global job id (e.g.,
238
+ * global job ID (e.g.,
239
239
  ``sdfrome002.sdf.slac.stanford.edu#165725.0#1699393748``),
240
240
  * run's submit directory (e.g.,
241
241
  ``/sdf/home/m/mxk/lsst/bps/submit/u/mxk/pipelines_check/20230713T135346Z``).
242
242
 
243
243
  .. note::
244
244
 
245
- If you don't remember any of the run's id you may try running
245
+ If you don't remember any of the run's ID you may try running
246
246
 
247
247
  .. code::
248
248
 
@@ -299,7 +299,7 @@ alongside the other payload jobs in the workflow that should automatically
299
299
  create and maintain glideins required for the payload jobs to run.
300
300
 
301
301
  If you enable automatic provisioning of resources, you will see the status of
302
- the provisioning job in the output of the ``bps report --id <id>`` command.
302
+ the provisioning job in the output of the ``bps report --id <ID>`` command.
303
303
  Look for the line starting with "Provisioning job status". For example
304
304
 
305
305
  .. code-block:: bash
@@ -446,7 +446,7 @@ If any of your jobs are being held, it will display something similar to::
446
446
 
447
447
  The job that is in the hold state can be released from it with
448
448
  `condor_release`_ providing the issue that made HTCondor put it in this state
449
- has been resolved. For example, if your job with id 1234.0 was placed in the
449
+ has been resolved. For example, if your job with ID 1234.0 was placed in the
450
450
  hold state because during the execution it exceeded 2048 MiB you requested for
451
451
  it during the submission, you can double the amount of memory it should request with
452
452
 
@@ -538,7 +538,49 @@ Troubleshooting
538
538
  Where is stdout/stderr from pipeline tasks?
539
539
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
540
540
 
541
- For now, stdout/stderr can be found in files in the run submit directory.
541
+ For now, stdout/stderr can be found in files in the run submit directory
542
+ after the job is done. Python logging goes to stderr so the majority
543
+ of the pipetask output will be in the \*.err file. One exception is
544
+ ``finalJob`` which does print some information to stdout (\*.out file)
545
+
546
+ While the job is running, the owner of the job can use ``condor_tail``
547
+ command to peek at the stdout/stderr of a job. ``bps`` uses the ID for
548
+ the entire workflow. But for the HTCondor command ``condor_tail``
549
+ you will need the ID for the individual job. Run the following command
550
+ and look for the ID for the job (undefined's are normal and normally
551
+ correspond to the DAGMan jobs).
552
+
553
+ .. code-block::
554
+
555
+ condor_q -run -nobatch -af:hj bps_job_name bps_run
556
+
557
+ Once you have the HTCondor ID for the particular job you want to peek
558
+ at the output, run this command:
559
+
560
+ .. code-block::
561
+
562
+ condor_tail -stderr -f <ID>
563
+
564
+ If you want to instead see the stdout, leave off the ``-stderr``.
565
+ If you need to see more of the contents specify ``-maxbytes <numbytes>``
566
+ (defaults to 1024 bytes).
567
+
568
+ I need to look around on the compute node where my job is running.
569
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
570
+
571
+ If using glideins, you might be able to just ``ssh`` to the compute
572
+ node from the submit node. First, need to find out on which node the
573
+ job is running.
574
+
575
+ .. code-block::
576
+
577
+ condor_q -run -nobatch -af:hj RemoteHost bps_job_name bps_run
578
+
579
+ Alternatively, HTCondor has the command ``condor_ssh_to_job`` where you
580
+ just need the job ID. This is not the workflow ID (the ID that ``bps``
581
+ commands use), but an individual job ID. The command above also prints
582
+ the job IDs.
583
+
542
584
 
543
585
  Why did my submission fail?
544
586
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -556,7 +598,7 @@ will continue normally until the existing gliedins expire. As a result,
556
598
  payload jobs may get stuck in the job queue if the glideins were not created
557
599
  or expired before the execution of the workflow could be completed.
558
600
 
559
- Firstly, use ``bps report --id <run id>`` to display the run report and look
601
+ Firstly, use ``bps report --id <run ID>`` to display the run report and look
560
602
  for the line
561
603
 
562
604
  .. code-block::
@@ -0,0 +1,2 @@
1
+ __all__ = ["__version__"]
2
+ __version__ = "29.2025.2400"
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: lsst-ctrl-bps-htcondor
3
- Version: 29.2025.2200
3
+ Version: 29.2025.2400
4
4
  Summary: HTCondor plugin for lsst-ctrl-bps.
5
5
  Author-email: Rubin Observatory Data Management <dm-admin@lists.lsst.org>
6
6
  License: BSD 3-Clause License
@@ -1,2 +0,0 @@
1
- __all__ = ["__version__"]
2
- __version__ = "29.2025.2200"