Software update: HTCondor 10.5.0 / 10.0.5

Spread the love

The HTCondor Team at the University of Wisconsin-Madison has released new feature and long-term support versions of its workload management system HTCondor. The version numbers have ended up at 10.5.0 and 10.0.5. HTCondor focuses on managing compute-intensive tasks and can distribute them across several connected nodes. The user sends his task to HTCondor, which handles the process based on set policies and the availability of connected resources, and finally returns the results to the user. For example, HTCondor can control a dedicated Beowulf cluster, but also regular desktops that have nothing to do. During the day SC16 Google, Fermilab and the HTCondor Team have a 160k-core cloud-based elastic compute cluster demonstratedand in 2020 the National Science Foundation chose HTCondor as part of her Partnership to Advance Throughput Computing. The brief adjustments to these expenses are as follows:

Version 10.5.0 – Feature Channel

  • Can now define DAGMan save points to be able to rerun DAGs from there
  • Expand default list of environment variables passed to the DAGMan manager
  • Administrators can prevent users using “getenv = true” in submit files
  • Improved throughput when submitting a large number of ARC-CE jobs
  • Execute events contain the slot name, sandbox path, resource quantities
  • Can add attributes of the execution point to be recorded in the user log
  • Enhanced condor_transform_ads tool to ease offline job transform testing
  • Fixed a bug where memory limits over 2 GiB might not be correctly enforced

Version 10.0.5 – Long Term Support Channel

  • Rename upgrade9to10checks.py script to condor_upgrade_check
  • Fix spurious warning from condor_upgrade_check about regexes with spaces

Version 10.0.4 – Long Term Support Channel

  • Provides script to assist updating from HTCondor version 9 to version 10
  • Fixes a bug where rarely an output file would not be transferred back
  • Fixes counting of submitted jobs, so MAX_JOBS_SUBMITTED works correctly
  • Fixes SSL Authentication failure when PRIVATE_NETWORK_NAME was set
  • Fixes rare crash when SSL or SCITOKENS authentication was attempted
  • Can allow client to present an X.509 proxy during SSL authentication
  • Fixes issue where a users jobs were ignored by the HTCondor-CE on restart
  • Fixes issues where some events that HTCondor-CE depends on were missing

Version 10.0.3 – Long Term Support Channel

  • GPU metrics continues to be reported after the startd is reconfigured
  • Fixed issue where GPU metrics could be wildly over-reported
  • Fixed issue that kept jobs from running when installed on Debian or Ubuntu
  • Fixed DAGMan problem when retrying a proc failure in a multi-proc node

Version 10.0.2 – Long Term Support Channel

  • HTCondor can optionally create intermediate directories for output files
  • Improved condor_schedd scalability when a user runs more than 1,000 jobs
  • Fix issue where condor_ssh_to_job fails if the user is not in /etc/passwd
  • The Python Schedd.query() now returns the ServerTime attribute for Fifemon
  • VM Universe jobs pass through the host CPU model to support newer kernels
  • HTCondor Python wheel is now available for Python 3.11
  • Fix issue that prevented HTCondor installation on Ubuntu 18.04

Version 10.0.1 – Long Term Support Channel

  • Add Ubuntu 22.04 (Jammy Jellyfish) support
  • Add file transfer plugin that supports stash:// and osdf:// URLs
  • Fix bug where cgroup memory limits were not enforced on Debian and Ubuntu
  • Fix bug where forcibly removing DAG jobs could crash the condor_schedd
  • Fix bug where Docker repository images cannot be run under Singularity
  • Fix issue where blahp scripts were missing on Debian and Ubuntu platforms
  • Fix bug where curl file transfer plugins would fail on Enterprise Linux 8

Version 10.0.0 – Long Term Support Channel

  • Users can prevent runaway jobs by specifying an allowed duration
  • Able to extend submit commands and create job submit templates
  • Initial implementation of htcondor command line interface
  • Initial implementation of Job Sets in the htcondor CLI tool
  • Add Container Universe
  • Support for heterogeneous GPUs
  • Improved File transfer error reporting
  • GSI Authentication method has been removed
  • HTCondor now utilizes ARC-CE’s REST interface
  • Support for ARM and PowerPC for Enterprise Linux 8
  • For IDTOKENS, signing key not required on every execution point
  • Trust on first use ability for SSL connections
  • Improvements against replay attacks

Version number 10.5.0 / 10.0.5
Release status Final
Operating systems Linux, BSD, macOS, Solaris, UNIX, Windows Server 2012, Windows 10, Windows Server 2016, Windows Server 2019, Windows 11
Website HTCondor
Download https://research.cs.wisc.edu/htcondor/htcondor/download/
License type Prerequisites (GNU/BSD/etc.)
You might also like