Software Update: HTCondor 8.9.5

Spread the love

The HTCondor Team at the University of Wisconsin-Madison has released a new stable version of its workload management system HTCondor. The version number has landed at 8.9.5. HTCondor focuses on the management of compute-intensive tasks and can distribute them over several connected nodes. The user sends his task to HTCondor, after which it handles the process based on set policies and the availability of connected resources, and finally sends the results back to the user. HTCondor can, for example, control a dedicated Beowulf cluster, but also regular desktops that have nothing to do for a while. While SC16 Google, Fermilab and the HTCondor Team have a 160k-core cloud-based elastic compute cluster demonstrated. The announcement of this release looks like this:

New Features:

  • Implemented a data flow mode for jobs. When enabled, a job whose 1) pre-declared output files already exist, and 2) output files are more recent than its input files, is considered a dataflow job and gets skipped. This feature can be enabled by setting the SHADOW_SKIP_DATAFLOW_JOBS configuration option to True. (Ticket #7231)
  • Added a new tool, classad_eval, that can evaluate a ClassAd expression in the context of ClassAd attributes, and print the result in ClassAd format. (Ticket #7339)
  • You may now specify ports to forward into your Docker container. See Docker and Networking for details. (Ticket #7322)
  • Added the ability to edit certain properties of a running condor_dagman workflow: MaxJobs, MaxIdle, MaxPreScripts, MaxPostScripts. A user can call condor_qedit to set new values ​​in the job ad, which will then be updated in the running workflow. (Ticket #7236)
  • Jobs which must use temporary credentials for S3 access may now specify the “session token” in their submit files. Set +EC2SessionToken to the name of a file whose only content is the session token. Temporary credentials have a limited lifetime, which HTCondor does not help you manage; as a result, file transfers may fail because the temporary credentials expired. (Ticket #7407)
  • Improved the performance of the negotiator by simplifying the definition of the condor_startd’s WithinResourceLimits attribute when custom resources are defined. (Ticket #7323)
  • If you configure a condor_startd with different SLOT_TYPEs, you can use the SLOT_TYPE as a prefix for configuration entries. This can be useful to set different BASE_GROUPs for different slot types within the same condor_startd. For example, SLOT_TYPE_1.BASE_CGROUP = hi_prio (Ticket #7390)
  • Added a new knob SUBMIT_ALLOW_GETENV. This defaults to true. When set to false, a submit file with getenv = true will become an error. Administrators may want to set this to false to prevent users from submitting jobs that depend on the local environment of the submit machine. (Ticket #7383)
  • condor_submit will no longer set the Owner attribute of jobs it submits to the name of the current user. It now leaves this attribute up to the condor_schedd to set. This change was made because the condor_schedd will reject the submission if the Owner attribute is set but does not match the name of the mapped authenticated user submitting the job, and it is difficult for condor_submit to know what the mapped name is when there is a map file configured. (Ticket #7355)
  • Added ability for a condor_startd to log the state of Ads when shutting down using STARTD_PRINT_ADS_ON_SHUTDOWN and STARTD_PRINT_ADS_FILTER. (Ticket #7328)

Bugs Fixed:

  • condor_submit -i now works with Docker universe jobs. (Ticket #7394)
  • Fixed a bug that happened on a Linux condor_startd running as root where a running job getting close to the RequestMemory limit, could get stuck, and neither get held with an out of memory error, nor killed, nor allowed to run. (Ticket #7367)
  • The Python 3 bindings no longer segfault when putting a ClassAd constructed from a Python dictionary into another ClassAd. (Ticket #7371)
  • The Python 3 bindings were missing the division operator for ExprTree. (Ticket #7372)
  • When calling classad.ClassAd.setdefault() without a default, or with a default of None, if the default is used, it is now treated as the classad.Value.Undefined ClassAd value. (Ticket #7370)
  • Fixed a bug where when file transfers fail with an error message containing a newline (n) character, the error message would not be propagated to the job’s hold message. (Ticket #7395)
  • SciTokens support is now available on all Linux and MacOS platforms. (Ticket #7406)
  • Fixed a bug that caused the Python bindings included in the tarball package to fail due to a missing library dependency. (Ticket #7435)
  • Fixed a bug where the library that is pre-loaded to provide a sane passwd entry when using condor_ssh_to_job was placed in the wrong directory in the RPM packaging. (Ticket #7408)

Version number 8.9.5
Release status Final
Operating systems Windows 7, Linux, BSD, macOS, Solaris, UNIX, Windows Server 2012, Windows 8, Windows 10, Windows Server 2016
Website HTCondor
Download http://htcondor.org/downloads/
License type Conditions (GNU/BSD/etc.)
You might also like