Condor is a specialized workload management system for compute-intensive jobs. Like other full-featured batch systems, Condor provides a job queuing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. Users submit their jobs to Condor, and Condor places them into a queue, chooses when and where to run them based upon a policy, monitors their progress, and ultimately informs the user upon completion.
While providing functionality similar to that of a more traditional batch queuing system, Condor's novel architecture allows it to succeed in areas where traditional scheduling systems fail. Condor can be used to manage a cluster of dedicated Beowulf nodes. In addition, several unique mechanisms enable Condor to effectively harness wasted CPU power from otherwise idle desktop workstations. Condor can be used to seamlessly combine all of your organization's computational power into one resource.
Condor is the product of the Condor Research Project at the University of Wisconsin-Madison (UW-Madison) and was first installed as a production system in the UW-Madison Department of Computer Sciences nearly ten years ago. This Condor installation has since served as a major source of computing cycles to UW-Madison faculty and students. Today, just in our department alone, Condor manages more than one thousand workstations, including the department's 500-CPU Linux Beowulf cluster. On a typical day, Condor delivers more than 650 CPU-days to UW researchers. Additional Condor installations have been established over the years across our campus and the world. Hundreds of organizations in industry, government, and academia have used Condor to establish compute environments ranging in size from a handful to hundreds of workstations.
Condor's features are extensive. Condor provides great flexibility for both the user submitting jobs and for the owner of a machine that provides CPU time toward running jobs. The following list summarizes some of Condor's capabilities.
ClassAds: The ClassAd mechanism in Condor provides an extremely flexible and expressive framework for matching resource requests (jobs) with resource offers (machines). Jobs can easily state both job requirements and job preferences. Likewise, machines can specify requirements and preferences about the jobs they are willing to run. These requirements and preferences can be described in powerful expressions, resulting in Condor's adaptation to nearly any desired policy.
Distributed submission: There is no single, centralized submission machine. Instead, Condor allows jobs to be submitted from many machines, and each machine contains its own job queue. Users may submit to a cluster from their own desktop machines.
User priorities: Administrators may assign priorities to users using a flexible mechanism that enables a policy of fair share, strict ordering, fractional ordering, or a combination of policies.
Job priorities: Users can assign priorities to their submitted jobs in order to control the execution order of the jobs. A "nice-user" mechanism requests the use of only those machines that would have otherwise been idle.
Job dependency: Some sets of jobs require an ordering because of dependencies between jobs. "Start job X only after jobs Y and Z successfully complete" is an example of a dependency. Enforcing dependencies is easily handled.
Support for multiple job models: Condor handles both serial jobs and parallel jobs incorporating PVM, dynamic PVM, and MPI.
Job checkpoint and migration: With certain types of jobs, Condor can transparently take a checkpoint and subsequently resume the application. A checkpoint is a snapshot of a job's complete state. Given a checkpoint, the job can later continue its execution from where it left off at the time of the checkpoint. A checkpoint also enables the transparent migration of a job from one machine to another machine. Condor will take a checkpoint of a job when it schedules the resource to a different job or the resource returns to the owner. Condor will also periodically produce a checkpoint for a job. This provides a form of fault tolerance and safeguards the accumulated computation time of a job. It reduces the loss in the event of a system failure such as the machine being shut down or hardware failure.
Job suspend and resume: Based on policy rules, Condor can ask the operating system to suspend and later resume a job.
Remote system calls: Despite running jobs on remote machines, Condor can often preserve the local execution environment via remote system calls. Users do not need to make data files available or even obtain a login account on remote workstations before Condor executes their programs there. The program behaves under Condor as if it were running as the user that submitted the job on the workstation where it was originally submitted, regardless of where it really executes.
Authentication and authorization: Administrators have fine-grained control of access permissions, and Condor can perform strong network authentication using a variety of mechanisms including Kerberos and X.509 public key certificates.
Heterogeneous platforms: In addition to Linux, Condor has been ported to most of the other primary flavors of Unix as well as Windows NT. A single pool can contain multiple platforms. Jobs to be executed under one platform may be submitted from a different platform. As an example, an executable that runs under Windows 2000 may be submitted from a machine running Linux.
Pools of machines working together: Flocking allows jobs to be scheduled across multiple Condor pools. It can be done across pools of machines owned by different organizations that impose their own policies.
Grid computing: Condor incorporates many of the emerging grid-based computing methodologies and protocols. Condor can submit jobs into resources managed via other scheduling systems such as PBS using the Globus Toolkit. Condor also includes all of the necessary software to receive jobs from other sites using the Globus Toolkit.
The ClassAd is a flexible representation of the characteristics and constraints of both machines and jobs in the Condor system. Matchmaking is the mechanism by which Condor matches an idle job with an available machine. Understanding this unique framework is the key to harness the full flexibility of the Condor system. ClassAds are employed by users to specify which machines should service their jobs. Administrators use them to customize scheduling policy.
Condor's ClassAds are analogous to the classified advertising section of the newspaper. Sellers advertise specifics about what they have to sell, hoping to attract a buyer. Buyers may advertise specifics about what they wish to purchase. Both buyers and sellers list constraints that must be satisfied. For instance, a buyer has a maximum spending limit, and a seller requires a minimum purchase price. Furthermore, both want to rank requests to their own advantage. Certainly a seller would rank one offer of $50 higher than a different offer of $25. In Condor, users submitting jobs can be thought of as buyers of compute resources and machine owners are sellers.
All machines in a Condor pool advertise their attributes, such as available RAM memory, CPU type and speed, virtual memory size, current load average, current time and date, and other static and dynamic properties. This machine ClassAd also advertises under what conditions it is willing to run a Condor job and what type of job it prefers. These policy attributes can reflect the individual terms and preferences by which the different owners have allowed their machines to participate in the Condor pool.
After a job is submitted to Condor, a job ClassAd is created. This ClassAd includes attributes about the job, such as the amount of memory the job uses, the name of the program to run, the user who submitted the job, and the time it was submitted. The job can also specify requirements and preferences (or rank) for the machine that will run the job. For instance, perhaps you are looking for the fastest floating-point performance available. You want Condor to rank available machines based on floating-point performance. Perhaps you care only that the machine has a minimum of 256 MBytes of RAM. Or, perhaps you will take any machine you can get! These job attributes and requirements are bundled up into a job ClassAd.
Condor plays the role of matchmaker by continuously reading all the job ClassAds and all the machine ClassAds, matching and ranking job ads with machine ads. Condor ensures that the requirements in both ClassAds are satisfied.
A ClassAd is a set of uniquely named expressions. Each named expression is called an attribute. Each attribute has an attribute name and an attribute value. The attribute value can be a simple integer, string, or floating-point value, such as
Memory = 512 OpSys = "LINUX" NetworkLatency = 7.5
An attribute value can also consist of a logical expression that will evaluate to TRUE, FALSE, or UNDEFINED. The syntax and operators allowed in these expressions are similar to those in C or Java, that is, == for equals, ! = for not equals, && for logical and, | | for logical or, and so on. Furthermore, ClassAd expressions can incorporate attribute names to refer to other attribute values. For instance, consider the following small sample ClassAd:
MemoryInMegs = 512 MemoryInBytes = MemoryInMegs * 1024 * 1024 Cpus = 4 BigMachine = (MemoryInMegs > 256) && (Cpus >= 4) VeryBigMachine = (MemoryInMegs > 512) && (Cpus >= 8) FastMachine = BigMachine && SpeedRating
In this example, BigMachine evaluates to TRUE and VeryBigMachine evaluates to FALSE. But, because attribute SpeedRating is not specified, FastMachine would evaluate to UNDEFINED.
Condor provides meta-operators that allow you to explicitly compare with the UNDEFINED value by testing both the type and value of the operands. If both the types and values match, the two operands are considered identical; =?= is used for meta-equals (or, is-identical-to) and =!= is used for meta-not-equals (or, is-not-identical-to). These operators always return TRUE or FALSE and therefore enable Condor administrators to specify explicit policies given incomplete information.
A complete description of ClassAd semantics and syntax is documented in the Condor manual.
ClassAds can be matched with one another. This is the fundamental mechanism by which Condor matches jobs with machines. Figure 15.1 displays a ClassAd from Condor representing a machine and another representing a queued job. Each ClassAd contains a MyType attribute, describing what type of resource the ad represents, and a TargetType attribute. The TargetType specifies the type of resource desired in a match. Job ads want to be matched with machine ads and vice versa.
Job ClassAd |
Machine ClassAd |
---|---|
|
|
MyType = "Job" |
MyType = "Machine" |
TargetType = "Machine" |
TargetType = "Job" |
Requirements = ((Arch== "INTEL" && Op-Sys=="LINUX") && Disk > DiskUsage) |
Requirements = Start |
Rank = TARGET. Department==MY. Department |
|
Rank = (Memory * 10000) + KFlops |
Activity = "Idle" |
Args = "-ini ./ies.ini" |
Arch = "INTEL" |
ClusterId = 680 |
ClockDay = 0 |
Cmd = "/home/tannenba/bin/sim-exe" |
ClockMin = 614 |
Department = "CompSci" |
CondorLoadAvg = 0.000000 |
DiskUsage = 465 |
Cpus = 1 |
StdErr = "sim.err" |
CurrentRank = 0.000000 |
ExitStatus = 0 |
Department = "CompSci" |
FileReadBytes = 0.000000 |
Disk = 3076076 |
FileWriteBytes = 0.000000 |
EnteredCurrentActivity = 990371564 |
ImageSize = 465 |
EnteredCurrentState = 990330615 |
StdIn = "/dev/null" |
FileSystemDomain = "cs.wisc.edu" |
Iwd = "/home/tannenba/sim-m/run_55" |
Islnstructional = FALSE |
JobPrio = 0 |
KeyboardIdle = 15 |
JobStartDate = 971403010 |
KFlops = 145811 |
JobStatus = 2 |
LoadAvg = 0.220000 |
StdOut = "sim.out" |
Machine = "nostos.cs.wisc.edu" |
Owner = "tannenba" |
Memory = 511 |
ProcId = 64 |
Mips = 732 |
QDate = 971377131 |
OpSys = "LINUX" |
RemoteSysCpu = 0.000000 |
Start = (LoadAvg <= 0.300000) && (KeyboardIdle > (15 * 60)) |
RemoteUserCpu = 0.000000 |
|
RemoteWallClockTime = 2401399.000000 |
State = "Unclaimed" |
TransferFiles = "NEVER" |
Subnet = "128.105.165" |
WantCheckpoint = FALSE |
TotalVirtualMemory = 787144 |
WantRemoteSyscalls = FALSE |
⋮ |
⋮ |
Each ClassAd engaged in matchmaking specifies a Requirements and a Rank attribute. In order for two ClassAds to match, the Requirements expression in both ads must evaluate to TRUE. An important component of matchmaking is the Requirements and Rank expression can refer not only to attributes in their own ad but also to attributes in the candidate matching ad. For instance, the Requirements expression for the job ad specified in Figure 15.1 refers to Arch, OpSys, and Disk, which are all attributes found in the machine ad.
What happens if Condor finds more than one machine ClassAd that satisfies the constraints specified by Requirements? That is where the Rank expression comes into play. The Rank expression specifies the desirability of the match (where higher numbers mean better matches). For example, the job ad in Figure 15.1 specifies
Requirements = ((Arch=="INTEL" && OpSys=="LINUX") && Disk > DiskUsage) Rank = (Memory * 100000) + KFlops
In this case, the job requires a computer running the Linux operating system and more local disk space than it will use. Among all such computers, the user prefers those with large physical memories and fast floating-point CPUs (KFlops is a metric of floating-point performance). Since the Rank is a user-specified metric, any expression may be used to specify the perceived desirability of the match. Condor's matchmaking algorithms deliver the best resource (as defined by the Rank expression) while satisfying other criteria.