Condor: Submitting A Job
Condor: Submitting A Job
Introduction
Condor is a high-throughput computing environment utilizing the power of multiple workstations by communicating over a network. Condor manages workstations and resources automatically.
Currently, Condor is available on Red Hat Enterprise Linux hosts (not available on Solaris 10).
This document describes, briefly, how to compile and submit a job to the ECN condor computing cluster.
Submitting a job
1. Set up your environment
If you're running Red Hat Enterprise Linux version 4, add the following directory to your PATH
variable:
/usr/local/condor/bin
(On Red Hat Enterprise Linux version 5, Condor is already in the default path.)
2. Compiling an executable to run in the Condor pool
The object files need to be linked to the Condor libraries when making the executable.
(a) If compiling simple jobs on the command line, then just replace
% cc myprog.c
with
% condor_compile cc myprog.c
(you can use any compiler in place of cc: gcc, CC, g++, f77, f90, etc...)
(b) In a Makefile:
Replace line
CC = cc
with
CC = condor_compile cc
OR
Make the above substitution, only in the rule where the executable is being made from the object files.
3. Submit the job to the Condor pool
Once you have compiled a binary linked with the Condor libraries, you need to create a description file in order to submit the job to the Condor pool. The Condor submit description file, filename.cmd
, describes the job to be run. Type man condor_submit
to read the Condor manual, which tells in detail all the options available to make a submit description file.
If you normally run your program on the command line like the following:
% sim-safe -a 200
you would then specify executable and command line arguments/options like the following in the submit file:
executable = sim-safe arguments = -a 200
You can also define macros and use them elsewhere in the file:
X = output Y = input
you MUST redirect stdin, stdout, & stderr
to some filenames or /dev/null
if they are used in your code. It's a good bet they are being used someplace. These are referred by the variables input, output and error respectively:
input = $(Y)/my_input.in output = $(X)/my_output.out error = /dev/null
You can keep a log of the Condor job execution:
log = my_run.log
Define which architectures you want your job to run on. The Condor pool of machines is made up of regularly updated machines. Contact the Condor maintainer to confirm the platforms being run on the Condor system. If, for example, the system is running both Linux 32-bit and Linux 64-bit, you need to configure your submission so your job will run on either Linux 32-bit or Linux 64-bit (unless you must run on one or the other). To run only on 32-bit machines you would added the following line (anywhere before the final "queue" line):
Requirements = Arch == "INTEL" && OpSys == "LINUX"
And for just 64-bit machines:
Requirements = Arch == "X86_64" && OpSys == "LINUX"
The default should be the same as your current submit host.
Any line that begins with a # is a comment:
# submit the job queue # submit 2 more copies of the job queue 2 # submit another copy but with different arguments arguments = -d 600 queue
Before you can actually submit a job you need to find out which ECN machines can actually submit jobs to the Condor pool.
Once you are on a "submit" machine, to actually submit the job you would then execute the following command From The Condor Submit Machine:
% condor_submit x.cmd
where x.cmd
is your Condor description file.
Example #1
Simple command file: loop.cmd # # loop 200 > my.output # (loop is the Condor compiled binary) # executable = loop arguments = 200 input = /dev/null output = my.output error = my.error # end of loop.cmd
Submit the job:
% condor_submit loop.cmd
Example #2
Here is more complex example to submit a "simplescalar" simulation: wave5.cmd
##### # condor command file for wave5 on "simplescalar" ##### PROGRAM_NAME = wave5 MIN_SIZE = 64 THRESHOLD = 12 INTERVAL = 1048576 DIR = /home/machine/a/user/bss/condor CONFIG = $(DIR)/run/icalp0.cfg OUTDIR = $(DIR)/run/condor INDIR = $(DIR)/run/bench/wave5 FILE = $(PROGRAM_NAME)_$(MIN_SIZE)_$(THRESHOLD)_$(INTERVAL) OUTFILE = $(OUTDIR)/$(PROGRAM_NAME).$(MIN_SIZE)_$(THRESHOLD)_$(INTERVAL) executable = $(DIR)/sim-icalp-outorder input = $(INDIR)/wave5.in output = $(OUTFILE).out error = $(OUTFILE).stat arguments = -config $(CONFIG) -filename $(OUTDIR)/$(FILE) \ -filename2 $(OUTDIR)/$(FILE).count \ -icalp:icalp_min_size $(MIN_SIZE) \ -icalp:icalp_sense_interval $(INTERVAL) \ -icalp:icalp_change_threshold $(THRESHOLD) \ $(INDIR)/wave5.ss queue # end of wave5.cmd
Submit the job:
% condor_submit wave5.cmd
4. Read and reference the on-line documentation
We at ECN are not users or experts at using the Condor pool. We just maintain the pools integrity and make sure everything is working correctly. To get more help you need to reference the on-line documentation.
Last modified: 2008/08/22 09:11:47.575000 GMT-4 by
curtis.f.smith.1
Created: 2007/10/23 16:34:47.098000 GMT-4 by brian.r.brinegar.1.