A Light-Weight Access Layer for Distributed Computing Infrastructure
SAGA (Simple API for Grid Applications) defines a high-level interface to the most commonly used distributed computing functionality. SAGA provides an access-layer and mechanisms for distributed infrastructure components like job schedulers, file transfer and resource provisioning services. Given the heterogeneity of distributed infrastructure, SAGA provides a much needed interoperability layer that lowers the complexity and improves the simplicity of using distributed infrastructure whilst enhancing the sustainability of distributed applications, services and tools.
SAGA-Python provides a Python module that is compliant with the OGF GFD.90 SAGA specification. Behind the API façade, SAGA-Python implements a flexible adaptor architecture. Adaptors are dynamically loadable modules that interface the API with different middleware systems and services. Most application developers use the adaptors that are already part of SAGA-Python, but you can easily implement your own in case your backend system is not supported yet.
SAGA-Python's main focus is ease of use and simple user-space deployment in heterogeneous distributed computing environments. It supports a wide range of application use-cases from simple, uncoupled tasks to complex workflows. SAGA-Python is being used on many distributed cyberinfrastructures, including XSEDE, OSG and FutureGrid where it marshals tens of thousands of cores and moves terabytes of data in production science applications.
SAGA-Python supports a wide range of distributed computing middleware and service via a flexible adaptor architecture. Adaptors are plug-ins that bind API calls to the respective backend. SAGA-Python currently supports the following backends:
A variety of projects and a host of applications seeking to utilize distributed resources in advanced and scalable ways use and contribute actively to the SAGA-Python project. The SAGA-Python user community spans a broad field of research topics, tools and frameworks. Here are a few of them:
GRIB uses SAGA-Python to implement workflows for IntOGen-mutations analysis and other biomedial applications.
BNL uses SAGA-Python to extend PanDA (the workload management system for the ATLAS project) to U.S. HPC resources.
KISTI uses SAGA-Python to develop their web-based National Supercomputing Service Platform.
LESC uses SAGA-Python in libHPC, a framework for specification of scientific HPC applications using high-level functional constructs.
IC3 uses SAGA-Python in the Autosubmit project to manages multi-model multi-member climate prediction experiments.
For more detailed examples, check out the Tutorial Pages!
The easiest way to install SAGA-Python is via virtualenv and pip.
$ virtualenv $HOME/saga $ source $HOME/saga/bin/activate $ pip install SAGA-Python
More detailed installation instructions can be found in the manual.
This example connects to a remote PBS cluster via SSH and runs a simple /bin/echo job It then uses the SFTP file adpator to copy the output file back to the local machine. More information about the SAGA-Python job API can be found in the respective manual section.
If you don't have access to a PBS cluster, you can
try a different adaptor by changing the url scheme in
line 18. For example, you can use
slurm[+ssh]://...
to access a SLURM
cluster, sge[+ssh]://...
to access an
SGE cluster, condor[+ssh]://...
to
access a Condor(-G) gateway or just
ssh://...
to access a remote host (e.g.,
a cloud VM) via SSH. A full list of available
adaptors and their URL scheme can be found in the
manual.
import sys import saga REMOTE_HOST = "india.futuregrid.org" def main(): try: # Your ssh identity on the remote machine ctx = saga.Context("ssh") ctx.user_id = "your_username" session = saga.Session() session.add_context(ctx) # Create a job service object that represent a remote pbs cluster. # The keyword 'pbs' in the url scheme triggers the PBS adaptors # and '+ssh' enables PBS remote access via SSH. js = saga.job.Service("pbs+ssh://%s" % REMOTE_HOST, session=session) # describe our job jd = saga.job.Description() # Next, we describe the job we want to run. A complete set of job # description attributes can be found in the API documentation. jd.environment = {'MYOUTPUT':'"Hello from SAGA"'} jd.executable = '/bin/echo' jd.arguments = ['$MYOUTPUT'] jd.output = "/tmp/mysagajob.stdout" jd.error = "/tmp/mysagajob.stderr" # Create a new job from the job description. The initial state of # the job is 'New'. myjob = js.create_job(jd) # Check our job's id and state print "Job ID : %s" % (myjob.id) print "Job State : %s" % (myjob.state) print "\n...starting job...\n" # Now we can start our job. myjob.run() print "Job ID : %s" % (myjob.id) print "Job State : %s" % (myjob.state) print "\n...waiting for job...\n" # wait for the job to either finish or fail myjob.wait() print "Job State : %s" % (myjob.state) print "Exitcode : %s" % (myjob.exit_code) outfilesource = 'sftp://%s/tmp/mysagajob.stdout' % REMOTE_HOST outfiletarget = 'file://localhost/tmp/' out = saga.filesystem.File(outfilesource, session=session) out.copy(outfiletarget) print "Staged out %s to %s (size: %s bytes)\n" % (outfilesource, outfiletarget, out.get_size()) return 0 except saga.SagaException, ex: # Catch all saga exceptions print "An exception occured: (%s) %s " % (ex.type, (str(ex))) # Trace back the exception. That can be helpful for debugging. print " \n*** Backtrace:\n %s" % ex.traceback return -1 if __name__ == "__main__": sys.exit(main())
Paste the above script into a file and run it. Your output should look similar to this:
$ python ./sleepjob.py
Job ID : None
Job State : New
...starting job...
Job ID : [ssh://gw68.quarry.iu.teragrid.org]-[9969]
Job State : Done
...waiting for job...
Job State : Done
Exitcode : 0
Staged out sftp://india.futuregrid.org/tmp/mysagajob.stdout to file://localhost/tmp/ (size: 16 bytes)