SAGA-Python

A Light-Weight Access Layer for Distributed Computing Infrastructure

Get Started

Introduction

SAGA (Simple API for Grid Applications) defines a high-level interface to the most commonly used distributed computing functionality. SAGA provides an access-layer and mechanisms for distributed infrastructure components like job schedulers, file transfer and resource provisioning services. Given the heterogeneity of distributed infrastructure, SAGA provides a much needed interoperability layer that lowers the complexity and improves the simplicity of using distributed infrastructure whilst enhancing the sustainability of distributed applications, services and tools.

SAGA-Python provides a Python module that is compliant with the OGF GFD.90 SAGA specification. Behind the API façade, SAGA-Python implements a flexible adaptor architecture. Adaptors are dynamically loadable modules that interface the API with different middleware systems and services. Most application developers use the adaptors that are already part of SAGA-Python, but you can easily implement your own in case your backend system is not supported yet.

SAGA-Python's main focus is ease of use and simple user-space deployment in heterogeneous distributed computing environments. It supports a wide range of application use-cases from simple, uncoupled tasks to complex workflows. SAGA-Python is being used on many distributed cyberinfrastructures, including XSEDE, OSG and FutureGrid where it marshals tens of thousands of cores and moves terabytes of data in production science applications.


Supported Middleware

SAGA-Python supports a wide range of distributed computing middleware and service via a flexible adaptor architecture. Adaptors are plug-ins that bind API calls to the respective backend. SAGA-Python currently supports the following backends:

Job Submission Systems

(All queuing system adaptors can also access clusters remotely by tunneling commands through SSH and GSISSH)

File / Data Management

Resource Management / Clouds


Who is Using It

A variety of projects and a host of applications seeking to utilize distributed resources in advanced and scalable ways use and contribute actively to the SAGA-Python project. The SAGA-Python user community spans a broad field of research topics, tools and frameworks. Here are a few of them:


Get Started

For more detailed examples, check out the Tutorial Pages!


Installation

The easiest way to install SAGA-Python is via virtualenv and pip.

$ virtualenv $HOME/saga
$ source $HOME/saga/bin/activate
$ pip install SAGA-Python

More detailed installation instructions can be found in the manual.


Example Code

This example connects to a remote PBS cluster via SSH and runs a simple /bin/echo job It then uses the SFTP file adpator to copy the output file back to the local machine. More information about the SAGA-Python job API can be found in the respective manual section.

If you don't have access to a PBS cluster, you can try a different adaptor by changing the url scheme in line 18. For example, you can use slurm[+ssh]://... to access a SLURM cluster, sge[+ssh]://... to access an SGE cluster, condor[+ssh]://... to access a Condor(-G) gateway or just ssh://... to access a remote host (e.g., a cloud VM) via SSH. A full list of available adaptors and their URL scheme can be found in the manual.

import sys
import saga

REMOTE_HOST = "india.futuregrid.org"

def main():
    try:
        # Your ssh identity on the remote machine
        ctx = saga.Context("ssh")
        ctx.user_id = "your_username"

        session = saga.Session()
        session.add_context(ctx)

        # Create a job service object that represent a remote pbs cluster.
        # The keyword 'pbs' in the url scheme triggers the PBS adaptors
        # and '+ssh' enables PBS remote access via SSH.
        js = saga.job.Service("pbs+ssh://%s" % REMOTE_HOST, session=session)

        # describe our job
        jd = saga.job.Description()

        # Next, we describe the job we want to run. A complete set of job
        # description attributes can be found in the API documentation.
        jd.environment     = {'MYOUTPUT':'"Hello from SAGA"'}
        jd.executable      = '/bin/echo'
        jd.arguments       = ['$MYOUTPUT']
        jd.output          = "/tmp/mysagajob.stdout"
        jd.error           = "/tmp/mysagajob.stderr"

        # Create a new job from the job description. The initial state of
        # the job is 'New'.
        myjob = js.create_job(jd)

        # Check our job's id and state
        print "Job ID    : %s" % (myjob.id)
        print "Job State : %s" % (myjob.state)

        print "\n...starting job...\n"

        # Now we can start our job.
        myjob.run()

        print "Job ID    : %s" % (myjob.id)
        print "Job State : %s" % (myjob.state)

        print "\n...waiting for job...\n"
        # wait for the job to either finish or fail
        myjob.wait()

        print "Job State : %s" % (myjob.state)
        print "Exitcode  : %s" % (myjob.exit_code)

        outfilesource = 'sftp://%s/tmp/mysagajob.stdout' % REMOTE_HOST
        outfiletarget = 'file://localhost/tmp/'
        out = saga.filesystem.File(outfilesource, session=session)
        out.copy(outfiletarget)

        print "Staged out %s to %s (size: %s bytes)\n" % (outfilesource, outfiletarget, out.get_size())


        return 0

    except saga.SagaException, ex:
        # Catch all saga exceptions
        print "An exception occured: (%s) %s " % (ex.type, (str(ex)))
        # Trace back the exception. That can be helpful for debugging.
        print " \n*** Backtrace:\n %s" % ex.traceback
        return -1


if __name__ == "__main__":
    sys.exit(main())

Paste the above script into a file and run it. Your output should look similar to this:

$ python ./sleepjob.py
Job ID    : None
Job State : New

...starting job...

Job ID    : [ssh://gw68.quarry.iu.teragrid.org]-[9969]
Job State : Done

...waiting for job...

Job State : Done
Exitcode  : 0

Staged out sftp://india.futuregrid.org/tmp/mysagajob.stdout to file://localhost/tmp/ (size: 16 bytes)