Skip to content

Target Machines for Executing AL Workflows

ROSE enables the orchestration of ML Learning workflows on diverse computing resources using radical.asyncflow. Below, we will show how you can specify your local computer and remote HPC machine as target resources using the RadicalExecutionBackend.

Local Computer

For local execution, user can use their desktops, laptops, and their own small clusters to execute their AL workflows as follows:

import os

from radical.asyncflow import WorkflowEngine
from radical.asyncflow import RadicalExecutionBackend

from rose.al.active_learner import SequentialActiveLearner

engine = await RadicalExecutionBackend(
    {'runtime': 30,
    'resource': 'local.localhost'})

asyncflow = await WorkflowEngine.create(engine)

acl = SequentialActiveLearner(asyncflow)

HPC Resources

To execute AL workflows on HPC machines, users must have an active allocation on the target machine and specify their resource requirements, as well as the time needed to execute their workflows. Remember, ROSE uses RadicalExecutionBackend from RADICAL-AsyncFlow which is an interface for RADICAL-Pilot runtime system. For more information on how to access, set up, and execute workflows on HPC machines, refer to the following link RADICAL-Pilot Job Submission:

import os

from radical.asyncflow import WorkflowEngine
from radical.asyncflow import RadicalExecutionBackend

from rose.al.active_learner import SequentialActiveLearner


hpc_engine = await RadicalExecutionBackend(
    {'runtime': 30, 'cores': 4096,
     'gpus' : 4, 'resource': 'tacc.frontera'})

asyncflow = await WorkflowEngine.create(hpc_engine)

acl = SequentialActiveLearner(asyncflow)