Skip to content

radical.asyncflow.data

File

File()

Base class for file handling in task execution systems.

Provides common attributes and functionality for managing files with filename and filepath properties.

Initialize a File object with default None values.

Sets filename and filepath attributes to None, to be populated by subclasses during file resolution.

Source code in doc_env/lib/python3.13/site-packages/radical/asyncflow/data.py
30
31
32
33
34
35
36
37
def __init__(self) -> None:
    """Initialize a File object with default None values.

    Sets filename and filepath attributes to None, to be populated
    by subclasses during file resolution.
    """
    self.filename = None
    self.filepath = None

download_remote_url staticmethod

download_remote_url(url: str) -> Path

Download a remote file to the current directory and return its full path.

Downloads file content from a remote URL using streaming to handle large files efficiently. Saves the file with a name derived from the URL.

Parameters:

Name Type Description Default
url str

The remote URL to download from.

required

Returns:

Name Type Description
Path Path

Absolute path to the downloaded file.

Raises:

Type Description
RequestException

If the download fails or URL is invalid.

Example

::

1
2
file_path = File.download_remote_url("https://example.com/data.txt")
print(f"Downloaded to: {file_path}")
Source code in doc_env/lib/python3.13/site-packages/radical/asyncflow/data.py
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
@staticmethod
def download_remote_url(url: str) -> Path:
    """Download a remote file to the current directory and return its full path.

    Downloads file content from a remote URL using streaming to handle large files
    efficiently. Saves the file with a name derived from the URL.

    Args:
        url: The remote URL to download from.

    Returns:
        Path: Absolute path to the downloaded file.

    Raises:
        requests.exceptions.RequestException: If the download fails or URL is invalid.

    Example:
        ::

            file_path = File.download_remote_url("https://example.com/data.txt")
            print(f"Downloaded to: {file_path}")
    """
    response = requests.get(url, stream=True)
    response.raise_for_status()  # Check if the download was successful

    # Use the file name from the URL, defaulting if not available
    filename = url.split("/")[-1] or "downloaded_file"
    file_path = Path(filename)

    # Save the file content
    with open(file_path, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)

    return file_path.resolve()  # Return the absolute path

InputFile

InputFile(file)

Bases: File

Represents an input file that can be sourced from remote URLs, local paths, or task outputs.

Automatically detects the file source type and handles appropriate resolution. Supports remote file downloading, local file path resolution, and task output file references.

Initialize an InputFile with automatic source type detection and resolution.

Determines whether the input is a remote URL, local file path, or reference to another task's output file, then resolves the appropriate file path.

Parameters:

Name Type Description Default
file

Input file specification. Can be: - Remote URL (http, https, ftp, s3, etc.) - Local file path (absolute or relative) - Task output file reference (filename for future resolution)

required

Raises:

Type Description
Exception

If file resolution fails or file source cannot be determined.

Attributes:

Name Type Description
remote_url str

URL if file is remote, None otherwise.

local_file str

Local path if file exists locally, None otherwise.

other_task_file str

Task reference if file is from another task, None otherwise.

filepath Path

Resolved file path.

filename str

Extracted filename from the resolved path.

Source code in doc_env/lib/python3.13/site-packages/radical/asyncflow/data.py
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
def __init__(self, file):
    """Initialize an InputFile with automatic source type detection and resolution.

    Determines whether the input is a remote URL, local file path, or reference
    to another task's output file, then resolves the appropriate file path.

    Args:
        file: Input file specification. Can be:
            - Remote URL (http, https, ftp, s3, etc.)
            - Local file path (absolute or relative)
            - Task output file reference (filename for future resolution)

    Raises:
        Exception: If file resolution fails or file source cannot be determined.

    Attributes:
        remote_url (str): URL if file is remote, None otherwise.
        local_file (str): Local path if file exists locally, None otherwise.
        other_task_file (str): Task reference if file is from another task, None otherwise.
        filepath (Path): Resolved file path.
        filename (str): Extracted filename from the resolved path.
    """
    # Initialize file-related variables
    self.remote_url = None
    self.local_file = None
    self.other_task_file = None

    self.filepath = None  # Ensure that filepath is initialized

    # Determine file type (remote, local, or task-produced)
    possible_url = ru.Url(file)
    if possible_url.scheme in URL_SCHEMES:
        self.remote_url = file
    elif os.path.exists(file):  # Check if it's a local file
        self.local_file = file
    else:
        self.other_task_file = file

    # Handle remote file (download and resolve path)
    if self.remote_url:
        self.filepath = self.download_remote_url(self.remote_url)

    # Handle local file (ensure it exists and resolve path)
    elif self.local_file:
        self.filepath = Path(self.local_file).resolve()  # Convert to absolute path

    # Handle file from another task. We do not resolve Path here as this
    # file is not created yet and it will be resolved when the task is executed.
    elif self.other_task_file:
        self.filepath = Path(self.other_task_file)

    # If file resolution failed, raise an exception with a more descriptive message
    if not self.filepath:
        raise Exception(f"Failed to resolve InputFile: {file}. "
                         "Ensure it's a valid URL, local path, or task output.")

    # Set the filename from the resolved filepath
    self.filename = self.filepath.name

download_remote_url staticmethod

download_remote_url(url: str) -> Path

Download a remote file to the current directory and return its full path.

Downloads file content from a remote URL using streaming to handle large files efficiently. Saves the file with a name derived from the URL.

Parameters:

Name Type Description Default
url str

The remote URL to download from.

required

Returns:

Name Type Description
Path Path

Absolute path to the downloaded file.

Raises:

Type Description
RequestException

If the download fails or URL is invalid.

Example

::

1
2
file_path = File.download_remote_url("https://example.com/data.txt")
print(f"Downloaded to: {file_path}")
Source code in doc_env/lib/python3.13/site-packages/radical/asyncflow/data.py
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
@staticmethod
def download_remote_url(url: str) -> Path:
    """Download a remote file to the current directory and return its full path.

    Downloads file content from a remote URL using streaming to handle large files
    efficiently. Saves the file with a name derived from the URL.

    Args:
        url: The remote URL to download from.

    Returns:
        Path: Absolute path to the downloaded file.

    Raises:
        requests.exceptions.RequestException: If the download fails or URL is invalid.

    Example:
        ::

            file_path = File.download_remote_url("https://example.com/data.txt")
            print(f"Downloaded to: {file_path}")
    """
    response = requests.get(url, stream=True)
    response.raise_for_status()  # Check if the download was successful

    # Use the file name from the URL, defaulting if not available
    filename = url.split("/")[-1] or "downloaded_file"
    file_path = Path(filename)

    # Save the file content
    with open(file_path, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)

    return file_path.resolve()  # Return the absolute path

OutputFile

OutputFile(filename)

Bases: File

Represents an output file that will be produced by a task.

Handles filename validation and extraction from file paths, ensuring proper output file naming for task execution.

Initialize an OutputFile with filename validation.

Extracts the filename from the provided path and validates that it represents a valid file (not a directory or empty path).

Parameters:

Name Type Description Default
filename

The output filename or path. Can be a simple filename or a path, but must resolve to a valid filename.

required

Raises:

Type Description
ValueError

If filename is empty or resolves to an invalid file path.

Attributes:

Name Type Description
filename str

The extracted filename for the output file.

Example

::

1
2
3
4
5
6
7
# Valid initializations
output1 = OutputFile("result.txt")
output2 = OutputFile("path/to/result.txt")

# Invalid - will raise ValueError
output3 = OutputFile("")  # Empty filename
output4 = OutputFile("path/")  # Path ends with separator
Source code in doc_env/lib/python3.13/site-packages/radical/asyncflow/data.py
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
def __init__(self, filename):
    """Initialize an OutputFile with filename validation.

    Extracts the filename from the provided path and validates that it
    represents a valid file (not a directory or empty path).

    Args:
        filename: The output filename or path. Can be a simple filename
            or a path, but must resolve to a valid filename.

    Raises:
        ValueError: If filename is empty or resolves to an invalid file path.

    Attributes:
        filename (str): The extracted filename for the output file.

    Example:
        ::

            # Valid initializations
            output1 = OutputFile("result.txt")
            output2 = OutputFile("path/to/result.txt")

            # Invalid - will raise ValueError
            output3 = OutputFile("")  # Empty filename
            output4 = OutputFile("path/")  # Path ends with separator
    """
    if not filename:
        raise ValueError("Filename cannot be empty")

    # Use os.path.basename() to handle paths
    self.filename = os.path.basename(filename)

    # Edge case: If the filename ends with a separator (e.g., '/')
    if not self.filename:
        raise ValueError(f"Invalid filename, the path {filename} does not include a file")

download_remote_url staticmethod

download_remote_url(url: str) -> Path

Download a remote file to the current directory and return its full path.

Downloads file content from a remote URL using streaming to handle large files efficiently. Saves the file with a name derived from the URL.

Parameters:

Name Type Description Default
url str

The remote URL to download from.

required

Returns:

Name Type Description
Path Path

Absolute path to the downloaded file.

Raises:

Type Description
RequestException

If the download fails or URL is invalid.

Example

::

1
2
file_path = File.download_remote_url("https://example.com/data.txt")
print(f"Downloaded to: {file_path}")
Source code in doc_env/lib/python3.13/site-packages/radical/asyncflow/data.py
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
@staticmethod
def download_remote_url(url: str) -> Path:
    """Download a remote file to the current directory and return its full path.

    Downloads file content from a remote URL using streaming to handle large files
    efficiently. Saves the file with a name derived from the URL.

    Args:
        url: The remote URL to download from.

    Returns:
        Path: Absolute path to the downloaded file.

    Raises:
        requests.exceptions.RequestException: If the download fails or URL is invalid.

    Example:
        ::

            file_path = File.download_remote_url("https://example.com/data.txt")
            print(f"Downloaded to: {file_path}")
    """
    response = requests.get(url, stream=True)
    response.raise_for_status()  # Check if the download was successful

    # Use the file name from the URL, defaulting if not available
    filename = url.split("/")[-1] or "downloaded_file"
    file_path = Path(filename)

    # Save the file content
    with open(file_path, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)

    return file_path.resolve()  # Return the absolute path