Changelog¶
[Unreleased]¶
[0.3.1] - 2026-05-19¶
Fixed¶
- Pinned
dragonhpc==0.13.2in thedragonoptional dependency. Dragon 0.14.0 changed DDict write routing inBatchsuch thatDragonExecutionBackendV3's monitor loop blocks indefinitely on every task result. Users on Dragon 0.14.0 should upgrade to the forthcoming release that includes the full migration fix.
[0.3.0] - 2026-05-15¶
Added¶
-
SpanBuffer— new public, thread-safe span collector that replacesInMemorySpanExporteras RHAPSODY's internal span store. Useslist.extend()for O(1) appends (vs. the exporter's O(n²)tuple +=per flush).shutdown()is a no-op so spans remain accessible aftertracer_provider.shutdown().get_finished_spans(),clear(), andforce_flush()match theInMemorySpanExporterAPI. -
Provider-agnostic OTel injection —
TelemetryManager.__init__()andSession.start_telemetry()accept three new optional parameters: span_processors: list | None— OTelSpanProcessorinstances (e.g.BatchSpanProcessor(OTLPSpanExporter())) added to RHAPSODY's privateTracerProvideralongside the internalSpanBuffer. Callers own exporter construction; RHAPSODY never imports specific exporter classes.metric_readers: list | None— OTelMetricReaderinstances (e.g.PeriodicExportingMetricReader(OTLPMetricExporter())) added to RHAPSODY's privateMeterProvideralongsideInMemoryMetricReader.-
resource: Resource | None— optional OTelResourceoverride.None(default) callsResource.create()which readsOTEL_SERVICE_NAMEandOTEL_RESOURCE_ATTRIBUTESfrom the environment automatically. -
TelemetryManager.annotate_current_span(event)— records aBaseEventas an OTel span event (with serialized attributes) on the currently active span. No-op when no valid span is active or telemetry is stopped. -
TelemetryManager.register_span_enricher(func)— registers a callback invoked atTaskCreatedtime that returns extradictattributes to stamp onto the task's OTel span. Multiple enrichers are applied in registration order. Used by orchestration layers (e.g. AsyncFlow) to inject workflow-level grouping keys. -
TelemetryManager.export_as_otlp(path=None)— serializes all finished spans to a standards-compliant OTLP JSON payload (the POST/v1/traceswire format: 32/16-char hex IDs, nanosecond timestamps as strings, typed KV attributes). Returns the JSON string and optionally writes topath. -
TelemetryManager._task_contexts— internal dict that captures the OTel context (context.get_current()) atTaskCreatedemit time and restores it viaattach/detachin_process_event(). This is the standard in-process OTel propagation pattern across asyncio task boundaries. -
TelemetryManager._session_ctx_token— stores the attach token for the session span context, detached cleanly instop().
Performance¶
- Batch-drain dispatch loop — replaced the per-event
asyncio.wait_for(queue.get(), timeout=0.05)pattern in_dispatch_loopwith a batch-drain approach that reduces asyncio event-loop interactions from O(N) to O(N/500) under burst load.
Key changes:
- Fast path: drain up to 500 events per iteration with queue.get_nowait()
(plain deque.popleft() — zero asyncio scheduling overhead, no timer handle,
no Task allocation per event). Fall back to a blocking await queue.get()
only when the queue is empty.
- asyncio.sleep(0) yielded once per full batch, not once per event.
Partial batches block naturally on the next await queue.get().
- _flush_batch() — new helper that processes OTel instruments, writes the
JSONL checkpoint, and dispatches to subscribers for an entire batch at once.
Subscriber dispatch is guarded by if self._subscribers to skip the inner
loop when no subscribers are registered.
- _write_batch() — new helper that serializes a batch of events and issues a
single file.write() call per batch. Uses
dataclasses.fields() + getattr instead of dataclasses.asdict(): avoids
the deep-recursive copy, and correctly captures init=False class-level
fields (e.g. event_type) that vars() / __dict__ miss on frozen
dataclass subclasses.
- Checkpoint file opened with buffering=131072 (128 KB block buffer) instead
of buffering=1 (line-buffered). The previous setting issued one write()
syscall per event — up to 80 000 syscalls for a 40 K-task session.
- Sentinel-based shutdown — stop() sets _running=False, awaits
queue.join() with a 5 s timeout to guarantee full drain, then pushes
_STOP_SENTINEL to unblock the dispatch loop cleanly. emit() is now a
no-op after stop() to prevent late emissions racing with the sentinel.
Measured at 40 000 tasks (DragonExecutionBackend):
| Telemetry overhead | |
|---|---|
Before (per-event wait_for) |
+7 s (3.66× slower than no-telemetry) |
| After (batch-drain) | +2 s (1.73× slower than no-telemetry) |
71% reduction in telemetry overhead at 40 K tasks.
Fixed¶
-
Task callback contract (
stdout/stderr) —task["stdout"]andtask["stderr"]are now guaranteed to bestr(never absent, neverNone) after any DONE or FAILED callback across all backends.DragonExecutionBackendV3,ConcurrentExecutionBackend, andDaskExecutionBackendeach had paths that left the key missing or set it toNone, causingKeyError: 'stdout'in consumers using bracket notation. A newpytest.mark.result_contractmarker groups seven cross-backend regression tests that enforce this invariant. -
TaskCreatedas task span head — task spans were previously opened atTaskSubmitted, leavingTaskCreated(the earliest lifecycle event) with only atrace_idand nospan_idin the JSONL. The span is now opened atTaskCreatedso all lifecycle events carry bothtrace_idandspan_id, making the JSONL 100% OTel-aligned from the first event. -
Same-batch span ordering hazard — when
TaskCreatedandTaskCompleted/TaskFailedland in the same 500-event batch,_process_eventcloses the span before_span_ids_for_eventcan look it up for intermediate events in the batch. Fixed by: _completed_span_ctxdict: stores theSpanContextsnapshot when a span is ended so intermediate events in the same batch can still retrieve it via.get()(not.pop()).- Terminal events (Completed/Failed/Canceled)
.pop()the entry, ensuring no unbounded growth. -
Same fix applied to the session span:
_completed_session_ctxsnapshot is taken atSessionEndedso JSONL serialization for same-batch session events still finds a valid context. -
ResourceUpdateJSONL correlation —ResourceUpdateevents now carry the session span'strace_idandspan_id(previously(None, None)), making node and GPU metrics queryable within the session trace in Jaeger / Tempo. -
TaskSubmitted/TaskQueuedfallback — ifTaskCreatedwas never seen (incomplete lifecycle, e.g. task registered before telemetry started),TaskSubmittedopens the task span instead, preserving best-effort correlation. -
External trace/span injection — components outside RHAPSODY (e.g. an orchestration layer or application code) can now emit events that carry an explicit
trace_id/span_idand have them written to the JSONL with the correct OTel correlation IDs, enabling cross-component trace stitching. -
Memory leak in
ConcurrentExecutionBackend._run_executable— file handles (stdout_f,stderr_f) opened forcapture_stdio=Truetasks were not closed on exception paths (subprocess launch failure, timeout, etc.). Refactored to initialize both handles andexit_codebefore thetryblock and close them infinally, eliminating the leak. -
OTel span parent hierarchy — task spans previously had no parent span and scattered across multiple unrelated traces. Root cause:
_dispatch_loopruns in a separate asyncioTaskand does not inherit the calling coroutine's OTel context. Fix: capturecontext.get_current()atTaskCreatedemit time, store in_task_contexts, restore viaattach/detachin_process_event(). All task spans now correctly nest under the session root span in the trace hierarchy. -
span_scope()session span fallback — when called from a context with no valid active span (e.g. inside a block body spawned beforestart_telemetry()),span_scope()now explicitly usesself._session_spanas the parent context instead of creating a root span. Replaces custom_null_context()with stdlibnullcontext.
Changed¶
-
TelemetryManager.export_as_otlp()— renamed fromexport_traces_otlp_json()for consistency with the OTel ecosystem naming convention. -
TelemetryManager.start()OTel provider setup — always usesResource.create({"service.name": "rhapsody", "session_id": ...})as the default resource (readsOTEL_SERVICE_NAMEfrom env automatically as fallback). Caller-suppliedspan_processorsare added viatracer_provider.add_span_processor()after the internalSpanBufferprocessor. Caller-suppliedmetric_readersare appended to theMeterProviderreader list alongside the internalInMemoryMetricReader.
Docs¶
docs/telemetry/integrations.md:- Fixed "Connecting an OTLP exporter" section: removed the broken
set_tracer_provider()pattern (RHAPSODY creates private providers and ignores globals) and replaced with the correctspan_processors=injection. Added tip block showing env-var configuration for Honeycomb/Grafana Cloud. - Fixed "Grafana + Prometheus Step 3": removed
set_meter_provider()pattern; replaced withmetric_readers=[PrometheusMetricReader()]. -
Updated "What RHAPSODY produces" table:
InMemorySpanExporter→SpanBuffer. -
docs/telemetry/quickstart.md: added three missing rows to thestart_telemetryparameters table (span_processors,metric_readers,resource). -
docs/telemetry/reference.md: added "Span annotation and enrichment API" section documentingspan_scope(),annotate_current_span(),register_span_enricher(), andexport_as_otlp(). -
docs/telemetry/index.md: updated "OpenTelemetry compatibility" paragraph to accurately describe the private-provider model and direct users to thespan_processors/metric_readersinjection API. -
ComputeTask: newcapture_stdio: bool = Falseparameter. WhenTrue, stdout and stderr from executable tasks are redirected directly to files ({work_dir}/{session_uid}/{task_uid}.stdout/.stderr) with zero in-memory buffering.task.stdoutandtask.stderrthen hold file paths instead of decoded strings. Function tasks are unaffected regardless of the flag value. ConcurrentExecutionBackend: honourscapture_stdioby passing open file handles toasyncio.create_subprocess_exec/asyncio.create_subprocess_shell; the kernel writes directly, no Python-level buffering.DaskExecutionBackend: honourscapture_stdioin the module-level_run_executablefunction; file handles are passed tosubprocess.runinside the worker process.DragonExecutionBackendV3: honourscapture_stdioby wrapping the command inbash -c "cmd 1>out 2>err", bypassing Dragon's internal PIPE infrastructure entirely.BaseBackend: added_work_dir: str(defaults toos.getcwd()) andis_attached: bool = False.Session.add_backend()is now the single authority that sets_work_dir = {session.work_dir}/{session.uid}and flipsis_attached = True.Session.add_backend(): raisesRuntimeErrorif the same backend instance is added to more than one session, preventing silent sharing bugs.- Unit tests for
capture_stdioacrossConcurrentExecutionBackend,DaskExecutionBackend, andDragonExecutionBackendV3. - Getting-started guide: new Capturing stdout/stderr to Files section in
docs/getting-started/advanced-usage.md.
[0.2.0] - 2026-04-04¶
Added¶
- Telemetry Abstraction Layer (TAL): new
rhapsody.telemetrypackage providing event-driven, backend-agnostic observability built on the OpenTelemetry Python SDK.TelemetryManager: central orchestrator owning the asyncio event bus, OTelMeterProvider+TracerProvider(in-memory), JSONL checkpoint writer, and subscriber fan-out.Session.enable_telemetry(resource_poll_interval, checkpoint_interval, checkpoint_path): one-call activation; returns aTelemetryManagerwired into the task state manager and all registered backends.- Event types:
SessionStarted,SessionEnded,TaskSubmitted,TaskQueued,TaskStarted,TaskCompleted,TaskFailed,ResourceUpdate— all frozen dataclasses with consistentevent_id,event_time,emit_time,session_id,backendfields. - OTel trace hierarchy: session span (root) → task spans (children, opened on
TaskStarted, closed onTaskCompleted/TaskFailed) →ResourceUpdateevents anchored to the session span. Every JSONL line carriestrace_idandspan_id. ResourceUpdate.resource_scope: computed discriminator field ("per_node"or"per_gpu") set automatically fromgpu_id. Replaces ad-hocgpu_id is Nonechecks throughout consumers.- Per-GPU telemetry:
ResourceUpdateevents withresource_scope="per_gpu"carry onlygpu_percentandgpu_id;cpu_percent,memory_percent, and all I/O fields areNoneon per-GPU events. Node-level aggregates useresource_scope="per_node". - Backend adapters:
ConcurrentTelemetryAdapter: psutil for CPU/memory/disk/net; pynvml (NVIDIA only) for per-device GPU, initialized once at thread start.ImportErroron missing pynvml is silent; driver failures (e.g., no NVIDIA kernel module loaded) log a one-timeWARNING.DaskTelemetryAdapter: pollsscheduler_info()per worker; emits disk I/O deltas from cumulative Dask counters; net I/O and GPU not exposed by Dask.DragonTelemetryAdapter: reads Dragon'sdpsmetric queue; emits per-device GPU events tagged withdevice_id; supports NVIDIA, AMD, and Intel GPU vendors.
- Adapter registration: failures during
_attach_telemetry_adapternow log atWARNINGwith full traceback instead of silently passing. TelemetryReaderandTelemetrySubscriber: typed pull/push interfaces overTelemetryManager.TelemetryManager.summary(),.task_spans(),.task_count(),.session_duration(): convenience methods returning plain dicts; no OTel knowledge required.- JSONL checkpoint file (
rhapsody.session.<id>.<ts>.telemetry.jsonl): line-buffered, readable during a live run; containsevent,metric, andspansections. - OTel contract test suite:
tests/unit/telemetry/conftest.py(AdapterCapabilities,assert_resource_update_contract) +tests/unit/telemetry/test_otel_contract.py(parametrized across Concurrent, Dask, Dragon). - Full telemetry documentation:
docs/telemetry/(overview, events & metrics reference, quick start, integrations).
DragonExecutionBackendV3: migrated to the Dragon batch.py streaming pipeline. Tasks submitted viasession.submit_tasks()are dispatched individually by a continuously running background thread — there is no compile or start step.DragonExecutionBackendV3: accepts two new constructor parameters:num_nodes(total nodes) andpool_nodes(nodes per worker pool), forwarded directly toBatch().DragonExecutionBackendV3.fence(): new method that delegates tobatch.fence(), allowing callers to wait for all in-flight tasks submitted by this client to complete.DragonExecutionBackendV3.create_ddict(): new helper that creates a DragonDDictinstance directly from the backend.DragonExecutionBackendV3._deliver_batch(): replaces_deliver_result/_deliver_failure. All tasks that complete within a single monitor sweep are collected into a list and delivered to the asyncio event loop in onecall_soon_threadsafecall instead of one per task — reducing cross-thread wakeups from O(tasks) to O(sweeps).DragonExecutionBackendV3result monitoring reads directly fromBatch.results_ddict(the same distributed dict Dragon workers write to) instead of going through theTaskobject. The membership check and value read are now a singletry/except KeyErroroperation, eliminating the redundant__contains__network round-trip (was two DDict RTTs per ready task, now one).DragonExecutionBackendV3: monitor thread now starts lazily on firstsubmit_tasks()call instead of at_async_inittime, eliminating the idle-spin phase while tasks are being built and registered.DragonExecutionBackendV3._cancelled_tasksconverted fromlisttoset: O(1) membership checks and automatic deduplication. Cancelled UIDs are now removed from the set once the monitor loop processes the task, preventing unbounded growth.DragonExecutionBackendV3._task_registryentries are now removed atomically (viadict.pop) on result or failure delivery, eliminating unbounded registry growth for long-running sessions.
Changed¶
ResourceUpdate.cpu_percentandResourceUpdate.memory_percent: type changed fromfloat(default0.0) tofloat | None(defaultNone). These fields areNoneonresource_scope="per_gpu"events and non-null floats onresource_scope="per_node"events. Consumers filtering node-level events should useevent.resource_scope == "per_node"instead ofevent.gpu_id is None.
Fixed¶
ConcurrentTelemetryAdapter: replaced nvidia-smi subprocess (one blocking call per poll, no per-device breakdown) with pynvml initialized once at thread start. Per-device GPU utilization is now collected with device index matching Dragon's pattern.ConcurrentTelemetryAdapter: pynvml driver failure (e.g., driver not loaded, permission denied) now logs a one-timeWARNINGinstead of silently disabling GPU metrics with no indication.ConcurrentExecutionBackend: regular (synchronous) functions are now executed correctly in bothThreadPoolExecutorandProcessPoolExecutor. Previously, all function tasks were dispatched viaasyncio.run(func(...)), which raisedValueErrorfor non-coroutine callables. The executor now detectsasyncio.iscoroutinefunctionand calls sync functions directly.Session/TaskStateManager: task futures now propagate exceptions on failure. Previously all futures were resolved withset_result(task)regardless of outcome. Failures now resolve as:- Function task raises → original exception propagated via
fut.set_exception(exc). - Executable task returns non-zero exit code →
fut.set_exception(TaskExecutionError(uid, stderr, exit_code)). - Successful tasks →
fut.set_result(task)(unchanged).
- Function task raises → original exception propagated via
Session.wait_tasks: replaced bareexcept Exception(which silently swallowed all errors) withasyncio.gather(*futures, return_exceptions=True). Task failures no longer propagate throughwait_tasks; callers inspecttask.state/task.exception/task.stderrdirectly.- API documentation: new Wait Modes reference in
docs/api/index.mdwith a comparison table ofwait_tasks,asyncio.gather(*futures), andawait future / await task, including exception types and when to use each. - API documentation: new Callable Task API section documenting sync and async function support, executor behaviour, and result access fields.
- Getting-started guide: new Wait Modes and Sync and Async Callable Tasks sections added to
docs/getting-started/advanced-usage.mdwith runnable code examples. ConcurrentExecutionBackend: executable tasks now honour per-task working directory viatask.cwd(task-level) ortask_backend_specific_kwargs={"cwd": "..."}(overrides task-level); passed ascwd=toasyncio.create_subprocess_exec/asyncio.create_subprocess_shell.DaskExecutionBackend: executable tasks now honourcwdfromtask_backend_specific_kwargs(overrides task-leveltask.cwd); forwarded to_run_executablewhich passes it ascwd=tosubprocess.run.DragonExecutionBackendV3: documented that per-task working directory must be set viatask_backend_specific_kwargs={"process_template": {"cwd": "..."}}(or"process_templates"for MPI jobs); this is the only supported mechanism in V3.ConcurrentExecutionBackend: executable tasks now honourenvfromtask_backend_specific_kwargs={"env": {...}}; passed asenv=toasyncio.create_subprocess_exec/asyncio.create_subprocess_shell.None(default) inherits the parent process environment.- Unit tests for
cwdandenvbehaviour acrossConcurrentExecutionBackend,DaskExecutionBackend, andDragonExecutionBackendV3. DaskExecutionBackendnow supports all threeComputeTasktypes: sync functions, async functions, and executable tasks (via subprocess inside a Dask worker).- Cluster-agnostic design: pass
cluster=orclient=to target any Dask-compatible cluster (SLURM, Kubernetes, LocalCUDACluster, etc.) without changing task code. - Resource scheduling via
task_backend_specific_kwargs={"resources": {"GPU": 1}}routes tasks to workers advertising matching Dask resources. shell=Truefor executable tasks viatask_backend_specific_kwargs={"shell": True}.DaskExecutionBackend._check_resources_satisfiable(): pre-submit check that immediately fails tasks with unsatisfiable resource constraints instead of hanging indefinitely. Function tasks settask.exception; executable tasks settask.stderrandtask.exit_code = 1.- Integration tests for Dask backend: end-to-end sync/async/executable task execution, cluster injection (cluster and client), and resource constraint failure behavior.
Performance¶
DragonExecutionBackendV3._monitor_loop: eliminated redundantBatch.results_ddict.__contains__check — each ready task previously required two distributed DDict network round-trips (membership test + value read); now a singletry/except KeyErrorread is used, saving ~570µs per task on HPC interconnects (~5.7s for 10K tasks).DragonExecutionBackendV3._monitor_loop: monitor thread now starts lazily at firstsubmit_tasks()call instead of at backend initialisation, eliminating ~1.2s of idle spinning while tasks are being built.DragonExecutionBackendV3._deliver_batch: cross-thread wakeups reduced from O(tasks) to O(sweeps) — all completions found in a single sweep are batched into onecall_soon_threadsafecall (~0.78s saved for 10K tasks).DragonExecutionBackendV3: removed per-tasklogger.debugcalls from the monitor hot path, eliminating 10K f-string allocations per run (29ms at 10K tasks, scales linearly).TaskStateManager: removed_task_statesshadow dict — task state is now read directly fromtask["state"](single source of truth), eliminating one dict write per task completion.TaskStateManager._update_task_impl:_task_futuresentries are now removed viadict.popon resolution, preventing unbounded memory growth at scale.Session: removed history recording (task["history"]timestamps) — twotime.time()syscalls and two dict writes per task eliminated.Session.get_statistics: method removed entirely; it depended on history timestamps and iterated all tasks on every call.
Measured on 2 nodes / 128 workers (HPC), 10K function tasks:
| Run 1 | Run 2 | Run 3 | Run 4 | |
|---|---|---|---|---|
| Dragon batch only | 9.82s | 10.92s | 9.99s | 10.08s |
| RHAPSODY + Dragon batch | 10.50s | 12.17s | 9.92s | 11.18s |
RHAPSODY overhead reduced from ~7.8s (before) to ≤1.3s (after) — within normal run-to-run variance of the Dragon batch itself.
Fixed¶
ConcurrentExecutionBackend._run_in_thread: calling a regular (sync) function no longer raisesValueError: a coroutine was expected.ConcurrentExecutionBackend._run_in_process: same fix — sync functions are called directly after cloudpickle deserialization.TaskStateManager._update_task_impl: futures for failed tasks are now resolved withset_exceptioninstead ofset_result, soawait futureandawait asyncio.gather(*futures)correctly raise on task failure.TaskStateManager.get_wait_future: same fix applied to the race-condition path (task already completed beforeget_wait_futurewas called).Session.wait_tasks: removedexcept Exception: logger.error(...)which was silently eating all non-timeout exceptions including programming errors.
Changed¶
ComputeTask: renamedworking_directoryparameter and dict key tocwdto distinguish task-level working directory from backend-level working directory (used by Dragon V1/V2 viaresources["working_dir"]). Update call sites:ComputeTask(cwd="/path")andtask["cwd"].DragonExecutionBackendV3: removed the silently-ignoredworking_directoryparameter from__init__andcreate()— it was never stored or used (unlike V1/V2 which honourresources["working_dir"]). Usetask_backend_specific_kwargs={"process_template": {"cwd": "..."}}instead.DaskExecutionBackend: resource constraints are now specified exclusively viatask_backend_specific_kwargs={"resources": {...}}(previously viagpu=/cpu_threads=onComputeTask).DaskExecutionBackend._build_dask_resources(): reads fromtask_backend_specific_kwargs["resources"]instead of task-levelgpu/cpu_threadsfields.DaskExecutionBackend._submit_executable():shell=now reads fromtask_backend_specific_kwargsfor consistency.- Optimized Dragon backend tests by using module-scoped fixtures, reusing a single backend instance per Dragon version (V1, V2, V3) across all tests instead of creating/destroying one per test.
Breaking Changes¶
DragonExecutionBackendV3: removednum_workers,disable_background_batching, anddisable_batch_submissionconstructor parameters. These were incompatible with the new streaming pipeline — the Dragon batch always runs in streaming mode and worker counts are controlled vianum_nodes/pool_nodes. Migration:DragonExecutionBackendV3(num_workers=16)→DragonExecutionBackendV3(num_nodes=4, pool_nodes=2)(or simply omit both to let Dragon decide)DragonExecutionBackendV3(disable_batch_submission=True)→ remove (streaming is now always on)DragonExecutionBackendV3(disable_background_batching=True)→ remove
ComputeTaskandAITask: removedranks,memory,gpu,cpu_threads, andenvironmentparameters. These fields were never consumed by any execution backend — they were silently ignored, creating a misleading API. Resource requirements are now backend-specific and must be passed viatask_backend_specific_kwargs. Migration:ComputeTask(executable=..., gpu=2)→ComputeTask(executable=..., task_backend_specific_kwargs={"resources": {"GPU": 2}})(Dask) ortask_backend_specific_kwargs={"process_template": {"gpu": 2}}(Dragon V3)ComputeTask(executable=..., environment={"K": "V"})→ComputeTask(executable=..., task_backend_specific_kwargs={"env": {"K": "V"}})(Concurrent/Dask)ComputeTask: removedcwdandshellas named parameters. All execution context is now exclusively specified viatask_backend_specific_kwargsfor consistency. Migration:ComputeTask(executable=..., cwd="/path")→ComputeTask(executable=..., task_backend_specific_kwargs={"cwd": "/path"})ComputeTask(executable=..., shell=True)→ComputeTask(executable=..., task_backend_specific_kwargs={"shell": True})
Fixed¶
_run_executable: replacedstdout=subprocess.PIPE, stderr=subprocess.PIPEwithcapture_output=True(ruff UP022).
[0.1.2] - 2026-02-16¶
Fixed¶
- Fixed project URLs in
pyproject.tomlto point to the correct GitHub repository. - Removed Python 3.8 classifier (package requires Python >= 3.9).
[0.1.1] - 2026-02-16¶
Fixed¶
- Fixed
ImportErrorwhen installingrhapsody-pywithout optional backends (Dragon, Dask, RADICAL-Pilot). Optional backend imports inbackends/__init__.pyare now wrapped intry/except.
[0.1.0] - 2025-02-16¶
Added¶
- Initial PyPI release as
rhapsody-py(import name:rhapsody). - Execution backends: Concurrent (built-in), Dask, RADICAL-Pilot, Dragon (V1, V2, V3).
- Inference backend: Dragon-VLLM for LLM serving on HPC.
- Task API:
ComputeTaskfor executables/functions,AITaskfor LLM inference. - Session API for task submission, monitoring, and lifecycle management.
- Backend discovery and registry system.
- Removed ThreadPoolExecutor wait approach (
_wait_executor), streamlining concurrency management. - Added
disable_batch_submissionparameter toDragonExecutionBackendV3for configurable task submission strategy. - Introduced polling-based batch monitoring with
_monitor_loopand_process_batch_results.