Skip to content

ClearML Integration

ROSE ships a plug-and-play ClearMLTracker that wires ClearML into any learner with a single line. For parallel learners, each sub-learner's metrics appear as separate series inside the same task — directly overlaid in the ClearML UI for convergence comparison.

pip install rose[clearml]

Quick start

from rose.integrations.clearml_tracker import ClearMLTracker

learner.add_tracker(
    ClearMLTracker(
        project_name="ROSE-Materials-UQ",
        task_name="ensemble-run-01",
    )
)

async for state in learner.start(learner_names=["A", "B"], max_iter=15):
    print(f"[{state.learner_id}] iter {state.iteration}: mse={state.metric_value:.4f}")
    # tracking is fully automatic — no clearml calls here

Open the ClearML web UI, navigate to project ROSE-Materials-UQ, and select the task ensemble-run-01. The Scalars tab shows overlaid curves per learner.

A complete runnable example is at examples/integrations/tracking/clearml/run_me.py.


What gets logged automatically

Hyperparameters — logged once in on_start

The entire pipeline manifest is connected to the ClearML task without any user annotation:

ClearML hyperparameter Source
learner_type Learner class name
criterion/metric_name as_stop_criterion(metric_name=...)
criterion/threshold as_stop_criterion(threshold=...)
criterion/operator as_stop_criterion(operator=...)
task/<name>/as_executable Per registered task
task/<name>/<key> Explicit log_params dict declared in task decorator

Scalars — logged per iteration in on_iteration

ClearML scalar Source
<metric_name> (e.g. mean_squared_error_mse) Stop criterion value
Any numeric key in state.state Auto-extracted from task dict returns

For parallel learners, state.learner_id is included automatically. The tracker logs each state as a separate series inside the same scalar title, making per-learner curves directly comparable without any user code.

Task tags — logged in on_stop

ClearML tag Value
stop:<reason> stop:criterion_met / stop:max_iter_reached / stop:stopped / stop:error
final_iter:<n> Last completed iteration number

Tags make it easy to filter tasks in the ClearML UI by outcome.


Parallel learner comparison

The ClearML tracker is designed with parallel learners in mind. Each on_iteration call carries state.learner_id — the tracker logs each learner as a separate scalar series under the same title:

Scalars tab in ClearML UI:
  ┌─ mean_squared_error_mse ──────────────────────────────┐
  │  ensemble-A  ───────────\                             │
  │  ensemble-B  ────────────\──────────────────────────  │
  └───────────────────────────────────────────────────────┘

No user code is required to achieve this — state.learner_id is already set by the parallel learner framework.


Multiple trackers

Attach ClearML alongside other trackers — they are independent observers:

from rose.integrations.clearml_tracker import ClearMLTracker

learner.add_tracker(HPC_FileTracker("run.jsonl"))   # safety net on HPC
learner.add_tracker(ClearMLTracker(project_name="x", task_name="y"))

Extending ClearMLTracker

To log additional artifacts (model checkpoints, prediction plots) override on_stop:

from rose.integrations.clearml_tracker import ClearMLTracker

class ClearMLCheckpointTracker(ClearMLTracker):
    def on_stop(self, final_state, reason: str) -> None:
        # Log model checkpoint as a ClearML artifact before closing the task
        if final_state and self._task:
            checkpoint = final_state.get("model_checkpoint")
            if checkpoint:
                self._task.upload_artifact(
                    name="best_model",
                    artifact_object=checkpoint,
                )
        super().on_stop(final_state, reason)