Usage with Cylc workflow manager

The command line interface can be used with workflow managers like cylc in virtualenv environments.

Note

This page covers basic usage of cylc. For usage with Slurm, see cylc-slurm-pip

Prerequisites

This example requires:

familiarity with and a working installation of cylc (e.g. by going through the tutorial)
virtualenv
python3.8 (so you can run virtualenv venv -p python3.8)

A new environment will be created during the setup phase of the cylc workflow run.

Setup

To initialize the workflow, we define a file in thelib/python directory (a cylc convention) with the code for the experiment: lib/python/components.py, including all the required functions.

import pandas as pd
from sklearn.linear_model import LinearRegression

from autora.experimentalist.grid import grid_pool
from autora.state import StandardState, estimator_on_state, on_state
from autora.variable import Variable, VariableCollection


def initial_state(_):
    state = StandardState(
        variables=VariableCollection(
            independent_variables=[Variable(name="x", allowed_values=range(100))],
            dependent_variables=[Variable(name="y")],
            covariates=[],
        ),
        conditions=None,
        experiment_data=pd.DataFrame({"x": [], "y": []}),
        models=[],
    )
    return state


experimentalist = on_state(grid_pool, output=["conditions"])

experiment_runner = on_state(
    lambda conditions: conditions.assign(y=2 * conditions["x"] + 0.5),
    output=["experiment_data"],
)

theorist = estimator_on_state(LinearRegression(fit_intercept=True))

These functions will be called in turn by the autora.workflow script.

The flow.cylc file defines the workflow.

[scheduling]
    cycling mode = integer
    initial cycle point = 0
    final cycle point = 5
    [[graph]]
        R1/0 = """
        setup_python => initial_state
        """
        R1/1 = """
            initial_state[^] => experimentalist => experiment_runner => theorist
        """
        2/P1 = """
            theorist[-P1] => experimentalist => experiment_runner => theorist
        """

[runtime]
    [[setup_python]]
        script = """
            virtualenv "$CYLC_WORKFLOW_SHARE_DIR/env" -p python3.8
            source "$CYLC_WORKFLOW_SHARE_DIR/env/bin/activate"
            pip install --upgrade pip
            pip install -r "$CYLC_WORKFLOW_RUN_DIR/requirements.txt"
        """
    [[initial_state]]
    script = """
            $CYLC_WORKFLOW_SHARE_DIR/env/bin/python \
                -m autora.workflow \
                components.initial_state \
                --out-path "$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/result"
        """
    [[experimentalist]]
        script = """
            $CYLC_WORKFLOW_SHARE_DIR/env/bin/python -m autora.workflow \
                components.experimentalist \
                --in-path "$CYLC_WORKFLOW_SHARE_DIR/$((CYLC_TASK_CYCLE_POINT - 1))/result" \
                --out-path "$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/conditions"
        """
    [[experiment_runner]]
        script = """
            $CYLC_WORKFLOW_SHARE_DIR/env/bin/python -m autora.workflow \
                components.experiment_runner \
                --in-path "$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/conditions" \
                --out-path "$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/data"
        """
    [[theorist]]
        script = """
            $CYLC_WORKFLOW_SHARE_DIR/env/bin/python -m autora.workflow \
                components.theorist \
                --in-path "$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/data" \
                --out-path "$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/result"
        """

Note that the first step – setup_python – initializes a new virtual environment for python, using the requirements file. In this example, we require the following requirements, but yours will likely be different:

autora-core>=4.0.0
autora-workflow>=v0.5.0
cylc-flow==8.1.4
cylc-uiserver==1.2.2

Execution

We can call the cylc command line interface as follows, in a shell session:

First, we validate the flow.cylc file:

cylc validate .

We install the workflow:

cylc install .

We tell cylc to play the workflow:

cylc play "cylc-pip"

(As a shortcut for "validate, install and play", use cylc vip .)

We can view the workflow running in the graphical user interface (GUI):

cylc gui

... or the text user interface (TUI):

cylc tui "cylc-pip"

Results

We can load and interrogate the results as follows:

from autora.serializer import load_state

state = load_state("~/cylc-run/cylc-pip/runN/share/result")
print(state)