Skip to content

autora.state

Classes to represent cycle state \(S\) as \(S_n = S_{0} + \sum_{i=1}^n \Delta S_{i}\).

Result = Delta module-attribute

Result is an alias for Delta.

Delta

Bases: UserDict, Generic[S]

Represents a delta where the base object determines the extension behavior.

Examples:

>>> from dataclasses import dataclass

First we define the dataclass to act as the basis:

>>> from typing import Optional, List
>>> @dataclass(frozen=True)
... class ListState:
...     l: Optional[List] = None
...     m: Optional[List] = None
...
Source code in autora/state.py
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
class Delta(UserDict, Generic[S]):
    """
    Represents a delta where the base object determines the extension behavior.

    Examples:
        >>> from dataclasses import dataclass

        First we define the dataclass to act as the basis:
        >>> from typing import Optional, List
        >>> @dataclass(frozen=True)
        ... class ListState:
        ...     l: Optional[List] = None
        ...     m: Optional[List] = None
        ...
    """

    pass

DeltaAddable

Bases: Protocol[C]

A class which a Delta or other Mapping can be added to, returning the same class

Source code in autora/state.py
40
41
42
43
44
class DeltaAddable(Protocol[C]):
    """A class which a Delta or other Mapping can be added to, returning the same class"""

    def __add__(self: C, other: Union[Delta, Mapping]) -> C:
        ...

StandardState dataclass

Bases: State

Examples:

The state can be initialized emtpy

>>> from autora.variable import VariableCollection, Variable
>>> s = StandardState()
>>> s
StandardState(variables=None, conditions=None, experiment_data=None, models=[])

The variables can be updated using a Delta:

>>> dv1 = Delta(variables=VariableCollection(independent_variables=[Variable("1")]))
>>> s + dv1
StandardState(variables=VariableCollection(independent_variables=[Variable(name='1',...)

... and are replaced by each Delta:

>>> dv2 = Delta(variables=VariableCollection(independent_variables=[Variable("2")]))
>>> s + dv1 + dv2
StandardState(variables=VariableCollection(independent_variables=[Variable(name='2',...)

The conditions can be updated using a Delta:

>>> dc1 = Delta(conditions=pd.DataFrame({"x": [1, 2, 3]}))
>>> (s + dc1).conditions
   x
0  1
1  2
2  3

... and are replaced by each Delta:

>>> dc2 = Delta(conditions=pd.DataFrame({"x": [4, 5]}))
>>> (s + dc1 + dc2).conditions
   x
0  4
1  5

Datatypes other than pd.DataFrame will be coerced into a DataFrame if possible.

>>> import numpy as np
>>> dc3 = Delta(conditions=np.core.records.fromrecords([(8, "h"), (9, "i")], names="n,c"))
>>> (s + dc3).conditions
   n  c
0  8  h
1  9  i

If they are passed without column names, no column names are inferred. This is to ensure that accidental mislabeling of columns cannot occur. Column names should usually be provided.

>>> dc4 = Delta(conditions=[(6,), (7,)])
>>> (s + dc4).conditions
   0
0  6
1  7

Datatypes which are incompatible with a pd.DataFrame will throw an error:

>>> s + Delta(conditions="not compatible with pd.DataFrame")
Traceback (most recent call last):
...
ValueError: ...

Experiment data can be updated using a Delta:

>>> ded1 = Delta(experiment_data=pd.DataFrame({"x": [1,2,3], "y": ["a", "b", "c"]}))
>>> (s + ded1).experiment_data
   x  y
0  1  a
1  2  b
2  3  c

... and are extended with each Delta:

>>> ded2 = Delta(experiment_data=pd.DataFrame({"x": [4, 5, 6], "y": ["d", "e", "f"]}))
>>> (s + ded1 + ded2).experiment_data
   x  y
0  1  a
1  2  b
2  3  c
3  4  d
4  5  e
5  6  f

If they are passed without column names, no column names are inferred. This is to ensure that accidental mislabeling of columns cannot occur.

>>> ded3 = Delta(experiment_data=pd.DataFrame([(7, "g"), (8, "h")]))
>>> (s + ded3).experiment_data
   0  1
0  7  g
1  8  h

If there are already data present, the column names must match.

>>> (s + ded2 + ded3).experiment_data
     x    y    0    1
0  4.0    d  NaN  NaN
1  5.0    e  NaN  NaN
2  6.0    f  NaN  NaN
3  NaN  NaN  7.0    g
4  NaN  NaN  8.0    h

experiment_data other than pd.DataFrame will be coerced into a DataFrame if possible.

>>> import numpy as np
>>> ded4 = Delta(
...     experiment_data=np.core.records.fromrecords([(1, "a"), (2, "b")], names=["x", "y"]))
>>> (s + ded4).experiment_data
   x  y
0  1  a
1  2  b

experiment_data which are incompatible with a pd.DataFrame will throw an error:

>>> s + Delta(experiment_data="not compatible with pd.DataFrame")
Traceback (most recent call last):
...
ValueError: ...

models can be updated using a Delta:

>>> from sklearn.dummy import DummyClassifier
>>> dm1 = Delta(models=[DummyClassifier(constant=1)])
>>> dm2 = Delta(models=[DummyClassifier(constant=2), DummyClassifier(constant=3)])
>>> (s + dm1).models
[DummyClassifier(constant=1)]
>>> (s + dm1 + dm2).models
[DummyClassifier(constant=1), DummyClassifier(constant=2), DummyClassifier(constant=3)]
Source code in autora/state.py
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
@dataclass(frozen=True)
class StandardState(State):
    """
    Examples:
        The state can be initialized emtpy
        >>> from autora.variable import VariableCollection, Variable
        >>> s = StandardState()
        >>> s
        StandardState(variables=None, conditions=None, experiment_data=None, models=[])

        The `variables` can be updated using a `Delta`:
        >>> dv1 = Delta(variables=VariableCollection(independent_variables=[Variable("1")]))
        >>> s + dv1 # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
        StandardState(variables=VariableCollection(independent_variables=[Variable(name='1',...)

        ... and are replaced by each `Delta`:
        >>> dv2 = Delta(variables=VariableCollection(independent_variables=[Variable("2")]))
        >>> s + dv1 + dv2 # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
        StandardState(variables=VariableCollection(independent_variables=[Variable(name='2',...)

        The `conditions` can be updated using a `Delta`:
        >>> dc1 = Delta(conditions=pd.DataFrame({"x": [1, 2, 3]}))
        >>> (s + dc1).conditions
           x
        0  1
        1  2
        2  3

        ... and are replaced by each `Delta`:
        >>> dc2 = Delta(conditions=pd.DataFrame({"x": [4, 5]}))
        >>> (s + dc1 + dc2).conditions
           x
        0  4
        1  5

        Datatypes other than `pd.DataFrame` will be coerced into a `DataFrame` if possible.
        >>> import numpy as np
        >>> dc3 = Delta(conditions=np.core.records.fromrecords([(8, "h"), (9, "i")], names="n,c"))
        >>> (s + dc3).conditions
           n  c
        0  8  h
        1  9  i

        If they are passed without column names, no column names are inferred.
        This is to ensure that accidental mislabeling of columns cannot occur.
        Column names should usually be provided.
        >>> dc4 = Delta(conditions=[(6,), (7,)])
        >>> (s + dc4).conditions
           0
        0  6
        1  7

        Datatypes which are incompatible with a pd.DataFrame will throw an error:
        >>> s + Delta(conditions="not compatible with pd.DataFrame") \
# doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
        Traceback (most recent call last):
        ...
        ValueError: ...

        Experiment data can be updated using a Delta:
        >>> ded1 = Delta(experiment_data=pd.DataFrame({"x": [1,2,3], "y": ["a", "b", "c"]}))
        >>> (s + ded1).experiment_data
           x  y
        0  1  a
        1  2  b
        2  3  c

        ... and are extended with each Delta:
        >>> ded2 = Delta(experiment_data=pd.DataFrame({"x": [4, 5, 6], "y": ["d", "e", "f"]}))
        >>> (s + ded1 + ded2).experiment_data
           x  y
        0  1  a
        1  2  b
        2  3  c
        3  4  d
        4  5  e
        5  6  f

        If they are passed without column names, no column names are inferred.
        This is to ensure that accidental mislabeling of columns cannot occur.
        >>> ded3 = Delta(experiment_data=pd.DataFrame([(7, "g"), (8, "h")]))
        >>> (s + ded3).experiment_data
           0  1
        0  7  g
        1  8  h

        If there are already data present, the column names must match.
        >>> (s + ded2 + ded3).experiment_data
             x    y    0    1
        0  4.0    d  NaN  NaN
        1  5.0    e  NaN  NaN
        2  6.0    f  NaN  NaN
        3  NaN  NaN  7.0    g
        4  NaN  NaN  8.0    h

        `experiment_data` other than `pd.DataFrame` will be coerced into a `DataFrame` if possible.
        >>> import numpy as np
        >>> ded4 = Delta(
        ...     experiment_data=np.core.records.fromrecords([(1, "a"), (2, "b")], names=["x", "y"]))
        >>> (s + ded4).experiment_data
           x  y
        0  1  a
        1  2  b

        `experiment_data` which are incompatible with a pd.DataFrame will throw an error:
        >>> s + Delta(experiment_data="not compatible with pd.DataFrame") \
# doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
        Traceback (most recent call last):
        ...
        ValueError: ...

        `models` can be updated using a Delta:
        >>> from sklearn.dummy import DummyClassifier
        >>> dm1 = Delta(models=[DummyClassifier(constant=1)])
        >>> dm2 = Delta(models=[DummyClassifier(constant=2), DummyClassifier(constant=3)])
        >>> (s + dm1).models
        [DummyClassifier(constant=1)]

        >>> (s + dm1 + dm2).models
        [DummyClassifier(constant=1), DummyClassifier(constant=2), DummyClassifier(constant=3)]

    """

    variables: Optional[VariableCollection] = field(
        default=None, metadata={"delta": "replace"}
    )
    conditions: Optional[pd.DataFrame] = field(
        default=None, metadata={"delta": "replace", "converter": pd.DataFrame}
    )
    experiment_data: Optional[pd.DataFrame] = field(
        default=None, metadata={"delta": "extend", "converter": pd.DataFrame}
    )
    models: List[BaseEstimator] = field(
        default_factory=list,
        metadata={"delta": "extend"},
    )

State dataclass

Base object for dataclasses which use the Delta mechanism.

Examples:

>>> from dataclasses import dataclass, field
>>> from typing import List, Optional

We define a dataclass where each field (which is going to be delta-ed) has additional metadata "delta" which describes its delta behaviour.

>>> @dataclass(frozen=True)
... class ListState(State):
...    l: List = field(default_factory=list, metadata={"delta": "extend"})
...    m: List = field(default_factory=list, metadata={"delta": "replace"})

Now we instantiate the dataclass...

>>> l = ListState(l=list("abc"), m=list("xyz"))
>>> l
ListState(l=['a', 'b', 'c'], m=['x', 'y', 'z'])

... and can add deltas to it. l will be extended:

>>> l + Delta(l=list("def"))
ListState(l=['a', 'b', 'c', 'd', 'e', 'f'], m=['x', 'y', 'z'])

... wheras m will be replaced:

>>> l + Delta(m=list("uvw"))
ListState(l=['a', 'b', 'c'], m=['u', 'v', 'w'])

... they can be chained:

>>> l + Delta(l=list("def")) + Delta(m=list("uvw"))
ListState(l=['a', 'b', 'c', 'd', 'e', 'f'], m=['u', 'v', 'w'])

... and we update multiple fields with one Delta:

>>> l + Delta(l=list("ghi"), m=list("rst"))
ListState(l=['a', 'b', 'c', 'g', 'h', 'i'], m=['r', 's', 't'])

A non-existent field will be ignored:

>>> l + Delta(o="not a field")
ListState(l=['a', 'b', 'c'], m=['x', 'y', 'z'])

... but will trigger a warning:

>>> with warnings.catch_warnings(record=True) as w:
...     _ = l + Delta(o="not a field")
...     print(w[0].message)
These fields: ['o'] could not be used to update ListState,
which has these fields & aliases: ['l', 'm']

We can also use the .update method to do the same thing:

>>> l.update(l=list("ghi"), m=list("rst"))
ListState(l=['a', 'b', 'c', 'g', 'h', 'i'], m=['r', 's', 't'])

We can also define fields which append the last result:

>>> @dataclass(frozen=True)
... class AppendState(State):
...    n: List = field(default_factory=list, metadata={"delta": "append"})
>>> m = AppendState(n=list("ɑβɣ"))
>>> m
AppendState(n=['ɑ', 'β', 'ɣ'])

n will be appended:

>>> m + Delta(n="∂")
AppendState(n=['ɑ', 'β', 'ɣ', '∂'])

The metadata key "converter" is used to coerce types (inspired by PEP 712):

>>> @dataclass(frozen=True)
... class CoerceStateList(State):
...    o: Optional[List] = field(default=None, metadata={"delta": "replace"})
...    p: List = field(default_factory=list, metadata={"delta": "replace",
...                                                    "converter": list})
>>> r = CoerceStateList()

If there is no metadata["converter"] set for a field, no coercion occurs

>>> r + Delta(o="not a list")
CoerceStateList(o='not a list', p=[])

If there is a metadata["converter"] set for a field, the data are coerced:

>>> r + Delta(p="not a list")
CoerceStateList(o=None, p=['n', 'o', 't', ' ', 'a', ' ', 'l', 'i', 's', 't'])

If the input data are of the correct type, they are returned unaltered:

>>> r + Delta(p=["a", "list"])
CoerceStateList(o=None, p=['a', 'list'])

With a converter, inputs are converted to the type output by the converter:

>>> @dataclass(frozen=True)
... class CoerceStateDataFrame(State):
...    q: pd.DataFrame = field(default_factory=pd.DataFrame,
...                            metadata={"delta": "replace",
...                                      "converter": pd.DataFrame})

If the type is already correct, the object is passed to the converter, but should be returned unchanged:

>>> s = CoerceStateDataFrame()
>>> (s + Delta(q=pd.DataFrame([("a",1,"alpha"), ("b",2,"beta")], columns=list("xyz")))).q
   x  y      z
0  a  1  alpha
1  b  2   beta

If the type is not correct, the object is converted if possible. For a dataframe, we can convert records:

>>> (s + Delta(q=[("a",1,"alpha"), ("b",2,"beta")])).q
   0  1      2
0  a  1  alpha
1  b  2   beta

... or an array:

>>> (s + Delta(q=np.linspace([1, 2], [10, 15], 3))).q
      0     1
0   1.0   2.0
1   5.5   8.5
2  10.0  15.0

... or a dictionary:

>>> (s + Delta(q={"a": [1,2,3], "b": [4,5,6]})).q
   a  b
0  1  4
1  2  5
2  3  6

... or a list:

>>> (s + Delta(q=[11, 12, 13])).q
    0
0  11
1  12
2  13

... but not, for instance, a string:

>>> (s + Delta(q="not compatible with pd.DataFrame")).q
Traceback (most recent call last):
...
ValueError: DataFrame constructor not properly called!

Without a converter:

>>> @dataclass(frozen=True)
... class CoerceStateDataFrameNoConverter(State):
...    r: pd.DataFrame = field(default_factory=pd.DataFrame, metadata={"delta": "replace"})

... there is no coercion – the object is passed unchanged

>>> t = CoerceStateDataFrameNoConverter()
>>> (t + Delta(r=np.linspace([1, 2], [10, 15], 3))).r
array([[ 1. ,  2. ],
       [ 5.5,  8.5],
       [10. , 15. ]])

A converter can cast from a DataFrame to a np.ndarray (with a single datatype), for instance:

>>> @dataclass(frozen=True)
... class CoerceStateArray(State):
...    r: Optional[np.ndarray] = field(default=None,
...                            metadata={"delta": "replace",
...                                      "converter": np.asarray})

Here we pass a dataframe, but expect a numpy array:

>>> (CoerceStateArray() + Delta(r=pd.DataFrame([("a",1), ("b",2)], columns=list("xy")))).r
array([['a', 1],
       ['b', 2]], dtype=object)

We can define aliases which can transform between different potential field names.

Source code in autora/state.py
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
@dataclass(frozen=True)
class State:
    """
    Base object for dataclasses which use the Delta mechanism.

    Examples:
        >>> from dataclasses import dataclass, field
        >>> from typing import List, Optional

        We define a dataclass where each field (which is going to be delta-ed) has additional
        metadata "delta" which describes its delta behaviour.
        >>> @dataclass(frozen=True)
        ... class ListState(State):
        ...    l: List = field(default_factory=list, metadata={"delta": "extend"})
        ...    m: List = field(default_factory=list, metadata={"delta": "replace"})

        Now we instantiate the dataclass...
        >>> l = ListState(l=list("abc"), m=list("xyz"))
        >>> l
        ListState(l=['a', 'b', 'c'], m=['x', 'y', 'z'])

        ... and can add deltas to it. `l` will be extended:
        >>> l + Delta(l=list("def"))
        ListState(l=['a', 'b', 'c', 'd', 'e', 'f'], m=['x', 'y', 'z'])

        ... wheras `m` will be replaced:
        >>> l + Delta(m=list("uvw"))
        ListState(l=['a', 'b', 'c'], m=['u', 'v', 'w'])

        ... they can be chained:
        >>> l + Delta(l=list("def")) + Delta(m=list("uvw"))
        ListState(l=['a', 'b', 'c', 'd', 'e', 'f'], m=['u', 'v', 'w'])

        ... and we update multiple fields with one Delta:
        >>> l + Delta(l=list("ghi"), m=list("rst"))
        ListState(l=['a', 'b', 'c', 'g', 'h', 'i'], m=['r', 's', 't'])

        A non-existent field will be ignored:
        >>> l + Delta(o="not a field")
        ListState(l=['a', 'b', 'c'], m=['x', 'y', 'z'])

        ... but will trigger a warning:
        >>> with warnings.catch_warnings(record=True) as w:
        ...     _ = l + Delta(o="not a field")
        ...     print(w[0].message) # doctest: +NORMALIZE_WHITESPACE
        These fields: ['o'] could not be used to update ListState,
        which has these fields & aliases: ['l', 'm']

        We can also use the `.update` method to do the same thing:
        >>> l.update(l=list("ghi"), m=list("rst"))
        ListState(l=['a', 'b', 'c', 'g', 'h', 'i'], m=['r', 's', 't'])

        We can also define fields which `append` the last result:
        >>> @dataclass(frozen=True)
        ... class AppendState(State):
        ...    n: List = field(default_factory=list, metadata={"delta": "append"})

        >>> m = AppendState(n=list("ɑβɣ"))
        >>> m
        AppendState(n=['ɑ', 'β', 'ɣ'])

        `n` will be appended:
        >>> m + Delta(n="∂")
        AppendState(n=['ɑ', 'β', 'ɣ', '∂'])

        The metadata key "converter" is used to coerce types (inspired by
        [PEP 712](https://peps.python.org/pep-0712/)):
        >>> @dataclass(frozen=True)
        ... class CoerceStateList(State):
        ...    o: Optional[List] = field(default=None, metadata={"delta": "replace"})
        ...    p: List = field(default_factory=list, metadata={"delta": "replace",
        ...                                                    "converter": list})

        >>> r = CoerceStateList()

        If there is no `metadata["converter"]` set for a field, no coercion occurs
        >>> r + Delta(o="not a list")
        CoerceStateList(o='not a list', p=[])

        If there is a `metadata["converter"]` set for a field, the data are coerced:
        >>> r + Delta(p="not a list")
        CoerceStateList(o=None, p=['n', 'o', 't', ' ', 'a', ' ', 'l', 'i', 's', 't'])

        If the input data are of the correct type, they are returned unaltered:
        >>> r + Delta(p=["a", "list"])
        CoerceStateList(o=None, p=['a', 'list'])

        With a converter, inputs are converted to the type output by the converter:
        >>> @dataclass(frozen=True)
        ... class CoerceStateDataFrame(State):
        ...    q: pd.DataFrame = field(default_factory=pd.DataFrame,
        ...                            metadata={"delta": "replace",
        ...                                      "converter": pd.DataFrame})

        If the type is already correct, the object is passed to the converter,
        but should be returned unchanged:
        >>> s = CoerceStateDataFrame()
        >>> (s + Delta(q=pd.DataFrame([("a",1,"alpha"), ("b",2,"beta")], columns=list("xyz")))).q
           x  y      z
        0  a  1  alpha
        1  b  2   beta

        If the type is not correct, the object is converted if possible. For a dataframe,
        we can convert records:
        >>> (s + Delta(q=[("a",1,"alpha"), ("b",2,"beta")])).q
           0  1      2
        0  a  1  alpha
        1  b  2   beta

        ... or an array:
        >>> (s + Delta(q=np.linspace([1, 2], [10, 15], 3))).q
              0     1
        0   1.0   2.0
        1   5.5   8.5
        2  10.0  15.0

        ... or a dictionary:
        >>> (s + Delta(q={"a": [1,2,3], "b": [4,5,6]})).q
           a  b
        0  1  4
        1  2  5
        2  3  6

        ... or a list:
        >>> (s + Delta(q=[11, 12, 13])).q
            0
        0  11
        1  12
        2  13

        ... but not, for instance, a string:
        >>> (s + Delta(q="not compatible with pd.DataFrame")).q
        Traceback (most recent call last):
        ...
        ValueError: DataFrame constructor not properly called!

        Without a converter:
        >>> @dataclass(frozen=True)
        ... class CoerceStateDataFrameNoConverter(State):
        ...    r: pd.DataFrame = field(default_factory=pd.DataFrame, metadata={"delta": "replace"})

        ... there is no coercion – the object is passed unchanged
        >>> t = CoerceStateDataFrameNoConverter()
        >>> (t + Delta(r=np.linspace([1, 2], [10, 15], 3))).r
        array([[ 1. ,  2. ],
               [ 5.5,  8.5],
               [10. , 15. ]])


        A converter can cast from a DataFrame to a np.ndarray (with a single datatype),
        for instance:
        >>> @dataclass(frozen=True)
        ... class CoerceStateArray(State):
        ...    r: Optional[np.ndarray] = field(default=None,
        ...                            metadata={"delta": "replace",
        ...                                      "converter": np.asarray})

        Here we pass a dataframe, but expect a numpy array:
        >>> (CoerceStateArray() + Delta(r=pd.DataFrame([("a",1), ("b",2)], columns=list("xy")))).r
        array([['a', 1],
               ['b', 2]], dtype=object)

        We can define aliases which can transform between different potential field
        names.

    """

    def __add__(self, other: Union[Delta, Mapping]):
        updates = dict()
        other_fields_unused = list(other.keys())
        for self_field in fields(self):
            other_value, key = _get_value(self_field, other)
            if other_value is None:
                continue
            other_fields_unused.remove(key)

            self_field_key = self_field.name
            self_value = getattr(self, self_field_key)
            delta_behavior = self_field.metadata["delta"]

            if (constructor := self_field.metadata.get("converter", None)) is not None:
                coerced_other_value = constructor(other_value)
            else:
                coerced_other_value = other_value

            if delta_behavior == "extend":
                extended_value = _extend(self_value, coerced_other_value)
                updates[self_field_key] = extended_value
            elif delta_behavior == "append":
                appended_value = _append(self_value, coerced_other_value)
                updates[self_field_key] = appended_value
            elif delta_behavior == "replace":
                updates[self_field_key] = coerced_other_value
            else:
                raise NotImplementedError(
                    "delta_behaviour=`%s` not implemented" % delta_behavior
                )

        if len(other_fields_unused) > 0:
            warnings.warn(
                "These fields: %s could not be used to update %s, "
                "which has these fields & aliases: %s"
                % (
                    other_fields_unused,
                    type(self).__name__,
                    _get_field_names_and_aliases(self),
                ),
            )

        new = replace(self, **updates)
        return new

    def update(self, **kwargs):
        """
        Return a new version of the State with values updated.

        This is identical to adding a `Delta`.

        If you need to replace values, ignoring the State value aggregation rules,
        use `dataclasses.replace` instead.
        """
        return self + Delta(**kwargs)

update(**kwargs)

Return a new version of the State with values updated.

This is identical to adding a Delta.

If you need to replace values, ignoring the State value aggregation rules, use dataclasses.replace instead.

Source code in autora/state.py
262
263
264
265
266
267
268
269
270
271
def update(self, **kwargs):
    """
    Return a new version of the State with values updated.

    This is identical to adding a `Delta`.

    If you need to replace values, ignoring the State value aggregation rules,
    use `dataclasses.replace` instead.
    """
    return self + Delta(**kwargs)

combined_functions_on_state(functions, output=None)

Decorator (factory) to make target list of functions into a function on a State. The resulting function uses a state field as input and combines the outputs of the functions.

Parameters:

Name Type Description Default
functions List[Tuple[str, Callable]]

the list of functions to be wrapped

required
output Optional[Sequence[str]]

list specifying State field names for the return values of function

None

Examples:

>>> @dataclass(frozen=True)
... class U(State):
...     conditions: List[int] = field(metadata={"delta": "replace"})
>>> identity = lambda conditions : conditions
>>> double_conditions = combined_functions_on_state(
...     [('id_1', identity), ('id_2', identity)], output=['conditions'])
>>> s = U([1, 2])
>>> s_double = double_conditions(s)
>>> s
U(conditions=[1, 2])
>>> s_double
U(conditions=[1, 2, 1, 2])

We can also pass parameters to the functions:

>>> def multiply(conditions, multiplier):
...     return [el * multiplier for el in conditions]
>>> double_and_triple = combined_functions_on_state(
...     [('doubler', multiply), ('tripler', multiply)], output=['conditions']
... )
>>> s = U([1, 2])
>>> s_double_triple = double_and_triple(
...     s, params={'doubler': {'multiplier': 2}, 'tripler': {'multiplier': 3}}
... )
>>> s_double_triple
U(conditions=[2, 4, 3, 6])

If the functions return a Delta object, we don't need to provide an output argument

>>> def decrement(conditions, dec):
...     return Delta(conditions=[el-dec for el in conditions])
>>> def increment(conditions, inc):
...     return Delta(conditions=[el+inc for el in conditions])
>>> dec_and_inc = combined_functions_on_state(
...     [('decrement', decrement), ('increment', increment)])
>>> s_dec_and_inc = dec_and_inc(
...     s, params={'decrement': {'dec': 10}, 'increment': {'inc': 2}})
>>> s_dec_and_inc
U(conditions=[-9, -8, 3, 4])
Source code in autora/state.py
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
def combined_functions_on_state(
    functions: List[Tuple[str, Callable]], output: Optional[Sequence[str]] = None
):
    """
    Decorator (factory) to make target list of `functions` into a function on a `State`.
    The resulting function uses a state field as input and combines the outputs of the
    `functions`.

    Args:
        functions: the list of functions to be wrapped
        output: list specifying State field names for the return values of `function`

    Examples:
        >>> @dataclass(frozen=True)
        ... class U(State):
        ...     conditions: List[int] = field(metadata={"delta": "replace"})
        >>> identity = lambda conditions : conditions
        >>> double_conditions = combined_functions_on_state(
        ...     [('id_1', identity), ('id_2', identity)], output=['conditions'])
        >>> s = U([1, 2])
        >>> s_double = double_conditions(s)
        >>> s
        U(conditions=[1, 2])
        >>> s_double
        U(conditions=[1, 2, 1, 2])

        # We can also pass parameters to the functions:
        >>> def multiply(conditions, multiplier):
        ...     return [el * multiplier for el in conditions]
        >>> double_and_triple = combined_functions_on_state(
        ...     [('doubler', multiply), ('tripler', multiply)], output=['conditions']
        ... )
        >>> s = U([1, 2])
        >>> s_double_triple = double_and_triple(
        ...     s, params={'doubler': {'multiplier': 2}, 'tripler': {'multiplier': 3}}
        ... )
        >>> s_double_triple
        U(conditions=[2, 4, 3, 6])

        # If the functions return a Delta object, we don't need to provide an output argument
        >>> def decrement(conditions, dec):
        ...     return Delta(conditions=[el-dec for el in conditions])
        >>> def increment(conditions, inc):
        ...     return Delta(conditions=[el+inc for el in conditions])
        >>> dec_and_inc = combined_functions_on_state(
        ...     [('decrement', decrement), ('increment', increment)])
        >>> s_dec_and_inc = dec_and_inc(
        ...     s, params={'decrement': {'dec': 10}, 'increment': {'inc': 2}})
        >>> s_dec_and_inc
        U(conditions=[-9, -8, 3, 4])

    """

    def f_(_state: State, params: Optional[Dict] = None):
        result_delta = None
        for name, function in functions:
            _f_input_from_state = inputs_from_state(function)
            if params is None:
                _params = {}
            else:
                _params = params
            if name in _params.keys():
                _delta = _f_input_from_state(_state, **_params[name])
            else:
                _delta = _f_input_from_state(_state)
            result_delta = _extend(result_delta, _delta)
        return result_delta

    if output:
        f_ = outputs_to_delta(*output)(f_)
    f_ = delta_to_state(f_)
    return f_

delta_to_state(f)

Decorator to make f which takes a State and returns a Delta return an updated State.

This wrapper handles adding a returned Delta to an input State object.

Parameters:

Name Type Description Default
f

the function which returns a Delta object

required

Returns: the function modified to return a State object

Examples:

>>> from dataclasses import dataclass, field
>>> import pandas as pd
>>> from typing import List, Optional

The State it operates on needs to have the metadata described in the state module:

>>> @dataclass(frozen=True)
... class U(State):
...     conditions: List[int] = field(metadata={"delta": "replace"})

We indicate the inputs required by the parameter names. The output must be (compatible with) a Delta object.

>>> @delta_to_state
... @inputs_from_state
... def experimentalist(conditions):
...     new_conditions = [c + 10 for c in conditions]
...     return Delta(conditions=new_conditions)
>>> experimentalist(U(conditions=[1,2,3,4]))
U(conditions=[11, 12, 13, 14])
>>> experimentalist(U(conditions=[101,102,103,104]))
U(conditions=[111, 112, 113, 114])

If the output of the function is not a Delta object (or something compatible with its interface), then an error is thrown.

>>> @delta_to_state
... @inputs_from_state
... def returns_bare_conditions(conditions):
...     new_conditions = [c + 10 for c in conditions]
...     return new_conditions
>>> returns_bare_conditions(U(conditions=[1]))
Traceback (most recent call last):
...
AssertionError: Output of <function returns_bare_conditions at 0x...> must be a `Delta`,
`UserDict`, or `dict`.

A dictionary can be returned and used:

>>> @delta_to_state
... @inputs_from_state
... def returns_a_dictionary(conditions):
...     new_conditions = [c + 10 for c in conditions]
...     return {"conditions": new_conditions}
>>> returns_a_dictionary(U(conditions=[2]))
U(conditions=[12])

... as can an object which subclasses UserDict (like Delta)

>>> class MyDelta(UserDict):
...     pass
>>> @delta_to_state
... @inputs_from_state
... def returns_a_userdict(conditions):
...     new_conditions = [c + 10 for c in conditions]
...     return MyDelta(conditions=new_conditions)
>>> returns_a_userdict(U(conditions=[3]))
U(conditions=[13])

We recommend using the Delta object rather than a UserDict or dict as its functionality may be expanded in future.

>>> from autora.variable import VariableCollection, Variable
>>> from sklearn.base import BaseEstimator
>>> from sklearn.linear_model import LinearRegression
>>> @delta_to_state
... @inputs_from_state
... def theorist(experiment_data: pd.DataFrame, variables: VariableCollection, **kwargs):
...     ivs = [vi.name for vi in variables.independent_variables]
...     dvs = [vi.name for vi in variables.dependent_variables]
...     X, y = experiment_data[ivs], experiment_data[dvs]
...     new_model = LinearRegression(fit_intercept=True).set_params(**kwargs).fit(X, y)
...     return Delta(model=new_model)
>>> @dataclass(frozen=True)
... class V(State):
...     variables: VariableCollection  # field(metadata={"delta":... }) omitted ∴ immutable
...     experiment_data: pd.DataFrame = field(metadata={"delta": "extend"})
...     model: Optional[BaseEstimator] = field(metadata={"delta": "replace"}, default=None)
>>> v = V(
...     variables=VariableCollection(independent_variables=[Variable("x")],
...                                  dependent_variables=[Variable("y")]),
...     experiment_data=pd.DataFrame({"x": [0,1,2,3,4], "y": [2,3,4,5,6]})
... )
>>> v_prime = theorist(v)
>>> v_prime.model.coef_, v_prime.model.intercept_
(array([[1.]]), array([2.]))

Arguments from the state can be overridden by passing them in as keyword arguments (kwargs):

>>> theorist(v, experiment_data=pd.DataFrame({"x": [0,1,2,3], "y": [12,13,14,15]}))\
...     .model.intercept_
array([12.])

... and other arguments supported by the inner function can also be passed (if and only if the inner function allows for and handles **kwargs arguments alongside the values from the state).

>>> theorist(v, fit_intercept=False).model.intercept_
0.0

Any parameters not provided by the state must be provided by default values or by the caller. If the default is specified:

>>> @delta_to_state
... @inputs_from_state
... def experimentalist(conditions, offset=25):
...     new_conditions = [c + offset for c in conditions]
...     return Delta(conditions=new_conditions)

... then it need not be passed.

>>> experimentalist(U(conditions=[1,2,3,4]))
U(conditions=[26, 27, 28, 29])

If a default isn't specified:

>>> @delta_to_state
... @inputs_from_state
... def experimentalist(conditions, offset):
...     new_conditions = [c + offset for c in conditions]
...     return Delta(conditions=new_conditions)

... then calling the experimentalist without it will throw an error:

>>> experimentalist(U(conditions=[1,2,3,4]))
Traceback (most recent call last):
...
TypeError: experimentalist() missing 1 required positional argument: 'offset'

... which can be fixed by passing the argument as a keyword to the wrapped function.

>>> experimentalist(U(conditions=[1,2,3,4]), offset=2)
U(conditions=[3, 4, 5, 6])

The state itself is passed through if the inner function requests the state:

>>> @delta_to_state
... @inputs_from_state
... def function_which_needs_whole_state(state, conditions):
...     print("Doing something on: ", state)
...     new_conditions = [c + 2 for c in conditions]
...     return Delta(conditions=new_conditions)
>>> function_which_needs_whole_state(U(conditions=[1,2,3,4]))
Doing something on:  U(conditions=[1, 2, 3, 4])
U(conditions=[3, 4, 5, 6])
Source code in autora/state.py
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
def delta_to_state(f):
    """Decorator to make `f` which takes a `State` and returns a `Delta` return an updated `State`.

    This wrapper handles adding a returned Delta to an input State object.

    Args:
        f: the function which returns a `Delta` object

    Returns: the function modified to return a State object

    Examples:
        >>> from dataclasses import dataclass, field
        >>> import pandas as pd
        >>> from typing import List, Optional

        The `State` it operates on needs to have the metadata described in the state module:
        >>> @dataclass(frozen=True)
        ... class U(State):
        ...     conditions: List[int] = field(metadata={"delta": "replace"})

        We indicate the inputs required by the parameter names.
        The output must be (compatible with) a `Delta` object.
        >>> @delta_to_state
        ... @inputs_from_state
        ... def experimentalist(conditions):
        ...     new_conditions = [c + 10 for c in conditions]
        ...     return Delta(conditions=new_conditions)

        >>> experimentalist(U(conditions=[1,2,3,4]))
        U(conditions=[11, 12, 13, 14])

        >>> experimentalist(U(conditions=[101,102,103,104]))
        U(conditions=[111, 112, 113, 114])

        If the output of the function is not a `Delta` object (or something compatible with its
        interface), then an error is thrown.
        >>> @delta_to_state
        ... @inputs_from_state
        ... def returns_bare_conditions(conditions):
        ...     new_conditions = [c + 10 for c in conditions]
        ...     return new_conditions

        >>> returns_bare_conditions(U(conditions=[1])) # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
        Traceback (most recent call last):
        ...
        AssertionError: Output of <function returns_bare_conditions at 0x...> must be a `Delta`,
        `UserDict`, or `dict`.

        A dictionary can be returned and used:
        >>> @delta_to_state
        ... @inputs_from_state
        ... def returns_a_dictionary(conditions):
        ...     new_conditions = [c + 10 for c in conditions]
        ...     return {"conditions": new_conditions}
        >>> returns_a_dictionary(U(conditions=[2]))
        U(conditions=[12])

        ... as can an object which subclasses UserDict (like `Delta`)
        >>> class MyDelta(UserDict):
        ...     pass
        >>> @delta_to_state
        ... @inputs_from_state
        ... def returns_a_userdict(conditions):
        ...     new_conditions = [c + 10 for c in conditions]
        ...     return MyDelta(conditions=new_conditions)
        >>> returns_a_userdict(U(conditions=[3]))
        U(conditions=[13])

        We recommend using the `Delta` object rather than a `UserDict` or `dict` as its
        functionality may be expanded in future.

        >>> from autora.variable import VariableCollection, Variable
        >>> from sklearn.base import BaseEstimator
        >>> from sklearn.linear_model import LinearRegression

        >>> @delta_to_state
        ... @inputs_from_state
        ... def theorist(experiment_data: pd.DataFrame, variables: VariableCollection, **kwargs):
        ...     ivs = [vi.name for vi in variables.independent_variables]
        ...     dvs = [vi.name for vi in variables.dependent_variables]
        ...     X, y = experiment_data[ivs], experiment_data[dvs]
        ...     new_model = LinearRegression(fit_intercept=True).set_params(**kwargs).fit(X, y)
        ...     return Delta(model=new_model)

        >>> @dataclass(frozen=True)
        ... class V(State):
        ...     variables: VariableCollection  # field(metadata={"delta":... }) omitted ∴ immutable
        ...     experiment_data: pd.DataFrame = field(metadata={"delta": "extend"})
        ...     model: Optional[BaseEstimator] = field(metadata={"delta": "replace"}, default=None)

        >>> v = V(
        ...     variables=VariableCollection(independent_variables=[Variable("x")],
        ...                                  dependent_variables=[Variable("y")]),
        ...     experiment_data=pd.DataFrame({"x": [0,1,2,3,4], "y": [2,3,4,5,6]})
        ... )
        >>> v_prime = theorist(v)
        >>> v_prime.model.coef_, v_prime.model.intercept_
        (array([[1.]]), array([2.]))

        Arguments from the state can be overridden by passing them in as keyword arguments (kwargs):
        >>> theorist(v, experiment_data=pd.DataFrame({"x": [0,1,2,3], "y": [12,13,14,15]}))\\
        ...     .model.intercept_
        array([12.])

        ... and other arguments supported by the inner function can also be passed
        (if and only if the inner function allows for and handles `**kwargs` arguments alongside
        the values from the state).
        >>> theorist(v, fit_intercept=False).model.intercept_
        0.0

        Any parameters not provided by the state must be provided by default values or by the
        caller. If the default is specified:
        >>> @delta_to_state
        ... @inputs_from_state
        ... def experimentalist(conditions, offset=25):
        ...     new_conditions = [c + offset for c in conditions]
        ...     return Delta(conditions=new_conditions)

        ... then it need not be passed.
        >>> experimentalist(U(conditions=[1,2,3,4]))
        U(conditions=[26, 27, 28, 29])

        If a default isn't specified:
        >>> @delta_to_state
        ... @inputs_from_state
        ... def experimentalist(conditions, offset):
        ...     new_conditions = [c + offset for c in conditions]
        ...     return Delta(conditions=new_conditions)

        ... then calling the experimentalist without it will throw an error:
        >>> experimentalist(U(conditions=[1,2,3,4]))
        Traceback (most recent call last):
        ...
        TypeError: experimentalist() missing 1 required positional argument: 'offset'

        ... which can be fixed by passing the argument as a keyword to the wrapped function.
        >>> experimentalist(U(conditions=[1,2,3,4]), offset=2)
        U(conditions=[3, 4, 5, 6])

        The state itself is passed through if the inner function requests the `state`:
        >>> @delta_to_state
        ... @inputs_from_state
        ... def function_which_needs_whole_state(state, conditions):
        ...     print("Doing something on: ", state)
        ...     new_conditions = [c + 2 for c in conditions]
        ...     return Delta(conditions=new_conditions)
        >>> function_which_needs_whole_state(U(conditions=[1,2,3,4]))
        Doing something on:  U(conditions=[1, 2, 3, 4])
        U(conditions=[3, 4, 5, 6])

    """

    @wraps(f)
    def _f(state_: S, **kwargs) -> S:
        delta = f(state_, **kwargs)
        assert isinstance(delta, Mapping), (
            "Output of %s must be a `Delta`, `UserDict`, " "or `dict`." % f
        )
        new_state = state_ + delta
        return new_state

    return _f

estimator_on_state(estimator)

Convert a scikit-learn compatible estimator into a function on a State object.

Supports passing additional **kwargs which are used to update the estimator's params before fitting.

Examples:

Initialize a function which operates on the state, state_fn and runs a LinearRegression.

>>> from sklearn.linear_model import LinearRegression
>>> state_fn = estimator_on_state(LinearRegression())

Define the state on which to operate (here an instance of the StandardState):

>>> from autora.state import StandardState
>>> from autora.variable import Variable, VariableCollection
>>> import pandas as pd
>>> s = StandardState(
...     variables=VariableCollection(
...         independent_variables=[Variable("x")],
...         dependent_variables=[Variable("y")]),
...     experiment_data=pd.DataFrame({"x": [1,2,3], "y":[3,6,9]})
... )

Run the function, which fits the model and adds the result to the StandardState as the last entry in the .models list.

>>> state_fn(s).models[-1].coef_
array([[3.]])
Source code in autora/state.py
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
def estimator_on_state(estimator: BaseEstimator) -> StateFunction:
    """
    Convert a scikit-learn compatible estimator into a function on a `State` object.

    Supports passing additional `**kwargs` which are used to update the estimator's params
    before fitting.

    Examples:
        Initialize a function which operates on the state, `state_fn` and runs a LinearRegression.
        >>> from sklearn.linear_model import LinearRegression
        >>> state_fn = estimator_on_state(LinearRegression())

        Define the state on which to operate (here an instance of the `StandardState`):
        >>> from autora.state import StandardState
        >>> from autora.variable import Variable, VariableCollection
        >>> import pandas as pd
        >>> s = StandardState(
        ...     variables=VariableCollection(
        ...         independent_variables=[Variable("x")],
        ...         dependent_variables=[Variable("y")]),
        ...     experiment_data=pd.DataFrame({"x": [1,2,3], "y":[3,6,9]})
        ... )

        Run the function, which fits the model and adds the result to the `StandardState` as the
        last entry in the .models list.
        >>> state_fn(s).models[-1].coef_
        array([[3.]])

    """

    @on_state()
    def theorist(
        experiment_data: pd.DataFrame, variables: VariableCollection, **kwargs
    ):
        ivs = [v.name for v in variables.independent_variables]
        dvs = [v.name for v in variables.dependent_variables]
        X, y = experiment_data[ivs], experiment_data[dvs]
        new_model = estimator.set_params(**kwargs).fit(X, y)
        return Delta(models=[new_model])

    return theorist

experiment_runner_on_state(f)

Wrapper for experiment_runner of the form \(f(x) arrow (x,y)\), where f returns both \(x\) and \(y\) values in a complete dataframe.

Examples:

The conditions are some x-values in a StandardState object:

>>> from autora.state import StandardState
>>> s = StandardState(conditions=pd.DataFrame({"x": [1, 2, 3]}))

The function can be defined on a DataFrame, allowing the explicit inclusion of metadata like column names.

>>> def x_to_xy_fn(c: pd.DataFrame) -> pd.Series:
...     result = c.assign(y=lambda df: 2 * df.x + 1)
...     return result

We apply the wrapped function to s and look at the returned experiment_data:

>>> experiment_runner_on_state(x_to_xy_fn)(s).experiment_data
   x  y
0  1  3
1  2  5
2  3  7

We can also define functions of several variables:

>>> def xs_to_xy_fn(c: pd.DataFrame) -> pd.Series:
...     result = c.assign(y=c.x0 + c.x1)
...     return result

With the relevant variables as conditions:

>>> t = StandardState(conditions=pd.DataFrame({"x0": [1, 2, 3], "x1": [10, 20, 30]}))
>>> experiment_runner_on_state(xs_to_xy_fn)(t).experiment_data
   x0  x1   y
0   1  10  11
1   2  20  22
2   3  30  33
Source code in autora/state.py
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
def experiment_runner_on_state(f: Callable[[X], XY]) -> StateFunction:
    """Wrapper for experiment_runner of the form $f(x) \rarrow (x,y)$, where `f`
    returns both $x$ and $y$ values in a complete dataframe.

    Examples:
        The conditions are some x-values in a StandardState object:
        >>> from autora.state import StandardState
        >>> s = StandardState(conditions=pd.DataFrame({"x": [1, 2, 3]}))

        The function can be defined on a DataFrame, allowing the explicit inclusion of
        metadata like column names.
        >>> def x_to_xy_fn(c: pd.DataFrame) -> pd.Series:
        ...     result = c.assign(y=lambda df: 2 * df.x + 1)
        ...     return result

        We apply the wrapped function to `s` and look at the returned experiment_data:
        >>> experiment_runner_on_state(x_to_xy_fn)(s).experiment_data
           x  y
        0  1  3
        1  2  5
        2  3  7

        We can also define functions of several variables:
        >>> def xs_to_xy_fn(c: pd.DataFrame) -> pd.Series:
        ...     result = c.assign(y=c.x0 + c.x1)
        ...     return result

        With the relevant variables as conditions:
        >>> t = StandardState(conditions=pd.DataFrame({"x0": [1, 2, 3], "x1": [10, 20, 30]}))
        >>> experiment_runner_on_state(xs_to_xy_fn)(t).experiment_data
           x0  x1   y
        0   1  10  11
        1   2  20  22
        2   3  30  33

    """

    @on_state()
    def experiment_runner(conditions: pd.DataFrame, **kwargs):
        x = conditions
        experiment_data = f(x, **kwargs)
        return Delta(experiment_data=experiment_data)

    return experiment_runner

inputs_from_state(f, input_mapping={})

Decorator to make target f into a function on a State and **kwargs.

This wrapper makes it easier to pass arguments to a function from a State.

It was inspired by the pytest "fixtures" mechanism.

Parameters:

Name Type Description Default
f

a function with arguments that could be fields on a State and that returns a Delta.

required
input_mapping Dict

a dict that maps the input arguments of the function to the state fields

{}

Returns: a version of f which takes and returns State objects.

Examples:

>>> from dataclasses import dataclass, field
>>> import pandas as pd
>>> from typing import List, Optional

The State it operates on needs to have the metadata described in the state module:

>>> @dataclass(frozen=True)
... class U(State):
...     conditions: List[int] = field(metadata={"delta": "replace"})

We indicate the inputs required by the parameter names. The output must be (compatible with) a Delta object.

>>> @inputs_from_state
... def experimentalist(conditions):
...     new_conditions = [c + 10 for c in conditions]
...     return new_conditions
>>> experimentalist(U(conditions=[1,2,3,4]))
[11, 12, 13, 14]
>>> experimentalist(U(conditions=[101,102,103,104]))
[111, 112, 113, 114]

If our function uses a different keyword argument than the state field, we can use the input mapping:

>>> def experimentalist_(X):
...     new_conditions = [x + 10 for x in X]
...     return new_conditions
>>> experimentalist_on_state = inputs_from_state(experimentalist_, {'X': 'conditions'})
>>> experimentalist_on_state(U(conditions=[1,2,3,4]))
[11, 12, 13, 14]

Both also work with the State as UserDict. Here, we use the StandardState

>>> experimentalist(StandardState(conditions=[1, 2, 3, 4]))
[11, 12, 13, 14]
>>> experimentalist_on_state(StandardState(conditions=[1, 2, 3, 4]))
[11, 12, 13, 14]

A dictionary can be returned and used:

>>> @inputs_from_state
... def returns_a_dictionary(conditions):
...     new_conditions = [c + 10 for c in conditions]
...     return {"conditions": new_conditions}
>>> returns_a_dictionary(U(conditions=[2]))
{'conditions': [12]}
>>> from autora.variable import VariableCollection, Variable
>>> from sklearn.base import BaseEstimator
>>> from sklearn.linear_model import LinearRegression
>>> @inputs_from_state
... def theorist(experiment_data: pd.DataFrame, variables: VariableCollection, **kwargs):
...     ivs = [vi.name for vi in variables.independent_variables]
...     dvs = [vi.name for vi in variables.dependent_variables]
...     X, y = experiment_data[ivs], experiment_data[dvs]
...     model = LinearRegression(fit_intercept=True).set_params(**kwargs).fit(X, y)
...     return model
>>> @dataclass(frozen=True)
... class V(State):
...     variables: VariableCollection  # field(metadata={"delta":... }) omitted ∴ immutable
...     experiment_data: pd.DataFrame = field(metadata={"delta": "extend"})
...     model: Optional[BaseEstimator] = field(metadata={"delta": "replace"}, default=None)
>>> v = V(
...     variables=VariableCollection(independent_variables=[Variable("x")],
...                                  dependent_variables=[Variable("y")]),
...     experiment_data=pd.DataFrame({"x": [0,1,2,3,4], "y": [2,3,4,5,6]})
... )
>>> model = theorist(v)
>>> model.coef_, model.intercept_
(array([[1.]]), array([2.]))

Arguments from the state can be overridden by passing them in as keyword arguments (kwargs):

>>> theorist(v, experiment_data=pd.DataFrame({"x": [0,1,2,3], "y": [12,13,14,15]}))\
...     .intercept_
array([12.])

... and other arguments supported by the inner function can also be passed (if and only if the inner function allows for and handles **kwargs arguments alongside the values from the state).

>>> theorist(v, fit_intercept=False).intercept_
0.0

Any parameters not provided by the state must be provided by default values or by the caller. If the default is specified:

>>> @inputs_from_state
... def experimentalist(conditions, offset=25):
...     new_conditions = [c + offset for c in conditions]
...     return new_conditions

... then it need not be passed.

>>> experimentalist(U(conditions=[1,2,3,4]))
[26, 27, 28, 29]

If a default isn't specified:

>>> @inputs_from_state
... def experimentalist(conditions, offset):
...     new_conditions = [c + offset for c in conditions]
...     return new_conditions

... then calling the experimentalist without it will throw an error:

>>> experimentalist(U(conditions=[1,2,3,4]))
Traceback (most recent call last):
...
TypeError: experimentalist() missing 1 required positional argument: 'offset'

... which can be fixed by passing the argument as a keyword to the wrapped function.

>>> experimentalist(U(conditions=[1,2,3,4]), offset=2)
[3, 4, 5, 6]

The same is true, if we don't provide a mapping for arguments:

>>> def experimentalist_(X, offset):
...     new_conditions = [x + offset for x in X]
...     return new_conditions
>>> experimentalist_on_state = inputs_from_state(experimentalist_, {'X': 'conditions'})
>>> experimentalist_on_state(StandardState(conditions=[1,2,3,4]), offset=2)
[3, 4, 5, 6]

The state itself is passed through if the inner function requests the state:

>>> @inputs_from_state
... def function_which_needs_whole_state(state, conditions):
...     print("Doing something on: ", state)
...     new_conditions = [c + 2 for c in conditions]
...     return new_conditions
>>> function_which_needs_whole_state(U(conditions=[1,2,3,4]))
Doing something on:  U(conditions=[1, 2, 3, 4])
[3, 4, 5, 6]
Source code in autora/state.py
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
def inputs_from_state(f, input_mapping: Dict = {}):
    """Decorator to make target `f` into a function on a `State` and `**kwargs`.

    This wrapper makes it easier to pass arguments to a function from a State.

    It was inspired by the pytest "fixtures" mechanism.

    Args:
        f: a function with arguments that could be fields on a `State`
            and that returns a `Delta`.
        input_mapping: a dict that maps the input arguments of the function to the state fields

    Returns: a version of `f` which takes and returns `State` objects.

    Examples:
        >>> from dataclasses import dataclass, field
        >>> import pandas as pd
        >>> from typing import List, Optional

        The `State` it operates on needs to have the metadata described in the state module:
        >>> @dataclass(frozen=True)
        ... class U(State):
        ...     conditions: List[int] = field(metadata={"delta": "replace"})

        We indicate the inputs required by the parameter names.
        The output must be (compatible with) a `Delta` object.
        >>> @inputs_from_state
        ... def experimentalist(conditions):
        ...     new_conditions = [c + 10 for c in conditions]
        ...     return new_conditions

        >>> experimentalist(U(conditions=[1,2,3,4]))
        [11, 12, 13, 14]

        >>> experimentalist(U(conditions=[101,102,103,104]))
        [111, 112, 113, 114]

        If our function uses a different keyword argument than the state field, we can use
        the input mapping:
        >>> def experimentalist_(X):
        ...     new_conditions = [x + 10 for x in X]
        ...     return new_conditions
        >>> experimentalist_on_state = inputs_from_state(experimentalist_, {'X': 'conditions'})
        >>> experimentalist_on_state(U(conditions=[1,2,3,4]))
        [11, 12, 13, 14]

        Both also work with the `State` as UserDict. Here, we use the StandardState
        >>> experimentalist(StandardState(conditions=[1, 2, 3, 4]))
        [11, 12, 13, 14]

        >>> experimentalist_on_state(StandardState(conditions=[1, 2, 3, 4]))
        [11, 12, 13, 14]

        A dictionary can be returned and used:
        >>> @inputs_from_state
        ... def returns_a_dictionary(conditions):
        ...     new_conditions = [c + 10 for c in conditions]
        ...     return {"conditions": new_conditions}
        >>> returns_a_dictionary(U(conditions=[2]))
        {'conditions': [12]}

        >>> from autora.variable import VariableCollection, Variable
        >>> from sklearn.base import BaseEstimator
        >>> from sklearn.linear_model import LinearRegression

        >>> @inputs_from_state
        ... def theorist(experiment_data: pd.DataFrame, variables: VariableCollection, **kwargs):
        ...     ivs = [vi.name for vi in variables.independent_variables]
        ...     dvs = [vi.name for vi in variables.dependent_variables]
        ...     X, y = experiment_data[ivs], experiment_data[dvs]
        ...     model = LinearRegression(fit_intercept=True).set_params(**kwargs).fit(X, y)
        ...     return model

        >>> @dataclass(frozen=True)
        ... class V(State):
        ...     variables: VariableCollection  # field(metadata={"delta":... }) omitted ∴ immutable
        ...     experiment_data: pd.DataFrame = field(metadata={"delta": "extend"})
        ...     model: Optional[BaseEstimator] = field(metadata={"delta": "replace"}, default=None)

        >>> v = V(
        ...     variables=VariableCollection(independent_variables=[Variable("x")],
        ...                                  dependent_variables=[Variable("y")]),
        ...     experiment_data=pd.DataFrame({"x": [0,1,2,3,4], "y": [2,3,4,5,6]})
        ... )
        >>> model = theorist(v)
        >>> model.coef_, model.intercept_
        (array([[1.]]), array([2.]))

        Arguments from the state can be overridden by passing them in as keyword arguments (kwargs):
        >>> theorist(v, experiment_data=pd.DataFrame({"x": [0,1,2,3], "y": [12,13,14,15]}))\\
        ...     .intercept_
        array([12.])

        ... and other arguments supported by the inner function can also be passed
        (if and only if the inner function allows for and handles `**kwargs` arguments alongside
        the values from the state).
        >>> theorist(v, fit_intercept=False).intercept_
        0.0

        Any parameters not provided by the state must be provided by default values or by the
        caller. If the default is specified:
        >>> @inputs_from_state
        ... def experimentalist(conditions, offset=25):
        ...     new_conditions = [c + offset for c in conditions]
        ...     return new_conditions

        ... then it need not be passed.
        >>> experimentalist(U(conditions=[1,2,3,4]))
        [26, 27, 28, 29]

        If a default isn't specified:
        >>> @inputs_from_state
        ... def experimentalist(conditions, offset):
        ...     new_conditions = [c + offset for c in conditions]
        ...     return new_conditions

        ... then calling the experimentalist without it will throw an error:
        >>> experimentalist(U(conditions=[1,2,3,4]))
        Traceback (most recent call last):
        ...
        TypeError: experimentalist() missing 1 required positional argument: 'offset'

        ... which can be fixed by passing the argument as a keyword to the wrapped function.
        >>> experimentalist(U(conditions=[1,2,3,4]), offset=2)
        [3, 4, 5, 6]

        The same is true, if we don't provide a mapping for arguments:
        >>> def experimentalist_(X, offset):
        ...     new_conditions = [x + offset for x in X]
        ...     return new_conditions
        >>> experimentalist_on_state = inputs_from_state(experimentalist_, {'X': 'conditions'})
        >>> experimentalist_on_state(StandardState(conditions=[1,2,3,4]), offset=2)
        [3, 4, 5, 6]

        The state itself is passed through if the inner function requests the `state`:
        >>> @inputs_from_state
        ... def function_which_needs_whole_state(state, conditions):
        ...     print("Doing something on: ", state)
        ...     new_conditions = [c + 2 for c in conditions]
        ...     return new_conditions
        >>> function_which_needs_whole_state(U(conditions=[1,2,3,4]))
        Doing something on:  U(conditions=[1, 2, 3, 4])
        [3, 4, 5, 6]

    """
    # Get the set of parameter names from function f's signature

    reversed_mapping = {v: k for k, v in input_mapping.items()}

    parameters_ = set(inspect.signature(f).parameters.keys())
    missing_func_params = set(input_mapping.keys()).difference(parameters_)
    if missing_func_params:
        raise ValueError(
            f"The following keys in input_state_mapping are not parameters of the function: "
            f"{missing_func_params}"
        )

    @wraps(f)
    def _f(state_: S, /, **kwargs) -> S:
        # Get the parameters needed which are available from the state_.
        # All others must be provided as kwargs or default values on f.
        assert is_dataclass(state_) or isinstance(state_, UserDict)
        if is_dataclass(state_):
            from_state = parameters_.intersection({i.name for i in fields(state_)})
            arguments_from_state = {k: getattr(state_, k) for k in from_state}
            from_state_input_mapping = {
                reversed_mapping.get(field.name, field.name): getattr(
                    state_, field.name
                )
                for field in fields(state_)
                if reversed_mapping.get(field.name, field.name) in parameters_
            }
            arguments_from_state.update(from_state_input_mapping)
        elif isinstance(state_, UserDict):
            from_state = parameters_.intersection(set(state_.keys()))
            arguments_from_state = {k: state_[k] for k in from_state}
            from_state_input_mapping = {
                reversed_mapping.get(key, key): state_[key]
                for key in state_.keys()
                if reversed_mapping.get(key, key) in parameters_
            }
            arguments_from_state.update(from_state_input_mapping)
        if "state" in parameters_:
            arguments_from_state["state"] = state_
        arguments = dict(arguments_from_state, **kwargs)
        result = f(**arguments)
        return result

    return _f

on_state(function=None, input_mapping={}, output=None)

Decorator (factory) to make target function into a function on a State and **kwargs.

This combines the functionality of outputs_to_delta and inputs_from_state

Parameters:

Name Type Description Default
function Optional[Callable]

the function to be wrapped

None
output Optional[Sequence[str]]

list specifying State field names for the return values of function

None
input_mapping Dict

a dict that maps the keywords of the functions to the state fields

{}

Returns:

Examples:

>>> from dataclasses import dataclass, field
>>> import pandas as pd
>>> from typing import List, Optional

The State it operates on needs to have the metadata described in the state module:

>>> @dataclass(frozen=True)
... class W(State):
...     conditions: List[int] = field(metadata={"delta": "replace"})

We indicate the inputs required by the parameter names.

>>> def add_ten(conditions):
...     return [c + 10 for c in conditions]
>>> experimentalist = on_state(function=add_ten, output=["conditions"])
>>> experimentalist(W(conditions=[1,2,3,4]))
W(conditions=[11, 12, 13, 14])

You can wrap functions which return a Delta object natively, by omitting the output argument:

>>> @on_state()
... def add_five(conditions):
...     return Delta(conditions=[c + 5 for c in conditions])
>>> add_five(W(conditions=[1, 2, 3, 4]))
W(conditions=[6, 7, 8, 9])

If you fail to declare outputs for a function which doesn't return a Delta:

>>> @on_state()
... def missing_output_param(conditions):
...     return [c + 5 for c in conditions]

... an exception is raised:

>>> missing_output_param(W(conditions=[1]))
Traceback (most recent call last):
...
AssertionError: Output of <function missing_output_param at 0x...> must be a `Delta`,
`UserDict`, or `dict`.

You can use the @on_state(output=[...]) as a decorator:

>>> @on_state(output=["conditions"])
... def add_six(conditions):
...     return [c + 6 for c in conditions]
>>> add_six(W(conditions=[1, 2, 3, 4]))
W(conditions=[7, 8, 9, 10])

You can also declare an input-to-output mapping if the keyword arguments of the functions don't match the state fields:

>>> @on_state(input_mapping={'X': 'conditions'}, output=["conditions"])
... def add_six(X):
...     return [x + 6 for x in X]
>>> add_six(W(conditions=[1, 2, 3, 4]))
W(conditions=[7, 8, 9, 10])

This also works on the StandardState or other States that are defined as UserDicts:

>>> add_six(StandardState(conditions=[1, 2, 3,4])).conditions
    0
0   7
1   8
2   9
3  10
Source code in autora/state.py
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
def on_state(
    function: Optional[Callable] = None,
    input_mapping: Dict = {},
    output: Optional[Sequence[str]] = None,
):
    """Decorator (factory) to make target `function` into a function on a `State` and `**kwargs`.

    This combines the functionality of `outputs_to_delta` and `inputs_from_state`

    Args:
        function: the function to be wrapped
        output: list specifying State field names for the return values of `function`
        input_mapping: a dict that maps the keywords of the functions to the state fields

    Returns:

    Examples:
        >>> from dataclasses import dataclass, field
        >>> import pandas as pd
        >>> from typing import List, Optional

        The `State` it operates on needs to have the metadata described in the state module:
        >>> @dataclass(frozen=True)
        ... class W(State):
        ...     conditions: List[int] = field(metadata={"delta": "replace"})

        We indicate the inputs required by the parameter names.
        >>> def add_ten(conditions):
        ...     return [c + 10 for c in conditions]
        >>> experimentalist = on_state(function=add_ten, output=["conditions"])

        >>> experimentalist(W(conditions=[1,2,3,4]))
        W(conditions=[11, 12, 13, 14])

        You can wrap functions which return a Delta object natively, by omitting the `output`
        argument:
        >>> @on_state()
        ... def add_five(conditions):
        ...     return Delta(conditions=[c + 5 for c in conditions])

        >>> add_five(W(conditions=[1, 2, 3, 4]))
        W(conditions=[6, 7, 8, 9])

        If you fail to declare outputs for a function which doesn't return a Delta:
        >>> @on_state()
        ... def missing_output_param(conditions):
        ...     return [c + 5 for c in conditions]

        ... an exception is raised:
        >>> missing_output_param(W(conditions=[1])) # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
        Traceback (most recent call last):
        ...
        AssertionError: Output of <function missing_output_param at 0x...> must be a `Delta`,
        `UserDict`, or `dict`.

        You can use the @on_state(output=[...]) as a decorator:
        >>> @on_state(output=["conditions"])
        ... def add_six(conditions):
        ...     return [c + 6 for c in conditions]

        >>> add_six(W(conditions=[1, 2, 3, 4]))
        W(conditions=[7, 8, 9, 10])

        You can also declare an input-to-output mapping if the keyword arguments of the functions
        don't match the state fields:
        >>> @on_state(input_mapping={'X': 'conditions'}, output=["conditions"])
        ... def add_six(X):
        ...     return [x + 6 for x in X]

        >>> add_six(W(conditions=[1, 2, 3, 4]))
        W(conditions=[7, 8, 9, 10])

        This also works on the StandardState or other States that are defined as UserDicts:
        >>> add_six(StandardState(conditions=[1, 2, 3,4])).conditions
            0
        0   7
        1   8
        2   9
        3  10
    """

    def decorator(f):
        f_ = f
        if output is not None:
            f_ = outputs_to_delta(*output)(f_)
        f_ = inputs_from_state(f_, input_mapping)
        f_ = delta_to_state(f_)
        return f_

    if function is None:
        return decorator
    else:
        return decorator(function)

outputs_to_delta(*output)

Decorator factory to wrap outputs from a function as Deltas.

Examples:

>>> @outputs_to_delta("conditions")
... def add_five(x):
...     return [xi + 5 for xi in x]
>>> add_five([1, 2, 3])
{'conditions': [6, 7, 8]}
>>> @outputs_to_delta("c")
... def add_six(conditions):
...     return [c + 5 for c in conditions]
>>> add_six([1, 2, 3])
{'c': [6, 7, 8]}
>>> @outputs_to_delta("+1", "-1")
... def plus_minus_1(x):
...     a = [xi + 1 for xi in x]
...     b = [xi - 1 for xi in x]
...     return a, b
>>> plus_minus_1([1, 2, 3])
{'+1': [2, 3, 4], '-1': [0, 1, 2]}

If the wrong number of values are specified for the return, then there might be errors. If multiple outputs are expected, but only a single output is returned, we get a warning:

>>> @outputs_to_delta("1", "2")
... def returns_single_result_when_more_expected():
...     return "a"
>>> returns_single_result_when_more_expected()
Traceback (most recent call last):
...
AssertionError: function `<function returns_single_result_when_more_expected at 0x...>`
has to return multiple values to match `('1', '2')`. Got `a` instead.

If multiple outputs are expected, but the wrong number are returned, we get a warning:

>>> @outputs_to_delta("1", "2", "3")
... def returns_wrong_number_of_results():
...     return "a", "b"
>>> returns_wrong_number_of_results()
Traceback (most recent call last):
...
AssertionError: function `<function returns_wrong_number_of_results at 0x...>`
has to return exactly `3` values to match `('1', '2', '3')`. Got `('a', 'b')` instead.

However, if a single output is expected, and multiple are returned, these are treated as a single object and no error occurs:

>>> @outputs_to_delta("foo")
... def returns_a_tuple():
...     return "a", "b", "c"
>>> returns_a_tuple()
{'foo': ('a', 'b', 'c')}

If we fail to specify output names, an error is returned immediately.

>>> @outputs_to_delta()
... def decorator_missing_arguments():
...     return "a", "b", "c"
Traceback (most recent call last):
...
ValueError: `output` names must be specified.
Source code in autora/state.py
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
def outputs_to_delta(*output: str):
    """
    Decorator factory to wrap outputs from a function as Deltas.

    Examples:
        >>> @outputs_to_delta("conditions")
        ... def add_five(x):
        ...     return [xi + 5 for xi in x]

        >>> add_five([1, 2, 3])
        {'conditions': [6, 7, 8]}

        >>> @outputs_to_delta("c")
        ... def add_six(conditions):
        ...     return [c + 5 for c in conditions]

        >>> add_six([1, 2, 3])
        {'c': [6, 7, 8]}

        >>> @outputs_to_delta("+1", "-1")
        ... def plus_minus_1(x):
        ...     a = [xi + 1 for xi in x]
        ...     b = [xi - 1 for xi in x]
        ...     return a, b

        >>> plus_minus_1([1, 2, 3])
        {'+1': [2, 3, 4], '-1': [0, 1, 2]}


        If the wrong number of values are specified for the return, then there might be errors.
        If multiple outputs are expected, but only a single output is returned, we get a warning:
        >>> @outputs_to_delta("1", "2")
        ... def returns_single_result_when_more_expected():
        ...     return "a"
        >>> returns_single_result_when_more_expected()  # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
        Traceback (most recent call last):
        ...
        AssertionError: function `<function returns_single_result_when_more_expected at 0x...>`
        has to return multiple values to match `('1', '2')`. Got `a` instead.

        If multiple outputs are expected, but the wrong number are returned, we get a warning:
        >>> @outputs_to_delta("1", "2", "3")
        ... def returns_wrong_number_of_results():
        ...     return "a", "b"
        >>> returns_wrong_number_of_results()  # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
        Traceback (most recent call last):
        ...
        AssertionError: function `<function returns_wrong_number_of_results at 0x...>`
        has to return exactly `3` values to match `('1', '2', '3')`. Got `('a', 'b')` instead.

        However, if a single output is expected, and multiple are returned, these are treated as
        a single object and no error occurs:
        >>> @outputs_to_delta("foo")
        ... def returns_a_tuple():
        ...     return "a", "b", "c"
        >>> returns_a_tuple()
        {'foo': ('a', 'b', 'c')}

        If we fail to specify output names, an error is returned immediately.
        >>> @outputs_to_delta()
        ... def decorator_missing_arguments():
        ...     return "a", "b", "c"
        Traceback (most recent call last):
        ...
        ValueError: `output` names must be specified.

    """

    def decorator(f):
        if len(output) == 0:
            raise ValueError("`output` names must be specified.")

        elif len(output) == 1:

            @wraps(f)
            def inner(*args, **kwargs):
                result = f(*args, **kwargs)
                delta = Delta(**{output[0]: result})
                return delta

        else:

            @wraps(f)
            def inner(*args, **kwargs):
                result = f(*args, **kwargs)
                assert isinstance(result, tuple), (
                    "function `%s` has to return multiple values "
                    "to match `%s`. Got `%s` instead." % (f, output, result)
                )
                assert len(output) == len(result), (
                    "function `%s` has to return "
                    "exactly `%s` values "
                    "to match `%s`. "
                    "Got `%s` instead."
                    "" % (f, len(output), output, result)
                )
                delta = Delta(**dict(zip(output, result)))
                return delta

        return inner

    return decorator