autora.experimentalist.bandit_random
Experimentalist that returns probability sequences: Sequences of vectors with elements between 0 and 1 or reward sequences: Sequences of vectors with binary elements
pool(num_rewards, sequence_length, initial_probabilities=None, sigmas=None, num_samples=1, random_state=None)
Returns a list of rewards.
A reward sequence is a sequence of vectors of dimension num_probabilities
. Each entry
of this vector is a number between 0 and 1.
We can set a fixed initial value for the reward probability of the first vector of each sequence
and a constant drif rate.
We can also set a range to randomly sample these values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_rewards |
int
|
The number of rewards/ dimention of each element of the sequence |
required |
sequence_length |
int
|
The length of the sequence |
required |
initial_probabilities |
Optional[Iterable[Union[float, Iterable]]]
|
A list of initial reward-probabilities. Each |
None
|
sigmas |
Optional[Iterable[Union[float, Iterable]]]
|
A list of constant drift rate for each element of the probabilites. Each |
None
|
num_samples |
int
|
number of experimental conditions to select |
1
|
random_state |
Optional[int]
|
the seed value for the random number generator |
None
|
Returns: Sampled pool of experimental conditions
Examples:
We create a reward sequence for five two arm bandit tasks. The reward probabilities for each arm should be .5 and constant.
>>> pool(num_rewards=2, sequence_length=3, num_samples=1, random_state=42)
[[[1, 0], [1, 1], [0, 1]]]
If we want more arms:
>>> pool(num_rewards=4, sequence_length=3, num_samples=1, random_state=42)
[[[1, 0, 1, 1], [0, 1, 1, 1], [0, 0, 0, 1]]]
longer sequence:
>>> pool(num_rewards=2, sequence_length=5, num_samples=1, random_state=42)
[[[1, 0], [1, 1], [0, 1], [1, 1], [0, 0]]]
more sequences:
>>> pool(num_rewards=2, sequence_length=3, num_samples=2, random_state=42)
[[[1, 0], [1, 1], [0, 1]], [[1, 1], [0, 0], [0, 1]]]
We can set fixed initial values:
>>> pool(num_rewards=2, sequence_length=3,
... initial_probabilities=[0.,.4],
... random_state=42)
[[[0, 0], [0, 1], [0, 1]]]
And drift rates:
>>> pool(num_rewards=2, sequence_length=3,
... initial_probabilities=[0.,.4],
... sigmas=[.2, .3],
... random_state=42)
[[[0, 0], [0, 1], [0, 1]]]
We can also sample the initial values by passing a range:
>>> pool(num_rewards=2, sequence_length=3,
... initial_probabilities=[[0, .2],[.8, 1.]],
... sigmas=[[0., .2], [0., .3]],
... random_state=42)
[[[0, 1], [1, 1], [0, 1]]]
Source code in temp_dir/bandit-random/src/autora/experimentalist/bandit_random/__init__.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
pool_from_proba(probability_sequence, random_state=None)
From a given probability sequence sample rewards (0 or 1)
Example
proba_sequence = pool_proba(num_probabilities=2, sequence_length=3, ... initial_probabilities=[.2,.8], ... sigmas=[.2, .1], random_state=42) proba_sequence [[[0.2, 0.8], [0.26094341595088627, 0.8750451195806458], [0.05294659470278715, 0.9691015912197671]]] pool_from_proba(proba_sequence, 42) [[[0, 1], [1, 1], [0, 1]]]
Source code in temp_dir/bandit-random/src/autora/experimentalist/bandit_random/__init__.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|
pool_proba(num_probabilities, sequence_length, initial_probabilities=None, sigmas=None, num_samples=1, random_state=None)
Returns a list of probability sequences.
A probability sequence is a sequence of vectors of dimension num_probabilities
. Each entry
of this vector is a number between 0 and 1.
We can set a fixed initial value for the first vector of each sequence and a constant drif rate.
We can also set a range to randomly sample these values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_probabilities |
int
|
The number of probilities/ dimention of each element of the sequence |
required |
sequence_length |
int
|
The length of the sequence |
required |
initial_probabilities |
Optional[Iterable[Union[float, Iterable]]]
|
A list of initial values for each element of the probalities. Each |
None
|
sigmas |
Optional[Iterable[Union[float, Iterable]]]
|
A list of sigma of the normal distribution for the drift rate of each arm. Each entry can be a range to be sampled from. The drift rate is defined as change per step |
None
|
num_samples |
int
|
number of experimental conditions to select |
1
|
random_state |
Optional[int]
|
the seed value for the random number generator |
None
|
Returns: Sampled pool of experimental conditions
Examples:
We create a reward probabilty sequence for five two arm bandit tasks. The reward probabilities for each arm should be .5 and constant.
>>> pool_proba(num_probabilities=2, sequence_length=3, num_samples=1, random_state=42)
[[[0.5, 0.5], [0.5, 0.5], [0.5, 0.5]]]
If we want more arms:
>>> pool_proba(num_probabilities=4, sequence_length=3, num_samples=1, random_state=42)
[[[0.5, 0.5, 0.5, 0.5], [0.5, 0.5, 0.5, 0.5], [0.5, 0.5, 0.5, 0.5]]]
longer sequence:
>>> pool_proba(num_probabilities=2, sequence_length=5, num_samples=1, random_state=42)
[[[0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]]]
more sequences:
>>> pool_proba(num_probabilities=2, sequence_length=3, num_samples=2, random_state=42)
[[[0.5, 0.5], [0.5, 0.5], [0.5, 0.5]], [[0.5, 0.5], [0.5, 0.5], [0.5, 0.5]]]
We can set fixed initial values:
>>> pool_proba(num_probabilities=2, sequence_length=3,
... initial_probabilities=[0.,.4], random_state=42)
[[[0.0, 0.4], [0.0, 0.4], [0.0, 0.4]]]
And drift rates:
>>> pool_proba(num_probabilities=2, sequence_length=3,
... initial_probabilities=[0.,.4],
... sigmas=[.1, .5], random_state=42)
[[[0.0, 0.4], [0.030471707975443137, 0.7752255979032286], [0.0, 1.0]]]
We can also sample the initial values by passing a range:
>>> pool_proba(num_probabilities=2, sequence_length=3,
... initial_probabilities=[[0, .2],[.8, 1.]],
... sigmas=[[0., .25], [0., .5]],
... random_state=42)
[[[0.15479120971119267, 0.81883546957753], [0.23713042219259264, 0.8811974469636589], [0.34032881599649456, 0.7269307761486841]]]
Source code in temp_dir/bandit-random/src/autora/experimentalist/bandit_random/__init__.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
|