Selection Rules
Selection Rules#
The Triage uses selection rules to compare the performance of trained model groups over time, and select a model group for future predictions. A selection rule tries to predict the best-performing model group in some train/test period, based on the historical performance of each model group on some metric.
For example, a simple selection rule might predict that the best-performing model group during one train/test period will perform best in the following period.
A selection rule can be evaluated by calculating its regret, or the difference between the performance of its selected model group and the best-performing model group in some period.
Triage supports 8 model selection rules. Each is represented internally by one of the following functions:
best_average_two_metrics(df, train_end_time, metric1, parameter1, metric2, parameter2, metric1_weight=0.5, n=1)
#
Pick the model with the highest average combined value to date
of two metrics weighted together using metric1_weight
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metric1_weight |
float |
relative weight of metric1, between 0 and 1 |
0.5 |
metric1 |
string |
model evaluation metric, such as 'precision@' |
required |
parameter1 |
string |
model evaluation metric parameter, such as '300_abs' |
required |
metric2 |
string |
model evaluation metric, such as 'precision@' |
required |
parameter2 |
string |
model evaluation metric parameter, such as '300_abs' |
required |
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns model_group_id, train_end_time, metric, parameter, raw_value, below_best |
required |
n |
int |
the number of model group ids to return |
1 |
best_average_value(df, train_end_time, metric, parameter, n=1)
#
Pick the model with the highest average metric value so far
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metric |
string |
model evaluation metric, such as 'precision@' |
required |
parameter |
string |
model evaluation metric parameter, such as '300_abs' |
required |
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns model_group_id, train_end_time, metric, parameter, raw_value, dist_from_best_case |
required |
n |
int |
the number of model group ids to return |
1 |
best_avg_recency_weight(df, train_end_time, metric, parameter, curr_weight, decay_type, n=1)
#
Pick the model with the highest average metric value so far, penalized for relative variance as: avg_value - (stdev_penalty) * (stdev - min_stdev) where min_stdev is the minimum standard deviation of the metric across all model groups
Parameters:
Name | Type | Description | Default |
---|---|---|---|
decay_type |
string |
either 'linear' or 'exponential'; the shape of how the weights fall off between the current and first point |
required |
curr_weight |
float |
amount of weight to put on the most recent point, relative to the first point (e.g., a value of 5.0 would mean the current data is weighted 5 times as much as the first one) |
required |
metric |
string |
model evaluation metric, such as 'precision@' |
required |
parameter |
string |
model evaluation metric parameter, such as '300_abs' |
required |
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns model_group_id, train_end_time, metric, parameter, raw_value, below_best |
required |
n |
int |
the number of model group ids to return |
1 |
best_avg_var_penalized(df, train_end_time, metric, parameter, stdev_penalty, n=1)
#
Pick the model with the highest average metric value so far, placing less weight in older results. You need to specify two parameters: the shape of how the weight affects points (decay_type, linear or exponential) and the relative weight of the most recent point (curr_weight).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stdev_penalty |
float |
penalty for instability |
required |
metric |
string |
model evaluation metric, such as 'precision@' |
required |
parameter |
string |
model evaluation metric parameter, such as '300_abs' |
required |
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns model_group_id, train_end_time, metric, parameter, raw_value, below_best |
required |
n |
int |
the number of model group ids to return |
1 |
best_current_value(df, train_end_time, metric, parameter, n=1)
#
Pick the model group with the best current metric value
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metric |
string |
model evaluation metric, such as 'precision@' |
required |
parameter |
string |
model evaluation metric parameter, such as '300_abs' |
required |
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns: model_group_id, train_end_time, metric, parameter, raw_value, dist_from_best_case |
required |
n |
int |
the number of model group ids to return |
1 |
lowest_metric_variance(df, train_end_time, metric, parameter, n=1)
#
Pick the model with the lowest metric variance so far
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metric |
string |
model evaluation metric, such as 'precision@' |
required |
parameter |
string |
model evaluation metric parameter, such as '300_abs' |
required |
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns model_group_id, train_end_time, metric, parameter, raw_value, below_best |
required |
n |
int |
the number of model group ids to return |
1 |
most_frequent_best_dist(df, train_end_time, metric, parameter, dist_from_best_case, n=1)
#
Pick the model that is most frequently within dist_from_best_case
from the
best-performing model group across test sets so far
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dist_from_best_case |
float |
distance from the best performing model |
required |
metric |
string |
model evaluation metric, such as 'precision@' |
required |
parameter |
string |
model evaluation metric parameter, such as '300_abs' |
required |
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns model_group_id, train_end_time, metric, parameter, raw_value, below_best |
required |
n |
int |
the number of model group ids to return |
1 |
random_model_group(df, train_end_time, n=1)
#
Pick a random model group (as a baseline)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
train_end_time |
Timestamp |
current train end time |
required |
df |
pandas.DataFrame |
dataframe containing the columns model_group_id, train_end_time, metric, parameter, raw_value, below_best |
required |
n |
int |
the number of model group ids to return |
1 |
RuleMakers#
Triage uses RuleMaker
classes to conveniently format the parameter grids accepted by make_selection_rule_grid
. Each type of RuleMaker
class holds methods that build parameter grids for a subset of the available selection rules.
The arguments of each add_rule_
method map to the arguments of the corresponding model selection function.
RandomGroupRuleMaker (BaseRules)
#
The RandomGroupRuleMaker
class generates a rule that randomly selects n
model groups for each train set.
Unlike the other two RuleMaker classes, it generates its selection rule spec
on __init__
__init__(self, n=1)
special
#
SimpleRuleMaker (BaseRules)
#
Holds methods that generate parameter grids for selection rules that evaluate the performance of a model group in terms of a single metric. These include:
- best_current_value
- best_average_value
- lowest_metric_variance
- most_frequent_best_dist
- best_avg_var_penalized
- best_avg_recency_weight
add_rule_best_average_value(self, metric=None, parameter=None, n=1)
#
add_rule_best_avg_recency_weight(self, metric=None, parameter=None, n=1, curr_weight=[1.5, 2.0, 5.0], decay_type=['linear'])
#
add_rule_best_avg_var_penalized(self, metric=None, parameter=None, stdev_penalty=0.5, n=1)
#
add_rule_best_current_value(self, metric=None, parameter=None, n=1)
#
add_rule_lowest_metric_variance(self, metric=None, parameter=None, n=1)
#
add_rule_most_frequent_best_dist(self, metric=None, parameter=None, n=1, dist_from_best_case=[0.01, 0.05, 0.1, 0.15])
#
TwoMetricsRuleMaker (BaseRules)
#
The TwoMetricsRuleMaker
class allows for the specification of rules that
evaluate a model group's performance in terms of two metrics. It currently
supports one rule:
add_rule_best_average_two_metrics(self, metric1='precision@', parameter1='100_abs', metric2='recall@', parameter2='300_abs', metric1_weight=[0.5], n=1)
#
Selection Grid#
make_selection_rule_grid(rule_groups)
#
Convert a compact selection rule group representation to a list of bound selection rules.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rule_groups |
list |
List of dicts used to specify selection rule grid. |
required |
Most users will want to use rulemaker objects
to generate their rule_group
specifications.
An example rule_groups specification:
[{
'shared_parameters': [
{'metric': 'precision@', 'parameter': '100_abs'},
{'metric': 'recall@', 'parameter': '100_abs'},
],
'selection_rules': [
{'name': 'most_frequent_best_dist', 'dist_from_best_case': [0.1, 0.2, 0.3]},
{'name': 'best_current_value'}
]
}, {
'shared_parameters': [
{'metric1': 'precision@', 'parameter1': '100_abs'},
],
'selection_rules': [
{
'name': 'best_average_two_metrics',
'metric2': ['recall@'],
'parameter2': ['100_abs'],
'metric1_weight': [0.4, 0.5, 0.6]
},
]
}]
Returns:
Type | Description |
---|---|
list |
list of audition.selection_rules.BoundSelectionRule objects |