Intro to Audition
What is Audition?#
Audition is the Triage model selection module. Model selection is the process of selecting a model group that is likely to perform well on future data. Audition makes model selection easy and repeatable, even when comparing hundreds of model groups.
Start by defining a set of coarse filtering rules that prune the worst-performing model groups. You can then apply more complex selection rules that rank model groups on factors like their performance over time or variability in performance.
Audition can compare the performance of these selection rules, helping you choose a selection rule that is likely to identify well-performing models in the future.
Audition works well in a Jupyter notebook, or any other environment that will conveniently display Audition's matplotlib output.
The Audition Python interface#
Initializing Auditioner#
This Auditioner object reads performance data from a database populated by a Triage Experiment. It loads information about the model groups selected by model_group_ids
, over the training sets specified by train_end_times
, and filters them as defined by initial_metric_filters
.
from triage.component.audition import Auditioner
aud = Auditioner(
db_engine = conn, # connection to a database populated by a Triage experiment
model_group_ids=[i for i in range(101, 150)], # selecting model groups to evaluate
train_end_times=end_times,
initial_metric_filters=
[{'metric': 'precision@',
'parameter': '50_abs',
'max_from_best': 0.3,
'threshold_value': 0.5}],
models_table='models',
distance_table='best_dist',
agg_type='worst'
)
- Achieves precision at least 0.3 worse than the best performing model group
- Achieves precision worse than 0.5
Note that the agg_type
parameter is optional for aggregating metric values across multiple models for a given model_group_id
and train_end_time
combination (e.g., from different random seeds) -- mean
, best
, or worst
(the default)
Selection rules#
Selection rules allow you to pare down your model groups even more.
Start by adding rules to a RuleMaker
object:
from triage.component.audition.rules_maker import SimpleRuleMaker, create_selection_grid
rule = SimpleRuleMaker()
rule.add_rule_best_current_value(metric='precision@', parameter='50_abs', n=3)
rule.add_rule_best_average_value(metric='precision@', parameter='50_abs', n=3)
- Ranked by precision in the most recent training set
- Ranked by average precision over all training sets.
Create a selection grid from your RuleMaker
, and register it in your Auditioner
object. This will generate a set of plots showing the performance of each rule, and allow you to view the top n
models selected by each rule.
grid = create_selection_grid(rule)
aud.register_selection_rule_grid(grid, plot=True)
aud.selection_rule_model_group_ids
Metric Filters#
Auditioner implements two coarse filters for pruning worst-performing model groups.
These filters are defined on the metrics specified by the metric
and parameter
arguments in the initial_metric_filters
dict. The combination of these two arguments should refer to a metric calculated by a Triage experiment.
max_from_best
: Model groups that perform this much worse than the best-performing model group in any period will be pruned.
threshold_value
: Model groups that perform worse than this threshold in any period will be pruned.
Adding metric filters#
After defining an Auditioner
instance, we can output the models permitted by the initial thresholds.
# Output the thresholded model group ids
aud.thresholded_model_group_ids
If that didn't thin things out too much, let's get a bit more agressive with both parameters. If we want to have multiple filters, then use update_metric_filters
to apply a set of filters to the model groups we're considering in order to eliminate poorly performing ones. The model groups will be plotted again after updating the filters.
aud.update_metric_filters([{
'metric': 'precision@',
'parameter': '50_abs',
'max_from_best': 0.5,
'threshold_value': 0.12
}])
aud.thresholded_model_group_ids
The Audition CLI#
Besides its Python interface, Audition exposes a full-featured CLI.
Start by defining an Audition config file. Parameters in an Audition config map to arguments in the Python interface introduced above.
triage -d dbconfig.yaml audition --config audition_config.yaml --directory audition_output
This command will run Audition against the database specified in dbconfig.yaml
, using options from audition_config.yaml
. It will store the resulting plots in the directory audition_output
.