Looking for results#

Before cooking looking for results recipes#

When a triage run finishes, it generates 3 types of outputs:

  • An experiment summary report (html) that you can generate

  • Objects stored either on the local filesystem (or S3 if you specified that). Two types of objects will be stored to disk in the project_path specified in creating the experiment object:

    • The matrices used for model training and validation, stored as (compressed) CSV files and associated metadata in yaml format.

    • The trained model objects themselves, stored as joblib pickles, which can be loaded and applied to new data.

  • Tables generated in your database

Summary report of the experiment run#

The summary gives an overview of what happened in the experiment.

  • The number of temporal splits generated based on temporal configuration

  • The number of unique date times on those temporal splits

  • The average size of the cohorts and their baserates

  • The number of features generated and used in your models

  • The number of feature groups

  • The number of different types of models generated, e.g., Random Forest, Decision Tree, etc.

  • The number of models generated based on your grid configuration

  • The best average performance metric (based on your definition) and which model type generated it

  • A first glance of the disparity metric defined over the groups you have defined

If the information looks correct based on what you intended to run, this is a good sanity check. If not, you can revisit your config file and look for inconsistencies, e.g., the start and end dates for features and labels.

How to generate the Experiment Summary#

We need to look up the hash of the triage run that just finished, list the specific performance metric and threshold we want the summary to show results for, the bias metric, and the priority groups we would like to look for.


from triage.component.postmodeling.experiment_summarizer import ExperimentReport



# Triage created hash(es) of the experiment(s) you are interested in. 
# It has to be a list (even if single element)
experiment_hashes = ['98112c011d842c43e841c415116ef179']

# Model Performance metric and threshold
# These default to 'recall@' and '1_pct'
performance_metric = 'precision@'
threshold = '10_pct'

# Bias metric defaults to tpr_disparity and bias metric values for all groups generated (if bias audit specified in the experiment config)
bias_metric = 'tpr_disparity'
bias_priority_groups = {'teacher_prefix': ['Dr.', 'Mr.', 'Mrs.', 'Ms.']}

# Create the Experiment Rerport 
rep = ExperimentReport(
    engine=db_engine,
    experiment_hashes=experiment_hashes,
    performance_priority_metric=performance_metric,
    threshold=threshold,
    bias_priority_metric=bias_metric,
    bias_priority_groups=bias_priority_groups
)

We can then run the following to get a summary

rep.generate_summary()

Results stored in the database#

Triage generates a series of tables where all the metadata and data from experiments is stored.

Triage will generate the following schemas:

  • triage_metadata: Has tables that store all the metadata associated with an experiment. For example Experiments and Triage runs Fig. 6.

Triage metadata schema

Fig. 6 Triage Metadata schema.#

  • train_results: Has tables that store all the data associated with training ml algorithms the experiments set up on a Triage experiment run. For example: Feature importances Fig. 7.

Train schema

Fig. 7 Triage Train schema.#

  • test_results: Has tables that store all the data associated with validating/testing the experiments set up on a Triage experiment run. For example: Evaluations and Predictions Fig. 8.

Test schema

Fig. 8 Triage Test schema.#

  • triage_production: Has tables that store all the data associated with predictions for a production environment.

Recipes:

Getting the hash of the experiment#

In Triage, the hash of an experiment is called run_hash or experiment_hash.

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • Optional: the date when you ran the experiment

👩‍🍳 How to cook

--this will retrieve the experiment hash from the last run 
select run_hash, start_time, os_user, current_status
from triage_metadata.triage_runs 
--if you want to check the ids on a specific date comment the limit 1 line and 
--uncomment the following line(s)
--where start_time::date = '2025-01-01'
--or if you would like to search on an interval of time 
--where star_time between '2025-10-09 08:00' and start_time between '2025-10-09 22:00' 
order by start_time desc 
limit 1

🍲 What to look for

The query will give you the run_hash, when it was run, who run it and the status of the experiment.

run_hash

start_time

os_user

current_status

86f34f9694ace01751fbd3dbe85dac48

2025-10-08 20:20:30.675

liliana

completed

Getting cohort and label tables of an experiment#

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • The experiment_hash of the experiment if you would like to retrieve the evaluations from all the models built on an experiment. In case you don’t know the experiment hash, follow the recipe Getting ID of the experiment or the How to cook (2) of that recipe

👩‍🍳 How to cook given a specific experiment hash

--we are retrieving the hash of the experiment, when the experiment run,
--the name of the cohort and label tables associated with this experiment. 
select distinct 
    run_hash, 
    start_time, 
    os_user, 
    current_status, 
    cohort_table_name, 
    labels_table_name 
from triage_metadata.triage_runs 
--note that on triage_metadata.triage_runs table, the experiment hash 
--is called run_hash, on the rest of the tables is called experiment_hash
where run_hash = '86f34f9694ace01751fbd3dbe85dac48';

👩‍🍳 How to cook looking also for the experiment hash

--note that on triage_metadata.triage_runs table, the experiment hash 
--is called run_hash, on the rest of the tables is called experiment_hash
select 
    run_hash, 
    start_time, 
    os_user, 
    current_status, 
    cohort_table_name, 
    labels_table_name 
from triage_metadata.triage_runs 
order by start_time desc 
limit 1;

🍲 What to look for

run_hash

start_time

os_user

current_status

cohort_table_name

labels_table_name

1dfae8fce8d582cdeebe1a82d5b7d906

2025-10-09 17:17:32.469

rmk2

completed

cohort_default_96a3340714ab3166717fd3d04d974926

labels_reincarceration_96a3340714ab3166717fd3d04d974926

Getting model groups of an experiment#

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • The experiment_hash of the experiment. In case you don’t know the experiment hash, follow the Get ID of the experiment recipe or the How to cook (2) of this recipe

👩‍🍳 How to cook

--given an experiment_hash 86f34f9694ace01751fbd3dbe85dac48
select distinct 
    model_group_id, 
    model_type 
from triage_metadata.experiment_models a 
join triage_metadata.models b 
using (model_hash)
where experiment_hash = '86f34f9694ace01751fbd3dbe85dac48'

👩‍🍳 How to cook (2) If you don’t know the experiment_hash

--if you don't know the experiment hash
with last_experiment as (
   select run_hash 
   from triage_metadata.triage_runs 
   order by start_time desc 
   limit 1
)

select distinct 
   model_group_id, 
   model_type 
from triage_metadata.experiment_models a
join last_experiment b 
   on b.run_hash = a.experiment_hash
join triage_metadata.models c
using (model_hash)

🍲 What to look for

The query will give you one model_group_id per each model type setup on your experiment. For example,

model_group_id

model_type

4

sklearn.dummy.DummyClassifier

5

triage.component.catwalk.estimators.classifiers.ScaledLogisticRegression

6

sklearn.tree.DecisionTreeClassifier

Getting model ids of an experiment#

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • The experiment_hash of the experiment. In case you don’t know the experiment hash, follow the Get ID of the experiment recipe or the How to cook (2) of that recipe

  • Optional: The model group(s) that you would like to get the model ids from. In case you don’t know/have it follow the How to cook (2) of this recipe

👩‍🍳 How to cook

select distinct model_group_id, model_id, train_end_time 
from triage_metadata.experiment_models a 
join triage_metadata.models b
 using (model_hash)
where experiment_hash = '86f34f9694ace01751fbd3dbe85dac48'
and model_group_id = 6
order by 3;

👩‍🍳 How to cook (2) If you don’t know/have the model_group_id and/or the experimet_hash

with last_experiment as (
    select run_hash 
    from triage_metadata.triage_runs 
    order by start_time desc 
    limit 1
)

select distinct 
    model_group_id, 
    model_id, 
    train_end_time 
from triage_metadata.experiment_models a 
join last_experiment b
 on a.experiment_hash = b.run_hash
join triage_metadata.models c
 using (model_hash)

🍲 What to look for

For each model group on your experiment you will get the different model ids.

model_group_id

model_id

train_end_time

16

1344

2017-07-01 00:00:00.000

16

1349

2018-01-01 00:00:00.000

16

1354

2018-07-01 00:00:00.000

16

1359

2019-01-01 00:00:00.000

16

1364

2019-07-01 00:00:00.000

16

1369

2020-01-01 00:00:00.000

16

1374

2020-07-01 00:00:00.000

16

1379

2021-01-01 00:00:00.000

16

1384

2021-07-01 00:00:00.000

16

1389

2022-01-01 00:00:00.000

16

1394

2022-07-01 00:00:00.000

16

1399

2023-01-01 00:00:00.000

16

1404

2023-07-01 00:00:00.000

Getting the number of models and matrices generated on an experiment#

This recipe will give you the number of models and matrices that were part of an experiment as well as how many were actually built, skipped and errored.

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • The experiment_hash of the experiment. In case you don’t know the experiment hash, follow the Get ID of the experiment recipe or the How to cook (2) of that recipe

👩‍🍳 How to cook

select
    experiment_hash,
    start_time, 
    os_user, 
    current_status,
    models
from triage_metadata.triage_runs a 
join triage_metadata.experiments b 
 on a.run_hash = b.experiment_hash
where experiment_hash = '86f34f9694ace01751fbd3dbe85dac48';

🍲 What to look for

Getting the performance evaluation of a model#

This recipe will help you to retrieve from the DB all the metrics you defined on your configuraton to be calculated on each of your models.

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • The model ids that you would like to retrieve the evaluation from. In case you don’t know them, you can follow the recipe Getting model ids of an experiment

  • The performance metric you would like to retrieve, i.e., Precision (precision@), Recall (recall@), ROC-AUC (roc-auc), Accuracy (accuracy)

  • The “threshold” you would like to retrieve, i.e., top 100 (100_abs), 10% (10_pct)

  • Optional: The experiment_hash of the experiment if you would like to retrieve the evaluations from all the models built on an experiment. In case you don’t know the experiment hash, follow the recipe Getting ID of the experiment or the How to cook (2) of that recipe

👩‍🍳 How to cook: For a scpecific model id

select 
    model_id, 
    evaluation_end_time, 
    parameter, 
    stochastic_value 
from test_results.evaluations 
where model_id = 1344
and metric in ('precision@', 'recall@')
and parameter in ('100_abs', '2_pct');

🍲 What to look for: For a specific model id

You will get for each metric and threshold defined on your query, the stochastic value with the performance evaluation.

model_id

evaluation_end_time

metric

parameter

stochastic_value

1344

2017-07-01 00:00:00.000

precision@

100_abs

0.19

1344

2017-07-01 00:00:00.000

precision@

2_pct

0.048199152542372885

1344

2017-07-01 00:00:00.000

recall@

100_abs

0.07421875

1344

2017-07-01 00:00:00.000

recall@

2_pct

0.35546875

👩‍🍳 How to cook: For all the models on a model group id

select model_group_id, model_id, evaluation_end_time, parameter, stochastic_value 
from test_results.evaluations a
join triage_metadata.models b
 using (model_id)
where model_group_id = 4
and metric in ('precision@', 'recall@')
and parameter in ('100_abs', '2_pct')
order by 3

🍲 What to look for: For all the models on a specific model group

You will get for each model id on a model group id, the metric and threshold defined on your query and the stochastic value with the performance evaluation.

model_id

evaluation_end_time

metric

parameter

stochastic_value

4

1

2018-05-01 00:00:00.000

2_pct

0.11258366800535476

4

1

2018-05-01 00:00:00.000

2_pct

0.019704779756326146

4

1

2018-05-01 00:00:00.000

100_abs

0.11833333333333333

4

1

2018-05-01 00:00:00.000

100_abs

0.0027725710715401438

👩‍🍳 How to cook: For all the model groups on an experiment run

select model_group_id, model_id, evaluation_end_time, parameter, stochastic_value 
from triage_metadata.experiment_models a 
join triage_metadata.models b 
  using (model_hash)
join test_results.evaluations c
 using (model_id)
where experiment_hash = '86f34f9694ace01751fbd3dbe85dac48'
and metric in ('precision@', 'recall@')
and parameter in ('100_abs', '2_pct')
order by 1, 3

🍲 What to look for: For a all the model groups on an experiment run

You will get for each model group id and model ids that were part of an experiment run, the metric and threshold defined on your query and the stochastic value with the performance evaluation.

model_group_id

model_id

evaluation_end_time

metric

parameter

stochastic_value

4

1

2018-05-01 00:00:00.000

2_pct

0.11258366800535476

4

1

2018-05-01 00:00:00.000

2_pct

0.019704779756326146

4

1

2018-05-01 00:00:00.000

100_abs

0.11833333333333333

4

1

2018-05-01 00:00:00.000

100_abs

0.0027725710715401438

5

2

2018-05-01 00:00:00.000

2_pct

0.570281124497992

5

2

2018-05-01 00:00:00.000

2_pct

0.09981255857544517

5

2

2018-05-01 00:00:00.000

100_abs

0.6556666666666667

5

2

2018-05-01 00:00:00.000

100_abs

0.015362386754139331

6

3

2018-05-01 00:00:00.000

2_pct

0.2819723337795627

6

3

2018-05-01 00:00:00.000

2_pct

0.04935176507341456

6

3

2018-05-01 00:00:00.000

100_abs

0.274

6

3

2018-05-01 00:00:00.000

100_abs

0.0064198687910028114

Getting predictions of a model#

Note

You can only retrieve predictions from the DB if in your run.py you setup the flag save_predictions to True.

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • The model ids that you would like to retrieve the evaluation from. In case you don’t know them, you can follow the recipe Getting model ids of an experiment

  • Optional: The “threshold” you would like to retrieve, i.e., the 100 with highest score (rank_abs_no_ties)

  • Optional: The experiment_hash of the experiment if you would like to retrieve the evaluations from all the models built on an experiment. In case you don’t know the experiment hash, follow the recipe Getting ID of the experiment or the How to cook (2) of that recipe

👩‍🍳 How to cook: For a specific model id

--we are getting the model id, the prediction date, the id of the entity
--the rank with no ties based on the score, the score (output of the model), and the 
--label (outcome) for that entity (ground truth). 
select model_id, as_of_date, entity_id, rank_abs_no_ties, score, label_value
from test_results.predictions 
where model_id = 3 
--we will retrieve the scores generated by the model for the top (untied) 100
--but there are other metrics you could use: rank_abs_with_ties, rank_pct_no_ties, rank_pct_with_ties
and rank_abs_no_ties < 101
order by 4

🍲 What to look for

You will get fore each entity on a model id, the score it got (output from the models .predict_proba), the rank within the cohort and the true label.s

model_id

as_of_date

entity_id

rank_abs_no_ties

score

label_value

1344

2017-07-01 00:00:00.000

100081853

1

0.58583

1

1344

2017-07-01 00:00:00.000

100126970

2

0.56869

0

1344

2017-07-01 00:00:00.000

100781591

3

0.52514

1

1344

2017-07-01 00:00:00.000

100228572

4

0.49065

1

1344

2017-07-01 00:00:00.000

100035417

5

0.47441

1

Getting the train and test matrices used on a model#

Often you’ll need to know which matrices where used on a specific models to do other test or analysis. This recipe will give you the name (UUID) of the train and test matrices used on a particular model id and for all the model ids on a model group.

🥕 Ingredients

  • A connection to the DB (DBeaver or psql)

  • The model_id or model_group_id in case you would like to retrieve the matrices from all the models on a model group. In case you don’t know the model id or the model group id, you can follow the recipe Getting model ids of an experiment

  • Optional: An experiment_hash. In case you don’t know the experiment hash, you can follow the recipe Getting the hash of the experiment

👩‍🍳 How to cook for a specific id

select distinct 
    model_group_id,
    model_id,
    train_end_time, 
    train_matrix_uuid,
    matrix_uuid as test_matrix_uuid
--in case you don't know the model id but you know the experiment_hash 
from triage_metadata.models a
-- getting the uuid from matrices used in validation 
join test_results.evaluations b 
 using (model_id)
where model_id = 1399
order by 1, 3 

👩‍🍳 How to cook for all model ids on a model group

select distinct 
    model_group_id,
    model_id,
    train_end_time, 
    train_matrix_uuid,
    matrix_uuid as test_matrix_uuid
--in case you don't know the model id but you know the experiment_hash 
from triage_metadata.models a
-- getting the uuid from matrices used in validation 
join test_results.evaluations b 
 using (model_id)
where model_group_id = 16
order by 1, 3 

How to cook for all model groups in an experiment

select distinct 
    model_group_id,
    model_id,
    train_end_time, 
    train_matrix_uuid,
    matrix_uuid as test_matrix_uuid
--in case you don't know the model id but you know the experiment_hash 
from triage_metadata.experiment_models a 
join triage_metadata.models b 
 using (model_hash)
-- getting the uuid from matrices used in validation 
join test_results.evaluations d 
 using (model_id)
where experiment_hash = 'f2614123549000597dbda80cb6e629b4'
order by 1, 3 

🍲 What to look for

model_group_id

model_id

train_end_time

train_matrix_uuid

test_matrix_uuid

16

1344

2017-07-01 00:00:00.000

91fb7fa570e7aa83ea52d29587b12c46

7cf9e79bdcd3016de726cb5fd8c596a5

16

1349

2018-01-01 00:00:00.000

8206235d5d9b55fe58d7fac576832e28

79fe4ee54a1e462acb0e3dc1d44cacb9

117

1275

2021-07-01 00:00:00.000

d5e9350e7a81a5a66eddeb06f597702e

bc7a98e98464ba038b4d02efd3aeac7d

117

1288

2022-01-01 00:00:00.000

3bc65e9f083a204c417c508c2a4c0e87

2fdd45fb0003af1694d0319c262ac4e6

118

1198

2018-07-01 00:00:00.000

ec4d424507aa99b48e7ad9b9dec08e56

db8feadb6d2e983afa8a12a308d6349f

118

1211

2019-01-01 00:00:00.000

50b09b93403c94c674a1d2b3cae74222

1797ae01ada3a0b4a39d6d489ecf5fa1

Getting the feature importances of a model#

This recipe will help you retrieve the feature importances of a specific model or set of models.

🥕 Ingredients

  • A connection to the DB (DBeaver, DbVizualizer, psql, or the query IDE you use)

  • A model_id or set of model ids that you would like to retrieve the feature importance from

  • Optional: an experiment_hash from which you would like to retrieve the feature importance from all the models generated

👩‍🍳 How to cook

select 
    model_id,
    --the name of the feature 
    feature, 
    feature_importance,
    --rank associated with the value of the feature importance
    --the most important feature has rank 1 
    rank_abs
from train_results.feature_importances 
where model_id = 1399
--we are retrieving the 10 most important features based 
--on the feature importance 
and rank_abs < 11

🍲 What to look for

You should get a table with the features and their corresponding feature importance for the 10 features with most feature_importance value and their rank.

model_id

feature

feature_importance

rank_abs

1399

b_age_entity_id_all_age_avg

0.0128745883

1

1399

b_all_event_entity_id_all_dsl_min

0.0096761433

6

1399

b_all_event_gaps_entity_id_1year_days_btwn_avg

0.0095655872

7

1399

b_all_event_gaps_entity_id_1year_days_btwn_max

0.0098250766

2

1399

b_all_event_gaps_entity_id_3years_days_btwn_avg

0.0094501876

8

1399

b_all_event_gaps_entity_id_3years_days_btwn_max

0.0097146209

5

1399

b_all_event_gaps_entity_id_5years_days_btwn_avg

0.0093690388

9

1399

b_all_event_gaps_entity_id_5years_days_btwn_max

0.0097221884

4

1399

b_all_event_gaps_entity_id_all_days_btwn_avg

0.0093398857

10

1399

b_all_event_gaps_entity_id_all_days_btwn_max

0.0097322057

3

Getting the configuration associated with an experiment#

Given all the different experiments we run, it is common to forget which experiment had what setup. This recipe will serve to retrieve the configuration file associated to an experiment.

🥕 Ingredients

  • A connection to the DB (DBeaver, DbVizualizer, psql, or the query IDE you use)

  • An experiment_hash. In case you don’t know the experiment hash, you can follow the recipe Getting the hash of the experiment

👩‍🍳 How to cook

select 
    config as all_config, 
    config->'temporal_config' as temporal_config,
    config->'scoring' as scoring, 
    config->'bias_audit_config' as bias_and_audit,
    config->'grid_config' as grid_config,
    config->'feature_aggregations' as feature_aggregations
from triage_metadata.triage_runs 
where run_hash = 'f2614123549000597dbda80cb6e629b4'

🍲 What to look for

The whole configuration file is stored as a jsonb object in the DB. You can retrieve each section and specific elements within a section. The following output omits the output of the columns all_config, bias_and_audit, and feature_aggregations to make it the output more “legible”.

all_config

temporal_config

scoring

bias_and_audit

grid_config

feature_aggregations

{“label_end_time”: “2024-01-01”, “test_durations”: [“0day”], “feature_end_time”: “2024-01-01”, “label_start_time”: “2017-01-01”, “feature_start_time”: “2014-01-01”, “test_label_timespans”: [“6month”], “max_training_histories”: [“3year”], “model_update_frequency”: “6month”, “training_label_timespans”: [“6month”], “test_as_of_date_frequencies”: [“6month”], “training_as_of_date_frequencies”: [“6month”]}

{“testing_metric_groups”: [{“metrics”: [“precision@”, “recall@”], “thresholds”: {“top_n”: [50, 100, 200], “percentiles”: [0.01, 0.1, 0.2, 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]}}]}

{“xgboost.XGBClassifier”: {“booster”: [“dart”], “nthread”: [10], “max_depth”: [10, 50, 100], “eval_metric”: [“logloss”], “tree_method”: [“hist”]}, “lightgbm.LGBMClassifier”: {“n_jobs”: [-5], “max_depth”: [100], “num_leaves”: [10], “is_unbalance”: [“false”], “n_estimators”: [100], “boosting_type”: [“dart”]}, “sklearn.dummy.DummyClassifier”: {“strategy”: [“prior”]}, “sklearn.tree.DecisionTreeClassifier”: {“max_depth”: [3, 10, 50, 100], “min_samples_split”: [30]}, “sklearn.ensemble.RandomForestClassifier”: {“n_jobs”: [-5], “max_depth”: [150], “max_features”: [“sqrt”], “n_estimators”: [5000], “min_samples_split”: [10]}, “triage.component.catwalk.baselines.thresholders.SimpleThresholder”: {“rules”: [[“j_rsc_suic_entity_id_all_high_risk_max > 0”, “j_rsc_selfharm_entity_id_all_high_risk_max > 0”, “j_rsc_selfcare_entity_id_all_high_risk_max > 0”, “j_rsc_physagg_entity_id_all_high_risk_max > 0”, “j_rsc_substanceabuse_entity_id_all_high_risk_max > 0”, “j_rsc_hosp_entity_id_all_high_risk_max > 0”, “j_rsc_harmtoothers_entity_id_all_high_risk_max > 0”]], “logical_operator”: [“or”]}, “triage.component.catwalk.baselines.rankers.BaselineRankMultiFeature”: {“rules”: [[{“feature”: “b_all_event_entity_id_all_dsl_min”, “low_value_high_score”: true}], [{“feature”: “b_ambulance_entity_id_all_total_count”, “low_value_high_score”: false}], [{“feature”: “b_all_event_entity_id_all_total_count”, “low_value_high_score”: false}], [{“feature”: “b_diagnoses_entity_id_all_dsl_min”, “low_value_high_score”: true}]]}, “triage.component.catwalk.estimators.classifiers.ScaledLogisticRegression”: {“C”: [0.01, 0.1, 0.5, 1], “solver”: [“saga”], “penalty”: [“l1”]}}