Configurations¶
Configurations can be set in .yml file. See example config.yml on github.
Define reference groups¶
--ref-group-method <type>
Fairness is always determined in relation to a reference group. By default, aequitas uses the majority group level for a given group as the reference group.
"majority"
Define fairness in relation to the majority in a group."min_metric"
Define fairness in relation to the subgroup with the lowest value on a given metric."predefined"
Define fairness in relation to groups of your choice.
The predefined reference groups are set in the configuration file.
ref_groups_method: "predefined"
reference_groups:
"gender": "male"
"age_cat": "35-50"
Score threshholds¶
If the input score
column is not binary, you can test the impact of alternative cutoffs on fairness metrics.
Thresholds are set in the configuration file.
thresholds:
rank_abs: [300]
rank_pct: [1.0, 5.0, 10.0 ]
With rank_abs
classify the observations with the top n as 1 and the remainder as 0. With rank_pct
classify the top n percent as 1 and the rest as 0.
Choosing Metrics¶
Choose from:
'Statistical Parity'
'Impact Parity'
'FOR Parity'
'FDR Parity'
'FPR Parity'
'FNR Parity'
fairness_measures: ["FPR Parity", "FNR Parity"]
Fairness threshold¶
Disparity is determined in terms of ratios. When the disparity ratio cross the fairness threshold, the decision is deemed unfair on that metric. Notice this is different from the webapp, which is 1 - fairness_threshold
fairness_threshold: 0.8
Attribute columns¶
You can manually set attribute columns to be assessed for fairness.
attr_cols: ["zipcode_pct_black", "zipcode_median_income"]
Project information¶
Your project title and goal will be inserted into the report.
project_description:
title: "Insert project title"
goal: "Insert project goal."
Databases¶
To connect to a database instead of using “–input “, use the db key, credentials, and input_query. These are set in config.yaml.
db:
db_credentials:
host: your_host
database: your_db
user: your_user
password: your_pass
port: 5432
The input query should return a table with score, label_value columns and then each attribute you want to use for bias analysis
input_query: "select id, score, label_value, attribute_1, attribute_2 from results.predictions left join ..."
The output schema is optional, default=public
output_schema: results
Note: database functionality is not compatible with csv input.