{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Configurations \n", "\n", "Configurations can be set in .yml file. See example [config.yml](https://github.com/dssg/aequitas/tree/master/src/aequitas_cli) on github.\n", "\n", "## Define reference groups\n", "```--ref-group-method ```\n", "\n", "Fairness is always determined in relation to a reference group. By default, aequitas uses the majority group level for a given group as the reference group.\n", "\n", "- `\"majority\"`\n", "Define fairness in relation to the majority in a group. \n", "\n", "- `\"min_metric\"`\n", "Define fairness in relation to the subgroup with the lowest value on a given metric.\n", "\n", "- `\"predefined\"`\n", "Define fairness in relation to groups of your choice. \n", "\n", "The predefined reference groups are set in the configuration file.\n", "\n", "```\n", "ref_groups_method: \"predefined\"\n", "\n", "reference_groups:\n", " \"gender\": \"male\"\n", " \"age_cat\": \"35-50\"\n", "```\n", "\n", "## Score threshholds \n", "\n", "If the input `score` column is not binary, you can test the impact of alternative cutoffs on fairness metrics.\n", "\n", "Thresholds are set in the configuration file.\n", "```{bash}\n", "thresholds:\n", " rank_abs: [300]\n", " rank_pct: [1.0, 5.0, 10.0 ]\n", "```\n", "With `rank_abs` classify the observations with the top n as 1 and the remainder as 0.\n", "With `rank_pct` classify the top n percent as 1 and the rest as 0.\n", "\n", "\n", "## Choosing Metrics\n", "Choose from:\n", "\n", "* `'Statistical Parity'`\n", "* `'Impact Parity'`\n", "* `'FOR Parity'`\n", "* `'FDR Parity'`\n", "* `'FPR Parity'`\n", "* `'FNR Parity'`\n", "\n", "```\n", "fairness_measures: [\"FPR Parity\", \"FNR Parity\"]\n", "```\n", "\n", "## Fairness threshold\n", "Disparity is determined in terms of ratios. When the disparity ratio cross the fairness threshold, the decision is deemed unfair on that metric. Notice this is different from the webapp, which is `1 - fairness_threshold`\n", "\n", "```\n", "fairness_threshold: 0.8\n", "```\n", "\n", "\n", "## Attribute columns\n", "You can manually set attribute columns to be assessed for fairness. \n", "\n", "```\n", "attr_cols: [\"zipcode_pct_black\", \"zipcode_median_income\"]\n", " \n", "```\n", "\n", "## Project information\n", "Your project title and goal will be inserted into the report.\n", "\n", "```\n", "project_description:\n", " title: \"Insert project title\"\n", " goal: \"Insert project goal.\"\n", "```\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Databases\n", "\n", "To connect to a database instead of using \"--input \", use the db key, credentials, and input_query. These are set in config.yaml. \n", "\n", "```\n", "db:\n", " db_credentials:\n", " host: your_host\n", " database: your_db\n", " user: your_user\n", " password: your_pass\n", " port: 5432\n", "```\n", "\n", "The input query should return a table with score, label_value columns and then each attribute you want to use for bias analysis\n", "\n", "``` \n", "input_query: \"select id, score, label_value, attribute_1, attribute_2 from results.predictions left join ...\"\n", "``` \n", "\n", "The output schema is optional, default=public\n", "\n", "```\n", " output_schema: results\n", "```\n", "\n", "Note: database functionality is not compatible with csv input." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }