· Prepare Input Data:
1 Following the best practice of responsible data acquisition, handling, classification, and management must be a priority to ensure that results and outcomes align with the AI system’s set goals and objectives. Effective data quality soundness and procurement begin by ensuring the integrity of the data source and data accuracy in representing all observations to avoid the systematic disadvantaging of under represented or advantaging over represented groups. The quantity and quality of the data sets should be sufficient and accurate to serve the purpose of the system. The sample size of the data collected or procured has a significant impact on the accuracy and fairness of the outputs of a trained model.
2 Sensitive personal data attributes which are defined in the plan and design phase should not be included in the model data not to feed the existing bias on them. Also, the proxies of the sensitive features should be analyzed and not included in the input data. In some cases, this may not be possible due to the accuracy or objective of the AI system. In this case, the justification of the usage of the sensitive personal data attributes or their proxies should be provided.
3 Causality based feature selection should be ensured. Selected features should be verified with business owners and non technical teams.
4 Automated decision support technologies present major risks of bias and unwanted application at the deployment phase, so it is critical to set out mechanisms to prevent harmful and discriminatory results at this phase.
Published by SDAIA in AI Ethics Principles, Sept 14, 2022