Shirley Ogolla

Alexander von Humboldt Institute of Internet & Society (HIIG)

Shirley Ogolla is a researcher at the Humboldt Institute of Internet & Society (HIIG) in Berlin, investigating emerging forms of Internet-enabled participation. At HIIG, Shirley’s research focuses on workers participation on digital platforms by examining new forms and processes of employee participation online.

Shirley has a background in Media Studies from Humboldt University of Berlin, Sorbonne University in Paris and New York University. Further, she spent the summer of 2017 at the Berkman Klein Center for Internet and Society working on topics on machine learning bias, artificial intelligence and ethics. Shirley is also the co-founder of collective no:topia, an italian-german artist collective based in Torino and Berlin, building interactive art installations on technologically embedded futures for the broader public (http://collectivenotopia.com).

Abstract:
Rethinking ML bias - Ensuring inclusive design to mitigate biases

This paper introduces an inclusivity-framework to foster a higher degree of participation in the design of Machine Learning (ML) systems. Contrary to muted and manageable impacts in other technological shifts, in ML-enabled automated systems exclusion will be much larger and more encompassing because of the scale of deployment and rapid nature of operation.We need to ensure that the already marginalized are not further neglected in our race forward to enrich our lives with technology. I believe that this will be a

concrete first step in changing some of the core methods as practiced in ML today to better address the issues of fairness and interpretability.

The overestimation of the autonomy of ML systems (Walsh, 2017) dominates public debates, and demands an appropriate disambiguation in order to make room for more immediate and important discussions such as inclusion. Moreover, information asymmetries among users, designers, practitioners and scholars call for a compensation through educational and professional training. Our societal ideas and ideals must be implemented into the design of ML systems, as everyone will be affected by its implementation in their daily practices eventually. When it comes to decision-making by algorithms, some processes still lack explanation, which prevents the results in being human-interpretable (Lipton, 2017). This is especially concerning in the case of decisions-making on humans and their access to credit (Datta, 2017), health (Hart, 2017) or justice (Angwin et al., 2016) for instance.

Furthermore, the diversity of the underlying data is crucial to the success of ML systems in the long run (Hardesty, 2016). As an algorithm is only as good as the data it works with, it relies on large-scale data in order to detect patterns and make predictions (Barocas & Selbst, 2015). This data reflects human history, and therefore reflects biases and prejudices of prior decision makers that reinforce the marginalisation of minorities. The dataset compilation process today only constitutes a certain group of people and it can be improved (Howard et al., 2017).

I propose a framework, here rooted in social science research that will ensure compliance to the proposed inclusivity matrix (Table 1) for the academic and applied communities. Despite the socio-economic and historical distinctions between cultures and nations, there are global cross cutting issues that occur when addressing inclusion in the context of ML systems. Issues regarding transparency, explainability, accountability and liability in ML systems must be taken into account. All stakeholders involved, from design to development, maintenance and end-of-life of these systems, should be considered in this process of attributing responsibility and liability to ensure culturally- and contextually-sensitive inclusive standards.

Current practice and research today agrees on abstract moral imperatives for inclusive design of such systems but there is very little concrete information for practitioners to apply these principles to their work. I propose this framework of inclusive design ensuring a high degree of participation within ML systems, by giving recommendations for diverse stakeholders and justifying my call for inclusion on a policy level, aiming to give tangible guidance for designers and policy makers.

Table & References

References

Walsh, Toby “Elon Musk is wrong. The AI singularity won't kill us all” Wired.co.uk. (2018), accessed 23.04.2018: https://www.wired.co.uk/article/elon-musk-artificial-intelligence-scaremongering

Lipton, Zachary C. "The Doctor Just Won't Accept That!."arXiv:1711.08037 (2017)

Datta, Anupam. "Did Artificial Intelligence Deny You Credit?." The Conversation (2017), accessed 23.04.2018: https://theconversation.com/did-artificial-intelligence-deny-you-credit-73259

Hart, Robert. "When Artificial Intelligence Botches Your Medical Diagnosis, Who’S To Blame?." Quartz (2017), accessed 23.04.2018: https://qz.com/989137/when-a-robot-ai-doctor-misdiagnoses-you-whos-to-blame/

Angwin, Julia & Mattu, Surya. "Machine Bias — Propublica." ProPublica (2016), accessed 23.04.2018: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencin

Hardesty, Larry.“Data diversity. Preserving variety in subsets of unmanageably large data sets should aid machine learning”. MIT News Office (2016), accessed 23.04.2018: https://news.mit.edu/2016/variety-subsets-large-data-sets-machine-learning-1216

Barocas, Solon & Andrew D. Selbst. "Big data's disparate impact." Cal. L. Rev. 104 (2016): 671

Howard, Ayanna, Cha Zhang, & Eric Horvitz. "Addressing bias in machine learning algorithms: A pilot study on emotion recognition for intelligent systems." Advanced Robotics and its Social Impacts (ARSO) IEEE Workshop (2017)

Table 1. Inclusivity-Matrix

(1) Bias prevention

(2) Bias detection

(a) ML

(i) data

(1) labeling

(2) generating

(b) non-ML

(i) transfer knowledge; transfer methods; build models

(1) Social Scientists

(2) Legal experts

(3) Policy Makers

(4) Media

(5) Public at large

(ii) Educational changes in the training of all the above

(a) ML

(i) dataset

(ii) modeling

(iii) training

(iv) implementation

(b) non-ML

(i) overcome information asymmetries

(ii) demystify field of ML

(iii) develop knowledge & expertise on nature of ML practices

(iv) rediscover methods & apply interdisciplinarily

(v) “enlighten” ML community on ethics, etc. on different maturity levels (education; academia; industry)