Kellie Webster & Kevin Robinson, Research Scientists, Google Research

Abstract: Language modeling improvements and emerging zero-shot capabilities have led to an explosion of interest in potential applications. With this power, many have cautioned about potential shortcomings and harms of the current technology. In this talk, we make a deliberate distinction between shortcomings intrinsic to a model, notably what is discussed in the bias literature, and the potential for a model used in a system to cause harm to real people. This distinction enables us to discuss what factors in pre-training influence the encoding of social stereotypes, and observe that improving language modeling may improve bias measurements even in the presence of data skew. On the other hand, measuring potential harms remains an outstanding challenge and requires new forms of cross-disciplinary understanding. We discuss our experience with exploring how to formulate scalable, relevant, and actionable measures of potential harm, with a case study on machine translation.

Kai-Wei Chang, Assistant Professor, University of California

Abstract: Natural Language Generation (NLG) technologies have advanced drastically in recent years, and they have empowered various real-world applications that touch our daily lives. Despite their remarkable performance, recent studies have shown that NLG models run the risk of aggravating the societal biases present in the data. Without properly quantifying and reducing the reliance on such correlations, the broad adoption of these models might have the undesirable effect of magnifying prejudice or harmful implicit biases that rely on sensitive demographic attributes. In this talk, I will discuss metrics and datasets for evaluating gender bias in language generation models. I will review existing bias measurements and demonstrate how intricate bias metrics are inconsistent with the extrinsic ones. I will further discuss the harms of gender exclusivity and challenges in representing non-binary gender in NLP.