As part of our AI research series, we explore bias in a natural language context.
This research note seeks to understand bias in a language context, specifically word embeddings.
Word embeddings are mathematical representations of words that capture their meanings and relationships to other words or phrases. They are a valuable tool for many natural language processing (NLP) applications and continue to be widely used in industry as a cost-effective and easy-to-deploy alternative to large language models (LLMs).
Word embeddings are widely used in NLP and LLM systems, yet they have the potential to encode harmful biases against demographic groups, such as on the basis of gender, disability, or ethnicities. These biases could cause tangible harm if word embeddings are deployed in consumer-facing applications.
Although there has been research into bias in such settings, there is no consensus on the best way to tackle it. This research seeks to uncover how biases in word embeddings could be identified and removed at source through current methodologies.
We find that while it is possible to measure some aspects of language bias and mitigation techniques can remove some elements of gender and ethnicity bias, there are limitations to current methods.
We welcome further research in this area, as well as the development of new techniques to mitigate and measure bias.
Our AI research series
We want to enable the safe and responsible use of AI in UK financial markets, driving growth, competitiveness and innovation in the sector. As part of our effort to deepen our understanding of AI and its potential impact on financial services, our team is undertaking research in the area of AI bias.
We will be publishing a series of research notes on how AI intersects with financial services to spark discussion on these issues, drawing on a variety of academic and regulatory perspectives. We hope that these notes are of interest to those who build models, financial firms, and consumer groups in understanding complex debates on building and implementing AI systems.
Disclaimer
Research notes contribute to the work of the FCA by providing rigorous research results and stimulating debate. While they may not necessarily represent the position of the FCA, they are one source of evidence that the FCA may use while discharging its functions and to inform its views. The FCA endeavours to ensure that research outputs are correct, through checks including independent referee reports, but the nature of such research and choice of research methods is a matter for the authors using their expert judgement. To the extent that research notes contain any errors or omissions, they should be attributed to the individual authors, rather than to the FCA.
Authors
Lesley Dwyer, Will Francis, Shalini Tyagi.