A Little story of bias
A father was driving his two children to watch a football match when they were involved in a terrible accident. The driver was killed immediately, as was one of the boys. The youngest child was sitting in the back on his car seat, survived the accident but was seriously injured.
The young child was taken to hospital where he was rushed into an operating theatre where they hoped they could save his life.
The doctor entered the room and looked at the patient, froze and said “I cannot operate on this boy, he is my son!’
Bias within Data
If you asked the question of how the boy could be the doctor’s son you are falling in a trap of bias. The doctor in the story is the child’s mother (obviously), but that may not be the first solution that comes to mind. In many societies we are brought up to see doctors as male, and nurses as female. This has really big implications if we are using computers to search for information though, as a search machine that uses content generated by humans will reproduce the bias that unintendedly sits within the content.
The source of the bias could be from how the system works. For example, if a company offers a face recognition service and uses photos posted on the internet (for example categorized in some way by GOOGLE), there will be a lot more white males than girls of Asian background. The results will be more accurate for the category with the largest presence in the database.
If a banking system takes the case of a couple who declare an income together, it will presume that the man’s income is higher then that of the woman’s and treat the individuals accordingly, because from experience the data shows that men’s income is higher then that of women and this generalization will become part of the structure.
The problem with language is also easy to see. If the example above of the doctor problem can be in some way ‘seen’ in the vast amount of text analyzed and used for an algorithm, then proposals and offers will differ according to gender.
Let’s take how we describe ourselves for a moment. A male manager will use a set of descriptive terms to describe himself that will differ from those used by a woman, he might be assertive, but she is more likely to be understanding and supportive. A system that unwittingly uses a dataset based upon (or even referring to) language used in job adverts and profiles of successful candidates will replicate a gender bias, because more proposals will be sent to people who use the language that reflects the current make-up of the employment situation.
In short: More men will be using the language that the system picks up on, because more men (than women) in powerful positions use that type of language. The bias will be recreated and reinforced.
In 2018 the State of New York proposed a law related to accountability within algorithms, Take a look at this short description, and the European Commission released a white paper on Artificial Intelligence – A European approach to excellence and trust in 2020. It might be more important an argument than it first appears.
There is lots of literature about this problem if you are interested, a quick online search will offer you plenty of food for thought.