Some Thoughts on Bias

A Little story of bias

A father was driving his two children to watch a football match when they were involved in a terrible accident. The driver was killed immediately, as was one of the boys. The youngest child was sitting in the back on his car seat, survived the accident but was seriously injured.

The young child was taken to hospital where he was rushed into an operating theatre where they hoped they could save his life.

The doctor entered the room and looked at the patient, froze and said “I cannot operate on this boy, he is my son!’

Bias within Data

If you asked the question of how the boy could be the doctor’s son you are falling in a trap of bias. The doctor in the story is the child’s mother (obviously), but that may not be the first solution that comes to mind. In many societies we are brought up to see doctors as male, and nurses as female. This has really big implications if we are using computers to search for information though, as a search machine that uses content generated by humans will reproduce the bias that unintendedly sits within the content.

The source of the bias could be from how the system works. For example, if a company offers a face recognition service and uses photos posted on the internet (for example categorized in some way by GOOGLE), there will be a lot more white males than girls of Asian background. The results will be more accurate for the category with the largest presence in the database.

If a banking system takes the case of a couple who declare an income together, it will presume that the man’s income is higher then that of the woman’s and treat the individuals accordingly, because from experience the data shows that men’s income is higher then that of women and this generalization will become part of the structure.

The problem with language is also easy to see. If the example above of the doctor problem can be in some way ‘seen’ in the vast amount of text analyzed and used for an algorithm, then proposals and offers will differ according to gender.

Let’s take how we describe ourselves for a moment. A male manager will use a set of descriptive terms to describe himself that will differ from those used by a woman, he might be assertive, but she is more likely to be understanding and supportive. A system that unwittingly uses a dataset based upon (or even referring to) language used in job adverts and profiles of successful candidates will replicate a gender bias, because more proposals will be sent to people who use the language that reflects the current make-up of the employment situation.

In short: More men will be using the language that the system picks up on, because more men (than women) in powerful positions use that type of language. The bias will be recreated and reinforced.

In 2018 the State of New York proposed a law related to accountability within algorithms, Take a look at this short description, and the European Commission released a white paper on Artificial Intelligence – A European approach to excellence and trust in 2020. It might be more important an argument than it first appears.

There is lots of literature about this problem if you are interested, a quick online search will offer you plenty of food for thought.

3 thoughts on “Some Thoughts on Bias

  1. It feels like we need some positive discrimination built into algorithms, to help combat biases.

    Some difficult questions are:

    • How do you change algos which are fed by huge amounts of big data?
    • How do you decide which changes are necessary and appropriate, versus those which are manipulation?
  2. A similar story was used to describe bias in a college lecture I attended. To no surprise, most attendees were confused at first, assuming the doctor would be the child’s father before realizing the doctor was his mother. With this automated response that is hardwired into some parts of our minds, how does one fully eliminate this form of bias? The same could be said for AI machine learning algorithms used for data analytics. While the technology is programmed to learn from the data and find trends and patterns, similar problems of bias still remain present. To what degree, if not all, could bias be completely eliminated from this process?

    • Here you touch on the problem of which data is used to train your machinery. If the training materials contain bias, then the machine will produce bias too. How can we train machines with unbiased data? How can we determine if data is biased if we are all guilty of bias ourselves?

Leave a Reply

Your email address will not be published. Required fields are marked *