A guide to Base Rate Fallacy in machine learning

A guide to Base Rate Fallacy in machine learning

The base rate fallacy is a kind of fallacy that is also known as base rate bias and base rate neglect. This kind of fallacy has information about the base rate and specific information. There can be ignorance of base rate data in favor of individuating data.

Performances of machine learning models are obtained by testing them. We use many statistical tests but also one thing that we all are aware of is that no statistical test is perfect. Some errors in models are easy to understand but hard to capture. The base rate fallacy can be considered an easy to understand but hard to find error. The concept of base rate fallacy is taken from behavioral science. In this article, we are going to discuss the this fallacy and we will also understand its applicability to machine learning. The major points to be discussed in the article are listed below.

In statistics, the base rate can be considered as probabilities of classes that are unconditioned of evidence of features. We may also think of the base rate as prior probabilities. We can understand it using the example of engineers in the world. So if 2% of the people are engineers in this world then the base rate of engineers is simply 1%.

Read more