The scale and its data

Thiago Santana
4 min readApr 8, 2021

--

Often we are invited to review something: an app in your phone’s store, or a service, or a restaurant. And we use these ratings to base our decisions. Due to “insensitivity to prior probability of outcomes”, as Amos Tversky and Daniel Kahneman name it in their famous article, one might expect that actual review scores are spread all over the evaluations scale. However, this is not necessarily the case.

Picture this: you take an Uber ride. After the service, you are asked to review the service with a rating scale of 1 to 5 stars.

Two questions for you:

  1. What is the mean of possible ratings on the scale?

This one is evident:

On a scale from 1 to 5 stars, the mean rate is 3 stars.

2. What is the mean rate of Uber services? If some driver gives the most average service possible, what would be the rate they’d have associated with their Uber profile? Is it 3 stars?

No.

According to Business Insider, a score of 4.8 is considered average for Uber drivers:

The average Uber rider rating, according to Uber, is 4.89 stars.

This mismatch between the mean of the scale and the mean of the evaluations is not self-evident, and can raise questions. Let’s check some of them:

The scale is related to the mean evaluation?

The difference between both means (scale and evaluations) brings to light that it’s not possible to talk about a mean rate without knowing the data. A 4-star Uber driver may feel they are top 25%, when they are in fact below average.

Scales can induce this bias. Netflix knew about that and took action. Do you remember when it was possible to give scores (1 to 5) on Netflix programs? I remember, and I remember having a hard time giving 5 stars —the same rate I had given “The Godfather”— to another movie.

Realizing that there are less complex evaluation systems, Netflix dropped the 5-star rating system, allowing their subscribers to either ‘like’ or ‘dislike’ the title they’ve just watched. After testing out the new ratings system before its launch, Netflix saw a 200 percent jump in ratings activity.

More evaluations lead to more precise recommendations by a better suggestion algorithm. Everybody wins.

So, can scales induce bias?

Yes, they can. This scale-induced bias is called Response Bias. It was first detected in research that required participants to self-report and affected (and still affects) research results.

Another example: Data Science course evaluations on Udemy

Let’s take another example regarding evaluation of Data Science courses on Udemy.

As of the writing of this article, the following distribution of ratings of these courses exists:

Ratings for Data Science courses in Udemy.

Let’s identify the quantity of evaluated courses per provided bin.

Quantity of courses in each evaluation bin.
Data that supports the graphic above.

About the table above:

  • Courses were grouped per bin based on their evaluation.
  • The “Bin Average” field is assumed as the average of the bins upper and lower numeric boundaries.
  • We do a weighted sum for each bin and calculate the sum of the scores (assuming the “Bin Average” distribution). Finally we calculate the course average rating of 3.87.

Thus, a 3.5 evaluated course is not on the top 37.5%, it is below average.

Conclusion

These Rio de Janeiro Samba Schools are very well evaluated, aren’t they? SOURCE: https://www.youtube.com/watch?v=91oAPK65mHA

I remember as a kid the very high scores the Rio de Janeiro Samba Schools got for their technical performances. It is glaring evidence that distribution of the ratings were not spread through the scale, but concentrated on the top.

But data distribution is not about the scale. It is about the data.

Being aware of this effect enables us to have a better understanding and make better decisions. After all, a 4.8-stars-out-of-five Uber driver is an average one.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Thiago Santana
Thiago Santana

Written by Thiago Santana

I am a data driven Customer Service professional with an aerospace industry background. Always looking for win-win situations and effective communication.

No responses yet

Write a response