March 2nd, 2005

angle brackets, fed up, html

The scoring system, again

Okay, here's a radically different possibility.

I had a long discussion with Dad about this, and he thought that my approach was all wrong.  So we talked for a while, and I realized he was right.

First of all, there's no inherent benefit to more votes--they simply mean that more people have looked at the story, not that it's better.  (Actually, they may indicate that the average will be more reliable, but I think that effect falls to nothing pretty quickly.)  So I'm removing that as a direct factor from the calculation, the way it was with formulae like avg(votes) * ln(count(votes)).

The real problem is this: there's a certain amount of noise in the way people vote; thus, a simple average can be easily spoiled.  So what I've decided to do is to throw away a few votes.

Here's how it works: imagine a story has twenty 5 votes, ten 4 votes, two 3 votes, and a 1 vote (from a vindicative rival author, perhaps, or from somebody slipping and clicking the wrong button, or because they didn't read the FAQ and don't know they're supposed to rate on quality rather than whether they like Scully/Skinner).  That means it has 33 votes total.
3: XX
1: X
A naive average will give 4.45 or so--that 1 vote dragged it down from 4.66.

Instead, we're going to throw away up to 20% of those ten votes.  Specifically, we'll take 20% of the total (6.6), divide into fifths (1.32), round down (1), and remove that number of votes from each of the groups.
3: X|X
2:  |
1: X|
This leaves us with nineteen 5 votes, nine 4 votes, and one 3 vote, which averages to 4.62--not far from where we would have been without that asshat giving it a 1.  (Realize, though, that all scores are affected in this way--this story would now be tied with one that had identical voting except for the 1 vote.) This score would then be rounded to a reasonable number of digits, and tied stories would be sorted according to age (newest first, so they can get more exposure).

The percentage probably needs tweaking, and may need to be replaced with a more dynamic calculation, but I think this system will work fairly well overall.

(By the way, I do realize that this system doesn't work at all with very few votes.  When there aren't many votes, the system will mark the story as "not enough data to score", and the story will be listed either at the top--perhaps with a "new!" marker where the stars would normally be--or in a separate column.)
  • Current Music
    Evanescence - Anywhere