RankDCG: Rank-Ordering Evaluation Measure

A novel evaluation measure for ranking systems that addresses limitations in existing metrics

Project Details

RankDCG: Rank-Ordering Evaluation Measure

Ranking is used for a wide array of problems, most notably information retrieval (search). There are a number of popular approaches to the evaluation of ranking such as Kendall’s tau, Average Precision, and nDCG.

The Problem

When dealing with problems such as user ranking or recommendation systems, traditional evaluation measures suffer from various limitations:

  • Inability to deal with ties - Many real-world ranking scenarios include tied rankings
  • Inconsistent lower bound scores - Existing measures lack clear and meaningful baseline scores
  • Ambiguous scoring - Difficulty interpreting what scores actually represent
  • Inappropriate cost functions - Cost functions that don’t align with practical ranking needs

The Solution

We propose rankDCG, a new evaluation measure that addresses these fundamental problems. RankDCG provides:

  • Proper handling of tied rankings
  • Consistent and interpretable lower bound scores
  • Clear, unambiguous evaluation metrics
  • Cost functions appropriate for real-world ranking applications

Key Features

The package provides implementations of:

  • Rank Discounted Cumulative Gain (RankDCG) - the primary novel contribution
  • Normalized Discounted Cumulative Gain (nDCG)
  • Discounted Cumulative Gain (DCG)
  • Average Precision (AP)
  • Mean Average Precision (MAP)

RankDCG possesses three fundamental properties:

  1. Consistent lower and upper score bounds (from 0 to 1)
  2. Works with non-normal value distribution
  3. Has transitivity property

Installation & Usage

The setup is straightforward: download the package and add the ranking_measures directory to your Python path or project folder.

from ranking_measures import measures
print(measures.find_rankdcg([9,3,1], [5,1,7]))
# Returns: 0.125

This example demonstrates evaluating a movie recommendation algorithm where reference scores [9,3,1] represent ideal preferences, while [5,1,7] represent the algorithm’s predictions.

Resources

This work contributes to the advancement of evaluation methodologies in information retrieval, recommendation systems, and other ranking-based applications.