NDCG
classkeras_rs.metrics.NDCG(
k: Optional[int] = None,
gain_fn: Callable[[Any], Any] = default_gain_fn,
rank_discount_fn: Callable[[Any], Any] = default_rank_discount_fn,
**kwargs: Any
)
Computes Normalised Discounted Cumulative Gain (nDCG).
This metric evaluates ranking quality. It normalizes the Discounted
Cumulative Gain (DCG) with the Ideal Discounted Cumulative Gain (IDCG)
for each list. The metric processes true relevance labels in y_true
(graded relevance scores (non-negative numbers where higher values
indicate greater relevance)) against predicted scores in y_pred
. The
scores in y_pred
are used to determine the rank order of items, by
sorting in descending order. A normalized score (between 0 and 1) is
returned. A score of 1 represents the perfect ranking according to true
relevance (within the top-k), while 0 typically represents a ranking
with no relevant items. Higher scores indicate better ranking quality
relative to the best possible ranking.
For each list of predicted scores s
in y_pred
and the corresponding
list of true labels y
in y_true
, the per-query nDCG score is
calculated as follows:
nDCG@k = DCG@k / IDCG@k
where DCG@k is calculated based on the predicted ranking (y_pred
):
DCG@k(y') = sum_{i=1}^{k} (gain_fn(y'_i) / rank_discount_fn(i))
And IDCG@k is the Ideal DCG, calculated using the same formula but on
items sorted perfectly by their true relevance (y_true
):
IDCG@k(y'') = sum_{i=1}^{k} (gain_fn(y''_i) / rank_discount_fn(i))
where:
y'_i
: True relevance of the item at rank i
in the ranking induced by
y_pred
.y''_i
True relevance of the item at rank i
in the ideal ranking (sorted
by y_true
descending).gain_fn
is the user-provided function mapping relevance to gain. The default
function (default_gain_fn
) is typically equivalent to lambda y: 2**y - 1
.rank_discount_fn
is the user-provided function mapping rank i
(1-based) to
a discount value. The default function (default_rank_discount_fn
) is
typically equivalent to lambda rank: 1 / log2(rank + 1)
.The final nDCG score reported is typically the weighted average of these per-query scores across all queries/lists in the dataset.
Note: sample_weight
is handled differently for ranking metrics. For
batched inputs, sample_weight
can be scalar, 1D, 2D. The scalar case
and 1D case (list-wise weights) are straightforward. The 2D case (item-
wise weights) is different, in the sense that the sample weights are
aggregated to get 1D weights. For more details, refer to
keras_rs.src.metrics.ranking_metrics_utils.get_list_weights
.
Arguments
y_true
) to gain values. The
default implements 2**y - 1
.default_rank_discount_fn
) implements
1 / log2(rank + 1)
.True
.None
, which
means using keras.backend.floatx()
. keras.backend.floatx()
is a
"float32"
unless set to different value
(via keras.backend.set_floatx()
). If a keras.DTypePolicy
is
provided, then the compute_dtype
will be utilized.Example
>>> batch_size = 2
>>> list_size = 5
>>> labels = np.random.randint(0, 3, size=(batch_size, list_size))
>>> scores = np.random.random(size=(batch_size, list_size))
>>> metric = keras_rs.metrics.NDCG()(
... y_true=labels, y_pred=scores
... )
Mask certain elements (can be used for uneven inputs):
>>> batch_size = 2
>>> list_size = 5
>>> labels = np.random.randint(0, 3, size=(batch_size, list_size))
>>> scores = np.random.random(size=(batch_size, list_size))
>>> mask = np.random.randint(0, 2, size=(batch_size, list_size), dtype=bool)
>>> metric = keras_rs.metrics.NDCG()(
... y_true={"labels": labels, "mask": mask}, y_pred=scores
... )