NDCG classkeras_rs.metrics.NDCG(
k: int | None = None,
gain_fn: Callable[[Any], Any] = default_gain_fn,
rank_discount_fn: Callable[[Any], Any] = default_rank_discount_fn,
**kwargs: Any
)
Computes Normalised Discounted Cumulative Gain (nDCG).
This metric evaluates ranking quality. It normalizes the Discounted
Cumulative Gain (DCG) with the Ideal Discounted Cumulative Gain (IDCG)
for each list. The metric processes true relevance labels in y_true
(graded relevance scores (non-negative numbers where higher values
indicate greater relevance)) against predicted scores in y_pred. The
scores in y_pred are used to determine the rank order of items, by
sorting in descending order. A normalized score (between 0 and 1) is
returned. A score of 1 represents the perfect ranking according to true
relevance (within the top-k), while 0 typically represents a ranking
with no relevant items. Higher scores indicate better ranking quality
relative to the best possible ranking.
For each list of predicted scores s in y_pred and the corresponding
list of true labels y in y_true, the per-query nDCG score is
calculated as follows:
nDCG@k = DCG@k / IDCG@k
where DCG@k is calculated based on the predicted ranking (y_pred):
DCG@k(y') = sum_{i=1}^{k} (gain_fn(y'_i) / rank_discount_fn(i))
And IDCG@k is the Ideal DCG, calculated using the same formula but on
items sorted perfectly by their true relevance (y_true):
IDCG@k(y'') = sum_{i=1}^{k} (gain_fn(y''_i) / rank_discount_fn(i))
where:
y'_i: True relevance of the item at rank i in the ranking induced by
y_pred.y''_i True relevance of the item at rank i in the ideal ranking (sorted
by y_true descending).gain_fn is the user-provided function mapping relevance to gain. The default
function (default_gain_fn) is typically equivalent to lambda y: 2**y - 1.rank_discount_fn is the user-provided function mapping rank i (1-based) to
a discount value. The default function (default_rank_discount_fn) is
typically equivalent to lambda rank: 1 / log2(rank + 1).The final nDCG score reported is typically the weighted average of these per-query scores across all queries/lists in the dataset.
Note: sample_weight is handled differently for ranking metrics. For
batched inputs, sample_weight can be scalar, 1D, 2D. The scalar case
and 1D case (list-wise weights) are straightforward. The 2D case (item-
wise weights) is different, in the sense that the sample weights are
aggregated to get 1D weights. For more details, refer to
keras_rs.src.metrics.ranking_metrics_utils.get_list_weights.
Arguments
y_true) to gain values. The
default implements 2**y - 1.default_rank_discount_fn) implements
1 / log2(rank + 1).True.None, which
means using keras.backend.floatx(). keras.backend.floatx() is a
"float32" unless set to different value
(via keras.backend.set_floatx()). If a keras.DTypePolicy is
provided, then the compute_dtype will be utilized.Example
>>> batch_size = 2
>>> list_size = 5
>>> labels = np.random.randint(0, 3, size=(batch_size, list_size))
>>> scores = np.random.random(size=(batch_size, list_size))
>>> metric = keras_rs.metrics.NDCG()(
... y_true=labels, y_pred=scores
... )
Mask certain elements (can be used for uneven inputs):
>>> batch_size = 2
>>> list_size = 5
>>> labels = np.random.randint(0, 3, size=(batch_size, list_size))
>>> scores = np.random.random(size=(batch_size, list_size))
>>> mask = np.random.randint(0, 2, size=(batch_size, list_size), dtype=bool)
>>> metric = keras_rs.metrics.NDCG()(
... y_true={"labels": labels, "mask": mask}, y_pred=scores
... )