AlibiBias
classkeras_nlp.layers.AlibiBias(alibi_bias_max=8, **kwargs)
A layer that adds the alibi bias to attention scores.
This layer adds the alibi bias to the attention scores. Alibi bias is a linear, non-learned bias. Defined and formalized in Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.
This layer takes as input the attention scores. and returns the attention scores after adding the alibi bias to it. The output will have the same shape as the input.
Arguments
2**(-alibi_bias_max/num_heads)
and uses that same value as its
ratio. Defaults to 8.keras.layers.Layer
,
including name
, trainable
, dtype
etc.Call arguments
(batch_size, num_heads, query_length, key_length)
.Example
query_length = 10
key_length = 10
num_heads = 4
batch_size = 2
hidden_dim = 8
# Create new alibi layer.
alibi_layer = keras_nlp.layers.AlibiBias()
query = np.zeros((batch_size, num_heads, query_length, hidden_dim))
key = np.zeros((batch_size, num_heads, hidden_dim, key_length))
attention_scores = keras.ops.matmul(query, key)
# Add alibi bias to attention scores.
attention_scores = alibi_layer(attention_scores)
References