Keras 3 API documentation / KerasNLP / Modeling Layers / PositionEmbedding layer

PositionEmbedding layer


PositionEmbedding class

    sequence_length, initializer="glorot_uniform", **kwargs

A layer which learns a position embedding for inputs sequences.

This class assumes that in the input tensor, the last dimension corresponds to the features, and the dimension before the last corresponds to the sequence.

This layer does not supporting masking, but can be combined with a keras.layers.Embedding for padding mask support.


  • sequence_length: The maximum length of the dynamic sequence.
  • initializer: The initializer to use for the embedding weights. Defaults to "glorot_uniform".
  • seq_axis: The axis of the input tensor where we add the embeddings.
  • **kwargs: other keyword arguments passed to keras.layers.Layer, including name, trainable, dtype etc.

Call arguments

  • inputs: The tensor inputs to compute an embedding for, with shape (batch_size, sequence_length, hidden_dim). Only the input shape will be used, as the position embedding does not depend on the input sequence content.
  • start_index: An integer or integer tensor. The starting position to compute the position embedding from. This is useful during cached decoding, where each position is predicted separately in a loop.


Called directly on input.

>>> layer = keras_nlp.layers.PositionEmbedding(sequence_length=10)
>>> layer(np.zeros((8, 10, 16)))

Combine with a token embedding.

seq_length = 50
vocab_size = 5000
embed_dim = 128
inputs = keras.Input(shape=(seq_length,))
token_embeddings = keras.layers.Embedding(
    input_dim=vocab_size, output_dim=embed_dim
position_embeddings = keras_nlp.layers.PositionEmbedding(
outputs = token_embeddings + position_embeddings