Embedding layer


Embedding class


Turns positive integers (indexes) into dense vectors of fixed size.

e.g. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]

This layer can only be used on positive integer inputs of a fixed range.


>>> model = keras.Sequential()
>>> model.add(keras.layers.Embedding(1000, 64))
>>> # The model will take as input an integer matrix of size (batch,
>>> # input_length), and the largest integer (i.e. word index) in the input
>>> # should be no larger than 999 (vocabulary size).
>>> # Now model.output_shape is (None, 10, 64), where `None` is the batch
>>> # dimension.
>>> input_array = np.random.randint(1000, size=(32, 10))
>>> model.compile('rmsprop', 'mse')
>>> output_array = model.predict(input_array)
>>> print(output_array.shape)
(32, 10, 64)


  • input_dim: Integer. Size of the vocabulary, i.e. maximum integer index + 1.
  • output_dim: Integer. Dimension of the dense embedding.
  • embeddings_initializer: Initializer for the embeddings matrix (see keras.initializers).
  • embeddings_regularizer: Regularizer function applied to the embeddings matrix (see keras.regularizers).
  • embeddings_constraint: Constraint function applied to the embeddings matrix (see keras.constraints).
  • mask_zero: Boolean, whether or not the input value 0 is a special "padding" value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True, then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).
  • weights: Optional floating-point matrix of size (input_dim, output_dim). The initial embeddings values to use.
  • lora_rank: Optional integer. If set, the layer's forward pass will implement LoRA (Low-Rank Adaptation) with the provided rank. LoRA sets the layer's embeddings matrix to non-trainable and replaces it with a delta over the original matrix, obtained via multiplying two lower-rank trainable matrices. This can be useful to reduce the computation cost of fine-tuning large embedding layers. You can also enable LoRA on an existing Embedding layer by calling layer.enable_lora(rank).

Input shape

2D tensor with shape: (batch_size, input_length).

Output shape

3D tensor with shape: (batch_size, input_length, output_dim).