ยป Keras API reference / KerasNLP / Layers / MultiSegmentPacker layer

MultiSegmentPacker layer

[source]

MultiSegmentPacker class

keras_nlp.layers.MultiSegmentPacker(
    sequence_length,
    start_value,
    end_value,
    pad_value=None,
    truncator="round_robin",
    **kwargs
)

Packs multiple sequences into a single fixed width model input.

This layer packs multiple input sequences into a single fixed width sequence containing start and end delimeters, forming an dense input suitable for a classification task for BERT and BERT-like models.

Takes as input a list or tuple of token segments. The layer will process inputs as follows: - Truncate all input segments to fit within sequence_length according to the truncator strategy. - Concatenate all input segments, adding a single start_value at the start of the entire sequence, and multiple end_values at the end of each segment. - Pad the resulting sequence to sequence_length using pad_tokens. - Calculate a separate tensor of "segment ids", with integer type and the same shape as the packed token output, where each integer index of the segment the token originated from. The segment id of the start_value is always 0, and the segment id of each end_value is the segment that precedes it.

Input should be either a tf.RaggedTensor or a dense tf.Tensor, and either rank-1 or rank-2.

Arguments

  • sequence_length: The desired output length.
  • start_value: The id or token that is to be placed at the start of each sequence (called "[CLS]" for BERT). The dtype must mach the dtype of the input tensors to the layer.
  • end_value: The id or token that is to be placed at the end of each input segment (called "[SEP]" for BERT). The dtype much mach the dtype of the input tensors to the layer.
  • pad_value: The id or token that is to be placed into the unused positions after the last segment in the sequence (called "[PAD]" for BERT).
  • truncator: The algorithm to truncate a list of batched segments to fit a per-example length limit. The value can be either round_robin or waterfall: - "round_robin": Available space is assigned one token at a time in a round-robin fashion to the inputs that still need some, until the limit is reached. - "waterfall": The allocation of the budget is done using a "waterfall" algorithm that allocates quota in a left-to-right manner and fills up the buckets until we run out of budget. It support arbitrary number of segments.

Returns

A tuple with two elements. The first is the dense, packed token sequence. The second is an integer tensor of the same shape, containing the segment ids.

Examples

Pack a single input for classification.

>>> seq1 = tf.constant([1, 2, 3, 4])
>>> packer = keras_nlp.layers.MultiSegmentPacker(
...     8, start_value=101, end_value=102)
>>> packer(seq1)
(<tf.Tensor: shape=(8,), dtype=int32,
    numpy=array([101, 1, 2, 3, 4, 102, 0, 0], dtype=int32)>,
 <tf.Tensor: shape=(8,), dtype=int32,
    numpy=array([0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)>)

Pack multiple inputs for classification.

>>> seq1 = tf.constant([1, 2, 3, 4])
>>> seq2 = tf.constant([11, 12, 13, 14])
>>> packer = keras_nlp.layers.MultiSegmentPacker(
...     8, start_value=101, end_value=102)
>>> packer((seq1, seq2))
(<tf.Tensor: shape=(8,), dtype=int32,
    numpy=array([101,   1,   2,   3, 102,  11,  12, 102], dtype=int32)>,
 <tf.Tensor: shape=(8,), dtype=int32,
    numpy=array([0, 0, 0, 0, 0, 1, 1, 1], dtype=int32)>)

Reference

Devlin et al., 2018.