Keras 3 API documentation / KerasNLP / Pretrained Models / Bart / BartSeq2SeqLM model

BartSeq2SeqLM model


BartSeq2SeqLM class

keras_nlp.models.BartSeq2SeqLM(backbone, preprocessor=None, **kwargs)

An end-to-end BART model for seq2seq language modeling.

A seq2seq language model (LM) is an encoder-decoder model which is used for conditional text generation. The encoder is given a "context" text (fed to the encoder), and the decoder predicts the next token based on both the encoder inputs and the previous tokens. You can finetune BartSeq2SeqLM to generate text for any seq2seq task (e.g., translation or summarization).

This model has a generate() method, which generates text based on encoder inputs and an optional prompt for the decoder. The generation strategy used is controlled by an additional sampler argument passed to compile(). You can recompile the model with different keras_nlp.samplers objects to control the generation. By default, "top_k" sampling will be used.

This model can optionally be configured with a preprocessor layer, in which case it will automatically apply preprocessing to string inputs during fit(), predict(), evaluate() and generate(). This is done by default when creating the model with from_preset().

Disclaimer: Pre-trained models are provided on an "as is" basis, without warranties or conditions of any kind. The underlying model is provided by a third party and subject to a separate license, available here.


  • backbone: A keras_nlp.models.BartBackbone instance.
  • preprocessor: A keras_nlp.models.BartSeq2SeqLMPreprocessor or None. If None, this model will not apply preprocessing, and inputs should be preprocessed before calling the model.


Use generate() to do text generation, given an input context.

bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en")
bart_lm.generate("The quick brown fox", max_length=30)

# Generate with batched inputs.
bart_lm.generate(["The quick brown fox", "The whale"], max_length=30)

Compile the generate() function with a custom sampler.

bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en")
bart_lm.generate("The quick brown fox", max_length=30)

Use generate() with encoder inputs and an incomplete decoder input (prompt).

bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en")
        "encoder_text": "The quick brown fox",
        "decoder_text": "The fast"

Use generate() without preprocessing.

# Preprocessed inputs, with encoder inputs corresponding to
# "The quick brown fox", and the decoder inputs to "The fast". Use
# `"padding_mask"` to indicate values that should not be overridden.
prompt = {
    "encoder_token_ids": np.array([[0, 133, 2119, 6219, 23602, 2, 1, 1]]),
    "encoder_padding_mask": np.array(
        [[True, True, True, True, True, True, False, False]]
    "decoder_token_ids": np.array([[2, 0, 133, 1769, 2, 1, 1]]),
    "decoder_padding_mask": np.array([[True, True, True, True, False, False]])

bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset(

Call fit() on a single batch.

features = {
    "encoder_text": ["The quick brown fox jumped.", "I forgot my homework."],
    "decoder_text": ["The fast hazel fox leapt.", "I forgot my assignment."]
bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset("bart_base_en"), batch_size=2)

Call fit() without preprocessing.

x = {
    "encoder_token_ids": np.array([[0, 133, 2119, 2, 1]] * 2),
    "encoder_padding_mask": np.array([[1, 1, 1, 1, 0]] * 2),
    "decoder_token_ids": np.array([[2, 0, 133, 1769, 2]] * 2),
    "decoder_padding_mask": np.array([[1, 1, 1, 1, 1]] * 2),
y = np.array([[0, 133, 1769, 2, 1]] * 2)
sw = np.array([[1, 1, 1, 1, 0]] * 2)

bart_lm = keras_nlp.models.BartSeq2SeqLM.from_preset(
), y=y, sample_weight=sw, batch_size=2)

Custom backbone and vocabulary.

features = {
    "encoder_text": [" afternoon sun"],
    "decoder_text": ["noon sun"],
vocab = {
    "<s>": 0,
    "<pad>": 1,
    "</s>": 2,
    "Ġafter": 5,
    "noon": 6,
    "Ġsun": 7,
merges = ["Ġ a", "Ġ s", "Ġ n", "e r", "n o", "o n", "Ġs u", "Ġa f", "no on"]
merges += ["Ġsu n", "Ġaf t", "Ġaft er"]

tokenizer = keras_nlp.models.BartTokenizer(
preprocessor = keras_nlp.models.BartSeq2SeqLMPreprocessor(
backbone = keras_nlp.models.BartBackbone(
bart_lm = keras_nlp.models.BartSeq2SeqLM(
), batch_size=2)


from_preset method

BartSeq2SeqLM.from_preset(preset, load_weights=True, **kwargs)

Instantiate a keras_nlp.models.Task from a model preset.

A preset is a directory of configs, weights and other file assets used to save and load a pre-trained model. The preset can be passed as a one of:

  1. a built in preset identifier like 'bert_base_en'
  2. a Kaggle Models handle like 'kaggle://user/bert/keras/bert_base_en'
  3. a Hugging Face handle like 'hf://user/bert_base_en'
  4. a path to a local preset directory like './bert_base_en'

For any Task subclass, you can run cls.presets.keys() to list all built-in presets available on the class.

This constructor can be called in one of two ways. Either from a task specific base class like keras_nlp.models.CausalLM.from_preset(), or from a model class like keras_nlp.models.BertClassifier.from_preset(). If calling from the a base class, the subclass of the returning object will be inferred from the config in the preset directory.


  • preset: string. A built in preset identifier, a Kaggle Models handle, a Hugging Face handle, or a path to a local directory.
  • load_weights: bool. If True, the weights will be loaded into the model architecture. If False, the weights will be randomly initialized.


# Load a Gemma generative task.
causal_lm = keras_nlp.models.CausalLM.from_preset(

# Load a Bert classification task.
model = keras_nlp.models.Classifier.from_preset(
Preset name Parameters Description
bart_base_en 139.42M 6-layer BART model where case is maintained. Trained on BookCorpus, English Wikipedia and CommonCrawl.
bart_large_en 406.29M 12-layer BART model where case is maintained. Trained on BookCorpus, English Wikipedia and CommonCrawl.
bart_large_en_cnn 406.29M The bart_large_en backbone model fine-tuned on the CNN+DM summarization dataset.


generate method

BartSeq2SeqLM.generate(inputs, max_length=None, stop_token_ids="auto")

Generate text given prompt inputs.

This method generates text based on given inputs. The sampling method used for generation can be set via the compile() method.

If inputs are a, outputs will be generated "batch-by-batch" and concatenated. Otherwise, all inputs will be handled as a single batch.

If a preprocessor is attached to the model, inputs will be preprocessed inside the generate() function and should match the structure expected by the preprocessor layer (usually raw strings). If a preprocessor is not attached, inputs should match the structure expected by the backbone. See the example usage above for a demonstration of each.


  • inputs: python data, tensor data, or a If a preprocessor is attached to the model, inputs should match the structure expected by the preprocessor layer. If a preprocessor is not attached, inputs should match the structure expected the backbone model.
  • max_length: Optional. int. The max length of the generated sequence. Will default to the max configured sequence_length of the preprocessor. If preprocessor is None, inputs should be should be padded to the desired maximum length and this argument will be ignored.
  • stop_token_ids: Optional. None, "auto", or tuple of token ids. Defaults to "auto" which uses the preprocessor.tokenizer.end_token_id. Not specifying a processor will produce an error. None stops generation after generating max_length tokens. You may also specify a list of token id's the model should stop on. Note that sequences of tokens will each be interpreted as a stop token, multi-token stop sequences are not supported.

backbone property


A keras_nlp.models.Backbone model with the core architecture.

preprocessor property


A keras_nlp.models.Preprocessor layer used to preprocess input.