TextToImage
classkeras_hub.models.TextToImage()
Base class for text-to-image tasks.
TextToImage
tasks wrap a keras_hub.models.Backbone
and
a keras_hub.models.Preprocessor
to create a model that can be used for
generation and generative fine-tuning.
TextToImage
tasks provide an additional, high-level generate()
function
which can be used to generate image by token with a string in, image out
signature.
All TextToImage
tasks include a from_preset()
constructor which can be
used to load a pre-trained config and weights.
Example
# Load a Stable Diffusion 3 backbone with pre-trained weights.
text_to_image = keras_hub.models.TextToImage.from_preset(
"stable_diffusion_3_medium",
)
text_to_image.generate(
"Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
)
# Load a Stable Diffusion 3 backbone at bfloat16 precision.
text_to_image = keras_hub.models.TextToImage.from_preset(
"stable_diffusion_3_medium",
dtype="bfloat16",
)
text_to_image.generate(
"Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
)
from_preset
methodTextToImage.from_preset(preset, load_weights=True, **kwargs)
Instantiate a keras_hub.models.Task
from a model preset.
A preset is a directory of configs, weights and other file assets used
to save and load a pre-trained model. The preset
can be passed as
one of:
'bert_base_en'
'kaggle://user/bert/keras/bert_base_en'
'hf://user/bert_base_en'
'./bert_base_en'
For any Task
subclass, you can run cls.presets.keys()
to list all
built-in presets available on the class.
This constructor can be called in one of two ways. Either from a task
specific base class like keras_hub.models.CausalLM.from_preset()
, or
from a model class like keras_hub.models.BertTextClassifier.from_preset()
.
If calling from the a base class, the subclass of the returning object
will be inferred from the config in the preset directory.
Arguments
True
, saved weights will be loaded into
the model architecture. If False
, all weights will be
randomly initialized.Examples
# Load a Gemma generative task.
causal_lm = keras_hub.models.CausalLM.from_preset(
"gemma_2b_en",
)
# Load a Bert classification task.
model = keras_hub.models.TextClassifier.from_preset(
"bert_base_en",
num_classes=2,
)
Preset | Parameters | Description |
---|---|---|
stable_diffusion_3_medium | 2.99B | 3 billion parameter, including CLIP L and CLIP G text encoders, MMDiT generative model, and VAE autoencoder. Developed by Stability AI. |
compile
methodTextToImage.compile(optimizer="auto", loss="auto", metrics="auto", **kwargs)
Configures the TextToImage
task for training.
The TextToImage
task extends the default compilation signature of
keras.Model.compile
with defaults for optimizer
, loss
, and
metrics
. To override these defaults, pass any value
to these arguments during compilation.
Arguments
"auto"
, an optimizer name, or a keras.Optimizer
instance. Defaults to "auto"
, which uses the default optimizer
for the given model and task. See keras.Model.compile
and
keras.optimizers
for more info on possible optimizer
values."auto"
, a loss name, or a keras.losses.Loss
instance.
Defaults to "auto"
, where a
keras.losses.MeanSquaredError
loss will be applied. See
keras.Model.compile
and keras.losses
for more info on
possible loss
values."auto"
, or a list of metrics to be evaluated by
the model during training and testing. Defaults to "auto"
,
where a keras.metrics.MeanSquaredError
will be applied to
track the loss of the model during training. See
keras.Model.compile
and keras.metrics
for more info on
possible metrics
values.keras.Model.compile
for a full list of arguments
supported by the compile method.save_to_preset
methodTextToImage.save_to_preset(preset_dir)
Save task to a preset directory.
Arguments
preprocessor
propertykeras_hub.models.TextToImage.preprocessor
A keras_hub.models.Preprocessor
layer used to preprocess input.
backbone
propertykeras_hub.models.TextToImage.backbone
A keras_hub.models.Backbone
model with the core architecture.
generate
methodTextToImage.generate(inputs, num_steps, guidance_scale, seed=None)
Generate image based on the provided inputs
.
Typically, inputs
contains a text description (known as a prompt) used
to guide the image generation.
Some models support a negative_prompts
key, which helps steer the
model away from generating certain styles and elements. To enable this,
pass prompts
and negative_prompts
as a dict:
text_to_image.generate(
{
"prompts": "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k",
"negative_prompts": "green color",
}
)
If inputs
are a tf.data.Dataset
, outputs will be generated
"batch-by-batch" and concatenated. Otherwise, all inputs will be
processed as batches.
Arguments
tf.data.Dataset
. The format
must be one of the following:tf.data.Dataset
with "prompts" and/or "negative_prompts"
keys