► KerasHub: Pretrained Models / API documentation / Model Architectures / Stable Diffusion 3 / StableDiffusion3TextToImage model

StableDiffusion3TextToImage model

[source]

`StableDiffusion3TextToImage` class

keras_hub.models.StableDiffusion3TextToImage(backbone, preprocessor, **kwargs)

An end-to-end Stable Diffusion 3 model for text-to-image generation.

This model has a generate() method, which generates image based on a prompt.

Arguments

backbone: A keras_hub.models.StableDiffusion3Backbone instance.
preprocessor: A keras_hub.models.StableDiffusion3TextToImagePreprocessor instance.

Examples

Use generate() to do image generation.

text_to_image = keras_hub.models.StableDiffusion3TextToImage.from_preset(
    "stable_diffusion_3_medium", image_shape=(512, 512, 3)
)
text_to_image.generate(
    "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
)

# Generate with batched prompts.
text_to_image.generate(
    ["cute wallpaper art of a cat", "cute wallpaper art of a dog"]
)

# Generate with different `num_steps` and `guidance_scale`.
text_to_image.generate(
    "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k",
    num_steps=50,
    guidance_scale=5.0,
)

# Generate with `negative_prompts`.
prompt = (
    "Astronaut in a jungle, cold color palette, muted colors, "
    "detailed, 8k"
)
text_to_image.generate(
    {
        "prompts": prompt,
        "negative_prompts": "green color",
    }
)

[source]

`from_preset` method

StableDiffusion3TextToImage.from_preset(preset, load_weights=True, **kwargs)

Instantiate a keras_hub.models.Task from a model preset.

A preset is a directory of configs, weights and other file assets used to save and load a pre-trained model. The preset can be passed as one of:

a built-in preset identifier like 'bert_base_en'
a Kaggle Models handle like 'kaggle://user/bert/keras/bert_base_en'
a Hugging Face handle like 'hf://user/bert_base_en'
a path to a local preset directory like './bert_base_en'

For any Task subclass, you can run cls.presets.keys() to list all built-in presets available on the class.

This constructor can be called in one of two ways. Either from a task specific base class like keras_hub.models.CausalLM.from_preset(), or from a model class like keras_hub.models.BertTextClassifier.from_preset(). If calling from the a base class, the subclass of the returning object will be inferred from the config in the preset directory.

Arguments

preset: string. A built-in preset identifier, a Kaggle Models handle, a Hugging Face handle, or a path to a local directory.
load_weights: bool. If True, saved weights will be loaded into the model architecture. If False, all weights will be randomly initialized.

Examples

# Load a Gemma generative task.
causal_lm = keras_hub.models.CausalLM.from_preset(
    "gemma_2b_en",
)

# Load a Bert classification task.
model = keras_hub.models.TextClassifier.from_preset(
    "bert_base_en",
    num_classes=2,
)

Preset	Parameters	Description
stable_diffusion_3_medium	2.99B	3 billion parameter, including CLIP L and CLIP G text encoders, MMDiT generative model, and VAE autoencoder. Developed by Stability AI.
stable_diffusion_3.5_medium	3.37B	3 billion parameter, including CLIP L and CLIP G text encoders, MMDiT-X generative model, and VAE autoencoder. Developed by Stability AI.
stable_diffusion_3.5_large	9.05B	9 billion parameter, including CLIP L and CLIP G text encoders, MMDiT generative model, and VAE autoencoder. Developed by Stability AI.
stable_diffusion_3.5_large_turbo	9.05B	9 billion parameter, including CLIP L and CLIP G text encoders, MMDiT generative model, and VAE autoencoder. A timestep-distilled version that eliminates classifier-free guidance and uses fewer steps for generation. Developed by Stability AI.

`backbone` property

keras_hub.models.StableDiffusion3TextToImage.backbone

A keras_hub.models.Backbone model with the core architecture.

[source]

`generate` method

StableDiffusion3TextToImage.generate(
    inputs, num_steps=28, guidance_scale=7.0, seed=None
)

Generate image based on the provided inputs.

Typically, inputs contains a text description (known as a prompt) used to guide the image generation.

Some models support a negative_prompts key, which helps steer the model away from generating certain styles and elements. To enable this, pass prompts and negative_prompts as a dict:

prompt = (
    "Astronaut in a jungle, cold color palette, muted colors, "
    "detailed, 8k"
)
text_to_image.generate(
    {
        "prompts": prompt,
        "negative_prompts": "green color",
    }
)

If inputs are a tf.data.Dataset, outputs will be generated "batch-by-batch" and concatenated. Otherwise, all inputs will be processed as batches.

Arguments

inputs: python data, tensor data, or a tf.data.Dataset. The format must be one of the following:
- A single string
- A list of strings
- A dict with "prompts" and/or "negative_prompts" keys
- A tf.data.Dataset with "prompts" and/or "negative_prompts" keys
num_steps: int. The number of diffusion steps to take.
guidance_scale: Optional float. The classifier free guidance scale defined in Classifier-Free Diffusion Guidance. A higher scale encourages generating images more closely related to the prompts, typically at the cost of lower image quality. Note that some models don't utilize classifier-free guidance.
seed: optional int. Used as a random seed.

`preprocessor` property

keras_hub.models.StableDiffusion3TextToImage.preprocessor

A keras_hub.models.Preprocessor layer used to preprocess input.

StableDiffusion3TextToImage model

StableDiffusion3TextToImage class

from_preset method

backbone property

generate method

preprocessor property

StableDiffusion3TextToImage model

StableDiffusion3TextToImage class

from_preset method

backbone property

generate method

preprocessor property

`StableDiffusion3TextToImage` class

`from_preset` method

`backbone` property

`generate` method

`preprocessor` property