SegFormerImageSegmenter
classkeras_hub.models.SegFormerImageSegmenter(
backbone, num_classes, preprocessor=None, **kwargs
)
A Keras model implementing SegFormer for semantic segmentation.
This class implements the segmentation head of the SegFormer architecture described in [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers] (https://arxiv.org/abs/2105.15203) and [based on the TensorFlow implementation from DeepVision] (https://github.com/DavidLandup0/deepvision/tree/main/deepvision/models/segmentation/segformer).
SegFormers are meant to be used with the MixTransformer (MiT) encoder family, and and use a very lightweight all-MLP decoder head.
The MiT encoder uses a hierarchical transformer which outputs features at multiple scales, similar to that of the hierarchical outputs typically associated with CNNs.
Arguments
keras.Model
. The backbone network for the model that is
used as a feature extractor for the SegFormer encoder. It is
intended to be used only with the MiT backbone model
(keras_hub.models.MiTBackbone
) which was created specifically for
SegFormers. Alternatively, can be a keras_hub.models.Backbone
a
model subclassing keras_hub.models.FeaturePyramidBackbone
, or a
keras.Model
that has a pyramid_outputs
property which is a
dictionary with keys "P2", "P3", "P4", and "P5" and layer names as
values.Example
Using presets:
segmenter = keras_hub.models.SegFormerImageSegmenter.from_preset(
"segformer_b0_ade20k_512"
)
images = np.random.rand(1, 512, 512, 3)
segformer(images)
Using the SegFormer backbone:
encoder = keras_hub.models.MiTBackbone.from_preset(
"mit_b0_ade20k_512"
)
backbone = keras_hub.models.SegFormerBackbone(
image_encoder=encoder,
projection_filters=256,
)
Using the SegFormer backbone with a custom encoder:
images = np.ones(shape=(1, 96, 96, 3))
labels = np.zeros(shape=(1, 96, 96, 1))
encoder = keras_hub.models.MiTBackbone(
depths=[2, 2, 2, 2],
image_shape=(96, 96, 3),
hidden_dims=[32, 64, 160, 256],
num_layers=4,
blockwise_num_heads=[1, 2, 5, 8],
blockwise_sr_ratios=[8, 4, 2, 1],
max_drop_path_rate=0.1,
patch_sizes=[7, 3, 3, 3],
strides=[4, 2, 2, 2],
)
backbone = keras_hub.models.SegFormerBackbone(
image_encoder=encoder,
projection_filters=256,
)
segformer = keras_hub.models.SegFormerImageSegmenter(
backbone=backbone,
num_classes=4,
)
segformer(images
Using the segmentor class with a preset backbone:
image_encoder = keras_hub.models.MiTBackbone.from_preset(
"mit_b0_ade20k_512"
)
backbone = keras_hub.models.SegFormerBackbone(
image_encoder=encoder,
projection_filters=256,
)
segformer = keras_hub.models.SegFormerImageSegmenter(
backbone=backbone,
num_classes=4,
)
from_preset
methodSegFormerImageSegmenter.from_preset(preset, load_weights=True, **kwargs)
Instantiate a keras_hub.models.Task
from a model preset.
A preset is a directory of configs, weights and other file assets used
to save and load a pre-trained model. The preset
can be passed as
one of:
'bert_base_en'
'kaggle://user/bert/keras/bert_base_en'
'hf://user/bert_base_en'
'./bert_base_en'
For any Task
subclass, you can run cls.presets.keys()
to list all
built-in presets available on the class.
This constructor can be called in one of two ways. Either from a task
specific base class like keras_hub.models.CausalLM.from_preset()
, or
from a model class like
keras_hub.models.BertTextClassifier.from_preset()
.
If calling from the a base class, the subclass of the returning object
will be inferred from the config in the preset directory.
Arguments
True
, saved weights will be loaded into
the model architecture. If False
, all weights will be
randomly initialized.Examples
# Load a Gemma generative task.
causal_lm = keras_hub.models.CausalLM.from_preset(
"gemma_2b_en",
)
# Load a Bert classification task.
model = keras_hub.models.TextClassifier.from_preset(
"bert_base_en",
num_classes=2,
)
Preset | Parameters | Description |
---|---|---|
segformer_b0_ade20k_512 | 3.72M | SegFormer model with MiTB0 backbone fine-tuned on ADE20k in 512x512 resolution. |
segformer_b0_cityscapes_1024 | 3.72M | SegFormer model with MiTB0 backbone fine-tuned on Cityscapes in 1024x1024 resolution. |
segformer_b1_ade20k_512 | 13.68M | SegFormer model with MiTB1 backbone fine-tuned on ADE20k in 512x512 resolution. |
segformer_b1_cityscapes_1024 | 13.68M | SegFormer model with MiTB1 backbone fine-tuned on Cityscapes in 1024x1024 resolution. |
segformer_b2_ade20k_512 | 24.73M | SegFormer model with MiTB2 backbone fine-tuned on ADE20k in 512x512 resolution. |
segformer_b2_cityscapes_1024 | 24.73M | SegFormer model with MiTB2 backbone fine-tuned on Cityscapes in 1024x1024 resolution. |
segformer_b3_ade20k_512 | 44.60M | SegFormer model with MiTB3 backbone fine-tuned on ADE20k in 512x512 resolution. |
segformer_b3_cityscapes_1024 | 44.60M | SegFormer model with MiTB3 backbone fine-tuned on Cityscapes in 1024x1024 resolution. |
segformer_b4_ade20k_512 | 61.37M | SegFormer model with MiTB4 backbone fine-tuned on ADE20k in 512x512 resolution. |
segformer_b4_cityscapes_1024 | 61.37M | SegFormer model with MiTB4 backbone fine-tuned on Cityscapes in 1024x1024 resolution. |
segformer_b5_ade20k_640 | 81.97M | SegFormer model with MiTB5 backbone fine-tuned on ADE20k in 640x640 resolution. |
segformer_b5_cityscapes_1024 | 81.97M | SegFormer model with MiTB5 backbone fine-tuned on Cityscapes in 1024x1024 resolution. |
backbone
propertykeras_hub.models.SegFormerImageSegmenter.backbone
A keras_hub.models.Backbone
model with the core architecture.
preprocessor
propertykeras_hub.models.SegFormerImageSegmenter.preprocessor
A keras_hub.models.Preprocessor
layer used to preprocess input.