MetaCLIP2CausalLMPreprocessor classkeras_hub.models.MetaCLIP2CausalLMPreprocessor(
tokenizer,
image_converter=None,
sequence_length=77,
add_start_token=True,
add_end_token=True,
to_lower=True,
**kwargs
)
MetaCLIP 2 preprocessor.
This preprocessing layer handles both text and image preprocessing for MetaCLIP 2 models. It tokenizes text inputs and resizes/normalizes images to match the model's expected input format.
Arguments
keras_hub.models.MetaCLIP2Tokenizer instance.keras_hub.models.MetaCLIP2ImageConverter instance.True, the preprocessor will prepend the tokenizer
start token to each input sequence.True, the preprocessor will append the tokenizer
end token to each input sequence.Call arguments
"prompts" and "images" keys, where "prompts" is
tf.Tensor or list of python strings and "images" are the image
tensors.None since MetaCLIP 2 doesn't need
labels to calculate the loss.sequence_length of
the layer.Examples
# Load the preprocessor from a preset.
preprocessor = keras_hub.models.MetaCLIP2CausalLMPreprocessor.from_preset(
"metaclip_2_vit_huge_patch14_224"
)
# Tokenize the sentence and preprocess the image.
preprocessor(
{
"prompts": "The quick brown fox jumped.",
"images": np.ones(shape=(224, 224, 3)),
}
)
# Tokenize a batch of sentences and preprocess a batch of images.
preprocessor(
{
"prompts": ["The quick brown fox jumped.", "The fox slept."],
"images": np.ones(shape=(2, 224, 224, 3)),
}
)
from_preset methodMetaCLIP2CausalLMPreprocessor.from_preset(
preset, config_file="preprocessor.json", **kwargs
)
Instantiate a keras_hub.models.Preprocessor from a model preset.
A preset is a directory of configs, weights and other file assets used
to save and load a pre-trained model. The preset can be passed as
one of:
'bert_base_en''kaggle://user/bert/keras/bert_base_en''hf://user/bert_base_en''./bert_base_en'For any Preprocessor subclass, you can run cls.presets.keys() to
list all built-in presets available on the class.
As there are usually multiple preprocessing classes for a given model,
this method should be called on a specific subclass like
keras_hub.models.BertTextClassifierPreprocessor.from_preset().
Arguments
Examples
# Load a preprocessor for Gemma generation.
preprocessor = keras_hub.models.CausalLMPreprocessor.from_preset(
"gemma_2b_en",
)
# Load a preprocessor for Bert classification.
preprocessor = keras_hub.models.TextClassifierPreprocessor.from_preset(
"bert_base_en",
)
| Preset | Parameters | Description |
|---|---|---|
| metaclip_2_vit_huge_patch14_224 | 1.86B | 2 billion parameter, 32-layer for vision and 24-layer for text, patch size of 14, image resolution 224x224. MetaCLIP 2 worldwide huge model (ViT-H-14-quickgelu-worldwide) trained on 29B seen pairs with QuickGELU activation. |
| metaclip_2_vit_huge_patch14_378 | 1.86B | 2 billion parameter, 32-layer for vision and 24-layer for text, patch size of 14, image resolution 378x378. MetaCLIP 2 worldwide huge model (ViT-H-14-378-worldwide) trained on 29B seen pairs. |
| metaclip_2_vit_giant_patch14_224 | 3.63B | 4 billion parameter, 40-layer for vision and 24-layer for text, patch size of 14, image resolution 224x224. MetaCLIP 2 worldwide giant model (ViT-bigG-14-worldwide) trained on 29B seen pairs. |
| metaclip_2_vit_giant_patch14_378 | 3.63B | 4 billion parameter, 40-layer for vision and 24-layer for text, patch size of 14, image resolution 378x378. MetaCLIP 2 worldwide giant model (ViT-bigG-14-378-worldwide) trained on 29B seen pairs. |