» Code examples / Computer Vision

Computer Vision

Image classification

Image classification from scratch
Simple MNIST convnet
Image classification via fine-tuning with EfficientNet
Image classification with Vision Transformer
Image Classification using BigTransfer (BiT)
Classification using Attention-based Deep Multiple Instance Learning
Image classification with modern MLP models
A mobile-friendly Transformer-based model for image classification
Pneumonia Classification on TPU
Compact Convolutional Transformers
Image classification with ConvMixer
Image classification with EANet (External Attention Transformer)
Involutional neural networks
Image classification with Perceiver
Few-Shot learning with Reptile
Semi-supervised image classification using contrastive pretraining with SimCLR
Image classification with Swin Transformers
Train a Vision Transformer on small datasets
A Vision Transformer without Attention

Image segmentation

Image segmentation with a U-Net-like architecture
Multiclass semantic segmentation using DeepLabV3+
Highly accurate boundaries segmentation using BASNet

Object detection

Object Detection with RetinaNet
Keypoint Detection with Transfer Learning
Object detection with Vision Transformers

3D

3D image classification from CT scans
Monocular depth estimation
3D volumetric rendering with NeRF
Point cloud classification

OCR

OCR model for reading Captchas
Handwriting recognition

Image enhancement

Convolutional autoencoder for image denoising
Low-light image enhancement using MIRNet
Image Super-Resolution using an Efficient Sub-Pixel CNN
Enhanced Deep Residual Networks for single-image super-resolution
Zero-DCE for low-light image enhancement

Data augmentation

CutMix data augmentation for image classification
MixUp augmentation for image classification
RandAugment for Image Classification for Improved Robustness

Image & Text

Image captioning
Natural language image search with a Dual Encoder

Vision models interpretability

Visualizing what convnets learn
Model interpretability with Integrated Gradients
Investigating Vision Transformer representations
Grad-CAM class activation visualization

Image similarity search

Near-duplicate image search
Semantic Image Clustering
Image similarity estimation using a Siamese Network with a contrastive loss
Image similarity estimation using a Siamese Network with a triplet loss
Metric learning for image similarity search
Metric learning for image similarity search using TensorFlow Similarity

Video

Video Classification with a CNN-RNN Architecture
Next-Frame Video Prediction with Convolutional LSTMs
Video Classification with Transformers
Video Vision Transformer

Other

Semi-supervision and domain adaptation with AdaMatch
Barlow Twins for Contrastive SSL
Class Attention Image Transformers with LayerScale
Consistency training with supervision
Distilling Vision Transformers
FixRes: Fixing train-test resolution discrepancy
Focal Modulation: A replacement for Self-Attention
Using the Forward-Forward Algorithm for Image Classification
Image Segmentation using Composable Fully-Convolutional Networks
Gradient Centralization for Better Training Performance
Knowledge Distillation
Learning to Resize in Computer Vision
Masked image modeling with Autoencoders
Self-supervised contrastive learning with NNCLR
Augmenting convnets with aggregated attention
Point cloud segmentation with PointNet
Semantic segmentation with SegFormer and Hugging Face Transformers
Self-supervised contrastive learning with SimSiam
Supervised Contrastive Learning
When Recurrence meets Transformers
Learning to tokenize in Vision Transformers
Efficient Object Detection with YOLOV8 and KerasCV