Context Encoding for Semantic Segmentation (EncNet)¶
Install Package¶
Clone the GitHub repo:
git clone https://github.com/zhanghang1989/PyTorch-Encoding
Install PyTorch Encoding (if not yet). Please follow the installation guide Installing PyTorch Encoding.
Test Pre-trained Model¶
Hint
- The model names contain the training information. For instance
FCN_ResNet50_PContext
: FCN
indicate the algorithm is “Fully Convolutional Network for Semantic Segmentation”ResNet50
is the name of backbone network.PContext
means the PASCAL in Context dataset.
How to get pretrained model, for example FCN_ResNet50_PContext
:
model = encoding.models.get_model('FCN_ResNet50_PContext', pretrained=True)
Prepare the datasets by runing the scripts in the scripts/
folder, for example preparing PASCAL Context
dataset:
python scripts/prepare_pcontext.py
The test script is in the experiments/segmentation/
folder. For evaluating the model (using MS),
for example Encnet_ResNet50_PContext
:
python test.py --dataset PContext --model-zoo Encnet_ResNet50_PContext --eval
# pixAcc: 0.792, mIoU: 0.510: 100%|████████████████████████| 1276/1276 [46:31<00:00, 2.19s/it]
The command for training the model can be found by clicking cmd
in the table.
Model | pixAcc | mIoU | Command |
---|---|---|---|
Encnet_ResNet50_PContext | 79.2% | 51.0% | cmd |
EncNet_ResNet101_PContext | 80.7% | 54.1% | cmd |
EncNet_ResNet50_ADE | 80.1% | 41.5% | cmd |
EncNet_ResNet101_ADE | 81.3% | 44.4% | cmd |
EncNet_ResNet101_VOC | N/A | 85.9% | cmd |
Quick Demo¶
import torch
import encoding
# Get the model
model = encoding.models.get_model('Encnet_ResNet50_PContext', pretrained=True).cuda()
model.eval()
# Prepare the image
url = 'https://github.com/zhanghang1989/image-data/blob/master/' + \
'encoding/segmentation/pcontext/2010_001829_org.jpg?raw=true'
filename = 'example.jpg'
img = encoding.utils.load_image(
encoding.utils.download(url, filename)).cuda().unsqueeze(0)
# Make prediction
output = model.evaluate(img)
predict = torch.max(output, 1)[1].cpu().numpy() + 1
# Get color pallete for visualization
mask = encoding.utils.get_mask_pallete(predict, 'pcontext')
mask.save('output.png')


Train Your Own Model¶
Prepare the datasets by runing the scripts in the
scripts/
folder, for example preparingPASCAL Context
dataset:python scripts/prepare_pcontext.py
The training script is in the
experiments/segmentation/
folder, example training command:CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset pcontext --model encnet --aux --se-loss
Detail training options, please run
python train.py -h
. Commands for reproducing pre-trained models can be found in the table.
Hint
The validation metrics during the training only using center-crop is just for monitoring the training correctness purpose. For evaluating the pretrained model on validation set using MS, please use the command:
CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset pcontext --model encnet --aux --se-loss --resume mycheckpoint --eval
Citation¶
Note
Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. “Context Encoding for Semantic Segmentation” The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018:
@InProceedings{Zhang_2018_CVPR, author = {Zhang, Hang and Dana, Kristin and Shi, Jianping and Zhang, Zhongyue and Wang, Xiaogang and Tyagi, Ambrish and Agrawal, Amit}, title = {Context Encoding for Semantic Segmentation}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2018} }