Context Encoding for Semantic Segmentation (EncNet)

Install Package

  • Clone the GitHub repo:

    git clone
  • Install PyTorch Encoding (if not yet). Please follow the installation guide Installing PyTorch Encoding.

Test Pre-trained Model


The model names contain the training information. For instance FCN_ResNet50_PContext:
  • FCN indicate the algorithm is “Fully Convolutional Network for Semantic Segmentation”
  • ResNet50 is the name of backbone network.
  • PContext means the PASCAL in Context dataset.

How to get pretrained model, for example FCN_ResNet50_PContext:

model = encoding.models.get_model('FCN_ResNet50_PContext', pretrained=True)

The test script is in the experiments/segmentation/ folder. For evaluating the model (using MS), for example Encnet_ResNet50_PContext:

python --dataset PContext --model-zoo Encnet_ResNet50_PContext --eval
# pixAcc: 0.7888, mIoU: 0.5056: 100%|████████████████████████| 1276/1276 [46:31<00:00,  2.19s/it]

The command for training the model can be found by clicking cmd in the table.

Model pixAcc mIoU Note Command Logs
Encnet_ResNet50_PContext 78.9% 50.6%   cmd ENC50PC
EncNet_ResNet101_PContext 80.3% 53.2%   cmd ENC101PC
EncNet_ResNet50_ADE 79.9% 41.2%   cmd ENC50ADE

Quick Demo

import torch
import encoding

# Get the model
model = encoding.models.get_model('Encnet_ResNet50_PContext', pretrained=True).cuda()

# Prepare the image
url = '' + \
filename = 'example.jpg'
img = encoding.utils.load_image(, filename)).cuda().unsqueeze(0)

# Make prediction
output = model.evaluate(img)
predict = torch.max(output, 1)[1].cpu().numpy() + 1

# Get color pallete for visualization
mask = encoding.utils.get_mask_pallete(predict, 'pcontext')'output.png')

Train Your Own Model

  • Prepare the datasets by runing the scripts in the scripts/ folder, for example preparing PASCAL Context dataset:

    python scripts/
  • The training script is in the experiments/segmentation/ folder, example training command:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python --dataset pcontext --model encnet --aux --se-loss
  • Detail training options, please run python -h.

  • The validation metrics during the training only using center-crop is just for monitoring the training correctness purpose. For evaluating the pretrained model on validation set using MS, please use the command:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python --dataset pcontext --model encnet --aux --se-loss --resume mycheckpoint --eval



  • Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. “Context Encoding for Semantic Segmentation” The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018:

    author = {Zhang, Hang and Dana, Kristin and Shi, Jianping and Zhang, Zhongyue and Wang, Xiaogang and Tyagi, Ambrish and Agrawal, Amit},
    title = {Context Encoding for Semantic Segmentation},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2018}