Semantic Segmentation ===================== Install Package --------------- - Clone the GitHub repo:: git clone https://github.com/zhanghang1989/PyTorch-Encoding - Install PyTorch Encoding (if not yet). Please follow the installation guide `Installing PyTorch Encoding <../notes/compile.html>`_. Get Pre-trained Model --------------------- .. hint:: The model names contain the training information. For instance ``EncNet_ResNet50s_ADE``: - ``EncNet`` indicate the algorithm is “Context Encoding for Semantic Segmentation” - ``ResNet50`` is the name of backbone network. - ``ADE`` means the ADE20K dataset. How to get pretrained model, for example ``EncNet_ResNet50s_ADE``:: model = encoding.models.get_model('EncNet_ResNet50s_ADE', pretrained=True) After clicking ``cmd`` in the table, the command for training the model can be found below the table. .. role:: raw-html(raw) :format: html ResNeSt Backbone Models ----------------------- ADE20K Dataset ~~~~~~~~~~~~~~ ============================================================================== ==================== =================== ========================================================================================================= Model pixAcc mIoU Command ============================================================================== ==================== =================== ========================================================================================================= FCN_ResNeSt50_ADE 80.18% 42.94% :raw-html:`cmd` DeepLab_ResNeSt50_ADE 81.17% 45.12% :raw-html:`cmd` DeepLab_ResNeSt101_ADE 82.07% 46.91% :raw-html:`cmd` DeepLab_ResNeSt200_ADE 82.45% 48.36% :raw-html:`cmd` DeepLab_ResNeSt269_ADE 82.62% 47.60% :raw-html:`cmd` ============================================================================== ==================== =================== ========================================================================================================= .. raw:: html Pascal Context Dataset ~~~~~~~~~~~~~~~~~~~~~~ ============================================================================== ==================== ==================== ========================================================================================================= Model pixAcc mIoU Command ============================================================================== ==================== ==================== ========================================================================================================= FCN_ResNeSt50_PContext 79.19% 51.98% :raw-html:`cmd` DeepLab_ResNeSt50_PContext 80.41% 53.19% :raw-html:`cmd` DeepLab_ResNeSt101_PContext 81.91% 56.49% :raw-html:`cmd` DeepLab_ResNeSt200_PContext 82.50% 58.37% :raw-html:`cmd` DeepLab_ResNeSt269_PContext 83.06% 58.92% :raw-html:`cmd` ============================================================================== ==================== ==================== ========================================================================================================= .. raw:: html ResNet Backbone Models ---------------------- ADE20K Dataset ~~~~~~~~~~~~~~ ============================================================================== ==================== ==================== ============================================================================================= Model pixAcc mIoU Command ============================================================================== ==================== ==================== ============================================================================================= FCN_ResNet50s_ADE 78.7% 38.5% :raw-html:`cmd` EncNet_ResNet50s_ADE 80.1% 41.5% :raw-html:`cmd` EncNet_ResNet101s_ADE 81.3% 44.4% :raw-html:`cmd` ============================================================================== ==================== ==================== ============================================================================================= .. raw:: html Pascal Context Dataset ~~~~~~~~~~~~~~~~~~~~~~ ============================================================================== ===================== ===================== ============================================================================================= Model pixAcc mIoU Command ============================================================================== ===================== ===================== ============================================================================================= Encnet_ResNet50s_PContext 79.2% 51.0% :raw-html:`cmd` EncNet_ResNet101s_PContext 80.7% 54.1% :raw-html:`cmd` ============================================================================== ===================== ===================== ============================================================================================= .. raw:: html Pascal VOC Dataset ~~~~~~~~~~~~~~~~~~ ============================================================================== ====================== ===================== ============================================================================================= Model pixAcc mIoU Command ============================================================================== ====================== ===================== ============================================================================================= EncNet_ResNet101s_VOC N/A 85.9% :raw-html:`cmd` ============================================================================== ====================== ===================== ============================================================================================= .. raw:: html Test Pretrained ~~~~~~~~~~~~~~~ - Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``PASCAL Context`` dataset:: python scripts/prepare_ade20k.py - The test script is in the ``experiments/segmentation/`` folder. For evaluating the model (using MS), for example ``EncNet_ResNet50s_ADE``:: python test.py --dataset ADE20K --model-zoo EncNet_ResNet50s_ADE --eval # pixAcc: 0.801, mIoU: 0.415: 100%|████████████████████████| 250/250 Train Your Own Model -------------------- - Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``ADE20K`` dataset:: python scripts/prepare_ade20k.py - The training script is in the ``experiments/segmentation/`` folder, example training command:: python train.py --dataset ade20k --model encnet --aux --se-loss - Detail training options, please run ``python train.py -h``. Commands for reproducing pre-trained models can be found in the table. .. hint:: The validation metrics during the training only using center-crop is just for monitoring the training correctness purpose. For evaluating the pretrained model on validation set using MS, please use the command:: python test.py --dataset pcontext --model encnet --aux --se-loss --resume mycheckpoint --eval Quick Demo ~~~~~~~~~~ .. code-block:: python import torch import encoding # Get the model model = encoding.models.get_model('Encnet_ResNet50s_PContext', pretrained=True).cuda() model.eval() # Prepare the image url = 'https://github.com/zhanghang1989/image-data/blob/master/' + \ 'encoding/segmentation/pcontext/2010_001829_org.jpg?raw=true' filename = 'example.jpg' img = encoding.utils.load_image( encoding.utils.download(url, filename)).cuda().unsqueeze(0) # Make prediction output = model.evaluate(img) predict = torch.max(output, 1)[1].cpu().numpy() + 1 # Get color pallete for visualization mask = encoding.utils.get_mask_pallete(predict, 'pascal_voc') mask.save('output.png') .. image:: https://raw.githubusercontent.com/zhanghang1989/image-data/master/encoding/segmentation/pcontext/2010_001829_org.jpg :width: 45% .. image:: https://raw.githubusercontent.com/zhanghang1989/image-data/master/encoding/segmentation/pcontext/2010_001829.png :width: 45% Citation -------- .. note:: * Hang Zhang et al. "ResNeSt: Split-Attention Networks" *arXiv 2020*:: @article{zhang2020resnest, title={ResNeSt: Split-Attention Networks}, author={Zhang, Hang and Wu, Chongruo and Zhang, Zhongyue and Zhu, Yi and Zhang, Zhi and Lin, Haibin and Sun, Yue and He, Tong and Muller, Jonas and Manmatha, R. and Li, Mu and Smola, Alexander}, journal={arXiv preprint arXiv:2004.08955}, year={2020} } * Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. "Context Encoding for Semantic Segmentation" *The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018*:: @InProceedings{Zhang_2018_CVPR, author = {Zhang, Hang and Dana, Kristin and Shi, Jianping and Zhang, Zhongyue and Wang, Xiaogang and Tyagi, Ambrish and Agrawal, Amit}, title = {Context Encoding for Semantic Segmentation}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2018} }