The semantic segmentation is a dense pixel label pre-diction task, which takes quite a lot of resources and computation cost in most of the time. In our approach, we pay attention to balance the speed and better performance which outperforms the state of the art in speed and accuracy for real-time performance. We come up with the idea of new efficient deep backbone that can extract more semantic details, reduce the computation cost and be easy to deploy at the same time. We call our new backbone as Cascaded Mobile Network, which is proved to be very useful. Our proposed model achieves 72.1 mIOU on the CityScapes val, and 69.5 on CamVid. We achieve good balance between speed and accuracy.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Wang, Z. H., Zhao, S., Shen, J., & Lei, Z. (2020). Efficient Light Deep Network for Street Scene Parsing. In 2020 IEEE International Conference on Visual Communications and Image Processing, VCIP 2020 (pp. 42–45). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/VCIP49819.2020.9301795