U-HRNet: Delving into Improving Semantic Representation of High Resolution Network for Dense Prediction

by   Jian Wang, et al.

High resolution and advanced semantic representation are both vital for dense prediction. Empirically, low-resolution feature maps often achieve stronger semantic representation, and high-resolution feature maps generally can better identify local features such as edges, but contains weaker semantic information. Existing state-of-the-art frameworks such as HRNet has kept low-resolution and high-resolution feature maps in parallel, and repeatedly exchange the information across different resolutions. However, we believe that the lowest-resolution feature map often contains the strongest semantic information, and it is necessary to go through more layers to merge with high-resolution feature maps, while for high-resolution feature maps, the computational cost of each convolutional layer is very large, and there is no need to go through so many layers. Therefore, we designed a U-shaped High-Resolution Network (U-HRNet), which adds more stages after the feature map with strongest semantic representation and relaxes the constraint in HRNet that all resolutions need to be calculated parallel for a newly added stage. More calculations are allocated to low-resolution feature maps, which significantly improves the overall semantic representation. U-HRNet is a substitute for the HRNet backbone and can achieve significant improvement on multiple semantic segmentation and depth prediction datasets, under the exactly same training and inference setting, with almost no increasing in the amount of calculation. Code is available at PaddleSeg: https://github.com/PaddlePaddle/PaddleSeg.


page 2

page 4


High-Resolution Swin Transformer for Automatic Medical Image Segmentation

The Resolution of feature maps is critical for medical image segmentatio...

Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos

Video semantic segmentation (VSS) is a computationally expensive task du...

Resolution Adaptive Networks for Efficient Inference

Recently, adaptive inference is gaining increasing attention due to its ...

Using Features at Multiple Temporal and Spatial Resolutions to Predict Human Behavior in Real Time

When performing complex tasks, humans naturally reason at multiple tempo...

OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution

Omnidirectional images (ODIs) have become increasingly popular, as their...

HR-CAM: Precise Localization of Pathology Using Multi-level Learning in CNNs

We propose a CNN based technique that aggregates feature maps from its m...

FatNet: High Resolution Kernels for Classification Using Fully Convolutional Optical Neural Networks

This paper describes the transformation of a traditional in-silico class...

Please sign up or login with your details

Forgot password? Click here to reset