Analyzing Modular CNN Architectures for Joint Depth Prediction and Semantic Segmentation

02/26/2017
by   Omid Hosseini Jafari, et al.
0

This paper addresses the task of designing a modular neural network architecture that jointly solves different tasks. As an example we use the tasks of depth estimation and semantic segmentation given a single RGB image. The main focus of this work is to analyze the cross-modality influence between depth and semantic prediction maps on their joint refinement. While most previous works solely focus on measuring improvements in accuracy, we propose a way to quantify the cross-modality influence. We show that there is a relationship between final accuracy and cross-modality influence, although not a simple linear one. Hence a larger cross-modality influence does not necessarily translate into an improved accuracy. We find that a beneficial balance between the cross-modality influences can be achieved by network architecture and conjecture that this relationship can be utilized to understand different network design choices. Towards this end we propose a Convolutional Neural Network (CNN) architecture that fuses the state of the state-of-the-art results for depth estimation and semantic labeling. By balancing the cross-modality influences between depth and semantic prediction, we achieve improved results for both tasks using the NYU-Depth v2 benchmark.

READ FULL TEXT

page 3

page 6

research
12/17/2018

Learning Common Representation from RGB and Depth Images

We propose a new deep learning architecture for the tasks of semantic se...
research
04/25/2016

Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks

Multi-scale deep CNNs have been used successfully for problems mapping e...
research
09/13/2018

Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations

Deployment of deep learning models in robotics as sensory information ex...
research
01/19/2021

SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from Monocular images

Depth estimation and semantic segmentation play essential roles in scene...
research
05/20/2017

Recurrent Scene Parsing with Perspective Understanding in the Loop

Objects may appear at arbitrary scales in perspective images of a scene,...
research
06/21/2022

Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth

Multi-task learning (MTL) paradigm focuses on jointly learning two or mo...
research
07/03/2018

Viewpoint Estimation-Insights & Model

This paper addresses the problem of viewpoint estimation of an object in...

Please sign up or login with your details

Forgot password? Click here to reset