Improving Generalization of Deep Networks for Estimating Physical Properties of Containers and Fillings

03/02/2022
by   Hengyi Wang, et al.
0

We present methods to estimate the physical properties of household containers and their fillings manipulated by humans. We use a lightweight, pre-trained convolutional neural network with coordinate attention as a backbone model of the pipelines to accurately locate the object of interest and estimate the physical properties in the CORSMAL Containers Manipulation (CCM) dataset. We address the filling type classification with audio data and then combine this information from audio with video modalities to address the filling level classification. For the container capacity, dimension, and mass estimation, we present a data augmentation and consistency measurement to alleviate the over-fitting issue in the CCM dataset caused by the limited number of containers. We augment the training data using an object-of-interest-based re-scaling that increases the variety of physical values of the containers. We then perform the consistency measurement to choose a model with low prediction variance in the same containers under different scenes, which ensures the generalization ability of the model. Our method improves the generalization ability of the models to estimate the property of the containers that were not previously seen in the training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2023

SegPrompt: Using Segmentation Map as a Better Prompt to Finetune Deep Models for Kidney Stone Classification

Recently, deep learning has produced encouraging results for kidney ston...
research
03/07/2022

A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

In this paper, we propose two techniques, namely joint modeling and data...
research
05/25/2022

Augmentation-induced Consistency Regularization for Classification

Deep neural networks have become popular in many supervised learning tas...
research
09/08/2020

Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification

Neural-based models have achieved outstanding performance on slot fillin...
research
07/27/2021

The CORSMAL benchmark for the prediction of the properties of containers

Acoustic and visual sensing can support the contactless estimation of th...
research
06/27/2023

Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection

In this paper, we propose the multi-perspective information fusion (MPIF...

Please sign up or login with your details

Forgot password? Click here to reset