Decomposed Generation Networks with Structure Prediction for Recipe Generation from Food Images

07/27/2020
by   Hao Wang, et al.
13

Recipe generation from food images and ingredients is a challenging task, which requires the interpretation of the information from another modality. Different from the image captioning task, where the captions usually have one sentence, cooking instructions contain multiple sentences and have obvious structures. To help the model capture the recipe structure and avoid missing some cooking details, we propose a novel framework: Decomposed Generation Networks (DGN) with structure prediction, to get more structured and complete recipe generation outputs. To be specific, we split each cooking instruction into several phases, and assign different sub-generators to each phase. Our approach includes two novel ideas: (i) learning the recipe structures with the global structure prediction component and (ii) producing recipe phases in the sub-generator output component based on the predicted structure. Extensive experiments on the challenging large-scale Recipe1M dataset validate the effectiveness of our proposed model DGN, which improves the performance over the state-of-the-art results.

READ FULL TEXT

page 1

page 2

page 4

page 8

research
10/04/2021

Learning Structural Representations for Recipe Generation and Food Retrieval

Food is significant to human daily life. In this paper, we are intereste...
research
09/02/2020

Structure-Aware Generation Network for Recipe Generation from Images

Sharing food has become very popular with the development of social medi...
research
04/28/2021

Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning

Unsupervised image captioning is a challenging task that aims at generat...
research
12/02/2021

Controllable Video Captioning with an Exemplar Sentence

In this paper, we investigate a novel and challenging task, namely contr...
research
07/23/2020

Comprehensive Image Captioning via Scene Graph Decomposition

We address the challenging problem of image captioning by revisiting the...
research
07/01/2021

Egocentric Image Captioning for Privacy-Preserved Passive Dietary Intake Monitoring

Camera-based passive dietary intake monitoring is able to continuously c...
research
05/03/2016

Improving Image Captioning by Concept-based Sentence Reranking

This paper describes our winning entry in the ImageCLEF 2015 image sente...

Please sign up or login with your details

Forgot password? Click here to reset