Image ad understanding is a crucial task with wide real-world applicatio...
Diffusion models, such as Stable Diffusion, have shown incredible perfor...
Creativity is an indispensable part of human cognition and also an inher...
Large-scale diffusion models have achieved state-of-the-art results on
t...
Prompt tuning is a new few-shot transfer learning technique that only tu...
Vision-and-language navigation (VLN) is a multimodal task where an agent...
A major challenge in visually grounded language generation is to build r...
As applications in large organizations evolve, the machine learning (ML)...
In the vision-and-language navigation (VLN) task, an agent follows natur...
Multi-sentence summarization is a well studied problem in NLP, while
gen...
There is a recent surge of interest in cross-modal representation learni...