Standard practice in pretraining multimodal models, such as vision-langu...
NLP benchmarks have largely focused on short texts, such as sentences an...
A key limitation in current datasets for multi-hop reasoning is that the...
With models reaching human performance on many popular reading comprehen...