This report describes a test of the large language model GPT-4 with the
...
More than one hundred benchmarks have been developed to test the commons...
The paper discusses the capacities and limitations of current artificial...
Drori et al. (2022) report that "A neural network solves, explains, and
...
The DALL-E 2 system generates original synthetic images corresponding to...
Pronoun disambiguation in understanding text and discourse often require...
Most work on physical reasoning, both in artificial intelligence and in
...
The Winograd Schema Challenge – a set of twin sentences involving pronou...
A recent paper by Davies et al (2021) describes how deep learning (DL)
t...
Arabshahi, Singh, and Anandkumar (2018) propose a method for creating a
...
The TransCoder system translates source code between Java, C++, and Pyth...
The Winograd Schema Challenge is both a commonsense reasoning and natura...
Lample and Charton (2019) describe a system that uses deep learning
tech...
A Winograd schema is a pair of sentences that differ in a single word an...
It has been proposed that human physical reasoning consists largely of
r...
In this position paper, I argue that standardized tests for elementary
s...