Mrinmaya Sachan

research

∙ 06/29/2023

Tokenization and the Noiseless Channel

Subword tokenization is a key part of many NLP pipelines. However, littl...

0 Vilém Zouhar, et al. ∙

research

∙ 06/29/2023

A Formal Perspective on Byte-Pair Encoding

Byte-Pair Encoding (BPE) is a popular algorithm used for tokenizing data...

0 Vilém Zouhar, et al. ∙

research

∙ 06/09/2023

Can Large Language Models Infer Causation from Correlation?

Causal inference is one of the hallmarks of human intelligence. While th...

0 Zhijing Jin, et al. ∙

research

∙ 06/07/2023

World Models for Math Story Problems

Solving math story problems is a complex task for students and NLP model...

0 Andreas Opedal, et al. ∙

research

∙ 06/05/2023

Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning

The Abstraction and Reasoning Corpus (ARC) (Chollet, 2019) and its most ...

0 Mattia Atzeni, et al. ∙

research

∙ 06/04/2023

Adaptive and Personalized Exercise Generation for Online Language Learning

Adaptive learning aims to provide customized educational activities (e.g...

0 Peng Cui, et al. ∙

research

∙ 05/29/2023

Membership Inference Attacks against Language Models via Neighbourhood Comparison

Membership Inference attacks (MIAs) aim to predict whether a data sample...

0 Justus Mattern, et al. ∙

research

∙ 05/24/2023

Learning the String Partial Order

We show that most structured prediction problems can be solved in linear...

0 Tianyu Liu, et al. ∙

research

∙ 05/24/2023

Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Mathematical reasoning in large language models (LLMs) has garnered atte...

0 Alessandro Stolfo, et al. ∙

research

∙ 05/23/2023

All Roads Lead to Rome? Exploring the Invariance of Transformers' Representations

Transformer models bring propelling advances in various NLP tasks, thus ...

2 Yuxin Ren, et al. ∙

research

∙ 05/23/2023

MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems

Although automatic dialogue tutors hold great potential in making educat...

0 Jakub Macina, et al. ∙

research

∙ 05/23/2023

When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP

Multi-task learning (MTL) aims at achieving a better model by leveraging...

0 Jingwei Ni, et al. ∙

research

∙ 05/22/2023

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

The fixed-size context of Transformer makes GPT models incapable of gene...

0 Wangchunshu Zhou, et al. ∙

research

∙ 05/20/2023

Re-visiting Automated Topic Model Evaluation with Large Language Models

Topic models are used to make sense of large text collections. However, ...

0 Dominik Stammbach, et al. ∙

research

∙ 05/18/2023

Efficient Prompting via Dynamic In-Context Learning

The primary way of building AI applications is shifting from training sp...

0 Wangchunshu Zhou, et al. ∙

research

∙ 05/18/2023

Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus

Several recent papers claim human parity at sentence-level Machine Trans...

0 Yuchen Eleanor Jiang, et al. ∙

research

∙ 05/17/2023

Variational Classification

We present a novel extension of the traditional neural network approach ...

0 Shehzaad Dhuliawala, et al. ∙

research

∙ 05/09/2023

Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good

With the recent advances in natural language processing (NLP), a vast nu...

0 Fernando González, et al. ∙

research

∙ 05/02/2023

Psychologically-Inspired Causal Prompts

NLP datasets are richer than just input-output pairs; rather, they carry...

0 Zhiheng Lyu, et al. ∙

research

∙ 04/27/2023

Controlled Text Generation with Natural Language Instructions

Large language models generate fluent texts and can follow natural langu...

0 Wangchunshu Zhou, et al. ∙

research

∙ 04/18/2023

Enhancing Textbooks with Visuals from the Web for Improved Learning

Textbooks are the primary vehicle for delivering quality education to st...

0 Janvijay Singh, et al. ∙

research

∙ 04/05/2023

PWESuite: Phonetic Word Embeddings and Tasks They Facilitate

Word embeddings that map words into a fixed-dimensional vector space are...

0 Vilém Zouhar, et al. ∙

research

∙ 03/30/2023

Elastic Weight Removal for Faithful and Abstractive Dialogue Generation

Ideally, dialogue systems should generate responses that are faithful to...

0 Nico Daheim, et al. ∙

research

∙ 02/27/2023

Strategize Before Teaching: A Conversational Tutoring System with Pedagogy Self-Distillation

Conversational tutoring systems (CTSs) aim to help students master educa...

0 Lingzhi Wang, et al. ∙

research

∙ 01/24/2023

Opportunities and Challenges in Neural Dialog Tutoring

Designing dialog tutors has been challenging as it involves modeling the...

0 Jakub Macina, et al. ∙

research

∙ 01/21/2023

Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics Without the Reference

Machine translation quality estimation (QE) predicts human judgements of...

0 Vilém Zouhar, et al. ∙

research

∙ 12/20/2022

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

Generated texts from large pretrained language models have been shown to...

0 Justus Mattern, et al. ∙

research

∙ 12/01/2022

Distilling Multi-Step Reasoning Capabilities of Large Language Models into Smaller Models via Semantic Decompositions

Step-by-step reasoning approaches like chain-of-thought (CoT) have prove...

0 Kumar Shridhar, et al. ∙

research

∙ 11/23/2022

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Socratic questioning is an educational method that allows students to di...

0 Kumar Shridhar, et al. ∙

research

∙ 10/29/2022

Beyond prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

Recent work has demonstrated that pre-trained language models (PLMs) are...

0 Yu Fei, et al. ∙

research

∙ 10/26/2022

Autoregressive Structured Prediction with Language Models

Recent years have seen a paradigm shift in NLP towards using pretrained ...

0 Tianyu Liu, et al. ∙

research

∙ 10/26/2022

Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems

Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis...

0 Yuchen Eleanor Jiang, et al. ∙

research

∙ 10/26/2022

A Bilingual Parallel Corpus with Discourse Annotations

Machine translation (MT) has almost achieved human parity at sentence-le...

0 Yuchen Eleanor Jiang, et al. ∙

research

∙ 10/25/2022

Differentially Private Language Models for Secure Data Sharing

To protect the privacy of individuals whose data is being shared, it is ...

0 Justus Mattern, et al. ∙

research

∙ 10/24/2022

Adapters for Enhanced Modeling of Multilingual Knowledge and Text

Large language models appear to learn facts from the large text corpora ...

0 Yifan Hou, et al. ∙

research

∙ 10/21/2022

A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

We have recently witnessed a number of impressive results on hard mathem...

0 Alessandro Stolfo, et al. ∙

research

∙ 10/07/2022

Longtonotes: OntoNotes with Longer Coreference Chains

Ontonotes has served as the most important benchmark for coreference res...

0 Kumar Shridhar, et al. ∙

research

∙ 10/04/2022

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

AI systems are becoming increasingly intertwined with human life. In ord...

8 Zhijing Jin, et al. ∙

research

∙ 07/04/2022

Probing via Prompting

Probing is a popular method to discern what linguistic information is co...

0 Jiaoda Li, et al. ∙

research

∙ 05/08/2022

A Structured Span Selector

Many natural language processing tasks, e.g., coreference resolution and...

0 Tianyu Liu, et al. ∙

research

∙ 05/04/2022

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

Human-translated text displays distinct features from naturally written ...

1 Jingwei Ni, et al. ∙

research

∙ 03/20/2022

Calibration of Machine Reading Systems at Scale

In typical machine learning systems, an estimate of the probability of t...

0 Shehzaad Dhuliawala, et al. ∙

research

∙ 03/09/2022

Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang

Languages are continuously undergoing changes, and the mechanisms that u...

0 Daphna Keidar, et al. ∙

research

∙ 02/28/2022

Logical Fallacy Detection

Reasoning is central to human intelligence. However, fallacious argument...

28 Zhijing Jin, et al. ∙

research

∙ 02/02/2022

Understanding Knowledge Integration in Language Models with Graph Convolutions

Pretrained language models (LMs) do not capture factual knowledge very w...

0 Yifan Hou, et al. ∙

research

∙ 10/16/2021

Case-based Reasoning for Better Generalization in Text-Adventure Games

Text-based games (TBG) have emerged as promising environments for drivin...

0 Mattia Atzeni, et al. ∙

research

∙ 10/15/2021

On Learning the Transformer Kernel

In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, d...

0 Sankalan Pal Chowdhury, et al. ∙

research

∙ 10/07/2021

Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning in NLP

The principle of independent causal mechanisms (ICM) states that generat...

0 Zhijing Jin, et al. ∙

research

∙ 09/12/2021

"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding

When reading a literary piece, readers often make inferences about vario...

5 Faeze Brahman, et al. ∙

research

∙ 08/10/2021

Differentiable Subset Pruning of Transformer Heads

Multi-head attention, a collection of several attention mechanisms that ...

0 Jiaoda Li, et al. ∙

Mrinmaya Sachan

Featured Co-authors

Sign in with Google

Consider DeepAI Pro