research
∙
08/16/2023
Time Travel in LLMs: Tracing Data Contamination in Large Language Models
Data contamination, i.e., the presence of test data from downstream task...
research
∙
07/14/2023
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain Keywords
We propose a novel task-agnostic in-domain pre-training method that sits...
research
∙
08/25/2022