Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System

by   Xinnian Liang, et al.

Large-scale Language Models (LLMs) are constrained by their inability to process lengthy inputs. To address this limitation, we propose the Self-Controlled Memory (SCM) system to unleash infinite-length input capacity for large-scale language models. Our SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides the agent with both long-term memory (archived memory) and short-term memory (flash memory) to generate precise and coherent responses. The controller determines which memories from archived memory should be activated and how to incorporate them into the model input. Our SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that our SCM system enables LLMs, which are not optimized for multi-turn dialogue, to achieve multi-turn dialogue capabilities that are comparable to ChatGPT, and to outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. Additionally, we will supply a test set, which covers common long-text input scenarios, for evaluating the abilities of LLMs in processing long documents. [Working in progress.][<https://github.com/wbbeyourself/SCM4LLMs>]


page 1

page 2

page 3

page 4


Augmenting Language Models with Long-Term Memory

Existing large language models (LLMs) can only afford fix-sized inputs d...

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

Most open-domain dialogue systems suffer from forgetting important infor...

Narrative XL: A Large-scale Dataset For Long-Term Memory Models

Despite their tremendous successes, most large language models do not ha...

∞-former: Infinite Memory Transformer

Transformers struggle when attending to long contexts, since the amount ...

A deep language model for software code

Existing language models such as n-grams for software code often fail to...

Keep Me Updated! Memory Management in Long-term Conversations

Remembering important information from the past and continuing to talk a...

Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Current generation of memory-augmented neural networks has limited scala...

Please sign up or login with your details

Forgot password? Click here to reset