Focusing Memory in LLMs: Semantic and Temporal Aware Eviction for Enhanced Context Management
Project Details
- Student(s): Charbel Boutros
- Advisor(s): Dr. Samer Saab Jr.
- Department: Electrical & Computer
- Academic Year(s): 2024-2025
Abstract
Large Language Models (LLMs) are constrained by finite context windows, limiting their ability to maintain long-term conversational coherence and process extensive information. Systems like MemGPT address this by implementing hierarchical memory, but its main context eviction relies on a simple First-In- First-Out (FIFO) strategy. This paper proposes a novel Semantic and Temporal Aware Eviction (STAE) mechanism for MemGPT’s main context queue. STAE embeds conversational turns (user-LLM message pairs) using OpenAI’s text-embedding-ada-002 and calculates a “core conversational meaning” as the centroid of these embeddings. Eviction decisions are based on a weighted score combining a message pair’s semantic distance from this centroid and its age in the queue. Preliminary qualitative results on multi-session chat tasks using GPT-4 indicate that STAE successfully identifies and evicts semantically outlying message pairs, potentially preserving more relevant long-term context compared to FIFO. This work represents a step towards more intelligent active context management in memory-augmented LLMs.