Memory Scheduling

1. What Is Memory Scheduling

Memory scheduling is MemOS's runtime ability to manage memory availability. It is not just about "which memory is found". Based on the current task, user state, historical topics, and memory heat, it decides which memories should stay closer to model context and which can remain in low-frequency storage.

You can think of memory scheduling as attention management in the memory system. When a user enters a task scenario, the system prepares the memories most likely to be needed and reduces interference from irrelevant history.

2. Why Scheduling Is Needed

If every request only performs a full retrieval, the system faces three problems:

Slower response: waiting until the user asks before searching all history increases first-token latency.
Context overload: recalling too much history may bury the information actually needed for the current task.
Unnatural topic switching: when the user's recent focus changes, the system needs to raise the priority of the new topic and downgrade the old one.

The goal of scheduling is not to store more content, but to make the right memories available at the right time.

3. What Scheduling Looks At

Signal	Role
Current task	Determines which topic or scenario the user is working on
Memory relevance	Identifies which memories are closer to the current input, conversation, and business goal
Freshness	Prioritizes information that is still valid and reduces the impact of outdated content
Usage frequency	Frequently used memories are more likely to be prepared in advance or kept active
Permission scope	Ensures scheduling respects user, Agent, tenant, and business isolation rules

Scheduling affects later recall and context injection. Relevant, active, and trustworthy memories are more likely to be used first; low-frequency, outdated, or context-inappropriate memories are delayed.

4. Example: From Buying a Home to Renovation

Earlier stage: buying a home is the core topic

User input

"Help me check the average second-hand home price around Binjiang."
"Remind me to view houses on Saturday."
"Record the latest mortgage rate changes."

Scheduling result

Generate memories about communities, house-viewing schedules, and mortgage rates.
Determine that "home buying" is a recent high-frequency topic.
Keep home-buying memories at a higher priority.

Recently: renovation becomes the new active topic

User input

"I'm going to look at tiles this weekend."
"Remind me to confirm plumbing and electrical work with the contractor."
"Note next week's furniture delivery time."

Scheduling result

Continue generating renovation-related memories.
Determine that "renovation" has become the new high-frequency topic.
Move renovation memories to a higher priority.
Keep home-buying memories, but gradually downgrade them.

The user casually says: "I feel like a lot of things are piling up. Please sort them out for me."

{ "Without scheduling": "temporary full retrieval" }

Needs to retrieve from all memories on the spot.
May mix in low-relevance items such as checking housing prices, viewing houses, grocery shopping, or watching movies.
The answer is slower and more likely to drift away from the current task.

{ "With scheduling": "prepare the current topic first" }

Prioritize renovation memories such as looking at tiles, confirming plumbing and electrical work, and furniture delivery.
No need to re-evaluate the full history every time.
Responses are faster and closer to what the user is currently worried about.

5. Relationship with Recall

Scheduling and recall are not the same thing:

Capability	Focus
Memory scheduling	Which memories should be more active and closer to model context at the current stage
Memory recall	Which memories should be retrieved and used in a specific request

Scheduling is runtime preparation and priority management. Recall is retrieval and selection for one request. When scheduling works well, recall is usually faster, more accurate, and less affected by irrelevant history.

Memory Production

Memory Recall