[Discussion/Help] Issue of duplication caused by automatic extraction of previous memory coexisting with current context

Mengmeng_Dang · December 1, 2025, 7:15am

Dear Teacher Fangtang/All Users:

If a new topic is not started every time memory is extracted, when the “number of historical conversation messages” is greater than the “trigger rounds for automatic memory extraction,” there will be a situation where the “summarized memory” and the “corresponding historical conversation” are sent to the model simultaneously. Although it is acceptable and adaptable to manually start memory in such cases, is there a better way to avoid this repetition?

Taking the example of “60 historical conversation messages and 20 trigger rounds for automatic memory extraction,” is it possible to load the memory corresponding to the first 20 messages into the context when reaching the 80th message (at this point, the previously summarized conversations have completely exited the current context)? However, this would result in a memory gap of 20 messages (between the 61st and 80th messages, the first 20 conversations gradually exit the context, but the corresponding memory has not been loaded).

Or are there any other better solutions?

This is a bit of a puzzle after seeing Teacher Fangtang mention in other posts the hope that “long-term memory summarization becomes imperceptible” (I completely agree!). Looking forward to everyone’s discussion~!

Apologies: Not proficient with Omate + Incomplete considerations + Possibly only I have such nitpicky thoughts, seeking friendly discussion & sorry for taking up Teacher Fangtang’s and everyone’s time, if there’s anything inappropriate, I apologize in advance.

Easy · December 1, 2025, 9:04am

Repetition exists, but it is not complete repetition.

The simple conclusion is that I believe there is no need to address this issue.

The detailed explanation is as follows:

When the automatic memory extraction trigger is set to 20 entries, it summarizes the latest 20 records, not the earliest 20, so there is no memory gap.
Long-term memory is focused on key content automatically extracted based on prompts and is not equivalent to conversation history (for example, analyzing character emotions and recording them in memory). Therefore, we believe it is reasonable for long-term memory and chat history to overlap.

YeZip · December 1, 2025, 5:49pm

The method you mentioned is more like a compressed chat log, which indeed saves more tokens.

Easy · December 2, 2025, 1:55am

Yes, Notepad can indeed be used this way, but it requires an agent to drive it, so it’s considered a high-end solution. In fact, there is also a keyword search tool in agent mode.

In terms of memory, Mem0’s solution is currently doing quite well, but the process is a bit complicated and the speed is somewhat slow. Additionally, I think it might need optimization for role-playing.

Mengmeng_Dang · December 2, 2025, 7:38am

Wow, thank you for the teacher’s response! There is actually a specific implementation plan, the teacher is amazing! I haven’t used Notepad yet, I’ll study it when I have time! Praise the teacher! Great! (Since I haven’t used Notepad yet,) I would like to ask the teacher on this basis, are these processes all done by AI (referring to operations like counter +1, reset, updating the temporarily stored memory into the displayable memory), is AI’s intelligence and reliability currently capable of handling this work, or does the Notepad function provide some assistance to make AI more reliable in achieving these functions? I know AI is very powerful hahaha but there is always a subtle sense of distrust when letting AI execute tasks with high reliability requirements (after all, when the model’s intelligence drops, even the format can be lost). Looking forward to the teacher’s reply, I will also try to experience the Notepad function myself. Thank you!

Mengmeng_Dang · December 2, 2025, 7:49am

Thank you, Teacher Fangtang, for your reply! Are you referring to the permanent memory function? I tried it many versions ago, but I found there seemed to be some issues (retrieving incorrect historical dialogues, NSFW content appearing in the form of system prompt words, which could increase the probability of dynamic review errors by the model). I’m not sure if it’s my problem or a bug, so I haven’t continued using it. I’ll try to learn and experiment with it again when I have time. The video content is also very helpful. Thank you, teacher!

YeZip · December 2, 2025, 8:52am

To write the above processes into specific prompt words, then use “prompt injection” to let the AI call tools according to the process. Like role-playing rules and other preset prompts, write them into your overall prompt architecture.

The specific prompt logic needs to be more rigorous and detailed to prevent errors. I usually use DeepSeek, and my role settings, worldview, memory, and other data are stored in a notebook. Practical tests show that DeepSeek has a high accuracy in executing tool call prompts, with few errors.