Subconscious Memory Recipe¶
New to Agent Memory? Start with the Quickstart Guide for a progressive adoption path.
Automatic memory injection and extraction around every LLM turn. The agent's memory works silently -- assembling relevant context before each call and saving new observations after each response -- without the application needing to manage memory explicitly.
Architecture¶
User message
|
v
[Pre-turn: ContextAssembler.assemble() -> inject into system message]
|
v
[LLM inference]
|
v
[Post-turn: extract facts from response -> save as Memory records]
|
v
[Outcome: report acted/dismissed/contradicted via ObservationProtocol]
|
v
Agent response
Quick Start¶
from popoto import (
Model, AutoKeyField, KeyField, StringField, FloatField,
DecayingSortedField, ConfidenceField,
WriteFilterMixin, AccessTrackerMixin,
)
from popoto.recipes.subconscious_memory import SubconsciousMemory
# Define your Memory model (any level from the quickstart guide)
class Memory(WriteFilterMixin, AccessTrackerMixin, Model):
memory_id = AutoKeyField()
agent_id = KeyField()
content = StringField(default="")
importance = FloatField(default=1.0)
relevance = DecayingSortedField(
base_score_field="importance",
partition_by="agent_id",
)
confidence = ConfidenceField(initial_confidence=0.5)
_wf_min_threshold = 0.1 # default after sweep 2026-04-17 (was 0.2)
_wf_priority_threshold = 0.7
def compute_filter_score(self):
return self.importance or 0.0
# Create the subconscious memory layer
sm = SubconsciousMemory(
model_class=Memory,
agent_id="agent-1",
score_weights={"relevance": 0.6, "confidence": 0.3},
max_items=10,
max_tokens=4000,
)
OpenAI SDK Integration¶
Wire subconscious memory into a standard OpenAI chat completion call:
from openai import OpenAI
client = OpenAI() # uses OPENAI_API_KEY env var
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What's our deployment strategy?"},
]
# Pre-turn: inject relevant memories into messages
messages, assembly_result = sm.inject_context(messages)
# Call the LLM (messages now include memory context in the system message)
response = client.chat.completions.create(
model="gpt-4.1-nano",
messages=messages,
)
answer = response.choices[0].message.content
# Post-turn: extract facts from the response and save as new memories
new_memories = sm.extract_memories(answer, importance=0.6)
# Report outcomes: memories were used successfully
sm.report_outcomes(assembly_result, outcome="acted")
How It Works¶
Pre-turn: inject_context(messages)¶
- Extracts the last user message as a query cue
- Calls
ContextAssembler.assemble()with the agent's memory model - Appends the formatted context to the system message (creates one if absent)
- Returns the modified messages and an
AssemblyResultfor later outcome reporting
If no relevant memories are found, messages are returned unchanged.
Post-turn: extract_memories(response_text, importance)¶
- Splits the LLM response into sentences
- Filters out sentences shorter than
extraction_min_length(default 10 chars) - Saves each sentence as a new Memory record with the specified importance
The built-in extraction uses a simple sentence-splitting heuristic. For more accurate extraction, override this method or extract facts using a secondary LLM call and save them directly via your model class.
Outcome: report_outcomes(assembly_result, outcome)¶
Reports how the agent used the injected memories via ObservationProtocol.on_context_used(). Outcomes strengthen or weaken memories for future retrieval:
"acted"-- the agent used this memory (strengthens confidence)"dismissed"-- the agent ignored this memory (mild weakening)"contradicted"-- the agent found this memory incorrect (strong weakening)"deferred"-- the agent noted but deferred action (neutral)"used"-- the memory informed reasoning without appearing in the response (confirms access, no strength signal)
Tuning¶
| Parameter | Default | Description |
|---|---|---|
max_items |
10 | Maximum memories injected per turn |
max_tokens |
4000 | Soft token budget for injected context |
extraction_min_length |
10 | Minimum chars for a sentence to become a memory |
score_weights |
(required) | Weight dict for composite scoring (e.g. {"relevance": 0.6, "confidence": 0.3}) |
system_preamble |
"You are a helpful assistant." | Prefix for auto-created system messages |
content_field |
"content" | Name of the text content field on your model |
importance_field |
"importance" | Name of the importance score field |
These constants can be tuned experimentally using the Tier 4 benchmark harness. See the Tuning Magic Numbers guide for the full constant catalog, optimal ranges, and how to run parameter sweeps.
Extensibility¶
Custom Fact Extraction¶
Subclass SubconsciousMemory and override extract_memories() for LLM-based extraction:
class SmartSubconsciousMemory(SubconsciousMemory):
def extract_memories(self, response_text, importance=0.5):
# Use a secondary LLM call to extract structured facts
facts = my_extraction_function(response_text)
saved = []
for fact in facts:
m = self.model_class(
agent_id=self.agent_id,
content=fact["text"],
importance=fact.get("importance", importance),
)
m.save()
saved.append(m)
return saved
Custom Query Cues¶
The default implementation uses the last user message as the query cue. For more sophisticated cue extraction, subclass and override the relevant portion of inject_context().
See Also¶
- Agent Memory Quickstart -- progressive adoption guide
- ContextAssembler -- retrieval-to-injection bridge
- PolicyCache Recipe -- RL-style learned action selection