Models and Fields¶
Models are the foundation of Popoto. They define the structure of your Redis-stored
data using Python classes with field declarations. If you are familiar with Django or
SQLAlchemy, the pattern will feel natural: inherit from Model, declare fields as
class attributes, and Popoto handles persistence, indexing, and querying.
Here is a simple restaurant model to illustrate the basics:
from popoto import Model, KeyField, Field, SortedField, GeoField
class Restaurant(Model):
name = KeyField()
cuisine = Field(type=str)
rating = SortedField(type=float)
location = GeoField()
active = Field(type=bool, default=True)
Each field type controls how data is validated, stored, and indexed in Redis. This guide covers every field type and its configuration options, working through a food delivery system as a running example.
KeyField¶
A KeyField determines how Popoto stores and retrieves your objects in Redis. The
values of all KeyFields on a model are concatenated to form the Redis key, making
lookups on KeyFields extremely fast -- a direct Redis GET rather than a scan.
For the Restaurant model above, a restaurant with name="Siam Garden" is stored at
the Redis key Restaurant:Siam Garden. Create a restaurant and retrieve it by name:
restaurant = Restaurant.create(
name="Siam Garden",
cuisine="Thai",
rating=4.5,
location=GeoField.Coordinates(latitude=40.7128, longitude=-74.0060),
)
loaded = Restaurant.load(name="Siam Garden")
print(loaded.cuisine)
# => "Thai"
Because name is a KeyField, the lookup is a single Redis GET -- the fastest
possible read operation. See Making Queries for additional ways to
retrieve instances.
Changing a KeyField Value¶
When you change a KeyField value and call save(), the Redis key itself changes.
Popoto handles the transition automatically: it deletes the old hash, removes the old
key from the class set, migrates all field indexes (sorted sets, geo sets, unique
constraints) from the old key to the new one, and adds the new key to the class set.
Both full saves and partial saves (save(update_fields=[...])) handle this correctly.
restaurant = Restaurant.create(name="Taco Shack", cuisine="Mexican", rating=4.0)
# Change the KeyField value
restaurant.name = "Taco Palace"
restaurant.save()
# Old key is cleaned up, new key is active
loaded = Restaurant.load(name="Taco Palace")
print(loaded.cuisine)
# => "Mexican"
# The old key no longer exists
print(Restaurant.load(name="Taco Shack"))
# => None
Uniqueness¶
When two restaurants share the same name, the second save overwrites the first. If
you need to guarantee that a field value is globally unique across all instances, use
UniqueKeyField or AutoKeyField.
UniqueKeyField enforces a per-value uniqueness constraint. AutoKeyField generates
a unique value automatically, ensuring every instance has a distinct key.
from popoto import Model, KeyField, AutoKeyField, UniqueKeyField
from popoto import Field, SortedField, GeoField
class Customer(Model):
username = KeyField()
email = UniqueKeyField()
name = Field(type=str)
address = GeoField()
class Driver(Model):
driver_id = AutoKeyField()
name = Field(type=str)
phone = UniqueKeyField()
rating = SortedField(type=float)
location = GeoField()
active = Field(type=bool, default=True)
The Customer model uses username as its primary KeyField and enforces that every
email is unique across all customers. The Driver model uses AutoKeyField so each
driver gets a unique ID without you supplying one.
customer = Customer.create(
username="foodie42",
email="foodie42@example.com",
name="Jane Doe",
)
# Attempting a duplicate email raises an exception
try:
Customer.create(
username="another_user",
email="foodie42@example.com",
name="Someone Else",
)
except Exception as e:
print(e)
# => UniqueKeyField 'email' value 'foodie42@example.com' already exists
Drivers get an auto-generated key, so you never need to supply driver_id:
driver = Driver.create(
name="Carlos",
phone="+1-555-0101",
rating=4.8,
location=GeoField.Coordinates(latitude=40.7580, longitude=-73.9855),
)
print(driver.driver_id)
# => "a1b2c3d4e5f6..." (auto-generated UUID4 hex)
AutoKeyField ID Strategies¶
AutoKeyField supports multiple ID generation strategies via the strategy parameter.
The default strategy is uuid4 for backward compatibility.
| Strategy | Length | Time-Sortable | Installation |
|---|---|---|---|
uuid4 |
32 chars | No | Built-in (default) |
ulid |
26 chars | Yes | pip install popoto[ulid] |
ksuid |
27 chars | Yes | pip install popoto[ksuid] |
UUID4 (Default)¶
The default strategy generates a 32-character random hexadecimal string using Python's
uuid.uuid4(). UUIDs are excellent for general-purpose unique identifiers but are not
time-sortable, meaning queries cannot rely on ID order to determine creation order.
class Article(Model):
id = AutoKeyField() # Default: strategy="uuid4"
title = Field(type=str)
article = Article.create(title="Hello World")
print(article.id)
# => "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6" (32 chars)
Use UUID4 when:
- You do not need chronological ordering by ID
- You want zero external dependencies
- Backward compatibility with existing Popoto models is important
ULID (Time-Sortable)¶
ULID (Universally Unique Lexicographically Sortable Identifier) generates 26-character IDs that are time-sortable. The first 10 characters encode a millisecond-precision timestamp, and the remaining 16 characters are random. ULIDs use Crockford's Base32 encoding for URL-safety and readability.
class Order(Model):
id = AutoKeyField(strategy="ulid")
product = Field(type=str)
quantity = Field(type=int)
order1 = Order.create(product="Widget", quantity=5)
order2 = Order.create(product="Gadget", quantity=3)
print(order1.id)
# => "01ARZ3NDEKTSV4RRFFQ69G5FAV" (26 chars)
# IDs are lexicographically sortable by creation time
print(order1.id < order2.id)
# => True
Use ULID when:
- You need IDs that sort chronologically (e.g., orders, events, logs)
- You want shorter IDs than UUID4 (26 vs 32 characters)
- You need URL-safe identifiers without special characters
KSUID (Time-Sortable)¶
KSUID (K-Sortable Unique Identifier) generates 27-character IDs that are also time-sortable. KSUIDs encode a timestamp with second-precision and 16 bytes of random data using Base62 encoding. They have a longer time range than ULIDs (until year 2150).
class Event(Model):
id = AutoKeyField(strategy="ksuid")
name = Field(type=str)
timestamp = Field(type=str)
event1 = Event.create(name="user_signup", timestamp="2025-01-15T10:30:00Z")
event2 = Event.create(name="purchase", timestamp="2025-01-15T10:31:00Z")
print(event1.id)
# => "0ujsswThIGTUYm2K8FjOOfXtY1K" (27 chars)
# IDs are lexicographically sortable by creation time
print(event1.id < event2.id)
# => True
Use KSUID when:
- You need IDs that sort chronologically
- You prefer Base62 encoding (alphanumeric only, case-sensitive)
- You need the extended time range (valid until year 2150)
Choosing a Strategy¶
| Use Case | Recommended Strategy |
|---|---|
| General-purpose unique IDs | uuid4 (default) |
| Event logs, audit trails | ulid or ksuid |
| Orders, transactions | ulid or ksuid |
| Time-series data | ulid or ksuid |
| Existing Popoto models | uuid4 (backward compatible) |
| Minimal dependencies | uuid4 (built-in) |
| Shortest IDs | ulid (26 chars) |
Tip
Time-sortable IDs like ULID and KSUID are particularly useful when you want to query "most recent" records efficiently, since lexicographic sorting on the ID field naturally orders by creation time.
Composite Keys¶
When no single field is unique, you can use multiple KeyFields to form a composite key. The combination of all KeyField values must be unique together. This is useful for junction or reservation models.
from popoto import Model, KeyField, Field
from popoto import Relationship
class Reservation(Model):
restaurant = KeyField()
customer = KeyField()
party_size = Field(type=int)
notes = Field(type=str, null=True)
Two distinct reservations exist as long as the restaurant-customer pair differs:
Reservation.create(
restaurant="Siam Garden",
customer="foodie42",
party_size=4,
notes="Window seat please",
)
Reservation.create(
restaurant="Bella Napoli",
customer="foodie42",
party_size=2,
)
# Retrieve by the composite key
reservation = Reservation.load(
restaurant="Siam Garden", customer="foodie42"
)
print(reservation.party_size)
# => 4
The Redis key for this instance is Reservation:Siam Garden:foodie42.
Warning
If two reservations share the same restaurant and customer values, the second
save silently overwrites the first. Add a UniqueKeyField or AutoKeyField if
you need to allow duplicates on the composite fields.
Models Without KeyFields¶
You can declare a model without any explicit KeyField. Popoto automatically adds a
hidden AutoKeyField named _auto_key, giving every instance a unique UUID4-based
Redis key. This is convenient when you always query by other fields like SortedField
or GeoField.
Note
The automatically added _auto_key uses the default uuid4 strategy. If you need
time-sortable IDs, declare an explicit AutoKeyField(strategy="ulid") or
AutoKeyField(strategy="ksuid") instead.
The MenuItem model uses an explicit AutoKeyField, but the effect is the same as
omitting all key fields entirely:
from popoto import Model, AutoKeyField, Field, SortedField
from popoto import Relationship
class MenuItem(Model):
item_id = AutoKeyField()
name = Field(type=str)
price = SortedField(type=float)
restaurant = Relationship(Restaurant)
available = Field(type=bool, default=True)
Create items without worrying about key collisions:
pad_thai = MenuItem.create(
name="Pad Thai",
price=14.99,
restaurant=restaurant,
available=True,
)
green_curry = MenuItem.create(
name="Green Curry",
price=16.50,
restaurant=restaurant,
)
print(pad_thai.item_id)
# => "e5f6a7b8-..." (auto-generated)
Tip
Use AutoKeyField when your model represents items that do not have a natural
unique identifier, such as orders, menu items, or log entries. Consider using
strategy="ulid" or strategy="ksuid" for models where chronological ordering
by ID is useful.
Field¶
The base Field class stores a typed value with optional validation. If you do not
specify a type, it defaults to str.
Popoto supports the following types: int, float, Decimal, str, bool, list,
set, tuple, dict, bytes, datetime.date, datetime.datetime, datetime.time.
Here are the types used across our food delivery models: Restaurant.cuisine is
str, Restaurant.rating is float, Restaurant.active is bool, Order.total
is float, and Order.status is str. Popoto validates field types when you save:
restaurant = Restaurant(name="Bella Napoli", cuisine="Italian", rating=4.2)
restaurant.active = "yes" # wrong type, should be bool
print(restaurant.is_valid())
# => False
Named Field Shortcuts¶
Popoto provides shortcut classes so you can avoid the type= parameter. These are
functionally identical to Field(type=...).
| Shortcut | Equivalent |
|---|---|
IntField |
Field(type=int) |
FloatField |
Field(type=float) |
DecimalField |
Field(type=Decimal) |
StringField |
Field(type=str) |
BooleanField |
Field(type=bool) |
ListField |
Field(type=list) |
SetField |
Field(type=set) |
TupleField |
Field(type=tuple) |
DictField |
Field(type=dict) |
BytesField |
Field(type=bytes) |
DateField |
Field(type=date) |
DatetimeField |
Field(type=datetime) |
TimeField |
Field(type=time) |
IndexedField |
Field(indexed=True) |
UniqueField |
Field(indexed=True, unique=True) |
Import shortcuts from popoto.fields.shortcuts:
from popoto import Model, KeyField
from popoto.fields.shortcuts import IntField, FloatField, BooleanField, StringField
class Restaurant(Model):
name = KeyField()
cuisine = StringField()
seat_count = IntField()
avg_price = FloatField()
active = BooleanField()
Use whichever style you prefer -- the behavior is identical.
Capped ListField (max_length)¶
When you pass max_length=N to a ListField, the list is stored in a separate Redis
list key instead of the model hash. This enables efficient push() operations using
Redis LPUSH + LTRIM without reading the full list.
from popoto import Model, KeyField, ListField
class EventLog(Model):
session_id = KeyField()
events = ListField(max_length=100) # Capped at 100 items
# Save the model first
log = EventLog(session_id="abc", events=[])
log.save()
# Push items directly to Redis (newest first)
log.events.push({"action": "click", "target": "button"})
log.events.push({"action": "scroll", "offset": 500})
# Reload to see the data
loaded = EventLog.query.get(session_id="abc")
print(loaded.events) # [{"action": "scroll", ...}, {"action": "click", ...}]
Key behaviors:
- push() prepends items (newest first) and automatically trims to
max_length - save() replaces the entire Redis list with the current field value
- delete() cleans up the separate Redis list key
- Complex types (tuples, dicts, Decimals) round-trip correctly through push/read
- Without max_length, ListField works exactly as before (stored in model hash)
- The model must be saved before calling
push()(needs a Redis key)
Null Values¶
KeyField and SortedField are required (null=False) by default. All other fields
are optional (null=True) by default. You can override this with the null keyword.
The Driver model illustrates this: driver_id (AutoKeyField) and phone
(UniqueKeyField) are always required, rating (SortedField) is required by default,
while name (Field) and location (GeoField) are optional by default.
driver = Driver(name=None, phone="+1-555-0199", rating=4.0)
print(driver.is_valid())
# => True (name is optional, None is acceptable)
Note
UniqueKeyField and AutoKeyField cannot be set to null=True. Attempting to
do so raises a ModelException at class definition time.
Default Values¶
Fields accept a default value used when creating instances without specifying that
field. The Order model demonstrates this with its status field:
order = Order.create(
customer=customer,
restaurant=restaurant,
total=42.50,
)
print(order.status)
# => "pending"
Defaults can also be callables. The callable is invoked each time a new instance is created, ensuring each instance gets a fresh value. This is critical for mutable types like lists and dicts.
import uuid
from popoto import Model, KeyField, Field
class Restaurant(Model):
name = KeyField()
tags = Field(type=list, default=list) # fresh list per instance
metadata = Field(type=dict, default=dict) # fresh dict per instance
internal_id = Field(default=uuid.uuid4) # unique UUID per instance
display_order = Field(type=int, default=lambda: 0) # lambda also works
Warning
Never use a mutable literal as a default (e.g., default=[] or default={}).
This shares a single object across all instances. Always use default=list or
default=dict instead.
String Max Length¶
You can set a maximum character length for string fields. Redis itself has no
practical string length limit, so max_length is purely a validation guard.
class MenuItem(Model):
item_id = AutoKeyField()
name = Field(type=str, max_length=100)
description = Field(type=str, max_length=500)
item = MenuItem(name="A" * 150)
print(item.is_valid())
# => False
Tip
The default max_length for string fields is 1024 characters. Set it explicitly
only when you need a stricter or looser limit.
SortedField¶
SortedField enables fast range queries using Redis sorted sets. This is one of
Redis's most powerful features, allowing queries like "menu items under $15" or
"restaurants rated above 4.0" without scanning every instance.
A SortedField is required for the range filters __lt, __lte, __gt, __gte.
See Making Queries for complete filter documentation.
Using the MenuItem model (which has price = SortedField(type=float)), create some
items and query by price range:
MenuItem.create(name="Pad Thai", price=14.99, restaurant=restaurant)
MenuItem.create(name="Green Curry", price=16.50, restaurant=restaurant)
MenuItem.create(name="Spring Rolls", price=8.99, restaurant=restaurant)
MenuItem.create(name="Mango Sticky Rice", price=9.50, restaurant=restaurant)
# Find affordable items under $10
budget_items = MenuItem.query.filter(price__lt=10.0)
print(len(budget_items))
# => 2
# Find premium items $15 and above
premium = MenuItem.query.filter(price__gte=15.0)
print([item.name for item in premium])
# => ["Pad Thai", "Green Curry"]
You can also query restaurant ratings the same way:
Range queries work with int, float, Decimal, datetime, date, and time.
SortedKeyField¶
SortedKeyField combines the direct-lookup speed of KeyField with the range query
capabilities of SortedField. Use it when a field serves as both a primary identifier
and a range-query target.
from popoto import Model, SortedKeyField, Field
class DailySpecial(Model):
day_number = SortedKeyField(type=int)
dish = Field(type=str)
price = Field(type=float)
Query by exact key or by range:
DailySpecial.create(day_number=1, dish="Tacos", price=9.99)
DailySpecial.create(day_number=2, dish="Pasta", price=12.99)
DailySpecial.create(day_number=3, dish="Sushi", price=15.99)
# Direct key lookup
monday = DailySpecial.load(day_number=1)
print(monday.dish)
# => "Tacos"
# Range query
early_week = DailySpecial.query.filter(day_number__lte=2)
print(len(early_week))
# => 2
IndexedField¶
IndexedField provides Set-based secondary indexing on non-key fields. Unlike
KeyField, an IndexedField does not become part of the Redis key -- it only
enables efficient exact-match queries via filter().
This decouples querying from identity: you can filter on status, category, or
region without those fields affecting the Redis storage key.
from popoto import Model, AutoKeyField, IndexedField, Field
class Order(Model):
order_id = AutoKeyField()
status = IndexedField(type=str)
region = IndexedField(type=str, null=True)
notes = Field(type=str)
Query indexed fields with exact match, __in, __isnull, __startswith, and
__endswith lookups:
Order.query.filter(status="shipped")
Order.query.filter(status__in=["pending", "processing"])
Order.query.filter(region__startswith="US-")
Order.query.filter(region__isnull=False)
You can also enable indexing on a plain Field with indexed=True:
See Indexed Fields for full details on index key patterns, performance characteristics, and the comparison table.
UniqueField¶
UniqueField combines secondary indexing with a per-value uniqueness constraint. It
guarantees that no two model instances share the same value for this field, without
making the field part of the Redis key.
from popoto import Model, AutoKeyField, UniqueField, Field
class User(Model):
user_id = AutoKeyField()
email = UniqueField(type=str)
name = Field(type=str)
user = User.create(email="alice@example.com", name="Alice")
# Duplicate email raises ModelException
try:
User.create(email="alice@example.com", name="Not Alice")
except Exception as e:
print(e)
# => Uniqueness violation on User.email: value 'alice@example.com' is already taken
UniqueField cannot be null and cannot have unique=False. It supports the same
query lookups as IndexedField.
See Indexed Fields for the uniqueness trade-offs under concurrent writes and the full comparison table.
DecayingSortedField¶
DecayingSortedField is a SortedField subclass where records lose relevance over time
following a power-law decay curve. The sorted set score is always a timestamp, and a
Lua script computes decay-ranked results server-side:
With the default decay_rate=0.1 (empirically tuned in sweep 2026-04-17; prior default
was 0.5), a record scores 1.0 after 1 day, 0.87 after 4 days, and 0.63 after 100 days.
from popoto import Model, KeyField, Field, FloatField
from popoto.fields.decaying_sorted_field import DecayingSortedField
class Memory(Model):
agent_id = KeyField()
content = Field(type=str)
importance = FloatField(default=1.0)
relevance = DecayingSortedField(base_score_field="importance")
Query for the most relevant recent records with top_by_decay(). When a model has
exactly one DecayingSortedField, the field name is auto-detected:
# Auto-detect field_name (works because Memory has exactly one DecayingSortedField)
top = Memory.query.filter(agent_id="agent-1").top_by_decay(n=10)
# Explicit field_name also works
top = Memory.query.filter(agent_id="agent-1").top_by_decay("relevance", n=10)
# Override decay rate for this query (aggressive — only very recent)
hot = Memory.query.filter(agent_id="agent-1").top_by_decay(n=5, decay_rate=1.0)
Refresh a record's timestamp without a full save using touch():
memory = Memory.query.get(agent_id="agent-1", content="deployment procedure")
memory.touch("relevance") # Resets the decay clock
| Parameter | Type | Default | Description |
|---|---|---|---|
decay_rate |
float |
0.1 |
Controls how fast scores drop. Higher = faster decay. Must be > 0. (Empirically tuned in sweep 2026-04-17; prior default was 0.5.) |
base_score_field |
str |
None |
Name of a companion field whose value multiplies the decay curve. When None, base score is 1.0. |
partition_by |
str or tuple |
() |
Partition the sorted set by key field values (inherited from SortedField). |
Use InteractionWeight constants with base_score_field for source/role-based importance
weighting in multi-agent teams. See Agent Memory — Source weighting
for the full pattern.
All standard SortedField range filters (__gt, __gte, __lt, __lte, __between)
work against the timestamp score. See Agent Memory for the
full agent memory primitives overview.
CyclicDecayField¶
CyclicDecayField extends DecayingSortedField with two additional temporal forces computed
atomically in a single Lua script:
- Cyclical resonance: Periodic boosts following cosine curves.
- Homeostatic pressure: Urgency that builds linearly while an item goes unresolved.
The effective score is: decay + cyclic_resonance + pressure. When cycles=[] and
pressure_rate=0.0, behavior is identical to DecayingSortedField.
from popoto import Model, KeyField, Field, CyclicDecayField
from popoto.fields.constants import TemporalPeriod
class Directive(Model):
agent_id = KeyField()
content = Field(type=str)
relevance = CyclicDecayField(
decay_rate=0.5, # override default (0.1) for faster forgetting
cycles=[(TemporalPeriod.QUARTERLY, 5.0, 0)],
pressure_rate=0.1,
)
Query with the same top_by_decay() interface, and discharge pressure with resolve_pressure():
# Top 10 by combined decay + cyclic + pressure score (field_name auto-detected)
top = Directive.query.filter(agent_id="agent-1").top_by_decay(n=10)
# Discharge accumulated urgency
directive.resolve_pressure("relevance")
# Refresh the decay clock (same as DecayingSortedField)
directive.touch("relevance")
| Parameter | Type | Default | Description |
|---|---|---|---|
decay_rate |
float |
0.1 |
Power-law decay exponent (inherited). Empirically tuned in sweep 2026-04-17; prior default was 0.5. |
base_score_field |
str |
None |
Companion field whose value multiplies the decay curve (inherited). |
cycles |
list |
[] |
List of (period, amplitude, phase) tuples. Use TemporalPeriod constants for period. |
pressure_rate |
float |
0.0 |
Rate of urgency buildup per unresolved day. |
partition_by |
str or tuple |
() |
Partition the sorted set by key field values (inherited). |
See CyclicDecayField feature docs for the full reference including
the scoring formula, Redis data model, TemporalPeriod constants, and error handling.
See Agent Memory for the broader agent memory primitives overview.
CoOccurrenceField¶
CoOccurrenceField maintains weighted association edges between model instances using
per-PK Redis sorted sets. Weights strengthen via co-retrieval and decay when not
reinforced. A server-side Lua BFS script enables multi-hop associative retrieval.
from popoto import Model, UniqueKeyField, StringField
from popoto.fields.co_occurrence_field import CoOccurrenceField
class Memory(Model):
key = UniqueKeyField()
content = StringField()
associations = CoOccurrenceField(symmetric=True, max_edges=100)
# Create and link
mem_a = Memory.create(key="ml", content="Machine learning")
mem_b = Memory.create(key="nn", content="Neural networks")
field = Memory._meta.fields["associations"]
field.link(Memory, mem_a.db_key.redis_key, mem_b.db_key.redis_key)
# Propagate associations
scores = field.propagate(Memory, [mem_a.db_key.redis_key], depth=2)
| Parameter | Type | Default | Description |
|---|---|---|---|
symmetric |
bool |
True |
If True, edges are bidirectional. |
max_edges |
int |
500 |
Maximum edges per PK; lowest-weight pruned when exceeded. |
decay_factor |
float |
0.95 |
Default multiplicative decay for weaken_all(). |
See CoOccurrenceField docs for the full reference including methods, Redis key patterns, and synergy with other memory fields. See Agent Memory for the broader agent memory primitives overview.
ExistenceFilter¶
ExistenceFilter is a Bloom filter for O(1) probabilistic membership checks. It answers
"have I ever stored anything about X?" without touching any sorted set or hash. False
positives are possible; false negatives are impossible.
Implemented entirely with Redis strings (SETBIT/GETBIT) and Lua scripts. No Redis
modules required -- works on both Redis and Valkey.
ExistenceFilter is a "side-effect field" -- it does not store a value on the model
instance. It maintains a Bloom filter index via on_save() hooks.
from popoto import Model, KeyField, Field
from popoto.fields.existence_filter import ExistenceFilter
class Memory(Model):
agent_id = KeyField()
topic = Field(type=str)
bloom = ExistenceFilter(
error_rate=0.01,
capacity=100_000,
fingerprint_fn=lambda inst: inst.topic,
)
Check membership before running expensive queries:
# Fast pre-check before expensive retrieval
if not Memory.bloom.definitely_missing(Memory, "kubernetes deployments"):
results = Memory.query.filter(agent_id="agent-1").top_by_decay(5)
else:
results = [] # skip retrieval entirely
| Parameter | Type | Default | Description |
|---|---|---|---|
error_rate |
float |
0.01 |
Target false positive rate. Lower = more bits required. |
capacity |
int |
100_000 |
Expected number of distinct items. Exceeding this degrades the error rate. |
fingerprint_fn |
Callable |
None |
Takes a model instance, returns a string fingerprint. Falls back to redis_key if not set. |
| Method | Returns | Description |
|---|---|---|
might_exist(model_class, fingerprint) |
bool |
True if fingerprint is possibly present (may be false positive). |
definitely_missing(model_class, fingerprint) |
bool |
True if fingerprint is guaranteed absent. |
fill_ratio(model_class) |
float |
Proportion of set bits (0.0-1.0). Monitor for capacity warnings. |
Tokenization: Fingerprints are automatically tokenized on save. A fingerprint like
"kubernetes deployment guide" is split into individual tokens ("kubernetes",
"deployment", "guide"), each added to the bloom filter separately. This enables
word-level queries: might_exist("kubernetes") returns True after saving that fingerprint.
Queries are normalized with the same rules (lowercase, stop word removal, min-length 3).
See ExistenceFilter feature docs for details.
on_delete() is a no-op -- Bloom filters do not support removal. Stale positives are
harmless for a pre-filter use case.
See Agent Memory -- ExistenceFilter for the full agent memory context.
BM25Field¶
BM25Field provides ranked keyword search using BM25 scoring, backed entirely by Redis
sorted sets and Lua scripts. No Redis modules required -- works on both Redis and Valkey.
Like ExistenceFilter, BM25Field is a "side-effect field" -- it does not store a value
on the model instance. It maintains an inverted index and corpus statistics via
on_save()/on_delete() hooks, and computes BM25 scores at query time server-side.
from popoto import Model, AutoKeyField
from popoto.fields.bm25_field import BM25Field
from popoto.fields.content_field import ContentField
class Memory(Model):
key = AutoKeyField()
raw_content = ContentField()
content_bm25 = BM25Field(source="raw_content")
Search via keyword_search() on the query builder:
results = Memory.query.keyword_search("redis deployment timeout", limit=10)
for memory in results:
print(f"{memory.key}: {memory._bm25_score:.3f}")
| Parameter | Type | Required | Description |
|---|---|---|---|
source |
str |
Yes | Name of the field to read content from for indexing. |
| Class Constant | Default | Description |
|---|---|---|
BM25_K1 |
1.2 |
Term frequency saturation parameter. |
BM25_B |
0.75 |
Document length normalization parameter (0 = none, 1 = full). |
| Method | Returns | Description |
|---|---|---|
search(model_class, field_name, query_text, limit) |
list[tuple[str, float]] |
Raw BM25 search returning (redis_key, score) tuples. |
recompute_stats(model_class, field_name) |
None |
Recompute avgdl/n from scratch to correct floating-point drift. |
Tokenization uses the same shared tokenizer as ExistenceFilter (fields/_tokenizer.py):
lowercase, split on non-word characters, filter tokens shorter than 3 characters, remove
stop words. BM25Field preserves duplicate tokens (unique=False) for accurate term
frequency counts.
See Hybrid Retrieval for the full feature documentation including RRF fusion and the hybrid retrieval recipe.
FrequencySketch¶
FrequencySketch is a Count-Min Sketch for approximate frequency counting. Tracks how
many times a fingerprint has been saved, with possible overcounting but never undercounting.
Implemented entirely with Redis hashes (HINCRBY/HGET) and Lua scripts. No Redis
modules required -- works on both Redis and Valkey.
from popoto import Model, KeyField, Field
from popoto.fields.existence_filter import FrequencySketch
class Memory(Model):
agent_id = KeyField()
topic = Field(type=str)
freq = FrequencySketch(
fingerprint_fn=lambda inst: inst.topic,
)
Query approximate frequency:
| Parameter | Type | Default | Description |
|---|---|---|---|
width |
int |
2000 |
Number of counters per row. Higher = less overcounting. |
depth |
int |
7 |
Number of hash functions (rows). Higher = more accurate. |
fingerprint_fn |
Callable |
None |
Takes a model instance, returns a string fingerprint. Falls back to redis_key if not set. |
| Method | Returns | Description |
|---|---|---|
get_frequency(model_class, fingerprint) |
int |
Approximate count. May overcount, never undercounts. |
on_delete() is a no-op -- CMS counters are monotonically increasing.
Both ExistenceFilter and FrequencySketch can be used together on the same model.
See Agent Memory -- FrequencySketch for
the full agent memory context.
PredictionLedgerMixin¶
PredictionLedgerMixin adds prediction recording, resolution, and error tracking to any
Popoto model. Agents record predictions before acting, then resolve them against actual
outcomes. The mixin computes prediction error and feeds it back into ConfidenceField
when error is high.
from popoto import Model, UniqueKeyField, StringField
from popoto.fields.prediction_ledger import PredictionLedgerMixin
from popoto.fields.confidence_field import ConfidenceField
class Memory(PredictionLedgerMixin, Model):
key = UniqueKeyField()
content = StringField()
certainty = ConfidenceField()
_pl_partition = "default"
Record a prediction and resolve it:
memory = Memory.create(key="fact1", content="sky is blue")
PredictionLedgerMixin.record_prediction(memory, predicted={"relevance": 0.9})
error = PredictionLedgerMixin.resolve_prediction(memory, actual={"relevance": 0.3})
# error ≈ 0.6
Auto-resolution from ObservationProtocol outcomes (acted, dismissed, contradicted)
is handled automatically when the model uses both mixins. Resolution is idempotent --
resolving an already-resolved prediction is a no-op.
| Method | Returns | Description |
|---|---|---|
record_prediction(instance, predicted, pipeline=None) |
None |
Store a prediction for a saved instance |
resolve_prediction(instance, actual, pipeline=None) |
float or None |
Resolve with actual values, returns error |
auto_resolve(instance, outcome, pipeline=None) |
float or None |
Resolve using outcome-to-error mapping |
get_prediction_data(instance) |
dict or None |
Read current prediction metadata |
get_highest_errors(model_class, partition, limit) |
list |
Query instances with highest errors |
compute_prediction_error(predicted, actual) |
float |
Overridable error computation |
Redis keys: $PL:{ClassName}:meta:{pk} (hash), $PL:{ClassName}:errors:{partition} (sorted set).
Implemented entirely with Redis hashes, sorted sets, and Lua scripts. No Redis modules required -- works on both Redis and Valkey.
See Agent Memory -- PredictionLedger for the full agent memory context.
partition_by¶
When you always query a SortedField together with a specific KeyField, you can
dramatically improve performance by defining partition_by. This creates a composite index
scoped to the values of the fields listed in the tuple.
The tradeoff is that queries on this SortedField must include the partition_by
fields. This is ideal for the MenuItem.price field, where you typically filter items
for a specific restaurant.
class MenuItem(Model):
item_id = AutoKeyField()
name = Field(type=str)
price = SortedField(type=float, partition_by=('restaurant',))
restaurant = Relationship(Restaurant)
available = Field(type=bool, default=True)
Now price queries must include the restaurant:
# Fast query: uses the composite index
affordable = MenuItem.query.filter(
restaurant=restaurant,
price__lt=12.00,
)
# This would fail because 'restaurant' is required by partition_by
try:
MenuItem.query.filter(price__lt=12.00)
except Exception as e:
print(e)
Tip
Use partition_by when you have a natural parent-child relationship. Menu items
always belong to a restaurant, so scoping the price index to the restaurant
reduces the sorted set size and speeds up range queries. See
Multi-Tenancy for using partition_by with a KeyField
to isolate data by tenant or project.
Partition key changes are handled automatically
If you change a partition field's value (e.g., moving an item from one restaurant to another), Popoto automatically removes the entry from the old partition's sorted set and adds it to the new one. No manual cleanup is needed.
Deprecation Notice
The sort_by parameter is deprecated and will be removed in a future major version.
Use partition_by instead. sort_by still works but emits a DeprecationWarning.
DatetimeField¶
DatetimeField extends the base Field with automatic timestamp management.
It supports two special parameters: auto_now_add and auto_now.
auto_now_add=True-- Sets the field to the current datetime on first save only.auto_now=True-- Updates the field to the current datetime on every save.
The Order model uses both:
from popoto.fields.datetime_field import DatetimeField
class Order(Model):
order_id = AutoKeyField()
# ... other fields ...
created_at = DatetimeField(auto_now_add=True) # set on first save
updated_at = DatetimeField(auto_now=True) # refreshed on every save
Timestamps are managed automatically:
order = Order.create(
customer=customer,
restaurant=restaurant,
total=28.99,
)
print(order.created_at)
# => 2025-06-15 14:30:00.123456
print(order.updated_at)
# => 2025-06-15 14:30:00.123456
# Update the order status
order.status = "confirmed"
order.save()
print(order.created_at)
# => 2025-06-15 14:30:00.123456 (unchanged)
print(order.updated_at)
# => 2025-06-15 14:32:10.654321 (refreshed)
See Model Meta Options for configuring order_by and ttl on the Meta
class.
Timestampable Mixin¶
If you find yourself adding created_at and updated_at to many models, use the
Timestampable mixin to include both fields automatically. It provides
DatetimeField(auto_now_add=True) for created_at and
DatetimeField(auto_now=True) for updated_at.
from popoto import Model, KeyField, Field
from popoto.utils.mixins.timestampable import Timestampable
class Restaurant(Timestampable, Model):
name = KeyField()
cuisine = Field(type=str)
# created_at and updated_at are included automatically
WriteFilterMixin¶
WriteFilterMixin gates save() calls based on a scoring function you define. Records
scoring below a minimum threshold are silently discarded (never persisted). Records
scoring above a priority threshold are persisted and tagged in a Redis sorted set for
preferential retrieval.
from popoto import Model, KeyField, Field
from popoto.fields.write_filter import WriteFilterMixin
class Memory(WriteFilterMixin, Model):
agent_id = KeyField()
content = Field(type=str)
importance = Field(type=float, default=0.5)
def compute_filter_score(self):
return self.importance or 0.0
The mixin adds three behaviors to your model:
-
Gate on save: Before persisting,
compute_filter_score()is called. If the score is below_wf_min_threshold(default0.1after sweep 2026-04-17; prior default was0.2), aSkipSaveExceptionis raised and caught byModel.save(), silently aborting the write. -
Priority tagging: If the score meets or exceeds
_wf_priority_threshold(default 0.7), the record's Redis key is added to a sorted set at$WF:{ClassName}:prioritywith the score as its rank. -
Cleanup on delete: When
delete()is called, the record is removed from the priority sorted set automatically.
# Silently discarded — score 0.05 < min_threshold 0.1
low = Memory(agent_id="a1", content="noise", importance=0.05)
low.save() # No error, but record is NOT in Redis
# Persisted normally — score 0.5 between thresholds
mid = Memory(agent_id="a1", content="useful", importance=0.5)
mid.save() # Stored in Redis
# Persisted AND priority-tagged — score 0.9 >= priority_threshold 0.7
high = Memory(agent_id="a1", content="critical", importance=0.9)
high.save() # Stored in Redis AND added to $WF:Memory:priority
Override the thresholds by setting class attributes:
class StrictMemory(WriteFilterMixin, Model):
_wf_min_threshold = 0.4 # Higher bar to persist
_wf_priority_threshold = 0.8 # Higher bar for priority
# ...
| Attribute | Default | Description |
|---|---|---|
_wf_min_threshold |
0.1 |
Minimum score to persist. (Empirically tuned in sweep 2026-04-17; prior default was 0.2.) |
_wf_priority_threshold |
0.7 |
Minimum score for priority tagging |
Tip
WriteFilterMixin must appear before Model in the inheritance list so that
its on_save() hook is called. The scoring function is application logic — Popoto
provides the gating mechanism, not the scoring logic.
See Agent Memory — WriteFilter for the broader agent memory context and how WriteFilter fits with DecayingSortedField and AccessTracker.
ContentField¶
ContentField routes large content values (documents, text, binary data) to filesystem
storage, keeping Redis memory usage minimal. Redis stores only a compact reference string
($CF:{hash}:{path}), and the content is lazy-loaded from the filesystem when the
attribute is accessed.
This is ideal for storing long-form text, HTML, markdown, or any content too large to keep in Redis comfortably.
No additional dependencies are needed -- ContentField uses the filesystem by default.
import popoto
from popoto.fields.content_field import ContentField
class Document(popoto.Model):
name = popoto.KeyField()
body = ContentField()
doc = Document.create(name="readme", body="# Hello World\n\nThis is a large document...")
On save, the content is written to the filesystem first, then a reference string is stored in Redis. On attribute access, the reference is detected and the content is transparently loaded from the filesystem:
loaded = Document.query.get(name="readme")
print(loaded.body)
# => "# Hello World\n\nThis is a large document..."
Content-Addressable Storage¶
ContentField uses SHA-256 hashing for content-addressable storage. Identical content produces the same hash and file path, so duplicate writes are deduplicated automatically. Writes are atomic (temp file + rename) to prevent partial reads.
Custom Content Store¶
By default, ContentField uses FilesystemStore which writes to ~/.popoto/content/
(or the path set via POPOTO_CONTENT_PATH). You can pass a custom store per-field
or set a global default via popoto.configure():
from popoto.stores.filesystem import FilesystemStore
class Document(popoto.Model):
name = popoto.KeyField()
body = ContentField(store=FilesystemStore(base_path="/data/documents"))
Or configure globally:
| Parameter | Type | Default | Description |
|---|---|---|---|
store |
AbstractContentStore or "filesystem" |
"filesystem" |
The content store backend. |
ContentField deletion is a no-op -- content files are append-only. Use
ContentField.garbage_collect(ModelClass) to remove orphaned files not referenced
by any live model instance.
EmbeddingField¶
EmbeddingField generates vector embeddings from a source field on save, stores them
as .npy files on the filesystem, and maintains an in-memory cache of pre-normalized
numpy matrices for fast cosine similarity computation at query time.
pip install popoto[embeddings] # numpy
pip install popoto[voyage] # numpy + voyageai
pip install popoto[openai] # numpy + openai
import popoto
from popoto.fields.content_field import ContentField
from popoto.fields.embedding_field import EmbeddingField
from popoto.embeddings.voyage import VoyageProvider
popoto.configure(
embedding_provider=VoyageProvider(api_key="your-key"),
)
class Memory(popoto.Model):
topic = popoto.KeyField()
content = ContentField()
embedding = EmbeddingField(source="content")
m = Memory.create(topic="revenue", content="Q4 revenue exceeded projections...")
# Embedding is generated automatically on save
Redis stores only the embedding dimension count (an integer). The actual vector
is stored as a .npy file under ~/.popoto/content/.embeddings/. On save,
EmbeddingField reads the source field value, calls the configured provider to
generate an embedding, and writes the vector atomically to disk.
Embedding Providers¶
Popoto ships with three built-in providers. You can also implement your own by
subclassing AbstractEmbeddingProvider.
VoyageProvider (recommended for retrieval):
from popoto.embeddings.voyage import VoyageProvider
provider = VoyageProvider(
api_key="your-voyage-api-key",
model="voyage-3-lite", # default
dimensions=512, # default
)
OpenAIProvider:
from popoto.embeddings.openai import OpenAIProvider
provider = OpenAIProvider(
api_key="your-openai-api-key",
model="text-embedding-3-small", # default
dimensions=1536, # default
)
OllamaProvider (local, no API key):
from popoto.embeddings.ollama import OllamaProvider
provider = OllamaProvider(
base_url="http://localhost:11434", # default
model="nomic-embed-text", # default (768-dim)
dim=None, # auto-detect on first embed()
)
Requires a running Ollama server (ollama serve) with the model pulled
(ollama pull nomic-embed-text). Uses stdlib only -- no extras to install.
See Content and Embedding Fields
for setup details.
Querying with semantic_search()¶
Once models have embeddings, use semantic_search() to find semantically similar
instances. See Making Queries -- semantic_search() for the
full query interface.
| Parameter | Type | Default | Description |
|---|---|---|---|
source |
str |
None |
Name of the field to read content from for embedding generation. |
provider |
AbstractEmbeddingProvider |
None |
Provider instance, or None to use the global default set via popoto.configure(). |
auto_embed |
bool |
True |
Generate embeddings automatically on save. |
cache |
bool |
True |
Cache embeddings in memory for fast similarity search. |
See Content and Embedding Fields for the full feature reference including storage layout, cache management, and provider API.
GeoField¶
GeoField uses Redis geospatial indexes for location-based queries. This is perfect
for finding nearby restaurants, tracking driver positions, or searching by delivery
address.
Values are GeoField.Coordinates(latitude, longitude) namedtuples, though plain
(latitude, longitude) tuples also work.
Using the Restaurant model (which has location = GeoField()), create restaurants
with locations and search by radius:
siam = Restaurant.create(
name="Siam Garden",
cuisine="Thai",
rating=4.5,
location=GeoField.Coordinates(latitude=40.7484, longitude=-73.9857),
)
bella = Restaurant.create(
name="Bella Napoli",
cuisine="Italian",
rating=4.2,
location=GeoField.Coordinates(latitude=40.7527, longitude=-73.9772),
)
taco = Restaurant.create(
name="Taco Loco",
cuisine="Mexican",
rating=4.0,
location=GeoField.Coordinates(latitude=40.7589, longitude=-73.9851),
)
# Find restaurants within 2km of Midtown Manhattan
nearby = Restaurant.query.filter(
location=(40.7505, -73.9834),
location_radius=2,
location_radius_unit='km',
)
print(len(nearby))
# => 3
Supported radius units are "m" (meters), "km" (kilometers), "ft" (feet), and
"mi" (miles).
Query with Distances¶
Add {field_name}_with_distances=True to include distance information in results.
Each returned object gets _geo_distance and _geo_distance_unit attributes, and
results are sorted closest first.
results = Restaurant.query.filter(
location=(40.7505, -73.9834),
location_radius=5,
location_radius_unit='km',
location_with_distances=True,
)
for r in results:
print(f"{r.name}: {r._geo_distance} {r._geo_distance_unit}")
# => Siam Garden: 0.3 km
# => Bella Napoli: 0.7 km
# => Taco Loco: 0.9 km
You can also use a model instance as the center point with _member:
results = Restaurant.query.filter(
location_member=siam,
location_radius=3,
location_radius_unit='km',
location_with_distances=True,
)
for r in results:
print(f"{r.name}: {r._geo_distance} km")
# => Siam Garden: 0.0 km
# => Bella Napoli: 0.8 km
# => Taco Loco: 1.2 km
Tip
Geo queries are backed by Redis GEORADIUS commands, which run in O(N+log(M)) time. This is very fast even with millions of locations.
DataFrameField¶
DataFrameField stores Pandas DataFrame
objects directly in Redis. This is useful for analytics or caching computed datasets.
Because it depends on Pandas, it uses a separate model outside the canonical set.
import pandas as pd
from popoto import Model, KeyField
from popoto.fields.dataframe_field import DataFrameField
class SalesReport(Model):
name = KeyField()
data = DataFrameField()
Store and retrieve a sales report:
sales_data = pd.DataFrame({
'restaurant': ['Siam Garden', 'Bella Napoli', 'Taco Loco'],
'orders': [142, 98, 215],
'revenue': [3408.58, 2156.02, 3225.85],
})
report = SalesReport.create(name="weekly_summary", data=sales_data)
loaded = SalesReport.load(name="weekly_summary")
print(loaded.data['revenue'].sum())
# => 8790.45
report.delete()
Note
DataFrameField requires the pandas package. Install it separately if it is
not already in your environment.
Reserved Field Names¶
The following names are reserved and cannot be used as field names:
limit-- Used inquery.filter()to limit the number of returned objectsvalues-- Used inquery.filter()to restrict which fields are returnedorder_by-- Used inquery.filter()to sort results
Model Methods¶
This section summarizes the core methods available on every Popoto model.
Creating and Saving¶
Create and save in one step with create(), or instantiate and call save() later:
# One-step creation
restaurant = Restaurant.create(
name="Siam Garden",
cuisine="Thai",
rating=4.5,
location=GeoField.Coordinates(latitude=40.7484, longitude=-73.9857),
)
# Two-step creation
restaurant = Restaurant(name="Bella Napoli")
restaurant.cuisine = "Italian"
restaurant.rating = 4.2
restaurant.save()
Loading¶
Load an instance by its KeyField values. For AutoKeyField models, load by the
generated key:
restaurant = Restaurant.load(name="Siam Garden")
print(restaurant.cuisine)
# => "Thai"
order = Order.load(order_id=order.order_id)
print(order.status)
# => "pending"
Updating¶
Modify fields and call save() to persist changes. For models with
DatetimeField(auto_now=True), the timestamp is refreshed automatically:
restaurant = Restaurant.load(name="Siam Garden")
restaurant.rating = 4.7
restaurant.save()
order.status = "delivered"
order.save()
# order.updated_at is automatically refreshed
Deleting¶
Call delete() to remove an instance and clean up all associated indexes (sorted
set entries, geo set entries, unique field indexes, and relationship indexes):
Validation¶
Use is_valid() to check whether a model instance passes all field validations
before saving:
restaurant = Restaurant(name=None, cuisine="Thai", rating=4.5)
print(restaurant.is_valid())
# => False (name is a required KeyField)
restaurant.name = "Siam Garden"
print(restaurant.is_valid())
# => True
The db_key Property¶
Every saved instance has a db_key property that exposes its Redis key:
restaurant = Restaurant.create(name="Siam Garden", cuisine="Thai", rating=4.5)
print(restaurant.db_key.redis_key)
# => "Restaurant:Siam Garden"
reservation = Reservation.create(
restaurant="Siam Garden", customer="foodie42", party_size=4
)
print(reservation.db_key.redis_key)
# => "Reservation:Siam Garden:foodie42"
The db_key is useful for debugging, logging, or performing custom Redis operations
outside of Popoto's query API.