popoto.models.encoding¶
popoto.models.encoding
¶
Serialization and deserialization of Popoto model instances using msgpack.
Custom types (Decimal, tuple, set, datetime, date, time, DataFrame) are encoded with tagged dicts so they round-trip through msgpack faithfully.
This module bridges Python's rich type system with Redis's binary storage using MessagePack as the serialization format. MessagePack was chosen over JSON for its compactness and speed, and over pickle for safety and cross-language compatibility.
Design Philosophy
Redis stores all values as binary strings, but Popoto models use rich Python types (Decimal, datetime, pandas DataFrames, etc.). This module provides a type-preserving serialization system that encodes these types into a format MessagePack can handle, then decodes them back to their original Python types.
The encoding strategy uses sentinel keys (e.g., "Decimal", "datetime") embedded in dictionaries to identify special types during decoding. This allows the decoder to distinguish between a regular dict and an encoded Decimal without requiring schema information.
Architecture
- TYPE_ENCODER_DECODERS: Registry mapping Python types to their encode/decode functions. Extensible for new types.
- encode_popoto_model_obj(): Entry point for serializing a Model instance to a Redis hash (dict of field_name -> packed_value).
- decode_popoto_model_hashmap(): Entry point for deserializing a Redis hash back into a Model instance.
Integration
Called by Model.save() to persist objects and by Query/DB_key to reconstruct objects from Redis. The encoding is transparent to model users.
Example
Automatic during save¶
person = Person(name="Alice", birthday=datetime.date(1990, 1, 15)) person.save() # Internally calls encode_popoto_model_obj
Automatic during query¶
person = Person.query.get(name="Alice") # Internally calls decode_popoto_model_hashmap
EncoderDecoder = namedtuple('EncoderDecoder', 'key, encoder, decoder')
module-attribute
¶
A named tuple defining how to serialize and deserialize a specific Python type.
Attributes:
| Name | Type | Description |
|---|---|---|
key |
A unique sentinel string (e.g., "Decimal") used to identify this type in serialized data. Appears as a key in the encoded dict. |
|
encoder |
A callable that transforms a Python object into a dict with the sentinel key and an "as_encodable" key containing the serializable form. |
|
decoder |
A callable that reconstructs the original Python object from the encoded dict. |
DECODERS_BY_KEYSTRING = {(encoder_decoder.key): (encoder_decoder.decoder) for encoder_decoder in (TYPE_ENCODER_DECODERS.values())}
module-attribute
¶
Lookup table for fast decoder resolution during deserialization.
Maps sentinel key strings (e.g., "Decimal") directly to decoder functions, avoiding the need to iterate through TYPE_ENCODER_DECODERS during decode. This is a performance optimization for the hot path of object reconstruction.
decode_custom_types(obj)
¶
Msgpack object-hook that restores tagged dicts to their Python types.
This is the counterpart to the type-specific encoders in TYPE_ENCODER_DECODERS. When MessagePack deserializes data, custom types come back as plain dicts with sentinel keys. This function detects those sentinel keys and applies the appropriate decoder to reconstruct the original Python type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
Any value returned from msgpack.unpackb(). If it's a dict with "as_encodable" and a recognized sentinel key, it will be decoded. Otherwise, returned unchanged. |
required |
Returns:
| Type | Description |
|---|---|
|
The decoded Python object (Decimal, datetime, etc.) if obj was an |
|
|
encoded custom type, otherwise obj unchanged. |
Design Note
The "as_encodable" check is a fast-path optimization. Most dicts in user data won't have this key, so we can skip the sentinel key scan for the common case.
Source code in src/popoto/models/encoding.py
encode_popoto_model_obj(obj)
¶
Encode a model instance into a dict of {field_name_bytes: msgpack_bytes}.
Transforms all field values on a model into a dictionary suitable for Redis HSET operations. Each field name becomes a UTF-8 encoded key, and each value is MessagePack-serialized (with custom type handling).
Relationship fields are stored as the related instance's redis_key.
Custom types (Decimal, datetime, etc.) use tagged-dict encoding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
Model
|
A Popoto Model instance to serialize. Must have _meta.fields populated by the metaclass. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
A dict mapping bytes (field names) to bytes (packed values), ready |
dict
|
for direct use with Redis HSET/HMSET commands. |
Raises:
| Type | Description |
|---|---|
ModelException
|
If a Relationship field contains a value that isn't an instance of the expected related model. |
Encoding Strategy
- Relationship fields: Store the related object's db_key (Redis key string), not the full object. This enables lazy loading and avoids circular serialization.
- Custom types (Decimal, datetime, etc.): Use TYPE_ENCODER_DECODERS to wrap the value with a sentinel key for later type reconstruction.
- All other types: Direct MessagePack serialization (handles None, str, int, float, bool, list, dict natively).
Integration
Called by Model.save() as part of the persistence pipeline. The returned dict is passed directly to Redis via HSET.
Note
NumPy array support is enabled via msgpack_numpy patching, allowing fields to store numpy arrays efficiently.
Source code in src/popoto/models/encoding.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 | |
decode_popoto_model_hashmap(model_class, redis_hash, fields_only=False, lazy=False)
¶
Decode a Redis hash into a model instance (or a raw fields dict).
The inverse of encode_popoto_model_obj(). Takes raw Redis hash data (bytes keys and MessagePack-encoded values) and reconstructs either a fully-instantiated Model object or a plain dictionary of field values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_class
|
Model
|
The Model subclass to instantiate. |
required |
redis_hash
|
dict
|
Mapping of |
required |
fields_only
|
If |
False
|
|
lazy
|
If |
False
|
Returns:
| Type | Description |
|---|---|
Model
|
A model instance, a dict (when fields_only), or |
Model
|
is empty. |
Decoding Process
- Each value is unpacked via msgpack.unpackb()
- decode_custom_types() checks for sentinel keys and reconstructs special types (Decimal, datetime, etc.)
- Field names are decoded from bytes to strings (unless fields_only)
- The resulting dict is passed to model_class() to create the instance
Integration
Called by: - DB_key.get() for single-object retrieval - Query iteration for bulk object loading - Query.values() for projection queries (with fields_only=True)
Note
Relationship fields are stored as Redis key strings, not full objects. The Model's getattribute handles lazy loading of related objects when accessed.
Source code in src/popoto/models/encoding.py
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 | |
decode_lazy_field(value_bytes)
¶
Decode a single msgpack-encoded field value.
Called by Model.getattribute when accessing a lazily-loaded field for the first time.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value_bytes
|
bytes
|
Raw msgpack bytes from Redis. |
required |
Returns:
| Type | Description |
|---|---|
|
The decoded Python value with custom types restored. |