Snapshots and Performance¶
In an event-sourced system, the state of an entity is rebuilt by replaying all past events. That works well for most use cases – but what if the stream contains thousands of events?
At some point, performance becomes a concern.
Why Long Streams Are Expensive¶
To handle a new command, the system needs to:
- Load the full event stream for an entity
- Replay all events to reconstruct the current state
- Apply business logic
- Append new events
If the stream is very long, step 2 becomes slower – especially if the entity is complex.
Enter Snapshots¶
A snapshot is a cached version of an entity's state at a given point in time.
Instead of replaying all events from the beginning, the system:
- Loads the latest snapshot
- Replays only the events that occurred after the snapshot
This can dramatically reduce load times – especially for entities with long histories.
Example¶
- Stream has 5,000 events
- Snapshot taken at event 4,000
- To rebuild state, only 1,000 events need to be replayed
When to Use Snapshots¶
Not every entity needs snapshots. Consider them when:
- Streams grow very long (thousands of events)
- Startup time becomes slow (e.g. loading on user login)
- Command latency is noticeable
Some systems create snapshots:
- After a fixed number of events (e.g. every 500)
- Based on time (e.g. once a day)
- On demand (e.g. triggered by specific workflows)
How to Store Them¶
Snapshots can be:
- Stored alongside the event stream (same store)
- Stored in a separate store (e.g. faster, memory-optimized)
They are typically keyed by stream ID and last event position.
Snapshots are an optimization – not a requirement. You can always rebuild from scratch.
Next up: Learn how to deal with personal data in an immutable event store with GDPR and Data Privacy.