Technical Advisory 10012
Date and Version​
Version: 2.63.0
Date: 2024-09-26
Description​
In version 2.63.0 we've increased the transaction duration for projections.
ZITADEL has an event driven architecture. After events are pushed to the eventstore, they are reduced into projections in bulk batches. Projections allow for efficient lookup of data through normalized SQL tables.
We've investigated multiple reports of outdated projections. For example created users missing in get requests, or missing data after a ZITADEL upgrade1. The conclusion is that the transaction in which we perform a bulk of queries can timeout. The old setting defined a transaction duration of 500ms for a bulk of 200 events. A single event may create multiple statements in a single projection. A timeout may occur even if the actual bulk size is less than 200, which then results in more back-pressure on a busy system, leading to more timeouts and effectively dead-locking a projection.
Increasing or disabling the projection transaction duration solved dead-locks in all reported cases. We've decided to increase the transaction duration to 1 minute. Due to the high value it is functionally similar to disabling, however it still provides a safety net for transaction that do freeze, perhaps due to connection issues with the database.
Statement​
A summary of bug reports can be found in the following issue: Missing data due to outdated projections. This change was submitted in the following PR: fix(projection): increase transaction duration, which will be released in Version 2.63.0
Mitigation​
If you have a custom configuration for projections, this update will not apply to your system or some projections. When encountering projection dead-lock consider increasing the timeout to the new default value.
Note that entries under Customizations
overwrite the global settings for a single projection.
Projections:
TransactionDuration: 1m # ZITADEL_PROJECTIONS_TRANSACTIONDURATION
BulkLimit: 200 # ZITADEL_PROJECTIONS_BULKLIMIT
Customizations:
custom_texts:
BulkLimit: 400
project_grant_fields:
TransactionDuration: 0s
BulkLimit: 2000
Impact​
Once this update has been released and deployed, transactions are allowed to run longer. No other functional impact is expected.
Footnotes​
-
Changes written to the eventstore are the main source of truth. When a projection is out of date, some request may serve incomplete or no data. The data itself is however not lost. ↩