One of the biggest problems with coordination in distributed systems is dealing with isolation from the distributed coordination system.
Several reasons may cause isolation and the ideal response may be different. Unfortunately, some reasons look the same to the application but may be very different:
Replicante chooses consistency over availability: if the coordinator is not responsive, application processes will assume they have no right to the resources they have and stop working.
The problem is the application process can’t ensure the lock is held and the coordination works before a write operation is performed. This is because between the check and the write the lock may be lost for a number of reasons. At the same time the code complexity grows fast and performance of both application and coordinator decrease when the process attempts to ensure it holds the lock very often.
Replicante checks it holds locks before operations are performed and, for long held locks, periodically in reasonable places (at the beginning and/or end of tasks). This means the window of opportunity for the lock to be though as held by the application but not by the coordinator is limited but still large.
Replicante could check its locks before every write operation to reduce that window of opportunity to a write operation alone.