Let's assume a relational db-backed API. I understand the usage of optimistic concurrency control (eg by using a version field) to prevent lost updates from clients (where the r/u/w cycle is performed by the client). But let's imagine a PATCH request sent by a client; the client need not have previous knowledge of the resource (or minimal, eg only the ID) and just sends the data to update (eg in JSON patch format, or whatever).
For the purposes of this discussion, let's imagine that the resource has a field that is updated according to some business rules, for example it's a set of fingerprints, and each client should be able to add its own fingerprint to the set, without deleting the existing ones (it's more complex than this, but this should provide an idea). So the order in which the requests are evaluated is not critical, but it's definitely critical that no request be lost, otherwise some client would not have its fingerprint in the set.
When it receives the request, to apply the patch the server must read the resource from the backend (to see whether it exists and to also fetch any values that need to be merged/patched), apply the requested changes, and write back the result, all on the server side without no client involvement.It's not difficult to see that such a sequence of operations is racy and can easily lead to lost updates if multiple clients send PATCH for the same resource and fields at the same time.
How to prevent this? I can think of a couple of strategies:
- use pessimistic locking (ie, real db locking) on the resource throughout the operation
- use optimistic locking "internally", ie fetch the resource and check that it didn't change in the meantime when writing it back; if it did, either fail the operation right away (the client will have to retry or whatever) or retry with the new version of the resource, and so on (perhaps up to a fixed number of times) until it succeeds.
Anything I'm missing here? Any additional strategies?