VERSION_MISMATCH on batch-upsert despite sending freshly fetched version

We’re hitting a VERSION_MISMATCH error on /v2/catalog/batch-upsert for certain catalog items that we cannot explain.

Our flow:

  1. GET the item to retrieve the latest version
  2. Immediately POST to batch-upsert with that exact version on both the parent ITEM and nested ITEM_VARIATION
  3. Get VERSION_MISMATCH back

What makes this strange:
After the failed POST, if we GET the item again, the parent ITEM version has changed but the ITEM_VARIATION version is unchanged. So it appears Square is committing the parent write and then failing on the variation version check, even though we sent the correct version for both.

This only happens on specific items and is 100% reproducible on those items every time. Our integration code and payload structure is identical for all items.

Has anyone seen this before? Is Square partially committing writes on failed requests by design? And why would the parent version change on a request that returned an error?

Any insights would be appreciated!

Hey @Shahzain! Usually when this happens, it’s because there are multiple asynchronous operations being performed on the item in a short time frame. For example:

  1. Item update request made
  2. Item is retrieved, but the update from (1) has not completed
  3. Item update request from (1) completes, changing the version
  4. Another item update request is made, using the version number from (2), which is now out of date

Does that sound possible with the application logic you’re working with?

If you believe it’s a different issue, can you let me know the Catalog Object ID of an item you’re working with where you’re seeing this happen?

Hi @josh-square

Thanks for the response. The Catalog Object ID is: O5EH6PLJOQURPTFU5TMGUSZK

However, the scenario you described doesn’t apply here. Our integration performs a GET immediately before every POST, there are no other concurrent operations running on these items at the time. We’ve also confirmed that the version is stable across repeated GETs before the POST, ruling out any async update in progress.

What we can’t explain is this: after the failed POST, re-fetching the item shows the parent ITEM version has changed but the ITEM_VARIATION version is unchanged. This suggests Square is committing the parent write and then failing on the variation check, on a request that ultimately returns an error. Is this expected behavior on your end?

Update: Upon further investigation on this specific item, we noticed that in some cases after receiving the VERSION_MISMATCH error, both the parent ITEM and ITEM_VARIATION versions have been bumped to the same new value, suggesting the write may actually be going through despite the error response. We need visibility from your end into what’s happening during the write for these specific catalog objects.

Thanks

Thanks, that’s very helpful! I found the logs for the version history of your catalog object, and there are some things I’d like to follow up with our catalog team on. I’ll escalate to them and let you know when I have an update!

Thanks Josh, really appreciate you digging into the logs and escalating this. Looking forward to hearing what the catalog team finds!

Hi @josh-square,

I’d like to follow-up on this issue.

Kind regards,

Shahzain

@Shahzain I don’t have an update from the team here, but in the mean time I did take a look at your API Logs. Based on the sequence of requests, it does look like there there are some attempts to update the object that happen without first retrieving the item to get its most up-to-date version (see attached screenshot). Can you take a look into your app logic and confirm whether that’s the case?

I do still think it’s unusual that failed requests are seemingly updating the version, but I’m also curious if ensuring those updated versions are being retrieved first will resolve your issue in the short-term.

We are experiencing something similar with some of our customers in that the batch upsert endpoint is creating new items even though we get an error message back. This is causing us to generate duplicates on our next call as we expect an errored upsert would cause the entire batch to fail. We are not getting an initial version_mismatch though, but this is probably because these are new items. This is the specific error we are seeing at one of our customers sites: (this.Category = INVALID_REQUEST_ERROR, this.Code = BAD_REQUEST, this.Detail = sake INVALID_REQUEST: INVALID_REQUEST: location_overrides[LJ362V5ZBA3AY]: requested rule with token L5V6NRQRQAJKYAIOXYQSPLKS may not be changed in place, this.Field = null)

I was going to create another post for this as it is slightly different, but this does appear to be the same root cause, in that errored batch upserts are actually updating the catalog when they shouldn’t.

Has there been any progress on this issue?

Yes, I do have an update here! The actual root cause is fairly complex, but the team confirmed there are situations where asynchronous operations can result in some parts of a request succeeding before it ultimately fails — we’re looking into ways to clean this up for a more clear developer experience.

In the mean time, we recommend continuing to retrieve the latest version immediately before making catalog updates. In order to prevent duplicated requests in situations where an error/retry mechanism is used, you can use idempotency_keys. This will ensure that if a request is actually a duplicate it will be rejected if it’s been processed already.