Widespread issues creating instances via datacrunch SDK/API

proteindesigner · September 23, 2025, 4:30am

I am seeing widespread issues trying to create instances using the python SDK/API. The API returns an APIException “server_error”. In some cases, in the dashboard the instance says “Error” for status. In other cases the instances proceed to be created, despite returning the API error, which requires manual cleanup in the datacrunch dashboard. As you can see from my forum posts, we are having a lot of issues with the API/SDK. Now it is not limited to Fin-03/B200s and I’m seeing it happen with other instance types at other locations like A100 (FIN-01) and H200 (Fin-02/ICE-01). This is a NEW issue. I have been using the exact same scripts for spinning up instances without issue for several months using the same volumes, etc.

TeamVerda · September 23, 2025, 7:01am

Hi,

Sorry to hear that you’re having trouble. Please let us know your User ID and Project ID so we can investigate the problem ASAP.

proteindesigner · September 24, 2025, 5:56am

I replied in a private conversation, I prefer to keep the User ID and Project ID confidential.

FYI, I have now tried spinning up instances using direct HTTP API calls (https://api.datacrunch.io/v1/instances) and am getting the same APIException “server error”. So it’s not a python SDK issue.

proteindesigner · September 24, 2025, 6:30am

Here’s what’s happening and what DataCrunch needs to address:

We create instances through the documented flow: POST /v1/instances with instance_type, hostname, etc. That call returns HTTP 202 with the new instance ID, and the instance really does
come up (we can see it running later).
The problem is the very next step: calling GET /v1/instances/{id}—either directly over HTTPS or via the official Python SDK’s InstancesService.get_by_id—almost immediately after the POST. That GET frequently comes back with HTTP 500 and the JSON error {“code”:“server_error”,“message”:null}. Because the SDK just wraps that response in APIException, every client that does “create → fetch details” fails even though the instance exists.
To prove it’s a transient backend issue, we added a trivial workaround: after the POST returns the new ID, we wait a couple of seconds (and retry up to a few times). Once we delay, GET /v1/instances/{id} succeeds and returns the full instance payload. That’s why we’re confident the instance is being created correctly—the API just needs a breather before the single-instance endpoint stops returning 500.

So the ask for DataCrunch:

Restore GET /v1/instances/{id} so it’s stable immediately after a successful create, or explicitly document/handle the consistency delay in the SDK.
Update the SDK (instances.get_by_id) to incorporate a retry/backoff if the backend keeps requiring a startup delay, so downstream clients don’t all have to implement their own hacks.

Until then we’re sleeping a couple of seconds before the lookup, but it’s a stopgap.

TeamVerda · September 25, 2025, 10:58am

Hello proteindesigner,

Thank you very much for all details you have provided. We have deployed a fix yesterday that resolves this issue.

We’ve added credit to your DataCrunch account since your input helped us track this down.

Alexey

proteindesigner · September 25, 2025, 4:19pm

Thanks so much- I’m just glad to see everything running super smoothly at my favorite neocloud provider .

Topic		Replies	Views
RTXPRO6000 Instance Creation Requests Timing Out (>30 second) Bug Reports	5	66	October 10, 2025
Instances stuck with status "new" Bug Reports	3	53	September 16, 2025
Cant see whether an Instance is available on FIXED PRICE Feedback	1	37	September 11, 2025
FIN-03 instances (B200) getting stuck in provisioning Bug Reports	3	57	September 19, 2025
Product Q&A Guidelines Product Q&A	1	38	August 13, 2025

Widespread issues creating instances via datacrunch SDK/API

Related topics