"PutObjectPart has failed" (the S3 upload failure monster revival)

This is a reroll of the major issue (clearly and easily reproducible) I commented about in NoSuchKey: Cells silently mis-uploaded files on S3 storage (now closed, but which contains all the gory details).

Multi-MB files consistently fail to be uploaded to S3 triggering a variety of errors:


{"level":"info","ts":1671762577.533683,"msg":"http: proxy error: context canceled"}


{"level":"error","ts":"2022-12-23T03:34:59+01:00","logger":"pydio.gateway.data","msg":"PutObjectPart has failed","error":"Put \"https://xxxyyy.s3.dualstack.eu-central-1.amazonaws.com/pydio/c2960499-49d9-45ef-aeb0-86be1197abbb?partNumber=2&uploadId=lPi4HhnfP8Nj..GuzBHx0gHMArDzwyD4YZn8NG4twvqVKIGYgvPmmvbtqa4.19LpFWM5MIYwn6bRIPvgPgIBksRHoSAydcco1jKBeUifUUmJ5qIERhPTScbZktQe0v_.\": context canceled","RemoteAddress":"<client IP>","UserAgent":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:108.0) Gecko/20100101 Firefox/108.0","ContentType":"application/octet-stream","HttpProtocol":"HTTP/1.1","UserName":"admin","UserUuid":"a69dd805-fd54-4c56-9b5f-xxx","GroupPath":"/","Profile":"admin","Roles":"ROOT_GROUP,ADMINS,a69dd805-fd54-4c56-9b5f-xxx"}

Contrary to 3.0.9, caddy_errors.log doesn’t output anything anymore (not even the 502 HTTP error codes it used to show).

The ability to DUMP THE TRAFFIC and DEBUG WHAT’S GOING ON is more pressing than ever.

I’d love to pinpoint from the traffic, where in the stack lays the error. Is it a missing header/query-string/timeout/token? A content-type? Is it between the client and Pydio? Pydio and AWS? In a request? A response?

But I can’t because Pydio makes me BLIND <add here tons of !>

With HTTPS omnipresence, tcpdump is of no use. So please, please, please, at the very least, add an environment variable like DUMP_TRAFFIC=1 (modeled after Apache2 mod_dumpio) which would log every possible packet inside a all-packets.log)

Because the only thing more frustrating than a 6 months old critical reproducible bug in production regarding failing file uploads is … the inability to actually debug/pinpoint its origin.

Thank you


I’ll look into why the caddy_errors.log doesn’t output anything. We’ve changed the version of caddy so it might just be a misconfiguration.

We’re also looking at improving traffic tracking within the microservices but I doubt it is there that we’ll find answers because a context canceled usually means that an incoming request has been cancelled on the client side - or timed out (but in most cases timeouts clearly states that it’s been cancelled for that reason).

I know it might have been asked already but you’re mentioning that it consistently fails, do you have any reproducible steps that we could follow ?
I see in the other ticket that you have installed cells without any proxy - but do you have any firewall, network rules that could explain a client connection being cancelled ?
Have you tried a bare local version of cells (we have docker for easy and small installations) to see if it is still reproducible here ? Have you tried with another s3 instance to see if it also happens there ?


I’ll try on a local storage, but in the meantime I found a lot of the following HEAD /pydio/xxxx HTTP/1.1" 404 NoSuchKey in my AWSLogs/ (I’ll try to reproduce to relate the exact timestamps):

(Line split for readability)

e1bb965e7382d09f7199fed58f4e9709a9d310da395facc0ba8e559acd4e6ad4 xxxx-pydio
[23/Dec/2022:02:30:03 +0000] 2001:::1324 arn:aws:iam::240676001234:user/pydio
5B0ZY1PJR73M8HVQ REST.HEAD.OBJECT pydio/ddfe5c26-3667-4fa2-a9b6-8cb4e9cd3637
"HEAD /pydio/ddfe5c26-3667-4fa2-a9b6-8cb4e9cd3637 HTTP/1.1" 404 NoSuchKey
313 - 11 - "-"
"MinIO (linux; amd64) minio-go/v7.0.21" - 
SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader xxxx-pydio.s3.dualstack.eu-central-1.amazonaws.com TLSv1.2

Happy new year

Basically, context canceled mostly means request timeout

So if you are upload big files to s3 with a slow server => s3 connection, you may have to increase the request time defined in the “Console > Uploader” options.