Rclone can't perform a S3 multipart upload to Pydio

I was getting invalid MD5 error when trying to perform a multipart upload with Rclone.
Here is what a have found.

The function FromContentMD5 from minio/pkg/etag/etag.go expects a base64 encoded string but get a hexadecimal encoded one.
However, I have checked that Rclone is sending the correct Content-Md5 header (run it with args -vvv --dump=headers). I can also see the valid base64 encoded header in the Pydio log.
My fix:

	if v[0] == "" {
		return nil, errors.New("etag: content-md5 is set but contains no value")
	}
	if len(v[0]) == 32 {
		h, err := hex.DecodeString(v[0])
		if err == nil {
			return ETag(h), nil
		}
	}
	b, err := base64.StdEncoding.Strict().DecodeString(v[0])

Now I’m able to upload files successfully.
Additionally, I recommend using the –low-level-retries 1 flag to avoid slowdowns when performing operations.
@GwynethLlewelyn I think you might be interested in this topic.

1 Like

Gosh, can you believe that I have only read this today — a year after you posted this??

I suppose that this comes from FromContentMD5(), right?

// FromContentMD5 decodes and returns the Content-MD5
// as ETag, if set. If no Content-MD5 header is set
// it returns an empty ETag and no error.
func FromContentMD5(h http.Header) (ETag, error) {
	v, ok := h["Content-Md5"]
	if !ok {
		return nil, nil
	}
	if v[0] == "" {
		return nil, errors.New("etag: content-md5 is set but contains no value")
	}
	b, err := base64.StdEncoding.Strict().DecodeString(v[0])
	if err != nil {
		return nil, err
	}
	if len(b) != md5.Size {
		return nil, errors.New("etag: invalid content-md5")
	}
	return ETag(b), nil
}

I guess it’s still wrong (according to your code, I mean).

The bad news?

I’ve actually copied that from the uptream package, i.e. from MinIO itself — not from Cells.

I suppose that @bsinou will frown upon changing a local dependency — only to get it overwritten by any subsequent upgrade from MinIO (which is in constant development). In other words: sure, this can be easily fixed in Cells — but the issues come from MinIO.

I wonder if you could try to add a PR to MinIO’s own repo on GitHub. With luck, they’ll accept it, and with the next release, it would neatly fix Cells… once and forever! Yay!

I did not try to compile it myself, though. Not yet! But I’m very curious if it’s just something that simple. Well — simple, because you have spent so much time in debugging it, of course.

Because it’s not only Rclone that cannot connect via S3. A lot of things don’t. To name a few that I use every day (the list is not exhaustive): Rclone (obviously!), Transmit (macOS file transfer client), Cyberduck (platform-independent file transfer client), the remote sync mechanism of the Synology NAS… and then a few more esoteric things as well. And that’s just the beginning!..

Not to mention the ability to mount, via FUSE, an external S3 filesystem :blush: Ah, there are so many things that I wish to do via S3…

I suddenly realise that the only reason that I still keep “our” Cells installation active is because I have been waiting for the past few years for a solution like yours. Because nobody wants to use something using a Web page any longer. Even our self-proclaimed “power users” are wary of using “Web pages” when it’s so easy and convenient to mount a remote filesystem these days — Dropbox, Google Drive, OneDrive, pCloud, etc. etc. and even things such as ownDrive. Alas, these tools only do file storage, of course, they don’t have the richness and flexibility of Pydio Cells — and many don’t support S3, obviously, but they are “established” services for which everybody writes drivers… while Pydio’s own protocol is naturally not supported. S3 is sort of a good compromise between richness of features and portability across different providers, although possibly with a certain lack of performance (I’ve never benchmarked it). It certainly beats WebDAV easily! :sweat_smile:

I love your loquacity.
I would like to have your level of proficiency in the English language – something tells me that English is not your mother tongue, am I right? – to discuss here a lot of improvements and bug fixes to Pydio Cells.

Pydio uses an old version of Minio, an outdated fork.

I’ve made a number of corrections, but I haven’t shared the code on GitHub.
Maybe I will when I have time.

These fixes have made Pydio Cells usable for large-scale uploads (tera or petabytes of data). It works great with Rclone.
However, there are still a few things to fix in the code: for example, the file modification time, which should preserve timestamps with 9-digit nanosecond precision.

1 Like

Hi guys
Indeed we have to stick to the fork for licensing reason :sob:
@jaimedelano would be super interested in getting your bunch of fixes. You can even send us a big patch file if you do not want to create PR :wink:
-c

I’d like to help.

I’ve lost the code of the last changes I made. So I’m still recreating it.
I’m currently fixing the file modification time to make it compatible with S3/Rclone.

1 Like

Hi @charles
I’m going to submit a pull request.
I’ve added support for writing and reading metadata via the S3 gateway.
Also, the file modification date is now saved and displayed correctly with nanosecond precision.

1 Like

mmm, we are actually implementing that already in the v5 branch …
But will definitely have a look at your suggestions

1 Like

I, for one, am eager to try these changes out :slight_smile:

So, if I understand you correctly, you didn’t modify the MinIO code — just the Cells code? I’m just asking because of the issue @charles mentioned regarding the usage of MinIO as part of Cells.

Granted, I’m looking forward to v5 as well… hehe

@GwynethLlewelyn These changes are only made to the Pydio Cells code. I’m already using it in production.
@charles Should I PR it to the current version, or to v5 instead?

Hi @jaimedelano
If you already have your changes implemented, they are most probably in the v4 code. So the simplest will probably be to create a PR on the current version (branch @main), and I’ll have a look at how this would apply to the new one (branch @next).

Thx !

PS : If not already done, do not forget to sign the CLA.