CEC scp doesn't use the multi part part size configured in the admin console of pydio serveur

For first post, please answer the questions below!

Describe your issue in detail

i sent a file using the cec scp command under linux to send a file into the server
The number of parts are not consistent with the part size in uploader menu

What version of Cells are you using?

4.2.5

What is the server OS? Database name/version? Browser name or mobile device description (if issue appears client-side)?

Debian 11

What steps have you taken to resolve this issue already?

No idea why

The number of parts are not consistent with the part size in uploader menu

You are talking about the settings/parameters/uploader page of the admin console, right ?

If yes, you are right. The cec does not use this. Multipart configuration is hard-coded.
Funny enough, we were talking about that yesterday, so you might be lucky and see this implemented soon.

Would you mind sharing your use case ?

Hi Sir,

Yes i was speaking about the Uploader page in admin console which set the part size , queue size and timeout per part.
The transfer of very large files is being tested in our platform with various clients: cells-sync, CEC and web GUI

  • CEC use is precious to be able to script the transfer
  • Web client if the user is a person because user friendly.

For us multi part is not an option : we need it
Thus, we also need to be able to use CEC in multipart

Do you know if pydio expects to update CEC to be able to manually set the multi part sending?

Thanks in advance
Philippe

Do you know if pydio expects to update CEC to be able to manually set the multi part sending?

We’ll discuss this and I’ll try to let you know.
Can still precise the approx size of file you are talking about? Do you have an idea of the bandwidth you’ll use for such usecase between your servers ?

Hi

We transfer files which size reached 1T byte
Our network badnwidth is vairable depending on the used WAN/LAN
We tested the transfer with network rate between 27 Mbytes/sec to 120 Megabytes/sec

Thanks for informing us regarding this topic.

Philippe N

For info, we released a first alpha version that make the multipart configurable when launching the scp command (it was already existing before but parameters were hard-coded).

With cec v4.0.1-alpha1, you now have these flags:

Flags:
  -h, --help                      help for scp
      --max-parts-number int      Maximum number of parts, S3 supports 10000 but some storage require less parts. (default 5000)
      --multipart-threshold int   Files bigger than this size (in MB) will be uploaded using Multipart Upload. (default 100)
      --part-size int             Default part size (MB), must always be a multiple of 10MB. It will be recalculated based on the max-parts-number value. (default 50)
      --parts-concurrency int     Number of concurrent part uploads. (default 3)
  -q, --quiet                     Reduce the amount of logs
      --skip-md5                  Do not compute md5 (for files bigger than 5GB, it is not computed by default for smaller files).

See cec scp --help for more info

I used the version 4.0.1 of CEC to keep testing the transfer very large files.
I have a set of questions regarding the use of this tool:

1/ Do you expect to send a debug mode of CEC to help us trace the messages sent/received to/from CEC ?
2/ Do you confirm that the parameters: --max-parts-number --multipart-threshold --part-size --parts-concurrency prevail over the parameters set in uploader menu of admin console?
3/ Why the request timeout is not part of the CEC input parameters like the other parameters listed in 2/?

Thanks in advance
Philippe Nantier

I did not really get your question. but the logs of the server when the problems happens and the corresponding output of the cec cmd on the client side would certainly help to diagnose the issue

yes

Technically spoken, this parameter is not handled at the same place in the underlying transfer libraries, that might explain why it has been kindof forgotten.
But we are working on this.

Hi

We still use the CEC V4.0.0 versions of CEC and expect to move to 4.0.1.

Do you know what the hardcoded multipart parameters of CEC V4.0.0 and lower are ?

  • part size ?
  • nb of parts sent in parallel ?
  • timeout per part?

thanks
Philippe

Hi

Sorry I have no time to dig that out, but feel free to give a look at the corresponding code if you really want/have to GitHub - pydio/cells-client: Command line client to communicate with cells REST api.

Good news is that we took time to rework both the sdk-go and the cells client to rather use the aws-sdk-go-v2 that brings nice new feature.

We have just pushed a working version in main branch and we would be glad to have your feedback on this.

There are still a few things we want to add before releasing a 4.1.0, but it is already working well:

  • fixes a few bug
  • add a request timeout parameter (default is no time out)
  • uses the latest AWS SDK with more options, improved performance and security

You can find the binary there:

https://download.pydio.com/pub/cells-client/dev/