Sync DataSource

Hello,

When I upload a file by ftp, it doesn’t sync in the pydio web interface.
I need to go to /settings/admin/scheduler and manually refresh the datasource.

Is this intentional or a bug?
Can I tell the scheduler to Sync DataSource every 5 minutes for example ?
Or better, Sync Datasource when I click on the refresh icon in the web interface.

Thanks !

Hello @lubovic,

you can make use of the cells-ctl client to resync a datasource if put data directly on the datasource (without going through the webUI).

https://download.pydio.com/pub/cells/release/

You can download the cells-ctl matching your cells version and then run the following command by just modifying pydiods1, to your datasource-name.
Then you could script that to your ftp uploads to make sure that everything is in sync.

./cells-ctl data sync --service pydio.grpc.data.sync.pydiods1
1 Like

Awesome! It’s working so nice.
Thanks a lot!

Just for the record, during the last adjustments of the API before v2.0, we have merged all useful commands from cells-ctl directly in the main cells CLI.

So as from the 2.0.0-rc2 to launch a resync from your shell, you should now do:

./cells data sync --service pydio.grpc.data.sync.pydiods1
# or even, with the new flag:
./cells data sync --datasource pydiods1

I am just moving from Pydio 8 to cells and I am sure that with 8 FTP’d (or samba’d) files were automatically shown in the web interface when browsing. Are you guys saying that this has now changed and we need to somehow run ./cells data sync every time a new file is added not via the web interface?

How are you triggering that sync, or should it just be periodic via a cronjob?

Thanks.

yes.

For the time being:

  • SAMBA file system are not supported
  • if you directly add some files on the file system (without going through the API), you have to explicitly trigger a datasource resync for the files to be seen by Cells. Yet, you have more tools that just the web UI to add files, typically, the cells-client is a good option, depending on what you want to achieve.

Thanks for letting me know. Unfortunately that seems pretty cumbersome - I have users that connect to the drive directly (via samba) and other users that use the cells web interface. A datasource resync currently takes about 5 minutes so if a user who is connecting via samba wants to make a file available to someone connecting via the cells interface they won’t have access to it until a resync has been run. Why can’t cells resync the current folder/path whenever it is accessed / opened? Surely resyncing that way would be very quick and solve this problem completely. I am pretty sure this is how Pydio 8 going all the way back to Ajaxexplorer worked.

Please tell me this is something that might come back.

Mentioning Issue with adding Datasource here since it’s about problems creating a datasource from a folder already containing files unknown (yet) to Cells.

Although the cells-ctl hint is very useful, I also think that “quick / immediate / inotify-based” sync out-of-the-box would be a very important feature. It’s the only alternative when the web-importer is not practical (and there could be a variety of reasons):

  • too many / too big files
  • files no present locally / files already on the servers
  • files must not be imported but only discovered to avoid duplication
  • backend is slow (S3 / Swift) and upload has been done via a side-channel

For all these cases a good sync’ is a strongly needed fallback (if not a first-class feature).