Paths excluded from nested path restriction that shouldn't be

Hello there, I’ve encountered an issue with adding a datasource in Cells 2.0.6. It seems that there was a restriction added that prevents nested paths from being added as seperate datasources. However it seems that this restriction has some incorrect pattern matching. For example I had a data folder under /home/user/data that was the main datasource setup at install. When I tried to add /home/user/files as another datasource it was detected as a nested path and was unable to be added. This is true even when the former is further nested, I had it at /home/user/pydio/data and adding /home/user/files failed due to a nested path error. This bug is the only thing really preventing me from using pydio currently because I have several folders in the same home directory where other apps put files which I would like to add as separate datasources.

Interestingly the bit of code responsible for this should be working correctly:

I tested out this case briefly here:

Here is the specific error in the log:

Rest Error 500	{"error": "{\"id\":\"datasource.nested.path\",\"code\":409,\"detail\":\"object service local1 is already pointing to /home/gondola/data, make sure to avoid using nested paths for different datasources\",\"status\":\"Conflict\"}"}

Interesting so I didn’t notice this section before which actually gives the basedir and strips to the top folder as the bucket

So really the way the edgecase plays out is that it believes that /home/gondola/data is a nested inside of /home/gondola which is true, however what we really want to know is if it is nested inside of /home/gondola/files which it is not.

I tried making a fork where we pass in this top level folder as well like so:

However the problem then is if we try to add yet another folder such as say /home/gondola/test it will detect flip the problem and think that /home/gondola/test is nested inside of /home/gondola/files.

Will have to play with this some more to see if there’s a better way to detect nested directories

Ok I believe I’ve found a workable solution. I’ve created a pull request here with the code changes required and provided a detailed explanation regarding the issue and how these changes solve it:

1 Like

Hello @gondola,

thanks for the report and sorry for the late answer. We have been working a lot on both 2.0.7 version and 2.1.0-RC0 that will bring a lot of improvements that should make the life of installers among other much easier…

That say, we are having a look at your issue and corresponding pull request to see if we ship this with the RC0 and I first want to be sure I really understand your problem.

So your default datasources folder is:

            |- pydiods1
            |- cellsdata
            |- ....

And you want to expose some of the other folders of your user home directory, lets say for instance:

        |- Documents
        |- Music
        |- SharedFiles
        | ....

If this is your use case, it is expected that it does not work.

The best practice we suggest to achieve such a setup would rather be to define the path for the default datasource to rather points toward /var/lib/cells/data and then expose the folder of the user home directory you want as custom datasources.

Just a quick “why”:

the datasoures relies on minio lib to exposes folders as S3 compatible buckets.
When you put default datasources to be created at /home/user/data, we in fact create one minio server that lives /home/user/data/.minio.sys and can expose each sub-folder of /home/user/data as S3 compatible buckets.

If you then create a datasource under e.g /home/user/SharedFiles, corresponding minio folder will then be /home/user/.minio.sys, and the former /home/user/data is a potential bucket for this second minio.

That is exactly what we must avoid…

So please let us know if we have understood your use case correctly, in such case I would sadly have to reject your pull request :frowning: , otherwise correct me where I’m wrong.

Anyway thanks again for the feed back and have a great week.

1 Like

Hi @bsinou, thanks for the response! Glad to hear about the new upcoming versions, I’m excited to try them out.

I understand the issue with minio that you explained and it makes sense that you wouldn’t want a bucket that contained another minio server, as such I understand why the check exists. This is why the way I chose to go about enabling the requested functionality is to pass the bucket in as a parameter and check if that bucket is already a datasource (and therefore a minio server). I will admit this solution probably could be circumvented by simply nesting the paths deeply enough although I believe the current check can be circumvented as well if I remember correctly. It has been a while so I will have to confirm the later thought.

Anyways, thanks for taking look. Good luck on the new releases!