V2.2.2 crash, screenshots included

V2.2.2 this time. The first screenshot shows where the problem appears to have started in the logs. There’s like a hundred or more of these errors all timestamped 9:43:03. The second screenshot shows another set of errors, these were happening for every user of the system starting sometime after that first set of errors. I restarted the service and it appears to be working again.

Hello @scott.bentley

this might look like “nested datasource” issue. But we normally prevent this.
Would it be possible to share your pydio.json config file ?

Thanks

Hi @bsinou

I’m happy to send you the config file. Can you please provide an email I can send it to? I don’t want to post it on a public forum.

I don’t think we have any nested datasources, it’s a very vanilla install for the most part. In my humble and non-expert opinion, I think the errors in the screenshots above indicate there was some kind of file upload error that caused the tree service to bug out. We likely have somewhat unique needs compared to your typical clients. Our file names tend to be super long and stupidly complicated, and the file sizes are frequently > 100Mb, sometimes > 1Gb. I’ve also seen staff upload literally 1000 pictures (.jpg, .png) instead of zipping them first. Our needs seem to be a pretty good stress test for each new version of Cells :wink:

Regards,

Scott

Hi,
For the indexation issues, these are issues that were happening before but that are better reported since 2.2. So it’s nothing new really, and nothing blocking, although we are are trying to track them better to solve them better.
As for the metaStreamer errors, that may indicate a lack of resources. How performant is your server regarding your overall load?
-c

Hi @bsinou and @charles

I was never told where to send the pydio.json config file, please let me know where I can send this as the problem persists.

The crash happened again this morning, and similar logs to the second screenshot above. I looked at the paths on disk and it looks like the paths there are ok. For example, where the log complains about “Cannot find…[melissa.parry@hhangus.com/125-09/.pydio/.pydio/recycle_bin/125-09” the files on disk look like “…/cellsdata/melissa.parry@hhangus.com/125-09/recycle_bin/” and there is a “.pydio” file in each folder.

In an attempt to fix the issue, I manually deleted the contents of …/cellsdata/melissa.parry@hhangus.com/125-09, deleting all files and folders from disk, including the .pydio files. I then synchronized the datasource. After that I created a test folder and Untitled Document.docx within Cells followed by another synchronize. Here are the log files from before deleting the files (pic1), after creating the test folder/file (pic2) and the errors when running another sync on the datasource (pic3).

Please note! That we have a Workspace that points to /cellsdata in order to administratively create/delete files from users cells, only the Administrator account has read/write permissions to this Workspace. I’m not sure, but these errors may be related to using this workspace under the Administrator user. I used this workspace to create the test files here. I’m doing some testing with staff to see if this has anything to do with the issue.
Here’s a screenshot of the workspace settings:

Ok, this post is getting long, but there’s more to the story. I had Melissa show me her account and she has a Cell titled “2171306” showing in her account, yet that does not exist on disk.

I can only assume something in the index in the database is totally whack :confused:

hi scott
Can you use a DM to send your json config?
1 - about the “crash” : you are referring to crash, but was cells really crashed? Or just hung? don’t you use something like systemd to Restart Always if process stops ? Or do you have logs (from systemd maybe) showing the stack dumped during the crash?
2 - about the index : again, probably indexation issues that were created during previous versions and that are now shown in the logs. Seeing stuff under a .pydio is indeed very strange, as it’s always a file…
Do your users use the sync client?
The top-level workspace should not create any issue.

Seems like some things are messed up in the index indeed. Will be pretty hard to reproduce. Your point of deleting on disk then re-sync is generally a good approach indeed, but from what I understand it keeps failing after that just after you re-add new data?

Thanks @charles,

How do I send a file to a DM for you? I need instructions because I don’t know how to DM in this forum.

1 - Sometimes the “crash” just means users cannot login or that file operations fail. In this case today, the entire service was inaccessible and the browser just reported “This site cannot be reached”, so it was a pretty hard crash. We do have “Restart=on-failure” in the systemd service definition, it doesn’t seem to work though. See screenshot below for the service config.

2 - We figured out what “2171306” was, it’s a shared location that was created way back in v1.0 of Cells. It’s shared from under one user to several others, and appears to be working as expected. The other issues, especially the “.pydio/.pydio/.pydio” stuff is a complete mystery though.

Do your users use the sync client?
No, we only use the web browser.

The top-level workspace should not create any issue.
I removed it for now, just in case. I also deleted offending files and directories from the file system with “rm -rf ./” and removed users who had logs exhibiting the above issues.

from what I understand it keeps failing after that just after you re-add new data?

It seems to be variable. I used the Administrator account and the Cells Data workspace to create a test folder and test file under “melissa.parry@hhangus.com/125-09/” and you can see the screenshots above for the results. However, after clearing it all out again, Melissa herself created new files and folders there without causing any issues. I’m still somewhat convinced this has to do with me as Administrator removing/creating files using the Cells Data workspace. No idea why it would cause issues, maybe permissioning, maybe file indexes get screwy when the real user tries to use the cell? Note that I can delete the files just fine as Admin, the issues don’t seem to appear until later when users try to use the cell.

Service config screenshot:

EDIT: I checked with the other admin staff and no one had deleted any user files in the past month. So, that seems to rule out administrative deletions as the most likely cause of this :frowning:

Ahh, maybe in fact you are right, creating workspaces at higher level in that case maybe creates issues with the specific management of the recycle_bin. When “deleting”, we look for a recycle_root flag upper in the tree to decide where the target recycle is. For a standard workspace, this would be in the workspace root. For “personal files” or “cells”, that should walk up the tree to the original owner folder.
I’ll have to think about this.
@bsinou any thought about the systemd config?

@charles One other thing of note, the whole reason we’re using an admin workspace to delete these files is because Cells does not have the “delete after X days” feature that Pydio v8 had. If that was built-in again, we would not need that administrative workspace at all.

Hi Scott, I know it’s not something you’ll necessarily want to hear but… this is part of the Enterprise version through the advanced scheduling tool (Cells Flows).

haha, thanks @charles
It’s not that I don’t want to hear it, I appreciate that the feature exists. I just haven’t convinced the senior management to put money down on Cells, for various reasons. I will have another chat with my super about it.
Strictly speaking, it’s not hard to write a shell script to the deletions and then synchronize the datasource in Cells, so I may try going at it that way in the future. It’s unfortunate that the admin based workspace seems to cause problem and I hope you can get that resolved as it seems to be a regression or flaw in Cells’ implementation.

It’s unfortunate that the admin based workspace seems to cause problem and I hope you can get that resolved as it seems to be a regression or flaw in Cells’ implementation.

You are right that it must be resolved - but I don’t really think it’s a regression (or you mean compared to Pydio?)

@charles and @bsinou
Can anyone point me to the location in the source code where I might find the source of this error?

Ts : 1615480783
Level : info
Logger : pydio.grpc.data.sync.cellsdata
Msg : {“level”:“error”,“ts”:“2021-03-11T11:39:43-05:00”,“msg”:“Cannot find parent path for node, this is not normal - skipping node!”,“NodePath”:"melissa.parry@hhangus.com/2191351/.pydio/WPP Sample Elec and ICAT Drawings.pdf/recycle_bin"}
SpanUuid : span-270

The error (and others like it) are generated even though the listed file(s) have been deleted from the disk. The errors are generated during the datasource re-synchronize process. I would like to understand why these are generated despite the files having been deleted long ago. Thanks!

This topic was automatically closed 35 days after the last reply. New replies are no longer allowed.