Questions about data recovery and how the data is stored

For first post, please answer the questions below!

Describe your issue in detail

Not an issue, but a question:

I am worried that if my database is lost that I will not be able to recover my data. Right now I have my pydio cells running and connecting to my nas where its storing all of the data. The nas does not store the database, only the data. If the device that cells is running on decides to quit for whatever reason what should I do to restore the files?

I know that the files will be on the nas still, but the names and paths will be different:

image

  • Is there a way to convert the hashed names back into the original names and paths if I do not have the database anymore?

  • Is there a good way to keep running backups of the database?

What version of Cells are you using?

I am running on docker:

pydio/cells:latest
mariadb:latest
mongo:4.4.6

What is the server OS? Database name/version? Browser name or mobile device description (if issue appears client-side)?

Debian 5.10.179-1

What steps have you taken to resolve this issue already?

I havent done anything yet since there is no issue so far.

First question: I think it is not possible, but I am not sure.
Second question: This is my backup script.

#!/bin/bash

cd /root

date=$(date '+%Y_%m_%d')
time=$(date '+%H_%M_%S')

path=/storage/backup/pydio
file="${path}/${date}/${time}.sql"

directory=$(dirname $file)
mkdir -p $directory

mysqldump -u DATABASE_USERNAME -p'DATABASE_PASSWORD' DATABASE_NAME > "${file}"
mongodump --host=MONGODB_HOST --port=MONGODB_PORT --username=MONGODB_USERNAME --password=MONGODB_PASSWORD --db=MONGODB_NAME --out=/storage/backup

mv -f /storage/backup/*/*.{b,j}son $directory
for collection in $directory/*.{b,j}son
do
	name=$(basename $collection)
	mv -f "${collection}" "${directory}/${time}.${name}"
done

MariaDB sql files and MongoDB json files are saved at /storage/backup/pydio

Okay I see, if anyone knows about the first question please let me know. That is what I am most worried about.

Read more about flat format here Datasource Format | Pydio, the “snapshot” is exactly here to make sure you have all the info if the DB dies. It can be re-imported to a new installation.

Plus we are now providing a dedicated cells-fuse tool to read these snapshots, we still need to write some docs for that, but you can try it out here https://download.pydio.com/pub/cells/release/4.2.3/linux-amd64/cells-fuse : it uses FUSE to mount your snapshot.db+files folder as a structured files/folder tree.
You would have to use it directly on the NAS.

-c

I think I understand a bit better now, those docs are really well written and clears up a lot of things. Thank you for sending the link.

Can you let me know if I should make backups of the database by having a script like this get ran on a cronjob:

#!/bin/bash

cd /root

date=$(date '+%Y_%m_%d')
time=$(date '+%H_%M_%S')

path=/storage/backup/pydio
file="${path}/${date}/${time}.sql"

directory=$(dirname $file)
mkdir -p $directory

mysqldump -u DATABASE_USERNAME -p'DATABASE_PASSWORD' DATABASE_NAME > "${file}"
mongodump --host=MONGODB_HOST --port=MONGODB_PORT --username=MONGODB_USERNAME --password=MONGODB_PASSWORD --db=MONGODB_NAME --out=/storage/backup

mv -f /storage/backup/*/*.{b,j}son $directory
for collection in $directory/*.{b,j}son
do
	name=$(basename $collection)
	mv -f "${collection}" "${directory}/${time}.${name}"
done

or should I instead take snapshots of the database by running the command in the docs in a cronjob:

cells admin datasource snapshot --datasource=ds1 --operation=dump --basename=snapshot.db 

  • Are there any advantages of backing up the database vs making snapshots?
  • Should I be doing both?

Are there any advantages of backing up the database vs making snapshots?

With the snaphot, you can restore the files and their tree structure, but you miss all meta data that might be have been added via Cells.

But on the other hand, with just the snapshots and the “flattened files” you can fully recover your files without re-installing Cells (e.g. with cells fuse)

Should I be doing both?

you can, but if you do not plan to leave Cells, a correct backup should be enough.
You can always launch a snapshot when necessary.

Sounds good, thanks everyone :slight_smile:

This topic was automatically closed 11 days after the last reply. New replies are no longer allowed.