Cells purging question

Hey, we’ve been trialling Cells for a while now and have noticed files are not being purged. The retention policy is set as “30 days max”. The documentation doesn’t say I need to setup a cron job as I had to do for Pydio v8 so I assumed it would do its thing automatically. Am I missing something?

Hello @scott.bentley,

If you are referring to the versioning, then you should have only the last 30 versions of a file(one for each 30 previous days).

To make sure that I understood your request, You wish for the files that are 30 days old to be Deleted forever from your Cells?

Thanks @zayn,

Yes, we want the file to be permanently deleted. We do not use the product as a file management tool and specifically do not want users storing tons of documents forever. Our use-case is to provide file transfer and limited collaboration between staff and clients. Purging files regularly is necessary to (a) relieve staff from having to remember to do it themselves and (b) ensure we aren’t using any more disk space than absolutely necessary for our use-case purposes.

Hey @zayn, just wanted to check in on this and see if you have anything else to add. Is purging a feature that is likely to be implemented in the future? Should I attempt to write a plugin of some sort? Where would I start if I wanted to write a plugin that would perform file purging?

Thanks,

Scott

A plugin development guide with some boiler-plate code would be, indeed, very nice to have.

One way would be to use the sdk to create a custom task that deletes (ran with a cron or systemd),

The go-sdk, allows you to CRUD the resources therefore allowing you to list and delete data. (you could list data and check if it is old by X days then proceed to delete).

If you are stuck, tell me and I’ll write a snippet for you.

You could also script something with the cells-client

Thanks Zayn. I was looking into this and then COVID hit. I’m back looking at it again now though!

So, I installed the cec client and wanted to list all existing cells but I don’t see how to do this? If I run “cec ls” I only get the workspaces and cells of the authenticated user (admin, in this case). I need a listing of ALL cells and/or files so that I can then delete those older than 30 days. Would be able to provide me with a snippet that might accomplish this?

Thank you so much, and I hope you and your loved ones have been well throughout this pandemic!

Scott

Hey @zayn , I don’t want to be pushy and I understand if there’s more pressing issues you need to respond to, however if there is an example of using the cec to find and remove files from accounts other than the logged in account, could you please point it out to me?

Otherwise, can you please advise how I can find files that are older than a number of days and administratively clear them for ALL accounts?

Also, a related question, how do I force an expiry on all public shares so the user cannot create shares without at least a minimum expiry of, for example, 60 days?

Hello @scott.bentley,

here is a go snippet to help you start, what it does is list nodes with the AdminTreeList (which will allow you to see all the nodes ), I have added comments to point where you have to add the functions that you need to.

package main

import (
	"log"
	"path/filepath"
	"strconv"
	"time"

	cells_sdk "github.com/pydio/cells-sdk-go"
	"github.com/pydio/cells-sdk-go/client/admin_tree_service"
	"github.com/pydio/cells-sdk-go/example/cmd"
	"github.com/pydio/cells-sdk-go/models"
)

var (
	config = &cells_sdk.SdkConfig{
		Url:        "https://my-cells.com",
		ClientKey:  "cells-front",
		User:       "admin",
		Password:   "",
		SkipVerify: false,
	}
)

func main() {
	ctx, cli, err := cmd.GetApiClient(config)
	if err != nil {
		log.Fatalf("Could not GetApiCLient, cause: %v\n", err)
	}

	params := &admin_tree_service.ListAdminTreeParams{Body: &models.TreeListNodesRequest{
		// Lists all the nodes under the personal datasource (meaning, personal-files/admin, personal-files/johndoe, etc...)
		Node: &models.TreeNode{Path: "personal"},
		Recursive: true,
	}, Context: ctx}

	// ListAdminTree lists all the nodes
	result, err := cli.AdminTreeService.ListAdminTree(params)
	if err != nil {
		log.Fatal(err)
	}

	for _, n := range result.Payload.Children {

		// ignores .pydio files
		if filepath.Base(n.Path) == ".pydio" {
			continue
		}

		// Parse and convert files MTime
		i, _ := strconv.ParseInt(n.MTime, 10, 64)
		tu := time.Unix(i, 0)
		d := time.Since(tu)

		// Checks if duration is older than 30 days
		if d.Hours() > float64(time.Hour * 24 * 30) {
			// add function to delete the nodes
			// see TreeService
			cli.TreeService.DeleteNodes(nil)
		}

	}
}

@zayn thanks that looks great! Of course, now I need to figure out how to use GO as I’ve never touched it before.

Was it your intention that this snippet should be used to create a separate app/script or is this something that could be turned into a plugin? I don’t see any resources in the documentation about developing plugins and I was kind of hoping to make this something that could be used from within Cells itself so other admins wouldn’t need to use command line scripts for this purpose.

Also, I see that you’ve used AdminTreeService to list all nodes, and TreeService to delete nodes. Can I do this with the REST API? Can an admin delete any node using TreeService as long as they have the path reference?

Thanks so much, and sorry to be asking so many questions lol

Hey @scott.bentley,

Yes, all of those operations are available through the REST API as well, sorry If I wrote the snippet in go as it was the main language that I use to write my scripts.

You could write the same script with bash, or other languages for instance java ( cells java sdk )
.

To add this as a plugin would be possible, there are no direct indication but you could analyze the code and see how it is done for the other plugins.