Fail to delete files in job scheduler

For first post, please answer the questions below!

Describe your issue in detail

What version of Cells are you using?

Pydio Cells Home Edition 4.4.6

What is the server OS? Database name/version? Browser name or mobile device description (if issue appears client-side)?

server OS: Ubuntu 22.04 LTS
Database:

  • MariaDB:10.6.11-debian-11-r22
  • MongoDB: 6.0.4-debian-11-r0
  • MinIO: 2023.1.25-debian-11-r0

What steps have you taken to resolve this issue already?

I was trying to delete files through Web Client. But sometimes it would be failed to delete file in time, and I switch to admin user to check the job scheduler but it seems the delete job was stuck and there’s no Last Execution or Status information showing (which in normal job it would show these two informations, especially Status, like task running or Finished)

And when this happens, I need to delete the file again, which will start a new job to delete the file, but the primary job can’t not be cancelled or deleted, it still can be seen in Scheduler.

The problem occurs when I try moving my mouse to the following icon (bell icon):

It will end up like this:


The error message is “Opps, something went wrong; please reload the window

So I have to reload the web page. But the same problem will still occur whenever I try clicking the bell icon…

A few questions that would help us having an idea:

  • are you using flat or structured DS?
  • how big is your VM in comparison to your data?
  • do you see any error in logs (in pydio.log or in tasks.log) that could give some more hints ?
  • how is the server deployed ?

Sorry to reply so late. I test it again this week, and right now the only solution is deleting pydio-cells pods and recreating them again…

  1. I’m using flat DS to store data by creating S3-compatible Storage bucket using MinIO, an example of DataSource settings is as follow:

  2. I deployed pydio-cells with helm chart (cells chart version: 0.1.2 ) in my kubernetes cluster, so my worker nodes is big enough in comparison to my data(mostly office files);

  3. Unfortunatelly, I didn’t see any error logs with this bug in pydio cells pods, only the error message I mentioned above…

  4. As I mention above, I have a kubernetes cluster and deployed pydio-cells on the cluster. The values.yaml file I set is as follow :

# Default values for cells.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

replicaCount: 3

image:
  repository: pydio/cells
  pullPolicy: IfNotPresent
  # Overrides the image tag whose default is the chart appVersion.
  tag: 4.4.6

imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
clusterDomain: cluster.local

serviceAccount:
  create: true
  annotations: {}
  name: "app"

podAnnotations: {
  "vault.hashicorp.com/agent-inject": "true",
  "vault.hashicorp.com/role": "app",
  "vault.hashicorp.com/agent-init-first": "true",
  "vault.hashicorp.com/agent-inject-token": "true"
}

podSecurityContext: {}

securityContext: {}

service:
  type: ClusterIP
  port: 8080
  discoveryPort: 8002
  binds:
    # Set values here if you want to bind the port elsewhere
  reverseproxyurl:
  tlsconfig:

  customconfigs: {
    # Initial license
    "defaults/license/data": "FAKE",

    # Creates a kind-of sticky session for grpc requests, priority is given to local grpc servers for any outgoing request going to grpc
    #"cluster/clients/grpc/loadBalancingStrategies[0]/name": "priority-local",

    #
    "frontend/plugin/core.pydio/APPLICATION_TITLE": "My Pydio Cells Cluster"
  }
  
  # Set up an external IP from LoadBalancer


## Enable tls in front of Cells containers.
##
tls:
  ## @param tls.enabled Enable tls in front of the container
  ##
  enabled: true
  ## @param tls.autoGenerated Generate automatically self-signed TLS certificates
  ##
  autoGenerated: true
  ## @param tls.existingSecret Name of an existing secret holding the certificate information
  ##
  existingSecret: ""

  ## @param tls.mountPath The mount path where the secret will be located
  ## Custom mount path where the certificates will be located, if empty will default to /certs
  mountPath: ""

ingress:
  ## @param ingress.enabled Enable ingress controller resource for Cells
  ##
  enabled: true
  ## @param ingress.apiVersion Force Ingress API version (automatically detected if not set)
  ##
  apiVersion: ""
  ## @param ingress.ingressClassName IngressClass that will be used to implement the Ingress (Kubernetes 1.18+)
  ## This is supported in Kubernetes 1.18+ and required if you have more than one IngressClass marked as the default for your cluster.
  ## ref: https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/
  ##
  ingressClassName: nginx
  ## @param ingress.hostname Default host for the ingress resource
  ##
  hostname: pydio-cloud.com
  ## @param ingress.path The Path to Pydio Cells®. You may need to set this to '/*' in order to use this with ALB ingress controllers.
  ##
  path: /
  ## @param ingress.pathType Ingress path type
  ##
  pathType: Prefix
  ## @param ingress.servicePort Service port to be used
  ## Default is http. Alternative is https.
  ##
  servicePort: https
  ## @param ingress.annotations Additional annotations for the Ingress resource. To enable certificate autogeneration, place here your cert-manager annotations.
  ## For a full list of possible ingress annotations, please see
  ## ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/annotations.md
  ## Use this parameter to set the required annotations for cert-manager, see
  ## ref: https://cert-manager.io/docs/usage/ingress/#supported-annotations
  ##
  ## e.g:
  ## annotations:
  ##   kubernetes.io/ingress.class: nginx
  ##   cert-manager.io/cluster-issuer: cluster-issuer-name
  ##
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: "ca-cluster-issuer"
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
    nginx.ingress.kubernetes.io/proxy-buffering: "off"
    nginx.ingress.kubernetes.io/proxy-http-version: "1.1"
    nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
    nginx.ingress.kubernetes.io/proxy-ssl-server-name: "on"
  ## @param ingress.tls Enable TLS configuration for the hostname defined at `ingress.hostname` parameter
  ## TLS certificates will be retrieved from a TLS secret with name: `{{- printf "%s-tls" .Values.ingress.hostname }}`
  ## You can:
  ##   - Use the `ingress.secrets` parameter to create this TLS secret
  ##   - Rely on cert-manager to create it by setting the corresponding annotations
  ##   - Rely on Helm to create self-signed certificates by setting `ingress.selfSigned=true`
  ##
  tls: true
  ## @param ingress.selfSigned Create a TLS secret for this ingress record using self-signed certificates generated by Helm
  ##
  selfSigned: false
  ## @param ingress.extraHosts The list of additional hostnames to be covered with this ingress record.
  ## Most likely the hostname above will be enough, but in the event more hosts are needed, this is an array
  ## e.g:
  ## extraHosts:
  ##   - name: cells.local
  ##     path: /
  ##
  extraHosts: []
  ## @param ingress.extraPaths Any additional paths that may need to be added to the ingress under the main host
  ## For example: The ALB ingress controller requires a special rule for handling SSL redirection.
  ## extraPaths:
  ## - path: /*
  ##   backend:
  ##     serviceName: ssl-redirect
  ##     servicePort: use-annotation
  ##
  extraPaths: []
  ## @param ingress.extraTls The tls configuration for additional hostnames to be covered with this ingress record.
  ## see: https://kubernetes.io/docs/concepts/services-networking/ingress/#tls
  ## e.g:
  ## extraTls:
  ## - hosts:
  ##     - cells.local
  ##   secretName: cells.local-tls
  ##
  extraTls: []
  ## @param ingress.secrets If you're providing your own certificates, please use this to add the certificates as secrets
  ## key and certificate are expected in PEM format
  ## name should line up with a secretName set further up
  ##
  ## If it is not set and you're using cert-manager, this is unneeded, as it will create a secret for you with valid certificates
  ## If it is not set and you're NOT using cert-manager either, self-signed certificates will be created valid for 365 days
  ## It is also possible to create and manage the certificates outside of this helm chart
  ## Please see README.md for more information
  ##
  ## Example
  ## secrets:
  ##   - name: cells.local-tls
  ##     key: ""
  ##     certificate: ""
  ##
  secrets: []
  ## @param ingress.extraRules Additional rules to be covered with this ingress record
  ## ref: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-rules
  ## e.g:
  ## extraRules:
  ## - host: example.local
  ##     http:
  ##       path: /
  ##       backend:
  ##         service:
  ##           name: example-svc
  ##           port:
  ##             name: http
  ##
  extraRules: []

ingress-nginx:
  enabled: false
  controller:
    admissionWebhooks:
      enabled: false
    hostPort:
      enabled: true
    ingressClassResource:
      default: true
      enabled: true
    kind: DaemonSet
    service:
      type: ClusterIP

resources: {}
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #   cpu: 100m
  #   memory: 128Mi
  # requests:
  #   cpu: 100m
  #   memory: 128Mi

autoscaling:
  enabled: false
  minReplicas: 1
  maxReplicas: 6
  targetCPUUtilizationPercentage: 80
  # targetMemoryUtilizationPercentage: 80

#nodeSelector: {}
nodeSelector:
  kubernetes.io/arch: amd64
  
tolerations: []

affinity: {}

#------------------------------
# Dependency settings
#------------------------------
mariadb:
  global:
    storageClass: nfs-fpt-storage
  enabled: true
  architecture: standalone
  volumePermissions:
    enabled: true
  auth:
    rootPassword: cloud123
  tls:
    enabled: false
  persistence:
    enabled: true
    accessModes:
      - ReadWriteOnce
    size: 50Gi
  nodeSelector:
    kubernetes.io/arch: amd64
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 1
          preference:
            matchExpressions:
              - key: server-type
                operator: In
                values:
                - database
  
mariadb-galera:
  enabled: false
  volumePermissions:
    enabled: true

redis:
  enabled: true
  volumePermissions:
    enabled: true
  auth:
    enabled: false
  global:
    storageClass: nfs-fpt-storage
  master:
    persistence:
      enabled: true
      accessMode: ReadWriteOnce
      size: 20Gi
    nodeSelector:
      kubernetes.io/arch: amd64
  replica:
    persistence:
      enabled: true
      accessMode: ReadWriteOnce
      size: 20Gi
    nodeSelector:
      kubernetes.io/arch: amd64

nats:
  enabled: true
  auth:
    enabled: false
  volumePermissions:
    enabled: true
  nodeSelector:
    kubernetes.io/arch: amd64

etcd:
  enabled: true
  commonAnnotations: {
    "helm.sh/hook": "pre-install",
    "helm.sh/hook-weight": "-2"
  }
  auth:
    rbac:
      create: false
    peer:
      secureTransport: false
      useAutoTLS: false
    client:
      secureTransport: false
      enableAuthentication: false
      existingSecret: "etcd-client-certs"
      certFilename: "tls.crt"
      certKeyFilename: "tls.key"
      caFilename: "ca.crt"
  volumePermissions:
    enabled: true
  persistence:
    enabled: true
    storageClass: nfs-fpt-storage
    accessModes:
      - ReadWriteOnce
    size: 50Gi
  nodeSelector:
    kubernetes.io/arch: amd64
    
minio:
  enabled: true
  defaultBuckets: "thumbnails pydiods1 personal versions cellsdata binaries"
  auth:
    rootUser: admin
    rootPassword: cloud123
  volumePermissions:
    enabled: true
  persistence:
    enabled: true
    storageClass: nfs-fpt-storage
    accessModes:
      - ReadWriteMany
    size: 8Ti
  nodeSelector:
    kubernetes.io/arch: amd64

mongodb:
  enabled: true
  auth: 
    enabled: false
  volumePermissions:
    enabled: true
  nodeSelector:
    kubernetes.io/arch: amd64
  persistence:
    enabled: true
    storageClass: nfs-fpt-storage
    accessModes:
      - ReadWriteOnce
    size: 200Gi
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 1
          preference:
            matchExpressions:
            - key: server-type
              operator: In
              values:
              - database   
  
vault:
  enabled: true
  injector:
    annotations: {
      "helm.sh/hook": "pre-install",
      "helm.sh/hook-weight": "-5"
    }
    webhook:
      annotations: {
        "helm.sh/hook": "pre-install",
        "helm.sh/hook-weight": "-5"
      }
      failurePolicy: Fail
      namespaceSelector:
        matchExpressions:
        - key: kubernetes.io/metadata.name
          operator: NotIn
          values: ["vault","kube-system","kube-public","kube-node-lease"]
  server:
    annotations: {
      "helm.sh/hook": "pre-install",
      "helm.sh/hook-weight": "-5"
    }
    dataStorage:
      mountPath: /tmp/vault/data
      storageClass: nfs-fpt-storage
      accessModes: ReadWriteOnce

      size: 20Gi
    extraVolumes:
    - type: configMap
      name: cells-vault
    postStart:
    - "/bin/sh"
    - "-c"
    - "sleep 5 && cp /vault/userconfig/cells-vault/bootstrap.sh /tmp/bootstrap.sh && chmod +x /tmp/bootstrap.sh && /tmp/bootstrap.sh"
  statefulset:
    annotations: {
      "helm.sh/hook": "pre-install",
      "helm.sh/hook-weight": "-5"
    }

I think it’s likely to happen when deleting many files at one time, like deleting a folder with many files. The job scheduler might fail to work or sync this user flow at the first time, then stuck with no status showing. And because admin can’t cancel or delete this user flow, it would still in the job scheduler queue. When user tries deleting these files at the second time, a new user flow would be created and added to the job scheduler queue. If the second try works successfully, then the first user flow would fail to work, because the files are already deleted, so that it would have a conflict. So maybe it will be greate if admin could delete user flow manually.

I met with this bug when trying extracting zip files(.zip, .tar.gz, e.g.) with lots of files inside, too. But I didn’t see this bug when deleting one or two files.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.