Initial setup not going well on Docker

I’ve been trying for a while to set up a Pydio server for trial. I’m coming over from Nextcloud (it has crashed on me for the last time) and trying to get this going, but even the Docker method isn’t working.

TL;DR:
The documentation is probably great for directly hosted configurations, but is no help when using a Docker nginx reverse proxy image. So far, binary image in VM or via the Docker image, I haven’t been able to get things working.

Setup:
I have a workstation running Centos 8. It’s running Docker in the base OS, but there are also VMs running. Per Nextcloud I’ve got nginx-proxy and nginx-proxy-letsencrypt running, which do a great job of automatically setting up web services (both VM and Docker based). I’ve since stopped using Nextcloud, it is deactivated and not occupying any ports. Short background of spinning up new web service:

  • Create a Docker container, or docker-compose.yml and spin it up, feeding VIRTUAL_HOST, LETSENCRYPT_HOST, and VIRTUAL_PORT as environment variables (the first two being the external address, the last the port the container listens on from the host side)
  • Spinning up the container (for testing I’ve created a proper website via plain ol’ httpd) with these set properly, the letsencrypt side properly acquires a certificate and registers it for the proxy side
  • Access the appropriate address specified in VIRTUAL_HOST/LETSENCRYPT_HOST and you’ll see the server (ranging from “It works!”, the default, to a proper website) pretty much immediately if your DNS is already set up.

I’ve set up a pydio subdomain (for security not my real address: pydio.mydomain.ca), and my router allows access to the appropriate ports (I’ve been able to access test sites internally and externally, as per above). I did have no issues with Nextcloud with this same setup (different subdomain obviously), until an update, hence why I’m looking elsewhere now (I think 5 failures is enough abuse for this relationship). For the uninitiated, the nginx-proxy and nginx-proxy-letsencrypt containers work as a pair. If it isn’t clear by now, one does the routing and web serving (if I set it up to do that) while also automatically generating configurations for new containers that show up with a VIRTUAL_HOST environment variable, the other arranges for new SSL certificates automatically for any newly started containers that have LETSENCRYPT_HOST as an environment variable if there is no currently valid certificate. The two share the certificate directory between each other and handle the external SSL verification.

I’ve set up a docker-compose.yml as per below, which I think is more or less the default recommended, modified to access my appropriate external directories. Important note: I have a multi-terabyte RAID array for mass storage, so I want to reference that external directory for storing the data. The main drive the system OS is on doesn’t have much space (plenty remaining for day to day, not enough for videos, images, binaries, family stuff, etc. to be stored and shared).

version: '3.7'
services:
  cells:
    image: pydio/cells:latest
    restart: unless-stopped
    ports: ["8080:8080"]
    environment:
      - CELLS_LOG_LEVEL=production
      - CELLS_BIND=0.0.0.0:8080
      - CELLS_EXTERNAL=pydio.mydomain.ca
      - CELLS_NO_SSL=1
      - VIRTUAL_HOST=pydio.mydomain.ca
      - LETSENCRYPT_HOST=pydio.mydomain.ca
      - VIRTUAL_PORT=8080
    volumes:
      - /srv/storage/docker/pydio/data:/var/cells/data
      - /srv/storage/docker/pydio:/var/cells
    network_mode: "bridge"
  mysql:
    image: mysql:5.7
    restart: unless-stopped
    environment:
      MYSQL_ROOT_PASSWORD: P@ssw0rd
      MYSQL_DATABASE: cells
      MYSQL_USER: pydio
      MYSQL_PASSWORD: P@ssw0rd
    command: [mysqld, --character-set-server=utf8mb4, --collation-server=utf8mb4_unicode_ci]
    volumes:
      - /srv/storage/docker/mysql:/var/lib/mysql
    network_mode: "bridge"

If I try to access the server, I’ll get one of four errors via my browser and the nginx-proxy logs, depending on how I’ve attempted to configure the CELLS_BIND and CELLS_EXTERNAL settings:

  • 503 error
  • Client sent an HTTP request to an HTTPS server.
  • A 400 error which internally is shown via the logs as:
    nginx.1 | pydio.mydomain.ca 1.2.3.4 - - [20/Jul/2021:01:59:08 -0400] "GET / HTTP/2.0" 400 48 "-" "Mozilla/5.0" "172.17.0.6:8080"
  • Or a 502 error shown via the logs as:
nginx.1     | pydio.mydomain.ca 1.2.3.4 - - [20/Jul/2021:01:02:54 -0400] "GET / HTTP/2.0" 502 157 "-" "Mozilla/5.0" "172.17.0.6:8080"
nginx.1     | 2021/07/20 01:02:54 [error] 243#243: *301 connect() failed (111: Connection refused) while connecting to upstream, client: 1.2.3.4, server: pydio.mydomain.ca, request: "GET / HTTP/2.0", upstream: "http://172.17.0.6:8080/", host: "pydio.mydomain.ca"

I HAVE had the setup page load when I use localhost:8080 sometimes, but that doesn’t resolve my external access issue. The generated default.conf for nginx-proxy (created inside of the container) is as follows:

# If we receive X-Forwarded-Proto, pass it through; otherwise, pass along the
# scheme used to connect to this server
map $http_x_forwarded_proto $proxy_x_forwarded_proto {
  default $http_x_forwarded_proto;
  ''      $scheme;
}
# If we receive X-Forwarded-Port, pass it through; otherwise, pass along the
# server port the client connected to
map $http_x_forwarded_port $proxy_x_forwarded_port {
  default $http_x_forwarded_port;
  ''      $server_port;
}
# If we receive Upgrade, set Connection to "upgrade"; otherwise, delete any
# Connection header that may have been passed to this server
map $http_upgrade $proxy_connection {
  default upgrade;
  '' close;
}
# Apply fix for very long server names
server_names_hash_bucket_size 128;
# Default dhparam
ssl_dhparam /etc/nginx/dhparam/dhparam.pem;
# Set appropriate X-Forwarded-Ssl header based on $proxy_x_forwarded_proto
map $proxy_x_forwarded_proto $proxy_x_forwarded_ssl {
  default off;
  https on;
}
gzip_types text/plain text/css application/javascript application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
log_format vhost '$host $remote_addr - $remote_user [$time_local] '
                 '"$request" $status $body_bytes_sent '
                 '"$http_referer" "$http_user_agent" '
                 '"$upstream_addr"';
access_log off;
                ssl_protocols TLSv1.2 TLSv1.3;
                ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384';
                ssl_prefer_server_ciphers off;
resolver 192.168.0.254;
# HTTP 1.1 support
proxy_http_version 1.1;
proxy_buffering off;
proxy_set_header Host $http_host;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $proxy_connection;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $proxy_x_forwarded_proto;
proxy_set_header X-Forwarded-Ssl $proxy_x_forwarded_ssl;
proxy_set_header X-Forwarded-Port $proxy_x_forwarded_port;
# Mitigate httpoxy attack (see README for details)
proxy_set_header Proxy "";
server {
        server_name _; # This is just an invalid value which will never trigger on a real hostname.
        server_tokens off;
        listen 80;
        access_log /var/log/nginx/access.log vhost;
        return 503;
}
server {
        server_name _; # This is just an invalid value which will never trigger on a real hostname.
        server_tokens off;
        listen 443 ssl http2;
        access_log /var/log/nginx/access.log vhost;
        return 503;
        ssl_session_cache shared:SSL:50m;
        ssl_session_tickets off;
        ssl_certificate /etc/nginx/certs/default.crt;
        ssl_certificate_key /etc/nginx/certs/default.key;
}
# pydio.mydomain.ca
upstream pydio.mydomain.ca-upstream {
        ## Can be connected with "bridge" network
        # pydio_cells_1
        server 172.17.0.6:8080;
}
server {
        server_name pydio.mydomain.ca;
        listen 80 ;
        access_log /var/log/nginx/access.log vhost;
        # Do not HTTPS redirect Let'sEncrypt ACME challenge
        location ^~ /.well-known/acme-challenge/ {
                auth_basic off;
                auth_request off;
                allow all;
                root /usr/share/nginx/html;
                try_files $uri =404;
                break;
        }
        location / {
                return 301 https://$host$request_uri;
        }
}
server {
        server_name pydio.mydomain.ca;
        listen 443 ssl http2 ;
        access_log /var/log/nginx/access.log vhost;
        ssl_session_timeout 5m;
        ssl_session_cache shared:SSL:50m;
        ssl_session_tickets off;
        ssl_certificate /etc/nginx/certs/pydio.mydomain.ca.crt;
        ssl_certificate_key /etc/nginx/certs/pydio.mydomain.ca.key;
        ssl_dhparam /etc/nginx/certs/pydio.mydomain.ca.dhparam.pem;
        ssl_stapling on;
        ssl_stapling_verify on;
        ssl_trusted_certificate /etc/nginx/certs/pydio.mydomain.ca.chain.pem;
        add_header Strict-Transport-Security "max-age=31536000" always;
        include /etc/nginx/vhost.d/default;
        location / {
                proxy_pass http://pydio.mydomain.ca-upstream;
        }
}

Now to be clear, I’m not 100% certain that my connection attempts are getting through to the Cells container because I don’t see any responses in the logs for that container, in spite of the multitude of errors I’m able to bring up with nginx-proxy. Since I get different error numbers, I feel the Cells web server is responding, it’s just not telling me what the errors are in the Cell container logs. At this point I’m lost. I’ve been at this for a few weeks now without success and will accept any advice possible.

Hello @Feynt and welcome to our forum.

Sorry for the late reply, but we are less during these summer season and have a lot to do with the new major v3 version that is around the corner (and that will be massive :slight_smile: !)

Thus said, I’ve been quickly reading your post and in a first pass I see a first misconfiguration that could be triggering your issue:

The external URL must be a full valid URL, including the protocol.

Could you already try with:

      - CELLS_EXTERNAL=https://pydio.mydomain.ca

If this does not solve your problem we will try to dig further. (in such case, the starting log would be of great help so that we can understand your issue quickly)

Sorry for the delay, I saw your response but work got me busy and I promptly forgot to reply.

Changing CELLS_EXTERNAL to the appropriate https:// prefix didn’t seem to change anything according to nginx_proxy:

nginx.1     | pydio.mydomain.ca 1.2.3.4 - - [31/Jul/2021:16:29:05 -0400] "GET / HTTP/2.0" 400 48 "-" "Mozilla/5.0" "172.17.0.7:8080"
nginx.1     | 2021/07/31 16:29:05 [error] 640#640: *22303 recv() failed (104: Connection reset by peer) while sending to client, client: 1.2.3.4, server: pydio.mydomain.ca, request: "GET / HTTP/2.0", upstream: "http://172.17.0.7:8080/", host: "pydio.mydomain.ca"

I get a similar response from on the server itself, pointing to the external address:

nginx.1     | pydio.mydomain.ca 1.2.3.4 - - [31/Jul/2021:16:33:41 -0400] "GET / HTTP/2.0" 400 48 "-" "Mozilla/5.0" "172.17.0.7:8080"

The start up logs of both the mySQL container and pydio container in order are:

2021-07-31 20:27:26+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.34-1debian10 started.
2021-07-31 20:27:28+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
2021-07-31 20:27:28+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.34-1debian10 started.
2021-07-31T20:27:28.916987Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2021-07-31T20:27:28.964840Z 0 [Note] mysqld (mysqld 5.7.34) starting as process 1 ...
2021-07-31T20:27:29.078202Z 0 [Note] InnoDB: PUNCH HOLE support available
2021-07-31T20:27:29.078235Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2021-07-31T20:27:29.078240Z 0 [Note] InnoDB: Uses event mutexes
2021-07-31T20:27:29.078245Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2021-07-31T20:27:29.078249Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-07-31T20:27:29.078253Z 0 [Note] InnoDB: Using Linux native AIO
2021-07-31T20:27:29.078500Z 0 [Note] InnoDB: Number of pools: 1
2021-07-31T20:27:29.078600Z 0 [Note] InnoDB: Using CPU crc32 instructions
2021-07-31T20:27:29.080397Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2021-07-31T20:27:29.088133Z 0 [Note] InnoDB: Completed initialization of buffer pool
2021-07-31T20:27:29.090185Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2021-07-31T20:27:29.114385Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2021-07-31T20:27:29.304763Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2021-07-31T20:27:29.409727Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2021-07-31T20:27:29.832789Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2021-07-31T20:27:29.835708Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2021-07-31T20:27:29.835720Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2021-07-31T20:27:29.836343Z 0 [Note] InnoDB: Waiting for purge to start
2021-07-31T20:27:29.887255Z 0 [Note] InnoDB: 5.7.34 started; log sequence number 4035213
2021-07-31T20:27:29.887658Z 0 [Note] Plugin 'FEDERATED' is disabled.
2021-07-31T20:27:29.887757Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2021-07-31T20:27:29.903949Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2021-07-31T20:27:29.903966Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
2021-07-31T20:27:29.905758Z 0 [Warning] CA certificate ca.pem is self signed.
2021-07-31T20:27:29.905987Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2021-07-31T20:27:29.907332Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2021-07-31T20:27:29.907362Z 0 [Note] IPv6 is available.
2021-07-31T20:27:29.907374Z 0 [Note]   - '::' resolves to '::';
2021-07-31T20:27:29.907395Z 0 [Note] Server socket created on IP: '::'.
2021-07-31T20:27:29.911350Z 0 [Note] InnoDB: Buffer pool(s) load completed at 210731 20:27:29
2021-07-31T20:27:29.916789Z 0 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2021-07-31T20:27:29.979678Z 0 [Note] Event Scheduler: Loaded 0 events
2021-07-31T20:27:29.979997Z 0 [Note] mysqld: ready for connections.
Version: '5.7.34'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)
### Pydio Cells Home Edition,
 Version: 	2.2.8,
 Built: 	31 May 21 07:37 +0000,
 Git commit: 	b67c983cc5ed26767324a596eb3cd4b0464ddc7d,
 OS/Arch: 	linux/amd64,
 Go version: 	go1.15.11,
### About to execute: [cells configure],

Welcome to Pydio Cells Home Edition installation ,
Pydio Cells Home Edition (v2.2.8) will be configured to run on this machine.,
Make sure to prepare access and credentials to a MySQL 5.6+ (or MariaDB equivalent) server.,
Pick your installation mode when you are ready.,

2021-07-31T20:27:37.423Z	INFO	pydio.gateway.rest	started,
2021-07-31T20:27:37.574Z	INFO	pydio.rest.install	started,

✅ Using the local CA at "/var/cells/certs/rootCA.pem" ✨,
✅ Created a new certificate valid for the following names 📜 - "127.0.0.1" - "172.17.0.7" - "localhost",
✅ The certificate is at "/var/cells/certs/2f76a1dfd0fe1a5d9383280e4a9f56fa.pem" ,
 and the key at "/var/cells/certs/2f76a1dfd0fe1a5d9383280e4a9f56fa-key.pem",

👉 If you are behind a reverse proxy, you can either install the RootCA on the proxy machine trust store, or configure your proxy to `insecure_skip_verify` for pointing to Cells.,
👉 If you are developing locally, you may install the RootCA in your system trust store to see a green light in your browser!,
🗒  To easily install the RootCA in your trust store, use https://github.com/FiloSottile/mkcert. Set the $CAROOT environment variable to the rootCA folder then use 'mkcert -install',

Activating privacy features... done.,
https://0.0.0.0:8080,

Installation Server is starting...,
Listening to: 0.0.0.0:8080,

2021-07-31T20:27:40.371Z	INFO	pydio.gateway.proxy	Restarting proxy	{"caddyfile": "\n\n0.0.0.0:8080  {\n\troot \"/var/cells/static/install\"\n\tproxy /install 172.17.0.7:37841\n\n\t\n\ttls \"/var/cells/certs/2f76a1dfd0fe1a5d9383280e4a9f56fa.pem\" \"/var/cells/certs/2f76a1dfd0fe1a5d9383280e4a9f56fa-key.pem\"\n}\n\n\t "},
2021-07-31T20:27:40.872Z	INFO	pydio.gateway.proxy	Restart done,

Opening URL https://pydio.mydomain.ca in your browser. Please copy/paste it if the browser is not on the same machine.

I’ll try playing around a bit more to get it going, but further diagnosis would be appreciated. I know I’m not the first to attempt using nginx_proxy as a reverse proxy (particularly those coming from NextCloud!), but so far all I can find online are people saying “why not just use traefik/insert other reverse proxy?” If I have to switch I can, but I’ve already got a solid grasp of nginx and its partner letsencrypt container, and it would be a shame to move away from them.

For confirmation as well, the nginx_proxy_letsencrypt container does have a certificate installed:


2021/07/31 16:27:26 Received event start for container 690838948bab,
2021/07/31 16:27:31 Debounce minTimer fired,
2021/07/31 16:27:26 Received event start for container 77082cfdda1c,
2021/07/31 16:27:31 Generated '/app/letsencrypt_service_data' from 6 containers,
2021/07/31 16:27:31 Running '/app/signal_le_service',
Creating/renewal pydio.mydomain.ca certificates... (pydio.mydomain.ca),
[Sat Jul 31 16:27:32 EDT 2021] Domains not changed.,
[Sat Jul 31 16:27:32 EDT 2021] Skip, Next renewal time is: Fri Aug 20 05:15:38 UTC 2021,
[Sat Jul 31 16:27:32 EDT 2021] Add '--force' to force to renew.,
Reloading nginx proxy (nginx-proxy)...,
2021/07/31 16:27:32 Generated '/etc/nginx/conf.d/default.conf' from 6 containers,
2021/07/31 16:27:32 [notice] 639#639: signal process started

I’m not sure if pydio needs to go through the trouble when it’s already being done.

If you do TLS termination on the reverse proxy, the flag to tell the Cells internal web server to accept plain http requests is CELLS_NO_TLS. In your config file, it seems you have put CELLS_NO_SSL

For the record we changed this between v1 and v2 major versions of the application.

Could you please try again with this flag ?

Hey, significant improvement, it finally got to the setup screen. I believe I got the CELLS_NO_SSL off of a pydio instruction page for Docker, but maybe it was the v1 page instead of the v2 page? There wasn’t a clear distinction.

My new roadblock is an error connecting to the MySQL container:

Error 1130:  Host '172.17.0.7' is not allowed to connect to this MySQL server

The default, localhost, obviously doesn’t work because the pydio container doesn’t host the database itself. Pointing it at the correct Docker address for that container (172.17.0.6) gives me the above message. I’m guessing this has to do with a permission thing (allow “user” to log in from address x.x.x.x). I’m looking into ways to auto configure that permission via the docker-compose script.

Addendum:

I added some entries to my docker-compose to specify ports for MySQL (to ensure it was using the right ones, probably superfluous) and to set MYSQL_ROOT_HOST: '%', followed by wiping out the locally stored data for both mysql and pydio, which allowed the setup to go through, mostly. I’m getting the following now in a set up reload loop (log too long to include in post, so, pastebin: 2021-08-02T20:51:26.328Z INFO pydio.gateway.data started2021-08-02T20:51:26.32 - Pastebin.com)

Addendum 2:

And after just letting it do its restart loop long enough, it seems to have stabilised and brought me to a login page which is letting me in and doing the starting walkthrough wizard. So, great progress, definitely. I’ve tried making some folders and I’m getting errors “Unknown Source”, as well as from the logs:

2021-08-02T20:55:47.981Z	INFO	pydio.gateway.proxy	Restart done
2021-08-02T20:55:48.577Z	INFO	pydio.rest.workspace	Creating a Personal workspace
2021-08-02T20:55:48.808Z	INFO	pydio.rest.workspace	Settings ACLS for workspace
2021-08-02T20:55:49.875Z	INFO	pydio.rest.workspace	Creating a Common Files workspace on pydiods1
2021-08-02T20:55:50.008Z	INFO	pydio.rest.workspace	Settings ACLS for workspace
2021-08-02T20:58:16.829Z	INFO	pydio.grpc.tasks	Run Job internal-prune-jobs on timer event Iso8601Schedule:"R/2012-06-04T19:25:16.828696-07:03/PT10M" 
2021-08-02T21:00:16.829Z	INFO	pydio.grpc.tasks	Run Job flush-mailer-queue on timer event Iso8601Schedule:"R/2012-06-04T19:25:16.828696-07:00/PT5M" 
2021-08-02T21:05:16.829Z	INFO	pydio.grpc.tasks	Run Job flush-mailer-queue on timer event Iso8601Schedule:"R/2012-06-04T19:25:16.828696-07:00/PT5M" 
2021-08-02T21:08:20.378Z	ERROR	pydio.rest.meta	Rest Error 404	{"error": "{\"id\":\"undefined\",\"code\":403,\"detail\":\"Unknown data source\",\"status\":\"Forbidden\"}"}
2021-08-02T21:08:16.829Z	INFO	pydio.grpc.tasks	Run Job internal-prune-jobs on timer event Iso8601Schedule:"R/2012-06-04T19:25:16.828696-07:03/PT10M" 
2021-08-02T21:08:40.829Z	ERROR	pydio.rest.tree	Rest Error 500	{"error": "{\"id\":\"undefined\",\"code\":403,\"detail\":\"Unknown data source\",\"status\":\"Forbidden\"}"}
2021-08-02T21:09:03.683Z	ERROR	pydio.rest.tree	Rest Error 500	{"error": "{\"id\":\"undefined\",\"code\":403,\"detail\":\"Unknown data source\",\"status\":\"Forbidden\"}"}