Upload failed: undefined

Hi @bsinou ,
I added the virtual IP on eno1 like you suggested, I don’t see the eno1 interface when I run ip addr or ifconfig -a, but names are resolved. Is it correct that I currently bind cells to 0.0.0.0? I don’t have my reverse proxy set up yet (that’ll be my next step).
I tried to disable the systemd service and start cells manually through cells start. I got the same errors as before and the services didn’t start. I added the line TasksMax=infinity to the cells.service file and now when I run it through sytemd, 9 out of 10 times all the services start!
Now when I try to delete a file through the front end I still get this error:
image
and in the log:

Level : error
Logger : pydio.rest.jobs
Msg : Rest Error 503 - {"id":"go.micro.client","code":503,"detail":"none available","status":"Internal Server Error"}```
Level : error
Msg : Streamer PutTaskStream - {"id":"go.micro.client","code":500,"detail":"none available","status":"Internal Server Error"}

Thanks for all the help,

Ben

Well, I would personally not bind Cells to 0.0.0.0 — I only get errors that way.

Instead, try binding to 1.0.0.1. After all, you did configure that ‘virtual’ interface (or interface alias) quite correctly IMHO — so this ought to work!

Are you using nginx or Apache os something else as a reverse proxy in front of everything? If so, you will most likely need to add the environment variable CELLS_GRPC_EXTERNAL=33060 before launching Cells; if you’re starting everything from systemd, then you should have at the bottom of the [Service] section the following line:

Environment=CELLS_GRPC_EXTERNAL=33060

Also see the Finalisation section in the general configurations KB page.

Oh, just to clear things up, you are still using interfaces venet0 and venet0:1 as per your earliest comments, right? Because eno1 or eno1:0 are very likely not the ‘correct’ names for the Ethernet interface(s) on a virtualised Linux running on the x86_64 architecture. It’s just that it gets a little confusing later on this topic when you suddenly start to talk about eno1 — which you shouldn’t be seeing in your configuration (so no wonder that ip addr or ipconfig -a do not show that interface — these commands will only show the virtualised Ethernet interfaces venet0 and venet0:1).

Hi,
I added the Enviornment variable to the service and tried to bind it to 1.0.0.1 (did you mean 10.0.0.1 by any chance? I tried that too and modifying the caddy file) and setting up a reverse proxy through caddy, this is my Caddyfile (when cells is bound to 1.0.0.1):

<My domain name> {

  tls <Email address>

  reverse_proxy 1.0.0.1:8080 {
    # Use https with a self signed cert between Caddy and Cells
    transport http {
      tls
      tls_insecure_skip_verify
    }
  }
}

# You might also want to redirect HTTP traffic toward HTTPS  
http://<My domain name> {
  redir http://<My domain name>
}

When I try to access Pydio through my domain name I get:
421 Site <My domain> is not served on this interface
Yeah sorry the venet0 / eneo1 was confusing for me as well, I set it up as venet0:1.

I think I found a possible reason that deleting files isn’t possible, when I do cells ps I see that pydio.grpc.jobs [ ] is not active, what can I do about it?

Thanks,

Ben

Hi,

Just a quick update, when binding cells to 0.0.0.0 I was able to run it through Caddy as a reverse proxy and a generated ssl certificate. 10.0.0.1 doesn’t work. I stil can’t delete files.

@bsinou ,
The lines:

auto lo
iface lo
inet loopback

are automatically added after the contents of the interfaces.template file. I did not find a way to get rid of it. I still cannot resolve names, so neither apt-get update nor ping google.com . It works if I remove the virtual IP

Thanks,

Ben

Hi @GwynethLlewelyn , any ideas how I could troubleshoot (or even solve!) the binding to 10.0.0.1? I have a feeling that it’s close to the actual problem. When I bind Cells to 10.0.0.1:8080 and my Caddy file for the reverse Proxy is:

https://<My domain> {

  tls <My Email>

  reverse_proxy 10.0.0.1:8080 {
    # Use https with a self signed cert between Caddy and Cells
    transport http {
      tls
      tls_insecure_skip_verify
    }
  }
}

# You might also want to redirect HTTP traffic toward HTTPS  
http://<My domain> {
  redir <My domain>
}

this is what cells ps looks like:

GENERIC SERVICES                          
 # data                                    
 pydio.test.objects             [ ]        
 # gateway                                 
 pydio.gateway.data             [X]        
 pydio.gateway.dav              [ ]        
 pydio.gateway.grpc             [ ]        
 pydio.gateway.proxy            [X]        
 pydio.gateway.rest             [X]        
 pydio.gateway.websocket        [X]        
 pydio.gateway.wopi             [ ]        
                                           
 GRPC SERVICES                             
 # broker                                  
 pydio.grpc.activity            [ ]        
 pydio.grpc.chat                [X]        
 pydio.grpc.log                 [X]        
 pydio.grpc.mailer              [X]        
 # data                                    
 pydio.grpc.data-key            [ ]        
 pydio.grpc.docstore            [X]        
 pydio.grpc.meta                [ ]        
 pydio.grpc.search              [ ]        
 pydio.grpc.tree                [ ]        
 pydio.grpc.versions            [ ]        
 # datasource                              
 pydio.grpc.data.index          [ ]        
 pydio.grpc.data.objects        [X]        
 pydio.grpc.data.sync           [X]        
 # discovery                               
 pydio.grpc.config              [ ]        
 pydio.grpc.healthcheck         [ ]        
 pydio.grpc.update              [X]        
 # frontend                                
 pydio.grpc.statics             [X]        
 # idm                                     
 pydio.grpc.acl                 [ ]        
 pydio.grpc.oauth               [ ]        
 pydio.grpc.policy              [ ]        
 pydio.grpc.role                [ ]        
 pydio.grpc.token               [ ]        
 pydio.grpc.user                [ ]        
 pydio.grpc.user-key            [ ]        
 pydio.grpc.user-meta           [ ]        
 pydio.grpc.workspace           [ ]        
 # scheduler                               
 pydio.grpc.jobs                [ ]        
 pydio.grpc.tasks               [X]        
 pydio.grpc.timer               [X]        
                                           
 REST SERVICES                             
 # broker                                  
 pydio.rest.activity            [ ]        
 pydio.rest.log                 [ ]        
 pydio.rest.mailer              [ ]        
 # data                                    
 pydio.rest.meta                [ ]        
 pydio.rest.search              [ ]        
 pydio.rest.templates           [ ]        
 pydio.rest.tree                [ ]        
 # discovery                               
 pydio.rest.config              [ ]        
 pydio.rest.update              [ ]        
 # frontend                                
 pydio.rest.frontend            [ ]        
 pydio.web.statics              [ ]        
 # idm                                     
 pydio.rest.acl                 [ ]        
 pydio.rest.auth                [ ]        
 pydio.rest.graph               [ ]        
 pydio.rest.policy              [ ]        
 pydio.rest.role                [ ]        
 pydio.rest.share               [ ]        
 pydio.rest.user                [ ]        
 pydio.rest.user-meta           [ ]        
 pydio.rest.workspace           [ ]        
 pydio.web.oauth                [ ]        
 # scheduler                               
 pydio.rest.jobs                [ ]       

any ideas? It’s really driving me mad

Thanks,

Ben

1 Like

I can confirm a fix for this “upload failed: undefined”

Symptoms:

  • uploads fail
  • lots of “proxy restart” messages in the app output
  • in the “services” panel, the “pydio.gateway.data” and “pydio.gateway.gprc” and “pydio.gateway.proxy” status is red

Solution:

  • add a local ip address to one of the ethernet devices
  • in my case
    ip a a 10.0.0.1/24 dev ens3

After a restart of Cells, all the above symptoms disappeared.

1 Like

Hi, thanks for the response.
I removed 10.0.0.1 from the interfaces file and ran ip a a 10.0.0.1/24 dev venet0 as you recommended. Now when Cells is bound to 10.0.0.1:8080 the only two services not running from cells ps are pydio.grpc.jobs and pydio.grpc.healthcheck . It’s annoying that I can’t modify the interfaces file directly, I guess a solution would be to run the command adding the ip on boot.
When I try to access my domain from the browser I get the error 421 Site is not served on this interface. When I look at the cells log there are two messages that keep repeating (just to be clear my domain’s form is www.cloud.<domain_name>.info:

{
  "level": "info",
  "ts": "2021-03-26T14:27:43+01:00",
  "logger": "pydio.grpc.data.sync.pydiods1",
  "msg": "{\"level\":\"error\",\"ts\":\"2021-03-26T14:27:43+01:00\",\"msg\":\"Streamer PutTaskStream\",\"error\":\"{\\\"id\\\":\\\"go.micro.client\\\",\\\"code\\\":500,\\\"detail\\\":\\\"none available\\\",\\\"status\\\":\\\"Internal Server Error\\\"}\"}"
}

and:

{
  "level": "error",
  "ts": "2021-03-26T14:27:43+01:00",
  "logger": "pydio.grpc.tasks",
  "msg": "Streamer PutTaskStream",
  "error": "{\"id\":\"go.micro.client\",\"code\":500,\"detail\":\"none available\",\"status\":\"Internal Server Error\"}"
}

The reverse proxy in the Caddy file is pointing to 10.0.0.1 as I wrote above, I just added a logging directive. Caddy log these two messages when I try to access the Pydio from the browser:

{
  "request": {
    "remote_addr": "<redacted>:55455",
    "proto": "HTTP/2.0",
    "method": "GET",
    "host": "cloud.<redacted>.info",
    "uri": "/",
    "headers": {
      "User-Agent": [
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) Gecko/20100101 Firefox/87.0"
      ],
      "Accept": [
        "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
      ],
      "Accept-Language": [
        "en-US,en;q=0.5"
      ],
      "Accept-Encoding": [
        "gzip, deflate, br"
      ],
      "Dnt": [
        "1"
      ],
      "Upgrade-Insecure-Requests": [
        "1"
      ],
      "Cache-Control": [
        "max-age=0"
      ],
      "Te": [
        "trailers"
      ]
    },
    "tls": {
      "resumed": false,
      "version": 772,
      "cipher_suite": 4865,
      "proto": "h2",
      "proto_mutual": true,
      "server_name": "cloud.<redacted>.info"
    }
  },
  "common_log": "<redacted> - - [26/Mar/2021:14:35:01 +0100] \"GET / HTTP/2.0\" 421 64",
  "duration": 0.009347468,
  "size": 64,
  "status": 421,
  "resp_headers": {
    "Content-Length": [
      "64"
    ],
    "Date": [
      "Fri, 26 Mar 2021 13:35:01 GMT"
    ],
    "Content-Type": [
      "text/plain; charset=utf-8"
    ],
    "X-Content-Type-Options": [
      "nosniff"
    ],
    "Server": [
      "Caddy",
      ""
    ]
  }
}
{
  "request": {
    "remote_addr": "<redacted>:55456",
    "proto": "HTTP/2.0",
    "method": "GET",
    "host": "cloud.<redacted>.info",
    "uri": "/",
    "headers": {
      "Accept-Encoding": [
        "gzip, deflate, br"
      ],
      "Dnt": [
        "1"
      ],
      "Upgrade-Insecure-Requests": [
        "1"
      ],
      "Cache-Control": [
        "max-age=0"
      ],
      "Te": [
        "trailers"
      ],
      "User-Agent": [
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) Gecko/20100101 Firefox/87.0"
      ],
      "Accept": [
        "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
      ],
      "Accept-Language": [
        "en-US,en;q=0.5"
      ]
    },
    "tls": {
      "resumed": false,
      "version": 772,
      "cipher_suite": 4865,
      "proto": "h2",
      "proto_mutual": true,
      "server_name": "cloud.<redacted>.info"
    }
  },
  "common_log": "<redacted> - - [26/Mar/2021:14:35:02 +0100] \"GET / HTTP/2.0\" 421 64",
  "duration": 0.000890917,
  "size": 64,
  "status": 421,
  "resp_headers": {
    "Server": [
      "Caddy",
      ""
    ],
    "Content-Type": [
      "text/plain; charset=utf-8"
    ],
    "X-Content-Type-Options": [
      "nosniff"
    ],
    "Content-Length": [
      "64"
    ],
    "Date": [
      "Fri, 26 Mar 2021 13:35:02 GMT"
    ]
  }
}

Does anyone have any ideas?
Thanks,

Ben

hi
Healthcheck service : it’s normal that it’s not running (unless you set a dedicated flag for it)
Jobs service : not normal , and probably the source of your “PutTaskStream” errors

Any errors regarding the jobs service at the very start?

Hi @charles , thanks for the response.
I added ip a a 10.0.0.1/24 dev venet0 to my rc.local file, I still have to manually systemctl restart cells to get the services except for jobs to start.
I checked with ip addr the address 10.0.0.1 is there.
Good to know re the Healthcheck. The log is obviously very long, the errors I’m getting with journalctl -u cells.service after restarting Cells are:

"pydio.grpc.tasks","msg":"Run Job resync-ds-pydiods1 on demand","SpanUuid":"bc1ca694-8ee6-11eb-bad1-024258ebb111"}
 panic: runtime error: index out of range [-1]
 goroutine 740 [running]:
 github.com/pydio/cells/broker/log.(*SyslogServer).getWriteIndex(...)
         github.com/pydio/cells/broker/log/syslog.go:119
 github.com/pydio/cells/broker/log.(*SyslogServer).watchInserts(0xc000bf3f80)
         github.com/pydio/cells/broker/log/syslog.go:199 +0x590
 created by github.com/pydio/cells/broker/log.(*SyslogServer).Open
         github.com/pydio/cells/broker/log/syslog.go:114 +0x3eb
{"level":"error","ts":"2021-03-27T11:25:23+01:00","msg":"SubProcess finished with error: trying to restart now"}
{"level":"info","ts":"2021-03-27T11:25:23+01:00","msg":"[pydio.grpc.log] Cannot open bleve index /var/cells/services/pydio.grpc.jobs/tasklogs.bleve cannot create new index, path already exists"}
{"level":"info","ts":"2021-03-27T11:25:26+01:00","logger":"pydio.grpc.tasks","msg":"Run Job resync-ds-cellsdata on demand","SpanUuid":"bf2a5c47-8ee6-11eb-bad1-024258ebb111"}
 panic: runtime error: index out of range [-1]
 goroutine 716 [running]:
 github.com/pydio/cells/broker/log.(*SyslogServer).getWriteIndex(...)
         github.com/pydio/cells/broker/log/syslog.go:119
 github.com/pydio/cells/broker/log.(*SyslogServer).watchInserts(0xc000521c80)
         github.com/pydio/cells/broker/log/syslog.go:199 +0x590
 created by github.com/pydio/cells/broker/log.(*SyslogServer).Open
         github.com/pydio/cells/broker/log/syslog.go:114 +0x3eb
{"level":"error","ts":"2021-03-27T11:25:28+01:00","msg":"SubProcess finished with error: trying to restart now"}
{"level":"error","ts":"2021-03-27T11:25:33+01:00","logger":"pydio.grpc.tasks","msg":"Streamer PutTaskStream","error":"{\"id\":\"go.micro.client\",\"code\":500,\"detail\":\"Error creating stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.0.0.1:43502: connect: connection refused\\\"\",\"status\":\"Internal Server Error\"}"}
{"level":"info","ts":"2021-03-27T11:26:14+01:00","logger":"pydio.grpc.data.sync.pydiods1","msg":"{\"level\":\"error\",\"ts\":\"2021-03-27T11:26:14+01:00\",\"msg\":\"Streamer PutTaskStream\",\"error\":\"{\\\"id\\\":\\\"go.micro.client\\\",\\\"code\\\":500,\\\"detail\\\":\\\"Error creating stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\\\\\"transport: Error while dialing dial tcp 10.0.0.1:34601: connect: connection refused\\\\\\\"\\\",\\\"status\\\":\\\"Internal Server Error\\\"}\"}"}

rinse repeat for various services with various ports. I still can’t check which functions are working and which aren’t because of the “Website not served on this interface” error.

Thanks,

Ben

Apologies for my silence here.

First, you’re right, I mistyped the address, you correctly spotted that it’s indeed 10.0.0.1 that I meant (1.0.0.1 is actually the secondary DNS server I use, thus my mistake! :slight_smile: ).

Secondly, I’m really not familiar with Caddy’s configuration files (I know, I should make an effort to understand it better!); so I have no idea if there is any issue with that… I can just speculate and say that your configuration looks right.

Thirdly — and perhaps more importantly! — as mentioned before, I have no idea how the Strato VM works and what its tricks and ‘special’ configurations are. Therefore much of my comments always assume that you have full control of your VM instance and that it behaves ‘as if’ it were running on bare metal, i.e. a physical server. This may or may not be the case… I did read a bit about Strato (they have awesome prices, BTW!) but although they list the hardware they use for provisioning VPS services (the HP3PAR platform from Hewlett-Packard), I couldn’t find any information related to what software they use. Ultimately, it should be irrelevant, but sometimes… it may make a difference, I don’t know.

Anyway, taking into account my overall limited experience, it seems to me that your Cells is binding some services to the correct ports, but seems to fail in some cases. I had a very similar issue recently, when adding a brand-new workplace, from new storage that I had just added. When I looked at pydio.json, I saw that Cells had picked port 9001 (if I remember correctly) to run the services related to that workplace/storage, which, unfortunately, is in use by my server for other purposes. I think that this happened because I had temporarily shut down the service running on ports 9000-9150 to upgrade to a more recent version, and, while I waited for that to finish, I was tinkering on Cells. This might explain why Cells suddenly ‘grabbed’ a port that shouldn’t be available. When rebooting the system, everything worked, because Cells starts before the other service — I just found that out when the other service was not working and complaining about the ‘stolen’ port!

The fix, in my case, was super-simple — I just shut Cells down, manually added a free port on pydio.json, and restarted Cells (after guaranteeing that the other service was up and running, too). Cells promptly bound its services to the ports I had configured and my problem disappeared!

Now, again, I’m not 100% sure that this was exactly what was happening, and much less sure if that’s what you are experiencing, but there are hints on some of the errors you posted that show that Cells is having trouble adding services to some ports. Assuming that the Strato VM doesn’t limit how many ports can be simultaneously used (there is no reason for assuming otherwise, but, who knows how their services work…), it seems that there might be a few ports that are being ‘stolen’ by other services running on your VM instance. This would explain why when you reboot the instance and start Cells immediately, it seems to work for a few moments, and then services start dropping errors: this could mean that other things running on your server are attempting to use the same ports as Cells, and, at some point, succeed in ‘stealing’ the port from Cells… thus the errors.

All the above is not supposed to happen, but, again, it’s impossible for me to say under which special circumstances this may occur. One thing is for sure: if you have several services running on your VMS, and some of them are attempting to use the same ports, there will be a race condition, as some services succeed to bind to their ports and others fail; and since this will happen in parallel, it will be next-to-impossible to predict which services will actually run and which won’t…

Ok, to check for my assumption, you need to check what ports are in use. There are several possible ways of doing that; use the one you’re more familiar with. I’m personally a fan of lsof -i :8080 but you can get even more interesting information from fuser -n tcp -v -a 8080 or ss -tulpn | grep :8080. You will have to take a look at all ports mentioned in pydio.json. From the errors you posted, you should most definitely look at ports 43502 and 34601, and make sure that they are, indeed, being used by the cells process…

Last but not least… if everything else fails, you can always attempt to run a new ./cells configure and start your configuration from scratch — different ports will be assigned. I did that once or twice when manually hacking pydio.json was leading to nowhere…

Good hunting!

Hey, thanks for the detailed response :slight_smile:
I ditched Caddy in favor of nginx and it’s working like a charm with a self-signed key, next step is a proper one. I guess Caddy isn’t very ripe, it looks very good and simple though! Just a bit too nontransparent for my taste.
I reran cells configure and followed your advice and checked all the involved ports.
image
Except for 33593, The results are all empty, it seems that Cells doesn’t bind to them at all and neither does anything else.
I can now upload and download files, but when I try to delete a file I get the StreamerPutTaskStram error. As you see the service looks green in the UI, but cells ps still shows that it’s not running. I also still need to restart the cells service manually after a reboot of the server.
Interestingly enough, before I restart the service after a reboot pydio.grpc.jobs is active in cells ps but almost nothing else and when I try to access Pydio through the browser I get an infinite Server is starting. These are the active services before I restart the cells service:

 pydio.gateway.data             [X]
 pydio.gateway.proxy            [X]
 pydio.gateway.rest             [X]
 pydio.gateway.websocket        [X]

 pydio.grpc.chat                [X]
 pydio.grpc.log                 [X]
 pydio.grpc.mailer              [X]

 pydio.grpc.docstore            [X]

 pydio.grpc.data.objects        [X]
 pydio.grpc.data.sync           [X]
 pydio.grpc.update              [X]
 # frontend
 pydio.grpc.statics             [X]

 pydio.grpc.jobs                [X]
 pydio.grpc.tasks               [X]
 pydio.grpc.timer               [X]

On a sidenote, there’s a slight mistake in the nginx config file in the installation instructions. There are some @ which shouldn’t be there. There are also a couple of listen [::]:443 ssl http, the http shouldn’t be there as far as I can read from the nginx documentation, it also gives a config error.

Thanks,

Ben

Hi, I’ve been poking around and still haven’t gotten any further. I thought that maybe a part of the range of the port is blocked but the only place where ports are enabled that I could find was /proc/sys/net/ipv4/ip_local_port_range and this seems in order (32768 60999). Is there maybe another place where this is controlled or am I completely in the wrong direction?

Thanks,

Ben

In terms of your overall networking configuration, I’d guess that you’ve got everything right. There is indeed no obvious reason for Cells not to bind to those ports.

There may be a few other non-obvious reasons, though!

Firstly, the ports may be blocked for some stupid reason at the level of the hosting supervisor (or whatever technology Stratos uses to provide virtual machines).

Without Cells running, you can try this simple trick to see if you can bind to a port:

nc -lp 44775 > /dev/null

(44475 was one of the ports listed on your jobs page; nc is the netcat command which should be present on most Linux distributions)

If that command gives some sort of error, you know that this particular port cannot be bound to, for whatever reason (it might not help you much further, but at least you’ll know why Cells couldn’t bind to the port — for some reason, nothing can, it’s not a Cells-related issue).

Next: have you checked your firewall? Does it block any range of ports for some reason? (Probably not — or else you couldn’t bind Cells to 33593, either…) What about at the level below, i.e., the supervisor or whatever runs your VM? Does their firewall block anything? What about the total number of ports that are free to bind to? (imagine that Stratos is restricting the number of ports to, say, the first thousand, and you hit that limit by coincidence)

Also: do you have SELinux active on either the VM or at the level of the VM supervisor? SELinux is usually a nightmare to debug (deliberately so: it’s supposed to make it very hard to circumvent security policies!). Maybe you need to add a new policy rule to ensure that you have access to those extra ports — or have Stratos do so on their supervisor software.

Last (but not least!): I understand that you have some errors when attempting to do more complex things beyond uploading/downloading. But personally, I’m a bit confused about what kind of errors you get when launching Cells — which, admittedly, is very chatty and difficult to follow. I would expect to see Cells grumbling on the logs if it tried to bind one microservice to a port and fail. With luck, it might even say why it failed (don’t hold your breath, though).

BTW,

@benw wrote:
[…]
listen [::]:443 ssl http

Heh. Indeed. Someone forgot to add a 2 (for http2) on that line…

Don’t forget that [::] is the notation for IPv6, in your case you’ve been only using IPv4 in your examples.

Cheers, Gwyn

[EDIT] it’s not the pydio.rest.jobs service twice in cells ps as I wrote below, one is pydio.grpc.jobs which is not running, one is pydio.rest.jobs which is running, I misread

Hey,

thanks for the quick response!
I did nc -lp port_number > /dev/null and then in a separate window lsof -i :port_number for all the port numbers that appear in the starting log, they all bind to nc as I would expect. My Firewall is ufw, even if I turn it off it doesn’t work so I doubt that’s the problem.
I’m not running SELinux but AppArmor is built into the Kernel. I’ve tried to disable the service as described here (I didn’t disable it in the kernel, only disabled the systemd service) and it didn’t make a difference.
I did notice something odd though, when I run cells ps I see pydio.rest.jobs
twice, once it’s enabled and once disabled. Could it be clashing with itself?
I wrote Strato an Email asking whether they have some firewall or are blocking any ports even though I doubt it because a dig into the security section of their website didn’t reveal anything.
Here are the parts that seemed error-relevant to me of the starting log of cells:

 09:57:34 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:34+02:00","msg":"[pydio.grpc.log] Cannot open bleve index /var/cells/services/pydio.grpc.jobs/tasklogs.bleve cannot create new index, path already exists"}
 09:57:34 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:34+02:00","logger":"pydio.grpc.jobs","msg":"started"}
 09:57:37 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:37+02:00","logger":"pydio.grpc.tasks","msg":"Run Job resync-ds-cellsdata on demand","SpanUuid":"140626b6-9cf7-11eb-b6bd-0242947f3aaf"}
 09:57:37 <Hostname>: panic: runtime error: index out of range [-1]
 09:57:37 <Hostname>: goroutine 740 [running]:
 09:57:37 <Hostname>: github.com/pydio/cells/broker/log.(*SyslogServer).getWriteIndex(...)
 09:57:37 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:119
 09:57:37 <Hostname>: github.com/pydio/cells/broker/log.(*SyslogServer).watchInserts(0xc0004add80)
 09:57:37 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:199 +0x590
 09:57:37 <Hostname>: created by github.com/pydio/cells/broker/log.(*SyslogServer).Open
 09:57:37 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:114 +0x3eb
 09:57:37 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:37+02:00","logger":"pydio.gateway.proxy","msg":"Restarting proxy","caddyfile":"\n\n\n\n\n\n\n10.0.0.1:8085 {\n\t\n\n\t\n\n\tproxy /a  10.0.0.1:35361 {\n\t\twithout /a\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t}\n\tproxy /oidc 10.0.0.1:34661 {\n\t\tinsecure_skip_verify\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t}\n\tproxy /io   10.0.0.1:40359 {\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t\theader_downstream Content-Security-Policy \"script-src 'none'\"\n\t\theader_downstream X-Content-Security-Policy \"sandbox\"\n\t}\n\tproxy /data 10.0.0.1:40359 {\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t\theader_downstream Content-Security-Policy \"script-src 'none'\"\n\t\theader_downstream X-Content-Security-Policy \"sandbox\"\n\t}\n\tproxy /ws   10.0.0.1:35121 {\n\t\twebsocket\n\t\twithout /ws\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t}\n\tproxy /dav 10.0.0.1:34189 {\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t\theader_downstream Content-Security-Policy \"script-src 'none'\"\n\t\theader_downstream X-Content-Security-Policy \"sandbox\"\n\t}\n\t\n\n\tproxy /plug/ 10.0.0.1:35449 {\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t\theader_downstream Cache-Control \"public, max-age=31536000\"\n\t}\n\tproxy /public/ 10.0.0.1:35449 {\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t}\n\tproxy /public/plug/ 10.0.0.1:35449 {\n\t\twithout /public\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t\theader_downstream Cache-Control \"public, max-age=31536000\"\n\t}\n\tproxy /user/reset-password/ 10.0.0.1:35449 {\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t}\n\t\n\tproxy /robots.txt 10.0.0.1:35449 {\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t}\n\t\n\tproxy /login 10.0.0.1:35449/gui {\n\t\twithout /login\n\t\theader_upstream Host <External address>\n\t\theader_upstream X-Real-IP {remote}\n\t\theader_upstream X-Forwarded-Proto {scheme}\n\t}\n\n\n\tproxy /grpc https://10.0.0.1:33767 {\n\t\twithout /grpc\n\t\tinsecure_skip_verify\n\t}\n\t\n\trewrite {\n\t\tif {>Content-type} has \"application/grpc\"\n\t\tto /grpc/{path}\n\t}\n\n\n\tredir 302 {\n\t\tif {>Content-type} not_has \"application/grpc\"\n\t\tif {path} is /\n\t\t/ /login\n\t}\n\t\n\t\n\t\n        proxy /wopi/ 10.0.0.1:45610 {\n            transparent\n\t\t\theader_upstream Host <External address>\n\t\t\theader_upstream X-Real-IP {remote}\n\t\t\theader_upstream X-Forwarded-Proto {scheme}\n        }\n\n        proxy /loleaflet/ http://<External address of collabora>:9980/loleaflet {\n            transparent\n            insecure_skip_verify\n            without /loleaflet/\n        }\n\n        proxy /hosting/discovery http://<External address of collabora>:9980/hosting/discovery {\n            transparent\n            insecure_skip_verify\n            without /hosting/discovery\n        }\n\n        proxy /lool/ http://<External address of collabora>:9980/lool/ {\n            transparent\n            insecure_skip_verify\n            websocket\n            without /lool/\n        }\n    \n\t\n\t\n\trewrite {\n\t\tif {path} not_starts_with \"/a/\"\n\t\tif {path} not_starts_with \"/oidc/\"\n\t\tif {path} not_starts_with \"/io\"\n\t\tif {path} not_starts_with \"/data\"\n\t\tif {path} not_starts_with \"/ws/\"\n\t\tif {path} not_starts_with \"/plug/\"\n\t\tif {path} not_starts_with \"/dav\"\n\t\t\n\t\tif {path} not_starts_with \"/wopi/\"\n\t\t\n\t\tif {path} not_starts_with \"/loleaflet/\"\n\t\t\n\t\tif {path} not_starts_with \"/hosting/discovery\"\n\t\t\n\t\tif {path} not_starts_with \"/lool/\"\n\t\t\n\t\tif {path} not_starts_with \"/public/\"\n\t\tif {path} not_starts_with \"/user/reset-password\"\n\t\tif {path} not_starts_with \"/robots.txt\"\n\t\tto {path} {path}/ /login\n\t}\n\n\troot \"/76126d21-b5d9-434a-b3f7-9f968b18f718\"\n\n\t\n\ttls \"/var/cells/certs/190dafab69706a67221c1226360de7dc.pem\" \"/var/cells/certs/190dafab69706a67221c1226360de7dc-key.pem\"\n\terrors \"/var/cells/logs/caddy_errors.log\"\n}\n\n\n\n\n\t"}
 09:57:38 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:38+02:00","logger":"pydio.gateway.proxy","msg":"Restart done"}
 09:57:39 <Hostname>: {"level":"error","ts":"2021-04-14T09:57:39+02:00","msg":"SubProcess finished with error: trying to restart now"}
 09:57:39 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:39+02:00","msg":"[pydio.grpc.log] Cannot open bleve index /var/cells/services/pydio.grpc.jobs/tasklogs.bleve cannot create new index, path already exists"}
 09:57:39 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:39+02:00","logger":"pydio.grpc.jobs","msg":"started"}
 09:57:42 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:42+02:00","logger":"pydio.grpc.tasks","msg":"Run Job resync-ds-pydiods1 on demand","SpanUuid":"16f9b78e-9cf7-11eb-b6bd-0242947f3aaf"}
 09:57:42 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:42+02:00","logger":"pydio.grpc.tasks","msg":"Run Job resync-ds-buvo on demand","SpanUuid":"17081283-9cf7-11eb-b6bd-0242947f3aaf"}
 09:57:42 <Hostname>: panic: runtime error: index out of range [-1]
 09:57:42 <Hostname>: goroutine 750 [running]:
 09:57:42 <Hostname>: github.com/pydio/cells/broker/log.(*SyslogServer).getWriteIndex(...)
 09:57:42 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:119
 09:57:42 <Hostname>: github.com/pydio/cells/broker/log.(*SyslogServer).watchInserts(0xc000772c00)
 09:57:42 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:199 +0x590
 09:57:42 <Hostname>: created by github.com/pydio/cells/broker/log.(*SyslogServer).Open
 09:57:42 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:114 +0x3eb
 09:57:43 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:43+02:00","logger":"pydio.grpc.data.sync.personal","msg":"{\"level\":\"error\",\"ts\":\"2021-04-14T09:57:43+02:00\",\"msg\":\"Streamer PutTaskStream\",\"error\":\"{\\\"id\\\":\\\"go.micro.client\\\",\\\"code\\\":500,\\\"detail\\\":\\\"Error creating stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\\\\\"trans\\port: Error while dialing dial tcp 10.0.0.1:46218: connect: connection refused\\\\\\\"\\\",\\\"status\\\":\\\"Internal Server Error\\\"}\"}"}
 09:57:44 <Hostname>: {"level":"error","ts":"2021-04-14T09:57:44+02:00","msg":"SubProcess finished with error: trying to restart now"}
 09:57:44 <Hostname>: {"level":"error","ts":"2021-04-14T09:57:44+02:00","logger":"pydio.grpc.tasks","msg":"Streamer PutTaskStream","error":"{\"id\":\"go.micro.client\",\"code\":500,\"detail\":\"Error creating stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.0.0.1:46218: connect: connection refused\\\"\",\"status\":\"Internal Server Error\"}"}
 09:57:44 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:44+02:00","msg":"[pydio.grpc.log] Cannot open bleve index /var/cells/services/pydio.grpc.jobs/tasklogs.bleve cannot create new index, path already exists"}
 09:57:44 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:44+02:00","logger":"pydio.grpc.jobs","msg":"started"}
 09:57:56 <Hostname>: panic: runtime error: index out of range [-1]
 09:57:56 <Hostname>: goroutine 730 [running]:
 09:57:56 <Hostname>: github.com/pydio/cells/broker/log.(*SyslogServer).getWriteIndex(...)
 09:57:56 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:119
 09:57:56 <Hostname>: github.com/pydio/cells/broker/log.(*SyslogServer).watchInserts(0xc00012d580)
 09:57:56 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:199 +0x590
 09:57:56 <Hostname>: created by github.com/pydio/cells/broker/log.(*SyslogServer).Open
 09:57:56 <Hostname>:         github.com/pydio/cells/broker/log/syslog.go:114 +0x3eb
 09:57:58 <Hostname>: {"level":"error","ts":"2021-04-14T09:57:58+02:00","msg":"SubProcess finished with error: trying to restart now"}
 09:57:58 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:58+02:00","logger":"pydio.grpc.jobs","msg":"started"}
 09:57:58 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:58+02:00","msg":"[pydio.grpc.log] Cannot open bleve index /var/cells/services/pydio.grpc.jobs/tasklogs.bleve cannot create new index, path already exists"}
 09:57:58 <Hostname>: {"level":"info","ts":"2021-04-14T09:57:58+02:00","logger":"pydio.grpc.jobs","msg":"Setting task 01403925-5626-49e4-84e2-3b2f5fac6614 in error status as it was saved as running"}
 09:58:05 <Hostname>: {"level":"error","ts":"2021-04-14T09:58:05+02:00","logger":"pydio.grpc.tasks","msg":"Streamer PutTaskStream","error":"{\"id\":\"go.micro.client\",\"code\":500,\"detail\":\"Error creating stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.0.0.1:35593: connect: connection refused\\\"\",\"status\":\"Internal Server Error\"}"}
 09:58:16 <Hostname>: {"level":"info","ts":"2021-04-14T09:58:16+02:00","logger":"pydio.grpc.tasks","msg":"Run Job internal-prune-jobs on timer event Iso8601Schedule:\"R/2012-06-04T19:25:16.828696-07:03/PT10M\" ","SpanUuid":"2ba9306f-9cf7-11eb-b6bd-0242947f3aaf"}
 09:58:26 <Hostname>: {"level":"error","ts":"2021-04-14T09:58:26+02:00","logger":"pydio.grpc.tasks","msg":"Streamer PutTaskStream","error":"{\"id\":\"go.micro.client\",\"code\":500,\"detail\":\"Error creating stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.0.0.1:35593: connect: connection refused\\\"\",\"status\":\"Internal Server Error\"}"}
 09:58:37 <Hostname>: {"level":"info","ts":"2021-04-14T09:58:37+02:00","logger":"pydio.grpc.data.sync.cellsdata","msg":"Got Stats for index://cellsdata","SpanRootUuid":"140626b6-9cf7-11eb-b6bd-0242947f3aaf","SpanParentUuid":"140626b6-9cf7-11eb-b6bd-0242947f3aaf","SpanUuid":"382115db-9cf7-11eb-bfee-0242947f3aaf","stats":{"HasChildrenInfo":false,"HasSizeInfo":false,"Size":0,"Folders":0,"Files":0}}
 09:58:37 <Hostname>: {"level":"info","ts":"2021-04-14T09:58:37+02:00","logger":"pydio.grpc.data.sync.cellsdata","msg":"Got Stats for s3://<IP Address>:33099/cellsdata","SpanRootUuid":"140626b6-9cf7-11eb-b6bd-0242947f3aaf","SpanParentUuid":"140626b6-9cf7-11eb-b6bd-0242947f3aaf","SpanUuid":"382115db-9cf7-11eb-bfee-0242947f3aaf","stats":{"HasChildrenInfo":false,"HasSizeInfo":false,"Size":0,"Folders":0,"Files":0}}

 09:58:37 <Hostname>: {"level":"error","ts":"2021-04-14T09:58:37+02:00","logger":"pydio.grpc.tasks","msg":"cannot run action actions.internal.prune-jobs: {\"id\":\"\",\"code\":0,\"detail\":\"all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.0.0.1:46218: connect: connection refused\\\" - Request was JobService.DetectStuckTasks on pydio.grpc.jobs - Micro-registry had node(s) : [10.0.0.1:43629]\",\"status\":\"\"}","OperationUuid":"internal-prune-jobs-0192e240","SpanUuid":"2ba9306f-9cf7-11eb-b6bd-0242947f3aaf","OperationUuid":"internal-prune-jobs-0192e240","SchedulerJobUuid":"internal-prune-jobs","SchedulerTaskUuid":"0192e240-c4b6-4ae6-8524-9cffef7e097a","SchedulerTaskActionPath":"ROOT/actions.internal.prune-jobs$0"}

Thanks a lot,

Ben

Hi,

I got an answer from Strato. They say that it’s impossible to add another LAN IP because they’re using Hypervisor and suggested that I use an IPv6/56 alternative. How can I go about doing this?
From my limited knowledge I’d say that I have to change the A DNS Record which
points to my IPv4 address to a Quad-A record which points to my IPv6 address and modify the binding address of cells from my previous 10.0.0.1 to an IPv6 address (surrounded by []) in a subnetwork that I have to build (I guess that’s gonna take some research). Is that possible?
By the way, tried to bind Cells to 127.0.0.1 and localhost to see what happens, it doesn’t work either.

Cheers,

Ben

Hi,

I just uninstalled Docker and with it the IP that it creates for itself. Now when I start Cells bound to localhost or 127.0.0.1 or 0.0.0.0 I get the Warning: no private IP detected for binding broker. Will bind to <My public IP>, which may give public access to the broker. message in the log. It did not solve the pydio.grpc.jobs and I still can’t delete files.
They also told me that the limits of the server are set in the /proc/user_beancounters, these are the contents, I can’t see anything related to the ports. Am I missing something?

Version: 2.5
       uid  resource                     held              maxheld              barrier                limit              failcnt
   2907792: kmemsize                 68116480             72425472  9223372036854775807  9223372036854775807                    0
            lockedpages                     0                    0  9223372036854775807  9223372036854775807                    0
            privvmpages               1045210              1135320  9223372036854775807  9223372036854775807                    0
            shmpages                      298                 2073  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0  9223372036854775807  9223372036854775807                    0
            numproc                       298                  298                 1100                 1100                    0
            physpages                  685136               687348              8388608              8388608                    0
            vmguarpages                     0                    0  9223372036854775807  9223372036854775807                    0
            oomguarpages               685136               687348                    0                    0                    0
            numtcpsock                      0                    0  9223372036854775807  9223372036854775807                    0
            numflock                        0                    0  9223372036854775807  9223372036854775807                    0
            numpty                          2                    2  9223372036854775807  9223372036854775807                    0
            numsiginfo                      0                  147  9223372036854775807  9223372036854775807                    0
            tcpsndbuf                       0                    0  9223372036854775807  9223372036854775807                    0
            tcprcvbuf                       0                    0  9223372036854775807  9223372036854775807                    0
            othersockbuf                    0                    0  9223372036854775807  9223372036854775807                    0
            dgramrcvbuf                     0                    0  9223372036854775807  9223372036854775807                    0
            numothersock                    0                    0  9223372036854775807  9223372036854775807                    0
            dcachesize               26824704             27123712  9223372036854775807  9223372036854775807                    0
            numfile                      2279                 2700  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0  9223372036854775807  9223372036854775807                    0
            numiptent                     361                  365                 2000                 2000                    0

Thanks,

Ben

I finally found out a bit about this user_beancounters file, I guess this doesn’t deliver any answers because the failcnt is 0 for everything. The only parameter that I couldn’t figure out is oomguarpages which has a held value higher than 0 but the barrier/limit are 0 and failcnt is also 0.
At this point I assume that it’s related to Hypervisor but I’m not sure how I could check this hypothesis (this way I could go to Strato with something concrete)

Thanks,

Ben

Thanks for the help everyone, I just installed Debian 10 instead of Ubuntu and I can delete :slight_smile:
If anyone runs into this problem in the future, the runtime/cgo: pthread_create failed problem was also there. Setting TasksMax=Infinity doesn’t work on Debian 10, I just set it to TasksMax=1000 and the error disappeared.
My guess is that the issues were some weird V-Server config stuff or some Ubuntu-bloat but I can’t pinpoint it

I had diffrent issue with Upload failed: undefined when running Cells inside small kubernetes cluster behind proxy (nginx-ingress), not really related with this issue but it may help someone.

It wasn’t rocket science to make it work over https, but i couldn’t upload some files (at the same time I could upload others)

What i didn’t realized at first is that there are strict limit of body-size like 1mb. Double check your setup if you use any reverse proxy for a job, set bigger value like 1024mb.

Sample code for Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "1024m"
(...)