Is there any (soft or hard) limit on the parallel python executions (onedata-server config pythonService.allowedConnections
)?
Are there any risks when setting it too high? What would be the worst thing to happen?
Is there any (soft or hard) limit on the parallel python executions (onedata-server config pythonService.allowedConnections
)?
Are there any risks when setting it too high? What would be the worst thing to happen?
There are no limits as far as I know. Really depends on the available memory, CPU & what Python Processors are used for. There is no official rule of thumb or something like that as far as I know.
The risks really depend on the setup:
Is there maybe also a possibility to monitor PyData or to monitor how many python scripts are executed in parallel?
e.g. is it possible to get those info from the OD API?
If you are running PyData in a docker container, you might be able to monitor the resources used by PyData (if you have DevOps access) but that won’t tell you how many scripts are executed in parallel.
There is no OD API that returns such info but such a feature should be quite feasible.
Additionally, I forgot to mention this before, “pythonService.allowedConnections” is not the only property that should be configured. There is a property on PyData’s side to throttle parallel executions called “max-concurrent-containers”.
Is this actually still in use? It sounds as if this could be a relict of the “old” pydata that spawned a container for each python processor executed, but the recent version does not do that anymore (then this setting would most likely be something along “max-concurrent-subprocesses” or something like that if still in use)
Hi @christoph.schober , yes, that property was used in the past to limit the number of containers created by PyData. However, as you said, PyData now creates subprocesses instead. The property was not renamed, it would cause a backward-incompatibility if we do so. Therefore, the same property is used now to limit the creation of subprocesses.
One additional note: For setting max-concurrent-containers
, the environment variable ALLOWED_CONNECTIONS
needs to be set (e.g. in the environment section of the pydata component in the helm chart override file).
One additional note: Setting the environment variable ALLOWED_CONNECTIONS
for PyData only has an effect starting from PyData version 1.5.1