Integrate additional Python packages to ONE DATA

Dear ONE DATA experts,

I currently use a Python script to parse some HTML with BeautifulSoup4.
Now I’m thinking about integrating this functionality into a workflow with a Python Processor.
The problem that I face is, that BeautifulSoup is not included in the standard libraries that are pre-installed in ONE DATA (at least I assume so, as an error is thrown)

My question is, if it is possible to add some additional Python packages so I can use my code within a Python processor?

  • If yes, how can I do that?
  • If no, is there any other nice way to parse HTML in ONE DATA?

Thanks in advance!


Hey Matthias,

I only have limited information but at least can tell you that this user need is known and will be considered. We already had kickoff meetings where this was discussed and the definite goal is to make e.g. Pyhon Libraries / Versions easily configurable for particular parts of the application, or better said wherever it’s needed. This will be especially important if you imagine developing something locally, transferring it to a development / testing environment and finally setting it into production. In this user journey - where we might need even more granularity and different versions / libraries for different use cases - this is a crucial functionality we need to support.

Until now I think this is a server wide configuration and would need to be tackled by the DevOp department. Andreas Wölfl or anyone else in our DevOp department can for sure provide more information here.

In the current Python processor there is no official way to install additional packages without DevOps-support.

If you require an additiopnal package you need to contact them and they can try to install the package for you. This might be easy or difficult depending on the target ONE DATA instance.

Thanks you guys for the quick and helpful responses! :slightly_smiling_face:

Follow up question: Is this the same issue and way to go if R packages / libraries are missing?