Can Query Helper leverage Spark?

Dear Community,

could someone tell me how the Query Helper Processor works internally or more specifically whether it is able to leverage Spark for parallel computations?
My specific use case is: I have a (potentially huge, millions of rows) table with key-value pairs. I want to apply queries which perform an RLIKE operation on the values, that means they will evaluate whether the values fit a regular expression. There can be multiple queries depending on the amount of regular expressions that are fed into the workflow via another input and can be quite a few as well (expected in a two-digit range).
Now my question more specifically: Can the Query Helper perform those queries in parallel so that all (or at least some) RLIKE queries are performed in parallel?


The answer as usual is: It depends.
Can you generate your query to have multiple RLIKE statements in the same SELECT statement?
These will be performed together.
If you use one SELECT statement per RLIKE statement (i.e. have many rows in your query input), every statement will be carried out individually and a UNION will be performed. This hurts performance (a lot) when there are many such statements.
I’d recommend going for as few queries as possible. Maybe you can make use of the transform function to achieve your goal of doing many RLIKEs in the same statement.