For a project I started to move parts of workflows into “services” using the integrated workflow processor. Now I get the impression that the workflows are getting slower (take longer to execute) as compared to the “original” WF.
Therefore the question: What is the performance overhead of an integrated workflow processor? Is it really doing an HTTP request in the background to execute the microservice workflow (which I guess would be a large overhead especially with larger data) or is it doing some magic to “glue” together the workflows and execute a single one?
And related: What about data storage (Spark) and optimizations? Would a integrated workflow processor break any Spark optimization?
Afaik it does not do the HTTP request to trigger the microservice separately (actually, that would be better than it is right now IMO because it would have better debuggability).
Instead, the contents of the (inner) microservice WF (i.e. the processors) are “copy/pasted” into the (outer) workflow temporarily for the execution, i.e. the workflow should behave as if it was a single big workflow.
@Flogge should be the expert here though.
2 Likes
Integrated Workflow Execution works by replacing (integrating) the IWF processor node with the “slave” workflow that is to be integrated. The Microservice input and output nodes serve as the connectors and disappear during the integration. So, there is no HTTP communication involved. What runs is just one large workflow. Thus, the overhead should be negligible.
1 Like