Hi all,
I’m using if/else processors in combination with asserts as a checking mechanism before and after crucial joins (to interrupt execution if duplicate join partners are detected).
The idea is the following:
- count rows before join
- count rows after join
- use if else processor to compare these two row counts
4.a if row counts are equal → processor forwards input to left output port and execution proceeds as intended
4.b if row counts are not equal → processor forwards input to right output port → use an assert processor with a false statement (e.g., SELECT 1=0 from inputTable
) here to interrupt the execution
Issue: If I’m missing all join partners in the join I’d like to check, the input of the if/else processor is empty (due to inner join) and my assert processor is executed (i.e., it shows green check marks) but it does not fail.
Expected behaviour: if the right output branch is used and the assert is executed, the assert will always fail, no matter of the input data
Workaround: Union with a dummy column
Apparently, if the input to the assert p. is empty the assert does not do anything?
What am I missing here? Can anybody explain this to me?
EDIT: If looking for an Assert Processor that will always throw an Assertion Error regardless of its input, see my other answer!
Try an assertion query like this:
SELECT COUNT(*) != 0 FROM inputTable AS i UNION SELECT 1=0 FROM inputTable
This should work in both cases.
Technical background:
SELECT 1=0 from inputTable
produces false for every single row in the input. These are lazily evaluated using an AND operator by the Assert Processor. This chain has to start somewhere technically, so the first starting value is “true”. When there are no rows, the result will therefore be “true” since there is no single row the above statement is validated for.
By introducing this count(*) part, we make sure that we also cover this case. This approach works since when the Assert Processor is reached, we already know that we are in the else-branch and in any case we want an error here (at least that what I can imagine from your post).
So with that count-part, we make sure that there is at least one row that evaluates to false and therefore the whole evaluation result is false in any case. If there are no rows at all, the first part
SELECT COUNT(*) != 0 FROM inputTable AS i
evaluates to a row with one cell being “false”, whereas if there is at least one row, the first part will evaluate to “true” but the
SELECT 1=0 from inputTable
will yield as many “false” rows as there are rows in the input. Since one “false” is sufficient (AND-logic), this will result in a “false” for the overall evaluation in any case.
Hope that helps.
…and I just found an even easier way of achieving this (“this” being an Assert Processor that will always throw an Assertion Error regardless of its input):
Use
SELECT 1=0
(without the FROM-part) to always generate exactly one row containing false regardless of the input.
This is way more efficient since it will not do any counting or other transformations to the input data.
1 Like
Nice, thanks a lot!
Using SELECT 1=0
now