Mixed up columns in union

Dear Community,

In one of my workflows I had the following problem: When unioning two tables (same schema, same datatypes, however differently ordered columns) using the UNION processor, it seemed, my second input table was not appended, although it was not empty. There was no error message in the processor.
I therefore used the DOUBLE INPUT QUERY shown below, which totally scrambled the columns, but gave no error message either. When I explicitly state all the columns in the query, it works as expected.

So it seems column order plays a role in union, in the processor and in the query. So how would I order columns in OD in order to use the UNION processor safely?

SELECT s.*
FROM secondInputTable s
UNION 
SELECT f.*
FROM firstInputTable f
1 Like

I would never use the UNION in Spark SQL because it can mix up your columns as described.
In general the UNION processor is good to use. But keep in mind that it treats left and right input differently.
From the documentation:

The processor has two input ports for the tables to be combined. The two input tables need to contain the same columns (name and representation type). However, the second (right input) table can contain additional columns, but they will be ignored.

2 Likes

Thanks, Kai! I guess, this is then a friendly reminder to not use spark union.
The OD processor does indeed the right thing, I just missed it due to too few lines in the result table.