Does Spark or One Data autocast in the CONCAT_WS function?

  • Does OD or Spark autocasts non-string type values to string when using the CONCAT_WS function?

  • Also where would i learn more about the autocast behavior?

I can use the CONAT_WS spark function in a query processor with different non-string types.

  CONCAT_WS(' ,', i.a, i.b, i.c) as my_new_string 
from InputTable i

where a is of type BigInt, b of Datetime and c of string.
The spark doc however states:

Returns the concatenation of the strings separated

Motivation: I need to now if SHA2(my_new_string, 256) can create unwanted collisions due to non injective behavior in the autocast.

Yes, non-string-types will be autocasted (where applicable).

If you want a specific (and lossless) conversion, you can use dedicated conversions (like date_format for timestamps. Floating point numbers are tricky here and you will always have a certain level of loss potential. But these types in and of themselves are lossful.

The autocasting will basically use the (Java) data type’s toString() functionality so the chain to get to the information you are looking for is mapping Spark data type to Java data type and then checking the toString() behavior of this very data type.