Hi,
Is there any way to check if a column exists in Spark SQL within a Query or an Extended Mathematical operation?
Ideally, I’d like to do the following:
IF not_exists(column_name) THEN
COALESCE(column_name, "")
-- column_name exists
ELSE
column_name
No Python or R must be used for this.
what about using a Describe and checking if the column you need exists, then using an if/else processor?
1 Like
Yes, that sounds like a good idea. Or a Double Input Query, getting the original data table as first input table and the DESCRIBED version as the second input.
Then, checking if the column_name is contained in the columns of the DESCRIBED version.
Attached a Service WF (via Integrated WF Processor) I always use to check if incoming data have missing columns (using an anti-join), and if so create them as NULL columns to be able to Union to the existing data without schema mismatch.
Inputs are the old data and the new data.
Probably you can use this or/and adapt it for your needs.
JE9EQ0xCUkQkeyJvbmVkYXRhVmVyc2lvbiI6eyJjbGllbnRWZXJzaW9uIjoiMS4yMTMuMSIsInNlcnZlclZlcnNpb24iOiI1Mi40LjQiLCJidW5kbGVWZXJzaW9uIjoiNS4yLjAifSwicmVwb3J0cyI6W10sIndvcmtmbG93Ijp7ImlkIjoiNjEwMGRjNTYtNDFkMS00OTFkLWIxN2MtYTdjNzgxMjMyNDg0IiwidmVyc2lvbk51bWJlciI6MSwibmFtZSI6IltTRVJWSUNFXSBBZGQgbWlzc2luZyBjb2x1bW5zIHdpdGggTlVMTCBiZWZvcmUgVW5pb24iLCJ0YWdzIjpbIk1TIl0sImRlc2NyaXB0aW9uIjoiIiwibm90ZXMiOm51bGwsImNyZWF0ZWRBdCI6IjIwMjItMDktMjFUMTQ6MzI6MjIuMzE3WiIsInVwZGF0ZWRBdCI6IjIwMjItMDktMjFUMTQ6MzI6MjQuMDE4WiIsInByb2plY3RzIjpbeyJpZCI6ImFkM2VkZWNkLWFjYWQtNDNmYi1iNjcyLTc2ZDUzMDQ3ZWI0ZiIsInByb2plY3RUeXBlIjoiVVNFX0NBU0VTIiwib3duZXIiOiIzNTY2ZTQ0Mi1kNjFiLTQ2ZTktYTRmMC1kMDY3Y2M5YjRlZTUiLCJuYW1lIjoiQ0RMUCIsImRlc2NyaXB0aW9uIjpudWxsLCJub3RlcyI6bnVsbCwidGFncyI6W10sImNyZWF0ZWRBdCI6IjIwMjItMDktMDVUMTI6MzM6MDcuNzcyWiIsInVwZGF0ZWRBdCI6IjIwMjItMDktMTlUMjA6MTA6NTIuODY5WiIsInNlY3VyZWQiOmZhbHNlLCJvd25lckluZm9ybWF0aW9uIjpudWxsfV0sIm5vZGVzIjp7ImRmMGE5Y2MzLWU0NGMtNGY0Zi1hM2FlLTQyODI4MTk1MjMxMyI6eyJpZCI6ImRmMGE5Y2MzLWU0NGMtNGY0Zi1hM2FlLTQyODI4MTk1MjMxMyIsInByb2Nlc3NvciI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMTYzLTAwMDAwMDAwMDAwMSIsImNvbmZpZyI6eyJmaWx0ZXIiOlt7Im5hbWUiOiJ3aGVyZV9zdGF0ZW1lbnQifV0sInNlbGVjdCI6W3sibmFtZSI6InNlbGVjdF9zdGF0ZW1lbnQifV0sInRhYmxlbmFtZSI6IiJ9LCJyZXN1bHQiOm51bGwsIm9wdGlvbnMiOnsiY29uZmlnRWRpdEV4cGVydE1vZGVFbmFibGVkIjpmYWxzZSwieCI6OTk4LCJ5Ijo0NDcsIm5hbWUiOiJRdWVyeSBIZWxwZXIiLCJjb2xvciI6IiMwMDk3YTcifSwib3V0Ijp7ImRhdGEiOnsiNmU4ZmYzYzYtY2MxZi00OTgwLWJkNTgtY2FjNjk5NmQxN2U0IjoidG9SZXR1cm4ifX19LCIwOGE0YmY0Zi05Y2MyLTRlZDUtOTNjMi05Y2YyNGUxN2MzNWEiOnsiaWQiOiIwOGE0YmY0Zi05Y2MyLTRlZDUtOTNjMi05Y2YyNGUxN2MzNWEiLCJwcm9jZXNzb3IiOiIwMDAwMDAwMC0wMDAwLTAwMDAtMDAwMy0wMDAwMDAwMDAwMDEiLCJjb25maWciOnsidGFibGVuYW1lIjoiIiwicXVlcnkiOiJkZXNjcmliZSBpbnB1dFRhYmxlICJ9LCJyZXN1bHQiOm51bGwsIm9wdGlvbnMiOnsiY29uZmlnRWRpdEV4cGVydE1vZGVFbmFibGVkIjpmYWxzZSwieCI6MTAwOCwieSI6LTE1OSwibmFtZSI6IlF1ZXJ5IChkZXNjcmliZSB0byBnZXQgY29sdW1uIG5hbWVzIGluIG9uZSBjb2x1bW4pIiwiY29sb3IiOiIjMDA5N2E3In0sIm91dCI6eyJkYXRhIjp7IjY4OWZhODU4LTAwMTQtNDcwMC04NGI0LTE3MDRjOTE2Y2FjMSI6InRhYmxlMiJ9fX0sIjZlOGZmM2M2LWNjMWYtNDk4MC1iZDU4LWNhYzY5OTZkMTdlNCI6eyJpZCI6IjZlOGZmM2M2LWNjMWYtNDk4MC1iZDU4LWNhYzY5OTZkMTdlNCIsInByb2Nlc3NvciI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDk2LTAwMDAwMDAwMDAwMSIsImNvbmZpZyI6eyJ0YWJsZW5hbWUiOiIiLCJpZGVudGlmaWVyIjoib3V0cHV0In0sInJlc3VsdCI6bnVsbCwib3B0aW9ucyI6eyJjb25maWdFZGl0RXhwZXJ0TW9kZUVuYWJsZWQiOmZhbHNlLCJ4IjoxMDAxLCJ5Ijo2MDIsIm5hbWUiOiJNaWNyb3NlcnZpY2UgT3V0cHV0IChvdXRwdXQpIiwiY29sb3IiOiIjZTkxZTYzIn0sIm91dCI6e319LCI2ODlmYTg1OC0wMDE0LTQ3MDAtODRiNC0xNzA0YzkxNmNhYzEiOnsiaWQiOiI2ODlmYTg1OC0wMDE0LTQ3MDAtODRiNC0xNzA0YzkxNmNhYzEiLCJwcm9jZXNzb3IiOiIwMDAwMDAwMC0wMDAwLTAwMDAtMDEwMy0wMDAwMDAwMDAwMDEiLCJjb25maWciOnsidGFibGVuYW1lIjoiIiwicXVlcnkiOiJTRUxFQ1QgXG4gICAgZi5jb2xfbmFtZSBcbkZST00gZmlyc3RJbnB1dFRhYmxlIGYgTEVGVCBBTlRJIEpPSU4gc2Vjb25kSW5wdXRUYWJsZSBzIE9OIGYuY29sX25hbWUgPSBzLmNvbF9uYW1lICJ9LCJyZXN1bHQiOm51bGwsIm9wdGlvbnMiOnsiY29uZmlnRWRpdEV4cGVydE1vZGVFbmFibGVkIjpmYWxzZSwieCI6NTU3LCJ5Ijo5MywibmFtZSI6IkRvdWJsZSBJbnB1dCBRdWVyeSAoTEVGVCBBTlRJIEpPSU4gdG8gZ2V0IGNvbHVtbiBuYW1lcyBvZiB2YWx1ZXMgbm90IGluIHRoZSBjdXJyZW50IEFQSSByZXRyaWV2YWwpIiwiY29sb3IiOiIjMDA5N2E3In0sIm91dCI6eyJkYXRhIjp7ImM0YWQ3ZGRjLWZkNWMtNDQ2OS1iMjJmLTcyOGVmMzA5MWQxNCI6ImRhdGEifX19LCJjNGFkN2RkYy1mZDVjLTQ0NjktYjIyZi03MjhlZjMwOTFkMTQiOnsiaWQiOiJjNGFkN2RkYy1mZDVjLTQ0NjktYjIyZi03MjhlZjMwOTFkMTQiLCJwcm9jZXNzb3IiOiIwMDAwMDAwMC0wMDAwLTAwMDAtMDAwMy0wMDAwMDAwMDAwMDEiLCJjb25maWciOnsidGFibGVuYW1lIjoiIiwicXVlcnkiOiJTRUxFQ1RcbiAgICBjb25jYXQoY29uY2F0X3dzKCcsJywgJyonLCBjb2xsZWN0X2xpc3Qoai5udWxsZWRfY29scykpKSBBUyBzZWxlY3Rfc3RhdGVtZW50LFxuICAgICcxPTEnIEFTIHdoZXJlX3N0YXRlbWVudFxuICAgIEZST01cbiAgICAgICAgKFNFTEVDVCBcbiAgICAgICAgICAgIGNvbmNhdCgnTlVMTCBBUyAnLCBpLmNvbF9uYW1lKSBBUyBudWxsZWRfY29scyBcbiAgICAgICAgRlJPTSBpbnB1dFRhYmxlIGkpIGpcbiAgICAgICAgXG4gICAgICAgIFxuIn0sInJlc3VsdCI6bnVsbCwib3B0aW9ucyI6eyJjb25maWdFZGl0RXhwZXJ0TW9kZUVuYWJsZWQiOmZhbHNlLCJ4Ijo1NTksInkiOjI1MSwibmFtZSI6IlF1ZXJ5IChwcmVwYXJlIHF1ZXJ5IGZvciBjcmVhdGluZyBOVUxMIGNvbHVtbnMpIiwiY29sb3IiOiIjMDA5N2E3In0sIm91dCI6eyJkYXRhIjp7ImRmMGE5Y2MzLWU0NGMtNGY0Zi1hM2FlLTQyODI4MTk1MjMxMyI6InRhYmxlMiJ9fX0sImQ1NGE5ODIwLWQ4NjgtNDk0NS1hODQ0LTM0OTYyNWY0ZTUzOCI6eyJpZCI6ImQ1NGE5ODIwLWQ4NjgtNDk0NS1hODQ0LTM0OTYyNWY0ZTUzOCIsInByb2Nlc3NvciI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDk3LTAwMDAwMDAwMDAwMSIsImNvbmZpZyI6eyJpZGVudGlmaWVyIjoiaW5wdXRfc2F2ZWRfc2NoZW1hIiwidGFibGVuYW1lIjoiIiwiaWdub3JlU2NoZW1hIjpmYWxzZSwiaW5wdXRUYWJsZSI6eyJoZWFkZXJzIjpbeyJuYW1lIjoiY29sX25hbWUiLCJ1bml0IjoiMjc2MjYxMDctNjg1Yi00MWIwLWFjMDctYzZiY2FlNjhlM2EyIn0seyJuYW1lIjoiZGF0YV90eXBlIiwidW5pdCI6IjI3NjI2MTA3LTY4NWItNDFiMC1hYzA3LWM2YmNhZTY4ZTNhMiJ9LHsibmFtZSI6ImNvbW1lbnQiLCJ1bml0IjoiZGU0MzgzMTQtNGYwNy00N2U2LTk2NWEtNTVlM2YxY2IxOTFkIn1dLCJyb3dzIjpbWyJBIiwic3RyaW5nIiwiIl0sWyJCIiwic3RyaW5nIiwiIl0sWyJDIiwic3RyaW5nIiwiIl0sWyJEIiwic3RyaW5nIiwiIl1dfX0sInJlc3VsdCI6bnVsbCwib3B0aW9ucyI6eyJjb25maWdFZGl0RXhwZXJ0TW9kZUVuYWJsZWQiOmZhbHNlLCJ4IjotMTEyLCJ5IjotMjkzLCJuYW1lIjoiTWljcm9zZXJ2aWNlIElucHV0IChpbnB1dF9zYXZlZF9zY2hlbWEpIiwiY29sb3IiOiIjZTkxZTYzIn0sIm91dCI6eyJpbnB1dERhdGEiOnsiOTQ0YjBkNGUtNDlmNy00MzgwLWFjODctZTc2YmQyMGU2ZGJmIjoiZGF0YSJ9fX0sImQ4MTQ1ZDMxLTI0NmQtNDEzNS05MTIyLTg4NzNjYzI4NTEyYSI6eyJpZCI6ImQ4MTQ1ZDMxLTI0NmQtNDEzNS05MTIyLTg4NzNjYzI4NTEyYSIsInByb2Nlc3NvciI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDk3LTAwMDAwMDAwMDAwMSIsImNvbmZpZyI6eyJpZGVudGlmaWVyIjoiaW5wdXRfaW5jb21pbmdfc2NoZW1hIiwidGFibGVuYW1lIjoiIiwiaWdub3JlU2NoZW1hIjpmYWxzZSwiaW5wdXRUYWJsZSI6eyJoZWFkZXJzIjpbeyJuYW1lIjoiY29sX25hbWUiLCJ1bml0IjoiMjc2MjYxMDctNjg1Yi00MWIwLWFjMDctYzZiY2FlNjhlM2EyIn0seyJuYW1lIjoiZGF0YV90eXBlIiwidW5pdCI6IjI3NjI2MTA3LTY4NWItNDFiMC1hYzA3LWM2YmNhZTY4ZTNhMiJ9LHsibmFtZSI6ImNvbW1lbnQiLCJ1bml0IjoiZGU0MzgzMTQtNGYwNy00N2U2LTk2NWEtNTVlM2YxY2IxOTFkIn1dLCJyb3dzIjpbWyJBIiwic3RyaW5nIiwiIl0sWyJCIiwic3RyaW5nIiwiIl0sWyJDIiwic3RyaW5nIiwiIl0sWyJEIiwic3RyaW5nIiwiIl1dfX0sInJlc3VsdCI6bnVsbCwib3B0aW9ucyI6eyJjb25maWdFZGl0RXhwZXJ0TW9kZUVuYWJsZWQiOmZhbHNlLCJ4IjoxMzc2LCJ5IjotMzAwLCJuYW1lIjoiTWljcm9zZXJ2aWNlIElucHV0IChpbnB1dF9pbmNvbWluZ19zY2hlbWEpIiwiY29sb3IiOiIjZTkxZTYzIn0sIm91dCI6eyJpbnB1dERhdGEiOnsiZGYwYTljYzMtZTQ0Yy00ZjRmLWEzYWUtNDI4MjgxOTUyMzEzIjoidGFibGUxIiwiMDhhNGJmNGYtOWNjMi00ZWQ1LTkzYzItOWNmMjRlMTdjMzVhIjoiZGF0YSJ9fX0sIjk0NGIwZDRlLTQ5ZjctNDM4MC1hYzg3LWU3NmJkMjBlNmRiZiI6eyJpZCI6Ijk0NGIwZDRlLTQ5ZjctNDM4MC1hYzg3LWU3NmJkMjBlNmRiZiIsInByb2Nlc3NvciI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAzLTAwMDAwMDAwMDAwMSIsImNvbmZpZyI6eyJ0YWJsZW5hbWUiOiIiLCJxdWVyeSI6ImRlc2NyaWJlIGlucHV0VGFibGUgIn0sInJlc3VsdCI6bnVsbCwib3B0aW9ucyI6eyJjb25maWdFZGl0RXhwZXJ0TW9kZUVuYWJsZWQiOmZhbHNlLCJ4IjotMTA5LCJ5IjotMTU0LCJuYW1lIjoiUXVlcnkgKGRlc2NyaWJlIHRvIGdldCBjb2x1bW4gbmFtZXMgaW4gb25lIGNvbHVtbikiLCJjb2xvciI6IiMwMDk3YTcifSwib3V0Ijp7ImRhdGEiOnsiNjg5ZmE4NTgtMDAxNC00NzAwLTg0YjQtMTcwNGM5MTZjYWMxIjoidGFibGUxIn19fX0sIm5vZGVOb3RlcyI6e30sIm93bmVyIjoiMzU2NmU0NDItZDYxYi00NmU5LWE0ZjAtZDA2N2NjOWI0ZWU1IiwicmVzb3VyY2VUeXBlIjoiV09SS0ZMT1ciLCJncm91cHMiOltdLCJhdXhpbGlhcmllcyI6W10sInZhcmlhYmxlcyI6W10sInJlY2VudEV4ZWN1dGlvbiI6bnVsbCwiam9iTmFtaW5nUnVsZSI6IiNAc3lzX2pvYl92ZXJzaW9uQCIsIm93bmVySW5mb3JtYXRpb24iOnsidHlwZSI6InVzZXIiLCJvd25lcklkIjoiMzU2NmU0NDItZDYxYi00NmU5LWE0ZjAtZDA2N2NjOWI0ZWU1IiwiZmlyc3ROYW1lIjoiTWF4aW1pbGlhbiIsImxhc3ROYW1lIjoiU2NocmltbCJ9LCJob21lUHJvamVjdCI6bnVsbCwicGVybWlzc2lvbnMiOlsiUkVBRCIsIldSSVRFIiwiQ1JFQVRFIiwiREVMRVRFIiwiRVhFQ1VURSIsIlNIQVJFIl0sInNpbmtzIjpbXX19
1 Like
Thanks! This was very useful to solve our use case with @ThomasZ , where we had to create columns in the secondInputTable of a UNION, with names matching the ones of the firstInputTable, in case those are missing.
Only sidenote: Instead of generating NULL values with default type “String”, I realized we actually had to generate default values of 0.0, so that missing columns get assigned “Double” as a Data Type and our UNION can work.
In the “Query (prepare query for creating NULL columns)” processor, this is the change I applied:
SELECT
concat(concat_ws(',', '*', collect_list(j.nulled_cols))) AS select_statement,
'1=1' AS where_statement
FROM
(SELECT
--CONCAT("CAST(", i.col_name ,") as string)") AS nulled_cols
concat('0.0 AS ', i.col_name) AS nulled_cols
FROM inputTable i) j