Assert statement for data type of one column


Let’s suppose I have a very simple table, containing one column with data type “string”.

column_name | type
my_column | string

I would like to create an assert statement in an “Assert” processor to check if my_column is of data type “string”.

How could I do that with Spark SQL? I know about the describe statement, but I can’t seem to get the output of a describe statement to be checked for correctness in an assert statement.

Important: Since I am working with the “Quality App” in ONE DATA Cartography, I cannot use multiple processors. I must put all of the code within one single Assert processor.

EDIT: With the following Syntax, I am able to get the data type for one single column, but I am still not able to verify the data type for this column in the same processor:

describe (SELECT  i.my_column FROM inputTable i
describe inputTable my_column

Solution by @maximilian.schriml: using the “typeof” function, one can get the type of all the rows of a column (which are all equal): Spark SQL, Built-in Functions

Therefore, to get the data type of one column name and then to verify its returned type, the following code does the job nicely in the ONE DATA Cartography - Data Quality app within one single processor :smiley: :

Code Data Quality app - Generic Rule:
SELECT distinct typeof( i.column_1) as type_column_1 FROM inputTable i GROUP BY type_column_1 HAVING type_column_1 != "string"

Code for Assert statement in a processor:

SELECT COUNT(*) = 0 FROM (SELECT distinct typeof( column_1 ) as type_column_1  FROM inputTable
GROUP BY type_column_1 
HAVING type_column_1  != "string") AS i