I am trying to find a way to build a check in a ONE DATA workflow to find out if a string respects the format mm/dd/yyyy (i.e., it’s a valid date). If the string does not respect this format, then an error statement should be thrown.
Does anyone know of a way to do that? Bear in mind that the input string may also be NULL.
CAST(to_date( i.date_col , "MM/dd/yyyy") AS STRING) as date_col_parsed,
if( i.date_col IS null, 1, 0) AS was_null_in_input,
if(i.date_col IS NOT null AND to_date( i.date_col , "MM/dd/yyyy") IS NOT null, 1, 0) AS was_valid_timestamp
FROM inputTable i
adds (next to the parsed and re-stringified date) two other columns, one indicating whether the input was null and the other indicating whether or not a valid date (following above pattern) was encountered.
Please note that in your example (mm/dd/yyyy) mm is interpreted as 2-digit minutes in Java. You need to use the upper case version MM instead (see my query).
The query is created for transformation but the indicator columns can easily be used (or adapted) in an Assert processor.