A quick way to check if you have this type of problem comes with the TRIM()
function in SQL or trimws()
in R. These functions allow you to remove leading and trailing white spaces. Simply check if TRIM(MY_COLUMN) = MY_COLUMN
. If this evaluates to FALSE at any point, you have leading or trailing white spaces in one of your text fields. Please note that this doesn't work with all databases, because they will sometimes ignore the leading and trailing spaces. If you want to be on the safe side, check if LENGTH(TRIM(MY_COLUMN)) = LENGTH(MY_COLUMN)
. Alternatively, you can check if the string satisfies a regular expression like '^[ ]+'
or '[ ]+#x27;
. Be aware that the space character U+0020
is not the only type of whitespace out there. In text data, you will often find the non-breaking space character U+00A0
, the tab character U+0009
or even more obscure characters. Therefore, it is safer to use regular expressions like '^[\s]+'
or '[\s]+#x27;
.