RK_ACC_NR_16
which is a CHAR(16)
. You are trying to build a recommendation engine, so naturally, you need to define a list of relevant customers in your product. This is done by a script that fetches a unique list of RK_ACC_NR_16
IDs from the data warehouse, perhaps including some additional filter conditions. This script is the basis for all other scripts to follow and the column RK_ACC_NR_16
is referenced everywhere in your code and in your table schemas.CHAR
data type to store the customer ID when a 4 Byte INTEGER
would have been the obvious and more efficient choice. They may also wonder why you chose a weird name like RK_ACC_NR_16
instead of the more obvious CUSTOMER_ID
. Of course, you had a good reason for these two choices, but it requires everybody who ever looks at your code to be familiar with the way customer IDs are implemented in your data warehouse. If the implementation in the data warehouse ever changes, for example due to data privacy laws being passed that require the customer ID to be encrypted, you will need to change the table schemas and queries in hundreds of different places across your code and database.RK_ACC_NR_16
in the data warehouse and assigns a unique CUSTOMER_ID
to every new RK_ACC_NR_16
. If changes are made to the data warehouse, you don't need to change every single part of your product. All you need to do is to update the customer management component in order to deal with whatever the data warehouse engineers come up with. As long as whatever they develop can still be mapped to your integer type CUSTOMER_ID
, you're good.WHERE
statement that encode the constraints demanded by the interface definition. For example, if the interface is for exchanging product information, you can add a filter PRODUCT_PRICE >= 0
or IS_PRODUCT_AVAILABLE IN ('y', 'n')
. This means that even if somebody sends you bullshit data, it will not enter your data pipeline where it can potentially break something. It will already be filtered out at the very beginning.