RK_ACC_NR_16which is a
CHAR(16). You are trying to build a recommendation engine, so naturally, you need to define a list of relevant customers in your product. This is done by a script that fetches a unique list of
RK_ACC_NR_16IDs from the data warehouse, perhaps including some additional filter conditions. This script is the basis for all other scripts to follow and the column
RK_ACC_NR_16is referenced everywhere in your code and in your table schemas.
CHARdata type to store the customer ID when a 4 Byte
INTEGERwould have been the obvious and more efficient choice. They may also wonder why you chose a weird name like
RK_ACC_NR_16instead of the more obvious
CUSTOMER_ID. Of course, you had a good reason for these two choices, but it requires everybody who ever looks at your code to be familiar with the way customer IDs are implemented in your data warehouse. If the implementation in the data warehouse ever changes, for example due to data privacy laws being passed that require the customer ID to be encrypted, you will need to change the table schemas and queries in hundreds of different places across your code and database.
RK_ACC_NR_16in the data warehouse and assigns a unique
CUSTOMER_IDto every new
RK_ACC_NR_16. If changes are made to the data warehouse, you don't need to change every single part of your product. All you need to do is to update the customer management component in order to deal with whatever the data warehouse engineers come up with. As long as whatever they develop can still be mapped to your integer type
CUSTOMER_ID, you're good.
WHEREstatement that encode the constraints demanded by the interface definition. For example, if the interface is for exchanging product information, you can add a filter
PRODUCT_PRICE >= 0or
IS_PRODUCT_AVAILABLE IN ('y', 'n'). This means that even if somebody sends you bullshit data, it will not enter your data pipeline where it can potentially break something. It will already be filtered out at the very beginning.