I suggest that we add temporal foreign keys to relational database design.
Programs move data. Databases store data in a consistent fashion. These different purposes can lead to different organizing principles. One of the key reasons for Object Oriented structures is to minimize cost and complexity when things change over time. This creates a temporal relationship between various designs, a relationship that is powerfully supported by object oriented development.
However, while programs hold functions that change over time, we don’t have a good structure for isolating complexity caused by data that changes over time. And so I ask the question: How do we begin to create the notion, in the data storage layer, that data can hold a temporal relationship with other data.
I don’t mean the notion that a data record would have a “last updated date”. I mean the notion that a data table may contain a foreign key to another table, where a record keeps the values in the related table that existed on a particular date, even if the data in the related table is later changed.
For example: let’s say that company A sells products. Their products are P1 and P2, and they sell for $20 each. Now, company B makes an order for product P1. In current RDBMS systems, we actively copy the price at the time of the order from the ‘products’ table to the ‘purchase order details’ table because the price could change later, and we want to remember the price on the date it was made.
However, this is a workaround. The fact is that the purchase order has a temporal relationship with the products table… a relationship that the notion of RDBMS cannot handle… so we copy fields around. The decision of what fields to copy belongs to the ‘purchase order details’ table, and the code becomes complex by the notion that specific fields have to be selected from that table instead of the products table when evaluating a product. It’s a kind of “overlay”. The relationship says: pick fields from the related table unless a field by that name happens to exist in the current table.
Should this relationship be defined by the ‘order details’ table? Shouldn’t the owner of the data (the products table) decide what columns to expose as “time related” while other columns are not? I submit that this would put responsibilities where they belong and reduce the complexity of the data systems themselves.
I suggest this innovative feature for RDBMS systems: a temporal foreign key. The owner of a table indicates what fields are likely to change frequently and would have their data kept in a temporal structure, while other fields are not temporal (like a relationship with the bill of materials used for new shipments). Then, when a foreign key is made, by placing the product id into the ‘order details’ table, the date of the relationship is noted. Temporal field values are fixed at the value in place at that time. No reason to copy fields to another table.
The code in the calling systems would be much simpler, as would the designs of the databases themselves. The complexity of the database system would increase, but not moreso than other forms of referential integrity.
It is time to consider this kind of relationship as an innovation to the now 30 year old basic notions of relational databases.
5 thoughts on “Considering: Temporal database relationships”
Very very interesting…
You speak about database creators implementing "real" structures.
But a problem arises : how we, the developer will handle that ?
Otherwise, is a brilliant idea!
In your example, the Price stored in the ‘Products’ table should be stored in another ‘ProductPrices’ table. The ‘ProductPrices’ table should have a ProductID, Price, and effective date range for that price.
There is no need to get fancy with "temporal foreign keys."
This has been done. It’s just that there are no "approved" ways of doing it, no patterns, no standards.
I almost saw this implemented 10 – 13 years ago in FoxPro, we designed it, but it wasn’t something the customer wanted to pay for.
By only storing sale price in the invoice, a small handful of data is being lost. Especially if a price is set high enough that NO items are purchased at that price. The price that was "too high" gets lost because it was never recorded.
Also, if there is a gap in sales data, what is lost is the day the price was altered. All we know is that Item A was purchased on the 1st for $10, and it was purchased on the 20th for $12. Question, when was the price changed?
The more I think of it, the more I’m almost willing to bet that SAP has implemented this in some of their modules.
thanks to Omen for the book link. I’m actually glad that this isn’t an original idea. I would hate to think of the state of db design if a geek who isn’t involved in Big Thinking on db design came up with something this important as an original thought.
To Joe: yes, you can create structures in tables to OVERCOME the complete lack of support in RDBMS systems. I’m asking for a change in the RDBMS systems so that we don’t need really clever people to have to do that.
To Malcolm: The notion of temporal data would have to include the historical aspect of change across time even if there is no referencing key. This is a good point. It is in alignment with my suggestion, not in conflict with it.
Methinks I have a book to read.