My name is Todd Forsyth. I’m the Technical Lead at RSC LLC. We specialize in installing, hosting and managing ARCHIBUS IWMS systems. This is one of a series of posts on how to make that process easier if you’re doing something similar yourself.
I started out my technical life not as an ARCHIBUS developer, but as a technology consultant at a major technology consulting firm. In this role, I both installed packaged applications (including Oracle Applications and Kana Customer Service) and developed web-based applications and data warehouses from scratch.
One of the biggest surprises for me in digging into the ARCHIBUS application is how it uses database keys. And when I say “keys,” I mean the unique values that identify a record. Things like the Work Request Code (wr_id) in the Work Requests (wr) table.
I’ll talk about what I found surprising, the pros and cons I find with the ARCHIBUS approach, and give some practical advice on what you need to do to use these database keys intelligently.
First a little background on database keys. As I said, the key is a single value that uniquely identifies each record. Things like a Work Request Code, or an Employee ID, or a Room Number in ARCHIBUS. In ADDITION to identifying records in its primary table, these values are ALSO used to tie a lot of data tables in your database together behind the scenes. For example, each employee record in the employee table also contains columns for Room Code, Floor Code and Building Code, so you’ll know where the employee sits.
Before joining RSC, all the applications and data warehouses I’d had a deep look at used what are called “surrogate keys.” That is, the key was always a made-up value, most often an integer number that had no intrinsic information stored in it. In such a system an employee records might have an employee_id value of ‘291716’. This number wouldn’t have anything to do with the real employee. It wouldn’t be their employee number or badge number, and CERTAINLY not their name. To find those, you’d have to use this number to look them up in the employee table.
By contrast, ARCHIBUS uses what are called “natural keys,” where the keys are meant to contain some information about the record they are the key to. An employee_id in such a system might be something like “TFORSYTH,” telling us something about the employee’s name.
It turns out that there are at least a couple of different schools of thought about the “right” way to build application databases. Some say that all primary keys should be surrogate, or arbitrary value, keys. Others find good reasons to use natural, meaningful data as a key. Let’s take a closer look at why both groups think they’re right:
The Pros and Cons of Surrogate, or “Arbitrary Value” Keys
This is the world I was used to; the way Oracle Applications, and SAP, and PeopleSoft all handle their databases. There are strengths to this approach, and challenges:
- The key has no intelligence built into it. Meaning you cannot derive any meaning, or relationship between the surrogate key and the rest of the data columns in a row. If things change in a way which would require you to update the basic information about a record (say you want to re-number all your rooms, or institute a new employee numbering system), this can be done without changing this value in a host of tables. You simply change the meaningful value in the primary or “home” table where that value lives. You could just update the room number in the room table, for example. This sure makes it easier when these values need to be changed.
- Surrogate keys are usually integers, which only require 4 bytes to store, so the keys, and any database indexes which use them, will be smaller in size than their natural key counterparts. All a fancy way of saying that big queries with lots of tables run faster with surrogate keys.
- If foreign key tables use surrogate keys then you will be required to have a join to retrieve the real foreign key value. (Meaning if you store the room id where an employee is seated in the employee table, you ALWAYS have to look up the “real” room number in the room table). You wouldn’t have to do this with natural keys. Some meaningful data (like a room number) would already be right there. If you needed to dig deeper, though, like getting the room name, you’d STILL need to go back to the room table to get it.
- So surrogate keys are not useful when searching for data, since they have no meaning. You have to go back to the primary table.
The Pros and Cons of Natural or “Real Data” Keys
This is the type of key structure ARCHIBUS uses, and all of these strengths and weaknesses are those that ARCHIBUS is subject to:
- Since the keys store some useful data, you will usually require less joins/tables when writing a query. I’ve definitely found this to be true in ARCHBUS. Often you don’t need to join to the building or employee table; it’s enough that you know the key value stored in the local table you’re looking at.
- Searches are easier because natural keys have meaning, and you don’t need to do so many joins to get to something meaningful
- Much more work is required to change the value of the key. Changing a Building Code, or “bl_id” value, for example, requires that ARCHIBUS look in over 100 tables where this key might be stored. The ARCHIBUS applications are smart enough to make this change, but a developer who builds on top of ARCHIBUS must constantly keep this in mind, especially if these keys are being stored in custom tables or fields, of if such value updates happen OUTSIDE of ARCHIBUS logic (which can happen when those key values come in from outside, as through an Employee Sync with an HR system.)
- Your primary key columns, and any indexes that looks at them will be larger because natural keys are usually strings, which take more space to store than “arbitrary” integers. Larger key columns and indexes mean queries that take longer to run. However, since ARCHIBUS databases are typically small in size, this isn’t usually a major concern. Some tuning may need to be done as the database grows, however.
Using ARCHIBUS Keys Intelligently
Now, while the above has been an interesting exercise in the theoretical, it’s not a choice we really get to make in ARCHIBUS. ARCHIBUS uses Natural Keys. That said, this implies a few things you need to keep in mind in setting ARCHIBUS up:
- Don’t pretend you have Surrogate Keys – I’ve known clients who are absolutely SURE surrogate keys are the way to go, even in ARCHIBUS. They want to assign ONLY numbers to their building, employee, or department keys (or “Codes”.) This is counter-productive for a couple of reasons:
- ARCHIBUS exposes the keys in places many other applications expose name or description fields. So if you want to have a clue what you’re looking at (who is this employee? Which building is this? Is this department Accounting or Legal?), you NEED to give this data some meaning.
- You will find yourself customizing nearly EVERY form you use regularly to go look up the meaningful data you need from its primary table, where it’s being stored in the Description or Name field. This is a recipe for disaster when doing an upgrade. Save your customizations for things that really matter.
- Giving your data some meaning doesn’t mean keys are free-form text fields. You need to be VERY careful about what you put in here. You might want to think about exposing your potential key value schemes to these four tests before calling them final:
- Is the primary key unique? – My example above of using first initial, last name (TFORSYTH) as a primary key for the employee table is a good example of scheme that DOESN’T pass this test. Adding the employee number might solve this: (TFORSYTH F2314)
- Does it apply to all rows? – Are there some data points that just doesn’t fit the scheme? What if you have an employee with a last name like “Wolfeschlegelsteinhausenbergerdorff?” ‘Probably a bad idea to use the WHOLE last name. Maybe X characters?
- Is it minimal? Remember, big size is one of the problems with natural keys. Keep your values SHORT. ARCHIBUS does a good job of enforcing this through their default key field sizes.
- Is it stable over time? This one is the real kicker. Can you GUARANTEE that these values will never change? Of COURSE you can’t. But if they change ALL THE TIME, you probably need to look harder for something to use as a key.
I hope you’ve found this look at ARCHIBUS keys useful and informative, and that it can help inform the way you set up and use them in your system.