CREATE TABLE OrgChart( Business nvarchar(255), OperatingSegment nvarchar(255), ReportingSegment nvarchar(255), Company nvarchar(255) )My knee-jerk problems complaints against storing the data like this:
- No constraints! - Anything can be entered for any of the fields so we might end up with "texas" and "TX" for state.
- Data is repeated all over the place! - Why would I want to list "United States" for each record?
- No primary key!? - You are searching across nvarchars for each query?!
- What if I want to be able to persist more data about one of the properties, such as add a description field for a given reporting segment. If I were to follow the same pattern, I would just add another column to the OrgChart table named ReportingSegmentDescription. You can just use you imagination on how disgusting that would be.
- The data is LENDING ITSELF TO BEING STORED AS A TREE. It is a hierarchy for pity's sake. Fix it! Fix it!
So. Long story short, I created a much more elegant solution that I thought would solve all these problems and give us the flexibility to expand the data in every direction! Observe:
CREATE TABLE OrganizationType( Id int not null identity primary key, Name varchar(255) not null Description varchar(255) not null ) CREATE TABLE Organization( Id int not null identity primary key , Name varchar(255) not null, Description varchar(255) ParentId int, OrganizationTypeId int not null, CONSTRAINT FK_OrgType FOREIGN KEY OrganizationTypeId REFERENCES OrganizationType(Id) )As you can see this more or less solves all of our problems, doesn't it? We can do all sorts of things with the data in this tree form that would have been highly impractical before, right? This is better, right!? RIGHT!? Sadly, no. Not right. You can imagine my disappointment when I experienced how dreadfully slow reading the tree structure was. Think about it, if I want to read the entire tree using NHibernate, I have to recursively execute queries against the database for each node. Where the original less-elegant data structure was able to pull down the data and build a tree virtually instantly, the new more elegant solution took a good five seconds. FIVE SECONDS. Now, there are likely several things I could do to improve performance such as:
- Don't use NHibernate.
- Create a Stored Procedure that utilizes a CTE to recursively perform the tree traversal in SQL But.. think about it, if I do this, what data is the sproc going to return? THE FLAT DATA TABLE WE ORIGINALLY HAD...
Now, we are left with a dissatisfaction in the following two options:
1) An ugly flat table that is crying out for elegance and flexibility, or An elegant, but terribly non-performant tree structure.
2) Are these our only options?
Well, yes. And no. Why not use both?
Some possible options:
- Store the data as a tree as we have discussed. Do inserts directly to the Organization table and taylor a stored procedure with a CTE to your reading-needs (such as reading the entire tree).
- Keep both schemas (with a few modifications) and use SQL triggers to sync them. Perform inserts and updates against the tree schema and perform reads against the flat structure.
- Again: Relational DBs are built for flat data structures. Work with the database, not against it.
- Just because the current solution doesn't seem elegant (to you) doesn't mean it needs to be changed. Remember the simple axiom: "If it ain't broke, don't fix it".
- Our flat table solved the original problem of storing the data and allowing it to be read back quickly. This is 90% of what was needed. Just because something could possibly be useful or because it would be "so much better" is never a good enough reason to initiate a refactor. Don't feel the need to retro-actively over-architect.
Happy Coding!
No comments:
Post a Comment