Fixing your Mendix Domain model - part 1

In the past month, I've been asked to help other Mendix developers with what I'm seeing as a very common problem: Domain models. What is occurring is that the developers build their models, begin to create their pages and forms, and inevitably hit a wall where they can't quite get the attributes on their pages to align, associate, or a number of other issues. Worse yet are those applications that have been in production for a while that require a redesign in order to accommodate a new feature. In this post I'm going to explore how I troubleshoot and help correct designs.

I'll be the first to tell you that there is never one "right" answer to your domain model design. OK, there is usually a 'best practice' to follow, but that doesn't mean there is only one way to implement design decisions. It's how we tech geeks get to pretend we're artists for a bit and explore our creativity. Heck, I'm used to designing models for fast Read times, not Write times, so I've had to go back to school myself by reading as much literature on design of OLTP's as I can get my hands on!

In most instances where I'm asked to help with design decisions, I come across the same two types of problems:

  1. The current model has been 'flattened' or denormalized resulting in data duplication and limited flexibility or
  2. The cardinality of the associations are not implemented correctly. Data ownership is misunderstood in this instance in almost all cases.

Denormalized data model problems

In order to explore this problem, I need to explain to those who don't understand what 'denormalized' means. I'm not a text book so if you want that answer go 'Google' it. What it means in practical terms is that data that is normally parsed into separate tables is combined to form a single record in a new table.

Think about it like this. If you had a Workbook, and on one spreadsheet you had Customers, a second spreadsheet that identified your Products, and a third spreadsheet where you stored Orders with a single column that related the unique customer id to know who made the order and a column that related that order and customer to a product, you would have a 'Normalized' workbook. The 'Order' spreadsheet contains the details of the order and two extra columns with the key values related to the Customer and the Product ordered.

But what if you wanted to know the Customer Name and the Product Description? A denormalized table would copy the Order Details, retrieve the Customer Name and store it in a column, and retrieve the Product Description and store it. From a reporting standpoint this table is faster because there is only a single retrieve on this denormalized table to get what they need, but when writing a record you wave a lot of data to write into each column instead of simply setting the association table with key values.

OK, now with that set of descriptions understood, let's get back on track to why denormalized tables in your model are (usually) a bad idea.

First of all, you are duplicating data. In the example above, you are now storing Customer Name in two places: Customer entity and Order entity. Duplicate data not only takes up storage resources but it also requires extra I/O (read and writes) to the database that isn't necessary, slowing down the application. Your goal as a developer is to remove as many extra steps as possible to keep your application running at peak performance. Additionally, duplicate data has a tendency to get out of sync. For example, let's say that customer gets married and changes their name. Now future orders will use the new name and order history will use the old name. Therefore, this is not a good practice and associations should always be used.

Second, you are making your model less flexible to change. If data is stored in one place and one place only, it is easy to associate (or de-associate) it to new or changing entities. Not so if it is being denormalized. Historical record maintenance becomes a mess. Steer clear of these denormalized tables in your application design.

The only exception to this is a separate reporting module, where you want to make the retrieval and display of data as efficient as possible. Reporting should be thought of as a separate module in your application, and I have blog post related to designing a module for just this case.

Data Ownership domain model problems

This one is a little less cut and dry in terms of good or bad practice. But there are some triggers you should listen and look for when building a good model. Here's a paraphrased quote I hear recently when discussing data ownership:

The form should own the ...

Stop! Whoa! "The form"...That should be a trigger in your mind. Forms don't own data. Forms are a medium for collecting data. Always think of the data relationships in real world terms as to how the data is used. Does a Customer own the relationship of a Product? No, because a Product can exist with or without a Customer. So how does a Customer acquire a Product? By placing an Order.  Now you know a relationship exists between Customer and Order, and Order and Product. An Order is the key to bring these two disparate entities together, and therefore owns the relationship. Therefore, draw your association FROM Order to Customer and FROM Order to Product.

Listening while discussing the relationship offers many cues as to what might be wrong with the cardinality (is it a one to one relationship or one to many? Many to Many?), as well as the ownership of the relationship. If you need help with this, open the association where you set the cardinality and put real attributes where it refers to the Entity names on the cardinality. For example, instead of reading "There are multiple (*) Orders for each (1) Customer", change it to "There are multiple purchases for John Smith" to see if it still makes sense.

One other item to check is the direction of the association arrows coming out of or into an entity. This is a bit tricky because there are good reasons when you might have some coming into and out of the same entity (such as linking the AuditTrail tables with your entities), but generally speaking, if the entity mostly has associations out to other entities and you see one or two coming into it, suspect those. I'm not saying they are wrong but they need to be looked at carefully. Usually entities relate one way or the other and rarely do they require both, so look at any entity that has both arrow directions carefully.

These are just a handful of techniques I use and practices I follow to troubleshoot and help developers with Domain Model issues. I hope you find value in this. I'll follow this posting with a process to follow once you've identified the issues and are ready to implement the changes in part two. Hope this is helpful!