The Database is a Detail

    I've seen many applications where the database is in the center. In these applications the database is the most important part and everything else is just a shell around it. In most of the applications that I've seen, they use an ORM, and in some they use plain SQL. All of them had business objects without any behavior. All of the behavior, all of the business rules were in the database. I want you to think about that!

    What's an "object" in the sense of object-oriented programming? What characterizes an "object"? The first sentence in Wikipedia about object-oriented programming states:

Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code: data in the form of fields (often known as attributes or properties), and code, in the form of procedures (often known as methods).

    An object can contain data and code. In other words an object has data and behavior. Combining data with behavior is the very essence of object-oriented programming. Divorcing the behavior from the data leads to a more cumbersome design. In fact, when we are talking about business rules, removing the behavior from the object is a named anti-pattern: Anemic Domain Model. There's a name for an object that has only data without any behavior. It's called property bag. If you come to think about it, it's no different than a dictionary (also known associative array, map or a symbol table). A property bag is just a collection of key-value pairs. This design resembles more closely a procedural program rather than an object-oriented one. So why even bother to create those "objects" in the first place if you end up with a procedural program? Why do you even bother using an object-oriented language at all? But that's not the worst part about this design.

    The worst part about putting your behavior in the database is that it couples the application to the database so strongly, that there is simply no way out. You are doomed to use the same database for an eternity. You cannot change the database if you discover that there is a better, faster or a cheaper alternative. You are stuck with the choice that you made. Go ahead, change the database if you don't believe me!

    Programming is all about managing dependencies. Good design allows you to manage dependencies easily. Good design allows you to be flexible and change parts of the program when requirements change. It separates the important parts of the application from the less important ones. The most important part of the application is the business rules and use cases that make your business work. These are the things that earn you money, pay the bills, and keep your business going. Not the database. The database is not important, it's just a detail, and you should be able to change it without much effort whenever you decide to do so. So how do you do that?

    Well the first thing you need to do is quite obvious. Get rid of all the stored procedures and put them in code. There are three main patterns that you can use to organize your business rules: Transaction Script, Table Module and Domain Model. Hopefully I'll write separate articles about each one in the future. For now just pick one, get rid of all stored procedures and put your business rules in the code.

    The next step is to abstract the database. In statically typed languages like C# or Java you need to hide it behind an interface or an abstract class. Define a data access layer in a separate assembly. Write a method in it for each and every query that you have in your application. Be very strict about it. Do not leave a single query to leak from the data access layer. And I really do mean it. Put the Entity Framework or Hibernate entities in the data access layer and never use them outside of it. Remove all the dependencies to the ORM outside of the data access layer. Voila, you now have isolated your application from the database. Changing the database now is as simple as implementing several abstract classes and writing integration tests. Your business rules are separated from the database. They are not affected by the change and you do not have to worry about them.

    Writing code that's easy to change is hard. Writing code that's easy to change means managing complexity effectively. It means you have to separate the things that matter from the things that don't. Divide and conquer. The database is one of those things that don't matter that much. It should be easy to manage and easy to change. The business rules are hard to change, because they define your business. They are the heart of the application and not the database. The business rules are what's important. The database is just a detail!

Popular posts from this blog

SOLID Series: The Single Responsibility Principle

The Technical Debt Metaphor