Introduction
Most of the .NET code I see make heavy use of datasets, and a fairly simple client-server (2-tier) design. This is fine, because .NET has great support for using and displaying datasets on the screen. However, for more complex applications, the approach does not scale well in terms of maintainability. At some point, developers have to start coding business objects of some kind.
The problem is that using datasets insulates the developer from the basic incompatibility (called the impedance mismatch) between objects and databases. When you start using business objects, then you are suddenly faced with a huge range of options and challenges that you never encountered when using datasets. The good news is that, once they start using objects, few developers go back. It is an exciting and productive environment, with great opportunities for code re-use.
This article is an introduction to the subject, with a view to helping readers understand the ways in which you can design object-oriented applications, and some of the challenges that you may encounter. The focus is on small to medium-sized business applications that can benefit from an object-oriented design, but probably won't be needing Microsoft Enterprise Services any time soon.
Apologies in Advance (to the Experts)
There is very little standard terminology in this field. If I call a widget a wodget, and you are used to calling it a wadget, don't be upset with me! I try to define what I mean by every term I use, so there should be no confusion.
A Multi-tiered Application
You've probably (hopefully) heard of 3-tiered systems. The common tiers mentioned are presentation, business logic, and database. In practice, most object-oriented applications have more than 3 tiers -- they have a framework of interconnected components, typically found inside multiple DLLs, EXEs, and third party applications, generally categorized into layers/tiers. The components may all be located on a single computer, or they may be spread across multiple computers.
From the developer's perspective, a primary idea behind a tiered system is to split logic into different pieces of code, so that one or the other can be changed, extended, or rewritten without affecting the rest. There are other advantages too, such as scalability. Performance is not an advantage of object-oriented applications, except where it is obtained through scalability.
One of the driving forces used in the design and implementation of the tiers is that we should minimize duplication of logic. In a complex application, duplication of logic will lead to bugs. (This is true of a simpler system too, but in that case, it is easier to manage).
Some possible tiers of a multi-tiered system are:
- Database -- this consists of the DBMS (e.g. SQL Server), the table data and structure, and logic that is coded inside of the database. The most obvious example of logic that is coded in the database are stored procedures or triggers. However, it is worth noting that the structure of the database itself is logic, in the form of data type validation, relationship validation, etc.
- Data Access Tier -- separate from the database, is the code that accesses the database. There are many reasons to keep this code in its own layer -- support for multiple DBMSs, automated data auditing, connection management, hiding of connection strings, retrieving of Identity values, etc.
- Object/Relational mapping -- at some point in the application, we have to translate objects to SQL, and vice versa. An O/R Mapping component takes care of this requirement. We could just code the SQL inside of the higher level objects, but there are numerous benefits to encapsulating the logic at a single point in the system. Some of these benefits can include insulating developers from knowing the database structure, supporting powerful query interfaces for the users, and insulating other objects from database structure changes.
- Business Domain objects -- (also called entity/business objects). These are objects that contain properties that reflect data. At this level, you will find objects named address, client, etc. Commonly, they also embed relationship logic -- for example, a client object may have a property or method that retrieves the address object for that client instance. In this case, the objects are known as Active Domain Objects, because they can actively access related data.
- Service Layer -- Typically, service objects will not have names like client, or address. They are pieces of processing and workflow logic, that usually correspond closely to use case scenarios. For example, in a shopping cart application, the service layer could handle the processing of the order. This might include sending data to an external system, emailing the purchaser, and saving the order data in a database.
- Controller Logic -- (also called UI Process logic, Presentation logic). When dealing with multiple User Interfaces, e.g., both WinForms and Web, it makes sense to try and split the non-UI logic into its own layer. This is very difficult to do, but there are pre-built frameworks that help support this type of logic. They are known collectively as MVC (Model-View-Controller). The Controller logic is the C in MVC. In a shopping cart application, the controller logic could handle the flow of the web pages as the user progressed through the order process.
- UI/View -- This is the piece of the application that the user interacts with. When the controller logic is in its own layer, the UI typically only contains UI Mapping code, i.e., mapping object properties to and from screen fields. Of course, it also contains code to interact with the controller objects.
Disclaimer -- Most applications do not have all of these tiers. It is perfectly fine to combine tiers, according to your own unique needs.
Examples
At this point, I think we need some overly-simplistic examples:
Example | What is it? |
SQL Server, Oracle, MS Access | Database, DBMS |
stored procedure | if it has significant logic, Business Domain Object, otherwise Database |
ASP page | UI and Controller |
ASPX page | UI |
ASPX page code-behind | Controller |
Response.Redirect | Controller |
Windows Form | UI |
Windows Form code-behind | Controller |
Address class | Business domain object |
conn=new OleDBConnection; | Data Access Tier |
myAddress.Street = Reader.GetString(3) or cmd.AddParameter(myAddress.Street) | O/R Mapping |
using (DBConn conn = ConnFactory.Create()) | Business Domain Object or Service Layer |
If you look closely, you can see some interesting things above:
- ASP pages encouraged the combination of UI and Controller logic. ASP.NET improves the situation through code-behind classes. The benefits are primarily that the code is easier to read and maintain, as is the HTML.
- Stored procedures that contain business logic automatically break up your business logic into multiple places. Since stored procedures are not compatible with your other code, (i.e., you cannot make use of helper functions within a stored procedure), this can lead to duplicate code. It will be interesting to see how Yukon (support for C# on the database) opens up possibilities in this area.
Challenge #1 - Validation
The first challenge in a tiered system is validation. It is a challenge, because it is almost impossible to write validation without duplicating the logic. This is because multiple layers of the system need access to the same validation logic, and some layers intrinsically contain validation logic:
- often, the user interface must be nice to the user, and show validation problems before the user presses the Save button.
- the controller objects and service layer may need to enforce validation.
- the business domain objects must enforce validation.
- the data objects usually have validation logic already, in the form of data types on the properties.
- the database has validation logic already, in the form of data types, foreign key relationships, and field length limits.
Often, application designers will just ignore this challenge, and accept that validation will have to be duplicated across multiple pieces of code. However, there are some other approaches:
- put the validation in a separate Rules component, that can be re-used by the UI and the other layers -- probably the best solution, but also the hardest to implement.
- put the validation in the business domain objects, or the controller logic, and hope that programmers do not bypass the layer.
- put validation in the UI, and in the database -- this is more common than you would think, because it is the simplest to implement.
Whatever the choice, it is practically impossible to achieve zero-duplication of validation logic. Depending on your particular needs, you will need to choose the best approach.
Challenge #2 - Security
Like validation, security needs to be enforced at multiple levels.
- At the UI, we may want to disable a Save button, or certain fields/menus/buttons.
- in our business domain logic, we may want to prevent certain actions based on the context of the actions - for example, we may only allow the save of an address object if it is linked to the user's own user-id.
- in our database, we may want to apply permissions to tables and/or stored procedures.
Security is difficult when it is dynamic (specified in detail by an admin user), and context-sensitive, i.e., you cannot always tie security directly to a specific table or data object. If your security is simpler than that, then you may be able to make do it by implementing your security at the Controller, and/or at the database.
For the more complex, dynamic cases, an approach that I have used in the past was to attach security information to the data objects. The business domain objects attach the security data, based on the current context. The security is then enforced by the code that persists the data objects, i.e., the O/R mapping layer. This worked well for me, but YMMV (Your Mileage May Vary).
In addition to the problem of security enforcement, you have to deal with database connection strings -- if a user has access to the connection string, they can then establish a database connection directly (using SQL Query Analyzer), and are able to bypass object-based security. In that case, the only effective security is at the database level -- a very strong argument for using stored procedures, (because of the fine-grained control of security that they offer). Another approach to the connection string dilemma is to disallow direct connections to the database. This is done by placing the data access tier on a server machine, together with the connection string. Security is then enforced at the data access level, through the use of a separate security component, usually using security tokens/tickets to communicate the user's permissions. A final approach is to encrypt the connection string at the client. This is difficult to do well, because it requires that a private key be stored somewhere, and it is hard to find a place that is secure. It is possible to do, but in .NET 1.1, you have to use some unmanaged APIs. |
Challenge #3 - to MVC or not to MVC
MVC (Model-View-Controller) is a powerful technique to separate controller logic from the UI. You'll find MVC used most often in complex, workflow oriented web applications. This is because web-based applications lend themselves well to MVC -- MVC operates in a state-driven manner, just like a web application. It is a much less intuitive pattern for a WinForms application.
In the MVC acronym, M refers to the business data objects or service layer, V refers to the UI, and C to the controller logic. The intention of the C is to encapsulate input controller logic, i.e., the code-behind of the page or form that interacts with the model. Often though, the meaning is interpreted as application controller logic, i.e., controlling the flow of pages and forms.
Many consider MVC to be the most poorly understood and implemented pattern, primarily because of the confusion over what the C means. The problem is that application controller code can be re-used across different UIs, where-as an input controller is specific to a particular UI. Many MVC implementations combine the two types of controllers, which is OK, but not re-usable across different UIs.
For example, a web page has a completely different interface to a WinForms app, yet in MVC, we may code a single piece of controller logic to handle both. When we do this, the UI has a tendency to be created for the lowest common denominator, which leads to a very uninspiring application on the higher end interfaces, i.e., WinForms.
Of course, this can be managed, primarily by creating input controllers for each platform, and generic, shared application controllers -- but it is still a challenge.
My experience with MVC is that it is nice and convenient to have the controller logic separated, specially when dealing with a web application with processes, e.g., an order process. However, I was at the mercy of the MVC framework that I chose, and in the end, I prefer control over convenience. But that's just me -- a control freak.
Challenge #4 - Transactions
The functions that the user performs drive the need for transactions, and functions are defined at the level of the controller or service layer logic -- so it makes sense that transactions should be initiated and committed at that level. In most environments, this implies that database connections should be initiated and terminated at the same level, since few platforms support transactions that transcend database connections.
Thus, the challenge is to provide database specific functionality (transactions), at a level of the application that may be substantially removed (in terms of intermediate layers) from the database. It may even be that the data access tier is on a completely separate machine.
One solution is conceptually simple - create an abstraction that represents a database connection/transaction, and can be used inside of the controller logic to begin, commit, and rollback transactions. This has a profound effect on the design of the data access tier. It means that the data access tier cannot automatically open and close database connections -- it becomes a managed resource at the level of the controller logic. In .NET terms, the data access tier has to support the IDisposable
interface.
Alternatively, you can make use of automatic transactions, which are supported in .NET, COM+, and MTS. This has the additional benefit of supporting distributed transactions.
Achieving Reuse
Within your own organization, it can be extremely beneficial to establish some reusable components that fulfill the needs of particular components of the system. Some you can code yourselves, but others (e.g., Object/Relational mappers) are usually easier to purchase.
For a single application, it may make sense to code everything from scratch. But after the first, it gets tired real fast, especially typing SQL strings, doing O/R mapping, and creating domain objects. I'd rather concentrate on the real application logic. As an example, in my own work environment, I work primarily with semi-complex web-based applications that have their own databases. Some of my reusable components are:
- OID generator - this is an article by itself, but basically, I prefer to use GUID fields instead of INT fields as primary keys in my database. This component generates guaranteed-unique GUIDs, for assigning to a Primary Key before saving to the database.
- Data Access DLL - a simple data access component that works with the OID generator, encourages parameterized queries, and allows management of transactions.
- Simple Security component - supports adding, editing, and authenticating of users, with support for roles. Passwords are salted and hashed in the database.
- Audited Security component - for those projects with more stringent security requirements, extends the simple security component by adding the ability to audit logins, and lock users out after x failed logins.
- POP Server component - based on a 3rd party component, supports forwarding of email based on configurable rules.
- Domain object templates - using a 3rd party tool, initial sets of domain objects and O/R mappings are generated based on the database structure.
Microsoft promotes their own re-usable "Application Blocks". These are of varying quality, and I do not specifically recommend any of them. I do recommend against the Microsoft Data Application Block, because it is SQL Server specific and does not encourage good data access practices.
References
- Data Access Patterns: Database Interactions in Object-Oriented Applications - by Clifton Nock. This is an excellent reference that breaks down the different components of data access. It tackles everything from object relational mapping, down to data access tiers.
- Patterns of Enterprise Application Architecture - By Martin Fowler et al. - This is the definitive book on designing object-oriented business systems. It tackles everything from the UI down to the data access, in a nicely clear and concise manner.
- Application Architecture for .NET - Designing Applications and Services - By Microsoft Patterns and Practices division. It is a good, short book, focused on creating very complex business applications, making use of Microsoft tools and products.
- Expert One-on-One Visual Basic .NET Business Objects - by Rockford Lhotka. A very practical approach to developing a complex tiered application. The author presents his own vision of an architecture that can be reused across multiple applications.