survic's blog on Programming or Enterprise Computing: Domain Driven Design: data modeling as the core of the domain modeling

Sunday, July 23, 2006

Domain Driven Design: data modeling as the core of the domain modeling

Vikas wrote (http://vikasnetdev.blogspot.com/2006/07/domain-driven-design-vs-data-driven.html): I am excited about Domain Driven Design otherwise I would not have spent 50$ and wasted my time to read it. After finishing it, I am going to rate this book either 1 out of 5 or 5 out of 5. This is one of few books in which I am going read case studies. Generally I skip them and use in real practice. Be paitent with me. :)

This makes me feel that I may be missing some gold mine; so, I took a deeper look of the “Domain Driven Design” website and its summary ( http://domaindrivendesign.org/book/PatternSummariesUnderCreativeCommons.doc ), its book comments on amazon, and its google results for its related literatures.

The following are three specifics:

1. In the past, when I was young ;-) I studied analytical/language philosophy pretty intensively, so, I am very much at home when I read the Domain-Driven Design book – believe it or not, I also found discussions on “ontology” in domain driven literatures! ( see http://www.st.informatik.tu-darmstadt.de:8080/ecoop2005/maw/acceptedPapers/Hruby.pdf ). I really feel that they are seriously taking OO to the next level -- expanding its perspectives.

---- However, I feel it is interesting that DDD does not pay special attention to database modeling.

2. I honestly believe that I had been doing DDD for a long time, at least it started eight years ago when I read Martin Fowler’s analysis pattern book (I noticed vikas also list it on his book list on his blog site – great people read alike ;-). However, I must add that I read that book together with other two books (I did not dramatize it; I did read them about the same time: coincidence or lucky): David C. Hay’s “Data Model Patterns – Conventions of Thought”, and (as an anti-dose for normalization) “The Data Warehouse Toolkit” by Ralph Kimball.

I especially appreciate the subtitle of the Hay’s book, “Conventions of Thought”. OO is not the only thing that we can think about. Further, I would argue that relational model and dimensional model are actually subsets of domain modeling thinking, and those subsets are indeed the more robust and more important ones -- because persistent data are more important by themselves (the “core domain”), and relational multiplicity is more important than anything else. Talking about “deep insight”, I always find realizing things that need to be persistent and the multiplicity of those things is the deepest insight.

Also note that database forces people to share data. So, it is the natural place where people have a ubiquitous language. Further, data warehouse is the ultimate place where we have a ubiquitous language across the whole company. Talking about ubiquitous language without digging into data warehousing, it again shows the tunnel view of “pure” OO.

3. The last item is a whole-hearted recommendation: on the one hand, we need a “process” to reduce the chaos in software development; on the other hand, using the metaphor of manufacturing is really very misleading. So, this article is very convincing:
http://domaindrivendesign.org/articles/blog/evans_eric_software_is_not_like_that.html

20 Comments:

Vikas said...: Hi Survic,

Since I have been reading your blog for some time, I have never iota of doubt that you don’t perform domain driven design.
http://www.martinfowler.com/eaaCatalog/domainModel.html
http://www.martinfowler.com/eaaCatalog/tableModule.html

Preaching you Domain Driven Design over Data Driven Design is like teaching benefits of Writing Tests First to Rockford Lhotka. Writing Test first makes sure that you won’t mix UI code with Business Layer which OO purists would never do. in first place My blog is more focused on general process, practices and real world. But we live in an imperfect world where we don’t make rules. We, OO purists, just form 10% of total .Net Community.

http://vikasnetdev.blogspot.com/2005/05/was-using-object-oriented-programming.html

If we could evangelize processes and practices that encourage rest of 90% to write good OO code, it would be a step in right direction.
Now, I realized that I am trying to change the world. LOL.; 7/23/2006 08:41:00 AM
Vikas said...: Vikas’s Thinking Starts
Since I am locked in mortal combat with Survic over Domain Modeling vs Data Modeling. Here is my strategy, concede tactical defeat to Survic. Fight Survic on my favorable grounds and make sure Survic cannot use his best weapons ‘Small projcts’ in his arsenal and win the strategic battle. Here comes my offense
Vikas’s Thinking Ends

Hi Survic,

I agree with you that Data Modeling wins hands down for small project and when programmer is also designing database.

What would you suggest in a environment when you are not in charge of database design.but DBA.? Do you agree that kind of environment exist? I had been in those environments. Would you wait for DBA to finish Database design and get caught with out a domain model in Design Phase? Mock screen are simply illusions of design and implementation which has to backed by Domain or Data Model. What is approach do you suggest in such situation? Taking DBA out of equation is not any option and project duration is 10 months

I have added more thoughts to my posts
http://vikasnetdev.blogspot.com/2006/07/domain-driven-design-vs-data-driven.html; 7/23/2006 03:46:00 PM
survic said...: It seems that the best offense is to defense well – the impression from the soccer games ;-)

I agree that environment exists (DBAs dominate database design). In that environment, my first try is to work with DBAs to develop the UML (or ER) for the database tables. This is an honest try -- I understand that this is also a politically risky approach, because it may be perceived as stepping over DBAs’ territory. However, it is an honest approach, and it can be very fruitful, because it eliminates the communication problems between developers and DBAs.

Also – I know this is a surprise to traditional OO – this also removes the communication breakdowns with users: users only know UI and core data; they do not know “objects” (I know I am making a controversial argument here, but it is better being clearer than just being “correct”: I have to say that the saying that users know objects is a myth from traditional OO).

If the DBAs are too “powerful” to accept such an approach; then, I do believe that OO (especially those “entity” objects) is a useful techno-political tool, so that developers can do database design with the disguise of OO design. However, then, my point is that we should not let this disguise eclipse the real thing.

Why? The heart of my argument is actually an academically based one: relational model is more mathematically robust; therefore, the core of OO is actually the relational model!

This directly translates into practices: when you do OO, use relational model first. You may say, what the heck is that?! OK, let’s express it in OO terms: when you do OO, inheritance is not the key; containment is, and always pay attention to the multiplicity of the containments. Also, although in memory you have a lot of freedom to change the data structures to make the algorithm more efficient; however, please do believe that there are “inherent” data structures, or “primary”, “core” data structures. Almost always, those “inherent” data structures correspond to the database’s table structure. So, if you cut straight to the cheese, and always hold those “inherent” data structures, everything else will be easy and straightforward.

If you do not believe this technique, then, just try to remember last time you read a book that with a large example, I bet you, just like me, will skip the so-called OO analysis, and jump to the data model. In any book, except some introductory OO books, whoever spends time on the OO analysis is waste the reader’s time. Just give the database schema, some example data, then, you are done.

It seems that we are switching our context (or “offense strategy” ;-) -- I would say that this is especially important for a larger project. For smaller projects, you can get it right no matter what: either OO or Database. For large projects, if you do not start with database (i.e., those “inherent”, “core”, “primary” data structures, or, in DDD terms, the “core domain”) you are likely getting into trouble, because you are likely to get into a lot of non-essential but complicated things, without DBAs’ help and without users’ help.

I got the feeling that DDD is widening the perspective of OO: at least by its name (“DDD”, instead of “OO”), it can include database modeling as part of domain modeling. Strangely enough though, DDD does not include data modeling. It is not logical ;-) -- database is the natural ground for the “ubiquitous language” and “core domain”, how in the world a person talking about it without talking about database and data warehousing?!

---- I tried to trace back why I was reading David Hay’s book with "analysis patterns", then, I found that it was in the Foreword in the “Analysis Patterns”. The Foreword was by Ralph Johnson, so, at least the idea did originate from a well-known author. Of course, I am guilty for pushing it to extremes.

So, I am going to use “DDD” (thanks to Vikas!), however, I will keep my own interpretations, and begin to confuse people (;-) by saying that I am using DDD with data modeling.

Again, the advantages of database-first are that I can leverage my database analysis/design skills as the core of my OO skills, and I can learn from DBAs and users, either directly, and/or indirectly, by reading materials from DBAs’ world and users’ domain world.

Also, I noticed that Design patterns are low level patterns; J2EE patterns are architecture-oriented patterns. Analysis patterns (both OO and database) are more about entities. Other patterns are mostly re-naming of the same thing. Renaming is good in the sense that it forces us to re-think; and part of the rethink is to find out that it is indeed a renaming in a certain context.
For example, “repository” is just a feature-rich DAO; “aggregate” and “UI presenters” are DTOs (“data transfer objects”) in different context. By the way, I heard the term POJO is re-interpreted as “Plain Old clR Object” – it was originally interpreted as “Plain Old Java Object”. Let’s keep this “pattern” – let “J” be the last letter of “clR” -- my point is that .net is so similar to Java light-weight approach; their architecture design literature is really transferable; and Java is also part of DDD. However, I do not understand why DDD is doing it all over again.; 7/30/2006 11:14:00 PM
survic said...: One thing I want to emphasize is that I prefer “custom class” over “dataset” in any situations (other than supporting other people’s code ;-). So, my arguments for “database first” does not mean I am for using the so-called “Table Module” http://www.martinfowler.com/eaaCatalog/tableModule.html. We are a happy in this regard :-)

Also, I want to point out that the so-called “anemic entity model” is the result of SOA: (a) its data must be easily changed to XML in memory-only; (b) its behavior code can be translated into another language, e.g. javascript. Regardless how you call them, those two things cannot be “abstracted” away; as a matter of fact, the more you try to hide them, the more confusing it is.

In other words, in domain analysis, there are two basic facts: the fact that there are (relational) databases, the fact that the application is (potentially) distributed. Those two facts cannot and should not be hidden; they should be treated as the core of the domain knowledge, and should be dealt with in the very beginning, and continuously in a prominent place.; 7/31/2006 12:51:00 PM
survic said...: I felt I need more definitive source to support my point of view. I did some research; frankly, I have to admit that I am not on the mainstream.

However, I do find a source: http://www.ambysoft.com/books/theObjectPrimer.html. I did not dig into it too much; and from what I saw, it did not go to the extremes as I did, but it shares the same points: database should be in the domain modeling picture, and developers should use data modeling as one important source of information.; 7/31/2006 04:21:00 PM
Vikas said...: This comment has been removed by a blog administrator.; 8/01/2006 03:34:00 PM
Vikas said...: survic wrote "I felt I need more definitive source to support my point of view. I did some research; frankly, I have to admit that I am not on the mainstream"

Because we solve real problems for paychecks not make living out of writing books.
Jokes apart, our focus has been more on tactical projects (1-6 months) where Data Driven Design works fine. Rocky, Martin and Eric focus is more on strategic projects which span over multiple years and have many layers of bureaucracy.; 8/01/2006 03:44:00 PM
survic said...: "TDD without mocks and DDD with data" -- now I know why I feel uneasiness with TDD and DDD.

I agree with you about the size of projects; but deep in my heart, I also believe that all larger business projects can be split into smaller projects, so, I do not believe “TDD with mocks and DDD without data” is good for anything -- I will keep an open mind though.

Thank you, Vikas. I have noticed that I reached both conclusions in response to your blogs, and now I have a clearer mind. Bloging does accelerate thinking a lot.; 8/01/2006 09:24:00 PM
Vikas said...: Welcome Survic. I have not yet given up on you :)

I read the Object Primer and found the following tit-bits.

Conceptual Modeling or Domain Modeling

Various types of Model you might want to use for conceptual domain modeling

1. Robustness Diagrams;
2. Object Role Model(ORM) diagrams;
3. Class responsibility collaborator(CRC) models;
4. UML class diagram
5. Logical data models(LDMs);
6. Analysis patterns;
7. UML object diagrams.

The fit between object technology and RDB technology is not perfect. In the early 1990s, the difference between the two approaches was labeled the object/relational impedance mismatch, also referred to as the
O/R impedance mismatch
or simply the impedance mismatch; terms still in common use today. Why does a technological impedance mismatch exist? The object-oriented paradigm is based on proven software engineering principles. The relational paradigm however is based on proven mathematical principles. Because the underlying paradigms are different, the two technologies do not work together seamlessly.

There is also a cultural impedance mismatch between developers in the object community and data professionals in the data community. Object developers have been taking an evolutionary approach to developments for years and now moving towards agile development methods such as extreme programming. Unfortunately many within the data community look upon evolutionary development as a questionable approach, and agile approaches are just now being considered.

When it gets right down to it the real issue is that data professionals view the world as data to be manipulated, whereas object developers view it as objects to be combined to perform behavior.

Data are important.

Data are one of many issues. Although data are important , so is telecommunication, user interface development, working with stakeholders, buinsess component architectures, frameworks, understanding the business domain, and so on.It is quite common for data professionals to overestimate the importance of data, which is unfortunate.

You need to look at the enterprise pictures. Luckily data professionals are often very good at considering enterprise –level issues-yet another reason developers should work with them closely.

Everyone needs to work together.

Common legacy data challenges

I am listing some from book
1. A single data field is used for several purposes.
2. The purpose of a data field is determined by the values of one or more columns.
3. Inconsistent values are being stored in a single data field
4.There is inconsistent/incorrect data formatting with in a column.
5.Important entities, attributes and relationships are hidden and floating in text fields.
Data values can stray from their field descriptions and business rules.
6. One attribute is stored in several fields.

My Comments
Context: Big projects. DBA doing the database design
it is more important to nail down requirements in Design rather than figuring out Data storage requirements. Because of O/R impedance, one may put extra effort into data storage effort rather than in discovering the problem domain.It is okay that data storage requirement activity to span over construction but not requirement gathering. If requirement gathering activity spills into construction, it can be very expensive

Tomorrow, I am going to post Rocky's comment on O/R impedance.; 8/02/2006 05:47:00 PM
Vikas said...: Now I understand why O/R mapping is such a big subject in J2EE environment not in .Net environment.; 8/02/2006 05:48:00 PM
survic said...: Yes, OR mapping. Good topic -- because of OR mapping, I can concede to your pure OO approach, as long as when we model entity objects, the questions of whether they are persisted (not how), and what their multiplicities are, are two of the earliest and most important issues.

I cannot wait to read your OR mapping post. I would like to see how do you do OR mapping and SOA without getting into the so-called “anemic entity model“?; 8/02/2006 09:33:00 PM
Vikas said...: Survic said
“I agree with you about the size of projects; but deep in my heart, I also believe that all larger business projects can be split into smaller projects, so, I do not believe “TDD with mocks and DDD without data” is good for anything -- I will keep an open mind though”

Hi Survic,

Thanks for pointing to Scott W. Ambler.’s book ‘The Object Primer’. It helped me to write this post is what I wanted to articulate in first place
First thing nobody is underestimating the importance of Data in Domain Driven Design. But there are following problems or realities in Enterprise Data Design (Scott calls Legacy Database design but DBAs continue to employ for consistency purpose)

1. A single data field is use for several purposes.
2. The purpose of a data field is determined by the value of one or more other columns.
3. Inconsistent values are stored in a single data field.
4. There is inconsistent/incorrect data formatting within a column
5. Some data values are missing with in a data field
6. One or more data fields that require do not exist.
7. Additional data fields that your application will need to support if it uses the legacy data exist
8. Multiple source exist for the same data and it is not clear which one to use
9. Important entities, attributes and relationship are hidden and floating in text fields
10. Data values can stray from their field descriptions and business rules.
11. Various key strategies are used to identify the same type of entity
12. You require a relationship between data records that is not supported by legacy data.
13. One attribute is stored in several fields.
14. Special characters within a data field are inconsistently used.
15. Different data types are used for similar columns.
16. The legacy data do not contain sufficient detail.
17. The legacy data contain too much data.
18. The legacy data are read-only, yet you require update access.
19. The timeliness of data varies from what you require
20. The default value used by a legacy application does not reflect the default value required by your system.
21. Different representations of the data exist.
22. The naming conventions used are difficult to understand

Some decisions may be right in their own way, as per my experience with DBAs in past. They are more concerned about efficient storage of data; need to be consistent with existing databases (what Scott calls legacy) and future extensibility at data storage level.

Some of above factors limit the database design diagram to be used as effective communication tool with all stakeholders. We need a abstraction above database which hide these complexities and communicate more clearly Problem Domain/System Blueprints to users.

This is where come entity classes’ diagrams (Since we both agree on importance of process classes diagrams, activity diagrams, Sequence diagrams etc.) which can hide all above mentioned storage details from system users. It will help the programmer to nail down the requirement and prepare the blueprints with out waiting for database design to over. Once requirements are completely nailed, data storage/retrieval is more implementation details.

This debate is similar to XML Hell debate. Nobody is arguing against the importance of XML but we need an abstraction/Graphical tools to hide its rawness as discussed in the following mail
http://vikasnetdev.blogspot.com/2006/06/xml-hell.html

Reference:
The Object Primer – By Scott W. Ambler; 8/06/2006 08:52:00 PM
survic said...: If databases are not easily approachable, we can say that “let’s grow fast using entity class design, and get stronger in requirement/analysis phase, and then we can deal with databases”. I agree, with some reservations.

I agree because I do believe use-cases/stories (usually in the format of screen mockups) and entity class diagrams must go hand in hand – flows and structures must evolve simultaneously. We cannot do use cases first, then, do entity classes – I was in a project that did that, it was ridiculous and disastrous. While we are doing the mockups, we need to do the domain model (either code or database), we cannot wait for DBAs.

However, I have reservations. We need the data in the damn legacy databases! Why? Because we need to do “manual Fit” – I admit that it is just a fancy name for experimenting with real data to understand their multiplicity – because I know you like Fit ;-)

So, if it is a new database, we design it ourselves; if it is a legacy one, then, we need the data within it. Either way or both ways, we need to deal with database early on.

After all, this is in the spirit of the “ubiquitous language”, isn’t it?

I am waiting for your ideas on a domain model without being “anemic" – I have seen a lot of derogative comments about it, but have seen nothing that can solve the problem. I do not know it is just me, or it is the emperor's new clothes, sigh.; 8/07/2006 02:48:00 PM
Vikas said...: This comment has been removed by a blog administrator.; 8/08/2006 08:03:00 PM
Vikas said...: So you agree with me with reservations and I agree 100% with you.
Not bad.

Here are my thoughts
http://vikasnetdev.blogspot.com/2006/08/domain-driven-design-based-on-entity.html; 8/08/2006 08:05:00 PM
survic said...: I added to vikas blogs:

http://vikasnetdev.blogspot.com/2006/08/domain-driven-design-based-on-entity.html
--------------------

I agree with you now, 100% :-)

The database storage details -- you are right: I have to hide them if they are too messy. If I insist on the “data driven” stuff, then, I have to cheat, and persuade myself with the differences among conceptual/logical/physical data models.

However, as you may guess, I do not like those demarcations; because the advantage of database is that the turn-around time between schema design and real data (the “manual Fit”) is within minutes.

More about “manual Fit”: I remember that originally (in the age of DB2 and Oracle) relational databases were build under the assumption that users are supposed to use sql/plus directly. That is the spirit of “manual Fit”!

Admittedly, it turned out that it was an invalid assumption, however, business analysts, or, developers who wear analyst hats should do such “manual Fit”.

Anyway, I found “manual Fit” is a good concept. I am going to use it everywhere. On the other hand, it adds some non-pure-OO elements to Fit though ;-); 8/09/2006 11:46:00 AM
Anonymous said...: Do you have a spam issue on this website; I also am a blogger,
and I was curious about your situation; many of us have
created some nice practices and we are looking to trade strategies with other
folks, be sure to shoot me an email if interested.

Here is my page: click for golden retrievers puppies facts; 5/07/2013 09:06:00 PM
Anonymous said...: I really like what you guys are up too. This kind of clever
work and coverage! Keep up the amazing works guys I've incorporated you guys to my own blogroll.

Feel free to visit my page :: sky free to air channel frequencies; 5/08/2013 01:40:00 PM
Anonymous said...: Hello there, You've done an incredible job. I will definitely digg it and personally suggest to my friends. I'm sure they will be benefited from
this website.

Feel free to visit my web site ... Discover More; 5/10/2013 02:37:00 AM
Anonymous said...: I will immediately take hold of your rss feed as I can't in finding your email subscription link or newsletter service. Do you've
any? Please permit me know so that I could subscribe.
Thanks.

my web-site - sky movie deals; 5/15/2013 11:14:00 PM

survic's blog on Programming or Enterprise Computing

Sunday, July 23, 2006

Domain Driven Design: data modeling as the core of the domain modeling

20 Comments:

Previous Posts

About Me