One thing I’ve been thinking and talking about for the past few weeks is the relationship between four different concepts, a relationship that I didn’t fully grasp at first but have become more convinced of as time wears on.  Those terms are:

  • Enterprise Canonical Data Model
  • Canonical Message Schema
  • Event Driven Architecture
  • Business Event Ontology

I understood a general relationship between them, but as time has passed and I’ve been placing my mind directly in the space of delivering service oriented business applications, the meanings have crystalized and their relationship has become more important.  First, some definitions from my viewpoint.

  • Enterprise Canonical Data Model – The data we all agree on.  This is not ALL the data.  This is the data that we all need to agree on in order to do our business.  This is the entire model, as though the enterprise had one and only one relational database.  Of course, it is impossible for the enterprise to function with a single database.  So, in some respect, creating this model is an academic exercise.  It’s usefulness doesn’t become apparent until you add in the following concepts, so read on.
  • Canonical Message Schema – When we pass a message from one application to another, over a Service Oriented Architecture or in EDI or in a batch file, we pass a set of data between applications.  Both the sender and the reciever have a shared understanding of what these fields (a) data type, (b) range of values, and (c) semantic meaning.  The first two we can handle with the service tools we have.  The third one is far and away the hardest to do, and this is where most of the cost of point-to-point integration comes from: creating a consistent agreement between two applications for what the data MEANS and how it will be used. 
  • Event Driven Architecture – a style of application and system architecture characterized by the development of a set of relatively independent actors who communicate events amongst themselves in order to achieve a coordinated goal.  This can be done at the application level, the distributed system level, the enterprise level, and the inter-enterprise level (B2B and EDI).  I’ve used this at many levels.  It’s probably my favorite model.  At the application level, I once participated in coding a component that interacted in an EDA application that ran in firmware on a high-speed modem.  At the system level, I helped design a system of messages and components that controls the creation of enterprise agreements.  At the enterprise level, I worked for numerous agencies, in my consulting days, to set up EDI transactions to share business messages between different business partners.
  • Business Event Ontology — A reasonably complete list of business events, usually in a heirarchy, that represents the points in the overall business process where two “things” need to communicate or share.  I’m not referring to a single event, but rather to the entire list.  Note that a business event is not the same as a process step.  An event may trigger a process step, but the event itself is a “notification of something that has occurred,” not the name of the process we follow next.

I guess what escaped me, until recently, was how closely related these concepts really are.

The way I’m approaching this starts from the business goal: use data to drive decisions.  Therefore, we need good data.  In order to have good data, we need to either integrate our applications or bring the data together at the end.   Either way, if the data is used consistently along the way, we will have a good data set to report from at the end. 

To create that consistency, we need the Enterprise Canonical Data Model.  Creating this bird is not easy.  It requires a lot of work and executive buy-in.  Note that the process of creating this model can generate a lot of heated discussions, mostly about variations in business process.  Usually the only way to mitigate these discussions is to create a data model that contains either none of the variations between processes, or contains them all.  Neither direction is “more correct” than the other.

However, in order to integrate the applications, either along the way or at the end of the data-generation processes, we need to use a particularly constrained definition of Canonical Schema: the Enterprise Canonical Message Schema is a subset of the Enterprise Canonical Data Model that represents the data we will pass between systems that many people feel would be useful. Note that we added a constraint over the definition above.  Not only are we sharing the data, but we are sharing the data from the Enterprise CDM. 

By constraining our message schema to the elements in the Enterprise Canonical Data Model, we radically reduce the cost of producing good data “at the end” because we will not generate bad data along the way.  The key word is “subset.”  In order to create a canonical schema without a canonical data model, you are building a house on sand.  The CDM provides the foundation for the schema, and creating the schema first is likely to cause problems later.

Therefore, for my friends still debating if we should do SOA as a “code first” or “schema first” approach, I will say this: if you want to actually share the service, you have no choice but to create the service “schema first” and even then, only AFTER a sufficiently well understood part of the canonical data model is described and understood.

And for my friends creating schemas that are not a subset of the overall model, time to resync with the overall model.  Let’s get a single model that we all agree on as a necessary foundation for data integration.

The next relationship is between the Canonical Message Schema and the Event Driven Architecture approach.  If you build your application so that you are sending messages, and you want to create autonomy between the components (goodness), you need to send data that has a well understood interpretation and as little ‘business rule baggage” as you can get away with.  What better place than the Canonical Data Model to get that understanding?  Now, this is no longer an academic exercise.  Creating the enterprise level data model provides common understanding, so that these messages can have clear and consistent meaning.  That is imperative to the notion of Event Driven Architecture, where you are trying to keep the logic of one component from bleeding over into another. 

The business event ontology defines the list of events that will occur that require you to send data.  Creating an ontology requires that you understand the process well enough to generalize the process steps into common-held sharable events.  To get this, the data shared at the point of an event should be in the form of an Enterprise Canonical Message Schema.

Therefore, to summarize the relationship:

   Business Events occur in a business, causing an application to send a Canonical Message to another application.  The Canonical Message Schema is a subset of the Canonical Data Model.  Event Driven Architecture is most efficient when you send a Canonical Message Schema message between components.  This provides you with more consistent data, which is better for creating a business intelligence data warehouse at the end.

Some agility notes:

The list of business events in a prospect ontology may include things like “receive prospect base information”, “receive prospect extended information”, “prospect questionnaire response received”, “prospect (re)assigned”, “prospect archived”, “prospect matched to existing customer”, “prospect assigned to marketing program,” etc. It is not a list of process steps.  Just the events that occur as inputs or outputs.

Clearly, this list can be created in iterations, but if it is, you need to make sure that you capture all of the events that surround a particular high level process and not just focus from technology.  In other words, the business processes of “qualify prospect” or “validate order” may have many business events associated with them, and those events may need to touch many applications and people.  If you decide to focus on “qualify prospect” first, then understand all of the events surrounding “qualify prospect” before moving on to “validate order,” but if both processes hit your Customer Relationship Management system, focus on the process, not the system. 


By Nick Malik

Former CIO and present Strategic Architect, Nick Malik is a Seattle based business and technology advisor with over 30 years of professional experience in management, systems, and technology. He is the co-author of the influential paper "Perspectives on Enterprise Architecture" with Dr. Brian Cameron that effectively defined modern Enterprise Architecture practices, and he is frequent speaker at public gatherings on Enterprise Architecture and related topics. He coauthored a book on Visual Storytelling with Martin Sykes and Mark West titled "Stories That Move Mountains".

12 thoughts on “Canonical Model, Canonical Schema, and Event Driven SOA”
  1. It is an effiecent & Agile way to break the Business Model/Architecture.

    I agree with you that, as our SOA implementation reaches some maturity we do tend to think in these lines. But keep these in mind before hand, will save us a lot of Time and Confusion.

    Thanks for relating these concepts..

  2. @Jack

    I could not have said it better myself.  


    The goal is to figure out how to communicate.  Think of this like the diplomatic community.  In a country, a lot goes on that the diplomatic community is not really worried about.  However, when we want to talk between ourselves (to create an international treaty, for example), we need to have a common language to negotiate and sign the treaty in.  That common language (not the content of the treaty) is the stuff defined by the Enterprise Canonical Data Model.

  3. +1. Like Jack, I really like how you’ve laid out the concepts. Very nice. App-independent messages are a key decoupling mechanism.

    "Creating this bird is not easy.  It requires a lot of work and executive buy-in."

    Indeed. The difficulty in creating ECDM and ECMS cannot be overstated, IMO. This can be really hard–especially when the diplomats in the community aren’t all that interested in participating in the exercise–"just send me the data I need". "Didn’t we just do this for data warehousing?"

    Lastly, you touched on this a little, but this exercise should not lose sight of the business processes. Only in the context of the processes do the events make sense. Integration, IMO, is best served by a "process first" approach. Events and services come after.


  4. @Rob,

    I agree with a process first approach and I think that’s one reason why the business event ontology is so important.  

    That said, processes have a heirarchy.  We speak of Level 1 processes like Marketing, Sales, Fulfillment, etc.  Level 2 processes would be under one of those top level ones.  For example, under Marketing would be things Create Market Strategy, Segment Market, Build Programs, Execute Programs, Capture Response.  

    That is the level where the business event ontology really hits home.  This is because these Level 2 processes are the domains of large systems.  You will tend to find a system that spans the process, largely from end to end.  The business likes looking at data within these buckets, and cares a lot less about the individual data elements flowing between them.

    So I agree, we start with process.  On the other hand, I caution teams not to go all the way down to level 4 and level 5 before starting on Integration and Services.  While there are likely to be services developed to support level 4 processes, they will be built in the context of a single system and don’t need to be architected "from the center."  They need to be architected in the project that builds or maintains the systems that serve those needs.

    Process First, but stop before you go too deep.

  5. It was the part about the CDM being like a shared relational database that lead me to think about "one true schema" and other common data model approaches. Jack’s post makes the distinction between the common data format and the message metadata CDM, and I recommend reading his post first 🙂

  6. In the practical world the process needs to be put in place may be, by the SOA governance to make all this work effectively. I had been on consulting projects where the enterprise had the Cannonical Model (not the database) in place. The process of subsetting was so cumbersome that we ended up creating our own "Rider" data model and schemas. Probably there could be some kind of tool  SOA platform vendors will start providing in the future to ease subsetting,transforming  and versioning from the Enterprise repository?

  7. @Anil,

    I’d love to hear more.  Can you send me an e-mail directly?  I’d like to know what, in the enterprise you worked in, made it so difficult to create a subset message from the Canonical Model?

    — N

Leave a Reply

Your email address will not be published. Required fields are marked *

fifteen − 1 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.