Udi Dahan posted an interesting reply to a recent posting of mine.  In my post, I go into detail to present a scenario where two services are coupled because the business itself is coupled.  He disagreed with my design and offered an alternative.  I will discuss his alternative and show that our designs are similar but that mine is more stable and more appropriate to the specific example I described.

I’ll see if I can add real diagrams tomorrow, when I’m on my own PC.  Right now, we will have to suffice with ‘text diagrams’.[addenda: I did my best to create diagrams.  The large one may get cut off in your browser.  Open it in a seperate window to see all of it.]

From what I can tell of Udi’s model, it looks like this:


    — writes –> [co-op master db]

    — subscribes –> [partner change events] — generated by –> [partner-master service]

    — writes –> [local partner data cache]

    — calls async –> [insert-partner-master]

    — notifies –> [original caller]


    — writes –> [partner master db]

    — publishes //change events// –> to all subscribers


My model was a bit different.  Here’s mine in the same, goofy, notation. 

[composite orchestration: create-co-op-partner]

    — calls sync –> [partner-master] (retrieve partner id)

    — calls sync –> [add-to-co-op-master] (retrieve co-op id)

    — notifies –> [original caller]


    — writes –> [partner master db]


    — writes –> [co-op master db]


Addenda: I created this diagram to show my viewpoint, in context of some callers.  you may need to open this is a seperate window to see all of it.

Some interesting comparisons: my message exchange pattern (MEP) is less reliant on async calls.  The assumption I made is that the orchestration itself is reliable, so if it cannot call one of the downstream services, the orchestration engine retries later (perhaps dehydrating as needed).  This probably lowers scalability.  On the other hand, my orchestration has two advantages: first off, it is both simpler to build and, in the 99% sunny-day flow, it performs far better. 

So there are architectural tradeoffs between the two designs: Udi wins for scalability, while I get performance. 

What is the cost of the scalability?  Udi’s design is far more complex and thus more expensive to build and own.  Do we get so many new co-op partners every day that we need the added cost of Udi’s design?  I doubt it.  Perhaps if the example were dealing with orders, but it isn’t.  It is dealing with co-op partner agreements… negotiated legal documents.  Even very large companies may create a handful of these in a month.  So the added complexity (and cost) produces no return on investment.

The most important difference, however, is not the use of sync or async services.  In fact, both models assume that the orchestration lives in an async container, and if you started with my model, it would be a trivial change to move to async services and pub-sub.  So, while I can chide Udi’s design on the basis of cost, that isn’t my disagreement with it.  In fact, I quite like the notions of publish-subscribe and local distributed data cache.  However, his model is not “elegant” in my opinion.

The source of my discomfort is the coupling.  In my model, the ‘create-co-op-partner’ service performs ONLY orchestration.  It makes no attempt to call a local database or store cache records.  It calls only other services.  This allows the fine grained services to be called directly by other consuming applications.  In effect, my model allows the business process to be encapsulated and seperated from the fine-grained services that are called by it. 

Udi binds them tightly together.  In his model, a change to the business process affects all systems that call ‘create co-op partner’ while in my model, any systems that are consuming the fine-grained services would not be affected by the change.  These are three different things: two fine grained services and one (composite) process service.  Tying the process service to one of the data services just doesn’t feel right to me. 

Which one is better?

That’s not an easy question.  I sat at my desk for an hour before coming up with a situation where my model is better, but I don’t think it is all that common of a situation.  I will say that generically, I believe that decoupling these three things from one another feels better.  That said: the best I could do to prove it is an odd case.  here goes:

let’s say that we are implementing the following process in our orchestration (both models have an orchestration.  same process for both):  ‘Create-co-op-partner’ service is called and passed a data document that describes a co-op partner.  There is no ‘master partner id’ so we search against the master partner service (or local partner cache) to see if the partner already exists.  It does, so we get the partner id.  We then create the co-op partner with the partner id we found. 

Along comes a change to the requirements.  (no… that NEVER happens ;-).

Our business wants another business process for the ‘spy toys’ division.  In this process, ‘create-co-op-partner’ will get a data document that describes a co-op partner.  The difference is that in this model, we don’t search for the partner first.  If no master partner id arrives on the data document, we always create a new partner first, and then create the co-op partner record.  Two different processes: two databases that need to be coordinated.

With my model, you simple create a new composite service that performs the new orchestration, call both fine-grained services, and move on.  No changes to the existing services.  No regression testing.

With Udi’s model, you have two choices: either change an existing service to support both processes (and incur regression testing costs) or you create a new service that performs the new orchestration rules as well as performs fine-grained database work to add the co-op partner.

Let’s say we have Udi’s model in production and we assign this new requirement to Tom, our very cool support developer.  Of course, our good support programmer choses to create a new service.  He doesn’t want to regression test thousands of lines of code that he didn’t change.  So he copies the existing service, changes the process code, and puts it out on the test server.  Of course, Tom would also realize that he has copied the code for the fine-grained database work to two places.  That code is not different between the services, but it could get there. Fixing a bug would have to happen twice.  Bad.  So Tom promptly creates a fine-grained service that both orchestration services will call and refactors out the common code.  Viola’  Udi’s model just migrated and morphed into mine. 

So with all due respect to an excellent architect, I say this: just as water seeks a level, design seeks stability.  If you build a design that, when kicked with a change, immediately folds to another design, the second design may be more stable than the first.  Why not start there? 

The sync/async point is not meaningful.  It is a tradeoff.  I maintain that I made the more appropriate one given the tangible example at hand. 

By Nick Malik

Former CIO and present Strategic Architect, Nick Malik is a Seattle based business and technology advisor with over 30 years of professional experience in management, systems, and technology. He is the co-author of the influential paper "Perspectives on Enterprise Architecture" with Dr. Brian Cameron that effectively defined modern Enterprise Architecture practices, and he is frequent speaker at public gatherings on Enterprise Architecture and related topics. He coauthored a book on Visual Storytelling with Martin Sykes and Mark West titled "Stories That Move Mountains".

2 thoughts on “Alas, We must differ…”
  1. "So there are architectural tradeoffs between the two designs: Udi wins for scalability, while I get performance."

    How do you measure performance? In my opinion, there is a function which translates scalability to performance.

    P = F(S, H)

    Where P is performance, S is scalability, and H is hardware (including network and all other non-software elements).

    One thing we can say about F and that is that for a scalable system (certain values of S), then P increases with H.

    In which case, could not a scalable system achieve higher performance than a performant but non-scalable system?

  2. Interesting formula, Udi.  Problem is that your function is not continuous.  It is therefore not valid to use it for extrapolation.  

    In other words, if you need 350ms response time, and your architecture cannot respond faster than 1,200ms response time, with no load, then adding hardware isn’t going to help.  

    Performance doesn’t go up beyond the constrained maximum indicated by the architecture.  

    So, no, creating an architecture that doesn’t meet performance criteria cannot be papered over by adding hardware.  Note that data centers are reducing hardware these days, not adding it.  You won’t make friends by suggesting that the solution to a problem is to put in a server.

    Note that the async nature of your design is NOT what I am concerned about.  The entire ‘performance’ thread is a red herring.  I’m concerned about coupling.  You appear to be recommending a tightly coupled design.  That is the chief point of my disagreement.  Are you willing to address that point or do you concede that you are suggesting a tightly coupled service design?

Leave a Reply

Your email address will not be published. Required fields are marked *

16 + 13 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.