About silos and hierarchies in software development

Disclaimer: this is NOT a rant about people. In most of the situations all devs I know want to deliver a good work. This is a rant about organisations imposing such structures calling themselves “an agile company”.

To give you some context: a digital product, sold online as a subscription. The application in my scenario is the usual admin portal to manage customers, get an overview of their payment situation, like balance, etc.
The application is built and maintained by a frontend team. The team is using the GraphQL API built and maintained by a backend team. Every team has a team lead and over all of them is at least one other lead. (Of course there are also a lot of other middle-management, etc.) 

Some time ago somebody must have decided to include in the API a field called “total” containing the balance of the customer so that it can be displayed in the portal. Obviously I cannot know what happened (I’m just a user of this product), but fact is, this total was implemented as an integer. Do you see the problem? We are talking about money displayed on the website, about a balance which is almost never an integer. This small mistake made the whole feature unusable.

Point 1: Devs implement technical requests instead of improving the product 
I don’t know if the developer who implemented this made an error by not thinking about what this total should represent or he/she simple didn’t had the experience in e-commerce but it is not my point. My point is that this person was obviously not involved in the discussion about this feature, why it is needed, what is the benefit. I can see it with my spiritual eyes how this feature became turned in code: The team lead, software lead (xyz lead) decided that this task has to be done. The task didn’t referred to the customer benefit, it stripped everything down to “include a new property called total having as value the sum of some other numbers”. I can see it because I had a lot of meetings like this. I delivered a string to the other team and this string was sometimes a URL and sometimes a name. But I did this in a company which didn’t called himself agile. 

Point 2: No chance for feedback, no chance for commitment for the product
Again: I wasn’t there as this feature was requested and built, I just can imagine that this is what it happened, but it really doesn’t matter. It is not about a special company or about special people but about the ability to deliver features or only just some lines of code sold as a product. Back to my “total”: this code was reviewed, integrated, deployed to development, then to some in-between stages and finally to production. NOBODY on this whole chain asked himself if the new field included in a public(!) API is implemented as it should. And I would bet that nobody from the frontend team was asked to review the API to see if their needs can be fulfilled.

Point 3: Power play, information hiding makes teams slow artificially (and kills innovation and the wish to commit themselves to the product they build) 
If this structure wouldn’t be built on power and position and titles then the first person observing the error could have talked to the very first developer in the team responsible for the feature to correct it. They could have changed it in a few minutes (this was the first person noticing the error ergo nobody was using it yet) and everybody would have been happy. But not if you have leads of every kind who must be involved in everything (because this is why they have their position, isn’t it?) Then somebody young and enthusiastic wanting to deliver a good product would create a JIRA ticket. In a week or two this ticket will be eventually discussed (by the leads of course)  and analyzed and it will eventually moved forward in the backlog – or not. It doesn’t matter anyway because the frontend team had a deadline and they had to solve their problem somehow.

Epilogue: the culture of “talk only to the leads” bans the cooperation between teams
at this moment I did finally understood the reason behind of another annoying behavior in the admin panel: the balance is calculated in the frontend and is equal with the sum of the shown items. I needed some time to discover this and was always wondering WTF… Now I can see what happened: the total in the API was not a total (only the integer part of the balance) and the ticket had to be finished so that somebody had this idea to create a total adding the values from the displayed items. Unfortunately this was a very short-sighted idea because it only works if you have less then 25 payments, the default number of items pro page. Or you can use the calculator app to add the single totals on every page…

All this is on so many levels wrong! For every involved person is a lose-lose situation.Β 

What do you think? Is this only me arguing for better “habitat for devs” or it is time that this kind of structures disappear.

Base your decisions on heuristics and not on gut feeling

As a developer we tackle very often problems which can be solved in various ways. It is ok not to know how to solve a problem. The real question is: how to decide which way to go 😯

In this situations often I rather have a feeling as a concrete logical reason for my decisions. This gut feelings are in most cases correct – but this fact doesn’t help me if I want to discuss it with others. It is not enough to KNOW something. If you are not a nerd from the 80’s (working alone in a den) it is crucial to be able to formulate and explain and share your thoughts leading to those decisions.

Finally I found a solution for this problem as I saw the session of Mathias Verraes about Design Heuristics held by the KanDDDinsky.

The biggest take away seems to be a no-brainer but it makes a huge difference: formulate and visualize your heuristics so that you can talk about concrete ideas instead of having to memorize everything what was said – or what you think it was said.

Using this methodology …

  • … unfounded opinions like “I think this is good and this is bad” won’t be discussed. The question is, why is something good or bad.
  • … loop backs to the same subjects are avoided (to something already discussed)
  • … the participants can see all criteria at once
  • … the participants can weight the heuristics and so to find the probably best solution

What is necessary for this method? Actually nothing but a whiteboard and/or some stickies. And maybe to take some time beforehand to list your design heuristics. These are mine (for now):

  • Is this a solution for my problem?
  • Do I have to build it or can I buy it?
  • Can it be rolled out without breaking neither my features as everything else out of my control?
  • Breaks any architecture rules, any clean code rules? Do I have a valid reason to break these rules?
  • Can lead to security leaks?
  • Is it over engineered?
  • Is it much to simple, does it feel like a short cut?
  • If it is a short cut, can be corrected in the near future without having to throw away everything? = Is my short cut implemented driving my code in the right direction, but in more shallow way?
  • Does this solution introduce a new stack = a new unknown complexity?
  • Is it fast enough (for now and the near future)?
  • … to be continued πŸ™‚

The video for the talk can be found here. It was a workshop disguised as a talk (thanks again Mathias!!), we could have have continued for another hour if it weren’t for the cold beer waiting πŸ™‚

Event Storming with Specifications by Example

Event Storming is a technique defined and refined by Alberto Brandolini (@ziobrando). I fully agree the statement about this method, Event Storming is for now “The smartest approach to collaboration beyond silo boundaries”

I don’t want to explain what Event Storming is, the concept is present in the IT world for a few years already and there are a lot of articles or videos explaining the basics. What I want to emphasize is WHY do we need to learn and apply this technique:

The knowledge of the product experts may differ from the assumption of the developers
KanDDDinsky 2018 – Kenny Baas-Schwegler

On the 18-19.10.2018 I had the opportunity to not only hear a great talk about Event Storming but also to be part of a 2 hours long hands-on session, all this powered by Kandddinsky (for me the best conference I visited this year) and by @kenny_baas (and @use case driven and @brunoboucard). In the last few years I participated on a few Event Storming sessions, mostly on community events, twice at cleverbridge but this time it was different. Maybe ES is like Unit Testing, you have to exercise and reflect about what went well and what must be improved. Anyway this time I learned and observed a few rules and principles new for me and their effects on the outcome. This is what I want to share here.

  1. You need a facilitator.
    Both ES sessions I was part at cleverbridge have ended with frustration. All participants were willing to try it out but we had nobody to keep the chaos under control. Because as Kenny said “There will be chaos, this is guaranteed.” But this is OK, we – devs, product owners, sales people, etc. – have to learn fast to understand each other without learning the job of the “other party” or writing a glossary (I tried that already and didn’t helped 😐 ). Also we need somebody being able to feel and steer the dynamics in the room.


    The tweets were written during a discussion about who could be a good facilitator. You can read the whole thread on Twitter if you like. Another good article summarizing the first impressions of @mathiasverraes as facilitator is this one.

  2. Explain AND visualize the rules beforehand.
    I skip for now the basics like the necessity of a very long free wall and that the events should visualize the business process evolving in time.
    This are the additional rules I learned in the hands-on session:

      1. no dev-talk! The developer is per se a species able to transform EVERYTHING in patterns and techniques and tables and columns and this ability is not helpful if one wants to know if we can solve a problem together. By using dev-speech the discussion will be driven to the technical “solvability” based on the current technical constraints like architecture. With ES we want to create or deepen our ubiquitous language , and this surely not includes the word “Message Bus”  πŸ˜‰
      2. Every discussion should happen on the board. There will be a lot of discussions and we tend to talk a lot about opinions and feelings. This won’t happen if we keep discussing about the business processes and events which are visualized in front of us – on the board.
      3. No discussions regarding persons not in the room. Discussing about what we think other people would mind are not productive and cannot lead to real results. Do not waste time with it, time is too short anyway.
      4. Open questions occurring during the storming should not be discussed (see the point above) but marked prominently with a red sticky. Do not waste time
      5. Do not discuss about everything, look for the money! The most important goal is to generate benefit and not to create the most beautiful design!

Tips for the Storming:

  • “one person, one sharpie, one set of stickies”: everybody has important things to say, nobody should stay away from the board and the discussions.
  • start with describing the business process, business rules, eventual consistent business decisions aka policies, other constraints you – or the product owner whom the business “belongs” – would like to model, and write the most important information somewhere visible for everybody.
  • explain how ES works: every business relevant event should be placed on a time line and should be formulated in the past tense. Business relevant is everything somebody (Kibana is not a person, sorry πŸ˜‰ ) would like know about.
  • explain the rules and the legend (you need a color legend to be able to read the results later).
  • give the participants time (we had 15 minutes) to write every business event they think it is important to know about on orange stickies. Also write the business rules (the wide dark red ones) and the product decisions (the wide pink ones) on stickies and put them there where they are applied. The rules before the event, the policies after one event happened.
  • start putting the stickies on the wall, throw away the duplicates, discuss and maybe reformulate the rest. After you are done try to tell the story based on what you can read on the wand. After this read the stickies from the end to the start. With these methods you should be able to discover if you have gaps or used wrong assumptions by modelling the process you wanted to describe.
  • mark known processes (like “manual process”) with the same stickies as the policies and do not waste time discussing it further.
  • start to discuss the open questions. Almost always there are different ways to answer this questions and if you cannot decide in a few seconds than postpone it. But as default: decide to create the event and measure how often happens so that later on you can make the right decision!
    Event Storming – measure now, decide later


    Another good article for this topic is this one from @thinkb4coding

At this point we could have continued with the process to find aggregates and bounded contexts but we didn’t. Instead we switched the methodology to Specifications by Example – in my opinion a really good idea!

Specifications
Event Storming enhanced with Specifications by Example

We prioritized the rules and policies and for the most important ones we defined examples – just like we are doing it if we discuss a feature and try to find the algorithm.

Example: in our ticket reservation business we had a rule saying “no overbooking, one ticket per seat”. In order to find the algorithm we defined different examples:

  • 4 tickets should be reserved and there are 5 tickets left
  • 4 tickets should be reserved and there are 3 tickets left
  • 4 tickets should be reserved and all tickets are already reserved.

With this last step we can verify if our ideas and assumptions will work out and we can gain even more insights about the business rules and business policies we defined – and all this not as developer writing if-else blocks but together with the other stake holders. At the same time the non-techie people would understand in the future what impact these rules and decisions have on the product we build together. The side-effect having the specifications already defined is also a great benefit as these are the acceptance tests which will be built by the developer and read and used by the product owner.

More about the example and the results can you read on the blog of Kenny Baas-Schwegler.

I hope I covered everything and have succeeded to reproduce the most important learning of the 2 days ( I tend to oversee things thinking “it is obvious”). If not: feel free to ask, I will be happy to answer πŸ™‚

Happy Storming!

Update: we had our first event storming and it was good!

Unfortunately we didn’t get to define the examples (not enough time). Most of the rules described above were accepted really well (explain the rules, create a legend for the stickies, flag everything out of scope as Open Question). Where I as facilitator need more training is by keeping the discussion ON the board and not beside. I have also a few new takeaways:

  • the PO describes his feature and gives answers, but he doesn’t write stickies. The main goal ist to share his vision. This means, he should test us if we understood the same vision. As bonus he should complete his understanding of the feature trough the questions which appear during the storming.
  • one color means one action/meaning. We had policies and processes on the same red stickies and this was misleading.
  • if you have a really complex domain
    (like e-commerce for SaaS products in our case) or really complex features start with one happy path example. Define this example and create the event “stream” with this example. At the end you should still add the other, not so happy-path examples.