Tl;dr: Continuous integration and delivery are not about a pipeline, it is about trust, psychological safety, a common goal and real teamwork.
What is needed for CI/CD – and how to achieve those?
No feature branches but trunk-based development and feature toggles: feature branches mean discontinuous development. CI/CD works with only one temporary branch: the local copy on your machine getting integrated at the moment you want to push. “No feature branches” also means pushing your changes at least once a day.
A feeling of safety to commit and push your code: trust in yourself and trust in your environment to help you if you fall – or steady you to not fall at all.
Quality gates to keep the customer safe
Observing and reducing the outcome of your work (as a team, of course)
Resilience: accept that errors will happen and make sure that they are not fatal, that you can live with them. This means also being aware of the risk involved in your changes
What happens in the team, in the team-work:
It enables a growing maturity, autonomy due to fast feedback, failing fast and early
It makes us real team-workers, “we fail together, we succeed together”
It leads to better programmers due to the need for XP practices and the need to know how to deliver backwards compatible software
It has an impact on the architecture and the design (see Accelerate)
Psychological safety: eliminates the fear of coding, of making decisions, of having code reviews
It gives a common goal, valuable for everybody: customers, devs, testers, PO, company
It makes everybody involved happy because of much faster feedback from customers instead of the feedback of the PO => it allows to validate the assumption that the new feature is valuable
It drives new ideas, new capabilities bc it allows experiments
Sets the right priorities: not to jump to code but to think about how to deliver new capabilities, to solve problems (sometimes even by deleting code)
How to start:
Agree upon setting CI/CD as a goal for the whole team: focus on how to get there not on the reasons why it cannot work out
Consider all requirements (safety net, coding and review practices, creating the pipeline and the quality gates) as necessary steps and work on them, one after another
Agree upon team rules making CI/CD as a team responsibility (monitoring errors, fixing them, flickering tests, processes to improve leaks in the safety net, blameless post-mortems)
Learn to give and get feedback on a professional manner (“I am not my work”). For example by reading the book Agile Conversations and/or practice it in the meetup
– – – – –
This bullet-point list was born during this year’s CITCON, a great un-conference on continuous improvement. I am aware that they can trigger questions and needs for explanations – and I would be happy to answer them 🙂
I did all of this. One case I will never forget: I should implement a feature request resulting in returning some object property as a string. This property was containing a URL, but the feature didn’t say “I need to know how to navigate to X or Y” but “please include the URL X in the result”.
It turned out that another 2 teams used this “string” to build navigation on it or to include it in emails without ever telling me. Why should they? I was done with the feature: it was their turn. Both of them have validated this string, have built URLs with them (using information exclusively owned by the backend service…), etc.
Let me be more explicit:
Failure No. 1: If I would have changed some internals in the backend service, I could’ve broken the UI code without knowing. My colleagues relied on things they had no chance to control. We were dependent on each other without being able to see it.
Failure No. 2: the company paid 3 different developers to write the same validation functions and the customer flow had to pass the same validations 3 times instead of only once. A totally wrong decision, from an economical point of view.
I think that was the moment I decided to change the way we deliver features, the way we work together. This was 6 or 7 years ago and since then I followed the same method to reorganize not only the teams but also the source code. Because one thing is sure: changing one without the other only leads to bigger pains and even more frustration.
Step 1. Visit the “other side” of that wall and learn what they are doing and how they are doing it. You will observe bottlenecks and wasted time and energy in your value stream (the road a feature passes from the idea to the customer)
Step 2. Get the buy-in by the next level in your hierarchy: in most situations (in both cases I were in this situation) you are not the first one noticing these problems, but you could be the first one offering a solution. Grab this chance, don’t hesitate!
Step 3. Remove the wall between the silos: find a good time to make your move, after the biggest project ended or before the next one starts. Don’t wait too long, there always will be unfinished features.
Step 4. This depends on how many team members we are talking about. In both cases, we were around 15 people, and nobody wants stand-ups or even meetings with 15 people! You become even slower and even less capable to take decisions. But this step is important for a few things:
both “parties” should learn and understand what the others do, how the parts are connected, what language, concept, design is used to build them
all members should understand and accept that it is important to split up in teams – and this is always hard because it means “we have to change”! Developers are – against all expectations – very reluctant to change. Even more reluctant when they realize that they won’t work with their buddies anymore but with some hardly known people they do not really or trust.
you and/or your boss, your colleagues, your buddy in this change must start to discover how the domain is shaped, how can the teams being split up – because this will be the next step.
Up to this point you didn’t improve the developer experience, it will become rather worse. What you have improved is the life of the product manager or CTO or whoever brings the requests to the teams: instead of explaining two teams the two parts of a feature (cut in the “middle” between backend and frontend), he/she must explain it only once. At the same time, the Delivery Lead Time (the first key metric in measuring team performance) will become shorter because all the ping-pong between BE and FE can be eliminated before the feature development starts.
After you all spent a longer or shorter time together is time to take the next step: align the organization to the business
The most important part is to find the natural boundaries of the domain and create business teams who OWN this (sub) domains.
I did this 3 times in all kinds of environments: brownfield monolith or greenfield new biz, it doesn’t matter. Having a monolith as cash cow doesn’t make this change easy of course but it can be made, with discipline and a good plan on how to take over control. (this topic is much to complex to be included in this article)
The last thing which must be said is, when NOT to start this transformation:
If you don’t find any fellow to support you. In this case, either the problem isn’t big enough to be felt by the others, or you are in the wrong company and maybe should start to think to transform yourself instead (and leave).
If you or your fellow and/or boss aren’t patient people. Changing is hard and should be accompanied carefully and patiently – so that one does not need to repeat it again after even greater frustrations and chaos (was there, saw this :-/ )
If you expect that this is all. Because it isn’t: every change toward more transparency – because this is what happens when you break up silos and let others look at the existing solutions – all these changes will make issues transparent. A few of these issues will be technical (like CI/CD, code coupling, infrastructure coupling, etc.). But the hard problems will be missing communication skills and missing trust. Nothing that cannot be solved – but it will take time, that is sure.
If you reach this point, you can start to form an autonomous team: one which not only decides, what to do but also in charge to do it. Working in an environment created by you and your team allows you all to discover and live up to your creativity, to make mistakes and learn from them.
This ownership and responsibility make the difference between somebody hired to type lines of code and somebody solving problems.
What do you think? Could you start this change in your company? What would you need?
Now you know about my experience. I would be really happy to find out yours – here or on twitter.
One last question: what would you like more to read of: how to find the right boundaries or how can your team become a REALLY autonomous team – and how autonomous can that be?
I think I have to write a (short) rant about silos and team hierarchies and the costs (and obviously frustration) caused in such "agile" tech companies.
Disclaimer: this is NOT a rant about people. In most of the situations all devs I know want to deliver a good work. This is a rant about organisations imposing such structures calling themselves “an agile company”.
To give you some context: a digital product, sold online as a subscription. The application in my scenario is the usual admin portal to manage customers, get an overview of their payment situation, like balance, etc. The application is built and maintained by a frontend team. The team is using the GraphQL API built and maintained by a backend team. Every team has a team lead and over all of them is at least one other lead. (Of course there are also a lot of other middle-management, etc.)
Some time ago somebody must have decided to include in the API a field called “total” containing the balance of the customer so that it can be displayed in the portal. Obviously I cannot know what happened (I’m just a user of this product), but fact is, this total was implemented as an integer. Do you see the problem? We are talking about money displayed on the website, about a balance which is almost never an integer. This small mistake made the whole feature unusable.
Point 1: Devs implement technical requests instead of improving the product I don’t know if the developer who implemented this made an error by not thinking about what this total should represent or he/she simple didn’t had the experience in e-commerce but it is not my point. My point is that this person was obviously not involved in the discussion about this feature, why it is needed, what is the benefit. I can see it with my spiritual eyes how this feature became turned in code: The team lead, software lead (xyz lead) decided that this task has to be done. The task didn’t referred to the customer benefit, it stripped everything down to “include a new property called total having as value the sum of some other numbers”. I can see it because I had a lot of meetings like this. I delivered a string to the other team and this string was sometimes a URL and sometimes a name. But I did this in a company which didn’t called himself agile.
Point 2: No chance for feedback, no chance for commitment for the product Again: I wasn’t there as this feature was requested and built, I just can imagine that this is what it happened, but it really doesn’t matter. It is not about a special company or about special people but about the ability to deliver features or only just some lines of code sold as a product. Back to my “total”: this code was reviewed, integrated, deployed to development, then to some in-between stages and finally to production. NOBODY on this whole chain asked himself if the new field included in a public(!) API is implemented as it should. And I would bet that nobody from the frontend team was asked to review the API to see if their needs can be fulfilled.
Point 3: Power play, information hiding makes teams slow artificially (and kills innovation and the wish to commit themselves to the product they build) If this structure wouldn’t be built on power and position and titles then the first person observing the error could have talked to the very first developer in the team responsible for the feature to correct it. They could have changed it in a few minutes (this was the first person noticing the error ergo nobody was using it yet) and everybody would have been happy. But not if you have leads of every kind who must be involved in everything (because this is why they have their position, isn’t it?) Then somebody young and enthusiastic wanting to deliver a good product would create a JIRA ticket. In a week or two this ticket will be eventually discussed (by the leads of course) and analyzed and it will eventually moved forward in the backlog – or not. It doesn’t matter anyway because the frontend team had a deadline and they had to solve their problem somehow.
Epilogue: the culture of “talk only to the leads” bans the cooperation between teams at this moment I did finally understood the reason behind of another annoying behavior in the admin panel: the balance is calculated in the frontend and is equal with the sum of the shown items. I needed some time to discover this and was always wondering WTF… Now I can see what happened: the total in the API was not a total (only the integer part of the balance) and the ticket had to be finished so that somebody had this idea to create a total adding the values from the displayed items. Unfortunately this was a very short-sighted idea because it only works if you have less then 25 payments, the default number of items pro page. Or you can use the calculator app to add the single totals on every page…
All this is on so many levels wrong! For every involved person is a lose-lose situation.
What do you think? Is this only me arguing for better “habitat for devs” or it is time that this kind of structures disappear.