Continuous Delivery: The Price of Admission
Institutionalizing this gold-standard DevOps
Google had to make funding decisions for these. Most are in an operational expenditure territory, and yearly budgets are reviewed. Some like the Selenium Farm, and VM environment allocation could be charged back to projects. “Could be” as in I’m unaware of actual cost-justification and applicable finances, but I trust that extensive analysis was done before funding significant infrastructural investments.
There’s a people factor though. How do you get people to go along what the technical directions you want at top-level? I think it involves picking appropriate staff (including self-selection) as well as cultivating the appropriate group think. So how, precisely, does Google prevent all of its DevOps excellence regressing as they grow?
It does so in two ways, I think:
Start as you mean to carry on
I’ve not seen Google’s recruitment or on-boarding workflow, but I expect that many of the norms for developers are reinforced appropriately and progressively.
Imagine the worst case scenario was developers arriving who hate source-control. They want to dissuade those candidates at the earliest appropriate moment. In addition to apply-for-job channels, Google uses Linkedin to find developers. There is then a moment that recruiters could deselect individuals that are forward in their hatred of source-control. Candidates with that strong feeling could still get through. Phone screens can further subset people, as can the face-to-face series of interviews, and so on. ThoughtWorks is the same of course. The trick is to not give away too much in each of those gates, but still give and increasingly refined message about corporate culture.
Finally, as candidates have have the resolve to run through the whole interview series and receive a job offer, there’s the strong possibility that an individual could actually be hired.
The problem changes quite a bit at this stage. Now you’ve made and offer, and it’s been accepted, you have to groom the developer towards being a advocate for the things you want to be valued. That may include a shift in their thinking. You really only have one more planned moment for this: the developer’s on-boarding. If you’ve got it right, that on-boarding is not only solid enough to leave the Noogler clear about the core rules, but to be able to go on and elaborate on them to others when needed. This is the best moment, and it is a “start as you mean to carry on” thing. That adage applies to many aspects in life, of course.
Redefining core (Dev) values
So what if core values change, after an institutional decision to do so?
Let us suppose that you wanted your developers to be industrious with internal documentation too, but only AFTER the first 1000 developers were hired. You could install a wiki (Google did – from open-source land), you can remove permissions from pages to make it open to contribution, but what if you determine that not enough people are writing new content, or updating old?
The answer is that you need to make a big deal about changing culture, and with someone with sufficient authority mandating the decision. Maybe they should elaborate on the rationale too. Maybe and all-hands meetings to socialize the applicable changes, would be fruitful, but only if led by someone with sufficient technical. You could easily flip a core of developers from being allergic to documentation, to being on favor it it. Perhaps only if contributing is not too odious – “use Lotus Notes” for example.
Pity the poor companies that don’t have strong technical leadership though. Companies with CTO that are not respected or missing committees that would push for excellence (with authority), are going to have IT assets that decay over time.
Cost Of Change
Defects still happen, but Google have moved them leftwards on the cost of change curve. That progression left, even begins to categorize things as “could have been a defect if we did not catch it here”.
Their one big trunk setup allows them to maximize reuse. In their configuration it also forces them to do lock-step upgrades. Those upgrades could be binaries from outside Google (say Log4J), but also their own internal shared components and services. Applications can still become legacy of course, but the source-code is moved forward as lockstep upgrades happen, and the unit/integration/functional tests for it guard it despite the lack of an active development team. Many larger enterprises are in the same place, and need that desperately.
1000ft decision making.
I mentioned this above, one big trunk and all that build/commit intelligence allows the higher ups to pivot quickly re all their source-based assets:
- What apps are BEAST vulnerable?
- How big is the effort to upgrade Hibernate for all?
- How far along is the Spring Framework to Guice migration?
- What is the test coverage ranking of commits by team / developer / university / language
- Do build breakages vary based on precipitation at the locale of the committer in question?
Centralized everything allow this auditability.
Google prevent developers from sharing branches that are not the single anointed trunk. That much should be obvious.
Speaking to that, an Emerson quote applies and should make us feel a little guilty:
“A foolish consistency is the hobgoblin of little minds ..
Concrete advice for the reader
In order perhaps,
- Putting in a CI daemon is never bad. But it will quickly show you what your underlying problems are
- Review your source-control choices (all apps). Sunset all of ClearCase, PVCS, CVS, StarTeam, Synergy as soon as you can
- Review your range of branching models (all apps)
- Work out if one big trunk makes sense (facilitates common code / reuse)
- If yes to the last, you might want to enable checking out of a subset on a per-developer basis
- If your dev teams can boost their throughput, think about the capacity of CI infrastructure needed to do one build per commit
- Design some elastic infrastructure for transient environments for the CI processes to deploy applications into for automated functional testing
- Worry about the elapsed time to take a build all the way through the pipeline, including those automated functional tests
- Collect metrics about builds into some system for later analysis
- Write tools to aggregate build metrics, commits and allow analysis
Then there is also, which of those applications require legacy rejuvenation concurrently and what that means for a larger CD style reorganization. Rejuvenation strategies should really include improving or introducing tests. That’d be test coverage at the unit level, integration tests in smaller number, and at least happy-path functional tests.
ThoughtWorks spends a lot of time moving clients to more sophisticated DevOps places, often as part of trying migration towards CD. There’s a lot to it, and I’ve only outlined a checklist of mostly mechanical achievements here. To get to true CD, it’s not just developer ‘workspace’ that needs to be boosted, it’s management and stakeholder participation/expectations, shared responsibility team, flow of feature requests into working code. In short many more ‘social’ things, and methodology-driven than anything I’ve outlined in this article. Some of that the Agile/Lean industry speaks to too. Anyway you can’t just get to true CD by installing Jenkins on a machine with a high-end CPU and lots of RAM.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)