DevOps Zone is brought to you in partnership with:

I am an experienced software development manager, project manager and CTO focused on hard problems in software development and maintenance, software quality and security. For the last 15 years I have been managing teams building electronic trading platforms for stock exchanges and investment banks around the world. My special interest is how small teams can be most effective at building real software: high-quality, secure systems at the extreme limits of reliability, performance, and adaptability. Jim is a DZone MVB and is not an employee of DZone and has posted 91 posts at DZone. You can read more from them at their website. View Full User Profile

Developers working in Production. Of course! Maybe, sometimes. What, are you nuts?

01.13.2014
| 7780 views |
  • submit to reddit

One of the basic ideas in DevOps is that developers and operations should share responsibility for designing systems, for implementing them and keeping them running. Developers should be on call in case something goes wrong, and be the one to fix whatever breaks. Because the person who wrote the code is often the only one who knows how it really works. And because of the moral hazard argument: if programmers are held fully accountable for the work that they do, they will be incented to do a better job, instead of writing garbage and handing it off to somebody else.

But this means that developers need some kind of access to production. How much access developers need, how often, and how this can be made safe, are important questions that have to be answered.

Hire wicked smart people and give them all access to root.
Unnamed devops evangelist, Is Devops Subversive?

If you ask whether developers should have access to production you’ll find that people fall into one of 3 camps:

Yeah, sure, of course – who else is going to support the system?

This is a simple decision for online startups, where there’s often nobody else to install, configure and support the application any ways.

As these organizations grow, developers often continue to stay closely involved in deployment, support and application operations, and in some cases, still play a primary role, especially in shops that heavily leverage cloud infrastructure (think Netflix).

Read my lips: Never Ever! Are you out of your freakin’ mind?

Question: Should developers have access to production?

Answer: Not only no, but hell no.

kperrier, Slashdot: Should Developers Have Access to Production

The situation is much different in large enterprises and government organizations, where walls have been built up between development and operations for many different reasons. It’s not just mergers and acquisitions and inertia and internal politics and protectionism that made this happen. It’s also SOX and PCI and HIPAA and GLBA and other overlapping regulations and privacy rules, and ITIL and COBIT and ISOxxx and CMMI and other IT governance frameworks, and internal and external auditors enforcing separation of duties and need-to-know access limitations in order to ensure the integrity and confidentiality of system data.

The same rules also apply to leaner-and-meaner Devops shops. For example, at Etsy (a Devops leader), PCI DSS compliant functions are managed and supported by a different team in a different way from the rest of their online systems: while developers have R/O access to a lot of production “data porn” (metrics and graphs and logs), they do not have access to production databases; there are more requirements for activity logging; a push to QA is handled in a clearly different way than a push to production; and all changes to production must be tracked and approved through a ticketing system.

And there’s also the problem of shared infrastructure: the same networks and servers and databases and other parts of the stack may be used by many different applications and different business units. Developers of course only understand the applications that they are working on and are only familiar with the simplified test configurations that they use day-to-day – they may not know about other systems and their shared dependencies, and could easily make changes that break these systems without being aware of the risks.

In case of emergency, break glass

Most organizations fall somewhere in between a Noops web startup in the cloud and a legacy-bound enterprise weighted down by too much governance and management politics. Operations is usually run separately, management is still accountable to regulators and auditors, but most people understand and recognize the need for developers to help out, especially when something goes wrong.

When the shit has indeed and truly hit the fan, developers – although usually only senior developers or team leads – are brought in to help troubleshoot and recover. Their access is temporary, maybe using a “fire id” extracted from a vault, then locked down again as soon as they are done. Developers are often paired up with an operations buddy who does most of the driving, or at least watches them carefully when the developer has to take the wheel.

Question: Should developers have access to production?

Answer: Everyone agrees that developers should never have access to production… Unless they’re the developer, in which case it’s different.

SatanicPuppy, Slashdot: Should Developers Have Access to Production

Problems in production can be fixed much faster if developers can see the logs, stack traces and core dumps and look at production data when something goes wrong. Giving at least some developers read access to production logs and alerts and monitors – enough to recognize that something has gone wrong and to figure out what needs to be fixed – makes sense.

Sometimes really bad things happen and all that matters is getting the system back and up and running as quickly as possible. You want the best people you can find working on the problem, and this includes developers. You’ll need their help with diagnosis and deciding what options are safest to take for roll back or roll forward, putting in an emergency fix or workaround, and data repair and reconciliation. Everyone will need to check later to make sure that any temporary fixes or workarounds are implemented properly, checked-in and redeployed.

When you run incident management fire drills, make sure that developers are included. And developers should also be included in incident postmortem reviews, even if they weren't part of the incident management team, because this is an important opportunity to learn more about the system and to improve it. But if you have developers firefighting in production more than almost never, then you’re doing almost everything wrong.

Debugging in production?

Published at DZone with permission of Jim Bird, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)