Thinking of starting an Innovation Lab? Here’s what not to do.

I recently got to experience the exquisite joy and excruciating pain of helping to build a new software team for a large, well-established company. We had the mandate to do things differently and to help other teams modernize. While a lot of things went very well, I think the most helpful, or at least entertaining, things I can write about are what didn’t go well and why. So, at taking the risk of being a prescriptive hypocrite, here’s a list of what I think you should not do.

1. Do not muddle the message.

It’s about Agile! It’s about Lean Startup! It’s about DevOps! It’s about a green-field project! It’s about The Cloud! It’s about Open Source! It’s about automation! It’s about whatever personal pet perspective you care about.  Yes, when your mandate is to innovate, you’re going to work with different tools and techniques, but If you’re known as “that cloud team” or “that agile team” you’ve lost control of your own identity. You should be about innovation, experimentation, and letting go of the way things have always been done. Everything else is just implementation details.

2. Do not change the experiment halfway through.

A few months into the project, a new executive informed us of the new reality. “You guys have been operating all on your own. That ends now.” Looking back on it now, that was the moment when all value in the experiment was lost. We started working with the existing processes (which nearly everyone agreed were terrible).

Between the radical process change and the lack of clear messaging, any person could take any result and use that to validate their own existing opinions.  If the project were a huge success, it was because of technology X. If it were a failure, that was obviously because of process Y.  

I don’t expect a lot of scientific rigor, but seriously….

3. Do not keep things a secret.

I fear that I’m being a little unkind to the anonymous executive in the point above. He was reacting to the fact that we were completely opaque.  Apparently (I didn’t know this at the time) other groups in other offices were either not informed that we existed or specifically told to not interact with us.  A more sustainable approach would have been to have full transparency, along with broad autonomy, from day one. Other groups should know what’s going on, and even to give advice and suggestions, but not to swoop in and micro-manage.

4. Do not piss all over other people.

One of the first things that our team did was to look at an existing project underway from an existing team and to throw it away entirely. From a purely technological perspective, that may have been a good decision, but it was, if I may speak plainly, a total dick move. That, coupled with our opacity, created the perception that we were a bunch of elitists. I always wanted to be elite, but not elitist.  We never wanted to be a group of better technologists. Just technologists with a different mandate and a different toolkit.

The boastful and tone deaf claims of “look at this team getting done in six months what that other team spent several years on” didn’t help matters. If anyone on that other team reads this. You have my sincere apologies.

5. Don’t conflate being an “innovator” with being an “agent of change”.

If your organization is large enough that an innovation lab makes sense, you’ll also need change agents who can evangelize and support changes. These are different things. Your change agents can point at what the innovation lab is doing, which will help work around the developer blind spot, but being a change agent requires more patience and political savvy than being an innovator.

Change agents are there to dismantle the dysfunctional engine, innovators are there to create fuel for new engines.

Software Architect as Hands on… Something

One of the good things about writing is that it forces you to organize your thinking. While organizing my thinking around the role of dedicated “software architects” I actually changed my opinion. Slightly.

My original extremely persuasive argument was that the main goal of a software architect should be to minimize the day-to-day pain of the developers on the team and should therefore have hands-on involvement with the code so that the pain is felt and understood.

While reading that over, I realized it was a bit too myopic and dev-centric (sorry, I’m only human) so I broadened the goal to minimize the day-to-day pain of the cross-functional team that is responsible for developing and testing and deploying and supporting the product. After all, Architectural decisions that make things easier for those developing the product but harder for those supporting the product are most likely bad decisions. You know, the whole “optimize the whole” concept that goes all the way back to the Toyota Production System.

So then, I guess a good software architect would be someone who still feels the day-to-day pain of their architecture decisions as an actual hands-on-contributor to the cross functional team, regardless of discipline. Ideally it should be someone who has gone through at least one cross-disciplinary crop-rotation.

Crop Rotation as a metaphor for interdisciplinary software work

As long as you don’t abuse them (mostly by implicitly imposing constraints from the source concept), metaphors can be useful shorthand to talk about and explore ideas. I think that if you’re thoughtful, you can use metaphors from just about every source. I go back to biological metaphors a lot: thinking about ecosystems, monocultures, germ theory, immune response, etc. One of my very favorites is the agricultural practice of crop rotation.

If you didn’t grow up on a farm (as I sort of did… long story) the concept may be foreign. The basic idea is to use the same piece of land to grow different plants over sequential seasons. You know, rotate the crops. There are a few reasons to do this. One is that different plants interact with the soil differently. Some pull nitrogen out, some put nitrogen in. It also helps with pest control, as bugs who like to eat crop X might not like to eat crop Y. Wikipedia has a good writeup.

I started thinking about this metaphor many years ago after I was on a project where the dedicated tester left and I had to step in and be entirely test-focused for several months. It was a good experience for me. For one, I learned that even though I had a QA background (software testing was my first real job) I wasn’t half as good at being a test-focused developer (aka SDET) as the guy I was trying to replace.  I had to learn some different tools and techniques and it felt great.

The surprising part was, after this project was over and I was doing lead development on a different project, I was a better developer as a result of it. Like the nitrogen left behind in the soil after growing legumes, validation strategies were left behind in my brain after letting myself focus on the test discipline.

I like to imagine a healthy-sized cross-functional team where the different team members deliberately rotate through which particular discipline/kind of effort that they “own” or pay attention to. The rotation could be along the frontend/middle/backend axis, or the dev/test/support axis, or whatever distinct kinds of efforts the team requires.  Even though there is a meaningful cost to letting your people get up to speed in multiple related disciplines (just as there is a meaningful cost in planting different crops from year to year) the benefits in terms of big-picture vision, creativity, preventing burnout, and team empathy could be huge.

Let’s look inside this complex opaque thing, shall we?

A few months ago, Mrs. Cron insisted that I go to the Emergency Room. She insisted that writhing around in pain on the bathroom floor isn’t a great way to spend an evening. She eventually brought me around to her way of thinking. Persuasive woman, Mrs. Cron.

I noticed that after giving me some industrial-strength painkillers, most of the effort was spent trying to look inside my generally opaque body without cutting me open (thanks!). There were blood tests, urine tests, ultrasounds, x-rays, MRIs, CT scans. Probably other tests as well, I don’t remember too clearly thanks to the aforementioned industrial-strength painkillers.

It turns out that I had gall stones, and they needed to remove my gallbladder. When it was time to cut me open, the surgeon already knew exactly what he would find inside and how to take care of it. The incisions were small and I recovered quickly. I’ll never be an underwear model, but that was never a reasonable goal of mine anyway.

The parallel to professional software development is pretty obvious. Many of the tools that we use are about looking inside a complex opaque thing without having to cut it open. We use debuggers, profilers, log analyzers, static analyzers, code coverage tools, and the like. Just like how the ER staff had a diverse set of tools for looking into me, it’s good for coders to have a diverse toolkit for looking into an application.

I just started using the free tier of New Relic for keeping track of how a very complex (and yes, very opaque) legacy application is behaving and it is my new favorite thing in the universe.

What exactly is culture and why am I wearing a suit today?

fancy_guys

Alas, Preston just had to one-up me with the bow tie.

 

Like I am on every Friday, I’m dressed up today. Today I’m wearing a vintage two-button pinstripe suit with a paisley tie. It’s Fancy Friday, which emerged years ago when I started working at the Cheezburger network. It’s a bit of the culture that I’ve carried with me since then.

Why? There are a few reasons.

1. It’s an oblique “screw you” to the entire idea of corporate dress codes and “Casual Friday”.  I once worked for a small startup that was bought up by a larger more established company. One of our new corporate overlords came in and saw that I was wearing an aloha shirt and said “You would LOVE our office in California! We do CASUAL FRIDAYS!” I left soon after that.

Every day is casual day when you look this good.

Every day is casual day when you look this good.

 

2. Collecting formal wear (specifically ties and cuff links) makes it easy for your loved ones to get you gifts. They don’t have to worry about if I have one already (because I do) or if it will fit (because it will). Today I’m wearing 3D printed cuff links modeled after the classic anglepoise lamps that were a gift from Mrs. Cron.

3. I like how mirrors look when I’m dressed up.

4. (and this is the most important). I have come to the understanding that culture is the set of choices you get to make without having to explain yourself. When I first show up at a technology job wearing a suit and tie, I have to explain myself. Always. It just isn’t part of the culture.  After a few weeks go by, Fancy Friday becomes part of the culture. Sometimes people join me in dressing up, as my old friend Preston did today, but mostly I stop having to explain myself because the culture absorbs this persistent choice.

The public choices you make over and over will influence the culture of your organization. This concept also applies to choices more important than what you wear. If you make a deliberate choice to take the time to help someone instead of narrowly focusing on your own tasks, that informs culture too.

 

 

 

Tagged , ,

Update on Continuous Deployment a few months in

On average, I’m doing just under four totally automated zero-downtime production deployments per day since I started doing continuous deployment on this project in early August 2011.

Obviously, that’s an average, so some days there are more, and some days there are fewer.

How many times during that span has improperly tested code broken things in a way that impacted users? Just once, and it was fixed in just a few minutes by adding back the code that I over-zealously removed.

Also in that time, we went from 0% automated test coverage to 18% automated test coverage, refactored a lot of duplicate code, and added new features.

Continuous Deployment for Existing Software – Do it Now

I recently wrote about doing continuous deployment from day one with a software project that had pretty good test automation and (if I do say so myself) a somewhat modern and decent architecture.

I’ve recently transitioned to working on a fairly ambitious overhaul of an existing project that has been around for a few years. The software has already reached a (mostly) working steady state and is used every day by real customers.

The very first thing I did, ahead of making any functional changes, was automate the deployment system using essentially the same kind of test gauntlet and approach for zero-downtime deployments I was using at Victors United. I added some very basic test automation, starting from the top of the test automation food pyramid with a simple “is the web server able to execute code, connect to the database server, and return an ‘OK’ rsponse?” test. From there, I’ve working my way down, just getting around to writing my first “pure” unit test just today.

Only after I was confident that I could do zero-effort and zero-downtime deployments that

  1. Wouldn’t completely destroy the system (I had a test for that) and
  2. Could be rolled back very easily if something went screwy

did I make my first functional change to the software.

And then, I made the smallest functional changes that could work. I tested them locally, added some test automation around them, and then let the automatic deployment system do its magic.

A few years ago, I was interested in doing continuous deployment, but I felt that the level of test automation wasn’t good enough, I wanted to be at least at around 90%.  Now I think that the less test automation you have, the more important it is to start doing continuous deployment of tiny incremental changes right away.

If you are working on software that’s actually in use and are interested in doing continuous deployment someday, there’s no better someday than today. Seriously. Do it now. It will start making your life better almost immediately.

 

Always Be Shipping – Real-World Continuous Deployment

Have you ever edited code directly on a production server? I’ll admit that I have, years ago, before I knew better. It’s easy and fast and gets emergency fixes out there as quickly as possible. You get to know what works and what doesn’t because you make tiny changes. If you’re working with a system that anyone cares about, it’s also dangerous and stupid.

I’ve also worked with development organizations that take the opposite extreme. Even the most trivial server updates needed to be scheduled weeks in advance. In order to push the updated code, you had to notify everyone, take all servers down in the middle of the night, run through a long series of manual steps, and then do a lot of manual testing. When things in production aren’t quite how they are in the test environment, this is followed by a hasty rollback or panic-induced middle-of-the-night attempts at bug fixing.

Both of these extreme approaches are (maybe) appropriate in some environments. But neither are appropriate for a new web startup that’s trying to move quickly and still provide a reliable trustworthy experience to their customers.

At the previous company I worked for, we often talked about IMVU-Style Continuous Deployment as our ideal process, but it was always something “in the future”. We were hesitant (some of us more than others) to do automatic deployment without at least a little manual intervention. We always wanted to have more test automation, or a smoother deployment system, or whatever.

Since it seemed to be hard (for me anyway) to move to an existing development organization to a continuous deployment system, I started to wonder what would happen if you do it that way from day one? I got a chance to answer that question when I co-founded a startup last year. One of the very first things I did, before we had anyone using the site, was to create an solid automated test & deployment system that was as fast and easy as possible without being dangerous and stupid.

Here’s the basic workflow that happens in our office multiple times every day.

Step 0. We make changes on our local dev envirnments, with a bias toward making the smallest possible change that adds value. That could be a bug fix, correcting a typo, a stylistic tweak, a stubbed-out new feature, whatever. Once I’m confident in my local (manual and automated) testing that the change is good (not perfect, not feature-complete, but just better), I push that to my github repository.

From there, the continuous integration server pulls down the new code and does the following:

Step 1. Does the code still compile. If not, the build fails and everything stops.

Step 2. The build agent runs the unit tests (where “unit tests” are defined as tests that run with no external dependencies, these take just a few seconds). For anything that does require external (generally slow) dependencies (network API, databases, filesystem, whatever) we use test doubles (fakes, mocks, stubs, whatever).

This first feedback loop is about catching and preventing errors in core business logic and is generally measured in seconds, not minutes.

Step 3. The build agent runs a set of tests that rebuild the database from a reference schema and exercises all of the repository layer code.

Step 4. The build agent runs another set of tests that test our dependencies on external APIs (twitter, geolocation services, etc.)

These two sets of tests run in a few minutes, so the feedback loop isn’t quite as tight, but it’s still pretty darn fast. Basically, they make sure that the assumptions that we make with our test doubles in our unit tests aren’t totally wrong.

I’ve written about these sorts of automated test distinction a couple of years ago, in a post about the Automated Testing Food Pyramid.

Step 5. Provided that the entire gauntlet of tests has passed so far, the code gets automatically deployed to a staging server.

Step 6. There’s an additional set of tests that run against the staging web server. These tests can find configuration problems and code that just does the wrong thing in a web context. These tests are pretty shallow. They hit all of the user-facing pages/JSON endpoints and fail if anything is totally broken.

Step 7. The build artifacts are copied from TeamCity to a new folder on our productionvserver, and then the web server is reconfigured to serve from that folder instead of the folder it had been serving from.

At this point, we’ve verified that the core business (game, in this case) is OK, verified that the persistence stack works as expected, that our integration with external APIs works as expected, and that the code doesn’t completely break in a web context. We’ve done a zero-downtime deploy to the production web server.

That’s cool, but we’re not quite done yet. There’s two more steps.

Step 8. Run a set of tests against the production web site to make sure that all of the pages that worked a few moments ago still work.

Step 9.  Have external monitoring systems in place, so if your changes make things slow or unresponsive. You’ll know.  We use pingdom.

Yikes! There’s a bunch of distinct steps here, and it seems really complicated (because it is). But it’s all totally automated. All I need to do ?

git push origin master

Because there’s zero deployment effort on my part, I do this all the time.  I find it very energizing to know that I can just do stuff in minutes instead of hours or days or (heaven forbid) months.

If (when) something goes wrong, I’ll know immediately. If a bad bit of code manages to roll through the test gauntlet, I can roll back easily (just reconfiguring the web server to use the last known good set of code). I’ve only had to roll back a couple of times over the course of several months and 326 deployments.

Just like when the folks at IMVU wrote about this process, I’m sure that some people in the audience convinced that I’m a crazy person advocating a crazy way of working. Here are the objections I’ve heard before.

Yeah, but this doesn’t give your QA team any time to test the code before it goes out!
We’re a small startup. We don’t have a QA team. Problem solved.

Yeah, but isn’t that incredibly dangerous?
No. The safest change you can make to a stable production system is the smallest change possible. Also, we design the individual parts of the system to be as encapsulated as possible, so we don’t tend to have crazy side-effects that ripple through and create unintended bugs.

When we make a change or add a new feature, we can manually test the hell out of that one thing in isolation (before checking in) instead of feeling like we need to spend a lot of time and effort manually testing everything in order to ship anything.

Yeah, but what about schema changes?
For schema changes that are backwards compatible with the code that’s out there (e.g. new tables, whatever). We have a simple system that executes the appropriate DML on application startup.

For non-compatible schema changes and things like server migrations, we have to take down the site and do everything manually. Fortunately, we’ve only had to do that twice now.

Yeah, but you have to spend all of that time writing tests. What a waste!
The time we spend writing tests is like the time a surgeon spends washing their hands before they cut you open. It’s a proven way to prevent bugs that can kill you.

Also, we get that time back, and then some, by not having to spend nearly as much time with manual testing and manual deployments.

Yeah, but what about big new features you can’t implement in just one sitting?
Traditional software development models (both waterfall and agile methods like Scrum) are organized around the idea of “multiple features per release (or iteration)“. Continuous deployment is organized around the idea of “multiple releases (or iterations) per feature“. As a result, we end up pushing a lot of code for features that aren’t done yet. For the most part, these simply unavailable through the UI or only exposed to users who have particular half-built features associated with their accounts. I credit Flickr with this general approach.

Yeah, but that might work for a solo developer, but it can’t work for a team.
There are actually three developers on the team.

Yeah, but I’m sure this only works with very experienced developers
One of the guys on the team has only been programming for the last year or so and hasn’t ever worked on a web project before. Tight feedback loops help everyone.

Yeah, but what about code you want to share with other that you don’t want released?
We use github, so creating additional branches and sharing with them is trivial. We also have a dedicated “preview” branch that triggers a parallel test/deploy gauntlet that sends code to a staging server instead of the production servers.

Yeah, but this will never work at my organization because…
OK. That’s cool. Don’t try it if you feel that it won’t work for you. You’re probably right. You’re not going to hurt my feelings either way. I found something that’s working really well for me, and I want share my experience to show other people that it’s possible.

What this really means

Half of what makes this process work is that we’re honest with ourselves that we’re human and will make mistakes. If we have multiple tight feedback loops between when we’ve broken something and when we know we’ve broken it, it’s faster and easier and cheaper to fix those mistakes and prevent similar mistakes from happening again.

The other half is the idea that if you design, implement, test, and release exactly one thing at a time, you know with certainty which change introduced a problem instead of having to ask the question “which of the dozen or so changes that we rolled out this month/sprint/whatever are causing this problem”.

About the site

Victors United is an online turn-based strategic conquest game. You can play asynchronously or in real time. You can play against robots or humans. If you’re playing against humans, you can play against your friends or against strangers. Unlike some other popular web based social games that I don’t like to mention, this is a real competitive game where strategy and gameplay matter.

About the tech 

The tech here is kind of beside the point. This general approach would work just as well with different technology stacks.

The front end is HTML5 + JavaScript + jQuery. The backend is IIS/ASP.NET MVC2/SQL Server/Entity Framework 4. Our servers are hosted in the SoftLayer cloud. External monitoring is provided by pingdom.

The test gauntlet is a series of distinct nUnit assemblies, executed by TeamCity when we push new code to GitHub. There’s a single custom PowerShell script that pulls down the build artifacts and tells IIS to change what directory it serves code from.