Category Archives: C#

Update on Continuous Deployment a few months in

On average, I’m doing just under four totally automated zero-downtime production deployments per day since I started doing continuous deployment on this project in early August 2011.

Obviously, that’s an average, so some days there are more, and some days there are fewer.

How many times during that span has improperly tested code broken things in a way that impacted users? Just once, and it was fixed in just a few minutes by adding back the code that I over-zealously removed.

Also in that time, we went from 0% automated test coverage to 18% automated test coverage, refactored a lot of duplicate code, and added new features.

Always Be Shipping – Real-World Continuous Deployment

Have you ever edited code directly on a production server? I’ll admit that I have, years ago, before I knew better. It’s easy and fast and gets emergency fixes out there as quickly as possible. You get to know what works and what doesn’t because you make tiny changes. If you’re working with a system that anyone cares about, it’s also dangerous and stupid.

I’ve also worked with development organizations that take the opposite extreme. Even the most trivial server updates needed to be scheduled weeks in advance. In order to push the updated code, you had to notify everyone, take all servers down in the middle of the night, run through a long series of manual steps, and then do a lot of manual testing. When things in production aren’t quite how they are in the test environment, this is followed by a hasty rollback or panic-induced middle-of-the-night attempts at bug fixing.

Both of these extreme approaches are (maybe) appropriate in some environments. But neither are appropriate for a new web startup that’s trying to move quickly and still provide a reliable trustworthy experience to their customers.

At the previous company I worked for, we often talked about IMVU-Style Continuous Deployment as our ideal process, but it was always something “in the future”. We were hesitant (some of us more than others) to do automatic deployment without at least a little manual intervention. We always wanted to have more test automation, or a smoother deployment system, or whatever.

Since it seemed to be hard (for me anyway) to move to an existing development organization to a continuous deployment system, I started to wonder what would happen if you do it that way from day one? I got a chance to answer that question when I co-founded a startup last year. One of the very first things I did, before we had anyone using the site, was to create an solid automated test & deployment system that was as fast and easy as possible without being dangerous and stupid.

Here’s the basic workflow that happens in our office multiple times every day.

Step 0. We make changes on our local dev envirnments, with a bias toward making the smallest possible change that adds value. That could be a bug fix, correcting a typo, a stylistic tweak, a stubbed-out new feature, whatever. Once I’m confident in my local (manual and automated) testing that the change is good (not perfect, not feature-complete, but just better), I push that to my github repository.

From there, the continuous integration server pulls down the new code and does the following:

Step 1. Does the code still compile. If not, the build fails and everything stops.

Step 2. The build agent runs the unit tests (where “unit tests” are defined as tests that run with no external dependencies, these take just a few seconds). For anything that does require external (generally slow) dependencies (network API, databases, filesystem, whatever) we use test doubles (fakes, mocks, stubs, whatever).

This first feedback loop is about catching and preventing errors in core business logic and is generally measured in seconds, not minutes.

Step 3. The build agent runs a set of tests that rebuild the database from a reference schema and exercises all of the repository layer code.

Step 4. The build agent runs another set of tests that test our dependencies on external APIs (twitter, geolocation services, etc.)

These two sets of tests run in a few minutes, so the feedback loop isn’t quite as tight, but it’s still pretty darn fast. Basically, they make sure that the assumptions that we make with our test doubles in our unit tests aren’t totally wrong.

I’ve written about these sorts of automated test distinction a couple of years ago, in a post about the Automated Testing Food Pyramid.

Step 5. Provided that the entire gauntlet of tests has passed so far, the code gets automatically deployed to a staging server.

Step 6. There’s an additional set of tests that run against the staging web server. These tests can find configuration problems and code that just does the wrong thing in a web context. These tests are pretty shallow. They hit all of the user-facing pages/JSON endpoints and fail if anything is totally broken.

Step 7. The build artifacts are copied from TeamCity to a new folder on our productionvserver, and then the web server is reconfigured to serve from that folder instead of the folder it had been serving from.

At this point, we’ve verified that the core business (game, in this case) is OK, verified that the persistence stack works as expected, that our integration with external APIs works as expected, and that the code doesn’t completely break in a web context. We’ve done a zero-downtime deploy to the production web server.

That’s cool, but we’re not quite done yet. There’s two more steps.

Step 8. Run a set of tests against the production web site to make sure that all of the pages that worked a few moments ago still work.

Step 9.  Have external monitoring systems in place, so if your changes make things slow or unresponsive. You’ll know.  We use pingdom.

Yikes! There’s a bunch of distinct steps here, and it seems really complicated (because it is). But it’s all totally automated. All I need to do ?

git push origin master

Because there’s zero deployment effort on my part, I do this all the time.  I find it very energizing to know that I can just do stuff in minutes instead of hours or days or (heaven forbid) months.

If (when) something goes wrong, I’ll know immediately. If a bad bit of code manages to roll through the test gauntlet, I can roll back easily (just reconfiguring the web server to use the last known good set of code). I’ve only had to roll back a couple of times over the course of several months and 326 deployments.

Just like when the folks at IMVU wrote about this process, I’m sure that some people in the audience convinced that I’m a crazy person advocating a crazy way of working. Here are the objections I’ve heard before.

Yeah, but this doesn’t give your QA team any time to test the code before it goes out!
We’re a small startup. We don’t have a QA team. Problem solved.

Yeah, but isn’t that incredibly dangerous?
No. The safest change you can make to a stable production system is the smallest change possible. Also, we design the individual parts of the system to be as encapsulated as possible, so we don’t tend to have crazy side-effects that ripple through and create unintended bugs.

When we make a change or add a new feature, we can manually test the hell out of that one thing in isolation (before checking in) instead of feeling like we need to spend a lot of time and effort manually testing everything in order to ship anything.

Yeah, but what about schema changes?
For schema changes that are backwards compatible with the code that’s out there (e.g. new tables, whatever). We have a simple system that executes the appropriate DML on application startup.

For non-compatible schema changes and things like server migrations, we have to take down the site and do everything manually. Fortunately, we’ve only had to do that twice now.

Yeah, but you have to spend all of that time writing tests. What a waste!
The time we spend writing tests is like the time a surgeon spends washing their hands before they cut you open. It’s a proven way to prevent bugs that can kill you.

Also, we get that time back, and then some, by not having to spend nearly as much time with manual testing and manual deployments.

Yeah, but what about big new features you can’t implement in just one sitting?
Traditional software development models (both waterfall and agile methods like Scrum) are organized around the idea of “multiple features per release (or iteration)“. Continuous deployment is organized around the idea of “multiple releases (or iterations) per feature“. As a result, we end up pushing a lot of code for features that aren’t done yet. For the most part, these simply unavailable through the UI or only exposed to users who have particular half-built features associated with their accounts. I credit Flickr with this general approach.

Yeah, but that might work for a solo developer, but it can’t work for a team.
There are actually three developers on the team.

Yeah, but I’m sure this only works with very experienced developers
One of the guys on the team has only been programming for the last year or so and hasn’t ever worked on a web project before. Tight feedback loops help everyone.

Yeah, but what about code you want to share with other that you don’t want released?
We use github, so creating additional branches and sharing with them is trivial. We also have a dedicated “preview” branch that triggers a parallel test/deploy gauntlet that sends code to a staging server instead of the production servers.

Yeah, but this will never work at my organization because…
OK. That’s cool. Don’t try it if you feel that it won’t work for you. You’re probably right. You’re not going to hurt my feelings either way. I found something that’s working really well for me, and I want share my experience to show other people that it’s possible.

What this really means

Half of what makes this process work is that we’re honest with ourselves that we’re human and will make mistakes. If we have multiple tight feedback loops between when we’ve broken something and when we know we’ve broken it, it’s faster and easier and cheaper to fix those mistakes and prevent similar mistakes from happening again.

The other half is the idea that if you design, implement, test, and release exactly one thing at a time, you know with certainty which change introduced a problem instead of having to ask the question “which of the dozen or so changes that we rolled out this month/sprint/whatever are causing this problem”.

About the site

Victors United is an online turn-based strategic conquest game. You can play asynchronously or in real time. You can play against robots or humans. If you’re playing against humans, you can play against your friends or against strangers. Unlike some other popular web based social games that I don’t like to mention, this is a real competitive game where strategy and gameplay matter.

About the tech 

The tech here is kind of beside the point. This general approach would work just as well with different technology stacks.

The front end is HTML5 + JavaScript + jQuery. The backend is IIS/ASP.NET MVC2/SQL Server/Entity Framework 4. Our servers are hosted in the SoftLayer cloud. External monitoring is provided by pingdom.

The test gauntlet is a series of distinct nUnit assemblies, executed by TeamCity when we push new code to GitHub. There’s a single custom PowerShell script that pulls down the build artifacts and tells IIS to change what directory it serves code from.

Oh no, not more of the same TDD discussion Tedium.

Every once in a while, otherwise reasonable people get together to argue about TDD with religious zeal. In the most recent flare-up, I’ve been disappointed that on all sides, as nobody is saying anything new.


At the risk of adding yet more noise, I did have a two nuanced thoughts that are at least new to me and I thought I would share them, along with a recent personal anecdote.  If this is obvious or old-hat to you, then I’m sorry.  If you think I’m  too stupid for words and that I’m drinking and/or selling snake-oil enriched kool-aid, I look forward to what will undoubtedly be informed and insightful feedback.

1. s/driven/aware/

The hardcore position of “never write a line of production code without a failing test for it” probably does more harm than good. Different kinds of code require varying degrees of effort (cost) to write and maintaintests for. Different kinds of code give varying degrees of benefit from test automation. Without always realizing it, developers make cost-benefit decisions all the time, and good development organizations empower their developers to act on those decisions.

That said, the cost-benefit decisions developers make must at least be informed decisions. A professional developer who hasn’t taken the time to learn how to use the appropriate test frameworks for their language/environment (jUnit, nUnit, whatever) is just plain negligent in 2009.  Test automation is just one tool in a competent developer’s toolbox, but a critical one.  I wouldn’t trust a carpenter who didn’t know what a hammer was, or a cardiologist who hadn’t bothered to learn about this newfangled angioplasty business.

Test-driven may not be appropriate for every context, but everyone needs to be at least test-aware.

2. s/first/concurrently/

Test-first is a really helpful approach, but it doesn’t work with the way everyone thinks, and mandating that everyone must always think in exactly the same way is the worst sort of micro-management.  The other extreme, writing test automation for a large system after it’s complete, is often prohibitively difficult and (frankly) boring as hell.

My advice is to always at least think about how you would write tests for your code before writing it. That will help keep you from painting yourself into untestable corners. Also, interlacing test writing immediately after you get a small subset of your system done is going to be much easier than testing the whole thing after the fact.  Personally, I move back and forth between writing the tests first and writing the code first. The key for me is that I’m working in short code-test-code-test cycles, using persistent (that is, I don’t throw it away when I’m done) test code as the primary mechanism for executing the code I’m writing as I’m writing it.  I don’t think of the process as being test-first, I think about testing concurrently with coding.

Recent Anecdote

Sure, anecdotes aren’t data, and they can’t prove anything, so take from it what you will

I just finished a pretty big refactoring project (that is a “pure” refactoring, the external interfaces and behavior of the existing stay the same but the underlying implementation was improved) of a system that had some decent test automation. Every time I got an edge case behavior wrong, introduced a side-effect, or removed a necessary side-effect (yuck), a test would go from green to red. This saved me at least a few days of development and testing time, and reduced the chances that I would release bugs to our QA guy (bad) or our production system (even worse).

My continuing love affair with ReSharper: Indicating Recursive Calls

I generally believe that comments should be about “why is this code doing this” instead of “what is this code doing” because if you feel you need comments to explain what your code is doing, it could probably be refactored to be more readable and/or intention revealing.

One exception that I’ve often made is for recursive calls. I generally write a comment indicating that the method is about to call itself and why. This may be left over from my first CS class at the University where I had some kind of mental block around recursion, I don’t know.

So, I was delighted to see ReSharper give a little visual indication of a recursive call in the margin.

Recursive Picture

This useful when doing recursion on purpose (this example was using reflection to populate test data into properties, I would recurse when a property was a complex type), but it’s even more useful if you do recursion on accident, as in this public property returning itself below.

 Bad Recursion

Guilt as a Code Smell

One of the best things about working for Velocity Partners is that I get a chance to do presentations/brown bag seminars for their other clients/prosepective clients. I like to think that as I’m a hands-on developer who actually works with this stuff every day, I have different credibility from the full time trainers/presenters/coaches/pundits.  Note: I don’t want to disparage the full-time trainers that I know and respect, their credibility comes from the fact that they were hands-on coders for a long time, and have more time to do the reading/research/writing that someone working on deadlines may or may not have.

I’m currently putting together a presentation on Refactoring, and how that relates to the other agile tools and techniques (where “agile” simply means “modern development practices that work”). While looking for examples, I’ve been re-reading Martin Fowler’s great Refactoring book.  One thing that struck me was the focus on “getting the inheritance hierarchy right” which, after living in the design patterns (favor composition over inheritance) realm for the last few years, felt kind of odd to me.

Meanwhile, in my “day job” I’ve been working on a well encapsulated, generics-based .NET client for a family of REST-y XML services.  After making one major component, I found that I had to make another major component that does much the same thing. So, I created a new abstract base class and made both my existing component and the new component concrete derived classes of the base class. As I needed functionality for the new class, I generalized it and moved it up to the abstract class (using the protected modifier, of course) so I could use it in the other derived class.

It was working pretty well. I had very little code duplication and ReSharper is particularly good at the “Pull Members Up” refactoring, but I was starting to feel a little guilt about not doing things in a “pattern oriented way”. Sure, you could never use the two concrete components interchangably, so there was a violation of LSP, but I can be cool with that. The abstract class didn’t have any public methods on it, so there’s no danger of someone trying to couple to its publically-exposed interface.

After, I figured out where my guilt was coming from. It’s a coupling/testability problem. As the concrete types are tightly coupled to their base types, I couldn’t ever substitute a different type to handle that functionality. This is particularly important as the base type was all about making external (HTTP) service calls, so it was impossible for me to test any of the types in isolation.

The solution: it’s pretty obvious, but I just moved the functionality of the abstract base class into a service class with an interface. Now my two components (formerly derived classes) just have an instance typed to that interface. I can test all three parts in isolation (Q: How do you test an abstract class in isolation? A: You can’t) and I can re-use the same functionality across additional components in the project, further reducing duplication.

So, what I’m saying is: take your guilt seriously. If you have a bad gut feel about a design, it might very well be bad.  It’s just like any of the code smells in the original Refactoring book. It doesn’t necessarily mean that there’s a problem, but it’s worth looking at.

But, at the same time, I’m not going to beat myself up over my interim design. It was a good, easy stepping stone to a more optimal design. This is the sort of thing that Scott Bain’s new book Emergent Design: The Evolutionary Nature of Professional Software Development is about. I’ve only skimmed it so far (I’m working on a new presentation, after all) but I know Scott, and I know his approach to the subject. From what I’ve read so far, it seems to be the right text for the professional programmer who wants to move beyond “just getting it working” to the level of “getting it working well”.  I wish that I could have read it years ago.

More on becoming a Java Developer

There’s an excellent SDET whom I worked with around a year ago who shares many of my software interests (Lean/Agile/TDD/Design Patterns, etc.) as well as some some non-software interests (photography, left-handedness).  Even though we haven’t been on the same project for a while, we sometimes get together to have design discussions. We’ve found it helpful for both of us to have an audience who is outside our immediate projects to give us honest feedback.  His test code is better than most people’s production code, and my production code is getting more and more testable.

Anyway, after one of these discussions, I mentioned that I was working with Spring, something that the development group uses extensively for Java projects, not as much with the C# projects.

“What, you’re working with Java now?” he asked.

“Yes, I’m becoming bilingual.” I said.

“That’s not bilingual, that’s ecumenical!

He has a point, many people have such a religious zeal around their choice of technology that it clouds their vision. Personally, I think that learning Java has done a lot to make me a better C# developer because the contrast stands out and I think to myself “why are things this way in this language?” and “wow, anonymous classes are pretty handy for endo-testing” and “while I miss the get/set property accessor syntax, I see and understand why Java folks don’t like it.”

College Botany and the Java Ecosystem

When I was a student at the University of Washington, vaguely interested in plant life, I decided to sign up for a botany class on top of an already busy schedule. “How hard could it be?” I thought, after all, I did very well in biology class in high school.

Big mistake.

Botany is hard, and not for the reasons you might expect. A first-year botany student learns more new words than a first-year foreign language student. Why is that? Because you have to learn all the different words to tell different parts of different kinds of plants apart. There’s a small amount of logically figuring stuff out (which I enjoy greatly) but it’s mostly rote memorization (which I enjoy not so much).

For the last few years, I’ve been working in the C#/.NET ecosystem, where most of the day-to-day tools and technologies are pretty clearly defined and packaged by Microsoft. Sure, you need source control, nUnit, ReSharper, and CruiseControl, but that’s about it.

I’ve just finished my first week as a developer working on a Java project. My first impression: the entire ecosystem is huge. I’ve had to learn about a billion new words. Maven, Pom, Jmx, Spring, JBoss, JDK, ClassPath, VM Parameters, Jetty, TeamCity, beans, WARs, JARs, SARs, EE/SE/ME, etc. Well, maybe it’s not as many new words as I tried to learn in botany, but it felt like a lot to me this week.

But it has been good. The core languages between C# and Java are almost exactly the same, with the distinctions being really interesting. The Pragmatic Programmers suggest learning a new programming language every year, and I understand why. Even if I don’t keep at this whole Java thing long-term, I’ll come out of it a better C# developer.

Actually, this would make two new languages this year, if you’re willing to count scratch.

The best thing about this learning experience is that IntelliJ has the generally same keyboard shortcuts at Visual Studio with ReSharper (of course). When I saw that hitting ALT+F7 (find usages) on a setter method will show me the spot in the spring config xml where that value was set declaratively, I almost cried with joy.

More C# Partial Class Testing Strategies

I can’t take credit for this approach, and even if I could, I probably wouldn’t, because it makes me feel kind of icky.

Anyway, I recently heard about a legacy code testing strategy where you mark your class as “partial”, and in another file, you add whatever public properties/methods you need for your tests. You make the contents of the second (testable file) conditionally compiled (the classic #IF DEBUG) so the encapsulation is still there for any release builds.

It’s kind of like endo-testing, but you’re extending the class “sideways” instead of “downwards”.

Basically, it’s breaking encapsulation in a controlled way, and for the most part, I think it’s a bad idea if you’re working with a new design. If, however, you’re trying to get some meaningful coverage for your legacy code (which wasn’t designed for testability) it can be a good stop-gap in dealing with the legacy code refactoring catch-22: where you don’t want to make changes without tests, but you can’t make tests without making changes. 

Any strategy, such as this one, which allows you to get the first layer of tests down before further refactoring, should be embraced as a good thing.  If you find that you need to do this for any new/original classes, my guess is that your class is too big and needs to be decomposed further into more cohesive and testable classes.