Fidelity Life Case Study

Posted by Micah Thu, 02 Oct 2008 21:02:00 GMT

As craftsmen, we’re proud of our work. Yet it’s rare that we get the opportunity to show off what we do for clients. Fortunately the kind folks at Fidelity Life have given us permission to do just that.

Check out the case study summarizing several systems 8th Light built for Fidelity Life using mostly Ruby and Rails. This project is a whopper.

Fidelity Life Case Study

Some Thoughts On Software Defects 2

Posted by Paul Pagel Tue, 24 Jun 2008 06:13:00 GMT

Software defects are a part of software. This is a negative subject, but I don’t want to seem like the software I write is full of defects and bugs. This blog is addressing how I have seen teams turn a roadblock into a success. I have read and heard at conferences about teams who take small failures and create a culture of failure around it. Following are some examples of small failures I have seen and great successes built around them. Sometimes I see small failures as a path to larger positive culture changes.

I think it is safe to say all software that is used and sufficiently complex has defects. There are many reasons for the defects. Here are a few of the defect situations I have been in and how our team solved them.

I think the important part I learned after evaluating these is that resolution of these situation needs to be fast and high quality. They need to make the project better off at the end of their resolution. Success for the customer is the only true result of a project, and defects may not be the most ideal path, but they must lead there.

The speed is important, but it doesn’t mean I should rush a solution and throw it in as soon as possible. To me it means giving the defect the red carpet treatment. I am trying to capture craftsmanship in the face of adversity. Every project has its ups and its downs, and a craftsman will take pride in success through any obstacles. Instead of making defects a slippery slope downhill, they are one step back and we take two steps forward.

Bugs

Bugs are a part of every piece of software I have written. That statement sounds a lot worse than it is. I have worked on systems where the requirements are dramatically changed week to week (which can be pretty exciting). There are situations I didn’t take into account, or some behavior I didn’t imagine until a real user started hitting the system. Now, as a developer, I do more mental exercises and think thoroughly through my solutions as I get more experience. This has never made my code bug free. For that reason, my team needs to know how to deal with bugs (I am fairly certain I am not the only creator of bugs).

Here is a line from the Pragmatic Programmer book, “It doesn’t matter whether the bug is your fault or someone else’s. It is still your problem.”

So, the team has a bug list. We list out the bugs as a todo list in basecamp, so they are not so formal they can be forgotten about. Then we try to address them and work on new stories. This worked well enough until there was a big release coming up and our customer came back to us with a big list of bugs. We put them on the list and continued to fix them, as well as work on stories. They were getting completed, but there was never an empty bug list. Then, the customer (who can directly add/edit the bug list) started writing priorities to the bugs. This one is HIGH priority. This one is CRITICAL. This one is IMMEDIATE. I make the suggestion “We need some real tool to manage our bug list,” because I can’t fit all this in my head. There isn’t that much room up there and I need to use it wisely. One of my team members suggested maybe it wasn’t the craftsmanship way to have ANY bugs shipped. I was trying to solve the wrong end of the equation. So, the team lead put forth a no bugs policy that we all agreed with.

You can not pick up a new story unless the bug list is empty.

This made sense to me, but I had reservations about a small/insignificant bug taking priority over a story that is important. To date, this has not happened, and the bug list has stayed near no bugs. That doesn’t mean less bugs are found, it just means they are fixed, and the code is refactored to prevent a future occurrence. Most importantly, some tool to track bugs never made it into our system. That was an idea which would have desensitized the team mentality to bugs, whereas with our policy now we are very sensitive to the issue of bugs. Challenging the craftsmanship of the team members that buggy code is something you should take personally was the right choice. Bugs got a first class ticket to termination in our system.

Now, the definition of a bug versus a small feature enhancement is a fine line. I know I have failed to define it well, and that might contribute to what gets called a “bug”. Often times, the urgency in the customer takes up more mental space than me thinking through it, looking up the acceptance criteria for the story where it was implemented, and going back to the customer and saying, “No, that was clearly not a defined scenario, we are going to need a story to turn that button green.” Now the next time, it is more than turning a button green, but the precedent has already been set. All I have been able to do is to strike a balance based upon how much effort it would take to make the bug/feature enhancement work. If it is a lot of effort, I will double check the bug to make sure it is a bug. If it is, I fix it. If not, I will push back to the customer to write a story card.

Production Support

Once a system goes into production, support begins. Following along with one of Paul Graham’s ideas, we have the developers doing the production support. We are the ones who wrote the system and know the system the best. When I look at a production support request, I can not only solve it, but make sure it doesn’t happen again. Or if it does, make sure it is easy to correct.

So, during our first deployment of a system, a single team member stepped up as the “production support” developer. I don’t know if he embraced it or was cornered to it, but as a craftsman, he took the responsibility and ran with it. As further systems were released, he would sometimes be doing an entire day of production support. Production support can be a lot of debugging and fixing data, which can be fun, but more times than not is tedious and rhythmic. Often times when I saw a production support email, I would look to the “production support guy,” who could fix it in about half the time I could. This seems a lot like a silo to me. Everyone should be able to do production support on any system. I should have to, because it is a perspective of the system that is important to have.

In response to this, we came up with a system of triage. Each day of the week is assigned to a specific developer. If a support item comes up, it is the job of the triage developer to respond to the client/customer we are working on it. If it addressed to a specific person, they will inform them. Otherwise it is on the triage developer’s shoulders to fix the support request before they continue their work for the day. This ensures the client always has an open line of communication with a developer. An email never slips through and doesn’t get addressed. There is clear responsibility to who should be addressing the support item. I know the “production support” developer is in favor of this system. As well as the customer, they ask who the triage is for the day and have no qualms about interrupting their work, as they should.

Communication and Managing Expectations

Recently, I did some integration work with a third party vendor. They were developing their side of the integration at the same time as us. Not wanting to slow development and wait for their functionality, we decided to write a mock server and integrate with the host according to the spec from the third party vendor. We received a story from the customer for that and proceeded to make the client for the third party system calls. We finished our story. In the demo portion of the iteration planning meeting, we could only demo against our mock server. This caused some nervousness in the customer (rightly so). I replied, “Once their side of the system is done, we should be able to send our calls across.” Then we received three more similar stories, but different system calls. We did them, removed the duplication and felt really good about the job we did. The came their test server.

Nothing worked! There were all sorts of communication problems, questions about who implemented the system to what spec, and political questions. Despite us thinking we were in the right, the stories were signed and the customer said, “Well, you said this would work.” At first, I tried to communicate the reason why it didn’t work, and how we can move forward. We came to spend a lot of time on this, and the team felt the integration should be its own story. The customer pushed back, “Well, you said this would work.” There was some tension, because both sides were right. We didn’t believe it was our fault it didn’t work (neither did the customer), but we told them it would. Rather than let out the righteous indignation I was feeling, one of the team members mentioned.

“We should not have assured you it would work.”

That one line brought the real problem to the front for both sides. We didn’t know whether it would work or not, just that we wrote the right code to the specification we had at the time. That code by itself has no value to the customer without it working, though. Once that line was said, everyone in the room sat for a second, then understood. The expectation over defect versus behavior was out of sync. A little ownership over the defect was all we needed to ease the tension and move forward. It is unproductive to get stuck in a stalemate of expectations. When in doubt, the customer is always right.

Agile Production Support: Final brush strokes 2

Posted by Paul Pagel Mon, 18 Feb 2008 14:39:00 GMT

There is no perfect software. At least I have never seen it. Bugs and minor feature changes are indications people are using your software. Real users hit a system in ways that no control group can, and on non-critical applications, this is the best way to test your software. Let people use it and see what happens. This is goes in line with the agile philosophy of release early and often. Get your application out there as fast as you can, so you can mold the finishing touches around the real users experience rather than a faux-environment.

There is some conversation about what is and what is not a “bug” in the software world. That is not a conversation I would like to partake in here, so lets call both bugs, integration items, minor feature enhancements, and things that fall through the cracks of development tweaks. It doesn’t matter what the nature of origin is, these are all things that MUST get done.

After the release of one of our products, a load of tweaks came in from the customer. As proud craftsman, we decided tweaks were our responsibility, and we would take them on in addition to our normal iteration. So we started to do them, to the detriment of our iteration. We accomplished only about half of our iteration’s velocity.

The next iteration, much to our surprise, we were twice as busy with production support. This is about the time that a developer looses a little faith. What did we miss? Is this high quality software we are writing? So we lost even more velocity when it came to iteration 2 after the release. Also, the customers were now unable to accurately plan new features moving forward due to an unstable velocity.

It is so hard to predict or estimate production support and tweaks. However, we needed to be able to so that the production support didn’t leave such a footprint in the project. It felt and looked like we were not getting very much done, even though we were working harder than usual. It was the time being put into a vacuum and being unaccounted for that was troubling the project. It also had a negative effect on the morale of the team.

We came up with a card, we call the “Production Support Card.” The amount of the iteration’s velocity this card took up was calculated by the amount of time we spent on production support the previous iteration averaged with the amount of time allocated for that iteration(sound like a familiar formula?). It is added as a card to the next iteration. If the developers only spend 6 of the 10 points on production support, it is expected that they will complete 4 points worth of stories, which are automatically entered in the iteration. For the first iteration where it becomes apparent that we need a production support card, we set the point value of the card at 0 and track how much time we spend, bumping out of the iteration the least important stories if needed.

So, what does this tracking buy you, if you have to spend the same amount of time on tweaks? First, it allows transparency to the customer about what you are working on that week. When they see your normal velocity of 20 points turn into 5 points, they have a right to be worried. When you say, in a defeated voice “we were fixing bugs,” they also have a right to worry about the stability of the code you have been writing, even though this spike in minor changes to the application is a part of the normal process.

Second, it raises the moral of the team, because they are working towards a specific goal, to remove the production support cards from the iteration. Also, we get the satisfaction of maintaining a velocity in points, which is something we know so well it is hard to work without.

It takes a few iterations, and the team squeezes the life out of the production support card, putting you back on track. After those iterations, the footprint goes from sasquatch to mini-me.

It also helps the customer plan around production support. Their time lines and release dates are made from a projection of feature difficulty to development’s velocity. Over a long period of time, the velocity normalizes, and it hurts the projections to have hiccups. If you have production support data, you can predict about how much time around a release you will loose on the initial release of brand new development.

Older posts: 1 2 3 ... 5