Mar 2007
Engineering Wants to Rewrite!
03.29.07 Filed in: SVPG Blog
Few words are more dreaded by product managers than being told by engineering: “No more new features! We need to stop and rewrite! Our code base is a mess, it can’t keep up with the number of users, it’s a house of cards, we can’t maintain it, the site is a dog!”
This situation has happened to too many companies, and continues to happen. It happened to eBay in 1999, and the company came far closer to collapsing than most people ever realized. It happened to Friendster a few years ago and opened the door for MySpace. It happened to Netscape during the browser wars with Microsoft, and everyone knows who won. The truth is that most companies never recover from this. It is a really bad situation to find yourself in.
When a company does get into this situation, everyone typically blames engineering. But in my experience, the harsh truth is that it’s usually the fault of product management. The reason is that for the past years the product managers have been pounding the engineering organization to deliver as many features as the engineering team possibly can produce. The result is that at some point, if you neglect the infrastructure, all software will reach the point where it can no longer support the functionality it needs to.
During this rewrite, you’re forced to stop forward progress as far as what the customers see. You might think that the rewrite will only take a few months (more about that below) but invariably it takes far longer, and you are forced to sit by and watch your customers leave you for your competitors.
If you haven’t yet reached this situation, here’s what you need to do to make sure you never do. You need to allocate a percentage of your engineering capacity to what at eBay we called “headroom”. The idea is to avoid slamming your head into ceilings. You do this by creating room – headroom – room for growth in the user base, growth in transactions, growth in functionality.
The deal with engineering goes like this. Product management takes 20% of the capacity right off the top and gives this to engineering to spend as they see fit – they might use it to rewrite, rearchitect or refactor problematic parts of the code base, or to swap out data base systems, improve system performance – whatever they believe is necessary to avoid ever having to come to the team and say “we need to stop and rewrite.” If you’re in really bad shape today, you might need to make this 30% or even more of the resources. I get nervous when I find teams that think they can get away with much less than 20%.
If you are currently in this situation, the truth is that your company may not survive this. But if you are to have a chance of pulling through, you’ll need to first do a realistic schedule and timeline for making the necessary changes that engineering identifies.
Most of the time, an experienced engineering team will come up with estimates that are on the slightly conservative side. The exception to this rule is this case of rewrites. Here the estimates are often wildly optimistic. You must make informed decisions in this situation, so you have to go through every line item on the schedule to make sure that the dates are realistic.
Second, if there’s any way humanly possible to break up the rewrite into chunks to be done incrementally, you should absolutely do so. Even though the rewrite might now stretch over two years instead of 9 months, if you can find a way to continue to make forward progress on user visible functionality, even if it’s only with 25% to 50% of the capacity, this is incredibly important.
Third, since you’ll only have very limited ability to deliver user visible functionality, you will need to pick the right features, and make sure you do define them right.
After eBay’s near-death experience, they made sure they wouldn’t put the company at risk again. They immediately began another rewrite, this time well in advance of issues. In fact, due to their very rapid growth, they ended up rewriting a third time, this time translating the entire site into a different programming language and architecture, and they did this massive multi-million line rewrite over several years, and most importantly, without impacting the user base and at the same time managing to deliver record amounts of new functionality. It’s the most impressive example of “rebuilding the engine mid-flight” that I know of.
But definitely the best strategy for dealing with this situation is to not get to this point. You need to pay your taxes and remember to dedicate at least 20% to headroom. If you haven’t had this discussion with your engineering counterpart, you should do so today.
Email to a friend
Sign up for the free newsletter here.
That Dog Won’t Hunt
03.15.07 Filed in: SVPG Blog
In the previous article I argued for some very significant changes to the way most teams produce software. Several of you wrote to me and asked that I elaborate on my final point, which had to do with the fact that once you have a product definition that works, you can’t just “piecemeal” it up and expect the same results. I believe this is a hugely important point, and gets to the underlying reason for a great many failed products and wasted releases.
Have you seen this movie before? The one where the product manager comes up with this great PRD that is packed with features, all clearly marked as P1/Must Have, P2/High Want, or P3/Nice to Have. Then he hands the PRD off to engineering, and they estimate the costs of the various features, and lay the features out against their staff availability, and they come up with a schedule that’s typically months longer than the product manager needs, so then the negotiating game starts – arguing estimates, cutting features, minimizing QA and beta times, trying to hire some extra contract staff, etc, all while the clock is ticking. I’m sure you know the story. Even if you haven’t seen the movie you can guess the ending. The product that eventually ships is far from a coherent whole; and nobody is happy with it – not the product manager, not the engineers, and definitely not the end-users.
Many teams think this is just how the game is played. But this is really just the natural consequence of a flawed process.
Instead, I argue for a very different model:
First, the job of the product manager, working with his designer, is to come up with a high-fidelity prototype with the minimal functionality necessary to meet the business objectives, yet with a user experience that users can figure out how to use and actually want to use. The reason it’s so important that the team come up with the minimal functionality, is that you all want to minimize implementation time and complexity, and also because it’s actually more likely the user experience will be good if there aren’t extraneous features.
Second, starting at the very beginning of this design process, someone from the engineering team (typically an architect or lead engineer), needs to participate in reviewing the product ideas as expressed in the prototype, so that he can help the product manager and designer understand the relative and absolute costs of the various product ideas. He can point out any dangerous directions the product might be heading in, or he can go investigate any areas he’s unsure about. But by the time the prototype is ready, this architect must have provided detailed estimates of the surviving features, so the many trade-offs of what is in and what’s cut have already been made, and made collaboratively, and at this point the engineering team must have a detailed estimate that they can commit to.
Third, it’s essential that this prototype be validated (tested) with real target users. Before committing the resources of the full product team, the product manager and designer must be confident they have come up with something that will succeed. It’s not enough to just believe the product definition is good, you have to test to make sure. You wouldn’t allow an engineer to ship code just because they believed it was good, you must test that code to make sure.
This is why once you’ve come up with this minimal product and have tested it with target users to the point that you have evidence that it will work, you can’t later just cut out some more features and assume that it’ll still work with users. If you could, then you didn’t really identify the minimal product earlier.
You will still have some cases where you have the same tough decision – a common situation is when one or more features takes engineering longer to build than they anticipated – but in this model, the normal response is a schedule slip rather than a feature cut. Remember, you’ve already done the cutting. The good news is that the estimates in this process are better than normal because engineering a high-fidelity prototype to base an estimate on rather than a paper document, they have had more time to evaluate the functionality, they feel more ownership in their estimates, and there is also less product to build, so when slips do occur, they’re not as severe or frequent as we are used to.
Similarly, once the engineering is underway, the product manager can’t just keep tossing in additional requirements, for essentially the same reasons. The good news here is that by far the most common reason that product managers add features is a consequence of not really thinking through the requirements in the first place, and the high-fidelity prototype will force most of these issues to the surface much earlier in the process.
Some people think that Agile methods like Scrum address these issues but in a different way. While I would love it if most teams switched to Agile methods tomorrow, as they really can make a positive difference, you’ll find they don’t really address these issues, and they create a couple of their own as well. More about that in an upcoming article.
So by all means prioritize as you’re thinking about the requirements and what’s most important, but by the time you come up with your final spec, make sure your product is already the minimal possible, and then yank all those P1/P2/P3 annotations from the spec, and make it clear to the team that this spec describes a whole product, and if you remove a leg, then as an old boss of mine would say, that dog won't hunt.
Email to a friend
Sign up for the free newsletter here.
