Skip to main content

Why Can't Developers Estimate Time?

by Ashley Moran
A few interesting points came up on a mailing list thread I was involved in. Here are a few of them. The original comments are presented as sub-headers / quoted blocks, with my response below. This isn't a thorough look at the issues involved, but what I thought were relevant responses. Note: I've done some editing to improve the flow and to clarify a few things.
Why can't developers estimate time?
We can't estimate the time for any individual task in software development because the nature of the work is creating new knowledge.
The goal of software development is to automate processes. Once a process is automated, it can be run repeatedly, and in most cases, in a predictable time. Source code is like a manufacturing blueprint, the computer is like a manufacturing plant, the inputs (data) are like raw materials, and the outputs (data) are like finished goods. To use another analogy, the reason Starbucks makes drinks so quickly and repeatably is because they invested a lot of time in the design of the process, which was (and is, ongoing) a complex and expensive task. Individual Starbucks franchises don't have to re-discover this process, they just buy the blueprint. I'll leave it as an exercise to the reader to infer my opinion of the Costa coffee-making process.
It's not actually always a problem that development time is unpredictable, because the flipside is that so is the value returned. A successful piece of software can make or save vastly more than its cost. Tom DeMarco argues for focussing on the high value projects for exactly this reason. Note that this does require a value-generation mindset, rather than the currently-prevalent cost-control mindset. This is a non-trivial problem.
By far the best explanation I've read of variability and how to exploit it for value is Don Reinertsen's Principles of Product Development Flow, which is pretty much the adopted "PatchSpace Bible" for day-to-day process management. And when I say "by far the best", I mean by an order of magnitude above pretty much everything else I've read, apart from the Theory of Constraints literature.
Here is the data from my last development project. (Histogram generated in R with 5-hour buckets: the horizontal axis shows the duration in hours for the user stories - 0-5 hours, 5-10 hours, etc; the vertical axis is the number of stories that took that duration). We worked in 90 minute intervals and journaled the work on Wave, so we knew task durations to a pretty fine resolution. (We did this for both client communication and billing purposes.) The result: our development times were about as predictable as radioactive decay, but they were veryconsistently radioactive. Correlation with estimates was so poor I refused to estimate individual tasks, as it would have been wilfully misleading, but we had enough data to make sensible aggregates.
Stories_dev_time_histogram
Rule of thumb: take the estimates of a developer, double it and add a bit
The double-and-add-a-bit rule is interesting. When managers do this, how often are tasks completed early? We generally pay much more attention to overruns than underruns. If a team is not completing half of its tasks early, it is padding the estimates, and that means trading development cycle time for project schedule. Cycle time is usually much more valuable than predictability, as it means getting to market sooner. Again, see Reinertsen's work, the numbers can come out an order of magnitude apart.
Also, this is the basis for Critical Chain project management, which halves the "safe" estimates to condense the timescale, and puts the remaining time (padding on individual tasks) at the end, as a "project buffer". This means that Parkinson's Law doesn't cause individual tasks to expand unduly. I'm unconvinced that Critical Chain is an appropriate method for software though, as the actual content of development work can change significantly, as feedback and learning improves the plan.
People in general just make shit up
It's not just developers that are bad with estimates either. Everyone at some point is just winging it because it's something they've never done before and won't be able to successfully make a judgement until they have.
As a community we need to get away from this. If we don't know, we don't know, and we need to say it. Clients who see regular progress on tasks they were made aware were risky (and chose to invest in) have much more trust in their team than clients whose teams make shit up. It's true! Srsly. Don't just take my word for it, though - read David Anderson's Kanban.
Estimating is a very important skill and should be taught more in junior dev roles
I propose an alternative: what we need to do is teach to junior devs the meaning of done. If estimation problems are bad enough, finding out at some indeterminate point in the future that something went out unfinished (possibly in a rush to meet a commitment … I mean - estimate!) blows not only that estimate out of the water, but the schedule of all the current work in process too. This is very common, and can cause a significant loss of a development team's capacity.

Comments

Popular posts from this blog

Quicksort implementation by using Java

 source: http://www.algolist.net/Algorithms/Sorting/Quicksort. The divide-and-conquer strategy is used in quicksort. Below the recursion step is described: 1st: Choose a pivot value. We take the value of the middle element as pivot value, but it can be any value(e.g. some people would like to pick the first element and do the exchange in the end) 2nd: Partition. Rearrange elements in such a way, that all elements which are lesser than the pivot go to the left part of the array and all elements greater than the pivot, go to the right part of the array. Values equal to the pivot can stay in any part of the array. Apply quicksort algorithm recursively to the left and the right parts - the previous pivot element excluded! Partition algorithm in detail: There are two indices i and j and at the very beginning of the partition algorithm i points to the first element in the array and j points to the last one. Then algorithm moves i forward, until an element with value greater or equal

Live - solving the jasper report out of memory and high cpu usage problems

I still can not find the solution. So I summary all the things and tell my boss about it. If any one knows the solution, please let me know. Symptom: 1.        The JVM became Out of memory when creating big consumption report 2.        Those JRTemplateElement-instances is still there occupied even if I logged out the system Reason:         1. There is a large number of JRTemplateElement-instances cached in the memory 2.     The clearobjects() method in ReportThread class has not been triggered when logging out Action I tried:      About the Virtualizer: 1.     Replacing the JRSwapFileVirtualizer with JRFileVirtualizer 2.     Not use any FileVirtualizer for cache the report in the hard disk Result: The japserreport still creating the a large number of JRTemplateElement-instances in the memory        About the work around below,      I tried: item 3(in below work around list) – result: it helps to reduce  the size of the JRTemplateElement Object        

Stretch a row if data overflows in jasper reports

It is very common that some columns of the report need to stretch to show all the content in that column. But  if you just specify the property " stretch with overflow' to that column(we called text field in jasper report world) , it will just stretch that column and won't change other columns, so the row could be ridiculous. Haven't find the solution from internet yet. So I just review the properties in iReport one by one and find two useful properties(the bold  highlighted in example below) which resolve the problems.   example: <band height="20" splitType="Stretch" > <textField isStretchWithOverflow="true" pattern="" isBlankWhenNull="true"> <reportElement stretchType="RelativeToTallestObject" mode="Opaque" x="192" y="0" width="183" height="20"/> <box leftPadding="2"> <pen lineWidth="0.25"/>