Sunday, October 21, 2012

Process Bugs

Context: A production bug is reported by 55 customers, causing a complete rollback of the new release in most of them, while some others halted their online services waiting for a hot-fix.
Adam (the PM), shouting: Who is responsible for this terrible bug?
Amr (the team leader): let's concentrate on resolving it and get these customers up and running again.
Adam, frowning: OK
After that the team spends more than 18 hours of continuous work fixing, testing, packaging, deploying, re-testing, etc. Finishing the last touches early next morning, Adam entered the team room:
Adam: Now, it was a horrible day yesterday, and I would like to know exactly what led us to this situation.
Mona (tester): developers are always careless about their code.
Sarah (developer): No way, it was due to your bad testing. If we are careless, Isn't it your responsibility to be careful and catch such tiny bugs?
Adam is listening ...
Yousef (developer): Hey, are we going to keep blaming each other and forget about the tight schedules, late notices, and minute changes, that we are always having?
Seconds of silence passed while all are staring anywhere not accepting any responsibility.
Amr: It was a process bug.
Adam: what?
Amr: A process bug. It has nothing to do with the team. One person or even the whole team cannot be blamed for such a recurring problem report. We keep looking for a person to shout at and forget about the root cause of the problem, which is a process bug.
This is a typical discussion that takes place in software development houses.  The last note is interesting and worth attention. Process Bugs are holes in the software development process which give way for bugs to slip into customer sites. Process bugs are what I always blame for bugs occurring at production. It is the development process which didn't help us prevent bugs from getting injected into our code. It also didn't help us discover these bugs and get them resolved before they reach customers.

It is very bad that we always keep thinking of which team members to blame. What is worse is our thinking that blaming really resolves the root-cause of the problem! If team leaders completely ignored these process bugs, and kept the deadly habit of team blaming, the following dynamic will definitely occur:

The deadly cycle of bug-blame-stress

If bugs occur and the team are blamed for it, they will lead to higher levels of stress; which will usually result in more bugs, more blaming, more stress, in an endless loop manner. Actually, this is an example of 'positive feedback loops' in which factors re-enforce themselves and result in system expulsion after a while. Examples such expulsions may be burn-out, getting out-ranged, quit development, and similar actions.

I have seen excellent software engineers quit the software industry because they lost confidence in themselves. Why? due to continuous public and personal blaming :(

In the bug case mentioned above (which actually happens every day), the following is a very modest listing of possible process bugs:

  • Code review is never planned. The team hardly find anytime to review.
  • Testing issues are reported by mail, and get lost every now and then due to complexity of managing hundreds of daily mails in the team leader's inbox
  • Bugs are assigned to module owners, whom may be very busy responding to customers or developing new features in another project. 
  • Team had several burn-outs, which resulted in more stress on the remaining team members!
  • ...
The key take-away of this blog is to look for root causes in process rather than looking for someone to blame. Blaming may relieve managers' stress and bad temper, but it will lead to team expulsion very soon!

Instead of blaming, a sound management question for the team should be similar to the one I described in this blog title: Take your time to pay for this technical debt, but let me know how you would prevent it in the future! As I said in this blog, part of fixing a problem is preventing it from re-occurring. In other words,  fixing the process hole that led to it. 

Monday, September 10, 2012

Iteration / Sprint Burn-down or Burn-up Chart?

Burn charts are an excellent light-weight tools for tracking software projects. The value of this tool is that it is an early, reliable, and visual indicator for team progress.

Burn charts are drawn on two levels: release and iteration. On release level, burn charts are used to track completed points every iteration, and is excellent to track releases on prolonged amount of time (3-6 months). On iteration level, burn charts are used on daily basis to track team progress in the iteration and how far the team is from achieving the iteration goals.

It is very important to note that Release burn chart come in two forms: burn-up and burn-down. However, on iteration level, it only comes on the form of burn-down chart, and there is almost no literature on iteration burn-up chart. You may find examples of iteration burn-up of effort plotted on the iteration burn chart, like that of Mitch Lacey's Scrum template.

What's the problem with iteration burn-down?

In the following example, consider the flat line at day 4 and 5

Iteration burn-down chart: does not indicate the reason of no progress on days 4 & 5

On day 4 and 5, the remaining effort did not change. There are many reasons which may explain this case, like:
  • The team were on vacation
  • The team were assigned temporary tasks in another project
  • The team worked on the project; some of them increased the remaining hour of their tasks due to very optimistic initial estimate
  • The team discovered more technical tasks that need to be done, but were overlooked in the iteration planning meeting. The estimates of these new tasks are added to the remaining hours of this iteration and the overall remaining hours are the same
Any of these explanations may be valid and usually such explanations are mentioned in the retrospective and this experience is added to the 'group memory' of the team. But, such contextual data is very important information for whoever is observing the team from outside. Also, it may be very important for the team while analysis on their performance over a long period of time, larger than the past iteration only. In such cases, such contextual data may not be visible or may be lost from the team memory.

Another pitfall of this graph is that in this specific case, it doesn't indicate the progress of the team. It just indicated that there may be an issue which needs more investigation.

Iteration burn-up captures more contextual data

Now, consider the following graph, which depicts the same data of the last example:

Iteration burn-up chart: actual work compared to total work indicates clearly the team progress

There are two graphs:
  • Actual burn-up (blue): this is a plot of the cumulative actual work hours by the team
  • Total work (red): this is the summation of the 'actual + remaining' work reported by the team
The rational of this graph is that the iteration is over when the actual burn-up graph intersects with the total work graph. This is exactly what we need from any progress indicator, to indicate when we will finish the iteration, given the variable and dynamic development environment.

In the above example, it is clearer now the reason behind the flat remaining line on days 4 and 5. On day 4, the team did not work on this iteration, so the total work graph is flat, whereas on day 5, the team did work on the iteration, but the total work is still flat, which means that the remaining is kept constant by the team.

The powerful point behind this iteration burn-up chart is that it indicates the actual effort of the team as well as the total work side by side, and indicates changes and patterns in both of them also side by side. 

Wednesday, August 15, 2012

Agile Configuration Management (3): A Process Increments Approach

This post is an attempt to define CM in terms of its practices. But, before you read this post, please review the concept of Process Increments, which is the way we partition our process improvements in Agile adoption and process improvement projects.

When we first thought of partitioning SPI (software process improvement) project this way, we never though that we will uncover such illuminating facts about configuration management. For example, the fact that Workitems should be identified and tracked is central to CM; however, it was never highlighted or drawn attention to by us or by anyone else in the field.

Another note is that collecting all process increments falling under the CM umbrella and really add value to the organization resulted in an excellent collection of practices. These practices are often overlooked by Agile teams. If you are Agile, just have a look below, and think twice whether or not this practice will add value to you.

Configuration Management partitioned into 7 process increments

Below is short description for every process increment:

  • Version Control: Project configuration items are under version control, and team is trained on basic copy-update-merge and lock-modify-unlock procedures
  • Workitem Tracking: Workitem types are identified, and workitems are managed and tracked
  • Traceability: Bi-directional traceability of requirements and work products is defined and enforced
  • Release Management: Release and release scope is identified;  Changes are received, prioritized, and planned; packaging, releasing, and post-release verification procedures are enforced
  • Baselining: Baselining procedure is defined and enforced at points where work product(s) are delivered to an external party
  • CM Environment: Project structure is defined, access rights are enforced, backup/restore procedures are employed, and proper branching/merging techniques are in-action
  • Continuous Integration & Deployment: Builds are automated; Integration between team members, and between teams is automated and frequent; and deployment is automated across different CM environments
In later posts, I will elaborate more on specific practices.

Saturday, July 14, 2012

Agile Configuration Management (4): Traceability is Not a Matrix!

In software development, Traceability is a very famous term, specially in companies implementing CMMI-based process improvement.  Usually the concept is substituted for another term, which is the 'traceability matrix'. Most probably it was originally proposed as example implementation of what is coined as 'Requirements Traceability'. Later on, it became the de-facto implementation of requirements traceability, and finally replaced the original concept in the heads of many!

The fact is, traceability is not a matrix! This is true, specially if you want to make value of this very important technique. Rather, it is a dynamic network of relationships, in which requirements have a grawing set of relationships to all other artifacts in the project. 

Traceability is a Dynamic Network, Not a 2-Dimensional Simple Matrix

Consider the following network of relationships, which is typical in any software development project:


As you can see, any requirements is related to so many other concepts, artifacts, or other workitemsRequirements entities can trace to each other (white), and trace to other information (green), and or trace to physical artifacts (gray). A single requirement may be traceable to tens of other artifacts, and in turn, each one may of them may be traceable to some others. 

What Value Does Traceability Add?

Traceability empowers the team to do three very important activities:

1. Root-cause Analysis:

If there is a defect reported by a customer, and we want to know its root cause. According to the scenario below, the defected code is traced back to the reasons of change, which is found to be a change requested by the customer sometime ago. Other information about the change request can be deduced.

This is also called Backward Traceability.

In healthy configuration management environments, defects can be easily traced back to the code, change requests, and many other information, which may enable the team to identify the real reasons behind the defect. 

2. Impact Analysis

Traceability enables the team to study the impact of a change and assess its costs and risks. In the scenario below, the customer requests a change on an already implemented user story. To assess the change, the team revisits the written code, the impacted design and products, etc.

The team assesses the cost of the change request, by evaluating the changes incurred on all artifacts which will possibly change due to this request.

This is also called Forward Traceability.

3. Requirement Completeness Assessment

The third value-add from traceability is to assess whether all requirements are complete with respect to some other artifact, like design or test cases. So, by simple query, we can deduce which user-story or change request which still do not have any related test cases or design artifacts. 

Implementing Traceability

Traceability can be implemented by special propose requirements management tools, like Rational RequisitePro, Doors, or Rational Requirements Composer. Another way which is more lean and equally effective is to use workitem tracking system capabilities like:
  • Workitem-workitem links
  • Workitem hyperlinks to external documents on version control or shared folders,
  • Workitem links to code change-sets or code revision numbers
In fact, in Agile environments, implementing traceability takes very little overhead. There are techniques to minimize the overhead for linking artifacts and build the traceability network seamlessly. Many of these techniques are exercised in the Agile Configuration Management Workshop, which I deliver at SECC.

Thursday, July 5, 2012

Finally Changed The Title of My Blog!

I spent so long thinking about changing the title of my blog. For several years, it used to be "Tales of Software Process Improvement". For all that time, I used to narrate many experiences in software process improvement.

However, what is more realistic about this blog is that I was narrating experiences of adopting Agile or RUP values and practices, and this is why I decided to change it to "Tales of Agile Software Development".

I'm a big fan for Agile and Lean software development. I can see and feel how it changes people lives, and saves a lot of time and waste. Most important, it makes people master and foster the most precious skill they would have, which is learning and gaining knowledge.

Tuesday, May 22, 2012

Correllation between Cyclomatic Complexity and Bugs density: Is this the real Issue?

The answer is no. Keeping the size constant, studies show no correlation between CC and defect density (from a conversation between me and Radouane Oudrhiri, my mentor in lean six sigma). However, there are other two interesting correlations to study:

The first one is: Does CC strongly correlate with the duration of detecting and fixing defects? In other words, if CC is lower, would we spend less time debug and fix defects?

The second one is: Does CC strongly correlate with the Fault Feedback Ratio (FFR, the average number of defects introduced while coding one change or fixing one defect)?

It needs more investigation to see if anyone has ever studied this correlation empirically. But, my gut feeling and the feedback I get from the teams I work with is that there is strong positive correlation between cyclomatic complexity on one side and the duration of detecting and fixing defects or the change impact on another side.

This is a good experiment to do. Keep alert for the results!

Tuesday, May 15, 2012

Process Increments: My Approach for Agile Adoption

Tomorrow, inshaAllah, I'm presenting at the RECOCAPE first gathering. I will be talking about the 'Process Increments' method, which me and my colleague Mohamed Amr have authored in 2010, and presented at the Agile Conference 2011.

Now, this is a good opportunity to introduce this new concept about 'Process Increments'.

A Process Increment is a process improvement chunk which can be implemented in a relatively small time (1-2 weeks) and still provide value for the organization. A Process Increment is independent from any other process increment, although it may have prerequisite ones.

The concept of  Process Increment  in software process improvement (SPI) projects is almost identical to user stories in Agile development projects, as appears in the next diagram:

Process Increments
Process Increments mapping to Themes, Epics, and User stories

Furthermore, the process increments are estimated in points, and have very well defined 'Done' definition. They can even be written on index cards!

Also, the whole project is planned in releases and iterations, and tracked using burn charts. In short, process increments is about running process improvement projects as typical agile projects. 

We found that this approach has excellent results, including:
  • Better project visibility
  • Faster adoption of Agile practices
  • Faster improvement velocity in general
  • Very high team morale!
If you would like to read more about the results of the study, this is the link to the paper at the IEEE Xplore digital library.

Also, you may download the paper for free at the Agile Alliance website. You may also download the presentation and watch me presenting it live at the Agile conference 2011 at Salt Lake City, Utah.

Monday, May 7, 2012

Agile Solves Problems and Introduces Others!

Recently, I got to know one of the experts of ESI, the European partner of SEI. I met him in the corridor of ITIDA during his visit to Egypt for one reason or another. We chatted for a couple of minutes, and I will quote him saying:
"Agile solves problems and introduces others!"
It was clear that he has a negative attitude towards "Agile"; may be he is talking about some bad failure patterns of Agile implementation, like those I mentioned in this blog item.

However, I would affirm that the amount of problems that Agile solves are far way bigger than the ones introduced.

I also forgot to ask him: What about CMMI-based process improvement, specially those following waterfall or phased development, does it really solve any problems? or just introduces others :)

Thursday, April 12, 2012

3Q's Method: Measuring the Progress of Code Refactoring

Currently, I'm assisting in several projects to refactor large code bases of legacy applications. These are products of very bad, or better say, deteriorated design, and other products which did not follow any design principles what so ever.

First of all, let's agree on the basic idea that you cannot manage what you cannot measure. So, how would we measure how good or bad the code design?

The strategy that I will employ is the 3Q's strategy, described in the below diagram:

Systematic refactoring of legacy code using the 3Q's stratey

The first Q: Quick Wins

The first stage is to catch low hanging fruits, like identifying and removing dead code, removing duplicate code, and reduce method length. At this stage the following measures would be helpful:
  • Cyclomatic complexity: should be 10 or below
  • Average method length: 15 or below
  • Code duplication for 3 (or greater) lines of code
  • Overall code size: should be monitored and set targets for reducing it

The Second Q: Divide & Conquer

The next stage is to start thinking of pulling apart components, whether business (or functional) components or utility components. This re-organization of code is essential to break the complexity of code and will open a large list of refactoring opportunities. At this stage, we will add the following two measures:
  • Instability: to measure correct use of design layers
  • Efferent and afferent coupling between component interfaces and the outer world

The Third Q: Build Quality In

The last stage is to start writing unit tests to test component interfaces. This is necessary so that to baseline the code quality and start doing more profound refactorings. At this stage, we will add the following measure:
  • Unit tests code coverage

Development Process Effectiveness and Efficiency Measures

All of the above are measures of "good & healthy code". However, how would I measure the improvement of the development process itself? in other word, how would I know whether or not these measures improved the development process effectiveness and efficiency? The following measures would serve:
  • Ripple effect (aka change impact): Number of touched methods per change. This measure should start high, then decrease over time
  • Fault feedback ratio (FFR): Number of injected bugs over number of resolved bugs. In healthy projects, this measure should be less than 0.3
  • Average defect density: Number of defects per code size unit, averaged for all changes in an iteration. This measures the amount of defects, whereas FFR measures the healthiness of the code fixing process
  • Average cost of one change per size unit: This is a bit tough to measure. But, depending on the nature of the product, changes can be sized and the cost can be normalized by the change size
    It is worth mentioning that we should record readings for the development process measures starting from day 1. This would be the only evidence for improvement to higher management. It would be very indicative to tell the senior management that the FFR has decreased from 0.7 to 0.34, rather than telling them that the overall code size decreased from 35 KLOC to 20 KLOC :)

    If you have previous experience with similar projects, which measures did you use?

    Wednesday, April 4, 2012

    Visio Activity Diagrams Stencil: A Valuable Tool for Lean Analysis of Your Process

    UML Activity diagrams is an excellent tool for process modeling, and I have been using it for about 9 years now. One of the great uses of it is to model your current process, and visualize it to discover non-value added activities, or waste in other words.

    This is a simple model for a typical Srum process, modeled in UML activity diagram:

    What I have to you is a visio 2007 stencil for drawing such beautiful activity diagrams!

    to download the stencil, follow this link: UML 2.2 -Activity Diagrams.vst

    Tuesday, March 20, 2012

    Agile Configuration Management (2): Towards a Practical Definition of Software Configuration Management

    Although definition should be simplifying the concept, in case of Software Configuration Management, the definition is complicating the concept! Take, for example, this definition:

    "A discipline applying technical and administrative direction and surveillance to (1) identify and document the functional and physical characteristics of a configuration item, (2) control changes to those characteristics, (3) record and report change processing and implementation status, and (4) verify compliance with specified requirements" - SEI CMMI Glossary

    To complement this definition, SEI added references to 7 other definitions: configuration audit, configuration control, configuration identification, configuration status accounting, configuration item, product, and audit. What this effectively does is adding to the complexity of the definition!

    On the other hand, there are some other definitions which are simple and to the point, and in the same time give a clear explanation of what Configuration Management means. It may not receive a unanimity among theorists that it is correct. However, in itself, it proposes a clear definition of CM, and I personally believe that they are excellent definitions. These are two definitions:

    "Software CM is a discipline for managing the evolution of computer program products, both during the initial stages of development and during all stages of maintenance" - ANSI/IEEE standard 1042-1987 (withdrawn standard)

    "In software engineering, software configuration management (SCM) is the task of tracking and controlling changes in the software" - Wikipedia 

    These two later definitions captures in simple terms the essence of Software configuration management as per its original intent. Building on this definition, I have added a 'Capability-oriented' definition, which defines SCM in terms of capabilities it adds to the team:

    "Software configuration management enables the team to trace releases, work-items, and work products to each other"

    A strong SCM environment empowers the team to relate workitems (what has been done) to work products (artifacts of work done) to releases (packaged and delivered software products). This is what I describe as a 'Strong configuration management environment'.

    Implications of this definition is huge. It means that the team may instantly know the history of a workitem (say  a bug), when it was released and which artifacts or code changed due to it. The team may also know every thing about a specific release to a customer, which bugs or user stories were included, and what code files or documents delivered as part of this release.

    Saturday, March 3, 2012

    Agile Configuration Management (1): Does it Make Any Sense?

    Lean thinking is one of the pillars of Agile software development. This title: "Lean Configuration Management for Agile Teams" is the latest workshop I'm conducting at SECC. Now, I'm giving my self an opportunity to write, why this topic is important.

    Configuration Management is one of the great successes of Software Engineering. It was marked by great persons like Gerald Weinberg as one of the achievements of software engineering is the 90's. However, the older the topic, the heavier it became. It is currently perceived that a middle size company would need about 7-10 templates, 4-5 procedures, and many sub-activities to implement a "good" configuration management environment.

    In this workshop, I tried to dig into the essence of CM, and what is really useful about it. I have gone through texts dealing with this topic, and reviewed all the previous implementations I have gone through. I tried to bridge a link between theory and practice. I was doing this in order to answer the question of one of my colleagues asking: "All of this stuff have absolutely no value, why are we doing this?". I was also trying to answer another question about how to become a CMMI-L3 company, while still Agile and Lean, specially with regard to CM process area.

    Actually, many of the practices which is currently implemented as part of the CM process adds no or little value, or may add value is other contexts, other than software development. Also, I have seen many practices or real value, but implemented in a completely incorrect manner, which made it of no value for this specific company or team.

    It is time to implement "lean" configuration management, which achieves the utmost benefit for the team, with the minimal waste or overhead. keep watching my next posts.