Why Programs Fail in the I&T Phase

May 4, 2020 By iwano@_84 Off

Why do so many large software centric programs fail in the Integration and Test ( I&T) phase? The answer is quite simple. When system hardware contained the majority of system functionality, System Engineering groups were manned by Electrical Engineers, mathematicians, and physicists whose job was to specify, design, build, and test specialty hardware cabinets that performed the system functions. With the advent of software intensive systems, the system engineering hardware design function was replaced by the software factory. System hardware is ordered from Cisco and Jupiter. System engineering no longer carried the load from beginning to end, so they abdicated the responsibility to integrate and test the system. By default, the integration and test function became the responsibility of the software engineer who was busy designing software and did not have the time, ability, or inclination to plan and execute successful system integration and test efforts.

The purpose of this paper is to address the system engineering abdication problem by presenting an I&T methodology that identifies the concept of “System Engineering dropped balls” and provides a solution for each dropped item. Individual organizations may wish to add or delete items from this list. The important part of this concepts is the recognition of the “dropped ball ” concept and the acceptance by the I&T community of the responsibility to solve the problem.

The I&T group must assume responsibility for each of these solutions. They may be twenty or more in number. They are critical to the success of the I&T portion of the program and no one else will assume ownership. This says that the I&T group must act as an adjunct of system engineering. This does not say that the I&T group must execute the solution for each item. However, they must assume the responsibility for making certain that each item is accommodated somewhere in the program. The items under discussion are to be found in column 1 of the chart presented in Annex A. These items are discussed in detail in this document.

A Change in Culture is Required

One problem is that I&T is considered to be purely “T” in many organizations with the emphasis on test activities at the end of the program. I&T must be considered as an essential equal at the very outset of a program (e.g., in the bid/tender phases). If not, then it is highly likely that the program has been planned at a high-risk level. At the very best, a substandard system is delivered that is degraded and compromised. It is too late to consider I&T mid-program or later. By the time a realization is made that there is a problem, corrective action is usually futile or prohibitively expensive. A culture change is required for success in the I&T area.

A second problem arises in small programs. When you have a 10 person team for the entire program it is unlikely that a person is given the I&T responsibility at the beginning. A possible solution for the small program environment is to have a designated staff person responsible for making certain that all small programs have adequate I&T planning. That way the cost of I&T planning can be shared.

Let’s define what we mean by System Integration. “System Integration is a life of the program process that encapsulates all levels of elemental product fusion (physical, electrical, software). Integration planning begins at the time the system is conceived. Integration activity ends when the system has been successfully sold off. This process includes all activities (planning, design, tools, and schedules) necessary to ensure that the system is designed and developed such that the independently developed parts can be successfully fused during the first part of the System Integration phase. The system functions and requirements are then verified late in the System Integration phase. These same functions and requirements are then validated in the presence of the procuring agency during the System Test phase”.

Problems in the integration and test development cycle can never be eliminated. This is a natural part of the process and one that has to be accepted as inevitable at the outset. What can be managed is the likelihood of those problems establishing a critical mass that causes a program to fail. Successful programs are difficult to execute – even with well-established processes and mature organizations. It is impossible to establish the conditions for success where these processes do not exist or have not been afforded adequate consideration.

In order to alleviate the above predictions of failure due to negligence at the beginning of a program, a step-by-step Due Diligence process has been defined. This step-by-step process will aid the I&T engineer in planning for a successful I&T experience. The Due Diligence process is founded on the statement: the areas that are likely to be the major contributors to program failure (in the I&T phase) can be forecasted at the beginning of the program and preventative measures can be put into place at an early date. Some of these areas have characteristics that are unique to the specific program. The remainders are areas where programs traditionally get into serious trouble.

I&T as an Engineering Discipline

I&T is not normally treated as an engineering discipline. Most system development activities have the characteristics of deterministic homework problems. There is an instruction book, problems to solve, answers in the back of the book, classes, mentors, and tests to pass. The generic engineering community should be capable of performing tasks that resemble homework problems. Most I&T activities (during the actual integration of the system) do not resemble engineering homework problems. They more resemble forensic analysis. Therefore, we cannot expect the majority of the generic engineering community to successfully execute I&T activities.

The successful I&T professional has somehow acquired the desire to make things work when they have limited insight to the particulars of the product. For some reason, they don’t mind working a complex problem on Saturday night. For some reason, they don’t mind figuring why a particular problem occurs 5 hours and 22 minutes after system initialization (one time in every four). They truly have a forensic analysis skill set that is different from the engineering norm.

Experience has shown that about 5% of the generic engineering population “has the right stuff”. The rest are just not interested in participating in an area where the comfort level is low and there is no product development ownership. This 5% of the engineering population (if properly selected) can make a significant contribution to the efficient execution of a contract. Indiscriminate selection of personnel will result in an inefficient operation. To continue the retrofit of individuals and processes will result in substandard operations. I&T practitioners need to develop the same level of competence as System Engineering, Software, and Manufacturing practitioners. Because the universities are not likely to solve this problem, it remains for our internal engineering organizations to perform this service.

The Due Diligence Process

The Due Diligence process is the key to removing the state of denial that is prevalent in many program offices. The process is based on 20 key technical items. These items constitute the “abdicated System Engineering items” discussed earlier. Each of these items must be considered and evaluated when the system is defined. It is the responsibility of the I&T group to make certain that this happens. There will be significant areas of neglect. Therefore, it is necessary to establish where the program is in respect to these areas at an early date. It is the responsibility of the I&T group to make certain that these items are raised to a satisfactory state.

The Due Diligence will make visible the items of concern (the abdicated items) that must be corrected in order to have a successful I&T experience. Most of the items of concern will fall into the system engineering area of responsibility. It is highly likely that these items will have received little, if any, attention prior to the time of this exercise. It is imperative that these deficiencies be corrected prior to the Preliminary Design Review (PDR) for the program. Stated another way, it is recommended that a successful Due Diligence be made a criterion for entering the PDR.

The Due Diligence is executed at least twice during the life of the contract. The first time is at the end of the Proposal generation period to make certain that the project can be executed. The second is at the beginning of the System Design phase (phase 2).

Each of the items listed in the System Definition column is evaluated and color coded (red/orange/yellow/green) by the most learned persons on the contract. Please refer to Annex A for an example of a color coded Due Diligence summary.

1. Red – Not in the budget and schedule. Very bad.

2. Orange – In the schedule and budget, but inadequate. Hard to accomplish.

3. Yellow – In the schedule and budget and should be OK, if given adequate attention.

4. Green – Easy to accomplish, plenty of time and budget.

5. NA – Not applicable.

Each of the Due Diligence items is defined in detail in the following sections. A composite of the Due Diligence results will present an accurate picture of what needs to be done. Once the weak points (Red or Orange) have been identified, a “Return to Yellow (RTY) plan” is developed for each deficient item. We use the color yellow as the desired state because Yellow means “OK,” if attention is paid to the item. An item is deemed deficient when several participants identify the items as Red or Orange. The timing of this evaluation is critical as some of the corrections will require substantial amounts of time (years) and money. It is recommended that the status of this plan be reported out by the I&T manager at the appropriate program reviews and that it be updated quarterly.

It is worth repeating that it is the responsibility of the I&T group to make certain that all items in column1 are in a satisfactory state (yellow or green) early in the program.

Due Diligence Process Steps

The first step is to establish what the list of items in column 1 should look like for a particular program. You accomplish this by first asking all of the most learned people on the program to provide a list of the areas where they think I&T will most likely get into trouble and also to list the data necessary to get out of trouble. Include yourself among the list of learned folk. Hold the list to twenty or less items per person. You will find this item at the top of column 1 (20 ugly problems list).

The second step is to filter the above inputs by eliminating the wringing and whining, the items that are redundant with the generic list in column 1 of the chart, and the superfluous inputs. Next, we generate the list of Due Diligence items for the specific program along with a list of definitions. The list consists of items 2-N from column 1 of the chart plus new items derived from the “tough problems list”. Most of the items from the tough problems lists will be redundant with the items in the generic list. However, there are usually 2-5 pearls of wisdom in the tough problems list.

Next, we submit to each of the learned folk the total list of items (along with detail definitions) and ask them to report their opinion of the color code of each item. The list will likely have lots of Red and Orange items. Remember to include yourself as a participant. If you don’t get results from some individuals, schedule a one-hour meeting with them and help them do the color coding.

Next, we put together a summary diagram similar to the one in Annex A showing the composite opinions of the most learned folk. The items from column 1 form the vertical axis and the responses from the staff constitute the columns. The summary will likely show lots of Red and Orange. Now, we have easily comprehended evidence of the nature of the total problem. If the problem is severe, the program will no longer be able to maintain a state of denial.

At this point we need to perform a little psychological analysis of responses in the summary chart. Most times all of the responses will be approximately the same with approximately 80% of the responses being either Red or Orange. Occasionally we get a response that is significantly different from the other and contains a large number of Yellow and Green entries. I have seen several reasons for these outliers.

1. The individual is conditioned to report good news no matter what the true conditions might be. This individual is not likely to be inclined to take the necessary corrective measures.

2. The individual is extraordinarily optimistic and sees everything through Yellow colored glasses to coin a phrase. This individual is not likely to be inclined to take the necessary corrective measures.

3. The individual feels a lack of responsibility for these critical I&T activities and reports everything OK with the assumption that he won’t be asked to fix the problem if he reports everything is OK. If the individual is in a position to heavily influence the I&T activities it would be a good idea to consider a realignment exercise for this individual.

Next, we review the results and decide what a “return to yellow/green plan” (RTY) would involve for each item. Each RTY plan will state:

1. The nature of the problem

2. The cause of the problem

3. The proposed problem solution

4. The person identified as recommended to execute the plan

Next, we review the results and the RTY plan descriptions with the I&T management. Once a plan of attack has been agreed upon, we present the results to the program manager. When faced with a picture that shows that his most learned folk agree that there are serious problems, it is very difficult for a program manager to disagree. We have removed the “state of denial”.

Next, we get responsible individuals assigned to the RTY plans and develop a schedule for their completion. We have now established a path to success.