Introduction
In today’s highly competitive industrial environment, many high-tech businesses are using Technical Risk Management (TRM) in their engineering design programs as a means of improving the chances of success.
Objective
The student will learn how to construct a technical risk management plan in order to avoid potential risks to a project.
Study Time: 4.0 hours
Overview
Technical Risk Management (TRM) allows an engineering team to identify potential failure modes early in a project, so that corrective actions can be effectively implemented. TRM also allows management of an organisation to set priorities for the tasks in a project, so that available resources are appropriately applied. TRM requires use of a methodical process to ensure that all potential risks are accounted for.
The process identifies, evaluates, mitigates, and manages the technical risks impacting the program. This section will discuss implementation of the process and will provide a simple example to show how the process can be applied.
Introduction
Technical Risk Management (TRM) is a process that assesses the technical risks to a project. The risks are identified, ranked, and addressed in order to mitigate the chance of project failure. A number of high-tech industries have begun to include TRM as part of their design effort. Modarres (2006) wrote “engineering systems are becoming more complex and demand for risk analysis is greater than ever.” He adds, “recently, many legislators including U.S. Congress have advocated greater reliance on the use of risk information.” United States Department of Defense military standard MIL-STD-882, System Safety Program Requirements, states “A formal safety program that stresses early hazard identification and elimination or reduction of associated risk to a level acceptable to the managing activity is the principle contribution to effective system safety.” (US Department of Defense, 1984) Even when TRM is not a contractual requirement, many businesses are using it on their own, so as to ensure the success of their projects.
A TRM process involves the design team utilising critical thinking and problem solving skills to examine risks early in a project, so as to avoid last minute panics. TRM also helps the design team determine the priority of their tasks and it aids in the allocation of available resources. Project managers see it as a means of reducing the risks to the success of a project. It also helps them manage their staffing resources.
Within the aerospace industry, it has been stated, “Increasingly, Government customers and Industry contractors seek better methods to identify and manage technical, schedule, and cost risk.” (Black, 2001) The medical device industry also has increasing expectations in this regard, where Kaye and Crowley (2000) have described the use of TRM: “Risk Management is a systematic application of policies, procedures, and practices to the analysis, evaluation, and mitigation of risks. It is a key component of quality management systems, and is a central requirement of the implementation of design controls.”
The Four Steps of TRM
The TRM process consists of four steps: Risk Identification, Risk Ranking, Risk Mitigation and Risk Tracking.
In the first step, Risk Identification, the project team examines the system that they are designing. Attention is paid to possible failure modes for the design.
In the second step, Risk Ranking, the project team assesses the risks from step one and ranks them so as to determine which presents the biggest threat to the project. The results of this step allow the project manager to appropriately prioritise the design tasks necessary to address each risk. This step, therefore, aids proper scheduling of tasks and the allocation of resources.
In the third step, Risk Mitigation, the project team must define the actions (such as design, analysis, or testing tasks) that must be performed in order to minimise the risk presented by each failure mode which was identified in the first step and ranked in the second step. This requires the design team to perform critical problem solving. Doing this early in the project avoids last minute panics which often result in a project running late and over budget.
The final step of the TRM process is Risk Tracking, which entails an ongoing activity that follows through on the tasks of the Risk Mitigation plan. All tasks from step three must be completed, so that the risk is mitigated as intended. Unless the plan is properly followed to completion, the time spent on the first three steps was wasted.
Risk Identification
Identifying potential risks to a design project can never be started too early. Reaction and response times are always shorter early in a project, than they are later. Early identification of risks improves the project team’s ability to resolve them with minimal backtracking and redesign, thus minimising impact on project cost and schedule. The first risk identification activity should occur soon after the initial concept is selected. The team should step back and brainstorm all possible failure modes that present a risk to the project. No effort should be made to determine relative importance or to define solutions at this stage. If a team comes up with only one or two risk items, then the team members are not being realistic. There are always things that can go wrong, and team members should examine all such possibilities, so as to identify all potential risks.
The project team should be alert for new risks arising as the project progresses, which should be added to the list of identified risks. These could occur because the detailed design or analysis stage highlights new potential issues. This is almost certain to occur as a design team uncover new risks while moving through the design process.
Risk Ranking or Scoring
The first step forced the team to step back and consider possible failure modes of their design. The next step determines which risks are the greatest threat to success. These high ranked risks should therefore be addressed first and receive prioritisation and allocation of resources. The project team must rank every risk that was listed in step one. The likelihood of each risk’s occurrence and its potential consequence, or impact, must be evaluated. Likelihood measures the probability that a failure will actually occur, and could be quantitative, resulting from probabilistic design calculations. However, they may have to come from educated estimates by the engineers, based on preliminary design work, because exacting calculations are simply not possible. Qualitative assessments based on the judgment by the technical staff may have to suffice. Kumamoto and Henley (1996) present a mathematical treatment of probabilistic risk, saying that each risk should be “expressed as an objective probability, percentage or density per action or unit time, or during a specified time interval.” But they acknowledge, “Unfortunately, the likelihood is not always exact; probability, percentage, frequency, and ratios may be based on subjective evaluation. Verbal probabilities such as rare, possible, plausible, and frequent are also used.” For purposes of this discussion, we will use three subjective graduations, representing low, medium, and high probability.
Next, the consequence of failure must be assessed for each risk. Failure consequences are also difficult to quantify, and Kumamoto and Henley (1996) say that “verbal and ambiguous terms such as catastrophic, severe, and minor may be used instead of quantitative measures.” They also point out that consequences need to be tailored to particular projects, and significance may depend on intangible attributes such as culture, ethics, emotion, reconciliation, media coverage, context, or litigability, as well as the fact that people estimate the outcome significance differently when a population is at risk as compared to placing an individual at risk. For purposes of this discussion, we will designate the consequences to be low, medium, and high impact.
Once the probability and consequence for each risk have been established, that item can be assigned a Risk Score, via a scoring matrix like the one shown in the figure below. The scoring matrix both quantifies the risks (giving them a score based upon a combination of probability and of consequence) and allows them to be categorised into easy to understand levels (usually colour coded, red, yellow, and green). Although simplistic, this colour coding provides a means of quickly and clearly highlighting, to management, which risks are of most concern. Jarrett (2000) explains, the corporate management views things more simplistically than most engineers, and as regards risk decisions, “even if it were possible to develop complex representations of risk accurately, it is difficult for the executive to deal with them. Instead, the executive is able to deal with a few scenarios and possible cases, and only with three general levels of conceptual risk associated with them: High Risk, Medium Risk, and Low Risk.”
This risk ranking matrix can be tailored to suit specific projects. If higher refinement is required, a five-by-five or larger matrix could be used, featuring more refined graduations of both probability and consequence.
It can be seen that the scoring matrix in the figure below applies equal weight to both probability and consequence, which need not be the case. The process can be tailored to meet the specific needs of a particular project and the scoring matrix can be skewed slightly toward the consequence axis to better account for the impact of potential failures.
Risk Mitigation Plan
Arguably the most useful part of the TRM process is the third step, which involves development of a Risk Mitigation Plan, which defines a list of steps that will attack each risk that has been assessed as having an unacceptably high level. Project team members must use their combined knowledge, skills, and resources to outline a series of actions which will reduce the high risk scores to low risk scores. These steps may involve any of the following:
- Redesign of existing aspects of the project
- Analysis of parts that have been designed (Either by hand calculation or finite element analysis)
- Testing
- Addition of new aspects to the design that lower the impact of a failure if it occurs
Risk Tracking
The fourth and final step of the TRM process is the execution of the Risk Mitigation Plan. This management oriented phase covers the longest period of time. It is the most important of the four steps because it involves the project team performing the steps outlined in the Risk Mitigation Plan, ensuring that each task of the plan is completed, resulting in the risk actually being reduced.
Identifying, ranking, and mitigation planning are all pointless unless the plan is actually executed.
Application of TRM
Practical Example
For purposes example, let us look at a design project (Hylton, 2009), designing and prototyping a new hand operated winch, as shown in figure 2 opposite.
This figure presents an assembly drawing of a winch for use in a TRM example. We will now undertake each of the fours steps of TRM in turn, starting with Risk Identification below.
Figure 2
Risk Identification
We begin by identifying potential risks, including not only possible failures of mechanical components, but also related technical problems that could impact our objectives. We might come up with a list of failure modes, like that shown in the list in Table A.
Table A: Risks identified by the example design team for the Winch Project
1. |
Failure of the cable |
2 |
Failure of the handle/lever which the user pushes on |
3. |
Failure of the drum onto which the cable is wrapped |
4. |
Failure of the shaft the drum spins on |
5. |
Failure of the ratcheting gear |
6. |
Failure of the latch which prevents ratchet from unwinding |
7. |
Failure of the bolts mounting the winch to the trailer |
8. |
Failure of vendor to supply components for assembly on time |
9. |
Failure of fabrication shop to meet schedule |
Risk Ranking
Now we determine a ranking order for the identified risks, in terms of both probability and impact. Here is some of the logic that might guide our rankings:
- The random selection of a cable size would not take into account the loading on the cable - Therefore, we could consider the probability of cable failure on the order of 50%. So let’s assign the likelihood of occurrence as medium.
- If the winch were used to pull a car onto a trailer (typical use for such a mechanism), in a failure of the cable, the car will suddenly be released and roll backwards.
- This would create the potential for damage or injury to anything or anyone behind the trailer at the time.
- This might well be treated as a high consequence.
Following a similar process for all of the identified risks, we would create a series of ranking scores like those shown in Table B:
Table B: Risk probabilities and consequences assigned for the Winch Project
Failure mode | Probability | Consequence | |
1. | Failure of the cable | Medium | High |
2. | Failure of the handle | Medium | Low |
3. | Failure of the drum | Low | Medium |
4. | Failure of the shaft | Medium | Medium |
5. | Failure of the gear | Low | High |
6. | Failure of the latch | High | High |
7. | Failure of the bolts | Medium | High |
8. | Failure of vendor supply | Low | High |
9. | Failure of fab shop | Medium | High |
Using the matrix from Figure 1, we would use the probabilities and impacts from Table B to generate the risk scores shown in Table C.
Table C: Risks scored by the example design team for the Winch Project
Failure mode | Probability | Consequence | Risk Score | |
1. | Failure of the cable | Medium | High | 4 |
2. | Failure of the handle | Medium | Low | 2 |
3. | Failure of the drum | Low | Medium | 2 |
4. | Failure of the shaft | Medium | Medium | 3 |
5. | Failure of the gear | Low | High | 3 |
6. | Failure of the latch | High | High | 5 |
7. | Failure of the bolts | Medium | High | 4 |
8. | Failure of vendor supply | Low | High | 3 |
9. | Failure of fab shop | Medium | High | 4 |
It becomes obvious, from the higher numbers, which risks deserve to receive attention first and which can be delayed until later in the program. Some can even possibly be ignored, if the risk is adequately low. We have turned qualitative assessments of the likelihoods and impacts for the potential risks into a quantitative risk score which gives the project manager direction as to which issues to address first, and therefore, where program resources should be directed.
Risk Mitigation Plan
The next step is to design a Risk Mitigation Plan for all identified risks. After the Risk Ranking step, the biggest issue was determined to be the ratcheting latch, a heavily loaded piece which prevents the winch from releasing its load.
Let us consider a Risk Mitigation Plan for this component.
Step 1. Determine how much load is on the latch, based on the design specifications for the device.
Step 2. Step 1 did not reduce the risk. But, using the information, we can perform hand calculations for the stress in the latch. This allows us to make a better assessment of the acceptability of the risk, or may allow for the latch design to be improved to accommodate the load.
Step 3. Create a 3D model of the latch, suitable for both manufacturing and analysis.
Step 4. Using the model from Step 3, create a Finite Element Analysis model and use it to predict the stresses in the latch, comparing predicted loads to material properties. This will either confirm acceptability, thus further reducing the likelihood of failure, or it will lead to improving the strength of the latch design.
Step 5. We could enhance the design in a manner that reduces the impact if the latch does fail. For example, incorporation of an inertia catch, intended to stop the drum if it begins to spool too quickly (as it would after a latch failure) lowers the consequence of failure.
The figure (opposite) shows the anticipated effect of the Risk Mitigation Plan on the Risk Ranking.
The fourth and final phase of the process involves tracking the design through to completion of the project.
Putting things in Perspective
Although it tended to be the high-tech industries that first began to implement TRM processes, more industries, producing a broader range of products, are now making use of the technique.
The benefits should be equally applicable to the project management of any industry, regardless of the level of technology. Early identification, assessment, and mitigation of program risks, reduces the chance of failure, and the associated loss of revenue, reputation, and jobs, no matter what the program.
A failure that results in the loss of a rocket ship has the potential for large financial impact on a company, due both to the immediate loss of the craft and its payload, and from the long term loss of business.
That does not even address the possibility that there may have been external damage caused depending on how and where the craft crashed. But consider a company whose business is making concrete blocks, and the decision is made to develop a newer, stronger, lighter, more durable block.
This should be great for the company’s future. However, what if mistakes are made in the design of the new product, resulting in a block that is subject to failure when used in heavy construction? Such a failure can just as easily bankrupt a small, low-tech firm, which makes it catastrophic in its own right. Therefore, there is merit in using the principles of TRM in any business regardless of the technology level.
Study any resource from the last two sections which you have not already completed.
Summary
The concepts of Technical Risk Management can be incorporated into the technical design process in any industry. The advantage is that the technical staff pinpoints the technical risks to program success early in the program. These risks are evaluated in a manner that ranks them in order of priority, so that the technical staff knows what to attack first, and the program manager can best evaluate how to distribute resources to address the appropriate risks. Developing and then implementing a plan for mitigating the risks will significantly improve the likelihood of program success
References & Bibliography
Black, H. (2001). U.S. Aerospace Risk Analysis Survey. Journal of Cost Analysis & Management, Winter issue.
Hylton, P. (2009). Book Chapter: Technical Risk Management. Handbook of Research and Technology Project Management, Planning and Operation. Hershey, PA, USA: IGI-Global.
Jarrett, E. (2000). Effect of Technical Elements of Business Risk on Decision Making, Managing Technical Risk, US Department of Commerce, NIST GCR 00-787.
Kaye, R. & Crowley, J. (2000). Medical Device Use-Safety: Incorporating Human Factors Engineering into Risk Management. U.S. Dept. of Health and Human Services Guidance for Industry and FDA Premarket and Design Control Reviewers, U.S. Dept. of Health and Human Services, Washington D.C.
Kumamoto, H. & Henley, E. (1996). Probabilistic Risk Assessment and Management for Engineers and Scientists. New York: IEEE Press.
Modarres, M. (2006). Risk Analysis in Engineering. New York: Taylor and Francis.
US Department of Defense, (1984). MIL-STD-882B, System Safety Program Requirements, Washington D.C., AMSC F3329