Implementation

How to Build an Effective FMEA: A Step-by-Step Guide

Reliability HQ1 February 202614 min read
Share:

Introduction

Failure Mode and Effects Analysis (FMEA) is the heart of the RCM process. It's where you systematically document how equipment can fail and what happens when it does. Well-executed FMEA drives effective maintenance strategy; poor FMEA leads to missed failures and wasted resources.

This guide walks you through creating effective FMEA documentation step by step. We'll use a practical example throughout—a motor-driven centrifugal pump—to illustrate each stage.

What is FMEA?

FMEA is a structured approach to identifying:
  • How an item can fail (failure modes)
  • What causes each failure (mechanisms)
  • What happens when it fails (effects)
  • How severe and likely each failure is
In the RCM context, FMEA feeds directly into maintenance task selection. The quality of your FMEA determines the quality of your maintenance programme.

RCM FMEA vs Design FMEA

There are different types of FMEA. Design FMEA (DFMEA) is used during product development to identify potential design weaknesses. Process FMEA (PFMEA) examines manufacturing processes.

RCM FMEA (sometimes called operational FMEA) focuses on:
  • Equipment already in service
  • The specific operating context
  • Maintenance task selection as the output
This guide focuses on RCM FMEA.

Before You Start: Preparation

Good preparation makes analysis sessions far more productive.

1. Define the System Boundaries

Be clear about what's included and excluded. For our pump example:

Included:
  • Pump casing and internals (impeller, wear rings, shaft, seals)
  • Drive motor
  • Coupling
  • Baseplate and foundation bolts
  • Local instrumentation (pressure gauge, flow indicator)
  • Suction strainer
Excluded:
  • Upstream and downstream piping (separate system)
  • Electrical supply (covered under electrical distribution)
  • Control system (covered under DCS/PLC analysis)
  • Lubrication supply system (separate system)
Document these boundaries so everyone is aligned.

2. Gather Reference Information

Collect before the session:
  • P&IDs and equipment drawings
  • Equipment datasheets and specifications
  • Manufacturer maintenance manuals
  • Operating procedures
  • Maintenance history and failure records
  • Any previous analysis documentation

3. Assemble the Team

The ideal FMEA team includes:
  • Facilitator: Guides the process, documents outputs
  • Operator: Knows how equipment behaves in service
  • Maintainer: Knows how it fails and what's practical
  • Engineer: Understands design intent and failure physics
For a pump, you might have a process operator, mechanical technician, and reliability engineer, plus the facilitator.

4. Schedule Adequate Time

For a system like our pump, plan for 3-4 hours. Complex systems may need multiple sessions. Avoid rushing—incomplete analysis is worse than no analysis.


Step 1: Define Functions

Start by listing everything the equipment must do. Functions answer: "What do users expect this equipment to accomplish?"

Primary Functions

These are the main reasons the equipment exists.

Example — Centrifugal Pump Primary Function:
Function NoFunction Description
1Transfer cooling water from reservoir to heat exchangers at minimum 200 L/min at 5 bar discharge pressure
Be specific. Include:
  • What substance (cooling water)
  • From where to where (reservoir to heat exchangers)
  • How much (200 L/min minimum)
  • At what conditions (5 bar)

Secondary Functions

These are additional expectations beyond the primary purpose.

Example — Pump Secondary Functions:
Function NoFunction Description
2Contain the cooling water (external leaks not to exceed 10 mL/hour)
3Allow flow to be isolated when required (suction and discharge isolation valves)
4Indicate discharge pressure locally (gauge reading within ±5% of actual)
5Indicate running status to control room (run signal within 2 seconds of start)
6Operate without excessive vibration (< 4.5 mm/s RMS at bearing housings)
7Operate within acceptable temperature limits (bearing temp < 80°C)
8Start reliably when called upon

Tips for Functions

  • Use the format "To [verb] [noun] [performance standard]"
  • Quantify wherever possible
  • Consider safety, environmental, control, and efficiency aspects
  • Ask operators: "What would make you say this isn't working properly?"

Step 2: Identify Functional Failures

For each function, determine how it can fail. Functional failures are states where the function is no longer fulfilled to the required standard.

Example — Functional Failures

FunctionFunctional Failure CodeFunctional Failure Description
1 (Transfer water at 200 L/min at 5 bar)1AUnable to transfer any water
1 (Transfer water at 200 L/min at 5 bar)1BUnable to transfer water at 200 L/min (reduced flow)
1 (Transfer water at 200 L/min at 5 bar)1CUnable to maintain 5 bar discharge pressure
2 (Contain water, <10 mL/hour leak)2AExternal leak exceeding 10 mL/hour
5 (Indicate running status)5AIndicates running when not running
5 (Indicate running status)5BIndicates not running when actually running
8 (Start reliably)8AFails to start when commanded
Notice that functions can have multiple functional failures:
  • Complete loss vs partial loss
  • Fails high vs fails low (for instrumentation)

Step 3: Identify Failure Modes

For each functional failure, list the specific events that could cause it. These are failure modes—the "what" that goes wrong.

Example — Failure Modes for "Unable to transfer any water"

Functional FailureFMFailure Mode
1A (No water transfer)1Impeller completely worn/eroded
1A2Impeller loose on shaft/detached
1A3Shaft sheared
1A4Motor winding failure (burn out)
1A5Motor bearing seized
1A6Coupling failure (sheared/disconnected)
1A7Suction strainer completely blocked
1A8Pump casing cracked/split
1A9Loss of prime (air-bound)

Finding Failure Modes

Good sources include:
  • Team experience: "What failures have we seen?"
  • Maintenance history: Work order records, failure reports
  • Generic failure mode libraries: Standard lists for equipment types
  • Manufacturer documentation: Known failure modes
  • Industry publications: Papers, case studies, standards

Level of Detail

The right level is where:
  • Different failure modes need different maintenance responses
  • You can describe what happens when it occurs
  • You can assess likelihood and severity
Too vague: "Pump fails" — Not useful for task selection Just right: "Mechanical seal fails (leaks excessively)" — Actionable Too detailed: "Mechanical seal secondary O-ring fails due to chemical attack on Viton material" — Probably unnecessary unless you're seeing this specific issue

Step 4: Describe Failure Effects

For each failure mode, describe what happens when it occurs. This is crucial for consequence assessment and maintenance task justification.

What to Include

  1. 1.Evidence of failure: How would someone know it happened?
  2. 2.Immediate effects: What happens to the process/production?
  3. 3.Safety/environmental impact: Any hazards created?
  4. 4.Secondary damage: Does it damage other equipment?
  5. 5.Corrective action required: What's needed to restore function?
  6. 6.Downtime/duration: How long to repair?

Example — Failure Effect

Failure Mode: Motor bearing seized Failure Effect:

"Motor makes increasing noise (high-pitched grinding) for approximately 24-48 hours before seizing. When bearing seizes, motor trips on overload. Control room receives motor trip alarm. Standby pump auto-starts. No safety or environmental impact. If detected early (on noise), bearing can be replaced in situ—4 hours, two fitters, £50 parts. If motor runs while seized, motor rewind required—remove motor, send to shop, 2 weeks, £1,200. No secondary equipment damage."

Tips for Failure Effects

  • Write in complete sentences
  • Be specific about times, people, costs
  • Distinguish between early detection and run-to-failure scenarios
  • Include evidence that would be apparent to operators/maintainers

Step 5: Assess Consequences

For each failure mode, categorise the consequences. This directly influences maintenance task selection.

Consequence Categories (SAE JA1011)

CategoryDefinitionImplications
HiddenFailure not evident under normal circumstancesFailure-finding task mandatory
Safety/EnvironmentalCould cause injury or environmental breachProactive task must reduce risk to acceptable level; redesign if not possible
OperationalAffects output, quality, or customer serviceProactive task if cost-effective
Non-operationalOnly involves repair costRun-to-failure often acceptable

Assessment Questions

Work through these in order:
  1. 1.Is the failure evident to operators under normal conditions?
- If NO → Hidden failure consequence - If YES → Continue to question 2
  1. 1.Does the failure cause or contribute to a safety or environmental hazard?
- If YES → Safety/Environmental consequence - If NO → Continue to question 3
  1. 1.Does the failure affect operations (output, quality, service)?
- If YES → Operational consequence - If NO → Non-operational consequence

Example — Consequence Assessment

Failure ModeEvident?Safety/Env?Operational?Consequence
Motor bearing seizedYes (noise, alarm)NoMinor (standby available)Non-operational
Mechanical seal failureYes (visible leak)NoMinor (continues operating)Non-operational
Relief valve fails to openNo (only evident on demand)Yes (overpressure possible)Hidden + Safety

Step 6: Select Maintenance Tasks

Based on failure effects and consequences, select appropriate maintenance tasks.

Task Selection Logic

For each failure mode, ask (in order):

For hidden failures:
  1. 1.Is there an on-condition task that will detect potential failure?
  2. 2.Is there a scheduled restoration or discard task?
  3. 3.Is there a failure-finding task?
  4. 4.Redesign may be necessary
For evident failures:
  1. 1.Is there an on-condition task that's worth doing?
  2. 2.Is there a scheduled restoration or discard task that's worth doing?
  3. 3.For safety/environmental: redesign may be necessary
  4. 4.For operational/non-operational: run-to-failure may be acceptable

Task Types Summary

Task TypeWhat It DoesWhen to Use
On-conditionDetects potential failure before functional failureWhen P-F interval exists and task is practical
Scheduled restorationRestores original capabilityWhen wear-out age is identifiable and most survive to it
Scheduled discardReplaces itemWhen wear-out age is identifiable and failure unacceptable
Failure-findingChecks if item has failedFor hidden failures only
CombinationMultiple tasks togetherWhen single task doesn't address adequately
Run-to-failureNo scheduled maintenanceWhen consequences are acceptable

Example — Task Selection

Failure Mode: Motor bearing wear leading to seizure Task selection reasoning: Is there a potential failure condition? Yes — vibration increase, noise, temperature rise Is there a P-F interval? Yes — typically weeks to months Is monitoring practical? Yes — monthly vibration readings take 5 minutes Is it worth doing? Yes — early detection prevents expensive motor rewind Selected task: Monthly vibration monitoring (portable). Set alert at 4.5 mm/s, action at 7 mm/s. Repair time: 4 hours vs 2 weeks if undetected.

Step 7: Document and Review

The FMEA Worksheet

Standard RCM FMEA uses two worksheets:

Information Worksheet: Captures functions, functional failures, failure modes, and failure effects Decision Worksheet: Records consequence assessment, task selection logic, and recommended tasks

Example FMEA Information Worksheet Extract

FunctionFunctional FailureFailure ModeFailure Effect
1. Transfer cooling water at 200 L/min at 5 bar1A. Unable to transfer any water1A.1 Motor bearing seizedMotor makes grinding noise 24-48 hrs before seizing. Motor trips on overload. Standby pump starts. 4 hr repair if detected early (£50), 2 weeks + £1,200 if motor damaged.
1. Transfer cooling water at 200 L/min at 5 bar1A. Unable to transfer any water1A.2 Coupling failureSudden loss of flow. Motor continues running (no load). Control room sees flow alarm and motor amps drop. 2 hr repair, £150 coupling.
2. Contain water (<10 mL/hr leak)2A. External leak >10 mL/hr2A.1 Mechanical seal wornProgressive increase in seal leak over weeks. Visible dripping. Pump continues operating. Seal replacement requires pump isolation, 3 hrs, £280 seal kit.

Review and Approval

Before implementing, FMEA should be reviewed by:
  • Someone not on the analysis team (fresh eyes)
  • Operations management (will they accept the tasks?)
  • Maintenance management (can they resource the tasks?)
  • Engineering authority (technically sound?)
Document the review and any changes made.

Common Pitfalls to Avoid

1. Incomplete Functions

Missing secondary functions means missing failure modes. Always consider containment, safety, control, indication, and efficiency.

2. Copying Other Analyses

Generic FMEAs for "pumps" ignore your operating context. Adapt—don't adopt.

3. Risk Ranking as the Output

Some FMEA approaches focus on calculating Risk Priority Numbers (RPN). RCM FMEA focuses on task selection. High-risk items need effective tasks; the ranking itself isn't the point.

4. Stopping at FMEA

FMEA without implementation is just documentation. Every failure mode should result in either a task or a conscious decision to run to failure.

5. One-Time Exercise

Equipment, processes, and knowledge evolve. Review FMEAs after failures, modifications, or at regular intervals.

Key Takeaways

  • Prepare thoroughly: Gather documentation, assemble the right team, define boundaries
  • Start with functions: Clear, quantified functions drive good analysis
  • Be systematic: Work through functional failures, then failure modes for each
  • Write useful failure effects: Include evidence, impact, and repair requirements
  • Assess consequences properly: Hidden, safety, operational, non-operational
  • Select appropriate tasks: Based on P-F intervals, wear-out ages, and consequence severity
  • Document everything: Future reviewers (and future you) will thank you
  • Implement what you analyse: FMEA isn't complete until tasks are in your CMMS

Tools and Templates

A well-designed template guides analysis and ensures consistency. Key features to look for:
  • Clear prompts for each field
  • Space for adequate detail in failure effects
  • Decision logic diagram integration
  • Easy transfer to CMMS
Our RCM FMEA Template Pack includes:
  • Information Worksheet template
  • Decision Worksheet template
  • RCM Decision Diagram
  • Completed example analysis
  • Quick-start guide
Need a starting point for failure modes? Our Failure Mode Library contains 500+ documented failure modes for common industrial equipment.

FMEA is a skill that improves with practice. Your first analysis will be slower and rougher than your tenth. The key is to start, learn, and continuously improve. Every completed analysis makes your maintenance programme stronger and builds your team's capability.

Good luck—and thorough analysis!

Ready to Improve Your Maintenance Programme?

Our professionally designed RCM templates and tools help you implement reliability best practices efficiently.

R

Reliability HQ

Sharing practical reliability engineering knowledge to help maintenance professionals implement RCM effectively. Based on SAE JA1011 standards and real-world experience.

Related Articles

Get More RCM Insights

Subscribe to receive new articles, guides, and practical tips for reliability engineering professionals.