Binary Pump Diagnosis

This example builds a small binary fault-diagnosis model for an industrial pump skid. Each variable has two states. Latent variables represent component health, and observed variables represent binary alarms from the control system. The graph contains a cycle, so iterative discrete belief propagation is used.


Variable Nodes

Each variable is binary:

using FactorGraph

states = [:ok, :bad]
alarmStates = [:no, :yes]

variables = [
    DiscreteVariable(:power, 2; label = "power", states = states),
    DiscreteVariable(:fuse, 2; label = "fuse", states = states),
    DiscreteVariable(:motor, 2; label = "motor", states = states),
    DiscreteVariable(:vibration, 2; label = "vibration", states = alarmStates),
    DiscreteVariable(:temperature, 2; label = "temperature", states = alarmStates),
    DiscreteVariable(:trip, 2; label = "trip", states = alarmStates)
]

Component Priors

Unary factors encode prior failure rates. The power feed, fuse, and motor are expected to be healthy, but not guaranteed:

f1 = DiscreteFactor(:power, [0.97, 0.03]; label = "power_prior", initialize = true)
f2 = DiscreteFactor(:fuse, [0.96, 0.04]; label = "fuse_prior", initialize = true)
f3 = DiscreteFactor(:motor, [0.94, 0.06]; label = "motor_prior", initialize = true)

Fault Propagation Factors

Pairwise factors encode simple engineering relationships. A bad power feed makes a blown fuse more likely, a blown fuse makes motor trouble more likely, and a bad motor makes vibration and temperature alarms more likely:

f4 = DiscreteFactor(:power, :fuse, [0.98 0.02; 0.30 0.70]; label = "power_fuse")
f5 = DiscreteFactor(:fuse, :motor, [0.96 0.04; 0.25 0.75]; label = "fuse_motor")
f6 = DiscreteFactor(:motor, :vibration, [0.90 0.10; 0.15 0.85]; label = "motor_vibration")
f7 = DiscreteFactor(:motor, :temperature, [0.92 0.08; 0.2 0.8]; label = "motor_temperature")

Protection Logic

The trip indication depends on both vibration and temperature. This ternary factor keeps all variables binary while still using a multidimensional table. The first slice is trip = :no, and the second slice is trip = :yes:

f8 = DiscreteFactor(
    :vibration, :temperature, :trip,
    cat(
        [0.98 0.30; 0.25 0.05],
        [0.02 0.70; 0.75 0.95];
        dims = 3
    );
    label = "trip_logic"
)

Observed Evidence

The operator sees vibration and temperature alarms, and the protection relay has tripped:

f9 = DiscreteFactor(:vibration, [0.05, 0.95]; label = "obs_vibration")
f10 = DiscreteFactor(:temperature, [0.10, 0.90]; label = "obs_temperature")
f11 = DiscreteFactor(:trip, [0.02, 0.98]; label = "obs_trip")

Factor Graph Construction

Collect the factors and build the factor graph:

factors = [f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11]
graph = factorGraph(variables, factors)

The graph can be rendered as an SVG factor graph figure:

saveGraphFigure("../bpd.svg", graph; label = (showEdgeIds = true, tooltipDetail = :full))

Sum-Product Diagnosis

Run damped loopy sum-product belief propagation to estimate posterior fault probabilities:

inference = sumproduct(graph)

gbp!(graph, inference; iterations = 80, tolerance = 1e-8, damping = true)

Inspect the posterior health estimates:

printMarginal(graph, inference; variable = :power)
printMarginal(graph, inference; variable = :fuse)
printMarginal(graph, inference; variable = :motor)
Marginal for variable node "power" (sum-product form):
  probability = [0.9887902594439142, 0.011209740556085757]
Marginal for variable node "fuse" (sum-product form):
  probability = [0.996413107024802, 0.003586892975198048]
Marginal for variable node "motor" (sum-product form):
  probability = [0.9002693105904549, 0.09973068940954512]

MAP Diagnosis

Use min-sum when only the most likely binary assignment is needed:

mapInference = minsum(graph)
gbp!(graph, mapInference; iterations = 80, tolerance = 0.0, schedule = :residual)

estimates(graph, mapInference)
6-element Vector{Union{Int64, String, Symbol}}:
 :ok
 :ok
 :ok
 :yes
 :no
 :yes

The MAP estimate is a single most likely explanation, while the sum-product marginals show how much uncertainty remains around each component.