-
Notifications
You must be signed in to change notification settings - Fork 11
Fault Modeling
Faults are a core concept of S#. While verification techniques for functional correctness only consider the intended behavior of a system, safety analysis techniques, by contrast, also consider the degraded behavior in the presence of one or more faults. Consequently, models used for safety analysis have to contain both kinds of behaviors; in particular, the model has to contain the system's relevant faults as well as their effects on the system. Usually, creating a model of the intended behavior already poses a challenge, while the degraded behavior in case of faults is even harder to predict and model correctly.
S# helps by making it possible to model the local effects that a fault has on the affected component(s) only; S# is later able to determine the global impact of the faults on the system's overall behavior. Consequently, it is only necessary to consider local fault effects, which is easier to get right than predicting the global consequences of a fault, helping to increase the adequacy of the models. It is always possible and usually desirable to specify the intended behavior first and add the degraded behavior later in a separate step. In that sense, a S# model including the faults is an extension of the S# model without any faults. In S#, faults are modeled by adding certain fields to components and classes modeling fault effects; the actual fault injection (and also fault removal for analysis efficiency reasons) are done automatically in the background.
The fault terminology used by S# matches the terms defined by Avizienis et al. The following gives a brief overview of the relevant terms for working with S#.
For safety analysis purposes, situations in which faults cause some behavior that would not have occurred otherwise are the most interesting ones; in fact, faults are only of importance in such situations. These situations represent fault activations; that is, a fault is activated when it has some influence on the system by somehow affecting its behavior. Faults are dormant until they are activated and become active; they become dormant again when they are deactivated.
In contrast to other [safety analysis approaches and tools](Related Work and Tools), S# does not focus on fault activation states that indicate whether a fault is dormant or active. During modeling, property specification, model checking, or simulations, S# only considers fault activations. Fault activations are implicit in a S# model; S# therefore determines automatically whether a fault can be activated, i.e., whether the fault has any effect on the modeled system in a certain situation. The reasons for using this somewhat non-traditional approach are twofold:
-
We feel that fault activations are the actual concept that is of interest for safety analyses; fault activation states are only of secondary interest to model persistence constraints (see below).
-
Fault activations allow for significantly more efficient model checking; we've seen model checking time reductions by almost four orders of magnitude compared to the traditional state-based fault modeling approach that S# previously used.
In S#, fault propagations are not explicitly modeled; they are deduced automatically by the model checker. Only faults and their local effects have to be explicitly modeled in S# in order to be analyzed. Intra- and inter-component error propagation is implicit in S# to simplify modeling; the propagation chain is what S# is designed to uncover automatically. Only the dependencies between components are modeled through implicit or explicit port bindings.
A S# Fault
can represent any kind of fault, in particular (and most commonly) external
hardware faults caused by nature. Additionally, one can also model development faults in
that way, representing a mistake that was feared to be made during development, such as
software bugs. In any case, the reason why such a fault exists or might be activated is
irrelevant for S# and the safety analyses. For instance, a development fault might be
introduced by a malicious and intentional backdoor or by an overtired developer; a
hardware fault might be because of radiation, a flood wave, etc.
S# faults are dormant until they are activated, typically nondeterministically, to simulate a natural cause that results in the fault or to simulate that a development fault is triggered somehow, i.e., a line of code that contains a bug is executed and the bug is exposed. Fault activation results in fault effects that change the internal state or behavior of the affected components, thereby causing an error.
Errors are explicitly modeled, instead they represent a deviation of a component's internal state from what it should have been; therefore, error exist only at runtime. Such errors can propagate through the component, causing other errors. In S#, fault effects model the effect of a fault on the behavior or state of a component, indirectly describing a set of errors. A fault effect is caused by the activation of a fault, and induces or causes one or more errors to occur.
Eventually, an error can result in a failure, where the error manifests itself in a way
that is externally observable. The way a component fails is called a failure mode; for
instance, is full and is empty are two failure modes of the [pressure
sensor](Pressure Tank Case Study) that might result from various different errors and
faults. Such a failure can either be a system hazard, or it might cause a fault in
another component. Through implicit or explicit port bindings, a failure of
a component implicitly causes an input fault in another component, where the sequence of
error propagation continues.
The chain of fault activations causing errors that are propagated to failures can not
always be found like this in a model. For instance, consider the [pressure
sensor](Pressure Tank Case Study): To model the failure mode is full, we add a fault
with an effect that immediately results in a failure (shown below), without causing any
other errors first. In that case, the error caused by the fault immediately results in a
failure, and the fault and its effect are often named in accordance with the failure mode
the fault and its effect are intended to model.
A fault's persistence constrains the transitions between its active and dormant states.
Transient faults, for instance, are activated and deactivated completely
nondeterministically, whereas permanent faults, while also activated
nondeterministically, never become dormant again. S#'s standard library includes the types
TransientFault
and PermanentFault
that model these two kinds of fault persistence,
which are the most common ones.
In general, transient faults subsume all other kinds of persistence, but exclusively using transient faults might make a system seem less safe than it actually is. For instance, consider a hardware component that can completely break, i.e., it might no longer function at all. Obviously, such a component cannot magically repair itself, so it is a permanent fault. If the fault is modeled to be transient, the model of course includes all situations in which the fault behaves like a permanent one, but also many more. If a hazard occurs in such an additional situation, the fault appears to be critical, even though this situation is physically impossible in the real world. Persistence constraints are therefore similar to general fairness constraints, where modeled behavior is excluded from consideration by the model checker for reasons of adequacy.
Being designed for safety analyses, S# of course has first class support for modeling faults
and fault effects. It guarantees certain formal properties such as conservative
extension in order to ensure adequacy of the safety analysis results that
it computes. The following examples model the [pressure sensor's](Pressure Tank Case
Study) two faults is full and is empty.
To add a fault to a model, a field of type Fault
has to be added to some component. If
the fault affects a single component only, it is usually advisable to add the fault to the
affected component itself, as is done in the following for the PressureSensor
component:
public class PressureSensor : Component
{
public readonly Fault SuppressIsEmpty = new TransientFault();
public readonly Fault SuppressIsFull = new TransientFault();
// ...
}
We usually make the fault fields both public
and readonly
: That way, faults can be
referred to in formulas, but they cannot be reinitialized from outside the component. The
fields are initialized with new instances of type TransientFault
; consequently, the
two faults are activated and deactivated completely nondeterministically. Faults can be
initialized with any object of a type derived from Fault
; TransientFault
and
PermanentFault
are standard classes provided by S#. Custom Fault
-derived types are
also possible, see below.
A fault can affect multiple components in order to model common cause failures, for instance. In that case, it is typically advisable to add the fault to a parent component of all affected components and pass the reference to the fault to the affect components' constructors:
public class ChildComponent : Component
{
public ChildComponent(Fault fault)
{
// ...
}
}
public class ParentComponent : Component
{
public readonly Fault F = new TransientFault();
private ChildComponent _child1;
private ChildComponent _child2;
public ParentComponent()
{
_child1 = new ChildComponent(F);
_child2 = new ChildComponent(F);
}
}
Activated faults affect the internal state of a component or the behavior of one or more
of its ports. Fault effects are commonly modeled by adding a nested classes
derived from the affected component. Fault effects must always be marked
with the [FaultEffect]
attribute. The local effects are modeled by overriding provided
ports, required ports, or the affected component's Update
method:
public class PressureSensor : Component
{
// Faults
public readonly Fault SuppressIsEmpty = new TransientFault();
public readonly Fault SuppressIsFull = new TransientFault();
// Provided ports; must be virtual in order to be affected by faults
public virtual bool IsFull => PhysicalPressure >= Specification.SensorPressure;
public virtual bool IsEmpty => PhysicalPressure <= 0;
// Required port
public extern int PhysicalPressure { get; }
// Fault effect 1
[FaultEffect(Fault = nameof(SuppressIsFull))]
public class SuppressIsFullEffect : PressureSensor
{
public override bool IsFull => false;
}
// Fault effect 2
[FaultEffect(Fault = nameof(SuppressIsEmpty))]
public class SuppressIsEmptyEffect : PressureSensor
{
public override bool IsEmpty => false;
}
}
The nested classes that model the effects of the is full and is empty faults are
called SuppressIsFullEffect
and SuppressIsEmptyEffect
; both derive from
PressureSensor
. The IsFull
and IsEmpty
provided ports are overridden by the
effects. In order for that to work, the affected ports must be declared virtual
and
the overriding ports must be declared override
. Fault effects therefore make use of
C#'s regular virtual dispatch mechanism.
When PressureSensor.IsFull
is invoked by an implicit or explicit port binding, the
original behavior is only executed as long as SuppressIsFull
is not activated. That
is, when S# activates the fault SuppressIsFull
during model checking,
SuppressIsFullEffect.IsFull
is executed instead of PressureSensor.IsFull
, always
returning false
regardless of the actual physical pressure level that is usually
checked by the port. Consequently, the fault's effect causes a failure of the sensor in
that it no longer reports that the pressure tank is full.
While in the example above, only provided ports are affected by faults, it is indeed also
possible to override required ports or the component's Update
method. Additionally,
the above example only shows how fault activations immediately result in component
failures. It is of course also possible, especially when overriding Update
, that fault
activations result in errors that are not immediate failures: The overridden component
members could change some values of the component's fields, resulting in an error. If and
when such an error propagates to a failure is determined by S# automatically. Here's an
example:
public class C : Component
{
// The component's internal state
private int _x;
// Required port; must be virtual in order to be affected by faults
public virtual extern int M();
// Update method
public override void Update()
{
_x = 7;
}
[FaultEffect(Fault = /* ... */)]
public class E : C
{
public override int M() => base.M() + 1;
public override void Update()
{
_x = 0;
}
}
}
The fault effect C.E
affects both C
's Update
method and its required port
M
. Whenever C.Update
is invoked and C.E
's corresponding fault (not shown) is
activated, E.Update
is executed instead, setting _x
to 0
instead of 7
,
causing an error that is not immediately externally observable, hence is not a failure.
When C.M
is called and the fault is activated, E.M
is executed instead. It in turn
invokes C.M
using C#'s base method invocation syntax base.M
; the base call
computes the actual value through the required port's binding. The computed value is then
incremented by 1
, resulting in either a failure or an error depending on how M
's
returned value is actually used (not shown in the example).
Fault effects for a component C
can also be modeled by adding non-nested classes
derived from C
to the model. In that case, however, the fault effect has no access to
C
's non-public members, as per the usual C# accessibility rules.
There is a 1:n correspondence between faults and fault effects, that is, a Fault
instance can have multiple effect instances, but each effect instance belongs to exactly
one Fault
instance. There are two ways to associate faults and fault effects:
Statically at compile-time and dynamically at runtime. Statically, a fault effect is
associated with a fault by name matching; it is the preferred approach for faults that
affect a single component only. The association is specified by referencing a fault in the [FaultEffect]
attribute. For example:
public class PressureSensor : Component
{
public readonly Fault SuppressIsFull = new TransientFault();
[FaultEffect(Fault = nameof(SuppressIsFull))]
public class SuppressIsFullEffect : PressureSensor
{
// ...
}
// ...
}
If the association between a fault and its effects cannot be statically determined at
compile-time, there is the Fault.AddEffect<T>
method that can be called dynamically
when the model is initialized. This is especially useful when modeling common cause
faults. Revisiting the example from above where multiple component instances are affected
by the same fault instance, the effects can be hooked up as follows:
public class ChildComponent : Component
{
public ChildComponent(Fault fault)
{
var effect = fault.AddEffect<E>(this);
effect.X = 17;
}
[FaultEffect]
public class E : ChildComponent
{
public int X;
// ...
}
}
public class ParentComponent : Component
{
// see above
}
The ChildComponent
's constructor adds the fault effect ChildComponent.E
to the
fault
provided as a constructor argument using the Fault.AddEffect<T>
method. The
method returns an instance of ChildComponent.E
representing the added fault effect,
which can be used to initialize the effect as shown above.
Explicitly adding a fault effect to a fault dynamically always overrides static
associations. So in the example above, if the FaultEffect
attribute specified a fault,
it would be overwritten by the invocation of Fault.AddEffect<ChildComponent.E>
.
There can be multiple fault effects affecting the same component port. In that case,
however, the S# compiler by default issues a warning because the fault effect that
actually has an effect is chosen nondeterministically, which is often unintended. Assuming
the PressureSensor
to have an additional fault that causes it to report the tank to be
full even though it is not (the dual of is full), the S# model would look like this:
public class PressureSensor : Component
{
// ...
public virtual bool IsFull => PhysicalPressure >= Specification.SensorPressure;
[FaultEffect(/* ... */)]
public class SuppressIsFullEffect : PressureSensor
{
public override bool IsFull => false;
}
[FaultEffect(/* ... */)]
public class ForceIsFullEffect : PressureSensor
{
public override bool IsFull => true;
}
}
The warning generated by the S# compiler reads as follows:
Warning SS4005: 'PressureTank.PressureSensor.IsFull.get' is affected by multiple
fault effects without an explicit priority specification, resulting in possibly
unintended additional nondeterminism. Consider making priorities explicit by
adding the 'SafetySharp.Modeling.PriorityAttribute' to:
'PressureTank.PressureSensor.ForceIsFullEffect',
'PressureTank.PressureSensor.SuppressIsFullEffect'.
If the additional nondeterminism is indeed intentional, mark both fault effects with the
Priority
attribute, using the same priority level; for instance, [Priority(0)]
.
The warning is silenced while still allowing S# to choose between the two effects when
both faults are activated at the same time.
Otherwise, assign different priorities to the
effects: Marking PressureSensor.SuppressIsFullEffect
with [Priority(0)]
and
PressureSensor.ForceIsFullEffect
with [Priority(1)]
means that in situations in
which both faults are activated, PressureSensor.ForceIsFullEffect
takes precedence
while PressureSensor.SuppressIsFullEffect
is ignored. That is, fault effects with
higher priorities take precedence over effects with lower priorities.
Transient and permanent faults are the two most common kinds of fault persistence. S#
therefore provides the TransientFault
type to be used for faults that can be activated
and deactivated completely nondeterministically. For faults that, once activated
nondeterministically, cannot be deactivated again, PermanentFault
is provided. For
other kinds of fault persistence, a new class deriving from Fault
must be added to the
model.
The following MaintenanceFault
class represents a fault that is similar to a
PermanentFault
, except that the fault can be repaired at some point, allowing it to
be deactivated again. Such a fault can be implemented as follows:
public class MaintenanceFault : Fault
{
private bool _isActive;
public MaintenanceFault()
: base(requiresActivationNotification: true)
{
}
protected override Activation CheckActivation()
{
// Warning: No side effects are allowed in this method!
return _isActive ? Activation.Forced : Activation.Nondeterministic;
}
public override void OnActivated()
{
_isActive = true;
}
public void Repair()
{
_isActive = false;
}
}
The class is derived from Fault
and contains a field _isActive
indicating whether
the fault is currently active. In the constructor, we pass true
to the base class
constructor's requiresActivationNotification
parameter to indicate that S# must call
the overridden OnActivated
method.
The method CheckActivation
is invoked by S# to determine whether the fault can, must,
or cannot be activated. As long as _isActive
is false, that is, the fault is dormant,
we return Activation.Nondeterministic
, causing S# to consider both a potential
activation as well as no activation. Once the fault is active, however, _isActive
is
true
and the method returns Activation.Forced
, meaning that if an activation is
possible, S# must activate the fault; the case that the fault is not activated is no
longer considered. Once S# has determined that the fault can indeed be activated, it
invokes the OnActivated
method. This is where _isActive
is set to true
.
Important: Do not set _isActive
to true
in CheckActivation
! In fact,
that method must be completely side effect-free, otherwise the fault will always be
considered to be activatable.
The Repair
method can be called to make the fault dormant again by setting
_isActive
to false
. With that, MaintenanceFault
could be used as follows:
class C : Component
{
public readonly MaintenanceFault F = new MaintenanceFault();
public override void Update()
{
if (/* some condition indicating that F is repaired */)
F.Repair();
}
[FaultEffect(Fault = nameof(F))]
public class E : C
{
// ...
}
}