Software Fault Tolerance Techniques
There are two main techniques for obtaining fault-tolerant software:-
1.) Recovery Block
2.) N-Version Software
Both the above techniques are based on design diversity.
Design diversity refers to a system's components being developed using different designs but providing the same service. The basic premise of design diversity is that components built in different ways will fail in different ways.
As a result, if one of the redundant versions fails, at least one of the others will provide a desirable output.
Let’s understand these techniques one by one.
Recovery Block Technique
The Recovery Block technique is a simple technique developed by Randel in the early 1970s.
In this technique, alternate software versions are organized like the dynamic redundancy (standby) approach in hardware.
The recovery block deals with the Adjudicator. The adjudicator in the recovery block confirms the results of several implementations of the same algorithm. The system view is divided into fault recoverable blocks in a system with recovery blocks.
These fault-tolerant blocks are used to build the overall system. There is at least one primary, secondary, and exceptional case code in each block, as well as an adjudicator. The adjudicator is the component that determines whether the various blocks to try are correct.
The adjudicator should be kept simple to ensure execution speed and correctness. The adjudicator performs the primary alternate first when entering a unit. (A unit may have N alternates that the adjudicator can try.) If the adjudicator judges that the fundamental block failed, the system is rolled back, and the second alternate is tried.
If the adjudicator does not accept the other results, the exception handler is called, indicating that the software could not accomplish the required operation.
The recovery block technique puts more pressure on the specification to be specific enough to generate numerous functionally identical alternatives. This issue is also considered in relation to the N-version software technique.
N-Version Software Technique
The N-version software solutions seek to replicate the N-way redundant hardware idea of classical hardware fault tolerance. Every module in an N-version software system is completed using up to N in different ways. Each variant performs the same task, but preferably in another way. Each version then submits its response to a voter or decider, who determines the proper response and returns it as the module's outcome.
By relying on the design diversity principle, this system should overcome the design flaws that affect most software. The notion that the system could comprise several types of hardware and many software versions is an essential distinction in N-version software.
N-version software can only be successful and tolerate faults if the required design diversity is met. The importance of suitable specifications (including recovery blocks) in N-version software cannot be overstated. The delicate balance required by the N-version software method necessitates a specific specification that the various versions are completely inter-operable, allowing a software decider to choose equally between them, but not so limiting that software programmer are unable to create diverse designs. It's challenging to encourage design diversity while maintaining version compatibility in the specification; nonetheless, most modern software fault tolerance approaches rely on this delicate balance.
The N-version approach allows for various defects to be generated, but the system is successfully masked and disregarded. However, it is critical to identify and correct these flaws before they become errors. First, the fault classification approach for N-version software: if an N-version system only has one version, the error is categorized as a simplex fault. If there are defects in M versions of an N-version system, the fault is called an M-plex fault.
FAQs
-
What is Software Fault Tolerance?
Software fault tolerance refers to a software's ability to identify and recover from a fault that is occurring or has already happened in either the software or hardware in the system it is executing to provide service according to the specification.
-
What are the two Software Fault Tolerance Techniques?
The two software fault tolerance techniques are Recovery Block and N-Version Software.
-
List two major differences between Recovery Block and N-Version Software Techniques.
The two major differences between RB and NVS are:-
The acceptance test is not performed in NVS while it is performed in RB.
NVS can be applied to critical systems while RB cannot.
Key Takeaways
In this article, we have extensively discussed Software Fault Tolerance and different fault tolerance techniques.
We hope that this blog has helped you enhance your knowledge regarding Software Engineering and if you would like to learn more, check out our articles on Software Engineering, and Software Reliability Models.
Do upvote our blog to help other ninjas grow.
Happy Coding!