www.prismmodelchecker.org

IEEE 1394 FireWire Root Contention Protocol

Contents

Related publications: [DKN02, DKN04]

Introduction

This case study concerns the Tree Identify Protocol of the IEEE 1394 High Performance Serial Bus (called ''FireWire''). The 1394 High Performance serial bus is used to transport video and audio signals within a network of multimedia devices. It has a scalable architecture and is ''hot-pluggable'', meaning that at any time devices can easily be added or removed from the network.

The tree identify process of IEEE 1394 is a leader election protocol which takes place after a bus reset in the network, that is, when a node is added or removed from the network. After such a reset all nodes in the network have equal status, and know only to which nodes they are directly connected. A leader (root) needs to be chosen to act as the manager of the bus for subsequent phases of IEEE 1394. This is done by the nodes communicating ''be my parent'' messages, starting from the leaf nodes. It can happen that two nodes contend the leadership (root contention), in which case the contenders exchange additional messages and involve time delays and electronic coin tosses.

The protocol is designed for use on connected networks, will correctly elect a leader if the network is acyclic, and will report an error if a cycle is detected. For further information concerning the IEEE 1394 Tree Identify Protocol see here.

We consider the following probabilistic timed automata models of the root contention part of the tree identify protocol, which are based on probabilistic I/O automata models presented in [SV99].

  • The full protocol Impl: which consists of the parallel composition of two nodes (Node1 and Node2), and two communication channels (Wire12 for messages from Node1 to Node2, and Wire21 for messages from Node2 to Node1) and corresponds to the system Impl of [SV99]. For further details, see the Uppaal specification for this system presented in [SS01].

  • The abstract model Abst: which is represented by a single probabilistic timed automaton and is an abstraction of Impl based on the the probabilistic I/O automaton I1 of [SV99].

We first use timed automata model checking tool KRONOS to perform a symbolic forwards reachability analysis to generate the set of states that are reachable with non-zero probability from the initial state, and before the deadline expires. This information is then encoded as a Markov decision process which we analyse with PRISM. We apply this technique to compute the minimum probability of a leader being elected before a deadline, for different deadlines, and study the influence of using a biased coin on this minimal probability.

Further details are available in [DKN02, DKN04].

Forwards Reachability with KRONOS

The forwards reachability algorithm of KRONOS proceeds by a graph-theoretic traversal of the reachable state space using a symbolic representation of states called symbolic states [DT98]. For example, consider the following probabilistic timed automata.

The above probabilistic timed automata models a process which repeatedly tries to send a packet after between 4 and 5 seconds, and if successful waits for 3 seconds before trying to send another packet. The packet is sent with probability 0.99 and lost with probability 0.01 due to an error.

The reachability graph obtained for the above probabilistic timed automata for a deadline of 15 seconds, measured with an extra clock y is given below. Since4 y is never reset, its value would increase indefinitely. To obtain a finite reachability graph, we need to apply the extrapolation abstraction of [DT98], which abstract away the exact value of y when y>=15. Notice that the abstraction is exact with respect to reachability properties.

Encoding the reachability graph in PRISM

The reachability graph obtained with KRONOS is described by an explicit list of states and transitions. In order to model check probabilistic properties we must encode it as a Markov decision process using PRISM's description language.

The behaviour of a system is described by a set of guarded commands of the form:

	[] guard -> command. 

A guard is a predicate over variables of the system. A command describes a transition which the system can make if the guard is true, by giving a new value to primed variables as a function on the old values of unprimed variables. We consider two types of encoding of a reachability graph in this language.

Explicit Encoding

The first solution is a direct explicit encoding of the reachability graph using a single variable s whose value is the index of the state of the reachability graph. Transitions of the system are encoded by guarded commands such that the guard tests the value of s and the command updates it according to the transition relation of the reachability graph. For example, the encoding of the outgoing transitions from states 0, 4 and 7, corresponding to location send in the reachability graph above is:

	[] s=0 -> 0.99: (s'=1) + 0.01: (s'=2); 
	[] s=4 -> 0.99: (s'=5) + 0.01: (s'=6); 
	[] s=7 -> 0.99: (s'=8) + 0.01: (s'=9);

This encoding generates a description the size of the reachability graph, which can grow drastically with the value of the deadline. PRISM involves a model construction phase, during which the system description is parsed and converted into an MTBDD representation for further analysis. This phase can be extremely time consuming when the input file does not correspond to a modular and structured description of a system, such as with the explicit encoding. Thus, an encoding allowing for a more compact description of the system is needed. Below we outline a number of compaction techniques and also consider an alternative approach where instead we reduce the size of the reachability graph through applying a probabilistic bisimulation [SL94] quotient.

Compaction Techniques

Instances Encoding

Symbolic states in the reachability graph correspond to several instances of locations of the timed automaton from which it was generated, with different time constraints. We can then encode them with two variables: a location variable l and an instance variable n describing to which instance of the location it corresponds.

Let l = 0, 1, 2 and 3 be the values corresponding to locations send, wait, error, and error_before respectively. Then, symbolic states 0, 4 and 7 correspond to three different instances of location send, say n = 0, 1 and 2. The outgoing probabilistic transitions from these states can be specified by the guarded commands:

	[] l=0 & n=0 -> 0.99 : (l'=1) & (n'=0) + 0.01 : (l'=2) & (n'=0);
	[] l=0 & n=1 -> 0.99 : (l'=1) & (n'=1) + 0.01 : (l'=2) & (n'=1);
	[] l=0 & n=2 -> 0.99 : (l'=1) & (n'=2) + 0.01 : (l'=2) & (n'=2);

Similarly, symbolic state 3 is the unique instance of location error_before, encoded as l=3 and n=0. The incoming transitions to this state are described by:

	[] l=2 & n=0 -> 1 : (l'=3) & (n'=0);
	[] l=2 & n=1 -> 1 : (l'=3) & (n'=0);

Instances are computed by a breadth-first traversal of the reachability graph.

Absolute compaction

Note that, in the commands representing the outgoing transitions from location send, the instance variable n is left unchanged, meaning that the transition only affects the location variable for instances 0, 1 and 2. This is equivalent to write that n'=n, which can be omitted since, by default, a non updated variable takes its old value. Thus, the transitions above have the same update command, and can then be described more compactly in a single command line:

	[] l=0 & (n=0 | n=1 | n=2) -> 0.99 : (l'=1) + 0.01 : (l'=2);

In a reachability graph, a transition between two given locations is usually repeated several times for different instances of the locations. This encoding allows us to specify them all in a single command line.

We will refer to this as the relative compaction, because it is based on specifying the updated value n' relative to its old value n.

Absolute compaction

The previous compaction does not apply in the case of the two incoming transitions to error_before. However, if we specify the updated value n' with its absolute value, the update command of both transitions is the same. Thus, they can both be described more compactly in a single command line:

	[] l=2 & (n=0 | n=1) -> 1 : (l'=3) & (n'=0);

In a reachability graph, we encounter states which are the destination of many different transitions, such as the state encoded by l=3 and n=0. This encoding allows us to specify them all in a single command line.

We will refer to this as the absolute compaction, because it is based on specifying the absolute value of n'. Note that this compaction could also be applied to the explicit encoding. However, since in practice the relative compaction leads to a more compact description, compaction algorithms have only been implemented in the case of the instances encoding. Absolute compaction is especially interesting when used in combination with the relative one.

Combination

In order to obtain a further reduction, we can combine both compactions. Since in practice there are potentially more transitions to be compacted with the relative encoding than with the absolute one, the heuristic implemented consists in first applying the relative compaction and then, for those transitions that were not compacted, i.e. those whose guard correspond to a unique source state, to change the command updates for the instance variable n from relative to absolute, and then apply the absolute compaction.

Reduction through Probabilistic Bisimulation

As an alternative approach we use the CADP (Ceasar/Aldebaran Development Package) a tool set for the design and verification of complex systems, which has recently be extended to allow for performance evaluation [GH02]. In particular, we use the bcg_min tool which supports the minimisation of probabilistic systems with respect to probabilistic bisimulation. The main steps in this approach are: represent the reachability graph in a format suitable for the CADP toolset, use CADP to construct the bisimulation quotient, and finally translate the CADP output into the PRISM language.

To construct the input to CADP required only a straightforward modification of the output from KRONOS (the main step is converting the reachability graph to the alternating model [HJ94]). For the quotient system to preserve the maximal reachability probability of interest, we must identify the states where a leader has been elected after the deadline has passed. For simplicity, since these are the only states that we need to identify, we add loops to these states and label these transition with one action and then label all remaining transitions with another distinct action.

Using this version of the reachability graph, we then use the toolset CADP to construct the quotient under probabilistic bisimulation which preserves the maximal probability of electing a leader after the deadline has passed.

To translate the CADP output into the PRISM language we wrote a simple translator which takes as input the probabilistic system representing the quotient system (the output from CADP) and translates this into the PRISM language. We note that at this stage we have lost all information concerning which locations and zones correspond to which states, and hence we are restricted to an explicit encoding of the quotient system.

Results for the full protocol

In the case when DEADLINE=4,000, the KRONOS source code can be found here and the output files when using the different encodings and compaction algorithms are listed below.

  • The output file when using the explicit encoding can be found here.
  • The output file when using the instances encoding and absolute compaction algorithm can be found here.
  • The output file when using the instances encoding and relative compaction algorithm can be found here.
  • The output file when using the instances encoding and the absolute and relative compaction algorithms can be found here.

Also in the case when DEADLINE=4,000, the input and output files from CADP can be found here and here respectively.

In the table below we summarise the results concerning the generation with KRONOS of the reachability graph and of its encoding as an MDP in PRISM's description language.

DEADLINE (103ns):   forward reachability:   reduced model:   encoding in PRISM language (lines):
explicit: instances:
states: time (sec): states: unreduced: reduced absolute relative relative+absolute
4 25990.940104 3716124 2424 974 894
6 43371.64 182 6202219 4050 15451473
8 78312.93 285 11262344 7320 27482622
10 111194.27 486 15986588 10398 38643710
20 4101718.7 1639 592541988 384061406213730
30 8928356.1 3576 1291544340 836343034929843
40 155675129 6147 2254207462 1458545268152019
50 241125225 9484 - 11514 - - -
60 344719441 13501 - 16392 - - -
80 6078711312 23146 - 28104 - - -

Furthermore, the graph below shows the evolution of the number of lines of the generated file for the different values of the deadline. It demonstrates that the instances encoding allows for compactions which reduce drastically the number of lines of the MDP file, but quotienting by probabilistic bisimulation achieves the greatest reduction.

Results for the abstract model

In the case when DEADLINE=10,000, the KRONOS source code can be found here and the output files when using the different encodings and compaction algorithms are listed below.

  • The output file when using the explicit encoding can be found here.
  • The output file when using the instances encoding and absolute compaction algorithm can be found here.
  • The output file when using the instances encoding and relative compaction algorithm can be found here.
  • The output file when using the instances encoding and the absolute and relative compaction algorithms can be found here.

Also in the case when DEADLINE=10,000, the input and output files from CADP can be found here and here respectively.

In the table below we summarise the results concerning the generation with KRONOS of the reachability graph and of its encoding as an MDP in PRISM's description language.

DEADLINE (103ns):   forward reachability:   reduced model:   encoding in PRISM language (lines):
explicit: instances:
states: time (sec): states: unreduced: reduced absolute relative relative+absolute
4 131 0.0032 17439 104 42 26
6 216 0.0167 29083 173 64 27
8 372 0.0283 499103 297 91 36
10 526 0.03111 709138 421 126 39
20 1,876 0.09454 2531566 1501 368 72
30 4,049 0.20976 54661219 3240 734100
40 7,034 0.461764 94992205 56291223126
50 10,865 1.232732 146743417 86941842159
60 15,511 2.743854 209524821 124122586186
80 27,296 8.946608 368688267 218414437243
10042,401 22.210268 5727412841339266797303

Furthermore, the graph below shows the evolution of the number of lines of the generated file for the different values of the deadline. It demonstrates that the instances encoding allows for compactions which reduce drastically the number of lines of the MDP file, and in the case where both relative and absolute compactions are used, the number of lines grows less than linearly on the value of the deadline and offers a greater reduction than applying the probabilistic bisimulation quotient.

Verification with PRISM

We now report on the model checking results when using PRISM for both Impl and Abst models. We first compute the minimum probability of a leader being elected before a deadline, for different deadlines, and then study the influence of using a biased coin on this minimal probability.

Model Statistics

The table below shows statistics for the MTBDD which represents each model we have built, in terms of the the total number of nodes and the number of leaves (terminal nodes), for each of the encodings and compaction algorithms.

Impl Model
DEADLINE (103ns):   Nodes:   Leaves:
explicit: instances:
unreduced: reduced absolute relative relative+absolute
4 5,217 411 2,0171,787 1,791 3
6 7,582 727 2,5922,165 2,259 3
8 11,7361,117 3,6482,917 2,937 3
10 15,2021,821 4,5763,579 3,609 3
20 - 4,900 9,4646,799 6,853 3
30 - 9,624 - 9,720 9,785 3
40 - 15,520- 12,358 12,4373
50 - 18,130- - - 3
60 - 29,118- - - 3
80 - 45,537- - - 3

Abst Model
DEADLINE (103ns):   Nodes:   Leaves:
explicit: instances:
unreduced: reduced absolute relative relative+absolute
4 485 140 188 188 193 3
6 698 341 346 328 328 3
8 979 375 403 403 405 3
10 1,306468 488 473 473 3
20 2,6761,506 988 971 991 3
30 4,4222,874 1,6621,5151,5483
40 - 4,502 2,2142,0072,0533
50 - 6,267 2,5812,3112,3903
60 - 7,762 3,3873,0563,1653
80 - 11,1214,3963,8554,0073
100 - 15,114- 5,2935,4513

Model Construction Times

The tables below gives the times taken to construct the models.

Impl Model
DEADLINE:   Construction time (s):
explicit: instances:
unreduced: reduced relative absolute relative+absolute
4 270 0.2688.37 5.67 4.86
6 878 0.71119.2 13.4 10.9
8 2,6401.85 61.9 44.8 38.1
10 6,0785.08 129 96.5 82.1
20 - 81.8 2,099 1,719 1,469
30 - 389 - 9,057 7,614
40 - 1,305 - 30,97727,870
50 - 3,451 - - -
60 - 7,172 - - -
80 - 23,320- - -

Abst Model
DEADLINE:   Construction time (s):
explicit: instances:
unreduced: reduced absolute relative relative+absolute
4 0.4920.0920.124 0.1010.056
6 1.14 0.1670.202 0.1550.071
8 3.79 0.2080.390 0.0670.105
10 8.54 0.3390.645 0.2450.136
20 122 5.79 6.29 1.82 0.505
30 698 27.6 33.2 8.97 1.50
40 - 111 112 26.8 3.70
50 - 284 289 71.1 8.21
60 - 554 612 147 15.4
80 - 1,802 2,187 531 43.1
100 - 5,181 - 1,48691.9

Model Checking

The property we consider is: "the minimum probability that a leader (root) is chosen before a DEADLINE."

  • Using a fair coin: In the table below, summarises the model checking results when using a fair coin and both the most compact encoding (instances encoding with both the absolute and relative compaction algorithms) and the explicit encoding and the reduced reachability graph, when using the MTBDD, sparse and hybrid engines, where in each case we stop iterating when the relative error between subsequent iteration vectors is less than 1e-6, that is:

    Model checking times:

    Impl Model
    DEADLINE:   unreduced: instances rel+abs   reduced: explicit   Probability:
    Time per iteration (s): Iterations: Time per iteration (s): Iterations:
    MTBDD Sparse Hybrid MTBDD Sparse Hybrid
    4 0.00090<  0.00001 <  0.00001 11 <  0.00001 <  0.00001 <  0.00001 15 0.625000000
    6 0.00185<  0.00001 0.00037 27 0.00057 <  0.00001 <  0.00001 35 0.851562500
    8 0.00457<  0.00001 0.00029 35 0.00111 <  0.00001 <  0.00001 45 0.939453135
    10 0.00725 0.00039 0.00078 51 0.00215 <  0.00001 <  0.00001 65 0.974731455
    20 0.06609 0.00235 0.00469 115 0.01496 0.00006 0.00034 1450.999629565
    30 0.29515 0.00544 0.01128 171 0.07934 0.00014 0.00079 2150.999994454
    40 0.67436 0.01132 0.02118 227 0.21943 0.00021 0.00171 2850.999999919
    50 - - - - 0.37234 0.00031 0.00235 3530.999999998
    60 - - - - 0.83342 0.00045 0.00390 4150.999999999
    80 - - - - 2.23935 0.00095 0.00640 5451.000000000

    Abst Model
    DEADLINE:   unreduced: instances rel+abs   reduced: explicit   Probability:
    Time per iteration (s): Iterations: Time per iteration (s): Iterations:
    MTBDD Sparse Hybrid MTBDD Sparse Hybrid
    4 <  0.00001 <  0.00001 <  0.00001 6 <  0.00001 <  0.00001 <  0.00001 8 0.625000000
    6 <  0.00001 <  0.00001 <  0.00001 12 <  0.00001 <  0.00001 <  0.00001 16 0.851562500
    8 <  0.00001 <  0.00001 <  0.00001 15 <  0.00001 <  0.00001 <  0.00001 20 0.939453135
    10 0.00048 <  0.00001 <  0.00001 21 0.00036 <  0.00001 <  0.00001 28 0.974731455
    20 0.00089 <  0.00001 0.00022 45 0.00233 <  0.00001 <  0.00001 60 0.999629565
    30 0.00151 0.00015 0.00030 66 0.01022 0.00022 <  0.00001 88 0.999994454
    40 0.00218 0.00046 0.00058 87 0.02491 0.00035 0.00008 1160.999999919
    50 0.00428 0.00037 0.00084 107 0.05886 0.00049 0.00007 1420.999999998
    60 0.01131 0.00064 0.00151 126 0.09722 0.00071 0.00011 1680.999999999
    80 0.00793 0.00133 0.00290 165 0.22117 0.00146 0.00018 2201.000000000
    100 0.01374 0.00211 0.00456 204 0.06867 0.00365 0.00253 2601.000000000
  • Varying the communication delay between the nodes: In the figure below, we give the probability for electing a leader before a deadline as the communication delay (wire length) between the nodes varies. As expected the results show that the probability of electing a leader before a deadline decreases as the communication delay decreases.

  • Varying the coin bias: In the tables below, we give the probability for electing a leader before a deadline of 3,000ns, 4,0000ns, 6,000ns, 8,0000ns and 10,000ns, for different biased coins (different probabilities of choosing the fast and slow timings) when the communication delay is both 360ns and 30ns.

    fast: slow:   communication delay 30ns   communication delay 360ns
    3,000ns: 4,000ns: 6,000ns: 8,000ns: 10,000ns: 3,000ns: 4,000ns: 6,000ns: 8,000ns: 10,000ns:
    0.010.990.0198020.0392080.0582370.0768860.0951660.0198000.0198030.0392110.0582370.076886
    0.100.900.1818000.3276180.4522190.5517770.6332330.1800000.1818000.3305340.4522190.551770
    0.200.800.3328000.5381120.7023010.8010060.8668860.3200000.3328000.5545160.7023530.801000
    0.300.700.4578000.6670020.8374500.9109650.9509080.4200000.4578000.7043520.8380500.910950
    0.400.600.5568000.7418880.9048040.9572000.9800520.4800000.5568000.7991500.9076350.957090
    0.450.550.5952380.7652730.9220930.9685470.9863390.4950000.5952380.8300270.9270660.968230
    0.500.500.6250000.7812500.9316410.9754940.9899690.5000000.6250000.8515620.9394530.974731
    0.510.490.6297970.7836120.9327690.9764890.9904740.4998000.6297970.8548320.9412150.975592
    0.520.480.6341830.7856980.9336620.9773730.9909190.4992000.6341830.8577680.9427570.976322
    0.530.470.6381440.7875070.9343260.9781500.9913090.4982000.6381440.8603760.9440830.976927
    0.540.460.6416660.7890320.9347690.9788280.9916460.4968000.6416660.8626580.9451950.977409
    0.550.450.6447380.7902700.9349970.9794100.9919360.4950000.6447380.8646160.9460950.977771
    0.560.440.6473420.7912120.9350110.9798980.9921780.4928000.6473420.8662490.9467810.978015
    0.570.430.6494650.7918490.9348170.9802970.9923760.4902000.6494650.8675580.9472510.978140
    0.580.420.6510940.7921700.9344140.9806060.9925310.4872000.6510940.8685390.9475010.978147
    0.590.410.6522100.7921610.9338030.9808260.9926430.4838000.6522100.8691870.9475240.978033
    0.600.400.6528000.7918080.9329800.9809580.9927130.4800000.6528000.8694980.9473130.977795
    0.610.390.6528450.7910920.9319400.9809960.9927390.4758000.6528450.8694630.9468540.977429
    0.620.380.6523290.7899960.9306760.9809410.9927200.4712000.6523290.8690710.9461350.976930
    0.630.370.6512340.7884970.9291800.9807860.9926540.4662000.6512340.8683080.9451420.976291
    0.640.360.6495430.7865720.9274390.9805270.9925390.4608000.6495430.8671610.9438550.975504
    0.650.350.6472380.7841950.9254380.9801550.9923700.4550000.6472380.8656090.9422530.974558
    0.700.300.6258000.7644420.9107410.9761670.9904710.4200000.6258000.8508980.9285300.966919
    0.800.200.5248000.6686720.8398530.9453550.9734260.3200000.5248000.7689420.8532750.923030
    0.900.100.3258000.4456980.6253620.7917730.8592490.1800000.3258000.5442730.6291890.746829
    0.990.010.0392060.0582280.0951490.1478350.1812430.0198000.0392060.0768720.0951560.130600

    Furthermore, in the graph below we have plotted the results up to the deadline of 10,0000ns. Both the table and the graph show that the (timing) performance of the root contention protocol can be improved using a biased coin which has a higher probability of flipping ''fast''. This curious result is possible because, although using such a biased coin decreases the likelihood of the nodes flipping different values, when nodes flip the same values there is a greater chance (i.e. when both flip ''fast'') that less time elapses before they flip again. There is a compromise here though: as the coin becomes more biased towards ''fast'', the probability of the nodes actually flipping different values (which is required for a leader to be elected) decreases, even though the delay between coin flips will on average decrease. This decrease in probability is demonstrated in both the table and the graph as increasing the probability of flipping fast eventually leads to a decrease in the probability of electing a leader by any given deadline.


Case Studies