Bayespp/Bayesian Filtering Classes.html at master · Sunstone-RTLS/Bayespp · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<HEAD>
        <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=iso-8859-15">
        <TITLE>Bayesian Filtering Classes</TITLE>
        <LINK REL="stylesheet" HREF="Deployment/paper.css" TYPE="text/css">
</HEAD>
<BODY>
<H1 ALIGN=CENTER><U>Bayesian Filtering Classes</U></H1>
<H1> Introduction</H1>

<P ALIGN=LEFT>
      Bayesian Filtering is a probabilistic technique for data fusion. The technique combines a concise mathematical formulation of a system with observations of that system. Probabilities are used to represent the state of a system, likelihood functions to represent their relationships. In this form Bayesian inference can be applied and further related probabilities deduced. See <A HREF="http://www.wikipedia.org/">Wikipedia</A> for information on <A HREF="http://www.wikipedia.org/wiki/Probability_theory">Probability
theory</A>, <A HREF="http://www.wikipedia.org/wiki/Bayes'_theorem">Bayes
theorem</A>, <A HREF="http://www.wikipedia.org/wiki/Bayesian_inference">Bayesian
Inference</A>.</P>
<P ALIGN=LEFT>For <U>discrete</U> systems the Bayesian formulation
results in a naturally iterative data fusion solution. For <U>dynamic</U>

systems there is a class of solutions, discrete <U>filters</U>, that combine observed outputs of the system with the system's dynamic model. An <U>estimator</U> computes a estimate of the systems state with each observation
of the system. Linear estimators such as the Kalman Filter are commonly applied.</P>
<P>Bayes++ is an open source library of C++ classes. These
classes represent and implement a wide variety of numerical
algorithms for Bayesian Filtering of discrete systems. The classes
provide tested and consistent numerical methods and the class
hierarchy explicitly represents the variety of filtering algorithms
and system model types</P>

<P>The following documentation summaries the classes available and
provides some general comments on their use. Each class's <B>.hpp</B>
file provides a complete interface specification and the numerical
form of the solution. Implementation issues are documented in the
<B>.cpp</B> files.</P>

<H2>Numerical Schemes in Bayes++</H2>
<P>There is a wide range of different numerical techniques for
filtering. For example the Kalman filter (a linear estimator which
calculates the 2<SUP>nd</SUP> order statistics of the system) is
represented using either a state vector and a covariance matrix, or
alternatively in information form.</P>
<P>Each numerical technique is a <B><I>Scheme</I></B>, and each
Scheme is implemented as a class. Each scheme implements the
algorithms required by a filter's particular numerical form. The
schemes provide a common interface that allows:</P>
<DL>
        <DD>i. initialization and update of the state, and</DD>
        <DD>ii. predict and observe functions with parameterized models</DD>

</DL>
<P>Each Scheme has both advantages and disadvantages. The numerical
complexity and efficiency varies, as does the numerical stability.
The table below lists all Schemes together with any requirement on
the representation associated with the algorithm.</P>
<TABLE WIDTH=725 BORDER=5 CELLPADDING=0 CELLSPACING=0>
        <TR VALIGN=TOP>
                <TD WIDTH=30%>
                        <P><B><FONT FACE="Verdana" SIZE=5>Scheme</FONT></B>
                </TD>
                <TD WIDTH=43%>
                        <P><B><FONT FACE="Verdana" SIZE=4>Formulation and<BR>Algorithm
                        used</FONT></B>

                </TD>
                <TD WIDTH=25%>
                        <P><B><FONT FACE="Verdana" SIZE=4>Representation
                        Requirements</FONT></B>
                </TD>
        </TR>
        <TR VALIGN=TOP>
                <TD WIDTH=30%>
                        <H4>Covariance_scheme</H4>

                        <P>Extended Kalman Filter (EKF)</P>
                        <P><A HREF="#References">See Reference[6]</A></TD>
                <TD WIDTH=43%>
                        <P>Classic 2nd order filter with state mean and covariance
                        representation. Non-linear models require a gradient linearisation
                        (Jacobian). Uses the common innovation update formulation.</TD>
                <TD WIDTH=25%>
                        <P>Innovation covariance invertible</TD>
        </TR>

        <TR VALIGN=TOP>
                <TD WIDTH=30%>
                        <H4>Unscented_scheme</H4>
                        <P>Kalman filter using unscented non-linear approximations</P>
                        <P><A HREF="#References">See Reference[1]</A></TD>
                <TD WIDTH=43%>
                        <P>Unscented non-linear transformations replace the Jacobians used
                        in the EKF- reducing linearisation errors in all cases.</TD>

                <TD WIDTH=25%>
                        <P>Innovation covariance invertible</TD>
        </TR>
        <TR VALIGN=TOP>
                <TD WIDTH=30%>
                        <H4>Iterated_covariance_scheme</H4>
                        <P>Modified EKF update</P>

                        <P><A HREF="#References">See Reference[6]</A></TD>
                <TD WIDTH=43%>
                        <P>Kalman filter for highly non-linear observation models.
                        Observation update is iterated using an Euler approximation.</TD>
                <TD WIDTH=25%>
                        <P>State, observation and innovation covariance invertible</TD>
        </TR>
        <TR VALIGN=TOP>

                <TD WIDTH=30%>
                        <H4>Information_scheme</H4>
                        <P>Extended Information Filter (EIF)</P>
                        <P><A HREF="#References">See Reference[7]</A></TD>
                <TD WIDTH=43%>
                        <P>Information (inverse covariance) filter.</P>
                        <P>Invertible linear model allows direct information form
                        prediction.<BR>Non-linear prediction requires conversion back to
                        state representation.<BR>Observe with modified innovation
                        formulation equivalent to EKF.</TD>

                <TD WIDTH=25%>
                        <P>State, observation and innovation covariance invertible</TD>
        </TR>
        <TR VALIGN=TOP>
                <TD WIDTH=30%>
                        <H4>Information_root_scheme</H4>
                        <P>Square root information filter (SRIF)</P>

                        <P><A HREF="#References">See Reference[4]</A></TD>
                <TD WIDTH=43%>
                        <P>Numerically stable solution using factorization of inverse
                        covariance as R'R</TD>
                <TD WIDTH=25%>
                        <P>Invertible prediction model. Prediction and observation
                        covariance invertible</TD>
        </TR>
        <TR VALIGN=TOP>

                <TD WIDTH=30%>
                        <H4>UD_scheme</H4>
                        <P>Square root covariance filter</P>
                        <P><A HREF="#References">See Reference[2]</A></TD>
                <TD WIDTH=43%>
                        <P>Numerically stable solution using UdU' factorization of
                        covariance.</TD>
                <TD WIDTH=25%>

                        <P>Innovation covariance invertible. Linearized observation
                        requires uncorrelated noise</TD>
        </TR>
        <TR VALIGN=TOP>
                <TD>
                        <P><B>CI_scheme<BR><BR></B>Covariance intersect filter
                        </P>
                        <P><A HREF="#References">See Reference[8]</A></TD>
                <TD>

                        <P>CI is interesting as it provides a weaker but more robust
                        fusion then traditional covariance based method such as the Kalman
                        filter. It estimates state and an upper bound of what its
                        covariance could be.</TD>
                <TD>
                        <P>No solution if covariances don't intersect.</TD>
        </TR>
        <TR VALIGN=TOP>
                <TD WIDTH=30%>
                        <H4>SIR_scheme</H4>

                        <P>Sequential Importance Re-sampling filter</P>
                        <P><A HREF="#References">See Reference[5]</A></TD>
                <TD WIDTH=43%>
                        <P>A bootstrap representation of the state distribution &ndash; no
                        linear or Gaussian assumptions required.</P>
                        <P>Uncorrelated linear roughening of samples.</TD>
                <TD WIDTH=25%>

                        <P>Bootstrap samples collapse to a degenerate from with fewer
                        samples then states.</TD>
        </TR>
        <TR VALIGN=TOP>
                <TD WIDTH=30%>
                        <H4>SIR_kalman_scheme</H4>
                        <P>Sequential Importance Re-sampling filter</TD>
                <TD WIDTH=43%>

                        <P>A bootstrap representation of the state distribution &ndash; no
                        linear or Gaussian assumptions required.</P>
                        <P>Additional computes state mean and covariance. Correlated
                        linear roughening of samples.</TD>
                <TD WIDTH=25%>
                        As above. Correlation non-computable from degenerate samples.</TD>
        </TR>
</TABLE>
<H1>Three Examples</H1>

<H2>Simple Example (Initialize &ndash; Predict &ndash; Observe)</H2>
<P>This is very simple example for those who have never used the
Bayesian Filtering Classes before. The example shows how two classes
are built. The first is the prediction model, the second the
observation model. In this example they represent a simple linear
problem with only one state variable and constant model noises.</P>
<P>The example then constructs a filter. The <B>Unscented</B> filter
scheme is chosen to illustrate how this works, even on a simple
linear problem. After construction the filter is given the problem&rsquo;s
initial conditions. A prediction using the predefine prediction model
is then made. This prediction is then fused with an external
observation given by the example and the defined observation model.
At each stage, the filter&rsquo;s state estimate and variance
estimate are printed.</P>
<H2>Position and Velocity</H2>

<P>This example solves a simple position and velocity observation
problem using the Bayesian filter classes. The system has two states,
position and velocity, and a noisy position observation is simulated.
Two variants:</P>
<DL>
                <DD>1) direct filter</DD>
                <DD>2) indirect filter where the filter is preformed on error
                and state is estimated indirectly</DD>
</DL>
<P>are demonstrated using a specified numerical scheme. The example
shows how to build <U>linear</U> prediction and observation models
and to cycle the filter to produce state estimates.</P>
<H2>Quadratic Calibration</H2>

<P>This example implements a &ldquo;Quadratic Calibration&rdquo;
filter. The observer is trying to estimate the state of a system as
well as the systems scale factor and bias. A simple system is
simulated where the state is affected by a known perturbation.</P>
<P>The example shows how to build <U>linearized</U> prediction and
observation model and cycle a filter to produce state estimates.</P>
<H1>Dual Abstraction</H1>
<P>The aim of Bayes++ is to provide a powerful and extensible
structure for Bayesian filtering. This is achieved by hierarchical
composition of related objects. In C++ these are represented by a
hierarchy of polymorphic classes. Numerical Schemes are grouped by
their state representation. The Schemes themselves provided the
filtering operations on these representations.</P>
<P>To complete a filter, an associated prediction and observation
model must also be specified. These model classes parameterize the
filtering operations. The users responsibility is to chose a model
type and provide the numerical algorithm for that model. Again model
classes composed using a hierarchy of polymorphic abstract classes to
represents the structure of each model type.</P>
<P>This <EM>dual abstraction</EM>, separating numerical schemes from
model structure, provides a great deal of flexibility. In particular
it allows a well-specified model to be used with a variety of
numerical schemes.</P>

<H2>Filter Hierarchy</H2>
<P>The table below lists a few of the key abstract filter classes
upon which filter Schemes are build. The C++ class hierarchy
documentation gives the complete structure.</P>
<DL>
        <DD>
                                <TABLE WIDTH=566 BORDER=3 CELLPADDING=2 CELLSPACING=0>
                                        <COL WIDTH=210>
                                        <COL WIDTH=342>
                                        <TR VALIGN=TOP>
                                                <TD WIDTH=210>

                                                        <P><STRONG>Bayes_filter</STRONG>
                                                </TD>
                                                <TD WIDTH=342>
                                                        <P>Base class for everything<BR>(contains no data)
                                                </TD>
                                        </TR>
                                        <TR VALIGN=TOP>
                                                <TD WIDTH=210>

                                                        <P><STRONG>Likelihood_filter</STRONG></TD>
                                                <TD WIDTH=342>
                                                        <P>Represents only the Bayesian Likelihood of a state
                                                        observation</TD>
                                        </TR>
                                        <TR VALIGN=TOP>
                                                <TD WIDTH=210>
                                                        <P><STRONG>Functional_filter</STRONG></TD>

                                                <TD WIDTH=342>
                                                        <P>Represents only the filter prediction by a simple
                                                        functional (non-stochastic) model</TD>
                                        </TR>
                                        <TR VALIGN=TOP>
                                                <TD WIDTH=210>
                                                        <P><STRONG>State_filter</STRONG></TD>
                                                <TD WIDTH=342>
                                                        <P>Represents only the filter state and an update on that
                                                        state</TD>

                                        </TR>
                                        <TR VALIGN=TOP>
                                                <TD WIDTH=210>
                                                        <P><STRONG>Kalman_state_filter</STRONG></TD>
                                                <TD WIDTH=342>
                                                        <P>Kalman representation of state statistics.<BR>Represents a
                                                        state vector and a covariance matrix. That is the 1st (mean)
                                                        and 2nd (covariance) moments of a distribution.</TD>
                                        </TR>

                                        <TR VALIGN=TOP>
                                                <TD WIDTH=210>
                                                        <P><B>Information_state_filter</B></TD>
                                                <TD WIDTH=342>
                                                        <P>Information representation of state statistics.<BR>Effectively
                                                        the inverse of the Kalman_state_filter representation.</TD>
                                        </TR>
                                        <TR VALIGN=TOP>

                                                <TD WIDTH=210>
                                                        <P><STRONG>Linrz_filter</STRONG></TD>
                                                <TD WIDTH=342>
                                                        <P>Model interface for a linear or gradient linearized Kalman
                                                        filters.<BR>Specifies filter operation using predict and
                                                        observe functions</TD>
                                        </TR>
                                        <TR VALIGN=TOP>
                                                <TD WIDTH=210>

                                                        <P><STRONG>Sample_filter</STRONG></TD>
                                                <TD WIDTH=342>
                                                        <P>Discreate reprensation of state statistic.<BR>Base class
                                                        for filters representing state probability distribution by a
                                                        discrete sampling</TD>
                                        </TR>
                                </TABLE>
        </DD>
</DL>
<H2 ALIGN=LEFT>Model Hierarchy</H2>

<P ALIGN=LEFT>These two tables show some of the classes in the
hierarchy upon which models are built.</P><A NAME="predict models"></A>
<DL>
        <DD>
                                        <TABLE WIDTH=561 BORDER=3 CELLPADDING=2 CELLSPACING=0>
                                                <COL WIDTH=223>
                                                <COL WIDTH=324>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=223>
                                                                <P><STRONG>Sampled_predict_model</STRONG></TD>

                                                        <TD WIDTH=324>
                                                                <P>Sampled stochastic prediction model</TD>
                                                </TR>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=223>
                                                                <P><STRONG>Functional_predict_model</STRONG>
                                                        </TD>
                                                        <TD WIDTH=324>

                                                                <P>Functional (non-stochastic) prediction model
                                                        </TD>
                                                </TR>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=223>
                                                                <P><STRONG>Additive_predict_model</STRONG></TD>
                                                        <TD WIDTH=324>
                                                                <P>Additive Gaussian noise prediction model.<BR>This
                                                                fundamental model for linear/linearized filtering, with noise
                                                                added to a functional prediction</TD>

                                                </TR>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=223>
                                                                <P><STRONG>Linrz_predict_model</STRONG></TD>
                                                        <TD WIDTH=324>
                                                                <P>Linearized prediction model with Jacobian of non-linear
                                                                functional part</TD>
                                                </TR>
                                                <TR VALIGN=TOP>

                                                        <TD WIDTH=223>
                                                                <P><STRONG>Linear_predict_model</STRONG></TD>
                                                        <TD WIDTH=324>
                                                                <P>Linear prediction model</TD>
                                                </TR>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=223>
                                                                <P><STRONG>Linear_invertable_predict_model</STRONG></TD>

                                                        <TD WIDTH=324>
                                                                <P>Linear prediction model which is invertible</TD>
                                                </TR>
                                        </TABLE>
        </DD>
</DL>
<P>&nbsp;</P><A NAME="observe models"></A>
<DL>
        <DD>
                                        <TABLE WIDTH=558 BORDER=3 CELLPADDING=2 CELLSPACING=0>

                                                <COL WIDTH=222>
                                                <COL WIDTH=322>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=222 HEIGHT=39>
                                                                <P><STRONG>Likelihood_observe_model</STRONG>
                                                        </TD>
                                                        <TD WIDTH=322>
                                                                <P>Likelihood observation model<BR>The most fundamental
                                                                Bayesian definition of an observation</TD>

                                                </TR>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=222>
                                                                <P><STRONG>Functional_observe_model</STRONG></TD>
                                                        <TD WIDTH=322>
                                                                <P>Functional (non-stochastic) observation model</TD>
                                                </TR>
                                                <TR VALIGN=TOP>

                                                        <TD WIDTH=222>
                                                                <P><STRONG>Linrz_uncorrelated_observe_model</STRONG></TD>
                                                        <TD WIDTH=322>
                                                                <P>Linearized observation model with Jacobian of non-linear
                                                                functional part and additive uncorrelated noise</TD>
                                                </TR>
                                                <TR VALIGN=TOP>
                                                        <TD WIDTH=222 HEIGHT=23>
                                                                <P><STRONG>Linrz_correlated_observe_model</STRONG></TD>

                                                        <TD WIDTH=322>
                                                                <P>as above, but with additive correlated noise</TD>
                                                </TR>
                                        </TABLE>
        </DD>
</DL>
<H1>Capabilities</H1>
<H2>Probabilistic representation of state</H2>
<P>Bayes rule is usually defined in term of Probability Density
Functions. However PDFs never appear in Bayes++. They are always
represented by their statistics.</P>

<P>This is for good reason, there is very little that can be done
algorithmically with a function. However the sufficient statistics,
given the assumptions of a filter, can be easily manipulated to
implement Bayes rule. Each filter class is derived from a base
classes that represent the statistics used. For example the
Kalman_filter and Sample_filter base classes.</P>
<P>It would be possible to use a common abstract base class for that
enforces the implementation of a PDF function; a function that maps
state to probability. This would provide a very weak method to view
the PDF of state. However such a function could not be efficiently
implemented by all schemes. Therefore to enforce this requirement in
the base class would be highly restrictive.</P>
<H2>Linear and Linearized models</H2>
<P>Many of the filters use classical linear estimation techniques,
such as the Kalman filter. To make them useful they are applied in
modified forms to cope with <U>linearized</U> models of some kind.
Commonly a gradient linearized model is used to update uncertainty,
while the state is directly propagated through the non-linear model.
This is the <I>Extended</I> form used by the Extended Kalman Filter.</P>
<P>However, some numerical schemes cannot be modified using the
<I>Extended</I> form. In particular, it is not always possible to use
the extended form with correlated noises. Where this is the case, the
linearized model is used for uncertainty and state.</P>

<P>There are also many Bayesian filters that work with non-Gaussian
noise and non-linear models - such as the <I>SIR_filter</I>. The SIR
scheme has been built so it works with Likelihood model.</P>
<P>The filters support discontinuous models such as those depending
on angles. In this case the model must be specifically formulated to
normalize the states. However, some schemes need to rely on an
additional normalization function. Normalization works well for
functions that are locally continuous after normalization (such as
angles). Non-linear functions that cannot be made into locally
continuous models are not appropriate for linear filters.</P>
<H2>Interface Regularity</H2>
<P>Where possible, the Schemes have been designed to all have the
same interface, providing for easy interchange. However, this is not
possible in all cases as the mathematical methods vary greatly. For
efficiency some schemes also implement additional functions. The
functions can be use to avoid inefficiencies where the general form
is not required.</P>
<P>Scheme class constructors are irregular (their parameter list
varies) and must be used with care. Each numerical scheme has
different requirements, and the representation size needs to be
parameterized. A template class Filter_scheme can be used to provide
a generic constructor interface. It provides provides specialization
for all the Schemes so they can be constructed with a common
parameter list.</P>
<H3>Open interface</H3>
<P>The design of the class hierarchy is deliberately open. Many of
the variables associated with schemes are exposed as <I>public
members</I>. For example the covariance filter&rsquo;s innovation
covariance is public. This is to allow efficient algorithms to be
implemented using the classes. In particular it is often the case
that subsequent computations reuse the values that have already been
computed by the numerical schemes. Each scheme defines a <I>public
state representation</I>.</P>

<P>Furthermore many temporaries are <I>protected members</I> to allow
derived classes to modify a scheme without requiring any additional
overhead to allocate its own temporaries.</P>
<P>Open interfaces are potentially hazardous. The danger is that
abuse could result in unnecessary dependencies on particular
implementation characteristics.
</P>
<H4>Public state representation Initialization and Update</H4>
<P>The two functions <B>init</B> and <B>update</B> are defined in the
filter class hierarchy for classes with names ending <B>_state_filter</B>.
They are very important in allowing the <I>public state
representation</I> to be managed.</P>

<BLOCKQUOTE><P><U>After</U> the public representation of a scheme is
changed (externally) the filter should be <I>initialized</I> with an
<B>init</B> function. The scheme may define additional init functions
that allow it to be initialized from alternative representations.
<B>init()</B> is defined for schemes derived from
<B>Kalman_state_filter</B>.</P></BLOCKQUOTE>
<BLOCKQUOTE><P><U>Before</U> the public representation of a scheme is
used the filter should be <I>updated</I> with an <B>update</B>

function. The scheme may define additional update functions that
allow it to update alternative representations. <B>update()</B> is
defined for schemes derived from <B>Kalman_state_filter</B>.</P></BLOCKQUOTE>
<H4>Sharing schemes state representation</H4>
<P>The filter hierarchy has been specifically designed to allow state
representation to be shared. Schemes state representations are
inherited using one or more <B>_state_filter</B> 's as virtual base
classes. It is therefore possible to combine multiple schemes (using
multiple inheritance) and only a single copy of each state
representation will exist.</P>
<P>The <B>init</B> and <B>update</B> functions should be used to
coordinate between the schemes and state representation. This allows
precise control numerical conversions necessary for different schemes
to share the state.</P>

<H4>Assignment and Copy Construction</H4>
<P>Filter classes can be assigned when they are of the same size. In
cases where the class includes members in addition to the public
representation this is optimized so that only public representation
is assigned. Assignment is equivalent to: <I>update</I>, assignment
of public representation and <I>initialization</I> from the new
state.</P>
<P>Copy Constructors are NOT defined. Generally the classes are
expensive to copy and so copies should be avoided. Instead references
or (smart) pointers should be combined with assignment to create
copies if necessary.</P>
<H4>Changing the state representation size</H4>
<P>The classes assume their state representation is a constant size.
They define their matrix sizes on construction. Matrices (and
vectors!) in public representation <U>should NOT be externally
resized</U>.</P>

<P>Since matrix resizing invariable involves reallocation, it is just
as efficient to create new matrices as to resize them. Also for a
filter scheme to change it size it must recomputed its internal
states. Therefore if the size of filter needs to change, the solution
is to create a new filter. The new filter can then be <I>initialized</I>
from the state of the old filter plus some new states. What you do
with these new states usually influences the state representation you
choose. Generally new states either enter the system with nothing
being know about them (zero information), or extract value being know
(zero covariance).</P>
<H2>Numerical and Exception Guarantees</H2>
<P>It is important to be able to rely on the numerical results of
Bayes++. Generally the Schemes are implemented using the most
numerically stable approach available in the literature. Bayes++
provides its own UdU' factorization functions to factorize PSD
positive (semi)definite matrices. The factorization is numerically
stable, computing and checking the conditioning of matrices.</P>
<P>Exceptions are thrown in the case of numerical failure. They
follow a simple rule:<BR><EM><STRONG>Bayes_filter_exception</STRONG>
is thrown if an algorithm cannot continue or guarantee the numerical
post-conditions of a function</EM></P>
<P>The only <EM>exception guarantee</EM> that Bayes++ makes when
throwing <STRONG><EM>Bayes_filter_exception</EM> </STRONG>from any
function, is that no resources will be leaked. That is the <I>Weak
guarantee</I> as defined by <A HREF="#Reference">Reference[9]</A>.
Therefore, unless otherwise specified, the numeric value of a class's
public state is <STRONG>undefined</STRONG> after this exception is
thrown by a member of a class.</P>

<P>Be aware that numerical post-conditions may be met even with
extreme input values. For example some inputs may result in overflow
of a result. This does not invalidate the post-conditions that the
result is positive. Even some <EM>Not a Number</EM> values may be
valid. Therefore the results may be computed without exception, but
include <EM>NaN</EM> or <EM>non-number</EM> values.</P>
<H3>Using exceptions in filter schemes</H3>
<P>What does this mean when using a filter Scheme? If the Scheme
throws an exception then, unless otherwise specified, the numeric
value of its state is undefined. Therefore you <STRONG>cannot</STRONG>
use any function of the filter that has any preconditions on its
numerical state. To continue using a filter either the <EM>init</EM>()
function should be used (which has no pre-conditions) or the filter
destructed.</P>

<P>Generally to access the public state of a Scheme you must first
call the <EM>update</EM>() function. If no exception is thrown then
the post-conditions of update are guaranteed. The post-conditions of
<EM>update</EM>() may include such useful properties as the
covariance matrix is PSD. Many application specific tests may also be
required. Just being PSD doesn't say much about X. It could even have
<EM>Infinity</EM> values on the diagonal! Conditioning, trace,
determinate, and variance bounds can all be applied at his point.
</P>
<H1>Polymorphic design</H1>
<P>Why are models and filter Schemes <STRONG>polymorphic</STRONG> ?
OR Why are they implemented with virtual functions?</P>
<BLOCKQUOTE><P>One alternative would be to use <STRONG>generic</STRONG>

instead of <STRONG>polymorphic</STRONG> Schemes. This would implement
Schemes as templates. If the sizes of matrix operations could be
fixed generic models and schemes would directly manipulate matrices
for predefined size.</P></BLOCKQUOTE>
<P>Bayes++ relies on a <STRONG>dual abstraction</STRONG>, separating
numerical schemes from model structure. To represent these, a
<STRONG>polymorphic design </STRONG>was a very important part of
Bayes++ design. After a fair bit of use I still think it was the
correct one. Why?</P>
<P>A) <EM>Usage</EM> : There are many (run time) polymorphic things I
really want to do with the library. I want to be able to build
composite filtering algorithms that switch models and schemes at run
time. This requires<U> both</U> polymorphic filters and models <U>and</U>

runtime sizing of matrices.<BR>A generic approach, especially one
parametrized on matrix size would not allow this. The code size
produced would also bloat massively. Not even STL parameterizes its
containers template with size!</P>
<P>B) <EM>Type safety</EM>: There is surprising little orthogonality
in filtering schemes. The numerics often dictate restrictions or
additional functionality. The type hierarchy in Bayes++ provides a
succinct and safe way to enforce much of this structure.<BR>In a
Generic approach checking that a particular Scheme models a generic
concept correct would be very difficult to enforce.</P>
<P>C) <EM>Compiler problems</EM>: Generic programming in C++ has a
nasty syntax, is very slow to compile and pushes the limits of
compiler reliability.</P>
<P>D) <EM>Efficiency</EM>: The run time overhead of a polymorphic
filter is negligible compared to the matrix algebra required. In fact
using common code may increase efficiency on a many computing
architectures.</P>
<H1 ALIGN=LEFT>Matrix Representation</H1>

<P>Bayes++ requires an external library to represent matrices and
vectors, and to support basic linear algebra operations. Bayes++ uses
the <A HREF="http://www.boost.org/libs/numeric/ublas/doc/index.htm"><B>uBLAS</B></A>
library from <A HREF="http://www.boost.org/"><B>Boost</B></A> for all
these requirements. It an excellent basic linear algebra library. The
interface and syntax are easy to use.</P>
<P>Previous versions of the filters were implemented with the <A HREF="http://www.osl.iu.edu/research/mtl/" TARGET="_top">Matrix
Template Library</A>, <I>MPP</I> and <I>TNT</I>. Public releases
using MTL are still available.
</P>

<P>The Bayes++ implementation uses a set of adapter classes in the
namespace <I>Bayesian_filter_matrix</I> (FM for short) for matrix
access. This system should make it simpler to port the library to
other matrix types. For uBLAS these function generally have no
runtime overhead. This efficiency is due to uBLAS's excellent
expression template implementation. For MTL most functions are
reasonably efficient but some functions create matrix temporaries to
return values.</P>
<H2>Matrix library efficiency - Temporary object usage</H2>
<P>Most of the filters avoid using Matrix and Vector temporaries.
They also have optimizations for a constant (on construction) state
size, noise and observations. In particularly the UD_filter scheme
avoids creating any temporary objects. All matrices maintain a fixed
size, other then when the observation state size is varied.</P>
<P>Why are temporary matrix objects bad? The main difficulty is
construction and destruction of Matrices. This generally requires
dynamic memory allocation which is very slow. For small matrices this
dominates compared to the cost of simple operations such as addition
and multiplication.</P>
<P>There are three ways out of this problem.</P>
<BLOCKQUOTE><P>1. Use fixed size matrices; they can (nearly) always
be efficiently allocated on the stack.<BR>2. Use stack based dynamic
allocators (such as C99's dynamic arrays or alloca) for
temporaries.<BR>3. Don't create temporaries. This is achievable with
a combination of hard coding in the algorithms (pre-allocation) and
by using a Matrix library with expression templates to avoid creating
temporaries for return values.</P></BLOCKQUOTE>

<P>Bayes++ is implemented to avoid creating temporaries. At present
my solution is somewhat of a compromise. MTL is only moderately
efficient on most C++ compilers. On release code this results in a
50% performance reduction if temporaries are avoided. uBLAS attains
close to optimal efficiency in many situations.</P>
<P>The UD_filter is a good illustration. It has been hand crafted to
avoid construction of any temporaries (I use it for embedding) and
can be compiled with either uBLAS, MTL or a special fast matrix
library. UD_filter is heavily optimized to use row operations to
avoid indexing overheads. uBLAS and the special library achieve
comparable results, which I believe are as fast as can be achieved.</P>
<P>On the flip side, the current Unscented_filter implementation
shows some of the problems. Because of my wrapper classes Row/Column
of a 2D matrix are not compatible with the Vec type. This results in
a lot of unnecessary copying into pre-allocated temporary vectors.</P>
<P>Future work includes a general method of avoiding implementation
temporaries. Filter schemes will be parameterised with helper objects
that deal with all the temporaries required. These helper objects can
then hide all the hard work of pre-allocation or dynamic allocation
of temporary objects.</P>
<H1>Restrictions</H1>
<P>For numerical precision, all filter calculations use <B>doubles</B>.
Templates could be used o provide numerical type parameterization but
are not. However the base class <I>Bayes_filter</I> defines <I>Float</I>,
which is used throughout as the numerical type.</P>

<P>The <I>Bayes_filter</I> base class is constructed with constant
state size. This allows efficient use of the matrix library to
implement the algorithms. Each derived class requires additional
representation and matrix temporaries and where possible they are
restricted to a constant size. In general, the only unknown is the
size of the observation state as parameterized by the observation
model.</P>
<P>Where the state size varies efficient solutions are still possible
using either spares matrix representations or by recreating new
filters with increased size.</P>
<H1>The future</H1>
<P>The Bayesian Filtering Classes are now a stable set of solution
for the majority of linear, linearisable and non-linear filtering
problems. I am very happy with their form and function. In
particular, the two-layer abstraction works extremely well. The
classes show best practice in both their design and in efficiency and
numerical stability of the algorithms used. So where to go from here?</P>
<P>The basic tenant of Bayesian theory is that the Likelihood
function complete captures all that is know to update the state. The
Bayesian Filtering classes now allow the modeling systems using
Likelihood functions. At present the SIR filter is the only
Likelihood filter. Sample filters will grow into a separate branch in
the class hierarchy. A general set of adapters will be required to
move between these varied representations.</P>
<P>To further improve the abstraction, the method of internal
variable exposure needs to be regularized. This will require the
addition of a few classes that hold and limit access to filters
internal variables.</P>
<P>The ordering of parameters used in Scheme class constructors is
prone to error. It requires the parameter list to be varied for each
scheme used. An extensible method of model parameterization is
required. A common parameterization could then be used with scheme
constructor extracting the information they require.</P>

<H2>STATE ABSTRACTION</H2>
<P>Can the state representation be abstracted away from the numerics
of a filter Scheme?</P>
<P>At the moment Bayes++ filter Schemes are a combination of the
numerical solution (e.g. innovation update) AND the representation
(e.g. Covariance matrix). Can these two functions be separated?</P>
<P>A highly sophisticated solution could use polymorphic (or generic)
filter algorithms that depend on the type (or traits) of the
representation. I think this in unnecessarily complex. In Bayes++ a
Scheme should only implement one algorithm.</P>
<P>The problem I would like to address is a bit more limited. A lot
of representations are naturally hierarchical and dynamic.<BR>e.g.
state := vehicle states &amp; feature states, feature states :=
{feature1 state &amp; feature2 state ....}</P>
<P>There would seem to be too possible ways to solve this</P>

<P>A) HARD: Use a state representation that represents this kind of
hierarchy. The filter schemes could therefore use hierarchical
numerical solutions that exploit the properties of the hierarchical
state.</P>
<P>B) EASY: Allow a sparse matrix representation of state. If the
sparse representation is a Graph then the sort of augmented state
representation in the example can be easily built without any copy
overhead. Each scheme would then just use sparse matrix techniques in
its numerical solution. Mostly what is needed is Cholesky and QR
decomposition and these have good sparse solutions. Obviously there
would be a runtime overhead for this but it would be great for
Simultaneous Location and Mapping problems!</P>
<H1 ALIGN=LEFT></A>Copyright</H1>
<P ALIGN=LEFT>Copyright (c) 2003,2004,2005 Michael Stevens. This document is
part of the Bayes++ library - see Bayes++.html for copyright license
details.</P>
<H1><A NAME="References"></A>References</H1>
<P>[1] &quot;A New Approach for Filtering Nonlinear Systems&quot;, SJ
Julier JK Uhlmann HF Durrant-Whyte, American Control Conference 1995</P>
<P>[2] &quot;Factorisation Methods for Discrete Sequential
Estimation&quot;, Gerald J. Bierman, ISBN 0-12-097350-2</P>

<P>[3] &quot;Real time Kalman Filter Application&quot;, Mohinder S.
Grewal, Angus P. Andrews, ISBN 0-13-211335-X</P>
<P>[4] &quot;Extension of Square-Root Filtering to Include Process
Noise&quot;, P. Dyer and S. McReynolds, Journal of Optimization
Theory and Applications, Vol.3 No.6 1969</P>
<P>[5] &quot;Novel approach to nonlinear-non-Guassian Bayesian state
estimation&quot;, NJ Gordon, DJ Salmond, AFM Smith, IEE Proceeding-F,
Vol.140 No.2 April 1993</P>
<P>[6] &quot;Tracking and Data Association&quot;, Y Bar-Shalom and T
Fortmann, Academic Press, ISBN 0-12-079760-7</P>
<P>[7] &quot;Stochastic Models, Estimation, and Control&quot;, Peter
S Maybeck, Academic Press, ISBN 0-12-480701-1</P>

<P>[8] &quot;A non-divergent Estimation Algorithm in th Presence of
Unknown Correlaltion&quot;, SJ Julier, JK Uhlmann, Proc. American
Control Conference 6/1997
</P>
<P>[9] &quot;Exception-Safety in Generic Components&quot;, David
Abrahams, Proc. of a Dagstuhl Seminar 'Generic Programming', Lecture
Notes on Computer Science 1766,
<A HREF="http://www.boost.org/more/error_handling.html" TARGET="_top">http://www.boost.org/more/error_handling.html</A></P>
</BODY>
</HTML>