Bayesian Networks: An Introduction

Hardcover
from $0.00

Author: Timo Koski

ISBN-10: 0470743042

ISBN-13: 9780470743041

Category: Neural Networks

Search in google:

Bayesian Networks: An Introduction provides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout. Features include: An introduction to Dirichlet Distribution, Exponential Families and their applications.A detailed description of learning algorithms and Conditional Gaussian Distributions using Junction Tree methods.A discussion of Pearl's intervention calculus, with an introduction to the notion of see and do conditioning.All concepts are clearly defined and illustrated with examples and exercises. Solutions are provided online. This book will prove a valuable resource for postgraduate students of statistics, computer engineering, mathematics, data mining, artificial intelligence, and biology.Researchers and users of comparable modelling or statistical techniques such as neural networks will also find this book of interest.

Preface ix1 Graphical models and probabilistic reasoning 11.1 Introduction 11.2 Axioms of probability and basic notations 41.3 The Bayes update of probability 91.4 Inductive learning 111.4.1 Bayes' rule 121.4.2 Jeffrey's rule 131.4.3 Pearl's method of virtual evidence 131.5 Interpretations of probability and Bayesian networks 141.6 Learning as inference about parameters 151.7 Bayesian statistical inference 171.8 Tossing a thumb-tack 201.9 Multinomial sampling and the Dirichlet integral 24Notes 28Exercises: Probabilistic theories of causality, Bayes' rule, multinomial sampling and the Dirichlet density 312 Conditional independence, graphs and d-separation 372.1 Joint probabilities 372.2 Conditional independence 382.3 Directed acyclic graphs and d-separation 412.3.1 Graphs 412.3.2 Directed acyclic graphs and probability distributions 452.4 The Bayes ball 502.4.1 Illustrations 512.5 Potentials 532.6 Bayesian networks 582.7 Object oriented Bayesian networks 632.8 d-Separation and conditional independence 662.9 Markov models and Bayesian networks 672.10 I-maps and Markov equivalence 692.10.1 The trek and a distribution without a faithful graph 72Notes 73Exercises: Conditional independence and d-separation 753 Evidence, sufficiency and Monte Carlo methods 813.1 Hard evidence 823.2 Soft evidence and virtual evidence 853.2.1 Jeffrey's rule 863.2.2 Pearl's method of virtual evidence 873.3 Queries in probabilistic inference 883.3.1 The chest clinic problem 893.4 Bucket elimination 893.5 Bayesian sufficient statistics and prediction sufficiency 923.5.1 Bayesiansufficient statistics 923.5.2 Prediction sufficiency 923.5.3 Prediction sufficiency for a Bayesian network 953.6 Time variables 983.7 A brief introduction to Markov chain Monte Carlo methods 1003.7.1 Simulating a Markov chain 1033.7.2 Irreducibility, aperiodicity and time reversibility 1043.7.3 The Metropolis-Hastings algorithm 1083.7.4 The one-dimensional discrete Metropolis algorithm 111Notes 112Exercises: Evidence, sufficiency and Monte Carlo methods 1134 Decomposable graphs and chain graphs 1234.1 Definitions and notations 1244.2 Decomposable graphs and triangulation of graphs 1274.3 Junction trees 1314.4 Markov equivalence 1334.5 Markov equivalence, the essential graph and chain graphs 138Notes 144Exercises: Decomposable graphs and chain graphs 1455 Learning the conditional probability potentials 1495.1 Initial illustration: maximum likelihood estimate for a fork connection 1495.2 The maximum likelihood estimator for multinomial sampling 1515.3 MLE for the parameters in a DAG: the general setting 1555.4 Updating, missing data, fractional updating 160Notes 161Exercises: Learning the conditional probability potentials 1626 Learning the graph structure 1676.1 Assigning a probability distribution to the graph structure 1686.2 Markov equivalence and consistency 1716.2.1 Establishing the DAG isomorphic property 1736.3 Reducing the size of the search 1766.3.1 The Chow-Liu tree 1776.3.2 The Chow-Liu tree: A predictive approach 1796.3.3 The K2 structural learning algorithm 1836.3.4 The MMHC algorithm 1846.4 Monte Carlo methods for locating the graph structure 1866.5 Women in mathematics 189Notes 191Exercises: Learning the graph structure 1927 Parameters and sensitivity 1977.1 Changing parameters in a network 1987.2 Measures of divergence between probability distributions 2017.3 The Chan-Darwiche distance measure 2027.3.1 Comparison with the Kullback-Leibler divergence and euclidean distance 2097.3.2 Global bounds for queries 2107.3.3 Applications to updating 2127.4 Parameter changes to satisfy query constraints 2167.4.1 Binary variables 2187.5 The sensitivity of queries to parameter changes 220Notes 224Exercises: Parameters and sensitivity 2258 Graphical models and exponential families 2298.1 Introduction to exponential families 2298.2 Standard examples of exponential families 2318.3 Graphical models and exponential families 2338.4 Noisy 'or' as an exponential family 2348.5 Properties of the log partition function 2378.6 Fenchel Legendre conjugate 2398.7 Kullback-Leibler divergence 2418.8 Mean field theory 2438.9 Conditional Gaussian distributions 2468.9.1 CG potentials 2498.9.2 Some results on marginalization 2498.9.3 CG regression 250Notes 251Exercises: Graphical models and exponential families 2529 Causality and intervention calculus 2559.1 Introduction 2559.2 Conditioning by observation and by intervention 2579.3 The intervention calculus for a Bayesian network 2589.3.1 Establishing the model via a controlled experiment 2629.4 Properties of intervention calculus 2629.5 Transformations of probability 2659.6 A note on the order of 'see' and 'do' conditioning 2679.7 The 'Sure Thing' principle 2689.8 Back door criterion, confounding and identifiability 270Notes 273Exercises: Causality and intervention calculus 27510 The junction tree and probability updating 27910.1 Probability updating using a junction tree 27910.2 Potentials and the distributive law 28010.2.1 Marginalization and the distributive law 28310.3 Elimination and domain graphs 28410.4 Factorization along an undirected graph 28810.5 Factorizing along a junction tree 29010.5.1 Flow of messages initial illustration 29210.6 Local computation on junction trees 29410.7 Schedules 29610.8 Local and global consistency 30210.9 Message passing for conditional Gaussian distributions 30510.10 Using a junction tree with virtual evidence and soft evidence 311Notes 313Exercises: The junction tree and probability updating 31411 Factor graphs and the sum product algorithm 31911.1 Factorization and local potentials 31911.1.1 Examples of factor graphs 32011.2 The sum product algorithm 32311.3 Detailed illustration of the algorithm 329Notes 332Exercise: Factor graphs and the sum product algorithm 333References 335Index 343