Klientų aptarnavimas: +370 652 87781

Pagalba | Naujas vartotojas | Prisijungti

El. knyga: Molecular Evolution: A Statistical Approach [Oxford Scholarship Online E-books]

Ziheng Yang (RA Fisher Professor of Statistical Genetics, Department of Genetics, Evolution and Environment, University College London)

Formatas: 512 pages
Išleidimo metai: 29-May-2014
Leidėjas: Oxford University Press
ISBN-13: 9780199602605

Kitos knygos pagal šią temą:

Oxford Scholarship Online E-books
Kaina nežinoma

Formatas: 512 pages
Išleidimo metai: 29-May-2014
Leidėjas: Oxford University Press
ISBN-13: 9780199602605

Kitos knygos pagal šią temą:

More info about Oxford Scholarship Online e-books

Partnerių interneto svetainė: http://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780199602605.001.0001/acprof-9780199602605

Studies of evolution at the molecular level have experienced phenomenal growth in the last few decades, due to rapid accumulation of genetic sequence data, improved computer hardware and software, and the development of sophisticated analytical methods. The flood of genomic data has generated an acute need for powerful statistical methods and efficient computational algorithms to enable their effective analysis and interpretation.

Molecular Evolution: a statistical approach presents and explains modern statistical methods and computational algorithms for the comparative analysis of genetic sequence data in the fields of molecular evolution, molecular phylogenetics, statistical phylogeography, and comparative genomics. Written by an expert in the field, the book emphasizes conceptual understanding rather than mathematical proofs. The text is enlivened with numerous examples of real data analysis and numerical calculations to illustrate the theory, in addition to the working problems at the end of each chapter. The coverage of maximum likelihood and Bayesian methods are in particular up-to-date, comprehensive, and authoritative.

This advanced textbook is aimed at graduate level students and professional researchers (both empiricists and theoreticians) in the fields of bioinformatics and computational biology, statistical genomics, evolutionary biology, molecular systematics, and population genetics. It will also be of relevance and use to a wider audience of applied statisticians, mathematicians, and computer scientists working in computational biology.

1 Models of nucleotide substitution

(34)

1.1 Introduction

(3)

1.2 Markov models of nucleotide substitution and distance estimation

(11)

1.2.1 The JC69 model

(3)

1.2.2 The K80 model

(2)

1.2.3 HKY85, F84, TN93, etc.

(4)

1.2.4 The transition/transversion rate ratio

(2)

1.3 Variable substitution rates across sites

(2)

1.4 Maximum likelihood estimation of distance

(9)

1.4.1 The JC69 model

(4)

1.4.2 The K80 model

(1)

1.4.3 Likelihood ratio test of substitution models

(2)

1.4.4 Profile and integrated likelihood methods

(2)

1.5 Markov chains and distance estimation under general models

(6)

1.5.1 Markov chains

(1)

1.5.2 Distance under the unrestricted (UNREST) model

(2)

1.5.3 Distance under the general time-reversible model

(3)

1.6 Discussions

(1)

1.6.1 Distance estimation under different substitution models

(1)

1.6.2 Limitations of pairwise comparison

(1)

1.7 Problems

(2)

2 Models of amino acid and codon substitution

(35)

2.1 Introduction

(1)

2.2 Models of amino acid replacement

(5)

2.2.1 Empirical models

(4)

2.2.2 Mechanistic models

(1)

2.2.3 Among-site heterogeneity

(1)

2.3 Estimation of distance between two protein sequences

(2)

2.3.1 The Poisson model

(1)

2.3.2 Empirical models

(1)

2.3.3 Gamma distances

(1)

2.4 Models of codon substitution

(5)

2.4.1 The basic model

(2)

2.4.2 Variations and extensions

(3)

2.5 Estimation of ds and dN

(18)

2.5.1 Counting methods

(8)

2.5.2 Maximum likelihood method

(2)

2.5.3 Comparison of methods

(1)

2.5.4 More distances and interpretation of the dN/ds ratio

(3)

2.5.5 Estimation of d$ and dN in comparative genomics

(2)

2.5.6 Distances based on the physical-site definition

(2)

2.5.7 Utility of the distance measures

(1)

2.6 Numerical calculation of the transition probability matrix

(3)

2.7 Problems

(2)

3 Phytogeny reconstruction: overview

(32)

3.1 Tree concepts

(12)

3.1.1 Terminology

(9)

3.1.2 Species trees and gene trees

(2)

3.1.3 Classification of tree reconstruction methods

(1)

3.2 Exhaustive and heuristic tree search

(6)

3.2.1 Exhaustive tree search

(1)

3.2.2 Heuristic tree search

(2)

3.2.3 Branch swapping

(2)

3.2.4 Local peaks in the tree space

(2)

3.2.5 Stochastic tree search

(1)

3.3 Distance matrix methods

(7)

3.3.1 Least-squares method

(2)

3.3.2 Minimum evolution method

(1)

3.3.3 Neighbour-joining method

(4)

3.4 Maximum parsimony

(6)

3.4.1 Brief history

(1)

3.4.2 Counting the minimum number of changes on a tree

(1)

3.4.3 Weighted parsimony and dynamic programming

(3)

3.4.4 Probabilities of ancestral states

(1)

3.4.5 Long-branch attraction

(1)

3.4.6 Assumptions of parsimony

100

(1)

3.5 Problems

101

(1)

4 Maximum likelihood methods

102

(51)

4.1 Introduction

102

(1)

4.2 Likelihood calculation on tree

102

(12)

4.2.1 Data, model, tree, and likelihood

102

(1)

4.2.2 The pruning algorithm

103

(4)

4.2.3 Time reversibility, the root of the tree, and the molecular clock

107

(1)

4.2.4 A numerical example: phylogeny of apes

108

(2)

4.2.5 Amino acid, codon, and RNA models

110

(1)

4.2.6 Missing data, sequence errors, and alignment gaps

110

(4)

4.3 Likelihood calculation under more complex models

114

(11)

4.3.1 Mixture models for variable rates among sites

114

(8)

4.3.2 Mixture models for pattern heterogeneity among sites

122

(1)

4.3.3 Partition models for combined analysis of multiple datasets

123

(2)

4.3.4 Nonhomogeneous and nonstationary models

125

(1)

4.4 Reconstruction of ancestral states

125

(8)

4.4.1 Overview

125

(2)

4.4.2 Empirical and hierarchical Bayesian reconstruction

127

(3)

4.4.3 Discrete morphological characters

130

(1)

4.4.4 Systematic biases in ancestral reconstruction

131

(2)

4.5 Numerical algorithms for maximum likelihood estimation

133

(5)

4.5.1 Univariate optimization

134

(2)

4.5.2 Multivariate optimization

136

(2)

4.6 ML optimization in phylogenetics

138

(6)

4.6.1 Optimization on a fixed tree

138

(1)

4.6.2 Multiple local peaks on the likelihood surface for a fixed tree

139

(1)

4.6.3 Search in the tree space

140

(3)

4.6.4 Approximate likelihood method

143

(1)

4.7 Model selection and robustness

144

(7)

4.7.1 Likelihood ratio test applied to rbcL dataset

144

(2)

4.7.2 Test of goodness of fit and parametric bootstrap

146

(1)

4.7.3 Diagnostic tests to detect model violations

147

(1)

4.7.4 Akaike information criterion (AIC and AICC)

148

(1)

4.7.5 Bayesian information criterion

149

(1)

4.7.6 Model adequacy and robustness

150

(1)

4.8 Problems

151

(2)

5 Comparison of phylogenetic methods and tests on trees

153

(29)

5.1 Statistical performance of tree reconstruction methods

153

(4)

5.1.1 Criteria

154

(2)

5.1.2 Performance

156

(1)

5.2 Likelihood

157

(8)

5.2.1 Contrast with conventional parameter estimation

157

(1)

5.2.2 Consistency

158

(1)

5.2.3 Efficiency

159

(4)

5.2.4 Robustness

163

(2)

5.3 Parsimony

165

(6)

5.3.1 Equivalence with misbehaved likelihood models

165

(3)

5.3.2 Equivalence with well-behaved likelihood models

168

(1)

5.3.3 Assumptions and justifications

169

(2)

5.4 Testing hypotheses concerning trees

171

(10)

5.4.1 Bootstrap

172

(5)

5.4.2 Interior-branch test

177

(1)

5.4.3 K-H test and related tests

178

(1)

5.4.4 Example: phytogeny of apes

179

(1)

5.4.5 Indexes used in parsimony analysis

180

(1)

5.5 Problems

181

(1)

6 Bayesian theory

182

(32)

6.1 Overview

182

(1)

6.2 The Bayesian paradigm

183

(14)

6.2.1 The Bayes theorem

183

(1)

6.2.2 The Bayes theorem in Bayesian statistics

184

(5)

6.2.3 Classical versus Bayesian statistics

189

(8)

6.3 Prior

197

(6)

6.3.1 Methods of prior specification

197

(1)

6.3.2 Conjugate priors

198

(1)

6.3.3 Flat or uniform priors

199

(1)

6.3.4 The Jeffreys priors

200

(2)

6.3.5 The reference priors

202

(1)

6.4 Methods of integration

203

(9)

6.4.1 Laplace approximation

203

(1)

6.4.2 Mid-point and trapezoid methods

204

(1)

6.4.3 Gaussian quadrature

205

(1)

6.4.4 Marginal likelihood calculation for JC69 distance estimation

206

(4)

6.4.5 Monte Carlo integration

210

(1)

6.4.6 Importance sampling

210

(2)

6.5 Problems

212

(2)

7 Bayesian computation (MCMC)

214

(49)

7.1 Markov chain Monte Carlo

214

(7)

7.1.1 Metropolis algorithm

214

(4)

7.1.2 Asymmetrical moves and proposal ratio

218

(1)

7.1.3 The transition kernel

219

(1)

7.1.4 Single-component Metropolis--Hastings algorithm

220

(1)

7.1.5 Gibbs sampler

221

(1)

7.2 Simple moves and their proposal ratios

221

(5)

7.2.1 Sliding window using the uniform proposal

222

(1)

7.2.2 Sliding window using the normal proposal

223

(1)

7.2.3 Bactrian proposal

223

(1)

7.2.4 Sliding window using the multivariate normal proposal

224

(1)

7.2.5 Proportional scaling

225

(1)

7.2.6 Proportional scaling with bounds

226

(1)

7.3 Convergence, mixing, and summary of MCMC

226

(18)

7.3.1 Convergence and tail behaviour

226

(4)

7.3.2 Mixing efficiency, jump probability, and step length

230

(11)

7.3.3 Validating and diagnosing MCMC algorithms

241

(1)

7.3.4 Potential scale reduction statistic

242

(1)

7.3.5 Summary of MCMC output

243

(1)

7.4 Advanced Monte Carlo methods

244

(16)

7.4.1 Parallel tempering (MC3)

245

(2)

7.4.2 Trans-model and trans-dimensional MCMC

247

(9)

7.4.3 Bayes factor and marginal likelihood

256

(4)

7.5 Problems

260

(3)

8 Bayesian phylogenetics

263

(45)

8.1 Overview

263

(3)

8.1.1 Historical background

263

(1)

8.1.2 A sketch MCMC algorithm

264

(1)

8.1.3 The statistical nature of phylogeny estimation

264

(2)

8.2 Models and priors in Bayesian phylogenetics

266

(13)

8.2.1 Priors on branch lengths

266

(3)

8.2.2 Priors on parameters in substitution models

269

(7)

8.2.3 Priors on tree topology

276

(3)

8.3 MCMC proposals in Bayesian phylogenetics

279

(16)

8.3.1 Within-tree moves

279

(2)

8.3.2 Cross-tree moves

281

(3)

8.3.3 NNI for unrooted trees

284

(3)

8.3.4 SPR for unrooted trees

287

(2)

8.3.5 TBR for unrooted trees

289

(2)

8.3.6 Subtree swapping

291

(1)

8.3.7 NNI for rooted trees

292

(1)

8.3.8 SPR on rooted trees

293

(1)

8.3.9 Node slider

294

(1)

8.4 Summarizing MCMC output

295

(1)

8.5 High posterior probabilities for trees

296

(10)

8.5.1 High posterior probabilities for trees or splits

296

(2)

8.5.2 Star tree paradox

298

(2)

8.5.3 Fair coin paradox, fair balance paradox, and Bayesian model selection

300

(5)

8.5.4 Conservative Bayesian phylogenetics

305

(1)

8.6 Problems

306

(2)

9 Coalescent theory and species trees

308

(53)

9.1 Overview

308

(1)

9.2 The coalescent model for a single species

309

(11)

9.2.1 The backward time machine

309

(1)

9.2.2 Fisher-Wright model and the neutral coalescent

309

(3)

9.2.3 A sample of n genes

312

(3)

9.2.4 Simulating the coalescent

315

(1)

9.2.5 Estimation of θ from a sample of DNA sequences

316

(4)

9.3 Population demographic process

320

(5)

9.3.1 Homogeneous and nonhomogeneous Poisson processes

321

(1)

9.3.2 Deterministic population size change

322

(1)

9.3.3 Nonparametric population demographic models

323

(2)

9.4 Multispecies coalescent, species trees and gene trees

325

(24)

9.4.1 Multispecies coalescent

325

(6)

9.4.2 Species tree--gene tree conflict

331

(4)

9.4.3 Estimation of species trees

335

(8)

9.4.4 Migration

343

(6)

9.5 Species delimitation

349

(10)

9.5.1 Species concept and species delimitation

349

(2)

9.5.2 Simple methods for analysing genetic data

351

(1)

9.5.3 Bayesian species delimitation

352

(3)

9.5.4 The impact of guide tree, prior, and migration

355

(3)

9.5.5 Pros and cons of Bayesian species delimitation

358

(1)

9.6 Problems

359

(2)

10 Molecular clock and estimation of species divergence times

361

(29)

10.1 Overview

361

(2)

10.2 Tests of the molecular clock

363

(3)

10.2.1 Relative-rate tests

363

(1)

10.2.2 Likelihood ratio test

364

(1)

10.2.3 Limitations of molecular clock tests

365

(1)

10.2.4 Index of dispersion

366

(1)

10.3 Likelihood estimation of divergence times

366

(9)

10.3.1 Global clock model

366

(1)

10.3.2 Local clock model

367

(1)

10.3.3 Heuristic rate-smoothing methods

368

(2)

10.3.4 Uncertainties in calibrations

370

(2)

10.3.5 Dating viral divergences

372

(1)

10.3.6 Dating primate divergences

373

(2)

10.4 Bayesian estimation of divergence times

375

(13)

10.4.1 General framework

375

(1)

10.4.2 Approximate calculation of likelihood

376

(1)

10.4.3 Prior on evolutionary rates

377

(1)

10.4.4 Prior on divergence times and fossil calibrations

378

(4)

10.4.5 Uncertainties in time estimates

382

(2)

10.4.6 Dating viral divergences

384

(1)

10.4.7 Application to primate and mammalian divergences

385

(3)

10.5 Perspectives

388

(1)

10.6 Problems

389

(1)

11 Neutral and adaptive protein evolution

390

(28)

11.1 Introduction

390

(1)

11.2 The neutral theory and tests of neutrality

391

(7)

11.2.1 The neutral and nearly neutral theories

391

(2)

11.2.2 Tajima's D statistic

393

(1)

11.2.3 Fu and Li's D, and Fay and Wu's H statistics

394

(1)

11.2.4 McDonald--Kreitman test and estimation of selective strength

395

(2)

11.2.5 Hudson--Kreitman--Aquade test

397

(1)

11.3 Lineages undergoing adaptive evolution

398

(2)

11.3.1 Heuristic methods

398

(1)

11.3.2 Likelihood method

399

(1)

11.4 Amino acid sites undergoing adaptive evolution

400

(8)

11.4.1 Three strategies

400

(2)

11.4.2 Likelihood ratio test of positive selection under random-site models

402

(3)

11.4.3 Identification of sites under positive selection

405

(1)

11.4.4 Positive selection at the human MHC

406

(2)

11.5 Adaptive evolution affecting particular sites and lineages

408

(3)

11.5.1 Branch-site test of positive selection

408

(1)

11.5.2 Other similar models

409

(1)

11.5.3 Adaptive evolution in angiosperm phytochromes

410

(1)

11.6 Assumptions, limitations, and comparisons

411

(3)

11.6.1 Assumptions and limitations of current methods

412

(1)

11.6.2 Comparison of methods for detecting positive selection

413

(1)

11.7 Adaptively evolving genes

414

(2)

11.8 Problems

416

(2)

12 Simulating molecular evolution

418

(24)

12.1 Introduction

418

(1)

12.2 Random number generator

418

(2)

12.3 Generation of discrete random variables

420

(4)

12.3.1 Inversion method for sampling from a general discrete distribution

420

(1)

12.3.2 The alias method for sampling from a discrete distribution

421

(1)

12.3.3 Discrete uniform distribution

422

(1)

12.3.4 Binomial distribution

423

(1)

12.3.5 The multinomial distribution

423

(1)

12.3.6 The Poisson distribution

423

(1)

12.3.7 The composition method for mixture distributions

424

(1)

12.4 Generation of continuous random variables

424

(6)

12.4.1 The inversion method

425

(1)

12.4.2 The transformation method

425

(1)

12.4.3 The rejection method

425

(3)

12.4.4 Generation of a standard normal variate using the polar method

428

(2)

12.4.5 Gamma, beta, and Dirichlet variables

430

(1)

12.5 Simulation of Markov processes

430

(6)

12.5.1 Simulation of the Poisson process

430

(1)

12.5.2 Simulation of the nonhomogeneous Poisson process

431

(2)

12.5.3 Simulation of discrete-time Markov chains

433

(2)

12.5.4 Simulation of continuous-time Markov chains

435

(1)

12.6 Simulating molecular evolution

436

(3)

12.6.1 Simulation of sequences on a fixed tree

436

(3)

12.6.2 Simulation of random trees

439

(1)

12.7 Validation of the simulation program

439

(1)

12.8 Problems

440

(2)

Appendices

442

(8)

Appendix A Functions of random variables

442

(4)

Appendix B The delta technique

446

(2)

Appendix C Phylogenetic software

448

(2)

References

450

(38)

Index

488

Ziheng Yang is currently RA Fisher Professor of Statistical Genetics in University College London. He obtained a Ph. D in agronomy in Beijing Agricultural University in 1992. Since then he held a few postdoctoral researcher positions in the UK and US. He joined UCL in 1997, first as a lecturer, then reader and professor. He teaches statistical genetics. He has published about 150 research papers and book chapters in molecular evolution, phylogenetics, population genetics, and computational biology. His program package paml is widely used in the molecular evolution community. He was elected a Fellow of the Royal Society in 2006.

Pastovi nuoroda: https://www.kriso.lt/db/9780199602605_pe.html

El. knyga: Molecular Evolution: A Statistical Approach [Oxford Scholarship Online E-books]

Paskyra ir nustatymai

Paieška

Ieškoti duomenų bazėje

Patikslinti paiešką

Temos Publishers Subjects

Pasirinkti pirkinių krepšelį