Atnaujinkite slapukų nuostatas

El. knyga: Building Bioinformatics Solutions 2nd Revised edition [Oxford Scholarship Online E-books]

(Professor of Bioinformatics, The School of Biological and Chemical Sciences, Queen Mary, University of London), (Software Developer, Nature Publishing Group), (Director of Integrated Health, Alere Inc.)
  • Formatas: 366 pages
  • Išleidimo metai: 16-Jan-2014
  • Leidėjas: Oxford University Press
  • ISBN-13: 9780199658558
  • Oxford Scholarship Online E-books
  • Kaina nežinoma
  • Formatas: 366 pages
  • Išleidimo metai: 16-Jan-2014
  • Leidėjas: Oxford University Press
  • ISBN-13: 9780199658558
Bioinformatics encompasses a broad and ever-changing range of activities involved with the management and analysis of data from molecular biology experiments. Despite the diversity of activities and applications, the basic methodology and core tools needed to tackle bioinformatics problems is common to many projects. This unique book provides an invaluable introduction to three of the main tools used in the development of bioinformatics software - Perl, R and MySQL - and explains how these can be used together to tackle the complex data-driven challenges that typify modern biology. These industry standard open source tools form the core of many bioinformatics projects, both in academia and industry. The methodologies introduced are platform independent, and all the examples that feature have been tested on Windows, Linux and Mac OS.

Building Bioinformatics Solutions is suitable for graduate students and researchers in the life sciences who wish to automate analyses or create their own databases and web-based tools. No prior knowledge of software development is assumed. Having worked through the book, the reader should have the necessary core skills to develop computational solutions for their specific research programmes. The book will also help the reader overcome the inertia associated with penetrating this field, and provide them with the confidence and understanding required to go on to develop more advanced bioinformatics skills.
1 Introduction 1(20)
1.1 From data to knowledge: the aim of bioinformatics
1(1)
1.2 Using this book
2(2)
1.2.1 About the coverage of this book
2(1)
1.2.2 Choice of tools
3(1)
1.2.3 Choice of operating system
3(1)
1.2.4 www.bixsolutions.net
4(1)
1.3 Principal applications of bioinformatics
4(4)
1.3.1 Sequence analysis
5(1)
1.3.2 Transcriptomics
5(1)
1.3.3 Proteomics
6(1)
1.3.4 Metabolomics
7(1)
1.3.5 Systems biology
7(1)
1.3.6 Literature mining
8(1)
1.3.7 Structural biology
8(1)
1.4 Building bioinformatics solutions
8(2)
1.5 Publicly available bioinformatics resources
10(6)
1.5.1 Publicly available data
10(4)
1.5.2 Publicly available analysis tools
14(1)
1.5.3 Publicly available workflow solutions
15(1)
1.6 Some computing practicalities
16(3)
1.6.1 Hardware requirements
16(1)
1.6.2 The command line
17(1)
1.6.3 Case sensitivity
18(1)
1.6.4 Security, firewalls, and administration rights
18(1)
References
19(2)
2 Building biological databases with SQL 21(52)
2.1 Common database types
22(7)
2.1.1 Flat text files
22(1)
2.1.2 XML
23(3)
2.1.3 Relational databases
26(3)
2.2 Relational database design-the 'natural' approach
29(16)
2.2.1 Steps 1-3: gather, group, and name the data
30(5)
2.2.2 Step 4: data types
35(4)
2.2.3 Step 5: atomicity of data
39(1)
2.2.4 Steps 6 and 7: indexing and linking tables
39(6)
2.2.5 Departure from design
45(1)
2.3 Installing and configuring a MySQL server
45(4)
2.3.1 Download and installation
45(3)
2.3.2 Creating a database and a user account
48(1)
2.4 Alternatives to MySQL
49(3)
2.4.1 PostgreSQL
49(1)
2.4.2 Oracle
50(1)
2.4.3 MariaDB
50(1)
2.4.4 Microsoft Access
50(1)
2.4.5 Big Data and NoSQL databases
51(1)
2.5 Database access using SQL
52(18)
2.5.1 Compatibility between RDBMSs
53(1)
2.5.2 Error messages
53(1)
2.5.3 Creating a database
53(1)
2.5.4 Creating tables and enforcing referential integrity
54(3)
2.5.5 Populating the database
57(2)
2.5.6 Removing data and tables from the database
59(1)
2.5.7 Creating and using source files
60(1)
2.5.8 Querying the database
61(7)
2.5.9 Transaction handling
68(1)
2.5.10 Copying, moving, and backing up a database
69(1)
2.6 MySQL Workbench: an alternative to the command line
70(2)
2.7 Summary
72(1)
References
72(1)
3 Beginning programming in Perl 73(84)
3.1 Downloading and installing Perl
74(3)
3.1.1 Older versions of Perl on Mac OS
74(1)
3.1.2 Older versions of Perl on Linux
75(1)
3.1.3 Installing Perl on Windows
75(1)
3.1.4 Compilers and other developer tools
75(1)
3.1.5 Before getting started
76(1)
3.2 Basic Perl syntax and logic
77(26)
3.2.1 Scalar variables
79(6)
3.2.2 Arrays
85(4)
3.2.3 Hashes
89(2)
3.2.4 Control structures and logic operators
91(6)
3.2.5 Writing interactive programs-I/O basics
97(4)
3.2.6 Some good coding practice
101(2)
3.2.7 Summary
103(1)
3.3 References
103(9)
3.3.1 Multidimensional arrays
104(3)
3.3.2 Multidimensional hashes
107(3)
3.3.3 Viewing data structures with Data::Dumper
110(2)
3.4 Subroutines and modules
112(5)
3.4.1 Making a Perl module
115(2)
3.5 Regular expressions
117(6)
3.5.1 Defining regular expressions
117(2)
3.5.2 More advanced regular expressions
119(2)
3.5.3 Regular expressions in practice
121(2)
3.6 File handling and directory operations
123(4)
3.6.1 Reading text files
124(1)
3.6.2 Writing text files
125(1)
3.6.3 Directory operations
126(1)
3.7 Error handling
127(2)
3.8 Retrieving files from the Internet
129(4)
3.8.1 Utilizing NCBI's eUtilities
131(2)
3.9 Accessing relational databases using Perl DBI
133(7)
3.9.1 Installing DBD::MySQL
134(1)
3.9.2 Connecting to a database
135(1)
3.9.3 Querying the database
136(2)
3.9.4 Populating the database
138(1)
3.9.5 Database transactions and error handling
139(1)
3.10 Harnessing existing tools
140(3)
3.10.1 CPAN
141(1)
3.10.2 BioPerl
142(1)
3.10.3 System commands
143(1)
3.11 Object-oriented programming
143(12)
3.11.1 Object-oriented programming in Perl using Moose
145(10)
3.12 Summary
155(1)
References
156(1)
4 Analysis and visualisation of data using R 157(56)
4.1 Introduction to R
158(33)
4.1.1 Downloading and installing R
159(1)
4.1.2 Basic R concepts and syntax
160(2)
4.1.3 Vectors and data frames-
162(3)
4.1.4 The nature of experimental data
165(4)
4.1.5 R modes, objects, lists, classes, and methods
169(4)
4.1.6 Importing data into R
173(1)
4.1.7 'Data visualization in R
174(6)
4.1.8 Writing programs in R
180(5)
4.1.9 Some essential R functions
185(4)
4.1.10 The RStudio integrated development environment
189(2)
4.2 Multivariate data analysis
191(7)
4.2.1 Exploratory data analysis
191(1)
4.2.2 Scatter plots
191(1)
4.2.3 Principal components analysis
192(2)
4.2.4 Hierarchical cluster analysis
194(4)
4.2.5 Pattern recognition
198(1)
4.3 R packages
198(10)
4.3.1 Installing and using Bioconductor packages
200(5)
4.3.2 The RMySQL package for database connectivity
205(2)
4.3.3 Packages for multivariate classification
207(1)
4.3.4 Writing your own R packages
207(1)
4.4 Integrating Perl and R
208(1)
4.5 Alternatives to R
208(3)
4.5.1 S+
208(1)
4.5.2 Matlab
209(1)
4.5.3 Octave
210(1)
4.6 Summary
211(1)
References
211(2)
5 Developing web resources 213(52)
5.1 Web servers
213(1)
5.2 Introduction to HTML
213(7)
5.2.1 Creating and editing HTML documents
214(1)
5.2.2 The structure of a web page
214(1)
5.2.3 HTML tags and general formatting
215(3)
5.2.4 An example web page
218(2)
5.2.5 Web standards and browser compatibility
220(1)
5.3 Programming for the web using Perl
220(19)
5.3.1 Mojolicious::Lite
221(3)
5.3.2 Debugging Mojolicious applications
224(1)
5.3.3 Routes
225(2)
5.3.4 Interfacing with databases within a web application
227(4)
5.3.5 Getting user input via forms
231(7)
5.3.6 Deploying a Mojolicious application
238(1)
5.3.7 Going further with Mojolicious
239(1)
5.4 Advanced web techniques and languages
239(5)
5.4.1 Cascading stylesheets
239(3)
5.4.2 JavaScript, JavaScript libraries, and Ajax
242(2)
5.5 Data Visualization on the web
244(20)
5.5.1 Using R graphics in Perl
244(6)
5.5.2 Plotting graphs with Chart::Clicker
250(6)
5.5.3 Plotting graphs with SVG::TT::Graph
256(7)
5.5.4 Primitive graphics with Perl
263(1)
5.5.5 Drawing graphs and graphics using JavaScript
263(1)
5.6 Summary
264(1)
References
264(1)
6 Software engineering for bioinformatics 265(64)
6.1 Unit testing
266(6)
6.1.1 Unit testing in practice
267(5)
6.2 Version control
272(16)
6.2.1 The basics of version control
272(3)
6.2.2 Centralized versus distributed version control
275(1)
6.2.3 Git
276(10)
6.2.4 Alternatives to Git
286(1)
6.2.5 Hosting and sharing your code on the Internet
287(1)
6.2.6 Running your own code repository
288(1)
6.3 Creating useful documentation
288(5)
6.3.1 Documenting command-line applications
289(1)
6.3.2 Documenting Perl code
290(3)
6.4 User-centred software design
293(1)
6.5 Alternatives to Perl
294(33)
6.5.1 Python
294(11)
6.5.2 Ruby
305(13)
6.5.3 Java
318(8)
6.5.4 Using Galaxy
326(1)
6.6 Summary
327(1)
References
327(2)
Appendix A: Using command-line interfaces 329(6)
A.1 Getting to the operating system command line
329(2)
A.2 General command-line concepts
331(2)
A.3 Command-line tips
333(2)
Appendix B: Getting started with Apache HTTP Server 335(6)
B.1 Installing Apache
336(1)
B.2 Apache fundamentals
337(4)
Appendix C: Setting up a Linux virtual machine in Windows 341(6)
C.1 Installing VirtualBox and configuring a virtual machine
341(3)
C.2 Using the VM
344(1)
C.3 Other uses of virtual machines
345(2)
Index 347
Conrad Bessant is Professor of Bioinformatics at Queen Mary, University of London. He is active in both teaching and research, and has been involved in a number of software development projects in the areas of proteomics and metabolomics.

Darren Oakley is a Software Developer at Nature Publishing Group (NPG). During his time at NPG he has been involved in numerous projects for improving the meta-data related to NPG articles. His current role is lead developer on NPG's next-generation online publishing platform.

Ian Shadforth is Director of Integrated Health and Bioinformatics at Alere Inc - a global medical devices and health management company. In this role Ian leads new concept development across technology, analytics and web to better enable individuals to achieve their health goals.