Doctor of Philosophy
School of Biological Sciences and School of Mathematics and Applied Statistics
Caldwell, Rachel Amber, Investigation of the length distributions of coding and noncoding sequences in relation to gene architecture, function, and expression, Doctor of Philosophy thesis, School of Biological Sciences and School of Mathematics and Applied Statistics, University of Wollongong, 2015. http://ro.uow.edu.au/theses/4645
The last 20 years has seen the birth of bioinformatics, and is defined as the combination of mathematics, biology, and computational approaches. This discipline has led to the era of ontology, extensive databases including sequences, structures, expression profiles, and genomes and database cross-referencing, (Ouzounis, 2012). Before this discipline, scientists referenced atlas books, such as Margret Dayhoff’s protein sequence collection (Strasser, 2010) which required long hours of letter counting. Through the development of sequencing technology over the past forty years, a tremendous amount of genomic sequencing data has already been collected. With a surge of such data increasing, so does the challenges of data organisation, accessibility and interpretation, with interpretation being the most challenging (Ouzounis, 2012).