My research
can generally be summarised as studies on functionally important genetic
variations. To date, I have focused mainly on tandem repeat sequence length
variation within protein-coding regions and UTRs of genes using the redundant
sequence database UniGene and the Tandem Repeats Finder (TRF) algorithm. An
analysis of human, fruit fly, nematode, cow and bacterial genomes using the
EST/mRNA data in UniGene has
been carried out, incorporating analysis using Gene Ontology (GO) and statistical
tests such as Fisher’s exact, Mann-Whitney and logistic regression. An
online application implementing a script used as part of this analysis is VNTRfinder
(submitted). This web application facilitates the identification of variable
tandem repeats across multiple sequences or genomes. Studies on repeat
variations in Whole-Genome Shotgun (WGS) sequences are also underway.
An
additional analysis of potentially polymorphic tandem repeat loci has been
carried out using TRF and rules described by Wren et al. [1]
that have been derived from the literature. Genotyping studies are underway to
assess the associations of polymorphisms within this set with disease. I am
also developing my own polymorphism prediction algorithms using genotyping
databases and data from my analysis of UniGene and WGSs.
Other
analyses include studies on functional constraint in Multi-species Conserved
Sequences using SNP data and the identification of frameshifting sequence
variants.
My research makes use
of the following tools:
DATASETS
UniGene
TissueInfo
PROGRAMS
Tandem Repeats Finder
e-PCR
See my links
page for links to some more programs and datasets used and, if bored, a brief
history of me.
[1] Wren, J. D.,