DR GREG ELGAR, Reader in Functional Genomics

(e-mail: G.Elgar@qmul.ac.uk )

Comparative and Functional Genomics in Vertebrates

My research covers a wide spectrum of genomic studies combining lab based in vivo functional assays, molecular biology and bioinformatics analyses. In my lab we recently identified a fascinating set of highly conserved non-coding sequences which are found in all bony vertebrates, from fish to humans. These sequences are closely associated with key genes that orchestrate early vertebrate development and we have developed an assay in zebrafish embryos, which allows us to test these Conserved Non-coding Elements (CNEs) for enhancer activity in vivo . Current areas of focus include;

a) Analysis of the function of CNEs associated with specific genes , including a number that are associated with genetic disease. In addition to assaying 'wild type' sequences, we will also be examining the effect of specific mutations on the function of CNEs. We are developing new methods to allow the assaying of combinations of CNEs as well as whole genomic regions containing multiple CNEs usin BAC 'recombineering'.

b) Development of a relational database (CONDOR) which will store all sequence and functional data for the vertebrate CNEs and their associated genes. The database can be conceptually divided into two sections. The first stores data from comparative multiple alignments and other bioinformatics data associated with it, and the second section stores functional (lab based) data. The design of the schema and generalised scheme is intuitive enough to allow other groups carrying out similar research to easily add their functional results on any of the CNE to the database. A web based front-end to the database will allow users to retrieve data based on input criteria or via a clickable graphical interface.

c) Investigating the language and grammar of CNEs using computational methods. Our current hypothesis is that the CNEs carry out some regulatory function for genes important in development. We could therefore envisage a gene regulatory network (GRN) for vertebrate development containing these genes. A network like this can be built based on different sets of data; the more independent data sources that are integrated, the higher the confidence on the network. A future plan is to integrate expression data from the enhancer assays with gene expression data from microarrays, co-citation in PubMed and possibly sequence motif composition to build interactions between regulatory proteins/genes.

d) Studying the evolution of CNEs in vertebrate genomes and their association with specific function. An unusual characteristic of the CNEs we have identified is that, despite their striking similarity between fish and mammals, they are unrecognisable in invertebrates. Although many of the genes identified in our analysis have clear homologues within these genomes, we can find no invertebrate CNE sequences. We are consequently intrigued to know when and where the CNEs first evolved and whether they are tightly associated with the emergence of the vertebrate lineage. Initial PCR data suggests that at least some of the CNEs are highly conserved in sharks. By building up phylogenetic profiles of the CNEs from a wide range of organisms (phylogenetic shadowing), we will be able to determine whether they are made up of smaller footprints of invariance, which will help in defining the language of these elements. By tracing the evolutionary origins of CNEs, we may be able to define these more clearly and in addition associate sequence variation within the CNEs with particular clades or lineages, thereby linking specific sequence variation with functional diversity.

e) We also host a comprehensive web site covering all aspects of the Pufferfish ( Fugu rubripes ) genome, including the latest genome assembly and comprehensive annotation and search facilities ( http://fugu.biology.qmul.ac.uk/ )