Phylogenetic disorder of genetic system

Содержание

Слайд 2

Phylogenetic disorder of genetic system Genes with common profiles of the

Phylogenetic disorder of genetic system
Genes with common profiles of the presence

and absence in disparate genomes tend to function in the same pathway. By mapping all human genes into about 1000 clusters of genes with similar patterns of conservation across eukaryotic phylogeny, we determined that sets of genes associated with particular diseases have similar phylogenetic profiles.
Слайд 3

The hundreds of eukaryotic genomes now sequenced allow the tracking of

The hundreds of eukaryotic genomes now sequenced allow the tracking of

the evolution of human genes, and the analysis of patterns of their conservation across eukaryotic clades. Phylogenetic profiling describes the relative sequence conservation or divergence of orthologous proteins across a set of reference genomes. 
Слайд 4

Different classes of functional gene groups have distinct coevolution patterns The

Different classes of functional gene groups have distinct coevolution patterns
The TCA

cycle is an extreme example of a well‐studied, highly annotated molecular pathway that overlaps significantly with the phylogenetic profile classification of human genes. To systematically query the overlap between our phylogenetic profiling of human genes and many other analyses of human molecular pathways, 
0
Слайд 5

Systematic identification of genes that coevolve with known pathways and diseases

Systematic identification of genes that coevolve with known pathways and diseases
In

the mapping of genes classified by HPO groups or by MSigDB groups to phylogenetic clusters, we noted that some of the same genes were correlating with distinct diseases and distinct molecular signature gene groups. For example, a set of 4–6 nuclearly encoded mitochondrial proteins constitute the overlap with MSigDB groups such as KEGG oxidative phosphorylation and HPO terms such as abnormal cerebrospinal fluid, 
Слайд 6

Many molecular pathways map to the same phylogenetic clusters as genes


Many molecular pathways map to the same phylogenetic clusters as genes

associated with specific human diseases.
Focusing on proteins coevolved with the microphthalmia‐associated transcription factor (MITF), we identified the Notch pathway suppressor of hairless (RBP‐Jk/SuH) transcription factor, and showed that RBP‐Jk functions as an MITF cofactor.
Слайд 7

Phylogenetic profiling identifies a new MITF‐associated factor While phylogenetic profiling could

Phylogenetic profiling identifies a new MITF‐associated factor

While phylogenetic profiling could be

used to seek the particular diseases with the strongest phylogenetic profile overlap, we could also query for particular known components of diseases whether they have similar phylogenetic profiles to any other genes. The proteins with the same profile are much more likely to act in the same pathway. As an example, we used phylogenetic profiling to investigate the role of MITF, the master regulator 
Слайд 8

By analyzing the conservation of human proteins across 87 species, we


By analyzing the conservation of human proteins across 87 species, we

sorted proteins into clusters of coevolution. Some clusters are enriched for genes assigned to particular human diseases or molecular pathways; the other genes in the same cluster may function in related pathways and diseases.
Слайд 9

Phylogenetic profile analysis of genes sets with similar disease phenotypes Phylogenetic

Phylogenetic profile analysis of genes sets with similar disease phenotypes
Phylogenetic profile

analysis has previously been a powerful tool for the study of human Bardet‐Biedl syndrome and mitochondrial diseases ). Just as phylogenetic profiling could detect significant overlap with about 20% of the molecular signatures gene groups, we sought to detect a similar fraction of the smaller set of genes annotated at present to be variant in human genetic diseases. Even though only a subset of human disease loci have been identified at this intermediate stage in human genetic analysis, 
Слайд 10

Phylogenetic profiling identifies ccdc105 as a meiosis‐specific chromatin localization gene Proteins

Phylogenetic profiling identifies ccdc105 as a meiosis‐specific chromatin localization gene
Proteins that constitute components

of specialized multiprotein complexes are also expected to have similar phylogenetic profiles. As a test for the use of phylogenetic profiles to generate candidate components of such protein complexes, we analyzed proteins of the synaptonemal complex. 
Слайд 11

Many genes that were thought to map to different diseases are

Many genes that were thought to map to different diseases are

actually coevolved together and mapped into the same phylogenetic clusters.
Слайд 12

Materials and methods Species database generation Protein‐coding sequences for human genes

Materials and methods Species database generation
Protein‐coding sequences for human genes were downloaded

using BioMart version 0.7 from the Ensembl project (release 60). Ensembl includes both automatic annotation, in which transcripts are determined and annotated genome‐wide by automated bioinformatic methods, and manual curation. 
Слайд 13

Calculation of the list of most correlated genes Pearson correlation coefficient

Calculation of the list of most correlated genes 
Pearson correlation coefficient (R )

was calculated using the NPP matrix to generate a correlation matrix. High correlation can be the result of coevolution or a by‐product of homology between gene sequences and in the later only corresponds to paralogous genes. To remove phylogenetic profile correlation scores that resulted from homology between the sequences of two human genes Gi to Gj, we assigned
Слайд 14

Calculation of Co10 scores To test whether sets of functional annotated

Calculation of Co10 scores
To test whether sets of functional annotated genes

are significantly coevolved, we calculated a Coevolution (Co10) score. We determined for each gene the 10 non‐homologous genes (the 10 nearest neighbors) that are most phylogenetically correlated with it (List10—see Materials and methods). We also tested 20, 50, and 100 nearest neighbors and this analysis yielded similar results (data not shown).
Слайд 15

Generation of binary phylogenetic profile and NPP with different organism sets

Generation of binary phylogenetic profile and NPP with different organism sets
To

test for the effect of different numbers of species on the performance of phylogenetic profiling, we resampled our data using 75, 50, or 25% of our original species list. To keep similar phylogenetic representation of the organisms that were used, we chose organisms from the entire eukaryotic tree. 
Слайд 16

Generation of coevolved gene clusters For each protein A, we ranked

Generation of coevolved gene clusters

For each protein A, we ranked the

top 50 most correlated genes to it, using Pearson's correlation coefficient (R ) on the NPP matrix. The most correlated protein to A received a rank score of 50 and the others the score of 49, 48, …, 1. The 50th protein got the rank score of one. The other genes got the rank score of zero. Since the rankings are asymmetric (i.e., Rank A to B is not necessary identical to the rank B to A), a ranking score between two genes (ranksocreAB) was calculated.
Слайд 17

High load can lead to a small population size, which in

High load can lead to a small population size, which in

turn increases the accumulation of mutation load, culminating in extinction via mutational meltdown.
Слайд 18

MSigDB and HPO database The Molecular Signature Database (MSigDB v3.0) contains

MSigDB and HPO database
The Molecular Signature Database (MSigDB v3.0) contains 6800

gene sets collected from various sources such as online pathway databases (KEGG, BIOcharta), Gene Ontology (GO groups), publications in PubMed and genes that share cis‐regulatory motifs or are coexpressed. We used the 6594 sets with fewer than 500 genes. 
Слайд 19

Plasmids pcDNA3‐MITF and PGL4.11‐TRPM1 promoter luciferase were described in previous publications.

Plasmids

pcDNA3‐MITF and PGL4.11‐TRPM1 promoter luciferase were described in previous publications. pSG5‐RBP‐Jk

was kindly provided by Dr E Manet (INSERM U758, Unité de Virologie humaine, Lyon, France).