FGFR paralogon phylogenomic analysis

The absolute nature of evolutionary events that had led to creation of ancient (>450) paralogy regions in the vertebrate genome is extremely difficult to track through inter-genomic and intra-genomic map comparison approaches because such ancient events experienced multiple chromosomal breakages and rearrangement events that led to the alteration of karyotype and disruption of gene order on chromosomes. A more convincing way to determine the mechanism of origin of vertebrate ancient paralogons is phylogenetic analysis of multigene families (Abbasi, 2010; Abbasi and Grzeschik, 2007; Asrar et al., 2013; Hughes, 1998; Hughes et al., 2001a).  


In order to perform a thorough phylogenetic analysis of genes from 80 gene families with threefold or fourfold representations on human FGFR bearing chromosomes, the chromosomal locations were obtained from Ensembl genome browser (Hubbard et al., 2002). Robust scanning identified that 17 of these families have members on each of the human FGFR bearing chromosomes while 63 have their members on at least three of those chromosomes. BLASTP (Altschul et al., 1990) of the Ensembl genome browser, using bidirectional best hit strategy, was employed to attain closest putative orthologous sequences of the human proteins in other species (Hubbard et al., 2002). For those organisms for which sequences were not available at Ensembl (Paul et al., 2014), the BLASTP (Altschul et al., 1990) search was carried out against the protein database available at the National Center for Biotechnology Information (Johnson et al., 2008) and the Joint Genome Institute (Nordberg et al., 2014).


Amino acid sequences were aligned by using CLUSTAL W (Thompson et al., 1994) under default parameters. Phylogenetic trees were constructed by using Neighbor-Joining (NJ) method (Russo et al., 1996; Saitou and Nei, 1987) with p-distance as amino-acid substitution model, implemented in MEGA version 5 (Kumar et al., 2008; Tamura et al., 2011). Complete deletion option was selected to eliminate any site which can introduce a gap in the sequences. Tree topologies were supported by bootstrap (Felsenstein, 1985) with 1000 replicates. To systematically check and validate trees with different reconstruction method; Maximum Likelihood with Whelan and Goldman (WAG) model (Whelan and Goldman, 2001), using MEGA 5 program (Tamura et al., 2011) was implemented. Results from both the methods are presented in this database.



                                    Figure 1(a): Neighbor Joining (N.J) Tree of FGFR Family                      Figure 1(b): Maximum Likelihood (M.L) Tree of FGFR Family


Useful references:


Abbasi, A. A. (2010). Unraveling ancient segmental duplication events in human genome by phylogenetic analysis of multigene families residing on HOX-cluster paralogons. Molecular phylogenetics and evolution57(2), 836-848.

Abbasi, A. A., & Grzeschik, K. H. (2007). An insight into the phylogenetic history of HOX linked gene families in vertebrates. BMC evolutionary biology,7(1), 239.>/a>

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

Asrar, Z., Haq, F., & Abbasi, A. A. (2013). Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome. Molecular phylogenetics and evolution66(3), 737-747.

Felsenstein, J., (1985). Confidence limit on phylogenies: an approach using the bootstrap. Evolution 39, 95–105.

Hubbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J., Curwen, V., Down, T., Durbin, R., Eyras, E., Gilbert, J., Hammond, M., Huminiecki, L., Kasprzyk, A., Lehvaslaiho, H., Lijnzaad, P., Melsopp, C., Mongin, E., Pettett, R., Pocock, M., Potter, S., Rust, A., Schmidt, E., Searle, S., Slater, G., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Stupka, E., Ureta- Vidal, A., Vastrik, I., Clamp, M., (2002). The Ensembl genome database project.Nucleic Acids Res. 30, 38–41.

Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., & Kumar, S. (2011). MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular biology and evolution, 28(10), 2731-2739.

Hughes, A. L. (1998). Phylogenetic tests of the hypothesis of block duplication of homologous genes on human chromosomes 6, 9, and 1. Molecular biology and evolution15(7), 854-870.

Hughes, A. L., da Silva, J., & Friedman, R. (2001). Ancient genome duplications did not structure the human Hox-bearing chromosomes. Genome research11(5), 771-780.

Johnson, M., Zaretskaya, I., Raytselis, Y., Merezhuk, Y., McGinnis, S., Madden, T.L., (2008). NCBI BLAST: a better web interface. Nucleic Acids Res, 36, W5–W9.


Kumar, S., Nei, M., Dudley, J., Tamura, K., (2008). MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences. Brief Bioinform, 9, 299–306.

Russo, C.A., Takezaki, N., Nei, M., (1996). Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny. Mol. Biol. Evol. 13, 525–536.

Saitou, N., Nei, M., (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol, 4, 406–425.

Thompson, J.D., Higgins, D.G., Gibson, T.J., (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 22, 4673–4680.

Nordberg, G. F., Fowler, B. A., & Nordberg, M. (Eds.). (2014). Handbook on the Toxicology of Metals. Academic Press.

Whelan, S., Goldman, N., (2001). A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol, 18, 691–699.