The data generation in modern biology is rapidly surpassing the speed of data analysis. This is especially true of phylogenetic analysis. The general problem is that the phylogenetic trees are too large to perform analysis in any sophisticated or fast way. The purpose of this project is to speed up phylogeny analysis and sequence alignment for identifying protein subfamilies using ancestral sequences. Basic steps of project include clustering protein families, alignment of sequence in each cluster, finding ancestral sequences for each cluster, and determining how this all affects the speed and performance of phylogenetic analysis