Description
This thesis incorporates two projects, one in assessing software availability and application in detecting SNPs for next generation sequencing, and the other in software engineering of a social networking environment for use in biomedical informatics. SNP Detection: The study on variations in DNA sequences has helped scientists understand the human response to diseases, drugs, vaccines, and relate some diseases to SNPs (Single Nucleotide Polymorphisms). SNP calling research has significantly evolved in recent years: from extremely expensive and time consuming to automated and efficient methods. This evolution has helped advance fields of biomedical, pharmacology and genetic research. Given the variety of reasons for detecting SNPs and the growing number of sequenced genomes, there is an urgent need for detecting SNPs in genomes more efficiently and accurately. The presented project is a preliminary work toward achieving that goal. This project is a survey of free and commercially available applications for automated SNP detection. I present some of the most popular and most used applications with a brief evaluation (strengths and weakness) of each one. The outcome can either be used as a guide for choosing the most appropriate application for SNP detection project at hand, or as a guiding resource for developing a new SNP detection algorithm. A summary table of software packages and their attributes is presented as outcome of this project. Reference Miner for Gene Wiki: This work is a subproject of the Gene Wiki initiative. Gene Wiki is a project that creates seed articles by collecting reviewed information for each human gene and protein. According to Wiki's report, approximately 10,271 articles have been created to include Gene Wiki project content to the date of this writing. Reference Miner is the application that identifies and extracts all online citations to Pubmed for insertion to Gene wiki pages. The result will then be reviewed in its context by curators for new gene Annotations. My contribution to this project was to improve the application by automatically extracting the sentences that contain a citation from Gene Wiki pages using article names (proteins, genes). Working with Google AppEngine as programming environment and Python as programming language, we successfully extracted full sentences with inline citations. This application takes as Input a single Wiki article name (names of a gene or protein) and produces a plain text output file with specific information on the article including the sentences in which the Article was cited and the specific position of the citation in the sentence. A better display in html is proposed at the end.