Description
Dengue (DenV) and Zika (ZIKV) Viruses are two of the most important human viral pathogens around today with no vaccines or antivirals. Understanding these viral interactions with the host is critical. Part of their life cycle involves well-classified proteolytic cleavage events that process their viral proteome before maturation. Consequently, viral proteases have been targeted for research and development of antivirals. Activity of these proteases is relatively well understood but little is known about their substrate recognition beyond the viral proteome. These proteases are able to cleave host substrates during the viral life cycle. Therefore, we have undertaken an in-silico approach utilizing the field of bioinformatics. We created a practical search using Regular Expressions (Regex) and a statistical search using a Position Weight Matrix (WM) that allowed analysis and retrieval of possible host substrates that met specified parameters. Substrates found in-silico are potential true viral protease substrates later to be tested using biological assays. To reveal novel host substrates of DenV and ZIKV proteases, substrates known to be cleaved by the viral proteases will serve as controls. The long-term goal of identifying novel host substrate cleavage events is to gain a better understanding of how the virus interacts with the host at the molecular level. With a better understanding of viral/host interactions, the potential for elucidating novel target sites could lead to the development of novel therapeutics targets against DenV and ZIKV. The Regex search revealed a large number of hits, ranging from tens to thousands, after varying parameters such as the offset and total number of amino acids. The WM search revealed thousands of potential target sites by scoring and summing a series of twelve amino acid sites. Results from Regex and WM were then compared following a filtering method that led to a list of hits found through both bioinformatics approaches. Among those, we focused on an initial list of most probably biologically significant hits that included five hits for the NS4A/2K site and ten for the CAPSID, NS2A/B, NS2B/3, NS34/A, and NS4B/5 sites. These lists will provide the basis for biological analysis in the context of viral infection.