Tuberculosis is one of the most vicious infectious diseases and its etiological agent Mycobacterium tuberculosis is a highly successful pathogen, which claims millions of lives annually. Despite a broad panel of drugs, the pathogen remains unconquered due to its host persistence and recalcitrance towards drugs. The rising instances of drug resistance are proving to be a major therapeutic challenge. In order to contain the pathogenesis and eradicate the resistant strains of M. tuberculosis, extensive genomic studies have been conducted. Despite tremendous research and insights accumulated, drug resistant strains are spreading alarmingly, which summons for novel approaches of deciphering the pathogen genome. By far, genomic research has focused on virulence, metabolism, information pathways and regulatory protein coding genes of the pathogen. More than a quarter (about 30%) of the reference strain M. tuberculosis H37Rv genome, termed hypothetical genes, has been characterized minimally. This lacuna impede holistic grasp over the pathogen. In order to contribute to this sparsely-investigated area, the hypothetical gene set and their predicted proteins have been analyzed in this study. The reference strain M. tuberculosis H37Rv, its avirulent counterpart M. tuberculosis H37Ra and 51 clinical isolates of diverse lineage and drug susceptibility profiles, procured from 5 countries have been compared. The reference strain genomes and de novo-assembled genomes have been computationally analyzed, clustered, interpreted and based on the findings, hypotheses were formulated. Collectively, this work is expected to contribute towards the understanding of M. tuberculosis genome plasticity and its role in drug resistance.