Abstract
DNA sequences of protein-coding genes seem to be more effective than 16S rRNA gene sequencing for classifying the ecological diversity of bacteria. This fact, however, may be a consequence of the low evolutionary rate of 16S rRNA genes. While 16S rRNA gene sequence data are useful for placing moderately divergent populations into separate sequence clusters, protein-coding genes provide a more effective tool for distinguishing very closely related, recently evolved, ecological populations. Preliminary investigations using metabolic genes as genetic markers, such as genes coding for malate dehydrogenase or 6-phosphogluconate dehydrogenase, have been performed to study microbial communities. The sequencing of metabolic genes could enable the study of specific biochemical activities in bacteria, the identification of clones within bacterial populations, and the characterisation of specific clones within communities of pathogenic microorganisms. Multilocus sequence typing has been successfully used, e.g. for the identification of the currently circulating hypervirulent meningococcal lineages in epidemiological studies. It has also suggested that sequencing of protein-coding genes would provide a finer phylogenetic resolution when finding geovars possibly belonging to unknown species. It is expected, therefore, that sequencing of protein-coding loci, which usually show a wider sequence variations and are more rapidly evolving than the more conserved 16S rRNA encoding genes, will disclose many previously unknown ecological populations of bacteria in future population surveys.
Keywords: protein coding genes, Bioinformatics, Neisseria, meningitidis, campylobacter jejuny, pediococcus acidilactici, lactobacillus delbrueckii, delbrueckii subsp, bulgaricus, lactis, streptococcus