Detecting programming language from source code using bayesian learning techniques

12Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With dozens of popular programming languages used worldwide, the number of source code files of programs available online for public use is massive. However most blogs, forums or online Q&A websites have poor searchability for specific programming language source code. Näive thumb rules based on the file extension if any are invariably used for syntax highlighting, indentation and other ways to improve readability of the code by programming language editors. A more systematic way to identify the language in which a given source file was written would be of immense value. We believe that simple Bayesiam models would be adequate for this given the intrinsic syntactic structure of any programming language. In this paper, we present Bayesian learning models for correctly identifying the programming language in which a given piece of source code was written, with high probability. We have used 20000 source code files across 10 programming languages to train and test the model using the following Bayesian classifier models - Naive Bayes, Bayesian Network and Multinomial Naive Bayes. Lastly, we show a performance comparison among the three models in terms of classification accuracy on the test data. © 2014 Springer International Publishing Switzerland.

Cite

CITATION STYLE

APA

Khasnabish, J. N., Sodhi, M., Deshmukh, J., & Srinivasaraghavan, G. (2014). Detecting programming language from source code using bayesian learning techniques. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8556 LNAI, pp. 513–522). Springer Verlag. https://doi.org/10.1007/978-3-319-08979-9_39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free