In this work, we propose identification of text line from a historical Kannada document. The proposed method consists of three stages: initially, preprocess the image by using Sauvola’s method and then apply the connected component and projection profile method to detect the text position of the text line. Finally, each text line is segmented based on projection points. The propose method is evaluated on Kannada historical document. Experimentation is carried out on the seventeen Kannada historical documents, in which the total number of lines together in all documents is 217 lines. We have tried few trail-and-error methods to identify the lines in the historical document image. Using the first method, we have detected 140 lines, but multiple lines were seen between each text line; the accuracy using this method was 64.51%. In the second method, we could detect 107 and the accuracy achieved was 49.30%. By using the third method, we could clearly detect 178, with reduced number of lines in between the text lines, and the accuracy in this case is 82.02%. Hence, we can conclude that using the third method most of the lines were precisely detected and obtained encouraging the result.
CITATION STYLE
Ravi, P., Naveena, C., Sharath Kumar, Y. H., & Manjunath Aradhya, V. N. (2020). Text-Line Extraction from Historical Kannada Document. In Advances in Intelligent Systems and Computing (Vol. 1014, pp. 276–285). Springer. https://doi.org/10.1007/978-981-13-9920-6_28
Mendeley helps you to discover research relevant for your work.