The Building a Patent Landscape for Technological Forecasting Tasks

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Methods and technologies for solving the problem of patent landscape visualization based on cluster analysis of the patent array are considered and used. Algorithms for downloading patent archives, parsing patent documents, clustering patents and visualizing the patent landscape have been developed. A software for clustering patent documents based on the Latent Dirichlet allocation model and visualization of the patent landscape on clustering data using the gensim, PySpark, and sklearn libraries has been implemented. The implemented software has been tested on patents issued by the US Patent and Trademark Office. The accuracy of classification of patents by category has been achieved - 84%. Possible ways to improve the system: (a) add the ability to save the LDA model trained on certain categories of patents to predict the category of a new patent document (b) improve the quality of preprocessing of the patent description text; (c) develop a software on the customer's resources (computing cluster), in order to fully test the Apache Spark framework; (d) saving a large amount of data on the customer's resources (computing cluster) using HDFS technology (Hadoop Distributed File System).

Cite

CITATION STYLE

APA

Korobkin, D., Saveliev, M., Vereschak, G., & Fomenkov, S. (2023). The Building a Patent Landscape for Technological Forecasting Tasks. In Lecture Notes in Electrical Engineering (Vol. 986 LNEE, pp. 314–324). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-22311-2_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free