An information entropy and latent Dirichlet allocation approach to noise patent filtering
•Proposing a semi-automated approach to noise patent filtering.•Recommending noise patent seeds via information entropy.•Measuring text similarities among patents via latent Dirichlet allocation.•Identifying noise patent clusters with respect to each of the noise patent seeds. Defining valid patents...
Gespeichert in:
Veröffentlicht in: | Advanced engineering informatics 2021-01, Vol.47, p.101243, Article 101243 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Proposing a semi-automated approach to noise patent filtering.•Recommending noise patent seeds via information entropy.•Measuring text similarities among patents via latent Dirichlet allocation.•Identifying noise patent clusters with respect to each of the noise patent seeds.
Defining valid patents in a particular technological field is an indispensable step in patent analysis. To minimise the risk of missing valid patents, domain experts manually exclude irrelevant patents, known as noise patents, from an initial patent set derived using a loose retrieval query. However, this task has become time-consuming and labour intensive due to the increasing number of patents and rising complexity of technological knowledge. This study proposes a semi-automated approach to noise patent filtering based on information entropy theory and latent Dirichlet allocation. The proposed approach comprises four discrete steps: (1) structuring patents using a term-weighting method; (2) recommending noise patent seeds based on the information quantity of patents in terms of focal keyword groups; (3) measuring text similarities for patent clustering using latent Dirichlet allocation; and (4) identifying potential noise patent clusters with respect to the noise patent seeds. Our case study confirms that the proposed approach is valuable as a complementary noise patent filtering tool that will enable domain experts to focus more on their own knowledge-intensive tasks such as prior art analysis and research and development (R&D) strategy formulation. |
---|---|
ISSN: | 1474-0346 1873-5320 |
DOI: | 10.1016/j.aei.2020.101243 |