基于SMOTE-LOF-Adaboost模型的核心专利识别研究

首页 > 过刊浏览>2023年第卷第21期 >

基于SMOTE-LOF-Adaboost模型的核心专利识别研究
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者单位:1.中国计量大学;2.中国计量大学光学与电子科技学院
作者简介:
通讯作者:
中图分类号:G306?????
基金项目:浙江省重点研发计划项目“基于互联网的新材料发光产业链检测关键技术研究与开发”(2021C01027)；浙江省自然科学“基于知识开放的众创式创新社区集体智慧涌现的机制研究”（LY20G01008）

Research on Identification of Core Patents Based on SMOTE-LOF-Adaboost Model

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对核心专利识别准确率低的问题，重构指标体系；针对传统核心专利识别方法处理不平衡数据效果欠佳，提出重采样技术与集成算法的组合模型。首先，在传统指标构建基础上加入专利发明人相关指标；其次，使用合成少数类过采样算法(SMOTE)增加少数类样本解决数据不平衡问题，采用局部离群因子(LOF)算法对新生成样本进行降噪处理，并与自适应集成算法(Adaboost)组合成SMOTE-LOF-Adaboost模型。最后，以智慧芽专利数据库中2012-2016年共22077条光伏领域专利数据为例，使用SVM、Adaboost、SMOTE-Adaboost、SMOTE-LOF-Adaboost进行实证分析，结果显示SMOTE-LOF-Adaboost模型AUC均值0.977 6，Recall均值0.986 0，F1均值0.960 7均优于其他三种模型，且各指标的标准差更小，表明SMOTE-LOF-Adaboost模型不仅提高核心专利预测的准确性，并且有更高的模型稳定性。

Abstract:

To address the issue of low accuracy in identifying core patents, the indicator system was reconstructed. To address the problem of the traditional core patent identification method"s poor performance in handling imbalanced data, a combined model of resampling techniques and ensemble algorithms was proposed. First, patent inventors" relevant indicators were added to the traditional indicator construction foundation. Second, the Synthetic Minority Over-sampling Technique (SMOTE) algorithm was used to increase the number of minority samples to solve the data imbalance problem. Then, the Local Outlier Factor (LOF) algorithm was used to denoise the newly generated samples, and combined with the Adaptive Boosting (Adaboost) algorithm to form the SMOTE-LOF-Adaboost model. Finally, taking the 22,077 photovoltaic field patent data from the Patsnap patent database from 2012 to 2016 as an example, SVM, Adaboost, SMOTE-Adaboost, and SMOTE-LOF-Adaboost were used for empirical analysis. The results showed that the SMOTE-LOF-Adaboost model had a mean AUC of 0.9776, a mean Recall of 0.9860, and a mean F1 score of 0.9607, which were superior to the other three models, and the standard deviation of each indicator was smaller. This indicates that the SMOTE-LOF-Adaboost model not only improves the accuracyof core patent prediction but also has higher model stability.

参考文献

相似文献

引证文献

引用本文

李颖,吴增源,陈亮.基于SMOTE-LOF-Adaboost模型的核心专利识别研究[J].,2023,(21).

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-05-26
最后修改日期:2023-06-08
录用日期:2023-07-21
在线发布日期: 2024-02-27
出版日期:

文章二维码

联系电话：020-37635126（一、三、五）/83568469（二、四）（查稿）、37674300/82648174（编校）、37635521/82640284（财务）、83549092（传真）

联系地址：广东省广州市先烈中路100号大院60栋3楼302室（510070）广东省广州市越秀区东风西路207-213星河亚洲金融中心A座8楼（510033）

邮箱：kjgl83568469@126.com kjgl@chinajournal.net.cn

引用本文

相关视频

分享

文章指标

历史

文章二维码