[关键词]
[摘要]
我国各级政府部门为促进科技创新,以科研计划项目的形式投入了大量资金。为了提高资金的使用效率,有效预测项目未来的绩效情况,是政府部门选择拟资助项目时的一项重要决策。本文提出了基于集成学习的项目绩效预测方法,利用多分类集成监督学习算法,对网络爬虫得到的已结题项目数据中隐含的关于项目绩效的信息进行有效挖掘,形成项目绩效预测模型。基于国家自然科学基金项目数据,利用多种指标对模型的性能进行评估,将本文模型对项目的绩效预测结果与专家的评估结果进行比较,结果显示了本文模型的有效性。
[Key word]
[Abstract]
Chinese governments at all levels have invested a large amount of funds in the form of scientific research projects to promote scientific and technological innovation. In order to improve the efficiency of the use of the funds, it is an important decision for the government to effectively predict the projects performanc in the future when they select projects to be funded. This paper proposes a performance prediction method for scientific research projects based on ensemble learning. By using multi-classification ensemble learning algorithms, the project performance related information hidden in the completed project data is effectively mined using web crawlers to form a project performance prediction model. Using the Project data of the National Natural Science Foundation of China, this paper evaluates the performance of the proposed model based on several measures. The performance prediction results of the model are compared with the expert evaluation results, and the results show the effectiveness of the proposed model.
[中图分类号]
TP393
[基金项目]
国家自然科学基金项目“软件开发项目中数据驱动的资源均衡与鲁棒调度集成优化”(项目编号:71602106)、上海市“科技创新行动计划”软科学重点项目“数据驱动的科研计划项目预算配置方法研究”(项目编号:20692192400)