[关键词]
[摘要]
当前建设工程合同管理工作量大,长期依赖人工审核和主观经验,为改变合同审查依赖主观经验的现状,进一步探索和实现合同条款智能识别和智能审核,尝试引入基于深度学习(DL)的自然语言处理(NLP)技术,提出一种建设工程合同智能分类方法。首先,设计合同智能分类方案,分别采用卷积神经网络(CNN)、长短期记忆网络(LSTM)和融合模型进行计算实验;其次,将CNN和LSTM模型用于合同文本特征提取和合同文本分类中,通过融合两个神经网络以提升分类模型的性能指标。结果显示,精确率和召回率的调和平均数(F1值)最高的为融合模型,其次是LSTM模型,CNN模型最低。相较于基于机器学习的其他同类研究,基于NLP和DL的建设工程合同分类融合模型能够提高合同文本分类任务的效果,实现对大规模数据量的施工、勘察、监理等不同类型合同的准确识别以及快速分类和归档。
[Key word]
[Abstract]
Currently, the construction project contract management workload is heavy, relying on manual review and subjective experience. In order to change this status quo, further explore and realize the intelligent identification and intelligent review of contract clauses, this paper introduces natural language processing (NLP) technology based on deep learning (DL), and proposes an intelligent classification method for construction project contract management. First of all, it designs a contract intelligent classification scheme, and uses convolutional neural network (CNN), long short-term memory (LSTM) and fusion model to conduct separate calculation experiments. Secondly, the CNN and LSTM models are used in contract text feature extraction and contract text classification, and the performance indicators of the classification model are improved by the fusion of two neural networks. The results show that the fusion model has the highest harmonic mean of precision and recall (F1 value), followed by the LSTM model, and the CNN model is the lowest. Compared with other similar research based on machine learning, the fusion model of construction contract classification based on NLP and DL can improve the effect of contract text classification tasks, and realize accurate identification, rapid classification and archiving of large-scale data volumes of construction, survey, supervision and other types of contracts.
[中图分类号]
TU723.1;G301
[基金项目]
国家自然科学基金面上项目“融合本体和深度学习的建设工程合同条款智能风险识别与规避研究”(72271209);企业委托项目“大数据环境下工业与民用项目全过程咨询、设计建造一体化及风险管控”(XDHT2021142A)。