[关键词]
[摘要]
利用文本分类、情感分析等自然语言处理手段,开发基于互联网文本信息的地区环境形象评价方法。为满足生态环境大数据的分析需求,划分了环境形象类别,分别从文体来源、情感极性和环境要素这三种角度评价地区环境形象。人工标注环境文本语料,对比支持向量机、朴素贝叶斯和卷积神经网络三种算法,最终构建了以卷积神经网络为核心算法的地区环境形象评价模型。方法的分类效果较好,三种分类的F1值均满足分析需求,环境要素的F1值在0.8~0.9之间,情感分析的F1值在0.8以上,文体来源的F1值在0.9左右。该方法应用在长三角城市,可实时处理地区热点环境舆情,分析地区环境形象,提供精准直观的环境形象评估结果,为区域环境管理提供基础信息支持。
[Key word]
[Abstract]
Using natural language processing methods such as text classification and sentiment analysis, this article develops a regional environmental image evaluation method based on Internet news. In order to meet the analysis needs of big data on ecological environment, the category of environmental image is divided. Regional environmental images are divided into three categories, including news sources, environmental elements, and emotional attitudes. This research constructs an environment-specific corpus and compares three algorithms, including support vector machines, naive Bayes, and convolutional neural networks and finally build a regional environmental image evaluation method with convolutional neural network. The F1 values of the three classifications all meet the analysis needs. The F1 value of environmental elements is between 0.8 and 0.9, the F1 value of sentiment analysis is above 0.8, and the F1 value of stylistic sources is about 0.9. The method is well applied to cities in the Yangtze River Delta region. It can process regional environmental public opinion in real time, analyze the environmental image of the region, and provide intuitive conclusions and basic information support for regional environmental management.
[中图分类号]
X321
[基金项目]
生态环境部环境规划院课题项目、“城市环境规划管理”、2020A049