Abstract:The increasing prominence of redundant research investment in survey and design enterprises highlights a depletion of funds, human resources, reputation, and even the spirit of scientific research, which is detrimental to the incubation of cutting-edge technologies. Hence, it is imperative to automatically identify the redundancy of scientific research topics and maximize the reuse of scientific research outcomes through intelligent means. This paper proposes a method for detecting the redundancy of scientific research projects within enterprises, integrating the basic theory of the BM25 algorithm and combining the data attributes of survey and design enterprises with characteristic values such as domain, specialty, and project leaders. The method involves four steps: text preprocessing, establishing a matching library, calculating the similarity between the input topic and the topics in the matching library using the TF-IDF algorithm and the BM25 algorithm, respectively, and finally analyzing the calculation results. Applied in the research topics of new energy, engineering digitalization, and informatization, this algorithm demonstrates a distinct advantage in differentiation over the TF-IDF algorithm, with a computation time of less than 0.1 seconds, meeting commercial needs. It supports the verification of redundancy in research topic initiation and the determination of overlap in outcomes. The accuracy of the calculation results has been verified by technical research and development personnel, meeting the needs of business management and holds promotional value in the survey and design industry.