studied the extraction of Twitter keywords, proposed an unsupervised graph-based method, extracted keywords using the collective node weight, carried out experiments on five data sets, and found that this method was better than other methods. introduced keyword extraction methods related to supervised and unsupervised methods, analyzed and compared various graph-based methods, and encouraged the development of new graph-based methods for keyword extraction. Therefore, keywords are very important in text processing. However, if the keywords input by users are not accurate or the keywords do not appear on the corresponding page, the retrieval effect of information will be greatly affected. In the process of information retrieval, users can find the corresponding web pages by inputting keywords. Information retrieval, text classification, emotional analysis, and topic identification have been widely concerned by researchers. With the rapid growth of the amount of text information, how to process and retrieve these massive texts has become a more and more important problem. With the development of society, there are more and more ways to express information, among which natural language text is the most important one and one of the largest information sources. The experimental results show that the improved TF–IDF algorithm is effective in extracting English text keywords, which can be further promoted and applied in practice. The comparison between the two algorithms demonstrated that the improved TF–IDF algorithm had the best performance, with a precision rate of 71.2%, a recall rate of 52.98%, and an F 1 score of 60.75%, when five keywords were extracted from each article. The results showed that the improved TF–IDF algorithm had the shortest running time and took only 4.93 s in processing 100 texts the precision of the algorithms decreased with the increase of the number of extracted keywords. Finally, 100 English literature was selected from the British Academic Written English Corpus for the analysis experiment. Then, an improved TF–IDF algorithm was designed, which improved the calculation of word frequency, and it was combined with the position weight to improve the performance of keyword extraction. First, two commonly used algorithms, the term frequency–inverse document frequency (TF–IDF) algorithm and the keyphrase extraction algorithm (KEA), were introduced. This study mainly analyzed the keyword extraction of English text.
0 Comments
Leave a Reply. |