Please login to be able to save your searches and receive alerts for new content matching your search criteria.
In today’s information age, the ability of intelligent sharing and scheduling of college English translation corpus resources needs to be improved. Therefore, a method based on fuzzy autocorrelation statistical feature analysis is proposed. First of all, a model must be constructed to detect the semantically relevant dimensional features of college English translation corpus resources under the background of informatization, and to analyze the essential attributes of translation activities by using the hierarchical parameter detection method of translated texts in the narrative structure. Then, a quantitative difference coverage model of word clusters of different lengths is established, with lexical attribute extraction and statistical examination of these resources being performed via a similarity attribute extraction technique for high-frequency word clusters. Subsequently, a semantic dynamic attribute analysis model is developed to derive statistical attributes of college English translation corpus resources within the informatized context. Ultimately, based on the obtained attribute extraction results, a fuzzy autocorrelation statistical attribute analysis method is employed for clustering large datasets. Furthermore, an intelligent particle swarm optimization algorithm is implemented to extract and disseminate lexical attributes of college English translation corpus resources within the information-driven context, so that the college English translation corpus resources can be optimized under the information background. According to the simulation results, this method has excellent accuracy in extracting and sharing lexical features of translated texts, and its feature discrimination ability is also good. It can indeed improve the ability of extracting, sharing, and detecting lexical features of translated texts from college English translation corpus resources.
Although there are several corpora with protein annotation, incompatibility between the annotations in different corpora remains a problem that hinders the progress of automatic recognition of protein names in biomedical literature. Here, we report on our efforts to find a solution to the incompatibility issue, and to improve the compatibility between two representative protein-annotated corpora: the GENIA corpus and the GENETAG corpus. In a comparative study, we improve our insight into the two corpora, and a series of experimental results show that most of the incompatibility can be removed.