Yi Language Information Processing Technology Based on Character Matching Algorithm

Authors

  • Chengping Wang, Qingya Zeng, Dongyan Sun, Xuanxuan Tian

Abstract

Yi character recognition system includes four main modules: character segmentation, feature extraction, feature compression, and dictionary matching. The traditional recognition system for text separation processing effect is not ideal, resulting in high recognition error rate. Based on the character matching algorithm, this paper studies the Yi language information processing technology. The system improves the accuracy of character segmentation by using the combination and anti combination rules of Yi characters. At the same time, this paper uses 1024 dimension peripheral direction contribution as the statistical feature of Yi characters, which has a good ability to distinguish a large number of similar characters in Yi characters. The system also uses KL transform-based feature compression algorithm and a three-level dictionary fast matching algorithm to realize a Yi language recognition system based on the Python platform. The experimental results show that the recognition rate of the platform is more than 98.7%. The system provides a reference for the research of Yi language information processing technology.

Published

2020-12-31

Issue

Section

Articles