LE QUY DON
Technical University
VietnameseClear Cookie - decide language by browser settings

Identifying coordinated compound words for Vietnamese word segmentation

Tran, N.A. and Dao, T.T. and Nguyen, P.T. (2013) Identifying coordinated compound words for Vietnamese word segmentation. In: 2013 International Conference on Soft Computing and Pattern Recognition, SoCPaR 2013, 15 December 2013 through 18 December 2013.

Text
Identifying coordinated compound words for Vietnamese word segmentation.pdf

Download (299kB) | Preview

Abstract

This paper proposes a dictionary-based method for determining coordinated compound words in Vietnamese. The main idea to determine whether two contiguous simple words in a text forms a coordinated compound word is based on their properties, part-of-speeches and the similarity between their definitions in the dictionary of the Vietnamese Computational Lexicon (VCL). We also based on the sets of synonym and antonym to identify, recognize, and establish a list of coordinated compound words (coordinated di-syllable phrases). We have used a number of rules to identify 3 or 4 syllable phrases/idioms based on relations of coordinated di-syllable phrases. We carried out two major experiments: one for identifying and creating a list of coordinated compounds, the other for improving the accuracy of Vietnamese word segmentation. The second experiment showed that the word segmentation F-scores increases from 0.11% to 0.41% (the error rate decreases from 3.32% to 12.6%). This is a new approach and highly practical value. © 2013 IEEE.

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculties > Faculty of Information Technology
Identification Number: 10.1109/SOCPAR.2013.7054145
Uncontrolled Keywords: Pattern recognition; Semantics; Soft computing; Coordinated compounds; new word; similarity; Vietnamese; Word segmentation; Computational linguistics
Additional Information: Conference code: 111424. Language of original document: English.
URI: http://eprints.lqdtu.edu.vn/id/eprint/10073

Actions (login required)

View Item
View Item