LE QUY DON
Technical University
VietnameseClear Cookie - decide language by browser settings

Identifying reduplicative words for Vietnamese word segmentation

Ngoc Anh, T. and Phuong Thai, N. and Thanh Tinh, D. and Hong Quan, N. (2015) Identifying reduplicative words for Vietnamese word segmentation. In: 2015 International Conference on Computing and Communication Technologies: Research, Innovation, and Vision for Future, IEEE RIVF 2015, 25 January 2015 through 28 January 2015.

Text
Identifying reduplicative words for Vietnamese word segmentation.pdf

Download (379kB) | Preview

Abstract

This paper proposes a method based on linguistic word-formation rules and dictionaries for determining reduplicative words in Vietnamese. The key idea for identifying whether adjacent syllables in a text can form a reduplicative word based on its formation rules. For 2-syllable reduplicative words, this paper uses rules that describe the repeating and the opposing between pairs of initial consonants, rhymes and tones. Then the method is expanded to identify reduplicative words that have 3 or 4 syllables from 2-syllable ones for the Vietnamese word segmentation task. Experimental results showed that the F1-score was improved to 98.61% and that word segmentation errors were reduced significantly, 1.26%. © 2015 IEEE.

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculties > Faculty of Information Technology
Identification Number: 10.1109/RIVF.2015.7049878
Uncontrolled Keywords: Computer programming; F1 scores; reduplicative rules; reduplicative word; Vietnamese; Word formations; Word segmentation; Computational linguistics
Additional Information: Conference code: 111206. Language of original document: English.
URI: http://eprints.lqdtu.edu.vn/id/eprint/9960

Actions (login required)

View Item
View Item