Learning phrase representations based on word and character embeddings

Jiangping Huang; Donghong Ji; Shuxin Yao; Wenzhi Huang; Bo Chen

Conference Proceedings

Learning phrase representations based on word and character embeddings

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9950 LNCS 547-554

DOI: 10.1007/978-3-319-46681-1_65

8Citations

4Readers

Get full text

Abstract

Most phrase embedding methods consider a phrase as a basic term and learn embeddings according to phrases’ external contexts, ignoring the internal structures of words and characters. There are some languages such as Chinese, a phrase is usually composed of several words or characters and contains rich internal information. The semantic meaning of a phrase is also related to the meanings of its composing words or characters. Therefore, we take Chinese for example, and propose a joint words and characters embedding model for learning phrase representation. In order to disambiguate the word and character and address the issue of non-compositional phrases, we present multiple-prototype word and character embeddings and an effective phrase selection method. We evaluate the effectiveness of the proposed model on phrase similarities computation and analogical reasoning. The empirical result shows that our model outperforms other baseline methods which ignore internal word and character information.

Author supplied keywords

Cite

CITATION STYLE

APA

Huang, J., Ji, D., Yao, S., Huang, W., & Chen, B. (2016). Learning phrase representations based on word and character embeddings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9950 LNCS, pp. 547–554). Springer Verlag. https://doi.org/10.1007/978-3-319-46681-1_65

Learning phrase representations based on word and character embeddings

Abstract

Author supplied keywords

Cite

Register to see more suggestions