Dependency parsing is an important subtask of natural language processing. In this paper, we propose an embedding feature transforming method for graph-based parsing, transform-based parsing, which directly utilizes the inner similarity of the features to extract information from all feature strings including the un-indexed strings and alleviate the feature sparse problem. The model transforms the extracted features to transformed features via applying a feature weight matrix, which consists of similarities between the feature strings. Since the matrix is usually rank-deficient because of similar feature strings, it would influence the strength of constraints. However, it is proven that the duplicate transformed features do not degrade the optimization algorithm: the margin infused relaxed algorithm. Moreover, this problem can be alleviated by reducing the number of the nearest transformed features of a feature. In addition, to further improve the parsing accuracy, a fusion parser is introduced to integrate transformed and original features. Our experiments verify that both transform-based and fusion parser improve the parsing accuracy compared to the corresponding feature-based parser.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.