Social media and health-related forums, including the expression of customer reviews, have recently provided data sources for adverse drug reaction (ADR) identification research. However, in the existing methods, the neglect of noise data and the need for manually labeled data reduce the accuracy of the prediction results and greatly increase manual labor. We propose a novel architecture named the weakly supervised mechanism (WSM) convolutional neural network (CNN) long-short-term memory (WSM-CNN-LSTM), which combines the strength of CNN and bi-directional long short-term memory (Bi-LSTM). The WSM applies the weakly labeled data to pre-train the parameters of the model and then uses the labeled data to fine-tune the initialized network parameters. The CNN employs a convolutional layer to study the characteristics of the drug reviews and active features at different scales, and then the feed-forward and feed-back neural networks of the Bi-LSTM utilize these salient features to output the regression results. The experimental results effectively demonstrate that our model marginally outperforms the comparison models in ADR identification and that a small quantity of labeled samples results in an optimal performance, which decreases the influence of noise and reduces the manual data-labeling requirements.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited