Document Re-Ranking Model for Machine-Reading and Comprehension
Abstract
Featured Application
Abstract
1. Introduction
2. Previous Studies
3. Re-Ranking Model Based on Artificial Neural Network
4. Evaluation
4.1. Datasets and Experimental Settings
4.2. Experimental Results
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Lee, H.-G.; Kim, H. GF-Net: Improving machine reading comprehension with feature gates. Pattern Recognit. Lett. 2020, 129, 8–15. [Google Scholar] [CrossRef]
- Wang, W.; Yang, N.; Wei, F.; Chang, B.; Zhou, M. Gated Self-Matching Networks for Reading Comprehension and Question Answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 30 July–4 August 2017; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 189–198. [Google Scholar]
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In Proceedings of the 8th International Conference on Learning Representations 2020 (ICLR), Addis Ababa, Ethiopia, 27–30 April 2020. [Google Scholar]
- Chen, D.; Fisch, A.; Weston, J.; Bordes, A. Reading Wikipedia to Answer Open-Domain Questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 30 July–4 August 2017; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 1870–1879. [Google Scholar]
- Lee, J.; Yun, S.; Kim, H.; Ko, M.; Kang, J. Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 565–569. [Google Scholar]
- Kratzwald, B.; Feuerriegel, S. Adaptive Document Retrieval for Deep Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 576–587. [Google Scholar]
- Xiong, C.; Dai, Z.; Callan, J.; Liu, Z.; Power, R. End-to-End Neural Ad-hoc Ranking with Kernel Pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, 7–11 August 2017; Association for Computing Machinery (ACM): New York, NY, USA; pp. 55–64. [Google Scholar]
- Guo, J.; Fan, Y.; Ai, Q.; Croft, W.B. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, IN, USA, 24–28 October 2016; Association for Computing Machinery (ACM): New York, NY, USA; pp. 55–64. [Google Scholar]
- Dai, Z.; Xiong, C.; Callan, J.; Liu, Z. Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining—WSDM, Marina Del Rey, CA, USA, 5–9 February 2018; Association for Computing Machinery (ACM): New York, NY, USA; pp. 126–134. [Google Scholar]
- Mitra, B.; Craswell, N. An Updated Duet Model for Passage Re-Ranking. arXiv 2019, arXiv:1903.07666v1. [Google Scholar]
- Alaparthi, C. Microsoft AI Challenge India 2018: Learning to Rank Passages for Web Question Answering with Deep Attention Networks. In Proceedings of the 2nd Workshop on Humanizing AI(HAI) at IJCAI’19, Macao, China, 12 August 2019. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 1532–1543. [Google Scholar]
- Bojanowski, P.; Grave, E.; Joulin, A.; Mikolov, T. Enriching Word Vectors with Subword Information. Trans. Assoc. Comput. Linguist. 2017, 5, 135–146. [Google Scholar] [CrossRef]
- Cover, T.M. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS 2014 Workshop on Deep Learning NeurIPS, Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
- Kalchbrenner, N.; Grefenstette, E.; Blunsom, P. A Convolutional Neural Network for Modelling Sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, 23–25 June 2014; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 655–665. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, L. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–7 December 2017; pp. 5998–6008. [Google Scholar]
- Zhou, X.; Wan, X.; Xiao, J. Attention-based LSTM Network for Cross-Lingual Sentiment Classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 247–256. [Google Scholar]
- Conneau, A.; Kiela, D.; Schwenk, H.; Barrault, L.; Bordes, A. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA; pp. 670–680. [Google Scholar]
- Bajaj, P.; Campos, D.; Craswell, N.; Deng, L.; Gao, J.; Liu, X.; Majumder, R.; McNamara, A.; Mitra, B.; Nguyen, T.; et al. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. Choice 2016, 2640, 660. [Google Scholar]
- Yang, P.; Fang, H.; Lin, J. Anserini: Reproducible ranking baselines using lucene. ACM J. Data Inf. Qual. 2018, 10, 16. [Google Scholar] [CrossRef]
- Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 7–9 May 2015. [Google Scholar]
- Craswell, N. Mean Reciprocal Rank. In Encyclopedia of Database Systems; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2009; p. 1703. [Google Scholar]
- Nogueira, R.; Cho, K. Passage Re-Ranking with BERT. arXiv 2019, arXiv:1901.04085. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; (Long and Short Papers). Volume 1, pp. 4171–4186. [Google Scholar]






| Model | MRR@10 | 
|---|---|
| Proposed model | 0.303 | 
| w/o additional input embeddings | 0.223 | 
| w/o phrase modeling layer | 0.295 | 
| Model | MRR@10 | |
|---|---|---|
| Development Set | Test Set | |
| BM25 [22] | 0.167 | 0.167 | 
| KNRM [7] | 0.218 | 0.218 | 
| Duet v2 [10] (official baseline) | 0.243 | 0.245 | 
| Duet v2 [10] (ensembled) | 0.252 | 0.252 | 
| Conv-KNRM [9] | 0.247 | 0.247 | 
| Conv-KNRM [9] (ensembled) | 0.271 | 0.290 | 
| Alaparthi et al., 2019 [11] | 0.298 | 0.291 | 
| Proposed model | 0.303 | 0.299 | 
| BERT-Base [25] | 0.347 | 0.347 | 
| BERT-Large [25] | 0.365 | 0.365 | 
| Model | Memory Usage (MB) | Training Parameters (MB) | Response Time (ms) | 
|---|---|---|---|
| BERT-Base [25] | 1289 | 110 | 1100 | 
| Proposed model | 161 | 3.5 | 300 | 
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. | 
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jang, Y.; Kim, H. Document Re-Ranking Model for Machine-Reading and Comprehension. Appl. Sci. 2020, 10, 7547. https://doi.org/10.3390/app10217547
Jang Y, Kim H. Document Re-Ranking Model for Machine-Reading and Comprehension. Applied Sciences. 2020; 10(21):7547. https://doi.org/10.3390/app10217547
Chicago/Turabian StyleJang, Youngjin, and Harksoo Kim. 2020. "Document Re-Ranking Model for Machine-Reading and Comprehension" Applied Sciences 10, no. 21: 7547. https://doi.org/10.3390/app10217547
APA StyleJang, Y., & Kim, H. (2020). Document Re-Ranking Model for Machine-Reading and Comprehension. Applied Sciences, 10(21), 7547. https://doi.org/10.3390/app10217547
 
        


 
       