Textual Data Science for Logistics and Supply Chain Management
Abstract
:1. Introduction
2. Methodology
3. Results
3.1. Word Cloud
3.2. Sentiment Analysis
- The polarity function scans for positive and negative words within a subjectivity lexicon.
- Once a polarity word is found, the function creates a cluster of terms, including the four preceding and the two following words.
- Within each cluster, the polarity function looks for valence shifters, which give a weight to the polar word (amplifiers add 0.8, negators subtract 0.8). Words with positive polarity are counted as 1, whereas words with negative polarity are counted as −1.
- The grand total of positive, negative, amplifying, and negating words is then divided by the square root of all words in the respective passage. This helps to measure the density of key words.
3.3. Topic Modeling
- Company A: 2929 terms (sparsity of 77%)
- Company B: 2158 terms (sparsity of 71%)
- Company C: 2592 terms (sparsity of 68%)
3.4. Correspondence Analysis
3.5. Multidimensional Scaling
4. Conclusions and Further Research
- Word clouds: Our analysis shows which topics (words) were most frequently mentioned. This allows managers and researchers to identify core topics in textual data.
- Sentiment analysis: We computed polarity scores based on managers’ sentiments. In our example, we were especially interested whether the average polarity scores differ across companies. The results illustrate how sentiment analysis can be used to uncover the emotional level of textual data, which reveals how certain topics are perceived in general.
- Topic models: Using the document-term matrix as a starting point, we clustered the texts based on dominating topics by means of LDA. Each resulting cluster contains topics ordered in relation to their importance. This analysis reveals topics that exist within a specific company (or document) and also allows for an easy comparison of what is important between different companies (or documents).
- Correspondence analysis: We computed associations between the words being used and the respective managers. A visual representation allows for an easy interpretation of how managers and topics are related. This helps to detect which interests and priorities a specific manager has and can easily be extended to reveal any kind of relationship between and within two groups of variables.
- Multidimensional scaling (MDS): We were interested in representing similarities between the interviews. Clusters of managers emerged that highlight the differences between companies and revealed underlying dimensions. This technique is frequently used in marketing research (e.g., for market segmentation, product positioning, brand image measurement, and brand similarity studies) but, as we have shown in our study, it might be equally useful to position respondents from other functional areas according to their respective preferences and perceptions.
Author Contributions
Funding
Conflicts of Interest
Appendix A. R Source Code
References
- Ghadge, A.; Dani, S.; Kalawsky, R. Supply chain risk management: Present and future scope. Int. J. Logist. Manag. 2012, 23, 313–339. [Google Scholar] [CrossRef] [Green Version]
- Pournader, M.; Kach, A.; Talluri, S. A Review of the Existing and Emerging Topics in the Supply Chain Risk Management Literature. Decis. Sci. 2020, 51, 867–919. [Google Scholar] [CrossRef]
- Ma, K.; Pal, R.; Gustafsson, E. What modelling research on supply chain collaboration informs us? Identifying key themes and future directions through a literature review. Int. J. Prod. Res. 2019, 57, 2203–2225. [Google Scholar] [CrossRef] [Green Version]
- Rozemeijer, F.; Quintens, L.; Wetzels, M.; Gelderman, C. Vision 20/20: Preparing today for tomorrow’s challenges. J. Purch. Supply Manag. 2012, 18, 63–67. [Google Scholar] [CrossRef]
- Cecere, L. A Practitioner’s Guide to Demand Planning. Supply Chain Manag. Rev. 2013, 17, 40–46. [Google Scholar]
- Shah, S.; Lütjen, M.; Freitag, M. Text Mining for Supply Chain Risk Management in the Apparel Industry. Appl. Sci. 2021, 11, 2323. [Google Scholar] [CrossRef]
- Folinas, D.; Tsolakis, N.; Aidonis, D. Logistics Services Sector and Economic Recession in Greece: Challenges and Opportunities. Logistics 2018, 2, 16. [Google Scholar] [CrossRef] [Green Version]
- Rossetti, C.L.; Handfield, R.; Dooley, K.J. Forces, trends, and decisions in pharmaceutical supply chain management. Int. J. Phys. Distrib. Logist. Manag. 2011, 41, 601–622. [Google Scholar] [CrossRef]
- Abrahams, A.; Fan, W.; Wang, A.; Zhang, Z.; Jiao, J. An Integrated Text Analytic Framework for Product Defect Discovery. Prod. Oper. Manag. 2015, 24, 975–990. [Google Scholar] [CrossRef]
- Alfaro, C.; Cano-Montero, J.; Gómez, J.; Moguerza, J.M.; Ortega, F. A multi-stage method for content classification and opinion mining on weblog comments. Ann. Oper. Res. 2016, 236, 197–213. [Google Scholar] [CrossRef]
- Näslund, D. Logistics needs qualitative research—Especially action research. Int. J. Phys. Distrib. Logist. Manag. 2002, 32, 321–338. [Google Scholar] [CrossRef]
- Hussein, M.; Eltoukhy, A.E.; Karam, A.; Shaban, I.A.; Zayed, T. Modelling in off-site construction supply chain management: A review and future directions for sustainable modular integrated construction. J. Clean. Prod. 2021, 310, 127503. [Google Scholar] [CrossRef]
- Treiblmaier, H.; Mair, P. Applying Text Mining in Supply Chain Forecasting: New Insights through Innovative Approaches. In Proceedings of the 23rd EurOMA Conference, Trondheim, Norway, 17–22 June 2016. [Google Scholar]
- Treiblmaier, H. A Framework for Supply Chain Forecasting Literature. Acta Tech. Corviniensis Bull. Eng. 2015, 7, 49–52. [Google Scholar]
- Nitsche, B. Unravelling the Complexity of Supply Chain Volatility Management. Logistics 2018, 2, 14. [Google Scholar] [CrossRef] [Green Version]
- Subramanian, L. Effective Demand Forecasting in Health Supply Chains: Emerging Trend, Enablers, and Blockers. Logistics. 2021, 5, 12. [Google Scholar] [CrossRef]
- Treiblmaier, H. Optimal levels of (de)centralization for resilient supply chains. Int. J. Logist. Manag. 2018, 29, 435–455. [Google Scholar] [CrossRef]
- Ehrenhuber, I.; Treiblmaier, H.; Engelhardt-Nowitzki, C.; Gerschberger, M. Toward a framework for supply chain resilience. Int. J. Supply Chain Oper. Resil. 2015, 1, 339. [Google Scholar] [CrossRef]
- R Core Team. R: The R Project for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
- Rinker, T.W. Qdap: Quantitative Discourse Analysis Package. In R Package Version 2.2.4; 2015; Available online: https://github.com/trinker/qdap (accessed on 12 July 2021).
- Feinerer, I.; Hornik, K.; Meyer, D. Text Mining Infrastructure in R. J. Stat. Softw. 2008, 25, 1–54. [Google Scholar] [CrossRef] [Green Version]
- Lucas, C.; Tingley, D. TranslateR: Bindings for the Google and Microsoft Translation. In R Package Version 1.0; 2014; Available online: https://rdrr.io/github/ChristopherLucas/translateR/ (accessed on 15 July 2021).
- Mair, P. Modern Psychometrics with R; Use R! Springer International Publishing: Basel, Switzerland, 2018; ISBN 978-3-319-93175-3. [Google Scholar]
- Fellows, I. Wordcloud: Word Clouds. Available online: https://CRAN.R-project.org/package=wordcloud (accessed on 27 May 2021).
- Hu, M.; Liu, B. Mining Opinion Features in Customer Reviews. In Proceedings of the National Conference on Artificial Intelligence (AAAI), San Jose, CA, USA, 25–29 July 2004; Volume 4, pp. 755–760. [Google Scholar]
- Rinker, T.W. QdapDictionaries: Dictionaries to Accompany the Qdap Package 1.0.7; University at Buffalo: Buffalo, NY, USA, 2013. [Google Scholar]
- Grün, B.; Hornik, K. Topicmodels: An R Package for Fitting Topic Models. J. Stat. Softw. 2011, 40, 313–339. [Google Scholar] [CrossRef] [Green Version]
- Blei, D.M.; Lafferty, J.D. Topic Models. In Text Mining: Classification, Clustering, and Applications; Srinivasta, A., Sahami, M., Eds.; Chapman & Hall: New York, NY, USA, 2009; CRC Press: New York, NY, USA; pp. 71–93. [Google Scholar]
- Kwartler, T. Text Mining in Practice with R|Wiley; John Wiley & Sons: New York, NY, USA, 2017; ISBN 978-1-119-28201-3. [Google Scholar]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Greenacre, M. Correspondence Analysis in Practice; Chapman and Hall: New York, NY, USA; CRC: New York, NY, USA, 2017; ISBN 978-1-315-36998-3. [Google Scholar]
- Bécue-Bertaut, M. Textual Data Science with R; CRC Press: Boca Raton, FL, USA, 2021. [Google Scholar]
- De Leeuw, J.; Mair, P. Simple and Canonical Correspondence Analysis Using the R Package anacor. J. Stat. Softw. 2009, 31, 1–18. [Google Scholar] [CrossRef] [Green Version]
- Treiblmaier, H.; Filzmoser, P. Exploratory factor analysis revisited: How robust methods support the detection of hidden multivariate data structures in IS research. Inf. Manag. 2010, 47, 197–207. [Google Scholar] [CrossRef]
- Borg, I.; Groenen, P.J.F. Modern Multidimensional Scaling: Theory and Applications, 2nd ed.; Springer Series in Statistics; Springer: New York, NY, USA, 2005; ISBN 978-0-387-25150-9. [Google Scholar]
- Borg, I.; Groenen, P.J.F.; Mair, P. Applied Multidimensional Scaling and Unfolding, 2nd ed.; Springer Briefs in Statistics; Springer International Publishing: Basel, Switzerland, 2018; ISBN 978-3-319-73470-5. [Google Scholar]
- Chen, Y.; Garcia, E.K.; Gupta, M.R.; Rahimi, A.; Cazzanti, L. Similarity-Based Classification: Concepts and Algorithms. J. Mach. Learn. Res. 2009, 10, 747–776. [Google Scholar]
- de Leeuw, J.; Mair, P. Multidimensional Scaling Using Majorization: SMACOF in R. J. Stat. Softw. 2009, 31, 1–30. [Google Scholar] [CrossRef] [Green Version]
- Mair, P.; Groenen, P.J.F.; De Leeuw, J. More on Multidimensional Scaling and Unfolding in R: Smacof Version 2. J. Stat. Softw. 2021. Available online: https://cran.r-project.org/web/packages/smacof/vignettes/smacof.pdf (accessed on 15 July 2021).
- Treiblmaier, H. Combining Blockchain Technology and the Physical Internet to Achieve Triple Bottom Line Sustainability: A Comprehensive Research Agenda for Modern Logistics and Supply Chain Management. Logistics 2019, 3, 10. [Google Scholar] [CrossRef] [Green Version]
- Rejeb, A.; Keogh, J.G.; Simske, S.J.; Stafford, T.; Treiblmaier, H. Potentials of blockchain technologies for supply chain collaboration: A conceptual framework. Int. J. Logist. Manag. 2021, 32, 973–994. [Google Scholar] [CrossRef]
Company A | Company B | Company C | |
---|---|---|---|
1 | 0.49 | 0.33 | 0.20 |
2 | 0.65 | 0.06 | −0.04 |
3 | −0.17 | 0.65 | 1.25 |
4 | −0.30 | −0.58 | 0.08 |
5 | 0.82 | −0.06 | 0.68 |
6 | 0.46 | 0.45 | |
7 | 0.99 | ||
8 | 0.22 | ||
avg | 0.40 | 0.14 | 0.43 |
Topic 1 | Topic 2 | Topic 3 | Topic 4 | Topic 5 | |
---|---|---|---|---|---|
1 | logistics | data | time | frequency | logistics |
2 | forecast | system | orders | branch | deliver |
3 | positions | forecast | business | pickup | stock |
4 | term | time | procurement | business | truck |
5 | tours | product | stock | channels | extreme |
6 | stock | business | items | forecast | logistic |
7 | leave | laughs | ordered | data | ordered |
8 | talks | volume | account | positions | distribution |
9 | thing | development | internal | quality | major |
10 | truck | evaluate | professional | extrapolation | sale |
Topic 1 | Topic 2 | Topic 3 | Topic 4 | Topic 5 | |
---|---|---|---|---|---|
1 | budget | overdue | bottle | mio | bottle |
2 | stand | forecast precision | count | effect | march |
3 | market | days | million | affected | mio |
4 | board | form | deviation | danger | rolling |
5 | budgeting | moment | takes | quarter | giant |
6 | contract | past | pcs | advantage | classic |
7 | directors | campaign | budget | carton | increases |
8 | rolling | claims | case | controlling | middle |
9 | theoretical | figure | hand | goods | rebuilt |
10 | contracts | any | larger | input | drip |
Topic 1 | Topic 2 | Topic 3 | Topic 4 | Topic 5 | |
---|---|---|---|---|---|
1 | logistics | data | time | frequency | logistics |
2 | forecast | system | orders | branch | deliver |
3 | positions | forecast | business | pickup | stock |
4 | term | time | procurement | business | truck |
5 | tours | product | stock | channels | extreme |
6 | stock | business | items | forecast | logistic |
7 | leave | laughs | ordered | data | ordered |
8 | talks | volume | account | positions | distribution |
9 | thing | development | internal | quality | major |
10 | truck | evaluate | professional | extrapolation | sale |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Treiblmaier, H.; Mair, P. Textual Data Science for Logistics and Supply Chain Management. Logistics 2021, 5, 56. https://doi.org/10.3390/logistics5030056
Treiblmaier H, Mair P. Textual Data Science for Logistics and Supply Chain Management. Logistics. 2021; 5(3):56. https://doi.org/10.3390/logistics5030056
Chicago/Turabian StyleTreiblmaier, Horst, and Patrick Mair. 2021. "Textual Data Science for Logistics and Supply Chain Management" Logistics 5, no. 3: 56. https://doi.org/10.3390/logistics5030056