A Sustainability-Oriented NLP Framework for Early Detection of Economic, Operational, and Environmental Risks in Global Shipping
Abstract
1. Introduction
1.1. Research Background and Purpose
1.2. Research Scope and Methods
2. Review of Prior Research on Shipping NLP
2.1. Literature Review on Shipping Sentiment Analysis and Sustainability-Related Risks
2.2. Limitations of Prior Research from a Sustainability Perspective
2.3. Contributions and Research Direction of the Present Study
3. Research Design and Empirical Analysis
3.1. Overview of the Sustainability-Oriented NLP Framework
- Shipping Market Text Collection and Pre-Processing: Shipping- and logistics-related news articles and weekly market reports are systematically collected and pre-processed to ensure textual consistency, temporal alignment, and analytical reliability. The focus on expert-curated weekly reports allows the framework to capture slow-moving, structural sustainability signals rather than high-frequency market noise. This stage establishes the foundation for capturing qualitative signals related to market instability, operational disruptions, regulatory transitions, and environmental pressures;
- Domain-Tuned Topic and Sentiment Extraction: Sustainability-relevant issue themes are identified using topic modeling, while sentiment polarity is measured using a shipping-specific sentiment lexicon aligned with CTQ dimensions. The sentiment lexicon is explicitly designed to capture sustainability-relevant stress expressions (e.g., congestion persistence, regulatory burden, capacity imbalance) rather than generic market optimism or pessimism. This step enables the detection of narratives associated with sustainability stress—such as congestion, rerouting risk, regulatory compliance, and decarbonization—that are embedded in shipping market discourse;
- Topic–Sentiment Network Construction: A topic–sentiment network is constructed to capture the structural relationships between extracted topics and sentiment signals. Network centrality, edge weight, and co-occurrence persistence are interpreted as indicators of systemic relevance rather than causal dominance. This step allows for the identification of which sustainability-related issues function as stress hubs and how narrative emphasis propagates across interconnected topics, providing structural context beyond isolated sentiment scores;
- Estimation of Topic-to-CTQ Impact Weights: The influence of sustainability-related topics and sentiment signals on sustainability-critical CTQ dimensions is quantified using ElasticNet regression, vector autoregression (VAR), and Bayesian estimation techniques. ElasticNet is employed to ensure model interpretability and robustness under multicollinearity, while VAR-based diagnostics are used to assess temporal precedence rather than causal determinism. This stage produces a weight matrix formalizing how qualitative narratives are statistically associated with changes in economic stability, operational reliability, and environmental performance;
- Construction of Composite Sustainability Stress Indices (CTQScores): Topic and sentiment information, weighted by their estimated impacts, are aggregated into composite CTQScores for each sustainability-critical dimension—freight rate stability, schedule reliability, lead time, vessel utilization, equipment availability, and eco-efficiency. CTQScores are explicitly interpreted as indicators of sustainability stress intensity and persistence, not as direct performance forecasts. Positive CTQScore values indicate stabilizing narrative conditions, whereas negative values reflect elevated sustainability stress;
- Validation through KPI Linkage and Early warning Assessment: The constructed CTQScores are empirically validated by linking them to observed key performance indicators (KPIs), such as the Shanghai Containerized Freight Index (SCFI) and operational performance metrics. Validation focuses on directional accuracy, temporal lead–lag relationships, and stress-detection capability rather than point-forecast precision. VAR-based diagnostics and robustness checks are used to evaluate whether sustainability stress signals extracted from textual data systematically precede or coincide with observable market and operational deterioration;
- Sustainability Risk Signal Generation and Monitoring: Based on validated CTQScore trajectories, sustainability risk signals are generated when stress levels exceed predefined thresholds or exhibit abnormal persistence. These thresholds are calibrated to balance sensitivity and false-alarm risk and are interpreted as early warning alerts rather than deterministic predictions. The resulting signals support proactive managerial and policy responses aimed at mitigating long-term sustainability risks.
3.2. Empirical Analysis by Research Stage
3.2.1. Data Collection and Pre-Processing
3.2.2. Identification of Sustainability-Critical Issue Domains Through Topic Modeling
3.2.3. Domain-Tuned Shipping-Specific Sentiment Lexicon: MTL-Based Sustainability Perception Modeling
Construction of the Sustainability-Oriented Shipping Sentiment Lexicon
- Base dictionary extraction: Positive and negative terms (approximately 6000 entries) were initially extracted from the Loughran–McDonald financial sentiment dictionary, providing a standardized and widely validated foundation for polarity classification;
- Domain corpus integration: Approximately 2.5 million tokens were collected from the shipping market corpus spanning 155 weekly KMI reports (2022–2025). Using TF–IDF analysis, the most salient shipping- and sustainability-relevant terms were identified as candidate vocabulary for lexicon expansion;
- Sustainability-critical CTQ labeling: Candidate terms were manually reviewed and labeled according to six sustainability-critical CTQ dimensions—freight rate stability, schedule reliability, vessel utilization, lead time, equipment availability, and eco-efficiency. Sentiment polarity was assigned according to each term’s implications for sustainable shipping performance rather than short-term profitability. For example, expressions indicating stabilized freight rates or improved utilization were classified as positive, whereas terms related to congestion, oversupply, or service disruption were classified as negative. Environmental terms were evaluated based on their contribution to long-term sustainability trajectories rather than immediate cost implications;
- PMI-based lexicon expansion: Pointwise Mutual Information (PMI) was calculated to identify statistically significant co-occurrence relationships between sentiment terms and CTQ-related keywords. Terms exceeding a predefined PMI threshold were incorporated into the lexicon to enhance contextual relevance;
- Coherence validation: The coherence of the expanded lexicon was evaluated using Word2Vec-based similarity measures and topic–sentiment association strengths. The resulting coherence score (c_v = 0.58) indicates that the lexicon reliably captures sustainability-relevant semantic structures in shipping discourse.
MTL-Based Sustainability Perception Modeling
Lexicon Characteristics and CTQ Linkage
Distinguishing Features from a Sustainability Perspective
3.2.4. Topic–Sentiment Network Analysis: Systemic Sustainability Stress Propagation
Network Construction and Weighting Scheme
Network Construction and Weighting Scheme
3.2.5. Estimation of Topic-to-CTQ Weights and Construction of Sustainability Stress Indices
Model Specification and Estimation of Topic-to-CTQ Weights
Construction and Interpretation of CTQScores
Illustrative Temporal Dynamics of Sustainability Stress
Robustness Checks, Sensitivity Analysis, and Model Scope
3.2.6. KPI Validation and Operationalization of the Sustainability Risk Radar
Validation Through KPI Linkage
Temporal Precedence and Dynamic Interaction Diagnostics
CTQ-Specific Heterogeneity of Sustainability Stress Relevance
Directional Accuracy and Sustainability Relevance
Sustainability Risk Signal Generation and Interpretation as a Sustainability Risk Radar
4. Conclusions and Implications
4.1. Managerial Implications
4.2. Policy and Regulatory Implications
4.3. Theoretical Implications
4.4. Limitations and Directions for Future Research
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| BDI | Baltic Dry Index |
| BERT | Bidirectional Encoder Representations from Transformers |
| CCFI | China Containerized Freight Index |
| CTQ | Critical-to-Quality |
| CV | Cross-Validation |
| DAPT | Domain-Adaptive Pre-training |
| FinBERT | Financial BERT |
| IMO | International Maritime Organization |
| KMI | Korea Maritime Institute |
| KOBC | Korea Ocean Business Corporation |
| KPI | Key Performance Indicator |
| LDA | Linear Discriminant Analysis |
| LM | Language Model |
| LSTM | Long Short-Term Memory |
| MAPE | Mean Absolute Percentage Error |
| MLM | Masked Language Modeling |
| MTL | Multi-Task Learning |
| NAC | Named Account Contract |
| NLP | Natural Language Processing |
| PCA | Principal Component Analysis |
| PMI | Pointwise Mutual Information |
| ProdLDA | Product of Experts Latent Dirichlet Allocation |
| SCFI | Shanghai Containerized Freight Index |
| S/N | Sales and Negotiation |
| SSI | Shipping Sentiment Index |
| VAR | Vector Autoregression |
| W | Weights |
| WS | World Scale |
Appendix A. Comparative Evaluation of Topic Modeling Methods
Appendix A.1. Evaluation Criteria
- Topic coherence: Quantitative coherence scores were computed to assess semantic consistency within topics. Both c_v and u_mass coherence metrics were calculated using the Gensim framework;
- Stability across runs: Each model was estimated over multiple random initializations, and topic overlap consistency was examined;
- Interpretability: Topics were qualitatively assessed by domain experts in maritime economics based on semantic clarity and alignment with sustainability-critical issues;
- Suitability for sustainability analysis: Models were evaluated on their ability to capture persistent, structural issue domains rather than short-lived event-driven clusters.
Appendix A.2. Comparative Results
| Model | Avg. c_v | Avg. u_Mass | Stability Across Runs | Interpretability | Sustainability Suitability |
|---|---|---|---|---|---|
| LDA | 0.41 | −8.72 | Moderate | Low–Moderate | Limited |
| BERTopic | 0.53 | −6.18 | Low | Moderate | Event-oriented |
| ProdLDA | 0.61 | −5.02 | High | High | Structural/Long-term |
Appendix A.3. Implications for Model Selection
Appendix B. CTQ-Linked Sentiment Lexicon
| Freight Rate Stability | |||||
|---|---|---|---|---|---|
| Term | Polarity | Term | Polarity | Term | Polarity |
| CBAM | Negative | freight rate | Neutral | improvement | Positive |
| decline | Negative | long-term contract | Neutral | increment | Positive |
| downward trend | Negative | rates | Neutral | normalization | Positive |
| hike | Negative | SCFI | Neutral | Poseidon Principles | Positive |
| instability | Negative | spot market | Neutral | PSS | Positive |
| overcapacity | Negative | WS | Neutral | recovery | Positive |
| plunge | Negative | BAF | Positive | resilience | Positive |
| sharp decline | Negative | balance | Positive | resilient | Positive |
| sharp increase | Negative | bunker hedging | Positive | slow steaming | Positive |
| surcharge | Negative | canal surcharge | Positive | stability | Positive |
| volatile | Negative | capacity crunch | Positive | stabilize | Positive |
| volatility | Negative | charter rate spike | Positive | stabilized | Positive |
| BDI | Neutral | green financing | Positive | stable | Positive |
| capacity adjustment | Neutral | green premium | Positive | tonnage tax | Positive |
| CCFI | Neutral | GRI | Positive | ||
| Schedule Reliability | |||||
| Term | Polarity | Term | Polarity | Term | Polarity |
| bottleneck | Negative | transshipment delay | Negative | on-time | Positive |
| cancellation | Negative | blank sailing | Negative | pro-forma schedule | Positive |
| congestion | Negative | sailing schedule | Neutral | reliability | Positive |
| delay | Negative | schedule | Neutral | schedule adherence | Positive |
| detour | Negative | shipment | Neutral | smooth flow | Positive |
| disruption | Negative | terminal congestion | Neutral | stable operation | Positive |
| port labor strike | Negative | trade lane | Neutral | ||
| rerouting | Negative | ||||
| Vessel Utilization | |||||
| Term | Polarity | Term | Polarity | Term | Polarity |
| idle vessel | Negative | carrier | Neutral | autonomous navigation | Positive |
| port congestion | Negative | carriers | Neutral | backhaul | Positive |
| orderbook-to-fleet ratio | Negative | charter | Neutral | demolition value | Positive |
| underused | Negative | fleet | Neutral | digital twin | Positive |
| overcapacity | Negative | port | Neutral | EEXI compliance | Positive |
| bulk | Neutral | supply adjustment | Neutral | methanol-ready | Positive |
| bunkering | Neutral | tanker | Neutral | scrapping | Positive |
| capacity | Neutral | tonnage | Neutral | slow steaming | Positive |
| ton-mile | Neutral | ||||
| Lead Time | |||||
| Term | Polarity | Term | Polarity | Term | Polarity |
| diversion | Negative | lead time | Neutral | transit time | Neutral |
| drought | Negative | logistics | Neutral | transshipment time | Neutral |
| port congestion | Negative | logistics flow | Neutral | efficiency | Positive |
| restriction | Negative | route | Neutral | slow steaming | Positive |
| waiting | Negative | route re-routing | Neutral | smart port automation | Positive |
| customs delay | Neutral | shipment | Neutral | speed | Positive |
| distance | Neutral | transit | Neutral | ||
| Equipment Availability | |||||
| Term | Polarity | Term | Polarity | Term | Polarity |
| chassis shortage | Negative | availability | Neutral | trucking | Neutral |
| deficiency | Negative | chassis | Neutral | normal supply | Positive |
| demurrage | Negative | container | Neutral | secured | Positive |
| detention | Negative | container box | Neutral | smooth availability | Positive |
| dwell time | Negative | containers | Neutral | ||
| empty repo | Negative | equipment | Neutral | ||
| imbalance | Negative | reefer | Neutral | ||
| shortage | Negative | truck | Neutral | ||
| Eco-efficiency | |||||
| Term | Polarity | Term | Polarity | Term | Polarity |
| EU ETS | Negative | Retrofitting | Neutral | green financing | Positive |
| fuel EU maritime | Negative | scrubber | Neutral | green premium | Positive |
| penalty | Negative | SEEMP | Neutral | green recycling | Positive |
| pollution | Negative | bio-fuel bunkering | Positive | LNG propulsion | Positive |
| regulation | Negative | bio-methanol bunkering | Positive | reduction | Positive |
| spill | Negative | carbon reduction | Positive | slow steaming | Positive |
| violation | Negative | CII rating | Positive | sustainability | Positive |
| carbon emissions | Negative | decarbonization | Positive | sustainable | Positive |
| CII | Neutral | decarbonize | Positive | transition | Positive |
| EEXI | Neutral | dual-fuel engine | Positive | WAPS | Positive |
| emissions | Neutral | eco-friendly | Positive | ||
| fuel | Neutral | ESG | Positive | ||
| IMO | Neutral | green corridor | Positive |
Appendix C
| Week | SCFI | Actual SCFI_t + 1 | Predicted SCFI_t + 1 | Actual Direction SCFI_t + 1 vs. SCFI_t | Predicted Direction SCFI_t + 1 vs. SCFI_t | Match |
|---|---|---|---|---|---|---|
| 4-3 | 1371 | 1348 | 1399.23 | Down (1348 < 1371) | Up (1399 > 1371) | No |
| 4-4 | 1348 | 1345 | 1430.43 | Down (1345 < 1348) | Up (1430 > 1348) | No |
| 5-1 | 1345 | 1479 | 1735.15 | Up (1479 > 1345) | Up (1735 > 1345) | Yes |
| 5-2 | 1479 | 1586 | 1652.51 | Up (1586 > 1479) | Up (1652 > 1479) | Yes |
| 5-3 | 1586 | 2073 | 2093.51 | Up (2073 > 1586) | Up (2093 > 1586) | Yes |
| 5-4 | 2073 | 2240 | 2087.82 | Up (2240 > 2073) | Up (2087 > 2073) | Yes |
| 6-1 | 2240 | 2088 | 1987.46 | Down (2088 < 2240) | Down (1987 < 2240) | Yes |
| 6-2 | 2088 | 1870 | 1738.97 | Down (1870 < 2088) | Down (1738 < 2088) | Yes |
| 6-3 | 1870 | 1862 | 1871.88 | Down (1862 < 1870) | Up (1871 > 1870) | No |
| 6-4 | 1862 | - | - | - | - | - |
References
- Baertlein, L. Red Sea Diversions, Tariff Risks Send Ocean Shipping Soaring. Reuters. 2024. Available online: https://www.reuters.com/markets/commodities/red-sea-diversions-tariff-risks-send-ocean-shipping-soaring-2024-05-31/ (accessed on 10 October 2025).
- United Nations Conference on Trade and Development (UNCTAD). Suez and Panama Canal Disruptions Threaten Global Trade and Development; UNCTAD: Geneva, Switzerland, 2024; Available online: https://unctad.org (accessed on 10 October 2025).
- Wolff, J. Red Sea Crisis Underlines the Need for Greater Data Transparency. Seatrade Maritime. 2024. Available online: https://www.seatrade-maritime.com/maritime-technology/red-sea-crisis-underlines-the-need-for-greater-data-transparency (accessed on 10 October 2025).
- Hoffart, F.M.; D’Orazio, P.; Holz, F.; Kemfert, C. Exploring the interdependence of climate, finance, energy, and geopolitics: A conceptual framework for systemic risks amidst multiple crises. Appl. Energy 2024, 361, 122885. [Google Scholar] [CrossRef]
- Aven, T.; Renn, O. Risk Management and Governance: Concepts, Guidelines and Applications; Springer: Berlin, Germany, 2010. [Google Scholar]
- Jeon, J. SCFI Surge Causes and Improvement in Liner Bargaining Power. Cello-Square. 2024. Available online: https://www.cello-square.com/en/blog/view-1405.do (accessed on 10 October 2025).
- Labrut, M. Container Shipping Set for Short, Sharp, Hard Landing. Seatrade Maritime. 2022. Available online: https://www.seatrade-maritime.com/containers/container-shipping-set-for-short-sharp-hard-landing- (accessed on 10 October 2025).
- Chen, L.; Liu, J.; Pei, R.; Su, Z.; Liu, Z. Shanghai Containerised Freight Index Forecasting Based on Deep Learning Methods: Evidence from Chinese Futures Markets. East Asian Econ. Rev. 2024, 28, 359–388. [Google Scholar] [CrossRef]
- Bai, X.; Lam, J.; Jakher, A. Shipping sentiment and the dry bulk shipping freight market: New evidence from newspaper coverage. Transp. Res. Part E 2021, 55, 102490. [Google Scholar] [CrossRef]
- Gavriilidis, T.; Merika, A.; Merikas, A.; Sigalas, C. Development of a sentiment measure for dry bulk shipping. Marit. Policy Manag. 2023, 50, 58–80. [Google Scholar] [CrossRef]
- Michail, N.A.; Melas, K.D. Sentiment-augmented supply and demand equations for the dry bulk shipping market. Economies 2021, 9, 171. [Google Scholar] [CrossRef]
- Labrut, M. Boxship Blocks Suez Canal. Seatrade Maritime. 2021. Available online: https://www.seatrade-maritime.com/containers/boxship-blocks-suez-canal (accessed on 10 October 2025).
- How the Panama Canal Drought and Trade Shifts Are Redefining Supply Chain Resilience for 2025. Langley. 2025. Available online: https://www.langleysearch.com/blog/2025/01/how-the-panama-canal-drought-and-trade-shifts-are-redefining-supply-chain-resilience-for-2025?source=google.com (accessed on 15 October 2025).
- Labrut, M. Panama Canal Transits Drop 29% in FY2024. Seatrade Maritime. 2024. Available online: https://www.seatrade-maritime.com/containers/panama-canal-transits-drop-29-in-fy2024 (accessed on 10 October 2025).
- Savvides, N. US Trade Surge Could Trigger Pandemic Levels of Port Congestion. Seatrade Maritime. 2025. Available online: https://www.seatrade-maritime.com/ports-logistics/us-trade-surge-could-trigger-pandemic-levels-of-port-congestion.com (accessed on 10 October 2025).
- Pettit, T.J.; Fiksel, J.; Croxton, K.L. Ensuring supply chain resilience: Development of a conceptual framework. J. Bus. Logist. 2010, 31, 1–21. [Google Scholar] [CrossRef]
- OECD. Risks and Resilience in Global Trade: Key Trends in 2023–2024; OECD Publishing: Paris, France, 2024. [Google Scholar] [CrossRef]
- Bouman, E.A.; Lindstad, E.; Rialland, A.; Strømman, A.H. State-of-the-art technologies, measures, and potential for reducing GHG emissions from shipping—A review. Transp. Res. Part D 2017, 52, 408–421. [Google Scholar] [CrossRef]
- World Bank. The Container Port Performance Index 2022: A Comparable Assessment of Performance Based on Vessel Time in Port; World Bank Group: Washington, DC, USA, 2023; Available online: http://hdl.handle.net/10986/39824 (accessed on 10 October 2025).
- Hirata, E.; Takahashi, T. Forecasting Shanghai Containerized Freight Index: A deep-learning-based model experiment. J. Mar. Sci. Eng. 2022, 10, 593. [Google Scholar] [CrossRef]
- Kaprinis, K. The Institutional Structure of Macroprudential Policy in the UK; Springer: Cham, Switzerland, 2023. [Google Scholar]
- Ivanov, D.; Dolgui, A.; Sokolov, B. The impact of digital technology and Industry 4.0 on the ripple effect and supply chain risk analytics. Int. J. Prod. Res. 2019, 57, 829–846. [Google Scholar] [CrossRef]
- Sui, C.; Wang, S.; Zheng, W. Sentiment as a shipping market predictor: Testing market-specific language models. Transp. Res. Part E 2024, 189, 103651. [Google Scholar] [CrossRef]
- Gong, Y. Shipping news sentiment as a predictor of iron ore freight rates. Resour. Policy 2025, 89, 105682. [Google Scholar]
- Jeon, J.; Çağatay, I.; Hong, S.; Andrew, L. Box rates unveiled: Predictive analytics for ocean freight rates with news-based market sentiment. Int. J. Prod. Econ. 2025, 286, 109669. [Google Scholar] [CrossRef]
- Ehlert, S.; Wilson, C.; Yawson, A. Industry investor sentiment in the global shipping industry. Marit. Policy Manag. 2024, 51, 74–79. [Google Scholar] [CrossRef]
- Kim, J.; Lee, K.; Kwon, J.; Yeo, J. Maritime supply chain risk sentiment and Korea’s trade volume: A news big-data analysis perspective. Asian J. Shipp. Logist. 2024, 40, 42–51. [Google Scholar]
- Friede, G.; Busch, T.; Bassen, A. ESG and financial performance: Aggregated evidence from more than 2000 empirical studies. J. Sustain. Financ. Invest. 2015, 5, 210–233. [Google Scholar] [CrossRef]
- George, G.; Merrill, R.K.; Schillebeeckx, S.J.D. Digital sustainability and entrepreneurship. J. Bus. Ventur. 2021, 36, 106153. [Google Scholar]
- Kölbel, J.F.; Heeb, F.; Paetzold, F.; Busch, T. Can sustainable investing save the world? Organ. Environ. 2020, 33, 554–574. [Google Scholar]
- Korea Maritime Institute (KMI). KMI Weekly Shipping Market Focus; Nos. 548–699; KMI: Busan, Republic of Korea, 2022–2025. [Google Scholar]
- Elkington, J. Towards the Sustainable Corporation: Win–Win–Win Business Strategies for Sustainable Development. Calif. Manag. Rev. 1994, 36, 90–100. [Google Scholar] [CrossRef]
- Sachs, J.D. From Millennium Development Goals to Sustainable Development Goals. Lancet 2012, 379, 2206–2211. [Google Scholar] [CrossRef] [PubMed]
- Rockström, J.; Steffen, W.; Noone, K.; Persson, Å.; Chapin, F.S.; Lambin, E.; Lenton, T.; Scheffer, M.; Folke, C.; Schellnhuber, H.J.; et al. A Safe Operating Space for Humanity. Nature 2009, 461, 472–475. [Google Scholar] [CrossRef] [PubMed]



| Authors (Year) | Data Source | NLP/Modeling Approach | Research Focus | Main Findings | Sustainability Relevance /Limitation |
|---|---|---|---|---|---|
| Bai, Lam & Jakher (2021) [9] | Shipping news articles | Lexicon-based sentiment index | BDI prediction | Shipping sentiment shows short-term predictive power for BDI | Primarily focuses on freight rate dynamics; sustainability implications are indirect and limited to economic volatility. |
| Gavriilidis et al. (2023) [10] | News articles | Shipping-specific sentiment index | Market risk explanation | Captures investor psychology and perceived risk | Improves domain relevance, but sustainability is treated implicitly without linkage to resilience or environmental transition. |
| Sui, Wang & Zheng (2024) [23] | Multilingual shipping news | Domain-tuned language model (BERT-based) | Freight rate prediction | Outperforms general-purpose language models | Methodologically advanced, but primarily prediction-oriented with limited sustainability interpretation. |
| Gong. Y (2025) [24] | Shipping news | Sentiment index + threshold autoregression | Regime-dependent freight rate prediction | Sentiment effects vary across market regimes | Interprets sentiment mainly as a price signal; sustainability stress is not explicitly conceptualized. |
| Jeon et al. (2025) [25] | Shipping news | Transformer-based deep learning | Container freight rate forecasting | Significantly reduces forecasting errors | High predictive accuracy, but limited transparency for sustainability governance or risk diagnosis. |
| This study | Weekly expert market reports (KMI) | Topic modeling + domain-tuned sentiment + ElasticNet | Sustainability risk detection | CTQScores capture early warning sustainability stress | Explicitly sustainability-oriented; interprets textual narratives as early indicators of economic, operational, and environmental stress rather than price prediction. |
| Period | Typical Corpora | Pre-Processing Methods | Modeling Approaches | Evaluation Metrics | Sustainability Orientation |
|---|---|---|---|---|---|
| 2018–2020 | Lloyd’s List, TradeWinds | TF-IDF, n-grams | LDA, linear regression | Accuracy, coherence | Predominantly price-centric; sustainability largely absent. |
| 2021–2022 | Shipping news | Word2Vec, FastText | LDA, VAR | RMSE, R2 | Emerging focus on operational disruption, but sustainability framed narrowly as efficiency loss. |
| 2023–2024 | Multilingual news | Sentence-BERT, morphological analysis | ProdLDA, BERTopic | F1-score, RMSE | ESG-related topics appear, but links to performance remain weak. |
| 2025 | Integrated news & reports | Contextual embeddings | MTL, hybrid models | Forecast accuracy, robustness | Initial sustainability references, still largely forecasting-driven. |
| This study | Expert weekly reports (KMI) | Domain-tuned lexicon + topic–sentiment network | ProdLDA + ElasticNet + VAR | Directional accuracy, stress persistence | Explicit sustainability-risk orientation: resilience, transition, and governance focus. |
| Category | LDA | ProdLDA | BERTopic |
|---|---|---|---|
| Model type | Probabilistic generative model | Neural variational generative model | Embedding-based clustering model |
| Input representation | Bag-of-Words (BoW) | Bag-of-Words (BoW) | Contextual sentence embeddings (BERT) |
| Topic coherence (Cv) | 0.42–0.48 | 0.53–0.61 | 0.47–0.55 |
| Topic redundancy | Moderate | Low | Moderate to high |
| Topic stability across runs | Low | High | Moderate |
| Suitability for CTQ linkage | Moderate | High | Low |
| Interpretability | High | Low | Moderate |
| Computational complexity | Low | Moderate | High |
| Role in this study | Baseline comparison | Primary analytical model | Complementary robustness check |
| Topic ID | Representative Keywords | Interpretation | Related CTQ Factor |
|---|---|---|---|
| T1 | lead time, waiting, congestion, port bottleneck, rerouting | Port congestion | Lead Time |
| T2 | delay, blank sailing, schedule, reliability, disruption | Service irregularity | Schedule Reliability |
| T3 | regulation, decarbonization, CII, ESG, IMO | Environmental regulation | Eco-efficiency |
| T4 | container box, reefer, chassis, trucking | Equipment supply | Equipment Availability |
| T5 | capacity, utilization, fleet, supply adjustment | Capacity imbalance | Vessel Utilization |
| T6 | freight rate, SCFI, surcharge, volatility | Rate fluctuation | Freight Rate Stability |
| ESG Pillar | Sustainability-Critical CTQ Dimension | Positive Terms | Negative Terms | Domain-Expanded Terms | Total Terms |
|---|---|---|---|---|---|
| Economic (E) | Freight Rate Stability | 23 | 12 | 9 | 44 |
| Operational (S *) | Schedule Reliability | 6 | 10 | 5 | 21 |
| Vessel Utilization | 8 | 5 | 12 | 25 | |
| Lead Time | 4 | 5 | 11 | 20 | |
| Equipment Availability | 3 | 8 | 9 | 20 | |
| Environmental (E) | Eco-efficiency | 20 | 8 | 8 | 36 |
| Total | 64 | 48 | 54 | 166 |
| ESG Pillar | CTQ Dimension | Positive Terms | Negative Terms | Neutral/Structural Terms | Domain-Expanded Terms | Sustainability Interpretation |
|---|---|---|---|---|---|---|
| Economic (E) | Freight Rate Stability | stability, recovery, balance, improvement | surge, plunge, volatility, decline | contract rate, spot rate, index, market condition | SCFI, BDI, CCFI capacity adjustment | Indicates economic sustainability through reduced volatility and stable freight markets |
| Operational (S *) | Schedule Reliability | on-time, stable operation, schedule maintained | delay, cancellation, congestion, rerouting | voyage plan, port call, berthing time | route, blank sailing, terminal congestion | Reflects reliability and resilience of shipping services |
| Vessel Utilization | higher utilization, full load, demand growth | overcapacity, idle vessels, service suspension | available capacity, load factor, deployment | fleet capacity, idle ships | Indicates efficiency and sustainability of capacity deployment | |
| Lead Time | shortened, efficient, improvement | delay, bottleneck, congestion | transit time, customs clearance, logistics flow | transshipment time, transport duration | Measures stability of end-to-end transport duration | |
| Equipment Availability | secured supply, smooth circulation | shortage, imbalance, bottleneck | container inventory, equipment turnover | container boxes, reefers, chassis | Evaluates robustness of equipment supply chains | |
| Environmental (E) | Eco-efficiency | reduction, transition, decarbonization | violation, pollution, penalty | IMO regulations, emission standards | LNG propulsion, EEXI, CII | Assesses environmental sustainability by capturing compliance with emission regulations, progress toward decarbonization, and long-term energy-efficiency improvements |
| CTQ Factor | Representative Keywords | Key Sentiment Terms | Interpretation |
|---|---|---|---|
| Freight Rate Stability | freight rates, SCFI, rate index | Increase (+), Decrease (–) | Changes in freight rates are interpreted from an economic sustainability and market stability perspective, where reduced volatility and gradual adjustments indicate improved sustainability rather than short-term profitability. |
| Schedule Reliability | cancellation, delay, sailing | Normalization (+), Congestion (–) | Sentiment reflects the operational sustainability of shipping services, with reductions in delays and cancellations indicating enhanced service resilience and reliability. |
| Vessel Utilization | transit days, shipping duration | Secured (+), Shortage (–) | Sentiment captures the sustainability of capacity deployment, where balanced utilization signals efficient resource use and structural alignment between supply and demand. |
| Lead Time | transit days, shipping duration | Shortening (+), Delay (–) | Sentiment represents the stability of end-to-end transport duration, with persistent delays indicating heightened sustainability stress in supply chain operations. |
| Equipment Availability | containers, equipment, turnover | Smooth supply (+), Shortage (–) | Sentiment measures the smoothness and resilience of equipment circulation, where improved availability supports sustainable logistics flows. |
| Eco-efficiency | decarbonization, LNG, IMO rules | Strengthening (+), Insufficiency (–) | Sentiment reflects environmental sustainability performance, capturing perceptions of compliance with emission regulations, progress toward decarbonization, and long-term energy-efficiency improvements. |
| Sustainability-Critical CTQ Dimension | Centrality (Stress Concentration) | PMI (Stress Association) | Representative Sustainability-Related Keywords |
|---|---|---|---|
| Freight Rate Stability | 0.88 | 0.62 | surcharge, decline, recovery, stability, normalization, SCFI |
| Schedule Reliability | 0.82 | 0.56 | delay, congestion, on-time, sailing schedule, blank sailing |
| Lead Time | 0.74 | 0.49 | shortening, bottleneck, transit time, delay, congestion |
| Vessel Utilization | 0.79 | 0.51 | full load, idle vessels, overcapacity, fleet, capacity |
| Equipment Availability | 0.65 | 0.44 | shortage, reefer, supply, container boxes, equipment pool |
| Eco-efficiency | 0.77 | 0.53 | LNG propulsion, carbon reduction, EEXI, regulation |
| Week | Positive Sentiment Contribution | Negative Sentiment Contribution | Net Sustainability Stress Index (CTQScore) |
|---|---|---|---|
| 4-3 | 0.29190201 | 0.41402586 | −1.7363935 |
| 4-4 | 0.22840234 | 0.21878620 | 0.1611671 |
| 5-1 | 0.31421675 | 0.21747399 | 1.4161243 |
| 5-2 | 0.29208879 | 0.37271511 | −1.1386700 |
| 5-3 | 0.32219636 | 0.27712198 | 0.6719016 |
| 5-4 | 0.26590635 | 0.24355297 | 0.3446320 |
| 6-1 | 0.29700736 | 0.29700736 | 0.0226577 |
| 6-2 | 0.35558284 | 0.34657358 | 0.1524255 |
| 6-3 | 0.49601597 | 0.40806562 | 1.2894797 |
| 6-4 | 0.14977733 | 0.23350383 | −1.1833245 |
| Model | Description | Mape (%) | R2 | Sustainability-Oriented Interpretation |
|---|---|---|---|---|
| Naïve Trend | Extension of previous week’s change | 14.2 | 0.41 | Limited sensitivity to emerging sustainability stress; reacts only after market movements |
| ARIMA | Traditional time-series forecasting model | 11.8 | 0.52 | Captures medium-term market dynamics but lacks sensitivity to narrative-driven sustainability stress |
| PCA-Based CTQ Score Model | Prediction using first principal component of CTQScore | 9.4 | 0.61 | Moderately captures aggregated sustainability stress across CTQ dimensions |
| ElasticNet (CTQ Score + KPI) | Integrated sentiment–topic–CTQ–KPI model | 6.7 | 0.68 | Highest responsiveness to emerging sustainability stress; effective early warning capability |
| Sustainability-Critical CTQ Dimension | R2 | Hit Ratio | Sustainability-Oriented Interpretation |
|---|---|---|---|
| Freight Rate Stability | 0.68 | 0.74 | Strong alignment with economic sustainability stress; functions as a primary stress hub |
| Schedule Reliability | 0.54 | 0.66 | Operational sustainability stress exhibits medium-term influence on market stability |
| Lead Time | 0.61 | 0.70 | Supply chain efficiency provides leading signals of systemic sustainability stress |
| Vessel Utilization | 0.72 | 0.78 | Capacity deployment strongly reflects sustainability stress embedded in freight markets |
| Equipment Availability | 0.49 | 0.63 | Operational stress exhibits delayed but persistent sustainability relevance |
| Eco-efficiency | 0.45 | 0.58 | Reflects long-term environmental sustainability stress driven by regulatory and transition dynamics |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Kim, D.; Kim, Y. A Sustainability-Oriented NLP Framework for Early Detection of Economic, Operational, and Environmental Risks in Global Shipping. Sustainability 2026, 18, 1814. https://doi.org/10.3390/su18041814
Kim D, Kim Y. A Sustainability-Oriented NLP Framework for Early Detection of Economic, Operational, and Environmental Risks in Global Shipping. Sustainability. 2026; 18(4):1814. https://doi.org/10.3390/su18041814
Chicago/Turabian StyleKim, Dongwon, and Yeonjoo Kim. 2026. "A Sustainability-Oriented NLP Framework for Early Detection of Economic, Operational, and Environmental Risks in Global Shipping" Sustainability 18, no. 4: 1814. https://doi.org/10.3390/su18041814
APA StyleKim, D., & Kim, Y. (2026). A Sustainability-Oriented NLP Framework for Early Detection of Economic, Operational, and Environmental Risks in Global Shipping. Sustainability, 18(4), 1814. https://doi.org/10.3390/su18041814

