1. Introduction
The new paradigm of development that combines growth with sustainability is theoretically challenging, yet desirable to attain a sustainable future. In September 1997, the International Maritime Organization (IMO) adopted an International Convention Protocol to achieve “sustainable maritime development” [
1], which is aligned to the notion of “sustainable development” that “meets the needs of the present without compromising the ability of future generations to meet their own needs” [
2]. Sustainability necessitates all three dimensions—environmental, economic, and social—to be considered in business development. The environmental dimension requires a reduction in environmental impacts, while the economic dimension concerns minimization of costs or business continuity. The social dimension relates to community well-being, including the provisions of human rights, better working conditions and improved labor regulations [
3]. Sustainability has three main objectives: minimize economic costs, mitigate negative environmental impacts, including air and water pollution, and enhance social justice, protect human rights and improve working conditions for labor.
In recent decades, the issue of sustainability has been taken seriously by the maritime industry through implementing policies and strategies to reduce shipping miles, promoting environment-friendly technology to reduce carbon emission and energy consumption, and ensuring compliance with labor laws [
3]. There are now additional costs and penalties for non-compliance and unsustainable practices for companies in the shipping industry [
4,
5,
6,
7]. Environmental costs in shipping, which include damage due to vessel exhaust pollution and external air pollution, are reduced by sustainable operations and fuel-efficient management practices [
7,
8,
9,
10]. Sustainable maritime development has subsequently become a management framework for benchmarking maritime business practices, logistics operations and procedures [
11]. The maritime industry, including ports, shipping and logistics, has begun to embrace environmental guidelines and regulations to reduce environmental impacts and carbon emissions. Pressures related to sustainability have also driven significant changes for markets, business practices of enterprises, and government policies and regulations. In particular, sustainability issues have become a vital element of maritime logistics; these include greening of ports [
12], zero carbon shipping and reducing emissions from vessels [
13,
14], climate change mitigation and adaptation [
15], ship speed optimization for reducing emissions [
16,
17,
18,
19,
20,
21], climate risk assessment [
22], use of renewable/sustainable energy—including LNG—as an alternative marine fuel [
23,
24,
25], and regulations including MARPOL (a treaty to prevent maritime pollution) [
26,
27,
28]. However, these frameworks, methodologies and practices are not well documented in a coherent and systematic manner. The research interest in sustainability has significantly grown. Mansouri et al. [
29] have conducted a comprehensive review of sustainability focusing on environmental concerns in maritime shipping, highlighting the need for real-time data monitoring of ports and vessels’ conditions. Sislian et al. [
30] found that most ports widely apply sustainable principles to operate maritime operations in their review of sustainability of port systems and ocean carrier networks. Davarzani et al. [
31] applied bibliometric and network analysis tools to review greening ports and maritime logistics to identify key themes and research trends. Notwithstanding this work, the extant literature on maritime sustainability requires theoretical consolidation to provide clarity, and to resolve conflicting views on sustainability for maritime operations and businesses. It is therefore important to conduct a comprehensive review of previous studies, identify key topical themes, and set out new research directions. Given the limited cognitive capacity of humans to handle and process large amounts of data, there is a strong need to deploy more sophisticated computing power and algorithms to text mine large databases to extract latent thematic patterns, clustering of ideas/concepts, and trends in sustainability research over time.
Although topic modeling is widely used in many studies to analyze various types of text data and multiple fields, including for journals [
32,
33], newspapers [
34], and reports [
35], its application to maritime studies is rather limited. Typical applications of topic modeling include statistics [
36], aviation [
35], hydropower [
37], transportation [
38], personal information privacy [
39], marketing [
32], and health [
33]. There are a few recent studies [
38,
39] that have employed topic modeling to analyze key characters used in academic papers and to then generate a group of words (“corpus” hereafter) that collectively represents the core subjects. An inference and parameter estimation algorithm is used to extract the latent structures and patterns in the text documents by identifying the most common topics. This technique is similar to a dimension reduction technique, such as factor or principal component analysis, that enables extracting the underlying factors from a large volume of data variables [
40]. The basic concept of topic modeling is that a single document is linked to various topics, which could contain many words. Topics can be shown in many documents at the same time [
35]. Hence, “documents blend multiple topics” [
36].
Despite a few attempts, there is limited knowledge on the scope and scale of sustainability literature across the wider maritime industry, including shipping, port, and maritime logistics businesses. In addition, there is limited attempt to apply computational tools to data mine textual information embedded in journal databases. Furthermore, there is a lack of understanding on what sustainability means for maritime studies, with inconsistencies and conflicting views on the use of various concepts, models and measurements. Identification of key words and clusters based on a similarity/dissimilarity matrix would help formalize a framework to reflect the meaning, scope and application of the concept of sustainability in the maritime studies.
This paper primarily aims to identify key concepts and terms applied to denote the notion of maritime sustainability. This study employs topic modeling techniques, such as latent Dirichlet allocation (LDA) and other computational approaches, to extract and visualize the underlying thematic groupings, trends and patterns in sustainability research in the field of maritime studies. We analyze 155 research papers published in peer-reviewed journals since 1993, text mining this research for relevance to maritime sustainability.
The rest of the paper is structured as follows. The next section describes the data utilized.
Section 3 presents the methods employed to identify specific patterns and trends in textual data. Findings and results are presented in
Section 4. Finally,
Section 5 concludes this study by highlighting research limitations, implications and future research.
3. Method: Topic Modeling
Topic modeling belongs to the field of unsupervised machine learning [
41], which uses algorithms to identify specific patterns of text, unlike supervised machine learning, which requires coding rules and data training. Topic modeling algorithms with statistical methods analyze words from documents to discover the topics contained in them.
Topic modeling automatically categorizes documents according to their underlying theme structures [
42]. LDA is the most straightforward probabilistic topic approach to document modeling [
36,
42].
Figure 3 presents the concept behind the LDA model (Blei, [
43] and Kuhn, [
35]).
represents the number of topics.
can be determined by researchers based on practical experience rather than scientific evidence. However, this study uses a statistical method for determining the value of
, following the suggestions of Newton and Raftery [
44] and Griffiths and Steyvers [
45]. It is noted that each topic consists of specific words by Dirichlet distributions and the topic model algorithm from a DTM. The suggestion for determining
, referring to a harmonic mean, allows the discovery of the optimal number of topics, as well as measuring the goodness-of-fit in the modeling. By using the calculation of the harmonic mean, which is one of the maximal log-likelihood methods, the optimum number of topics (
) can be ascertained, as shown in the
Appendix,
Figure A3. In the case of ports, the value of
was determined to be 10; for shipping and maritime logistics, values were 14 and 9, respectively. In the case of papers on ports, the optimal number of topics using the harmonic means method was 21 when the value of the maximum log-likelihoods is observed (
Appendix,
Figure A3). However, too many topics generate noise from undesirable or unwanted topic groups in the text data, because similar topic groups should be separated to create a larger number of topics. Thus, we selected a value (
K = 10) where the slope of the line in the maximum log-likelihood graph begins to flatten. Moreover,
and
represent different quantities:
stands for the specified number of topics,
indicates the auxiliary index over topics.
represents the Dirichlet parameter, indicating the parameter for determining
. The topic distribution per paper in topic modeling is represented by the Dirichlet distribution, which is given by the Dirichlet parameter
(
.
is generally set as
, as suggested by Griffiths and Steyvers [
45]. Thus, the parameter
is applied with different values depending on
(i.e., in the case of ports,
should be 5, when the parameter
is 10).
represents individual documents. In the case of ports,
has 72 documents (papers) concerning sustainability. The number of shipping papers concerning sustainability is 70, and maritime logistics has 13 papers concerning sustainability. In other words,
is a random variable from
, indicating the topic proportions for the individual
th document. As shown in
Figure 3, the value
determines topic proportions,
as shown in
Figure 3. The relationship between
and
can be shown as follows:
Then,
is determined by
, indicating the topic assignments for the
th document. Thus,
shows the topic assignment for the
th word (
depends on each document) in each document,
.
is the parameter for determining
. While
represents topic proportions, each
represents the word (term) distributions. The relationship between
and
can also be shown as follows, in that the value
determines topic proportions
. The value
is assumed to be 0.1, as suggested by Griffiths and Steyvers [
45].
For each topic being indexed by , is the model parameter, indicating word proportions within the topic. Therefore, is the observed variable in the document from the multinomial distribution (), indicating the th word of document .
This study conducts topic modeling using R statistical software version 3.4.4 (R Development Core Team, Vienna, Austria) (“R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License”, R software website (
www.r-project.org)), which provides attractive environments for computational methods of data science [
41]. The next section presents the results from the R software using the
topicmodels package.
5. Discussion and Conclusions
This study conducted a comprehensive literature review on sustainability in the field of maritime studies. From SCI/SSCI journals, 155 academic papers related to port, shipping and maritime logistics were extracted. Consolidation of sustainability issues was achieved by conducting a latent Dirichlet allocation for latent data discovery and relationships among text document data. The landscape of sustainability research was illustrated using bibliometric analysis. In this study, a new intellectual structure of sustainability literature was created, current trends and key co-authorship patterns were mapped, and future research development trajectories in the field of maritime studies were projected.
The results from text mining indicate the extracted common sustainability issues for ports and shipping businesses. Broadly, these issues are related to green ports/shipping, carbon emission/climate change and region-specific environmental regulation/management. The need to optimize shipping routes/networks to reduce carbon/green house gas emission, shrink distances, and reduce logistics costs is an additional sustainability issue for shipping.
For maritime logistics, sustainability issues are generally related to achieving optimal logistics systems, sustainable supply chain design, and service quality management. The co-occurrence analysis identified high-frequency keywords, including sustainability, management, port, emission, impact, performance, model, logistics, system and framework. More recently, the keywords have shifted to include governance, corporate social responsibility, and supply chain management. A shift in publication on sustainability from OECD (Organization for Economic Co-operation and Development)-dominated countries, with the exception of Singapore and Taiwan, to China and South Korea is noted in the visual illustration of mapped data.
A notable finding in this study shows that most of the sustainability issues related to ports, shipping and maritime logistics were related to economic and environmental dimensions of sustainability. However, social aspects, such as labor laws and standards, working conditions, maritime employment, regional growth and development, and the livability of communities within port regions, have attracted relatively less academic investigation within the field of maritime studies. Arguably, sustainability of the maritime industry requires that all dimensions are equally addressed.
There are theoretical and policy implications of these findings. From a theoretical perspective, this study has developed a broader understanding of the major themes and conceptual models to help theorize the notion of sustainability in the context of maritime studies. It enables researchers to: see the value of text-mining tools to help synthesize diverse results; understand the inconsistencies in the extant body of literature; and, develop new theories and models of maritime sustainability. From a policy perspective, this study has created a grounded platform to help researchers to develop maritime sustainability assessments and improvement frameworks to identify and evaluate key sustainability principles, guidelines and measurements employed in the field of maritime studies.
Future research could pay equal attention to researching social aspects of maritime management, including the effectiveness of community planning, labor laws and regulations, and strategic policy-making. From a supply chain perspective, single port-to-port chains is an issue that has received limited attention that requires further investigation. Competitive port supply chains therefore could be incorporated in the future research agenda. Sustainability of the broader region within which a port operates needs to be integrated in the evaluation of sustainability measures on port functions, shipping services and maritime logistics operations. In addition, new case studies from developing countries and emerging economies could be encouraged to provide wider insights into sustainability issues in a globalized marketplace. Furthermore, interdisciplinary research across institutions and nations could be better promoted by journals to enhance cross-cultural learning across different business settings and industry practices to help preserve the natural environment and mitigate deleterious impacts of maritime operations on the natural habitat, while generating growth and employment opportunities for local port communities.