Big Data-Driven Banking Operations: Opportunities, Challenges, and Data Security Perspectives

: At present, with the rise of information technology revolution, such as mobile internet, cloud computing, big data, machine learning, artiﬁcial intelligence, and the Internet of Things, the banking industry is ushering in new opportunities and encountering severe challenges. This inspired us to develop the following research concepts to study how data innovation impacts banking. We used qualitative research methods (systematic and bibliometric reviews) to examine research articles obtained from the Web of Science and SCOPUS databases to achieve our research goals. The ﬁndings show that data innovation creates opportunities for a well-developed banking supply chain, effective risk management and ﬁnancial fraud detection, banking customer analytics, and bank decision-making. Also, data-driven banking faces some challenges, such as the availability of more data increasing the complexity of service management and creating ﬁerce competition, the lack of professional data analysts, and data costs. This study also ﬁnds that banking security is one of the most important issues; thus, banks need to respond to external and internal cyberattacks and manage vulnerabilities


Introduction
Banking operations have been influenced by the emergence of intelligent data-driven technology systems, which involves an enormous amount of data being captured and stored at unprecedented volume, velocity, and variety [1,2]. The impressive explosion of data and the rapid development of new technologies significantly transform business strategies and management through artificial intelligence (AI), IoT, big data analytics (BDA), cloud computing, blockchain, and so on [3][4][5][6]. In the earlier periods, data innovation was considered insignificant for most firms; however, the value of data is now commonly accepted [7]. Most companies worldwide have started exploiting the potential opportunities offered by utilising big data. Although the capturing and storage of big data has been heavily invested in, less than 0.5% of the collected data were analysed or used [1]. This situation forces financial institutions to consider transforming existing data into valuable knowledge for management and creation of as much profit as possible for enterprises [8][9][10][11].
Data innovation is mainly considered to be a concept in the revolution of big data, which has played an increasingly important role in the growth of the banking or financial services sector [2,12,13]. A large amount of customer information can help banking organisations learn about their customers' preferences to improve service quality, enhance business profit, and, finally, satisfy their customers [14][15][16]. Based on the big data provided, an intelligent system may provide bankers with valuable tools to support the decision-making process when dealing with complicated portfolios [7]. Additionally, the unstoppable and rapid development of data analytics has generated a new range of services for the banking sector and provided a remarkable capability to specialise and individualise its products [10,17,18]. As competition is relatively high in the banking industry, companies need to understand their existing clients very well by learning about their customers' behaviours from acquired data in the internal system to offer better-tailored services [16,19]. The author of [1] mentioned that current clients' data could also attract similar new customers to enhance a company's market share. Also, big data innovation provides favourable conditions for expanding financial institutions' business scope and services for customers in the banking industry. However, navigating the challenges brought by big data is also a question that the banking industry needs to consider carefully [10,[20][21][22]. The growing variety of data leads to an increasing challenge in decisionmaking [23]. Data collection and integration are not sufficient for banks to effectively use these data. The challenge is to transform the collected data into strategic levers to improve customer satisfaction and, thus, company performance by tuning information quality and using big data analytics [14,24,25]. Only deep mining can help explore hidden information that can be used to provide customers with better financial products and services. The utility of big data in finance is linked to several obstacles, such as how to realise fast and efficient processing of big datasets with multi-source data. How can companies deal with the fragmented data generated by financial technologies and the risk caused by rapid response demand? How can they make full use of data analysis and mining to obtain more significant economic benefits [26][27][28]? Thus, a critical claim applied to banking operations is creating solutions to unlock the vital information hidden in big data.
Research on big data and the banking industry is new. Researchers are still trying to figure out how banks and financial institutions can utilise big datasets to improve their overall financial performance. There is a lack of studies that show how banks face challenges and deal with both external and internal data and banking security. This is considered to be a research gap to be addressed in this paper.
Based on this research gap and the significance of the above issues, this study investigates certain research questions. The questions are as follows: (i) How does big data promote the overall performance of bank operations? (ii) What types of challenges do banks face when utilising enormous datasets? (iii) How do banks deal with banking data security?
Considering the research gap and questions, our objectives are to discover datadriven opportunities, threats, and security issues in the banking industry. In addition to exploring the key objectives, this study also aims to explore the present landscape of big data innovation in the banking industry using bibliometric data.
This study has significance as an investigation that shows how big data influence banking operations as a whole. The views of various researchers, fellows, and other stakeholders related to big data and banking activities were collected and analysed. We tested existing concepts and introduced a comprehensive understanding of the current research on big data using a qualitative method. The utilisation of big data was explored from different financing perspectives to bring a clear picture to readers. As big data is still a new and emerging research topic, where researchers are trying to establish a fundamental theorem, this study provides the crucial concepts regarding the essence of big data in the banking industry.
This study offers significant contributions to the present literature. First, this study shows how the presence of big data has been shaping the banking sector. More specifically, several revolutionary innovations, such as big data analytics, machine learning, artificial intelligence, and IoT, have positively changed the entire banking supply chain, including reducing financial fraud by managing and controlling fintech risks, understanding banking customers, and helping with better data-driven banking decisions. This study also extends existing research by specifying the significant challenges associated with data-driven banking operations.

Methodology
This study followed a qualitative research methodology. Commonly, a literature review helps most to find out the gaps in current research and highlight the research boundary [29]. This study used deductive logic to examine trending trends relating to big data from banking operation perspectives [30]. The structured processes of this qualitative research methodology are described in the sections below.

Study Selection
This study followed specific selection measures to identify relevant studies for our review. Inclusion criteria were set up, such as selected studies must be empirical and utilise an analytical and qualitative strategy [31]. These strategies helped us construct an initial processing or framework with self-contained components which finally led to the study selection process. Initially, we collected relevant literature during the study selection process. This collection process not only focused on specific areas but also on the use of big data for different business aspects. As research on big data in banking is a new endeavour, obtaining studies with more than 5 to 10 years of data was somewhat difficult. The relevant literature was included in this study if it was published by 28 February 2022.

Database Selection
The data of this research were collected from secondary sources. Initially, both Scopus and Web of Science (WoS) databases were used as the main search engines [32,33]. These two databases are the most accepted and well-known databases worldwide, with more than 20,000 journals in Scopus and 12,000 journals in the Web of Science [30]. Not all articles have significant contributions with a high acceptability. This is why we mainly focused on Elsevier, Taylor & Francis, Springer, Emerald, Wiley, Sage, Informs, IEEE, and other databases to collect relevant articles with a high acceptability. It is because the literature relating to big data in banking is not well established.

Data Collection
Data collection focusing on banking aspects and industrial elements, particularly in the finance industry, was emphasized here. A purposive sampling method was used because our research focused on particular fields [34]. We followed [29], a well-known study on qualitative research, for the data collection process of qualitative research. After collecting the data, initial screening was performed to sort out the relevant literature relating to the banking industry; in some cases, the literature relating to big data and finance, business, management, and other industries was prioritised. Additionally, some web-based contents were also collected and noted as important readings. As big data analytics is not an old concept, and it is considered a concept that recently emerged in this decade, webpage contents also helped extend our discussion from different viewpoints. After the initial screening, the relevant articles relating to the research field were selected for further reading. Then, the most important step was article coding. As this study focused on different views of big data and the banking industry, article coding helped make the study selection process markedly easier. Also, we followed the data analysis methods described in [35], a complete study analysing bibliometric data using R Studio (R version 4.2.1). The entire step-by-step biblioshiny process is described in Appendix A.

Keyword Searching
Keyword searching was one of the most important issues in the initial stage. Articles with their title, abstract, keywords, and main text that are highly related to the topic were searched [36]. First, we searched titles with the terms "big data" AND "banking", and we found 62 articles using the Scopus database and 65 articles using the WoS database. In the second stage, "big data" AND "banking" were specified as the topic in the WoS and we found 1460 articles. This number was 959 articles when using the Scopus database with the keyword searching area being specified within the article title, abstract, and keywords. After removing 767 duplicates from both databases, 1652 articles were finally included for the bibliometric analysis.

Data Inclusion and Exclusion
Data inclusion and exclusion processes are also crucial for a qualitative study. We followed the data inclusion and exclusion processes of [30]. First of all, the authors of [30] identified the literature inclusion criteria. Once articles were found to meet the inclusion criteria, they applied the exclusion criteria to eliminate the irrelevant literature considering the main theme of the research. We extended the data inclusion and exclusion process based on [36], highlighting definite inclusion and exclusion processes. The data inclusion process of [36] was limited to only 32 top-tier peer-reviewed journals; however, this study was not limited to certain very specific journals. Our study was not limited to only 32 top-tier journals; we extended the inclusion process to journals with a good journal index, such as those indexed in the WoS and Scopus. After collecting the articles, these articles were first divided into two parts based on the concept of whether they met the research objectives or not. Afterwards, during the initial screening, related and even partially related articles were selected for further processing and planning of the search protocol; this was followed by screening of the abstracts and headings, which resulted in the selection of a good segment of relevant articles for further processing. Some articles were excluded, particularly those articles that do not focus on benefits, big data, and banking, as well as those publications that do not focus on real applications of big data in banking.

Research Framework
Researchers and graduate students often utilise a structured research framework in their research and stay up to date about new variations of qualitative research frameworks [37]. A research framework is essential to classify the structure of qualitative research based on data from a literature review [38]. It also helps with data coding and interpretation [39]. Based on these concepts, this qualitative study's methodology followed a structured research framework described by the authors of two previous studies, [34,40]. The framework is presented in Figure 1.

Big Data
The author of [41] introduced the most common definition of big data based on a paradigm of the three V's, known as volume, velocity, and variety. In terms of data volume, it is estimated that 1.7 megabytes of new information is generated every second for every person across the world by 2020 [1]. In terms of velocity, big data involves fast data creation speed, which is even more critical than volume for many applications. The collection of tick-by-tick data or nearly real-time information allows companies to be much more agile than their competitors [42]. The third one is variety, which is the most interesting of the three V's; that is, big data comprises data of different natures [43]. For example, big data comes in the forms of messages, texts, images, updates, videos, web searches, financial transactions, emails, and posts on Facebook, Twitter, and other social networks. Several scholars claimed there should be the addition of two more V's in the definition of big data, that is, variability and virality. Variability refers to the contextualisation of big data, while The author of [41] introduced the most common definition of big data based on a paradigm of the three V's, known as volume, velocity, and variety. In terms of data volume, FinTech 2023, 2 488 it is estimated that 1.7 megabytes of new information is generated every second for every person across the world by 2020 [1]. In terms of velocity, big data involves fast data creation speed, which is even more critical than volume for many applications. The collection of tick-by-tick data or nearly real-time information allows companies to be much more agile than their competitors [42]. The third one is variety, which is the most interesting of the three V's; that is, big data comprises data of different natures [43]. For example, big data comes in the forms of messages, texts, images, updates, videos, web searches, financial transactions, emails, and posts on Facebook, Twitter, and other social networks. Several scholars claimed there should be the addition of two more V's in the definition of big data, that is, variability and virality. Variability refers to the contextualisation of big data, while virality requires the growth of big data to be exponential [44].

Big Data Analytics
Big data analytics (BDA) is considered a tool for extracting values from huge volumes of data and information to drive new market opportunities with maximum customer retention [45]. BDA is a complex process to examine large datasets to uncover different important information, such as market trends, hidden patterns, customer preferences, and correlations, to make informed business decisions [46][47][48]. Top-performing organisations that use big data analytics perform five times more efficiently than organisations that do not utilise big data analytics [48].

Machine Learning
The authors of [49] stated that machine learning (ML) "...is an evolving branch of computational algorithms that are designed to emulate human intelligence by learning from the surrounding environment". ML is considered the working horse of the so-called big data technologies. ML is also regarded as lying at the intersection of statistics and computer science and the core of data science and AI [50,51]. In the banking industry, the use of ML is very popular to make algorithm-based predictions [52,53].

Artificial Intelligence
The term artificial intelligence (AI) refers to the use of a computer to model intelligent behaviour with minimal human intervention [54]. The author of [55] mentioned AI as intelligent human behaviour consisting of processes, which can be formalised with an algorithm and reproduced in a machine. Also, Professor John McCarthy, regarded as the father of AI (http://jmc.stanford.edu/articles/whatisai.html (accessed on 16 February 2021)), mentioned AI as the science of making intelligent machines. More specifically, intelligent computer programs are related to the very similar tasks of using computers to understand human intelligence. However, one important issue is that AI does not have to confine itself to biologically observable facts [56].

Internet of Things (IoT)
The Internet of Things (IoT) refers to the network of physical objects embedded within software, sensors, and other technologies to connect and exchange data between devices and specified systems over the internet [57,58]. The IoT is a smart environment that uses interconnection sensing and actuating devices to share information across platforms through a unified connecting framework [59]. IoT applications provide reliable deviceto-device and human-to-device interaction services [60]. Smart home appliances are a good example of the IoT; for example, thermostats, lights, refrigerators, cars, and other appliances can all be interconnected through an IoT network.

Data-Driven Banking
Progressive regulations and technological developments, particularly information technology and big data, have led to digital banking operations and virtual banking systems worldwide [42,61]. Also, data-driven services offered by a service provider are always expected by customers to operate in real time [62]. This is why the market is moving fast to sustain market dynamics, which ultimately supports banking operations' digital growth for a more extended period and towards better banking systems. However, current regulations, transformation, technological advancements, and innovations will not work unless incumbent institutions introduce a data-driven approach with a high financial performance that can provide more significant profits [22,[63][64][65][66]. Data innovation increases market competition in different aspects of the banking industry [67,68], promoting the entire banking system with diversified opportunities. Data-driven approaches help financial institutions to perform better in mergers and acquisitions [69], predict the success of banking telemarketing [70] and banking supply chain [71], and evaluate banking performance [14,72].

The Present Landscape of Big Data and Business
The rapid development of interconnected mobile networks, the IoT, and social networks has resulted in the exponential growth of diversified data. Semi-structured and unstructured data source channels have become more complicated, leading to a modern digital information era. As the demand for data innovation increases daily, the revenue of big data analytics software grows rapidly. In 2011, the revenue was USD 32.14 billion and had almost doubled in 2018 [73]. Also, the revenue of big data and business analytics is increasing worldwide from year to year. The global market value of big data and business analytics was valued at USD 168.8 billion in 2018. However, it is predicted to grow to nearly USD 274.3 billion by 2022, with a five-year CAGR (compound annual growth rate) of 13.2 percent (https://www.statista.com/statistics/551501/worldwide-big-data-business-a nalytics-revenue/ (accessed on 16 February 2021)). Global information is also snowballing, generating hundreds of billions of data points daily [40]. These data contain information on past and present internal operations and external activities of different industries. Every day, trillions of data points provide numerous opportunities to people, such as effective communication at a lower cost, using global information systems to work together from different places, making decisions, monitoring transaction processes, and providing control measures. Global information systems also help overcome differences in distance, time, language, and culture for people to cooperate effectively. Cooperation can be improved through groupware software, group decision support systems, extranets, and electronic meeting facilities. For these reasons, the amount of information is increasing globally (see Figure 2).
FinTech 2023, 2, FOR PEER REVIEW 7 analytics was valued at USD 168.8 billion in 2018. However, it is predicted to grow to nearly USD 274.3 billion by 2022, with a five-year CAGR (compound annual growth rate) of 13.2 percent (https://www.statista.com/statistics/551501/worldwide-big-data-businessanalytics-revenue/ (accessed on 16 February 2021)). Global information is also snowballing, generating hundreds of billions of data points daily [40]. These data contain information on past and present internal operations and external activities of different industries. Every day, trillions of data points provide numerous opportunities to people, such as effective communication at a lower cost, using global information systems to work together from different places, making decisions, monitoring transaction processes, and providing control measures. Global information systems also help overcome differences in distance, time, language, and culture for people to cooperate effectively. Cooperation can be improved through groupware software, group decision support systems, extranets, and electronic meeting facilities. For these reasons, the amount of information is increasing globally (see Figure 2). As shown in Figure 2, the last decade was a booming decade for data innovation. The global information was almost two zettabytes in 2010, which remarkably increased to 64.2 zettabytes in 2020. Information has been produced rapidly after 2019, and the amount of information will increase by more than six times in the following decades. Moreover, according to Analytics Insight, 2019 was an important year in the big data landscape. After the merging of Cloudera with Hortonworks at the beginning of the year, the use of big data has been on the rise globally, and organisations have begun to accept the importance of data operations and orchestration to their business success (https://www.talend.com/resources/what-is-cloudera/; https://www.pacificdataintegra-  As shown in Figure 2, the last decade was a booming decade for data innovation. The global information was almost two zettabytes in 2010, which remarkably increased to 64.2 zettabytes in 2020. Information has been produced rapidly after 2019, and the amount of information will increase by more than six times in the following decades. Moreover, according to Analytics Insight, 2019 was an important year in the big data landscape. After the merging of Cloudera with Hortonworks at the beginning of the year, the use of big data has been on the rise globally, and organisations have begun to accept the importance of data operations and orchestration to their business success (https://www.talend.com/resources/what-is-cloudera/; https://www.pacificdataintegrators.com/insights/cloudera-hortonworks-merger; (accessed on 16 February 2021)). The value of the big data industry was USD 189 billion, an increase of USD 20 billion over 2018, and had continue to increase, reaching USD 247 billion in 2022. Big data trends will be the centre of attraction for data scientists, data officers, and managers; big data analytics will significantly impact investment and cloud-based operations; and ML will become the focus [75]. These trends will capture the market with a significant amount of money, such as the value of data innovation in cognitive computing is expected to reach nearly USD 18.6 billion. Data innovation in application infrastructure will reach almost USD 11.7 billion, and public safety and homeland security will reach about USD 7.5 billion. Real-time data will also be considered a fundamental value proposition in every case, segment, and solution. Additionally, leading market companies are also rapidly integrating data-driven innovative technologies with IoT infrastructure (https://www.globenewswire.com/news-release/2020/03/18/2002786/0/en/Globa l-Big-Data-Market-Insights-2020-2025-Leading-Companies-Solutions-Use-Cases-Busines s-Cases-Infrastructure-Technology-Integration-Industry-Verticals-Regions-and-Countries .html (accessed on 16 February 2021)). Therefore, it is crucial to consider data management in any innovative decision-making and corporate activities.
As a single field, data innovation is mostly related to the banking industry, discrete manufacturing, professional services, process manufacturing, and federal government activities [27,[76][77][78][79][80][81]. As shown in Figure 3, the banking industry is the biggest single entity from which big data and business analytics revenues are generated, accounting for nearly 14% of the total revenue. Discrete manufacturing is the second biggest sector, contributing to almost 11.3% of the total revenue. Professional services stay at the same level as process manufacturing, accounting for 8.2% of the total revenue. As the banking industry is the biggest one, identifying the advantages and challenges of data innovation is very important for every phase.

Present Literature
A brief review of existing literature on big data and banking is also presented here. Figure 4 presents highly discussed issues in this research area. Figure 4 shows the word cloud of big data and banking research based on the bibliometric data analysis. The highlighted words include big data, banking, machine learning, big data analytics, fintech, mining, credit risk, cloud computing, classification, credit scoring, data models, financial performance, risk, competition, feature selection, sentiment analysis, challenges, and so on. The systematic result is presented in the results sections based on the aforementioned keywords. Also, how big data is linked to banking research is important. Figure 5 shows how banking is connected with data innovations.
tivities [27,[76][77][78][79][80][81]. As shown in Figure 3, the banking industry is the biggest single entity from which big data and business analytics revenues are generated, accounting for nearly 14% of the total revenue. Discrete manufacturing is the second biggest sector, contributing to almost 11.3% of the total revenue. Professional services stay at the same level as process manufacturing, accounting for 8.2% of the total revenue. As the banking industry is the biggest one, identifying the advantages and challenges of data innovation is very important for every phase.

Present Literature
A brief review of existing literature on big data and banking is also presented here. Figure 4 presents highly discussed issues in this research area. Figure 4 shows the word cloud of big data and banking research based on the bibliometric data analysis. The highlighted words include big data, banking, machine learning, big data analytics, fintech, mining, credit risk, cloud computing, classification, credit scoring, data models, financial performance, risk, competition, feature selection, sentiment analysis, challenges, and so  on. The systematic result is presented in the results sections based on the aforementioned keywords. Also, how big data is linked to banking research is important. Figure 5 shows how banking is connected with data innovations.   We used keyword plus terms, authors' keywords, and keywords from the title of the manuscripts for the bibliometrics analysis. However, Figure 5 includes only our keywords. The detailed keyword analysis is also presented in Table A1 (see Appendix A). A thematic map was also produced and is presented in Figure 6. As shown in Figure 6, the highlighted thematic groups are as follows: The basic themes are (i) data development analysis, efficiency, and bank, and (ii) financial crisis, banking industry, and corporate social responsibility. The motor themes are liquidity, corporate governance, and data. There are two themes that fall under both basic and motor themes. These are (i) big data, machine learning, and banking, and (ii) bank, China, and financial performance. The niche themes are (i) bank size, banking sector, and content analysis; (ii) India, profitability, and customer satisfaction; and (iii) Islamic banks and Pakistan. The declining or emerging themes are (i) systemic risk, financial stability, and too-big-to-fail, and (ii) personality traits, financial inclusion, and microfinance. Also, we present the growth and increasing trend of big data and banking research in Figure A1: cumulative word growth of 20 words from 1991 to 2021, and Figure A2: year-by-year cumulative word growth from 1991 to 2021 (see Appendix A). We used keyword plus terms, authors' keywords, and keywords from the title of the manuscripts for the bibliometrics analysis. However, Figure 5 includes only our keywords. The detailed keyword analysis is also presented in Table A1 (see Appendix A). A thematic map was also produced and is presented in Figure 6. As shown in Figure 6, the highlighted thematic groups are as follows: The basic themes are (i) data development analysis, efficiency, and bank, and (ii) financial crisis, banking industry, and corporate social responsibility. The motor themes are liquidity, corporate governance, and data. There are two themes that fall under both basic and motor themes. These are (i) big data, machine learning, and banking, and (ii) bank, China, and financial performance. The niche themes are (i) bank size, banking sector, and content analysis; (ii) India, profitability, and customer satisfaction; and (iii) Islamic banks and Pakistan. The declining or emerging themes are (i) systemic risk, financial stability, and too-big-to-fail, and (ii) personality traits, financial inclusion, and microfinance. Also, we present the growth and increasing trend of big data and banking research in Figure A1: cumulative word growth of 20 words from 1991 to 2021, and Figure A2: year-by-year cumulative word growth from 1991 to 2021 (see Appendix A).
Many researchers discussed the concept of big data for industrial usage in different periods; however, the discussion mainly came to the front from 2014 onwards. According to the bibliometric findings, we found that most research studies were published after 2018; thus, we consider this to be a very uprising issue of publication. Table 1 presents the most notable contributions to big data and banking. Many researchers discussed the concept of big data for industrial usage in different periods; however, the discussion mainly came to the front from 2014 onwards. According to the bibliometric findings, we found that most research studies were published after 2018; thus, we consider this to be a very uprising issue of publication. Table 1 presents the most notable contributions to big data and banking.    Table A2 (see Appendix A), which includes the top 50 most cited studies in this research field. The countries with the most publications are also presented here (see Table A3 in Appendix A). As shown in Table A3

Data-Driven Opportunities in Banking Operations
Big data influences dramatically many aspects of financial services and the banking industry, including the financial market [82]; internet credit service companies [83]; internet finance [84]; management, analysis, and applications [80]; and credit banking risk analysis and risk management [85,86]. Based on these concepts, the influence of big data on the banking sector is discussed in the subsections below.

Banking Supply Chain
Banks can be extended to become a more extensive supply chain network by including affiliated or upstream/downstream companies along their related supply chains. Based on real-time supply chain upstream and downstream data, supply chain finance (SCF) risk can be calculated using different assessment models [87]. IoT-based risk management performance using big data analysis is a useful instrument capable of predicting different (SCF) risks [88,89]. Moreover, banks use big data analytics to explore internal businessto-business (B2B) data to improve SCF and identify potential corporate customers or to improve the business offerings to existing corporate customers along the business's supply chain [90]. The authors of [22] identified big data analytics as an important factor of banking customer marketing and risk management performance. Using big data, banks can utilise the information to track client behaviour continually in real time as well as to monitor supply chain performance, which will finally lead to a boost in overall profitability and performance [91].

Bank Risk Management
The rapid growth of massive data and the increasing maturity of big data technologies have made it possible to manage risks in the banking industry based on big data analysis. Building an advanced risk management system has become one of banks' core competitiveness [27,66,92]. Big data is also becoming more crucial for financial risk analysis. Particularly, with the rise of cybercrime, big data analysis assists in detecting various patterns that suggest a potential banking cybersecurity danger [93,94]. Also, banks may obtain real-time insight into its risks and utilise the information to derive a risk management strategy by utilising data science technology, which integrates predictive algorithms to analyse large data in conjunction with risk assessment [48,95,96]. Banks also gain a wealth of insight into organisational risk by leveraging various sources of big data, allowing for threat assessment and mitigations [89,97].

Financial Fraud Detection
Scammers can quickly exhaust personal financial accounts or steal thousands of dollars from credit cards. Worse still, organised criminal groups can execute well-designed plans and illegally dispossess millions of dollars. Therefore, financial fraud detection is essential to minimise the risk for the organisation. Big data assists in banking fraud detection in different ways; for example, (i) companies use improved and better fraud detection methods based on real-time big data analysis [26,98]; (ii) algorithms based on real-time transactions and high-speed data generation from diverse sources help banks detect fraud more accurately; (iii) modern systems can detect fraud faster and consistently using ML algorithms [99]; and (iv) big data allows banks to prevent unauthorised transactions by providing banking at a safe and secure level, which raises the security standard of the banking industry.

Customer Insight and Marketing Analytics
Big data helps banks obtain a 360-degree view of their customers [100][101][102]. "A 360degree view of the customer is the concept of being able to view and analyses all of the data you have about every single customer in isolation, in one location" (https://www.mycustomer.com/hrglossary/360-degree-view-of-the-customer (accessed on 16 February 2021)). Using a large dataset, the 360-degree view operates like a crystal ball by showing an inside look into past, present, and future customer data and customer-organisation relationships. Therefore, it provides diversified banking facilities such as analysis of bank loan risk, segmentation of different banks' clients, and client sentiment analysis [103,104]. In addition, big data enables banks to gain insight into customers' consumption behaviours and patterns, thus simplifying their needs and demands. [Note: the reference [105] mentioned that big data enables banks to gain insights into customers' consumption habits and patterns, thereby simplifying the task of determining their needs and demands. This study also highlighted the advantages of customer segmentation, which allows banks to better target their customers through relevant marketing activities tailored to customer needs. Mainly, the focus of the study was on "big data customer segmentation" (https://bigdata-madesi mple.com/role-big-data-banking-industry/ (accessed on 16 February 2021))]. By tracking each customer's transaction, banks can categorise customers based on several parameters, such as standard services, preferred credit card spending, and even net worth. Customer segmentation allows banks to better target customers through relevant marketing activities tailored to customer needs.

Banking Decision
Big data management is considered a tool that allows a company or an institution to generate, manipulate, and manage massive datasets within a prespecified timeframe. Big data helps promote progress transparency, audit ability, and executive oversight of any enterprise's risk [106], thus improving their decision-making ability. A digitalised big dataset is a valuable tool to increase business decision-making with diversified options. Different comprehensive methods, analytical techniques, and standards for describing and managing decisions are essential for banking success [107,108]. The authors of [108] identified a big data value chain for decision science. It starts with data capture, then data curation, data analysis, data visualisation, and finally, decision-making using visualised data. Big data processing and big data analytics capabilities influence banks' decisionmaking quality; however, a lack of knowledge regarding big data analytics also influences the decision-making quality [109]. Also, big data analysis during the decision-making process always adds value to the decision made [110].

Challenges Faced by Banks in the Era of Big Data
Big data's 4V characteristics comprise different challenges for management, analytics, finance, and other various applications. These challenges consist of effectively organising and managing banking sectors, finding novel business models, and handling traditional banking issues [80]. Data-driven development methods will also have a disruptive impact on its future. For a long time, banks have focused on collecting and storing enormous amounts of data. However, they face significant challenges when contemplating the full use of such data. In this paper, the significant challenges are summarised, considering the banking sector's typical characteristics in the era of big data.

Changes in Banking Operation
The advancement in information technology and Internet technology development brings changes to the banking industry's operations and regulatory policies. It has lowered the industry's entry barriers, allowing non-financial institutions to enter the financial system and use their technological advantages and blind spots in the regulatory process to gain competitive advantages [111]. The relationship between data innovation and banking will impact management virtualisation and product virtualisation [112][113][114]. For management virtualisation, various documents and vouchers in the banking business will appear in digital files in an electronic and data management model, which will continue to impact the traditional commercial banking operating model. Besides that, product virtualisation will also influence big data management, as different products will change their profit-earning behaviour and be exchanged at a faster rate as various types of data signals become availabe. These changes significantly affect traditional banking habits [40,115].

Complex Service Management
With the rapid development of e-banking business, big data promotes the formation of customer solutions with more choices and more independent requirements than before, which can help sustain profitability and generate trending business values [22,27]. Commercial banks must promote individualised services through active marketing for new customers with diversified opportunities and expanded services, which make everyday banking operations more complex than traditional banking services [42,44,116]. Also, unstructured and heterogeneous datasets are extremely complex and need real-time or almost real-time analysis to manage their complexity [96]. Also, big data works as a great challenge due to data accidents, which present an inevitable challenge. Employers may face this challenge when regulating the use of big data in their organisations [117].

The Highly Competitive Market for Commercial Banks
The ability to obtain big data will determine the competitiveness of commercial banks. Commercial banks themselves have a large amount of customer data and transaction data; the more data they collect and process, the more they can gain the advantages of data innovation, leading to fierce competition among competitors [118,119]. Therefore, competing with larger firms becomes more complicated than during the earlier times of traditional banking. Also, the choice of technology for big data involves decision-making risks [120]. In addition, the analysis and processing of unstructured data are costly [121,122]. Unstructured and semi-structured data that widely exist in social networks, e-commerce platforms, and other media require more complex methods to deal with their massive fragments and higher costs [98,120,123]. The costs of big data management ultimately hit the overall profitability, leading to market competitiveness.

Changes in Banking Operation
The advancement in information technology and Internet technology development brings changes to the banking industry's operations and regulatory policies. These changes make banking operations more complex compared to traditional banking [96,108]. Also, banks are increasingly implementing technological operations that facilitate complex service scenarios [124]. In addition, utilising vast volumes of complex data makes the business process workflow complex. However, banks must utilise such complex data and analyses to improve their overall business operations and financial performance [125].

Lack of Professional Data Analysts, Experiences, and Knowledge
Most commercial banks have not established a mature data analysis team with a high sensitivity to data's value, professional ability, and data analysis experience. In particular, a lack of knowledge and experiences regarding big data handling and a lack of confidence are considered some of the biggest human barriers to the use of big data in the banking industry [114,126]. Many data analysts are good at finding the causes of problems that have occurred through data analysis and management. Still, they do not have sufficient ability to discover unknown issues. In this case, a lack of top management support and skills is also regarded as an important issue [127,128]. Also, there is a problem of skill shortage in precisely predicting the value of big data and future data trends [128][129][130]. The authors of [131] identified that there is a general shortage of data security experts in the banking industry.

The Costs of Data
To better adapt and control the big data model, the banking industry must establish its own commercial big data platform and algorithms [132,133]. This big data platform must collect structured business data, manage all kinds of unstructured data, and even compare the collected FinTech 2023, 2 497 data with historical data, which requires banks to process data efficiently [114,134]. To achieve this expected goal of the banking industry, it usually is necessary to make available cloud computing technology, distributed computing technology, and redundant configuration technology. From this perspective, although big data in the banking industry has the prospect of bringing high benefits, at the same time, the costs are quite remarkable, and there is no way to avoid this kind of mandatory costs [120,135]

Banking and Data Security
Database vulnerabilities, privacy breaches, and leakage of users' information by internal employees have frequently occurred in companies [136], particularly in the banking industry. Thus, the structure of data security and data privacy systems continues to be an urgent need in the banking industry that needs to be addressed [137,138]. As the banking industry accelerates into the industrial internet era, the generation of new information technologies and applications, such as big data, artificial intelligence, and cloud computing, continues to develop, and the process of digitisation and industrial upgrading is accelerating [61,76,139,140]. However, under the current wave of digitisation, continuous expansion of business boundaries has also led to frequent cybersecurity issues in the banking industry [1,141]. For example, attacks in the banking sector have shown an intensifying trend at present time. Hackers continuously target banks via different forms, such as ransomware, malicious outsourcing, compromised remote sites, Stuxnet, ICS insider, IT insider, Trojan horses, worms, and denial-of-service (DDoS) attacks [142,143]. An example is the theft of USD 101 million from Bangladesh's central bank conducted by unidentified hackers on 5 February 2016 [144]. Due to the lack of an organisational structure, management system, and other aspects compatible with data security, the banking industry faces relatively severe data security risks [145,146]. Both external and internal parties are involved in the security issue, such as dealing with external cyberattacks, internal security awareness, and management vulnerabilities, as banking security risks are emphasised here.

Dealing with External Cyberattacks
Banks are lucrative targets that cybercriminals typically attempt to get internal access, damage, or alter targeted network systems to steal funds and fundamental financial data, such as bank account details and credit card information [147]. Attacks in the banking sector have shown an intensifying trend at present time. External hackers continuously target banks via different forms, such as ransomware, malicious outsourcing, compromised remote sites, Stuxnet, ICS insider, IT insider, Trojan horses, worms, and denial-of-service (DDoS) attacks [142,143]. An example is the theft of USD 101 million from Bangladesh's central bank conducted by unidentified hackers on 5 February 2016 [144,148]. The main purpose of cyberattacks from external parties is not aimed at the destruction of the targeted system but rather the gaining of financial benefits [149]. Banks still cannot completely prevent network security incidents, and it is impossible to ensure 100% security [150]. However, banks are trying to create good systems against external cyberattacks [151]

Internal Security
Improving internal security is also one of the most critical issues for banking applications in the financial sector [127]. It directly helps educate internal parties on how to deal with financial terrorism, such as banking operation risks and cautions about cyberattacks, phishing attacks, etc. [131,152,153]. It is also vital to make customers aware of information security by educating them about online banking activities and the dangers of phishing, ACH and wire fraud, malware, and other risks through the use of different materials, such as articles, posters, videos, email campaigns, and newsletters [154]. Additionally, training internal parties regarding technological devices' high adaptability is also crucial [155,156]. The authors of [156] emphasised that employees' knowledge regarding their organisation's information security policies and procedures helps to increase information security awareness.

. Management Vulnerabilities
Managing different types of vulnerabilities is also a highly crucial issue for banks. How banks address vulnerabilities is becoming increasingly important at every level of banking operation. The author of [157] mentioned some critical methods to manage and mitigate vulnerabilities. There include (a) identifying and categorising information assets, such as confidential data, highly restricted data, data for internal use, and publicly used data, based on the data value and level of sensitivity; (b) implementing a cybersecurity risk assessment process; (c) and educating internal employees about cybersecurity issues. It is also essential to prepare an IT expert team to prioritise the most critical parts of the network and network segmentation as a strategic policy to reduce information threats [158]. The author of [159] added regular vulnerability assessment as an important tool to reduce information threats. However, it is highly emphasised that managing vulnerabilities should be conducted by a separate department with an expert team with both expertise and business intelligence.

Discussion
Big data's proliferation provides favourable conditions for expanding financial institutions' business scope and services for customers in the banking industry. How to navigate the challenges brought by big data is also a question that the banking industry needs to consider carefully [22,27,116,160]. Thus, a qualitative research methodology to identify the different aspects of big data implications for the banking industry was conducted in this study.
Overall, the banking sector is undergoing changes because of big data, which is giving banks multiple opportunities to improve their processes, reduce risks, and spur development. It makes consumer services more transparent, which fosters loyalty and trust. Big data makes it possible to provide more individualised suggestions to customers. By using data-driven goods and services, it creates new income sources. Additionally, it improves risk management skills and supports banks' regulatory compliance. Overall, big data is having a significant influence on banking, giving banks the ability to control risks, optimise operations, and encourage corporate growth.
Second, the banking sector faces increasing challenges in managing and applying big data technology due to banking operation changes, a more competitive market, risks in selecting big data technology for decision-making, a lack of professional data analysts, and expensive costs. Third, big data management plays a crucial role in controlling and reducing banks' vulnerabilities when dealing with external cyberattacks as well as raising internal security awareness. Banking security has always attracted attention from banks. Although commercial banks have increased their data security management investment in recent years, factors such as long business chains and complex software and hardware systems have further increased the hidden dangers of big data. Properly keeping and safely and legally using big data is particularly important in such cases [161][162][163][164][165].
In response to banking security issues, the banking industry should emphasise some core elements of its data security system, including transaction security, security compliance, network security technology, information security, and the entire life cycle of data Among them, transaction security is an essential factor that distinguishes the banking industry from other industries and an essential object of future security protection. Other elements of banking security systems are also prevalent everywhere in the banking industry. However, with subsequent promulgation of supporting laws, regulations, and industry regulatory standards in data security, the future of banking data security systems must be protected from any kind of cyberattacks.

Conclusions
Big data has been transforming the way banks operate. It is also evident that this transformation is only in its infancy. The potential of using big data to improve banking operations is enormous. Banks can increase their profit substantially. Despite the significant prospect of big data in the banking and financial industry, there is a very limited number of in-depth systematic and bibliometric analyses available on the issue. To address this issue, this study stated its purposes and conducted an in-depth analysis to show how an enormous amount of data creates opportunities for financial institutions, particularly banking institutions. Also, this study discussed the challenges and different banking and data security issues. Finally, this study showed that big data significantly influences the banking supply chain, bank risk management, detection of financial fraud, analysis for customer insights, and major banking decisions. Although big data brings several benefits, some drawbacks are evident. This study showed that the major challenges are complex service management, increasing market competition, significant changes in banking operations from manual services to automation, the lack of professional data analysts, and finally, the costs of utilising big datasets. Regarding banking and data security, this study mentioned that banks should carefully exploit big data to improve their performance while minimising security incidents. Identifying the solutions for these issues requires the involvement and cooperation of data scientists, marketers, lawyers, managers, and regulators.

Future Research Direction
Although this research is detailed and covers the relevant, peer-reviewed literature gathered from peer-reviewed journals, there are still some limitations to the current work. We used the Scopus and Web of Science databases, which are popular among researchers, to select relevant articles. There is the possibility that articles not indexed in Scopus and Web of Science might be relevant to the study scope. This study highlights some challenges that banks encounter in the application of big data. However, a comprehensive framework to overcome these challenges has not been systematically addressed. Therefore, in the future, both highly developed theoretical and empirical research should be experimented with to address big data challenges in the banking industry. Also, this study suggests that further research on improving efficiency using big data analytics in the banking industry. Data Availability Statement: Data will be available upon request.

Acknowledgments:
We are grateful to the anonymous reviewers who provided comments on this manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.      Step-by-step Bibliometrix package: The bibliometrix process was carried out using the programming software R (the version of the software is 4.3.1 for Windows) and R Studio. Using these software, we followed these steps: Step 1-install "bibliometrix" package. Use the command "install.packages ("bibliometrix")".
Step 6-merge both datasets into one single dataset.
Step 7-analyse the data to fulfill the research purpose. Step-by-Step Bibliometrix Package: The bibliometrix process was carried out using the programming software R (the version of the software is 4.3.1 for Windows) and R Studio. Using these software, we followed these steps: Step 1-install "bibliometrix" package. Use the command "install.packages ("bibliometrix")".
Step 6-merge both datasets into one single dataset.
Step 7-analyse the data to fulfill the research purpose.