Next Article in Journal
Increasing the Speed of Multiscale Signal Analysis in the Frequency Domain
Previous Article in Journal
Diabetic Retinopathy Detection: A Blockchain and African Vulture Optimization Algorithm-Based Deep Learning Framework
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Research Progress of Tumor Big Data Visualization

1
Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430070, China
2
College of Biology, Hunan University, Changsha 410082, China
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(3), 743; https://doi.org/10.3390/electronics12030743
Submission received: 6 January 2023 / Revised: 26 January 2023 / Accepted: 31 January 2023 / Published: 2 February 2023
(This article belongs to the Section Computer Science & Engineering)

Abstract

:
Background: As the number of tumor cases significantly increases, so does the quantity of tumor data. The mining and application of large-scale data have promoted the development of tumor big data. Among them, the visualization methods of tumor big data can well show the key information in a large volume of data and facilitate the human brain to receive information. Therefore, tumor big data visualization methods are a key part of the development of tumor big data. Process: This paper first summarizes the connotation, sources, characteristics, and applications of tumor big data, and expounds the current research status of tumor big data visualization at home and abroad. Then, this paper focuses on four mainstream visualization presentation methods of tumor big data, namely the visualization of tumor spatiotemporal data, the visualization of tumor hierarchy and network data, the visualization of tumor text data, and the visualization of multidimensional tumor data, and gives specific application scenarios. After this, the paper introduces the advantages, disadvantages, and scope of the use of five data visualization websites and software that can be easily obtained by readers. Finally, this paper analyzes the problems existing in tumor big data visualization, summarizes the visualization methods, and proposes the future of tumor big data visualization.

1. Introduction

According to the latest global cancer burden data released by the World Health Organization and the International Agency for Research on Cancer in 2020 [1,2], cancer is the first or second leading cause of death before the age of 70 in 112 countries. Meanwhile, due to the intensifying aging of the population, the cancer burden is expected to increase by 50% in 2040, compared with 2020. The number of new cancer cases will reach nearly 30 million. It is of great social significance for cancer prevention and therapy to carry out big data research based on the huge number of cancer cases at present. At the second academic conference of the Tumor Committee of the Chinese Society of Health Information and Healthcare Big Data, Hui Zhouguang, deputy secretary-general of the Tumor Committee and chief physician of the Cancer Hospital of the Chinese Academy of Medical Sciences, said that the clinical big data of cancer can provide powerful evidence and support for the early diagnosis and reasonable therapy of cancer. Therefore, the importance and effectiveness of tumor big data are increasingly prominent, among which the study of tumor big data visualization technology is a key part of the development of tumor big data.
This paper first summarizes the connotation, sources, characteristics, and applications of tumor big data, and expounds the current research status of tumor big data visualization at home and abroad. Then, this paper focuses on four mainstream visualization presentation methods of tumor big data, namely the visualization of tumor spatiotemporal data, the visualization of tumor hierarchy and network data, the visualization of tumor text data, and the visualization of multidimensional tumor data, and gives specific application scenarios. After this, the paper introduces the advantages, disadvantages, and scope of the use of five data visualization websites and software that can be easily obtained by readers. Finally, this paper analyzes the problems existing in tumor big data visualization, summarizes the visualization method, and proposes the future of tumor big data visualization.

2. Tumor Big Data

Although there is no unified definition for big data, McKinsey pointed out in the report “Big data: The next frontier for innovation, competition, and productivity” [3] released in June 2011, ‘Big data’ refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. It is characterized by massive data scale, rapid data flow, diverse data types and low value density.” It can be concluded that tumor big data refers to a dataset with a large scale generated in the process of tumor prevention and therapy in the medical industry, which greatly exceeds the capability range of traditional database software tools in terms of data acquisition, storage, and management and analysis.

2.1. Sources of Tumor Big Data

The generation of big data relies on the Internet, and all data connected to the Internet can be the source of big data. Based on the sources and characteristics of big data, tumor big data mainly comes from three sources [4,5,6], which are explained in what follows.

2.1.1. Tumor Clinical Data

The clinical data of tumors mainly include the basic information of tumor patients, electronic medical records, biochemical test results, imaging data, pathological information, diagnosis and therapy methods, efficacy evaluation, follow-up outcomes, various omics information, etc. With the number of new cancer cases, and the number of deaths and detection indicators increasing year by year, the volume of cancer clinical data also increases. In addition, the storage capacity of clinical imaging data is large; for instance, the storage space for a standard pathological map is close to 5 GB, so a great deal of clinical diagnosis and therapy data of tumors is generated.

2.1.2. Biomedical Data

Biomedical data mainly refer to the data generated by pharmaceutical companies in the process of drug research. In the process of biological medicine research, new data are generated all the time, so the quantity of produced data is very large. For example, the volume of produced data using DNA sequencing is at the level of PB every year.

2.1.3. Network Tumor Data

Network tumor data include data about disease, health, medical therapy, and drug purchase on various websites and mobile phone applications, and the data generated by health detection equipment.

2.2. Characteristics and Applications of Tumor Big Data

In addition to the characteristics of volume, variety, velocity, and value, tumor big data analysis also has the characteristics of complexity, timing, diversity, redundancy, deficiency, privacy, etc. [7]. Complexity refers to the complexity related to big data information caused by the increase in tumor words representing different meanings and measurement indicators. Timing refers to the data generated by patients going to the hospital for therapy, which are arranged in chronological order. Diversity refers to the diversity of subjects generating tumor big data, including doctors, nurses, and other medical personnel. Redundancy refers to the existence of a great deal of duplicate or irrelevant information in medical data. Deficiency refers to the deviation and absence of patient description in tumor big data analysis or incomplete data due to the dispersion of tumor clinical data. Privacy refers to the privacy of user data and information leakage, which will have serious consequences.
Because data mining in big data technology can effectively realize the precise therapy of tumor diseases, it has been widely used in the prediction, diagnosis, therapy, and prognosis of tumor diseases [8]. The application prospect of big data is also very broad in early warning, diagnosis and therapy, intelligent health management, gene testing, and drug research [9]. In addition, big data mining has become a therapeutic strategy and potential observation avenue for cancers such as chronic lymphocytic leukemia. These studies all demonstrate the great value of tumor big data in the study of tumors and diseases.

3. Research Progress of Tumor Big Data Visualization

3.1. Overview of Tumor Big Data Visualization Research

Data visualization is an interdisciplinary subject of human–computer interaction, graphics, imagology, statistical analysis, geographic information, and other disciplines. It integrates various knowledge and skills in data processing, algorithm design, software development, human–computer interaction, etc. It displays data through images, charts, animations, and other forms, interprets the relationship and trend between different kinds of data, and improves the efficiency of reading and understanding data [10]. Tumor big data visualization refers to the use of big data technology to visually display tumor data and the relationship between various kinds of data.
With the arrival of the era of big data and the application of big data technology becoming the focus from all walks of life, the research and application of tumor big data visualization will also attract more and more attention from researchers. At present, the research and application of tumor big data visualization are still in the exploratory stage.

3.2. Research Status of Abroad Tumor Big Data Visualization

From 2000 to 2022, a total of 2838 academic papers were searched under the theme of tumor big data in the Web of Science. A total of 12,332 academic papers were collected based on the theme of tumor visualization. However, only 45 academic papers were included in the search with the theme of tumor big data visualization (Figure 1). The number of articles related to tumor big data visualization has rapidly increased in the last three years. Among the papers with high citations, Li [11] proposed a method that provides users with the DNA methylation data of different cancer types with different visual analysis methods. Galetsi [12] revealed that using visual methods to analyze patient medical outcomes could improve the prediction of disease. Wang [13] proposed the use of bibliometrics and visualization methods for deep mining to reveal the panorama of emerging technologies in cancer. Bi [14] developed ClickGene, a cloud-based visualization platform, for DIY analysis of users’ private data or public genomic data. Singh Urminder [15] developed MetaOmGraph software for the exploratory analysis of massive datasets and used it to identify new putative biomarker genes in diverse tumors.

3.3. Research Status of Tumor Big Data Visualization in China

From 2013 to 2022, a total of 101 academic papers on the topic of tumor big data visualization were counted on CNKI (Figure 2). In 2016, the number of articles on tumor big data visualization suddenly increased. Since 2016, the number of tumor big data visualization articles has fluctuated. However, this does not mean that the domestic research on tumor big data visualization is stagnant because more domestic experts and scholars choose to publish in English journals in recent years. Among the papers with high citations, He Xiaolin [16] used their HTML5-based tumor epidemiology data visualization analysis system to reveal the main characteristics of tumor epidemiology and the multidimensional relationship between tumor epidemiology data. Based on the visual analysis of data from multiple channels, He Xiaxia [17] concluded that diabetes had a certain probability of causing tumors and put forward reasonable preventive measures. Zhang Xiangyang [18] pointed out in their paper that precision medicine plays an important role in the prevention and therapy of tumors and explained the application of big data mining and analysis in the field of health care.
In general, the development of tumor big data visualization in foreign countries is earlier than that in China. Current research on tumor big data visualization not only focuses on making the presentation of results more concise, beautiful, and easy to understand, but it also pays attention to using the results to assist in intelligent decision making. In the face of complex and huge tumor data and the increasing number of patients, tumor big data visualization should become an effective tool to solve tumor problems. At the same time, tumor big data is the interdisciplinary integration of oncology and big data. As a cross-research direction, tumor big data needs the joint effort of multiple fields. For example, the contribution of artificial intelligence should be considered in the field of big data [19,20,21,22].

4. Tumor Big Data Visualization Methods

With the development of big data visualization technology, the method and effects of visualization have gradually become the key reference factors for the value of tumor data. Medical oncology imaging has evolved from a purely visualization tool to a visual analysis tool that represents the analytical approach to disease representation in vivo. In this paper, the literature related to medical tumor visualization was analyzed, summarized, and cross-fused with Shneiderman’s classification of the data. Sixteen common tumor big data visualization methods were obtained and divided into the following four categories: the visualization of tumor spatiotemporal data, the visualization of tumor hierarchical and network data, the visualization of tumor text data, and the visualization of multidimensional tumor data (Figure 3).

4.1. Visualization of Tumor Spatiotemporal Data

Spatiotemporal data refers to the datasets of geographic elements or phenomena associated with locations based on uniform spatiotemporal benchmarks, with basic features such as spatial dimension (S), attribute dimension (D), time dimension (T), etc. [23]. The spatial dimension refers to the accurate three-dimensional spatial location (S-XYZ) or spatial distribution characteristics of geographic information, which is measurable. The attribute dimension refers to all kinds of relevant information (attribute or thematic information) that can be loaded on the spatial dimension, which is multidimensional. The time dimension means that the relevant information changes with the change in time and has temporal nature. The focus of spatiotemporal data visualization is to establish visual representations of temporal and spatial dimensions and related information object attributes and to display the information and rules closely related to time and space. The visualizations of tumor time series data, tumor spatial dimension data, and tumor geographic information are introduced in the following sections.

4.1.1. Visualization of Tumor Time Series Data

Time series data refers to a series of observations arranged in time order. Tumor time series data visualization mainly studies the change in the attributes and states of the tumor data over time and the moment of the mutation of the tumor data state. The visualization of time attributes can be divided into two types: spiral diagram and calendar view [24].
  • Spiral diagram: A spiral diagram is suitable for displaying change trends and periodic data over a long period of time or a large volume of data. Hubschmann [25] used spiral diagrams to show the subtype classification of central nervous system tumors, distinguishing and identifying a large number of subtypes. Kurzhals [26] proposed a visualization of gaze spirals for moving-eye tracking that allows users to compare the long-term recordings of common scenes without manual annotation (Figure 4). This method can be used for the prevention and therapy of cancer and can be used to record the living environment and habits of the user and compare them with the correct situation to judge whether the user has bad habits.
  • Calendar view: In reality, time is divided according to the year, month, day, hour, etc. In time series representation, time attributes can be displayed by frequency, which can be called calendar view [24]. Van [27] first proposed clustering similar daily data and visualizing the average of corresponding dates on the calendar. This method can be used to analyze information such as the daily number of visits and waiting time of the oncology department, which is helpful in reasonably arranging the schedule of daily medical therapy and improving the efficiency of medical therapy. Bartram [28] developed a calendar visualization tool to help identify periodic patterns and outliers to better monitor the health of users. This method can be used to help users to detect physical abnormalities and receive therapy as early as possible to prevent the deterioration of tumors.

4.1.2. Visualization of Tumor Spatial Dimension Data

Spatial dimensional data refers to data with physical spatial coordinates divided according to the dimensions of the data presentation, in which a scalar field refers to the data field recording a single scalar at the spatial sampling location [7].
  • One-dimensional scalar field visualization: One-dimensional scalar field visualization [29] is the scalar field data obtained by sampling along a certain path, and the distribution of the data is presented in the form of a line graph. In the test sheet, blood item values are visualized using a one-dimensional scalar field (Figure 5). This method is suitable for the display of various biological test results, such as red blood cells (RBCs), white blood cells, platelet count (PLT), and other blood items in routine blood tests. The quantitative changes and morphological distribution of routine blood tests were observed to evaluate disease symptoms and assist doctors to diagnose the disease.
  • Two-dimensional (2D) scalar field visualization: Two-dimensional scalar field visualization [30] is represented by the distribution characteristics of scalar data on a two-dimensional surface. Zhang [31] optimized the color map adjustment formula in color mapping, proposed a data-driven 2D color map optimization method for scalar field visualization, and used it in the visualization of radiotherapy dose data of head cancer patients (Figure 6). This method is suitable for CT images. According to the different colors and the location of the colors, CT images can help doctors to better assess the location of the tumor and identify the internal characteristics of the tumor.
  • Three-dimensional (3D) scalar field visualization: Two-dimensional scalar fields are arranged in a certain order to form three-dimensional scalar fields. Three-dimensional scalar field visualization is to present the distribution characteristics of scalar data in three-dimensional space. Marolt [32] proposed the integration of virtual reality (VR) into a web-based medical visualization framework to support the visualization of volume data. Figure 7 shows a screenshot of the 3D surgery animation. This method can be helpful for the reconstruction of CT images, MRI images, etc., and help doctors make decisions. In addition, 3D scalar field visualization can also be used in medical teaching and surgical simulation to save personnel training costs and reduce surgical risks.

4.1.3. Visualization of Tumor Geographic Information

In information visualization, point, line, and surface are classical visual elements, which are widely used in geographic information data visualization. Maps are an effective means of displaying and communicating tumor data such as tumor incidence and mortality. Tumor geographic information visualization provides a visual analysis of the geographic distribution of tumor data, which can effectively promote the prevention and control of tumors. From the perspective of visual element mapping, this section will give an overview of the application of points, lines, planes, and other elements in tumor geographic information data visualization [33].
  • Point data visualization: Location information is the basic attribute of geographic information data, which refers to the specific location of behaviors and events. In the research of geographic information data visualization, the distribution and location information of entity attribute data are usually described by the design of points [34].
    Sherman [35] illustrated colorectal cancer mortality with maps from the Federal Qualified Health Centers (FQHC) (Figure 8). This method can be used to identify the key areas of intervention for various tumor diseases in different regions.
  • Line data visualization: Lines are visualization methods that connect points, usually representing relationships between two or more visual elements. A lot of valuable information in geographic information data can be described by line visualization, such as path, flow, trend, etc. [34]. Zhu [36] proposed a flow mapping method to extract typical data flows from large-scale geographic mobility data and constructed a line data visualization method to understand complex population mobility trends in the United States (Figure 9). This method can be used to analyze the source and spread trend of tumors and diseases, which is helpful in collecting information timely and accurately, making efficient response plans, understanding the possible causes of tumor diseases, and preventing the occurrence of tumors and diseases from the source.
  • Regional data visualization: A region is a geographical range with adjacent spatial locations or similar attributes in geographic information data [34]. Mastellaro [37] found that the geographical distribution of carriers of the genetic TP53-R337H mutation was associated with the occurrence of adrenocortical tumors in the Brazilian population (Figure 10), shown as the frequency of TP53-R337H in newborns from 42 municipalities in the Seventh Regional Health Board of Sao Paulo State. This method can be used to analyze and predict regional tumor data by combining geographical location and other factors, monitor neonatal mortality or the incidence of tumors, and determine whether there is a regional genetic tumor.

4.2. Visualization of Tumor Spatiotemporal Data

Visualization of tumor hierarchical and network data [38] is mainly applied to the tumor data visualization of large-scale networks with a large number of nodes and edges in a limited screen space.

4.2.1. Visualization of Tumor Hierarchical Data

A hierarchy [39], often simply called a tree, is a special case of a graph, characterized by the fact that each item has a link to its parent (except for the root). It is generally believed that hierarchy can store two kinds of information: one is structure information and the other is content information. According to the logical organization and spatial construction form of the visualization structure to meet the needs of the model and display, hierarchical data visualization can be divided into three categories: the node-connection method, the space-filling method, and the hybrid method.
  • Node link: Node connection is a kind of hierarchical visualization technology that uses different shapes of nodes to represent data and lines between nodes to represent the relationship between the data. It can be divided into two categories: 2D node links and 3D node links. The representative technologies of 2D node links mainly include space trees [40], hyperbolic trees [41,42], and radial trees [43]. The representative technologies for 3D node links mainly include cone trees [44], magic eye views [45] and collapsible cylindrical trees [46]. For the application of the node-link method, the most concerned method is the 2D node-link method. Techawut [42] used a data visualization framework based on a hyperbolic tree model to visually represent encyclopedia-like results, which facilitates knowledge exploration for users (Figure 11). This method can be used for keyword knowledge exploration based on the field of tumors and can also be used for association analysis between various tumors and genome-wide association analysis. Yamada [43] designed an application called FuncTree2, which can transform hierarchical categorical data into an interactive and highly customizable radial tree, shown as a radial trees for classifying biological items. This approach can be used for the classification analysis of various tumor diseases.
  • Space filling: The space-filling method is a kind of hierarchical visualization technology that uses various shapes of bounding boxes to represent the nodes of the hierarchy and the bounding relationship between the upper nodes and the lower nodes to represent the relationship between data. It can be divided into two categories: the representative technologies of the 2D space-filling method mainly include treemaps [47,48], circle packing [49], and radial filling [50]. The representative technologies for the 3D space-filling method mainly include information pyramids [51] and information cubes [52]. At present, there are few studies on the space-filling method, which mainly focus on the research and application of treemaps. Aupetit [48] explored a grouping interactive intelligent treemap, aiming to help doctors intelligently group and arrange the sleep conditions of wearers (Figure 12). This method is suitable for decision support, which can help doctors group and arrange tumor data and reduce the workload.
  • Hybrid: Hybrid methods are hierarchical visualization techniques that combine several visualization techniques and ideas to integrate their advantages, so as to make cognitive behavior more efficient. The representative techniques include elastic hierarchies [53], space-optimized trees [54], and hierarchical nets [55]. At present, the research on hybrid methods is mainly on hierarchical nets. For example, Verrastro [56] used a hierarchical network to show the visualization results of the statistical analysis of 30 non-alcoholic fatty liver disease activity scores and highlighted the relationship between the studied therapy. This method is suitable for association analysis and visualization between various tumor diseases or various test indicators.

4.2.2. Visualization of Tumor Network Data

Network data, unlike hierarchical data, does not have a bottom-up or top-down hierarchy. In network data visualization, each node represents a subject, and the connecting lines between nodes represent the association relationship between subjects [57]. The network structure can be mainly divided into two types: a force-directed layout and an arc layout.
  • Force-directed layout: In a force-directed layout [58], the forces are calculated based on the relative position of nodes and lines. Pouryahya [59] found a specific role for pregnancy-specific glycoproteins in cancer through network structure analysis with a force-directed layout (Figure 13). This approach can be used to identify the root cause of the disease through genomic association analysis.
  • Arc layout: An arc layout includes a circular arc layout and a circular layout [60] (Figure 14 and Figure 15). Its nodes are arranged along a linear axis or ring, and the lines indicate that there is a link relationship between nodes. This method is suitable for the analysis of medication and daily condition monitoring and can assist in detecting whether there is the possibility of lesions and whether other diseases will occur.

4.3. Visualization of Tumor Text Data

Tumor text visualization refers to expressing the complex or difficult contents and rules in tumor text in the form of visual symbols. At the same time, it provides people with the function of rapid interaction with visual information, so that people can use the inherent parallel processing ability of visual perception to quickly obtain the key information contained in big data [61]. In this section, the visualization of tumor text data is divided into the visualization of tumor text content and tumor text relationship.

4.3.1. Visualization of Tumor Text Content

Text content visualization mainly focuses on how to quickly obtain the key points of text content and express them visually.
  • Visualization of tumor text data based on word frequency: Visualization based on word frequency regards tumor medical text as a collection of words, uses word frequency to express text features, and mainly displays visually through word cloud (also known as label cloud). Gaidano [62] used word cloud maps to show the mutated genes in chronic lymphocytic leukemia, where font size is proportional to molecular lesion frequency. Christian [63] used word clouds to show which words were most associated with cancer in the selection algorithm (Figure 16). This method is suitable for the visualization of all textual information in the field of oncology, such as case information, clinical medical records, drug lists, gene analysis, etc. Through the visualization of word cloud maps, high-frequency words in the field of cancer are found to help users to screen important information and assist in decision making.
  • Visualization of tumor text data based on semantics: Semantic-based visualization takes the tumor medical text as a set of words, reflects the semantic hierarchy relationship in the text through the layout of keywords, and is mainly visualized through the document scatter (also known as the rising sun diagram). Collins [64] used a document scatter to show the structure of the text content, which also embodies the semantic hierarchy of words through a radial layout, with the innermost layer being the most important overview of the article content (Figure 17). This method is suitable for the self-detection of diseases and for sharing their own cases and medical records through social networks. Based on big data processing technology, patients can measure the development degree of their own disease, prevent the deterioration of the tumor, and refer to the medication records of patients with the same disease to determine their own medication therapy plan.

4.3.2. Visualization of Tumor Text Relationship

Text relation visualization mainly focuses on the connotation relationship of the text and visually describes the connotation relationship in the text to help people understand the content of the text and discover the rules.
Wattenberg [65] proposed a word tree combining the idea of a suffix tree, which presents the context relationship of query words in a tree structure (Figure 18). This method is suitable for retrieving valuable fields from a large number of electronic medical records, analyzing the self-reported disease information of patients through word trees, and quickly inferring the disease of patients.

4.4. Visualization of Multidimensional Tumor Data

Multidimensional data refers to the data variable with multidimensional attributes. Multidimensional data visualization [66] is the study of how to present multidimensional data through various methods to make it easy for a human to understand 2D or 3D graphics and images.

Visualization of Multidimensional Tumor Data Based on Geometry

The basic idea of geometry-based multidimensional data visualization is to map high-dimensional data to low-dimensional space by geometric drawing or geometric projection and to represent multidimensional information objects by points, curves, or polylines [67].
  • Parallel coordinate system: The basic idea of the parallel coordinate system technology [68] is to represent the n-dimensional attributes of multidimensional data in the form of n-parallel coordinate axes with equal vertical spacing, and each parallel axis corresponds to an attribute dimension. Dinh [69] used the parallel coordinate method to visualize the internal characteristics of tumors to help doctors better diagnose and patients better target therapy. This method can be used for dimensionality reduction visualization of tumor multidimensional data to help statistical analysis and correlation analysis. Hua [70] used the radar chart in the circular coordinate system to represent the correlation between the expression of innate immune protein LCN2 and immunity in tumors (Figure 19). The top chart is the radar chart of the correlation between LCN2 expression and TMB, and the bottom chart is the radar chart of the correlation between LCN2 expression and MSI. This method can be used to visualize the many-to-one tumor data relationship, the number of tumor patients, and physical examination indicators. It can also help doctors to analyze the health status of the physical examination population and determine the association between their health indicators and formulate a suitable tumor prevention and therapy plan for the population.
  • Scatter plot: A scatter plot is a visual method to describe the relationship between two variables in multidimensional data [67]. Hempel [71] concluded that the marker voxels from MK and MD were located only in tumor tissue and were not associated with specific molecular glioma features through scatter plots (Figure 20). This method is suitable for the scatter plot visualization of multidimensional data and facilitates correlation analysis.

5. Tumor Big Data Visualization Application Example

In recent years, a number of websites and software with a high level of data visualization have emerged. Five websites or software programs were selected that are generally applicable in tumor big data visualization. The details are described in what follows.

5.1. Tableau

Tableau is a website with both data computing and visualization functions. The visualization method of this website can be used in all walks of life. At the same time, it is a very user-friendly method, because users do not need any code programming foundation.

5.2. Echarts

Echarts is an open-source data visualization website, but it is not well suited for data analysis. The users who use it need to have some code programming ability. It can also be used for various types of data visualization.

5.3. Matlab

MATLAB is a software program that is used for data analysis, data processing, and data visualization. It requires a certain amount of coding ability from the user. It is mainly used in numerous scientific fields where numerical computations are required.

5.4. GraphPad Prism

GraphPad Prism is a software program that integrates data analysis and data visualization. It does not require the user to have a code base. Its scope of use is relatively refined, and it is mainly suitable for mapping in the medical field.

5.5. 3DMAX

Notably, 3DMAX is a PC-system-based 3D animation rendering and production software. It does not require the user to have a code base, but it still has a certain learning cost. It can be mainly used for 3D animation production in the medical field.
Table 1 presents the types of data visualization that can be expressed by these five software programs.

6. Problems in the Development of Tumor Big Data Visualization

6.1. Information Security and Privacy Protection

Tumor patient data are uniformly submitted to a big data system so that everyone’s information is transparently presented in the scope of data sharing. Tumor big data involves the privacy of patients and the security of medical institutions, and there may be serious security risks [72].

6.2. Lack of Compound Talents

Despite the rapid development of information technology, interdisciplinary talents in different countries are lacking in different fields. In the past, most universities trained talents in a single field, but the development of tumor big data visualization needs talents from the combination of fields in medicine and big data.

6.3. Theoretical Research and Practical Application

The results presented in this paper revealed that the research on tumor big data visualization has only emerged in the past five years, and many studies are still in the theoretical research stage. The practical application of tumor big data visualization in specific medical scenarios still has a long way to go.

6.4. Breakthroughs in Key Technologies

Different forms of tumor data can be obtained through the hospital system, but the current level of technology is difficult to support the real-time analysis and integration of a large volume of tumor data. Therefore, it is necessary to continue to make breakthroughs in relevant key technologies to support the rapid development of relevant industries.

7. Summary

7.1. Conclusions

Table 2 presents the classification and summary of tumor big data visualization methods.

7.2. Outlook

Although the application of big data technology in tumor diseases will impact traditional therapy methods to a certain extent, it will bring greater opportunities and challenges to the diagnosis and therapy of tumors and prognosis assessment. At the same time, tumor big data visualization and visual analysis itself also face great challenges. With the rapid improvement of clinical medical research level and the successful development of new drugs, cancer prevention and therapy will develop rapidly with the support of big data technology, and the scientific and technological achievements will provide the basis for early screening, early diagnosis, and drug research of tumors. It provides clinical guidance for medical staff, provides personalized diagnosis and therapy plans for patients, and lays the foundation for the implementation of precision tumor medicine. It is hoped that the application of big data will lead the visualization analysis in the field of tumor prevention and therapy, thus leading to a new era of tumor big data visualization.

Author Contributions

Writing X.C.; Validation B.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation Project of China (No.82003931).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Sung, H.; Ferlay, J.; Siegel, R.L. Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
  2. Wild, C.; Weiderpass, E.; Stewart, B.W. World Cancer Report: Cancer Research for Cancer Prevention; IARC Press: Lyon, France, 2020. [Google Scholar]
  3. Manyika, J.; Chui, M.; Brown, B. Big Data: The Next Frontier for Innovation, Competition, and Productivity; McKinsey Global Institute: Washington, DC, USA, 2011. [Google Scholar]
  4. Ning, L.; Min, C. Applying themes and related data sources research of healthcare big data. China Digit. Med. 2016, 11, 6–9. [Google Scholar]
  5. Bo, S.; Yanli, Y.; Yunxia, F. Review of medical bigdata research. Transl. Med. J. 2016, 5, 298–300. [Google Scholar]
  6. Yufei, S. Thinking on the development and application of Big Data in health care. Wirel. Internet Technol. 2021, 18, 94–95. [Google Scholar]
  7. Yi, W.; Shuxia, R. Survey on visualization of medical big data. J. Front. Comput. Sci. Technol. 2017, 11, 681–699. [Google Scholar]
  8. Song, B.; Tiantian, Z.; Xu, Y. Research on the Application of Medical Big Data in Tumor Diseases. China Digit. Med. 2017, 12, 35–37+64. [Google Scholar]
  9. Lingling, T.; Li, L. Research and application of big data and artificial intelligence in gynecological malignant tumors. Chin. J. Pract. Gynecol. Obstet. 2019, 35, 720–723. [Google Scholar]
  10. Bin, L.; Zengjie, L.; Yu, L. Review of data visualization research. J. Hebei Univ. Sci. Technol. 2021, 42, 643–654. [Google Scholar]
  11. Li, Y.; Ge, D.; Lu, C. The SMART App: An interactive web application for comprehensive DNA methylation analysis and visualization. Epigenet. Chromatin 2019, 12, 1–9. [Google Scholar] [CrossRef]
  12. Galetsi, P.; Katsaliaki, K. A review of the literature on big data analytics in healthcare. J. Oper. Res. Soc. 2020, 71, 1511–1529. [Google Scholar] [CrossRef]
  13. Wang, X.; Guo, J.; Gu, D. Tracking knowledge evolution, hotspots and future directions of emerging technologies in cancers research: A bibliometrics review. J. Cancer 2019, 10, 2643. [Google Scholar] [CrossRef]
  14. Bi, J.; Tong, Y.; Qiu, Z. ClickGene: An open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration. Mining 2019, 12, 1–15. [Google Scholar] [CrossRef]
  15. Singh, U.; Hur, M.; Dorman, K. MetaOmGraph: A workbench for interactive exploratory data analysis of large expression datasets. Nucleic Acids Res. 2020, 48, e23. [Google Scholar] [CrossRef] [PubMed]
  16. Xiaolin, H.; Qing, Q.; Ze, Z. Visualization-based analysis of tumor epidemical data. Chin. J. Med. Libr. Inf. Sci. 2016, 25, 73–80. [Google Scholar]
  17. Xiaxia, H.; Chengying, S.; Bai, C. Investigation big data and visualized relationship between cancer and diabetes. J. Xinjiang Med. Univ. 2017, 40, 229–232. [Google Scholar]
  18. Xiangyang, Z.; Ling, C.; Man, Z. Application of big data mining and analytics to healthcare. Med. J. Air Force 2017, 33, 359–361. [Google Scholar]
  19. Maahi, A.K.; Shivajirao, M.J.; Iyer, B.R. Brain Tumor Segmentation and Identification Using Particle Imperialist Deep Convolutional Neural Network in MRI Images. Int. J. Interact. Multimed. Artif. Intell. 2022, 7. [Google Scholar] [CrossRef]
  20. Manuel, M.M.; Alfonso, J.L.R.; Vidal, A.; Marcelo, V.; Antonio, F. A Clustering Algorithm Based on an Ensemble of Dissimilarities: An Application in the Bioinformatics Domain. Int. J. Interact. Multimed. Artif. Intell. 2022, 7. [Google Scholar] [CrossRef]
  21. Satheshkumar, K.; Arvid, L.; Alexander, S.L. Pulmonary Nodule Classification in Lung Cancer from 3D Thoracic CT Scans Using fastai and MONAI. Int. J. Interact. Multimed. Artif. Intell. 2021, 7. [Google Scholar] [CrossRef]
  22. Loay, H.; Adel, S.; Mohamed, A.N.; Osama, A.O.; Domenec, P. Promising Deep Semantic Nuclei Segmentation Models for Multi-Institutional Histopathology Images of Different Organs. Int. J. Interact. Multimed. Artif. Intell. 2020, 7. [Google Scholar] [CrossRef]
  23. Wang, J.; Wu, F.; Guo, J. Challenges and opportunities of spatio-temporal big data. Sci. Surv. Mapp. 2017, 42, 1–7. [Google Scholar]
  24. Fang, Y.; Xu, H.; Jiang, J. A survey of time series data visualization research. IOP Conf. Ser. Mater. Sci. Eng. 2020, 782, 022013. [Google Scholar] [CrossRef]
  25. Gu, Z.; Hübschmann, D. spiralize: An R package for visualizing data on spirals. Bioinformatics 2022, 38, 1434–1436. [Google Scholar] [CrossRef] [PubMed]
  26. Koch, M.; Weiskopf, D.; Kurzhals, K. A Spiral into the Mind: Gaze Spiral Visualization for Mobile Eye Tracking. arXiv 2022, arXiv:2204.13494. [Google Scholar]
  27. Van-Wijk, J.J.; Van-Selow, E.R. Cluster and calendar based visualization of time series data. In Proceedings of the 1999 IEEE Symposium on Information Visualization, San Francisco, CA, USA, 24–29 October 1999; pp. 4–9. [Google Scholar]
  28. Huang, D.; Tory, M.; Bartram, L. A field study of on-calendar visualizations. arXiv 2017, arXiv:1706.01123. [Google Scholar]
  29. Wei, Z. Data Visualization Technology and Its Application Software; Northwestern Polytechnical University: Xi’an, China, 1998. [Google Scholar]
  30. Yi, L.; Lei, Y.; Wei, G. Advanced combinatorial algorithm for 2D navigator scalar field modeling. J. Basic Sci. Eng. 2008, 16, 472–477. [Google Scholar]
  31. Zeng, Q.; Wang, Y.; Zhang, J. Data-driven colormap optimization for 2d scalar field visualization. In Proceedings of the 2019 IEEE Visualization Conference (VIS), Vancouver, BC, Canada, 20–25 October 2019; pp. 266–270. [Google Scholar]
  32. Kokelj, Ž.; Bohak, C.; Marolt, M. A web-based virtual reality environment for medical visualization. In Proceedings of the 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 21–25 May 2018; pp. 299–302. [Google Scholar]
  33. Chen, W.; Guo, F.; Wang, F. A survey of traffic data visualization. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2970–2984. [Google Scholar] [CrossRef]
  34. Zhou, Z.; Shi, C.; Shi, L. A Survey on the Visual Analytics of Geospatial Data. J. Comput. Aided Des. Comput. Graph. 2018, 30, 747–763. [Google Scholar]
  35. Sahar, L.; Foster, S.L.; Sherman, R.L. GIScience and cancer: State of the art and trends for cancer surveillance and epidemiology. Cancer 2019, 125, 2544–2560. [Google Scholar] [CrossRef]
  36. Guo, D.; Zhu, X. Origin-destination flow data smoothing and mapping. IEEE Trans. Vis. Comput. Graph. 2014, 20, 2043–2052. [Google Scholar] [CrossRef]
  37. Seidinger, A.L.; Caminha, I.P.; Mastellaro, M.J. TP53 p. Arg337His geographic distribution correlates with adrenocortical tumor occurrence. Mol. Genet. Genom. Med. 2020, 8, e1168. [Google Scholar]
  38. Herman, I.; Melancon, G.; Marshall, M.S. Graph visualization and navigation in information visualization: A survey. IEEE Trans. Vis. Comput. Graph. 2000, 6, 24–43. [Google Scholar] [CrossRef]
  39. Weidong, X.; Yang, S.; Xiang, Z.; Cheng, Z.; Xiaosheng, F. Survey on the Research of Hierarchy Information Visualization. J. Chin. Comput. Syst. 2011, 32, 137–146. [Google Scholar]
  40. Düster, A.; Allix, O. Selective enrichment of moment fitting and application to cut finite elements and cells. Comput. Mech. 2020, 65, 429–450. [Google Scholar] [CrossRef]
  41. Bou, B. Treebolic2 Webpage. Available online: http://treebolic.sourceforge.net/treebolic2/en/index.html (accessed on 6 June 2021).
  42. Kanjanakuha, N.; Janecek, P.; Techawut, C. The comprehensibility assessment of visualization of semantic data representation (vsdr) reflecting user capability of knowledge exploration and discovery. In Proceedings of the 2019 7th International Conference on Computer and Communications Management, Bangkok, Thailand, 27–29 July 2019; pp. 195–199. [Google Scholar]
  43. Darzi, Y.; Yamate, Y.; Yamada, T. FuncTree2: An interactive radial tree for functional hierarchies and omics data visualization. Bioinformatics 2019, 35, 4519–4521. [Google Scholar] [CrossRef]
  44. Robertson, G.G.; Mackinlay, J.D.; Card, S.K. Cone trees: Animated 3D visualizations of hierarchical information. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 27 April–2 May 1991; pp. 189–194. [Google Scholar]
  45. Kreuseler, M.; López, N.; Schumann, H. A scalable framework for information visualization. In Proceedings of the IEEE Symposium on Information Visualization 2000, Salt Lake City, UT, USA, 9–10 October 2000; pp. 27–36. [Google Scholar]
  46. Dachselt, R.; Ebert, J. Collapsible cylindrical trees: A fast hierarchical navigation technique. In Proceedings of the Information Visualization, IEEE Symposium on Information Visualization, San Diego, CA, USA, 22–23 October 2001; p. 79. [Google Scholar]
  47. Scheibel, W.; Weyand, C.; Döllner, J. EvoCells-A Treemap Layout Algorithm for Evolving Tree Data. In VISIGRAPP (3: IVAPP); University of Potsdam: Potsdam, Germany, 2018; pp. 273–280. [Google Scholar]
  48. Abuthawabeh, A.; Baggag, A.; Aupetit, M. Augmented Intelligence with Interactive Voronoi Treemap for Scalable Grouping: A Usage Scenario with Wearable Data. Eurograph. Assoc. 2022, 43–47. [Google Scholar]
  49. Weixin, W.; Chunying, M.; Hong-an, W. Visualization of hierarchical information based on venn diagrams. Chin. J. Comput. 2007, 30, 1632–1636. [Google Scholar]
  50. Andrewsk, H. Information slices: Visualising and exploring large hierarchies using cascading semi-circular discs. In Proceedings of the IEEE Symposium on Information Visualization, Research Triangle Park, NC, USA, 19–20 October 1998. [Google Scholar]
  51. Andrews, K.; Wolte, J.; Pichler, M. Information PyramidsTM: A new approach to visualizing large hierarchies. IEEE Vis. 1997, 97, 49–52. [Google Scholar]
  52. Rekimoto, J.; Green, M. The information cube: Using transparency in 3d information visualization. In Proceedings of the Third Annual Workshop on Information Technologies & Systems (WITS’93), Orlando, FL, USA, 5 December 1993; Volume 13, pp. 125–132. [Google Scholar]
  53. Zhao, S.; McGuffin, M.J.; Chignell, M.H. Elastic hierarchies: Combining treemaps and node-link diagrams. In Proceedings of the IEEE Symposium on Information Visualization, Minneapolis, MN, USA, 23–25 October 2005; pp. 57–64. [Google Scholar]
  54. Nguyen, Q.V.; Huang, M.L. A space-optimized tree visualization. In Proceedings of the IEEE Symposium on Information Visualization 2002, Boston, MA, USA, 28–29 October 2002; pp. 85–92. [Google Scholar]
  55. Balzer, M.; Deussen, O. Hierarchy based 3D visualization of large software structures. In Proceedings of the IEEE Visualization 2004, Austin, TX, USA, 10–15 October 2004; p. 4. [Google Scholar]
  56. Panunzi, S.; Maltese, S.; Verrastro, O. Pioglitazone and bariatric surgery are the most effective therapys for non-alcoholic steatohepatitis: A hierarchical network meta-analysis. Diabetes Obes. Metab. 2021, 23, 980–990. [Google Scholar] [CrossRef] [PubMed]
  57. Yafeng, Z.; Yaning, Z.; Xue, B. Survey of big data visualization in education. J. Front. Comput. Sci. Technol. 2021, 15, 403. [Google Scholar]
  58. Fruchterman, T.M.J.; Reingold, E.M. Graph drawing by force directed placement. Softw. Pract. Exp. 1991, 21, 1129–1164. [Google Scholar] [CrossRef]
  59. Mathews, J.C.; Nadeem, S.; Pouryahya, M. Functional network analysis reveals an immune tolerance mechanism in cancer. Proc. Natl. Acad. Sci. USA 2020, 117, 16339–16345. [Google Scholar] [CrossRef] [PubMed]
  60. McGuffin, M.J. Simple algorithms for network visualization: A tutorial. Tsinghua Sci. Technol. 2012, 17, 383–398. [Google Scholar] [CrossRef]
  61. Jiayu, T.; Zhiyuan, L.; Maosong, S. A Survey of Text Visualization. J.-Comput.-Aided Des. Comput. Graph. 2013, 25, 273–285. [Google Scholar]
  62. Gaidano, G.; Rossi, D. The Mutational Landscape of Chronic Lymphocytic Leukemia and Its Impact on Prognosis and Treatment Hematol. Soc. Hematol Educ Program 2017, 2017, 329–337. [Google Scholar] [CrossRef]
  63. Dubey, A.K.; Hinkle, J.; Christian, J.B. Extraction of tumor site from cancer pathology reports using deep filters. In Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Niagara Falls, NY, USA, 7–10 September 2019; pp. 320–327. [Google Scholar]
  64. Collins, C.; Carpendale, S.; Penn, G. Docuburst: Visualizing Document Content Using Language Structure. In Computer Graphics Forum; Blackwell Publishing Ltd.: Oxford, UK, 2009. [Google Scholar]
  65. Wattenberg, M.; Viégas, F.B. The word tree, an interactive visual concordance. IEEE Trans. Vis. Comput. Graph. 2008, 14, 1221–1228. [Google Scholar] [CrossRef]
  66. Qi, S.; Du, J.; Qian, S. Research overview of multidimensional data visualization technology. Softw. Guide 2015, 14, 15–17. [Google Scholar]
  67. Sun, Y.; Feng, X.; Tang, J. Survey on the Research of Multidimensional and Multivariate Data Visualization. Comput. Sci. 2008, 35, 1–7. [Google Scholar]
  68. Inselberg, A. The plane with parallel coordinates. Vis. Comput. 1985, 1, 69–91. [Google Scholar] [CrossRef]
  69. Raidou, R.G.; van-der-Heide, U.A.; Dinh, C.V. Visual analytics for the exploration of tumor tissue characterization. Comput. Graph. Forum 2015, 34, 11–20. [Google Scholar] [CrossRef]
  70. Xu, W.; Zhang, J.; Hua, Y. An integrative pan-cancer analysis revealing LCN2 as an oncogenic immune protein in tumor microenvironment. Front. Oncol. 2020, 10, 605097. [Google Scholar] [CrossRef]
  71. Hempel, J.-M.; Brendle, C.; Adib, S.D.; Behling, F.; Tabatabai, G.; Castaneda Vega, S.; Schittenhelm, J.; Ernemann, U.; Klose, U. Glioma-Specific Diffusion Signature in Diffusion Kurtosis Imaging. J. Clin. Med. 2021, 10, 2325. [Google Scholar] [CrossRef] [PubMed]
  72. Linjing, S.; Tingting, S. Development and prospect of health medical big data application. Wirel. Internet Technol. 2018, 15, 143–144. [Google Scholar]
Figure 1. Number of publications and citation frequency of tumor big data visualization.
Figure 1. Number of publications and citation frequency of tumor big data visualization.
Electronics 12 00743 g001
Figure 2. The number of publications of tumor big data visualization on CNKI from 2013 to 2022.
Figure 2. The number of publications of tumor big data visualization on CNKI from 2013 to 2022.
Electronics 12 00743 g002
Figure 3. Mind map of tumor big data visualization classification.
Figure 3. Mind map of tumor big data visualization classification.
Electronics 12 00743 g003
Figure 4. Gaze spiral diagram [26]. The lower part of the picture is the storage and presentation form of the gaze spiral, and the upper part of the picture is the unfolding picture of selecting a certain part from the gaze spiral.
Figure 4. Gaze spiral diagram [26]. The lower part of the picture is the storage and presentation form of the gaze spiral, and the upper part of the picture is the unfolding picture of selecting a certain part from the gaze spiral.
Electronics 12 00743 g004
Figure 5. One-dimensional scalar field visualization [7]. The horizontal axis represents time and the vertical axis represents the number of RBC or PLT.
Figure 5. One-dimensional scalar field visualization [7]. The horizontal axis represents time and the vertical axis represents the number of RBC or PLT.
Electronics 12 00743 g005
Figure 6. Two-dimensional scalar field visualization [31]. The redder the color, the higher the dose, and the bluer the dose lower. Figure (a) is the color map in the rainbow-coded radiotherapy dose data. Figures (b,c) show the optimization results before and after applying the ROI exploration tool.
Figure 6. Two-dimensional scalar field visualization [31]. The redder the color, the higher the dose, and the bluer the dose lower. Figure (a) is the color map in the rainbow-coded radiotherapy dose data. Figures (b,c) show the optimization results before and after applying the ROI exploration tool.
Electronics 12 00743 g006
Figure 7. Three-dimensional scalar field visualization. The image shows a screenshot of the suture after resection of the tumor during 3D tumor resection.
Figure 7. Three-dimensional scalar field visualization. The image shows a screenshot of the suture after resection of the tumor during 3D tumor resection.
Electronics 12 00743 g007
Figure 8. Point data visualization [35]. From blue to green, colorectal cancer mortality increases.
Figure 8. Point data visualization [35]. From blue to green, colorectal cancer mortality increases.
Electronics 12 00743 g008
Figure 9. Line data visualization [36]. Blue colors represent places that had more out-migration than in-migration while red areas were hot destinations.
Figure 9. Line data visualization [36]. Blue colors represent places that had more out-migration than in-migration while red areas were hot destinations.
Electronics 12 00743 g009
Figure 10. Three-dimensional scalar field visualization of the frequency of newborn carriers of the TP53 p.Arg337His germline mutation in 42 cities from the São Paulo State’s VII Regional Health Board (RHB VII). The São Paulo State is represented in gray color, and the RHB-VII region is indicated by an arrow in a map of Brazil at the left. The zoomed RHB-VII region was depicted in gradient colors representing the range of estimated mutation frequency in each city).
Figure 10. Three-dimensional scalar field visualization of the frequency of newborn carriers of the TP53 p.Arg337His germline mutation in 42 cities from the São Paulo State’s VII Regional Health Board (RHB VII). The São Paulo State is represented in gray color, and the RHB-VII region is indicated by an arrow in a map of Brazil at the left. The zoomed RHB-VII region was depicted in gradient colors representing the range of estimated mutation frequency in each city).
Electronics 12 00743 g010
Figure 11. Hyperbolic tree [42]. VSDR displaying search results of keyword “David Beckham”.
Figure 11. Hyperbolic tree [42]. VSDR displaying search results of keyword “David Beckham”.
Electronics 12 00743 g011
Figure 12. A tree map [48]. The Clinician categorized all images in meaningful groups. The number of visible images is identical for all groups.
Figure 12. A tree map [48]. The Clinician categorized all images in meaningful groups. The number of visible images is identical for all groups.
Electronics 12 00743 g012
Figure 13. Force-directed layout [59]. (B) is the global picture. (A,C) are local enlargings. Green means high gene expression and red means low gene expression. Nodes with relationships are connected by wired segments, but nodes without relationships are not connected by wired segments.
Figure 13. Force-directed layout [59]. (B) is the global picture. (A,C) are local enlargings. Green means high gene expression and red means low gene expression. Nodes with relationships are connected by wired segments, but nodes without relationships are not connected by wired segments.
Electronics 12 00743 g013
Figure 14. Circular arc layout [60]. Arc diagrams of a 43-node, 80-edge network: left: with a random ordering and 180-degree arcs; middle: after applying the barycenter heuristic to order the nodes; right: after changing the angles of the arcs to 100 degrees.
Figure 14. Circular arc layout [60]. Arc diagrams of a 43-node, 80-edge network: left: with a random ordering and 180-degree arcs; middle: after applying the barycenter heuristic to order the nodes; right: after changing the angles of the arcs to 100 degrees.
Electronics 12 00743 g014
Figure 15. Circular layout [60]. Circular layout of a random 50-node, 200-edge graph, after barycenter ordering.
Figure 15. Circular layout [60]. Circular layout of a random 50-node, 200-edge graph, after barycenter ordering.
Electronics 12 00743 g015
Figure 16. Word cloud map [63]. Larger tokens in the word cloud chart are ranked higher by the selection algorithm.
Figure 16. Word cloud map [63]. Larger tokens in the word cloud chart are ranked higher by the selection algorithm.
Electronics 12 00743 g016
Figure 17. Rising sun diagram [64]. The document scatter is the search for “electricity” related synonym sets, the closer to the core the stronger the relevance.
Figure 17. Rising sun diagram [64]. The document scatter is the search for “electricity” related synonym sets, the closer to the core the stronger the relevance.
Electronics 12 00743 g017
Figure 18. Word tree [65]. A word tree showing all occurrences of “I have a dream” in Martin Luther King’s historical speech.
Figure 18. Word tree [65]. A word tree showing all occurrences of “I have a dream” in Martin Luther King’s historical speech.
Electronics 12 00743 g018
Figure 19. Circular coordinate system [70]. Radar map of correlation between LCN2 expression and MSI.
Figure 19. Circular coordinate system [70]. Radar map of correlation between LCN2 expression and MSI.
Electronics 12 00743 g019
Figure 20. Scatter plots: (a) Scatter plot with pathological MK and MD values distribution in the whole brain. The white arrow shows the separate voxels area with glioma-specific diffusion properties; the asterisked field shows the manual glioma segmentation’s voxel distribution; (b) corresponding batch of MD images with the overlaid labels of diffusion-based automatic segmentation (green) and the manual segmentation (red) of the tumor and their spatial overlap (pink) [71].
Figure 20. Scatter plots: (a) Scatter plot with pathological MK and MD values distribution in the whole brain. The white arrow shows the separate voxels area with glioma-specific diffusion properties; the asterisked field shows the manual glioma segmentation’s voxel distribution; (b) corresponding batch of MD images with the overlaid labels of diffusion-based automatic segmentation (green) and the manual segmentation (red) of the tumor and their spatial overlap (pink) [71].
Electronics 12 00743 g020
Table 1. Presentable forms of five software or website.
Table 1. Presentable forms of five software or website.
TableauEchartsMatlabGraphPad Prism3DMAX
1spiral diagram×××××
2calendar view×××
33D model××××
4map××
5tree map×××
6force-directed layout××××
7arc layout×××
8word cloud map××
9document scatter×××
10word tree××××
11parallel××
12radar chart×
13scatter plot×
143D graph×
Table 2. Classification and summary of tumor big data visualization methods.
Table 2. Classification and summary of tumor big data visualization methods.
ClassificationMethod of VisualizationFeaturesApplication ScenariosReferences
1*Visualization of tumor time series dataSpiral diagram: It is used in long time or large amount of data change trend and periodic data. Calendar view: It displays the data in chronological order in the form of a calendar chart.Spiral diagram: It is used in the prevention and therapy of cancer. Calendar view: It helps users to detect physical abnormalities.[24,25,26,27,28]
Visualization of tumor spatial dimension dataOne-dimensional scalar field visualization: It uses line graph to show the distribution of the data. 2D scalar field visualization: The distribution is represented on a 2D surface. 3D scalar field visualization: It presents the distribution characteristics of scalar data in 3D space.One-dimensional scalar field visualization: It is used to aid in crude disease judgment. 2D scalar field visualization: This method is suitable for CT images. 3D scalar field visualization: This method can be helpful for the reconstruction of CT images etc. and also be used in medical teaching and surgical simulation.[29,30,31,32]
Visualization of tumor geographic informationPoint data visualization: It can display more information in a limited geographical space, but the overlap between points will affect the reading. Line data visualization: It can represent the flow of geographic data with directionality. Regional data visualization: It represents geographic area data consisting of length and width.Point data visualization: It is used to accurately represent the distribution of data in the map. Line data visualization: This method can be used to analyze the source and spread trend of tumor. Regional data visualization: It is used to show the overall tumor data in an area.[33,34,35,36,37]
2*Visualization of tumor hierarchy dataNode link: It uses different shapes of nodes to represent data and lines between nodes to represent the relationship between data. Space filling: It represents the relationship between data in the form of an enclosing box. The level logic is clear and the specific gravity is obvious. Hybrid: It combines the advantages of many methods and discards the disadvantages.Node link: This approach can be used for the classification analysis of various tumor diseases. Space filling: This method is suitable for decision support. Hybrid: This method is suitable for association analysis.[38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55]
Visualization of tumor network dataForce-directed layout: Forces are calculated based on the relative position of nodes and lines. Arc layout: Its nodes are arranged along a linear axis or ring, and the lines indicate that there is a link relationship between nodes.Force-directed layout: It can be used for gene association analysis. Arc layout: This method is suitable for the analysis of medication and daily condition monitoring.[56,57,58,59]
3*Visualization of tumor text contentWord cloud map: It uses word frequency to express text features. Document scatter: It reflects the semantic hierarchy relationship in the text through the layout of keywords.Word cloud map: This method is applicable to the visualization of high-frequency words of all text information in the tumor field. Document scatter: This method is suitable for self-detection of diseases.[61,62,63]
Visualization of tumor text relationshipWord tree: It focuses on the connotation relationship of the textWord tree: This method is suitable for retrieving valuable fields from a large number of electronic medical records.[64]
4*Parallel coordinate systemIt reduces the dimension of multidimensional data to a coordinate representation on a two-dimensional plane.This method can be used to visualize the many-to-one tumor data relationship, the number of tumor patients, and physical examination indicators.[67,68,69]
Scatter plotIt describes the relationship between two variables in a multidimensional data set.It facilitates correlation analysis.[66,70]
1* Visualization of tumor spatiotemporal data. 2* Visualization of tumor hierarchy and network data. 3* Visualization of tumor text data. 4* Visualization of multidimensional tumor data.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, X.; Liu, B. Research Progress of Tumor Big Data Visualization. Electronics 2023, 12, 743. https://doi.org/10.3390/electronics12030743

AMA Style

Chen X, Liu B. Research Progress of Tumor Big Data Visualization. Electronics. 2023; 12(3):743. https://doi.org/10.3390/electronics12030743

Chicago/Turabian Style

Chen, Xingyu, and Bin Liu. 2023. "Research Progress of Tumor Big Data Visualization" Electronics 12, no. 3: 743. https://doi.org/10.3390/electronics12030743

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop