# Using Visual Exploratory Data Analysis to Facilitate Collaboration and Hypothesis Generation in Cross-Disciplinary Research

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Datasets and Methods

## 3. Implementation and Case Studies

## 4. Discussion

## 5. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Cutcher-Gershenfeld, J.; Baker, K.S.; Berente, N.; Flint, C.; Gershenfeld, G.; Grant, B.; Haberman, M.; King, J.L.; Kickpatrick, C.; Lawrence, B.; et al. Five ways consortia can catalyse open science. Nature
**2017**, 543, 615–617. [Google Scholar] [CrossRef] [PubMed] - Kitchin, R. The Data Revolution: Big Data, Open Data, Data Infrastructures & Their Consequences; Sage: London, UK, 2014; 222p. [Google Scholar]
- Schutt, R.; O’Neil, C. Doing Data Science: Straight Talk from the Frontline; O’Reilly: New York, NY, USA, 2013; 406p. [Google Scholar]
- Phethean, C.; Simperl, E.; Tiropanis, T.; Tinati, R.; Hall, W. The role of data science in web science. IEEE Intell. Syst.
**2016**, 31, 102–107. [Google Scholar] [CrossRef] - Dhar, V. Data science and prediction. Commun. ACM
**2013**, 56, 64–73. [Google Scholar] [CrossRef] - Drineas, P.; Huo, X. NSF Workshop Report: Theoretical Foundations of Data Science (TFoDS); TFoDS workshop organizing committee: Arlington, VA, USA, 2016; 20p, Available online: http://www.cs.rpi.edu/TFoDS/TFoDS_v5.pdf (accessed on 26 September 2017).
- Brillinger, D.R. Exploratory Data Analysis. In International Encyclopedia of Political Science; Badie, B., Berg-Schlosser, D., Morlino, L., Eds.; SAGE Publications: Thousand Oaks, CA, USA, 2001; Volume 1, pp. 530–537. [Google Scholar]
- Cox, V. Translating Statistics to Make Decisions; Apress: New York, NY, USA, 2017; 324p. [Google Scholar]
- Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Reading, PA, USA, 1977; 688p. [Google Scholar]
- Ho, Y.C. Abduction? Deduction? Induction? Is there a logic of exploratory data analysis? In Proceedings of the Annual Meeting of the American Educational Research Association, New Orleans, LA, USA, 4–8 April 1994. [Google Scholar]
- Hazen, R.M. Data-driven abductive discovery in mineralogy. Am. Mineral.
**2014**, 99, 2165–2170. [Google Scholar] [CrossRef] - Fox, P.; Hendler, J. The science of data science. Big Data
**2014**, 2, 68–70. [Google Scholar] [CrossRef] [PubMed] - Magnani, L. Abduction, Reason and Science: Processes of Discovery and Explanation; Springer: New York, NY, USA, 2011; 205p. [Google Scholar]
- Miller, H.J. The data avalanche is here. Shouldn’t we be digging? J. Reg. Sci.
**2010**, 50, 181–201. [Google Scholar] [CrossRef] - Kraak, M.-J. Exploratory visualization. In Encyclopedia of GIS; Shekhar, S., Xiong, H., Eds.; Springer: Berlin, Germany, 2008; pp. 301–307. [Google Scholar]
- Tufte, E.R. The Visual Display of Quantitative Information, 2nd ed.; Graphics Press: Cheshire, CT, USA, 2001; 197p. [Google Scholar]
- Fox, P.; Hendler, J. Changing the equation on scientific data visualization. Science
**2011**, 331, 705–708. [Google Scholar] [CrossRef] [PubMed] - Buzan, T. Mind Map Handbook: The Ultimate Thinking Tool; HarperCollins: Toronto, ON, Canada, 2005; 431p. [Google Scholar]
- Novak, J.D.; Cañas, A.J. The Theory Underlying Concept Maps and How to Construct and Use Them; Technical Report IHMC CmapTools; Institute for Human and Machine Cognition: Pensacola, FL, USA, 2008; Available online: http://cmap.ihmc.us/docs/theory-of-concept-maps (accessed on 26 September 2017).
- Mou, X.; Jamil, H.; Ma, X. VisFlow: A visual database integration and workflow querying system. In Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE 2017), San Diego, CA, USA, 19–22 April 2017; pp. 1421–1422. [Google Scholar]
- Ma, X.; Chen, Y.; Wang, H.; Zheng, J.G.; Fu, L.; West, P.; Erickson, J.S.; Fox, P. Data visualization in the Semantic Web. In The Semantic Web in Earth and Space Science: Current Status and Future Directions; Narock, T., Fox, P., Eds.; IOS Press: Berlin, Germany, 2015; pp. 149–167. [Google Scholar]
- Ma, X. Linked Geoscience Data in practice: Where W3C standards meet domain knowledge, data visualization and OGC standards. Earth Sci. Inform.
**2017**, 10, 429–441. [Google Scholar] [CrossRef] - Steed, C.A.; Ricciuto, D.M.; Shipman, G.; Smith, B.; Thornton, P.E.; Wang, D.; Shi, X.; Williams, D.N. Big data visual analytics for exploratory earth system simulation analysis. Comput. Geosci.
**2013**, 61, 71–82. [Google Scholar] [CrossRef] - Ma, X. Geoinformatics in the Semantic Web. In Proceedings of the 17th Annual Conference of the International Association for Mathematical Geosciences (IAMG 2015), Freiberg, Germany, 5–13 September 2015; pp. 18–26. [Google Scholar]
- The Co-Evolution of the Geo- and Biospheres. An Integrated Program for Data-Driven, Abductive Discovery in the Earth Sciences. Available online: https://dtdi.carnegiescience.edu (accessed on 26 September 2017).
- Lafuente, B.; Downs, R.T.; Yang, H.; Stone, N. The power of databases: The RRUFF project. In Highlights in Mineralogical Crystallography; Armbruster, T., Danisi, R.M., Eds.; De Gruyter: Berlin, Germany, 2015; pp. 1–30. [Google Scholar]
- Database of Raman Spectroscopy, X-ray Diffraction and Chemistry of Minerals (RRUFF). Available online: http://rruff.info (accessed on 26 September 2017).
- Rakovan, J. Words to the Wise—More than 4,000 To Be Exact. Rocks Miner.
**2007**, 82, 423–424. [Google Scholar] [CrossRef] - IMA Mineral List with Database of Mineral Properties. Available online: http://rruff.info/ima/ (accessed on 26 September 2017).
- Mineralogy Database—Mineral Collecting, Localities, Mineral Photos and Data (Mindat). Available online: https://www.mindat.org (accessed on 26 September 2017).
- Hazen, R.M.; Papineau, D.; Bleeker, W.; Downs, R.T.; Ferry, J.M.; McCoy, T.J.; Sverjensky, D.A.; Yang, H. Mineral evolution. Am. Mineral.
**2008**, 93, 1693–1720. [Google Scholar] [CrossRef] - Hystad, G.; Downs, R.T.; Hazen, R.M. Mineral Species Frequency Distribution Conforms to a Large Number of Rare Events Model: Prediction of Earth’s Missing Minerals. Math. Geosci.
**2015**, 47, 647–661. [Google Scholar] [CrossRef] - Hazen, R.M.; Grew, E.S.; Downs, R.T.; Golden, J.; Hystad, G. Mineral ecology: Chance and necessity in the mineral diversity of terrestrial planets. Can. Mineral.
**2015**, 53, 295–324. [Google Scholar] [CrossRef] - Hystad, G.; Downs, R.T.; Hazen, R.M.; Golden, J.J. Relative Abundances of Mineral Species: A Statistical Measure to Characterize Earth-like Planets Based on Earth’s Mineralogy. Math. Geosci.
**2017**, 49, 179–194. [Google Scholar] [CrossRef] - Morrison, S.M.; Liu, C.; Eleish, A.; Prabhu, A.; Li, C.; Ralph, J.; Downs, R.T.; Golden, J.J.; Fox, P.; Hummer, D.R.; et al. Network analysis of mineralogical systems. Am. Mineral.
**2017**, 102, 1588–1596. [Google Scholar] [CrossRef] - Fox, P.; McGuinness, D.L. TWC Semantic Web Technology. Available online: http://tw.rpi.edu/web/doc/TWC_SemanticWebMethodology (accessed on 26 September 2017).
- Ma, X.; Zheng, J.G.; Goldstein, J.C.; Zednik, S.; Fu, L.; Duggan, B.; Aulenbach, S.M.; West, P.; Tilmes, C.; Fox, P. Ontology engineering in provenance enablement for the National Climate Assessment. Environ. Model. Softw.
**2014**, 61, 191–205. [Google Scholar] [CrossRef] - Three.js—JavaScript 3D Library. Available online: https://threejs.org (accessed on 26 September 2017).
- Demo System for Exploring Co-relationships between Elements and Minerals. Available online: https://goo.gl/FAEepi (accessed on 10 November 2017).
- Cube Matrix to Show Element-Mineral Co-relationships. Available online: https://github.com/xgmachina/3dcube (accessed on 10 November 2017).
- Wickham, H.; Grolemund, G. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data; O’Reilly Media: Sebastopol, CA, USA, 2017; 522p. [Google Scholar]
- Bittner, K.; Spence, I. Use Case Modeling; Pearson Education, Inc.: Boston, MA, USA, 2003; 347p. [Google Scholar]
- Kandel, S.; Heer, J.; Plaisant, C.; Kennedy, J.; van Ham, F.; Riche, N.H.; Weaver, C.; Lee, B.; Brodbeck, D.; Buono, P. Research directions in data wrangling: Visualizations and transformations for usable and credible data. Inf. Vis.
**2011**, 10, 271–288. [Google Scholar] [CrossRef]

**Figure 1.**Key steps in a data science process (adapted from [3]).

**Figure 2.**Pilot system for the exploratory data analysis of co-relationships among elements and minerals. (

**a**) An initial output by visualizing the raw mineral counts; (

**b**) output after taking a logarithmic calculation of the mineral counts in each cell; (

**c**) changes in the opacity of each cell based on the value of the mineral counts. The cell filled with solid red (lower right) has oxygen on all three axes. It has the highest mineral count, 4138, in the whole matrix; (

**d**) sliced-out two-dimensional planes to see the patterns. Here it shows a plane for oxygen, i.e., oxygen is the element on the Z-axis; and (

**e**) changing the distance between cells along one or more axes to see patterns in a two- or one-dimensional context.

**Figure 3.**Use a ‘mouse over’ operation to see attributes of a matrix cell. The cell below the cursor is highlighted, and the attributes of the cell is shown on the top of the window. The value ‘0.297970034’ means that about 29.8% of minerals containing oxygen also contain calcium.

**Figure 4.**Visualization of a 72 × 72 × 72 matrix in the developed pilot system. The rows of red and blue cells are corresponding to the O-H plane, and they highlight different elements’ association with hydrated (i.e., O–H bearing) minerals.

**Figure 5.**Visualization outputs showing co-relations between primary and secondary cobalt minerals. Minerals are arranged by their first occurrence time (old to young: left to right along the horizontal X-axis; top to bottom along the vertical Y-axis). The raw value in each cell represents the number of localities at which both the X and Y minerals occur. (

**a**) Primary to primary (raw values); (

**b**) primary (Y) to secondary (X), with logarithmic values; and (

**c**) secondary to secondary, with logarithmic values.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ma, X.; Hummer, D.; Golden, J.J.; Fox, P.A.; Hazen, R.M.; Morrison, S.M.; Downs, R.T.; Madhikarmi, B.L.; Wang, C.; Meyer, M.B.
Using Visual Exploratory Data Analysis to Facilitate Collaboration and Hypothesis Generation in Cross-Disciplinary Research. *ISPRS Int. J. Geo-Inf.* **2017**, *6*, 368.
https://doi.org/10.3390/ijgi6110368

**AMA Style**

Ma X, Hummer D, Golden JJ, Fox PA, Hazen RM, Morrison SM, Downs RT, Madhikarmi BL, Wang C, Meyer MB.
Using Visual Exploratory Data Analysis to Facilitate Collaboration and Hypothesis Generation in Cross-Disciplinary Research. *ISPRS International Journal of Geo-Information*. 2017; 6(11):368.
https://doi.org/10.3390/ijgi6110368

**Chicago/Turabian Style**

Ma, Xiaogang, Daniel Hummer, Joshua J. Golden, Peter A. Fox, Robert M. Hazen, Shaunna M. Morrison, Robert T. Downs, Bhuwan L. Madhikarmi, Chengbin Wang, and Michael B. Meyer.
2017. "Using Visual Exploratory Data Analysis to Facilitate Collaboration and Hypothesis Generation in Cross-Disciplinary Research" *ISPRS International Journal of Geo-Information* 6, no. 11: 368.
https://doi.org/10.3390/ijgi6110368