Attitudes towards Autonomous Data Collection and Analysis in the Planetary Science Community

As missions are planned to targets further away from Earth, it becomes all but required to increase the role of autonomy in the mission. An investigation of what aspects of mission operations and decision making autonomy will be accepted in by the planetary science community is thus required to aid in development planning. This paper presents a data set collected regarding attitudes towards autonomous data collection and analysis in the planetary science community and initial analysis of this data. A survey, conducted at the 2013 Lunar and Planetary Science Conference, asked respondents to identify whether or not they would accept conclusions drawn by autonomous data collection techniques and what factors would impact this acceptance. It also looked at the acceptance of computers and computer software in the data collection and analysis process.


Introduction
The confluence of several factors is creating a nearly "perfect storm" requirement for the use of extensive autonomy in planetary science missions. These factors include: (a) the identification of mission targets further away from Earth, (b) an increase in communications "round-trip" time as the spacecraft moves further from Earth, (c) a corresponding decrease in power as the spacecraft moves further away from the sun, (d) smaller spacecraft sizes (limiting antenna size and gain levels possible), (e) the impact that b and c have on the level of communications possible and (f) the miniaturization of OPEN ACCESS computers, making applications which require significant processing power possible in a suitably small form factor. Basically, this amounts to a simple decision: one can perform less mission work (due to delays), receive less data-value (due to transmission capability limitations) and have a human teleoperate the craft or hand control over to autonomous control software (which would be needed in the first instance, to some extent, to handle emergency response to situations that cannot wait for a roundtrip communication to resolve-e.g., tipping, slipping, etc.).
Given the high stakes of planetary missions (i.e., they are generally high-budget and high-profile), it is understandable that human controllers are hesitant to release control to software. Similarly, it seems reasonable that many planetary scientists may have similar misgivings regarding handing over the task of data collection, analysis and value determination to software. While the root causes of these misgivings are beyond the scope of this paper, it does attempt to determine what can be done to overcome them. This paper presents data from a survey of 230 participants at the 2013 Lunar and Planetary Science Conference in The Woodlands, Texas. Participants were asked about their views on the use of autonomous control software in directing data collection, data analysis and its prioritization for transmission.

Background
Despite robotic exploration of space and artificial intelligence evolving during the same period (from the 1950s to present), the two did not meet in application until the early 1970s [1]. The consideration of the use of autonomous technologies for space exploration has been ongoing, with progressively greater numbers of missions incorporating autonomous control [2][3][4]; however, handing over control to a machine can, quite literally, create "terror" for human operators who would prefer to be teleoperating (remotely controlling) the spacecraft [5]. A brief overview of the use of autonomous control in space is now presented. This is followed by a discussion of an example autonomous control technology that, if its results are accepted by the planetary science community, could dramatically change how space missions are conducted.

Autonomy in Space
The Jet Propulsion Laboratory was an early leader in the development of autonomous systems for space exploration. In 1972, JPL started work on artificial intelligence technologies for a Mars rover vehicle. In 1975, with a need for additional planning capabilities, they started work on DEVISER system to create and transmit commands to the Voyager spacecraft. The FAITH system was created to diagnose problems on Voyager and other spacecraft [6].
The Soviet Union, it would appear, embraced the use of autonomy even in situations where humans were present to prospectively perform the work. Their spacecraft docking technologies, IGLA (first demonstrated for in-space docking in 1986) and its replacement KURS (first used in 1989), were autonomous based on radar and calculated the relative position and orientation based on differences in signal strength between multiple antennas. IGLA was used for the Soyuz T spacecraft; KURS performed docking maneuvers onboard the Progress M and Soyuz TM spacecraft [7].
The Jet Propulsion Laboratory's Sojourner Mars rover was the first demonstration of rover control for planetary science. Launched in 1996, the craft started Martian exploration in July of 1997. Planning for Sojourner was performed on Earth and uploaded to the craft. The craft was able to respond to emergencies and changing conditions; however, any planning changes required operator intervention [10,11].
Spirit and Opportunity, launched in 2003 and operating on Mars starting from January of 2004, initially followed this same approach. Software, developed by Carnegie Mellon was tested on the rover in 2006. This software (in its most autonomous mode) extended the robot's driving capabilities by utilizing an onboard terrain map and allowing the robot to plan and update, if needed, its route based on human controller-provided activity and constraint rules [12][13][14].
Autonomous control of spacecraft has also been demonstrated. The Deep Space One (DS1) spacecraft, launched in 1999, demonstrated three autonomy technologies. These included the Remote Agent Experiment (RAX), the AutoNav system and beacon software. RAX was allowed to control DS1 for several days-making DS1 the first spacecraft to be largely controlled by an onboard agent. During this time, it demonstrated its ability to manage mission objectives, perform planning and scheduling and health monitoring. Despite a failure during its first test, a second test operated without issue. The test also demonstrated a fiscal benefit of the use of autonomy: significantly less ground staff were required to operate it than with previous similar craft [15][16][17][18][19][20][21].
The AutoNav system automated the navigation control aspects of the spacecraft. It was progressively tested between 24 October 1998 and 20 April 1999 when it was placed in complete control of the spacecraft [22]. The beacon software demonstrated a new approach in craft-to-ground communications. With the beacon, the spacecraft determines when it needs assistance and conveys this to controllers, limiting the use of congested deep space communications resources for status messages and "status normal" telemetry [22].
In 2003, additional autonomy onboard a spacecraft was demonstrated with the Autonomous Sciencecraft Experiment (ASE) onboard Earth Observing-1. The onboard CASPER software performed planning and data prioritization, increasing the amount of important science data that could be transmitted over the fixed-capacity communications link via data prioritization. ASE also allowed the spacecraft to respond quickly and thus capture short duration phenomena. Earth Observing-1 also demonstrated the Livingston expert system which performs onboard fault discovery and diagnosis [23,24].

Future Enabling Technologies
Previous work [25][26][27][28][29] demonstrates a possible future for autonomy in planetary science. This is, of course, one of many prospective approaches to the operations of autonomous data collection and analysis missions; it is, however, illustrative of the importance of developing planetary science community embraced autonomous control technology.
With Model-Based Transmission Reduction (MBTR) and its highest-level component, Model-Based Findings Transmission (MBFT), data is autonomously assessed onboard the spacecraft based on an a priori model. If the data is found to support the model a validation finding is sent; however, if a model is determined to be invalid, then it is corrected. A MBFT model can be a very high-level scientific thesis (e.g., conditions on Mars were previously suitable for known life), which can be deconstructed into a set of supporting statements that must be true or false in order for this high-level thesis to be true or false. The spacecraft automatically seeks out additional data to allow it to assess the various statements that it must ascertain the truth status of.
This approach can be utilized to control the operations of a network of heterogeneous craft [29] like those proposed in [30][31][32][33]. The combination of the finding driven goal generation and goal based autonomous control could prospectively lead to lower-cost, lower human operational involvement science missions. A technology of this sort is also required to enable further-from-Earth missions where decisions must be made without human involvement for most matters due to transmission delay constraints.

Results and Discussion
A survey was administered to 230 participants at the 2013 Lunar and Planetary Science Conference (LPSC) in The Woodlands, Texas. Only individuals affiliated with the conference (i.e., wearing a conference badge) were surveyed. Other than ensuring that potential respondents were conference participants, no other technique was utilized to screen or limit participation in the survey. The survey was offered to individuals in all public areas of the conference facility during four days of the conference. While some individuals declined to participate, the response was largely positive, with most individuals that were asked to participate agreeing to take the survey.
Questions 1 and 2 collected the respondent's age and duration of experience in the field. Question 3 asked the respondent to indicate their field of study. Questions 4 through 15 solicited the respondent's opinions about aspects of the survey topic (the specific questions asked and aggregate response data are listed in Sections 3.3 to 3.14). For Questions 4 through 15, the survey was presented to the respondent in the following format (illustrated by Question 4), shown in Figure 1. In addition to the aforementioned questions, Question 16 allowed the respondent to share any additional thoughts on the topic not covered by the other questions or to expand upon other responses. The analysis of these comments will be the subject of future work.

Age Distribution of Respondents
A primary thesis of this work is that there is a negative correlation between the (themselves correlated) respondent age, number of years in the field and the acceptance of autonomous control of data collection and (particularly) analysis. The age distribution of the sample is presented in Table 1 and Figure 2. Note that approximately 45% of the survey takers are in the 20-30 years of age category and approximately 24% fall into the 31-40 category. This is likely due to a confluence of the actual demographic distribution of the population as well as a possible correlation between age and spending time in the public areas of the conference center.

Years in Field Distribution of Respondents
Age and the number of years in the field have a strong positive correlation. This is largely logical; obviously a certain age is required to have worked a given number of years in a field. While some may enter the field at a later point than college graduation (or may graduate college at a later point than is prototypical), a strong correlation between the number of years of age of the respondent and the typical college graduation age and the number of years worked in the field is present. Due to the triviality of this conclusion, it will not be analyzed further. This data is presented in Table 2 and Figure 3. Table 2. Years in field distribution of respondents.

Question 4 Distribution of Respondents
Question 4 solicited respondents' general opinion about the topic prior to them considering the other questions that might frame this opinion. Most respondents did not read the entire survey before beginning to answer; however, a few did. A few also opted to change answers to earlier questions as they were answering later questions or at the end. In both cases, this was observed to be a very isolated phenomenon. The responses were divided on this question with 19% of respondents indicating that they agreed, 16% indicating no preference, 23% indicating that they disagreed and 9% indicating that they strongly disagreed. The agree-no preference and disagree-no preference categories had 10% and 13% of respondents in them, respectively.
In total, 50% of respondents indicated some form of disagreement (between disagree-no preference and strongly disagree), 33% indicated some form of agreement (between agree-no preference and strongly agree) and 16% indicated no preference. Note that the percentages do not add up to 100% due to rounding and one survey where this question was not responded to. This data is presented in Table 3 and Figure 4.

Question 5 Distribution of Respondents
In response to Question 5, participants evidenced a strong preference for personal analysis of their data. Of the respondents, 30% strongly agreed, 40% agreed and 90% indicated some form of agreement (between agree-no preference and strongly agree). No preference was indicated by 7% of respondents and 3% indicated some form of disagreement (between disagree-no preference and strongly disagree). This data is presented in Table 4 and Figure 5. Table 4. Responses to the statement "I prefer to analyze my own data (using or not using a computer)".

Question 6 Distribution of Respondents
Strong agreement with the use of a computer for trend identification was also evidenced with 93% of respondents indicating some form of agreement (between agree-no preference and strongly agree). No preference was indicated by 5% of respondents and some form of disagreement (between disagree-no preference and strongly disagree) was indicated by 1% of respondents. Note that the percentages do not add up to 100% due to rounding and one survey where this question was not responded to. This data is presented in Table 5 and Figure 6.  Figure 6. Graph of responses to the statement "I use/would use a computer to identify trends in data".

Question 7 Distribution of Respondents
Question 7 begins a set of questions (Questions 7 to 10) asking respondents about whether they would trust computer analysis under various conditions. Respondent answers to these questions indicate methods that the autonomous software development community can utilize to validate their systems and how well received they would be in the planetary science community. Question 7 asks respondents whether they would trust computer data analysis if the researcher was able to access or evaluate the source code. This could be achieved, for example, via open-sourcing autonomous control software (and exposing the source code to community review) or making it available for review and analysis by the researcher's team (possibly under a non-disclosure agreement, if it was considered proprietary or export controlled).
Of the respondents, 74% indicated some form of agreement (between agree-no preference and strongly agree), 18% indicated no preference and 7% indicated some form of disagreement (between disagree-no preference and strongly disagree). Note that the percentages do not add up to 100% due to Strongly Disagree rounding and one survey where an invalid (an indication of a response between two categories) response was provided. This data is presented in Table 6 and Figure 7. Anecdotally, some respondents indicated a level of concern as to whether they personally could or would be able to find someone with the requisite skills to perform a source code evaluation. It is unclear how this factored into the response in the few instances where this sentiment was voiced.  Figure 7. Graph of responses to the statement "I would trust computer analysis of data if I had access to or could evaluate the source code".

Question 8 Distribution of Respondents
Question 8 may speak directly to the concerns of those who do not feel that they have the experience (or perhaps realize the complexity and time requirements of software evaluation). It asks respondents to identify whether review by a panel of trusted experts would increase their trust in computer data analysis. Agreement (between agree-no preference and strongly agree) was indicated by 76% of respondents, 12% indicated no preference and 12% indicated some form of disagreement (between disagree-no preference and strongly disagree). This data is presented in Table 7 and Figure 8.

Question 9 Distribution of Respondents
Question 9 attempts to determine whether testing on Earth and the opportunity to compare an autonomously controlled approach to a traditional manually operated (or human in situ) approach would increase confidence in software data analysis. In response to this question, 73% of respondents indicated some form of agreement (between agree-no preference and strongly agree), 15% indicated no preference and 10% indicated some form of disagreement (between disagree-no preference and strongly disagree). Note that the percentages do not add up to 100% due to rounding, four surveys where this question was not responded to and one survey where an invalid (an indication of a response between two categories) response was provided. This data is presented in Table 8 and Figure 9. Table 8. Responses to the statement "I would trust computer analysis of data if the software had been tested on Earth and its analysis had been compared to traditional methods".  Figure 9. Graph of responses to the statement "I would trust computer analysis of data if the software had been tested on Earth and its analysis had been compared to traditional methods".

Question 10 Distribution of Respondents
Question 10 looks at whether a common form of quality control, the selection of an instance of a process at random for validation of correctness and accuracy, would increase confidence in autonomous software data analysis. In response to this question, 71% of respondents indicated some form of agreement (between agree-no preference and strongly agree), 18% indicated no preference and 9% indicated some form of disagreement (between disagree-no preference and strongly disagree). Note that the percentages do not add up to 100% due to rounding, five surveys where this question was not responded to and one survey where an invalid (an indication of a response between two categories) response was provided. This data is presented in Table 9 and Figure 10. Table 9. Responses to the statement "I would trust computer analysis of data if data sets used for analysis were occasionally (at random time intervals) sent back with the analysis results for verification".  Figure 10. Graph of responses to the statement "I would trust computer analysis of data if data sets used for analysis were occasionally (at random time intervals) sent back with the analysis results for verification".

Question 11 Distribution of Respondents
In most cases, the quantity of data that a spacecraft is able to generate is greater than the amount of data that it is able to transmit. Given this, the results of any study could, potentially, be impacted by the data that the spacecraft decides to transmit. The prioritization of data is, thus, important. Question 11 asks respondents whether they would trust computer prioritization of data for transmission (as opposed to an approach, for example, where thumbnails or other summary data was sent back and a human determined what data to bring back at higher resolution and in what order). In response to this question, 50% of respondents indicated some form of agreement (between agree-no preference and strongly agree), 20% indicated no preference and 27% indicated some form of disagreement (between disagree-no preference and strongly disagree). Note that the percentages do not add up to 100% due to rounding, five surveys where this question was not responded to and one survey where an invalid (an indication of a response between two categories) response was provided. This data is presented in Table 10 and Figure 11.  Figure 11. Graph of responses to the statement "I would trust a computer to prioritize data for transmission".

Question 12 Distribution of Respondents
Question 12 is very similar to Question 11; however, it utilizes the word "discarded". Arguably, computer prioritization would effectively include identifying data that would never practically be sent back. Perhaps, however, respondents presumed that it would be stored for retrieval on the spacecraft, if needed at a later time. The responses to this question show much less confidence in computer control of what data is discarded (as opposed to prioritization) with 48% indicating some form of disagreement (between disagree-no preference and strongly disagree), 15% indicating no preference and 36% indicating some form of agreement (between agree-no preference and strongly agree). Note that the percentages do not add up to 100% due to rounding and three surveys where this question was not responded to. This data is presented in Table 11 and Figure 12.  Figure 12. Graph of responses to the statement "I would trust a computer to determine what data is sent back to Earth versus being discarded".

Question 13 Distribution of Respondents
Question 13 asks respondents whether they trust a computer to determine what data is relevant to a particular science objective. Prospectively, this could determine how it is utilized by the software for analysis and/or its prioritization for transmission back to Earth. In responding to this question, 52% of participants indicated some form of disagreement (between disagree-no preference and strongly disagree), 13% indicated no preference and 33% indicated some form of agreement (between agree-no preference and strongly agree). Note that the percentages do not add up to 100% due to rounding and four surveys where this question was not responded to. This data is presented in Table 12 and Figure 13.

Question 14 Distribution of Respondents
Question 14 drives to the heart of the concern with spacecraft that are a significant distance from Earth (e.g., Voyager 1 and 2) and future prospective interstellar missions. At a certain point, the spacecraft must transition from a data collection tool to a decision making entity: it must make an assertion (or finding) and defend it. Failing to do this may result in data being sent back that is insufficient to draw any conclusion (causing an expensive, multi-decade mission to result in scientific failure). For these long-distance missions, there may be little choice. However, this same sort of technology could, prospectively be used on Earth or on closer proximity planetary science missions.
In responding to Question 14, 41% of participants indicated some form of agreement (between agree-no preference and strongly agree), 18% indicated no preference and 39% indicated some form of disagreement (between disagree-no preference and strongly disagree). Note that the percentages do not add up to 100% due to rounding, five surveys where this question was not responded to and one survey where an invalid (an indication of a response between two categories) response was provided. This data is presented in Table 13 and Figure 14.  Figure 14. Graph of responses to the statement "I would trust a scientific finding that was arrived at by a computer program".

Question 15 Distribution of Respondents
Question 15 is substantively similar to Question 4. Question 4 asks about trusting computer analysis (without human involvement); Question 15 asks about trusting the computer-generated findings of that analysis (enough to publish them as a journal editor). An agreeing response (between agree-no preference and strongly agree) was indicated by 46% of respondents, while 29% had no preference and Strongly Disagree 22% indicated a disagreeing response (between disagree-no preference and strongly disagree). Note that the percentages do not add up to 100% due to rounding, four surveys where this question was not responded to and two surveys where an invalid (an indication of a response between two categories) response was provided. This data is presented in Table 14 and Figure 15.  Figure 15. Graph of responses to the statement "As an editor for a journal, I would publish an article with computer-generated findings".

Age and Years in Field Correlation with Question 15 Response
Anecdotal data, prior to conducting the survey, suggested that a strong correlation might exist with the (themselves, likely strongly correlated) variables age and the number of years a respondent has worked in the field and acceptance of autonomous software control of data collection and analysis. A common notion is that individuals who gained significant experience in human teleoperation may preconceive this as the way space missions should be conducted. This would suggest that a correlation might exist between advanced age and a longer time in the field and a lack of support for the use of autonomous control software.
In this paper, the possibility of this correlation is examined with regards to only one of the key questions, number 15. Table 15 and Figure 16 present the age/agreement data and Table 16 and Figure 17 present the years in field/agreement data. Ignoring the spike caused by the single response in the <20 category in Table 15 and Figure 16, there is no evidence that older or more experienced individuals would be less likely to "publish an article with computer-generated findings" than younger or less experience individuals.   Figure 16. Graph of correlation between age and the statement "As an editor for a journal, I would publish an article with computer-generated findings". Table 16. Correlation between years of experience and the statement "As an editor for a journal, I would publish an article with computer-generated findings".  Figure 17. Graph of correlation between years of experience and the statement "As an editor for a journal, I would publish an article with computer-generated findings".

Analysis
The data collected in response to Questions 7 through 10 can be utilized to compare different approaches to validating autonomous data collection and analysis software. From this, it appears (based on the average of the responses-a metric that is being utilized as a single-number indicator of aggregate preference) that access to or review of the source code would engender the most trust, followed by expert review, testing and comparison on Earth and, finally, the random verification of analysis while the work was ongoing. It is important to note that a combination of these methods might engender even greater trust than a single method alone; collection and analysis of data related to this is a topic for future work. The average value and agreement/no preference/disagreement percentages for these questions are summarized in Table 17. Questions 11 through 14 look at areas of a planetary science mission where autonomous control could be incorporated. From the average values it would appear that it would be accepted most readily for data prioritization, followed by scientific finding generation, determination of whether data should be sent or discarded and, finally, the evaluation of relevance to scientific objectives. It is important to note that only the first, data prioritization for transmission, is above the 5 value that serves as the demarcation between agreement and disagreement. Also, the order of acceptance would be different if the percentage of those expressing agreement was used instead of the average response value. This data is summarized in Table 18.  Reviewing respondents' answers to Questions 4 and 15, juxtaposed, suggests that consideration of the matter in greater detail may lead to greater acceptance of autonomous data collection and analysis. Question 4 states "I would trust computer analysis (without human involvement) of planetary science data" while Question 15 states "as an editor for a journal, I would publish an article with computer-generated findings". Clearly these questions are, though not equivalent, of the same vein (as analysis is, of course, used to generate a finding). The responses, indicated in Tables 19 and 20 show an increase in agreement (from 23% to Question 4 to 33% for Question 15) as well as a shift in the average response (from 4.52 to 5.5, notably crossing the no-preference threshold from disagreement to agreement). Further study is required to validate this prospectively interesting finding which suggests that education of members of the planetary science community with regards to autonomous control technology may be critical in increasing its acceptance.

Conclusions and Future Work
This paper has presented the attitude of the planetary science community towards autonomous data collection and analysis. This attitude could, at present, be described as mixed. On one hand, there appears to be an acknowledgement that computers are an extraordinarily helpful tool; on the other, there seems to be a reluctance to "turn over the reins" of collection and analysis to software.
It has been shown that there are several things that respondents agree would increase their confidence in the software; however, respondents also indicated a reluctance to involve autonomous control in key mission areas.
Future work will need to further explore these areas in greater depth. It is also planned to conduct additional analysis on the data collected during the 2013 Lunar and Planetary Science Conference and to juxtapose this data with similar data that will be collected from other application areas of autonomous control.