Preliminary Perspectives on Information Passing in the Intelligence Community

: Analyst sensemaking research typically focuses on individual or small groups conducting intelligence tasks. This has helped understand information retrieval tasks and how people communicate information. As a part of the grand challenge of the Summer Conference on Applied Data Science (SCADS) to build a system that can generate tailored daily reports (TLDR) for intelligence analysts, we conducted a qualitative interview study with analysts to increase understanding of information passing in the intelligence community. While our results are preliminary, we expect that this work will contribute to a better understanding of the information ecosystem of the intelligence community, how institutional dynamics affect information passing, and what implications this has for a TLDR system. This work describes our involvement in and work completed during SCADS. Although preliminary, we identify that information passing is both a formal and informal process and often follows professional networks due especially to the small population and specialization of work. We call attention to the need for future analysis of information ecosystems to better support tailored information retrieval features.


Introduction
Intelligence analysts are tasked with a broad set of information-finding and sensemaking objectives to generate meaningful knowledge from lots of unstructured data [1,2]. Given the data-centric nature of analysis, computational support is essential to enable efficiency and quality in intelligence work. Although computation paired with big data promise to transform and improve the productivity of intelligence work, they also introduce new challenges, such as identifying knowledge gaps to infer threat's intents or to communicate shared understanding across community members [2]. The vision of human-machine teaming [3] has highlighted an opportunity for new technological solutions that take advantage of artificial intelligence (AI) and machine learning (ML). For example, recommender systems can help analysts find relevant information they might not know to look for [4], and natural language text summarization can greatly reduce the time needed for human interpretation of large collections of text [5]. Complementary advancements in capabilities for identifying patterns, anomalies, and relationships among entities offer the promise of revealing intelligence findings that might otherwise have been missed [6][7][8]. Given these developments, the grand challenge of the Summer Conference on Applied Data Science (SCADS), an 8-week program hosted by the Laboratory for Analytic Sciences (LAS) at North Carolina State University, is to use AI/ML to generate tailored daily reports (TLDR) for intelligence analysts to increase the efficiency and efficacy of their work [9].
Research in HCI has repeatedly demonstrated that despite impressive computational capabilities, the practical value of new tools and technologies depends on the ability to integrate them into appropriate environments [10]. Practical integration is not trivial. Intelligence analysts conduct analysis within a complex sociotechnical system [11,12]. Understanding the information ecosystem intelligence analysts work within is crucial for the successful development and implementation of new tools for their workflow [10,13,14]. This involves gaining an awareness of what kinds of information they work with, with whom they work, where their information comes from, how it is transformed, and where it is distributed [10,15]. Analysts work with numerous tools and process varying types of information, and analysis also requires human judgment and creativity on how to best utilize the available tools [16]. At the same time, analysts also frequently communicate with others-either synchronously or asynchronously [11,17,18]. Requests for information can come in different forms and be initiated by different types of customers [11]. Aspects of the workflow can also be highly collaborative, where communications with colleagues or subject matter experts are essential for filling in knowledge gaps [11,19]. The high variability and complexity of analysis operations create a major challenge for researchers and software engineers to optimize the benefits of their contributions. One possible solution to this challenge is to develop a system-level model of intelligence analysts' information ecosystem and the processes of information passing. In this work, we conducted an interview study with analysts to contribute toward this goal.
In this paper, we describe a study we conducted at the 2022 SCADS to develop an understanding of intelligence analysts' information ecosystem. We aim to extend prior research that has studied individual sensemaking and decision-making processes [20,21] by focusing on characterizing information passing among multiple people, systems, and data sources within the context of intelligence analysis. Additionally, while prior work has often used participants without professional intelligence analysis experience as proxies [22], we had a unique opportunity to recruit participants with significant intelligence analyst experience in an unclassified environment, allowing us to validate and build upon prior findings. One of the main goals of the research is to understand the different types and units of information generated, processed, and shared through analysis operations. The research questions guiding our interview study are: 1.
What kinds of information do intelligence analysts engage with in their analysis work? 2.
How does information flow in the analysis work of intelligence analysts? 3.
What factors influence how intelligence analysts engage with information in their analysis work?
Despite still being preliminary, this study contributes to the understanding and the modeling of both the interplay between information inputs and outputs, as well as the collaborative process of intelligence analysis. As this was the inaugural SCADS Conference, a primary goal was to generate data for use by future attendees at later editions of this summer conference. We understood going into the conference, that 8 weeks would be too tight for a full interview study-from idea conception to full data analysis-but that any work we did that was left in some "in progress" stage might be taken up by future attendees in pursuit of the multiyear grand challenge. We anticipate that this ultimate model will be useful in ensuring the appropriateness of the TLDR developed at future SCADS for intended users. We also expect that this model will contribute value to the broader community of intelligence researchers in its depiction of intelligence analysis information flows, based on empirical work conducted with actual intelligence analysts. In addition to these anticipated future contributions, we also compiled an annotated survey of existing literature about intelligence analysts and analytic workflows through this work.

Literature Survey
To broaden our understanding of intelligence analysis/analysts' work environment, we conducted a survey of existing literature on these topics. This section describes the methodology that we used to identify papers, our system for organizing papers and annotations, a description of our annotations, our process for identifying and analyzing existing models of analysts' work, and the findings of common themes in the literature that emerged from this exercise. These findings informed the research questions that guided the interview study we conducted. Our selection process was intentionally broad to expose ourselves to what was known about analysis workflows in the intelligence community and sensemaking at large. The annotated bibliography we compiled is also a contribution in itself as we believe it will be useful for future SCADS participants to familiarize themselves with a wide scope of the literature on intelligence analysis/analysts.

Search Strategy
With the intention to broaden the information we found in our literature survey, we used the following approaches: • Keyword search: We performed a brainstorming session and identified an initial set of keywords and terms to nonexhaustively represent the research space we wanted to explore. These terms were generated based on our existing knowledge of the research space as well as the terms we heard from the talks and discussions that occurred during the initial weeks of SCADS. These keywords were used as search terms on Google Scholar. • Author and Publication search: We created a list of key authors and publication venues that covered the identified topics. The publication record of these authors was examined and relevant papers were added to our spreadsheet. • Citation-chaining: Recorded papers were labeled as either relevant or not relevant by the researchers who read them, and papers of particular interest were flagged for the whole group to read. Papers that were identified as relevant were used to find further papers by exploring their references, the list of papers that cited them, and looking at other work the authors produced.
As this was an unstructured review, the relevance of each paper was determined on a subjective basis. In total, 143 papers were collected. Out of these, 82 were most relevant to the intelligence community and the themes about information passing we later uncovered.

Themes
The papers we collected ranged in publication year from 1991-2022 (mean 2012, median 2006). A spreadsheet was created to collate and analyze all the papers collected. A full description of as well as a link to the spreadsheet can be found in Table A2. We extracted the following citation data about each paper: Title, Author(s), Year published, Abstract, Citation, and URL. Before delving deeper into the synthesis of collected papers, researchers identified four key themes of individual interest to tag the papers with: • Information: Addresses information (or proxy information) that analysts deal with, how they deal with it, and how the information flows between different entities. • Process: Addresses workflows or processes that analysts use in their day-to-day work. Also includes workflows or processes that are specific to certain situations. • Job: Addresses how analysts perceive their jobs and occupational job scope. • Collaboration: Addresses collaboration among analysts, intelligence agencies, and other relevant individuals or groups.
The themes were developed after reviewing a subset of collected papers to identify common focal elements. The categories and their definitions were based on common features identified by comparing the properties of the papers (i.e., thematic analysis) and individual researcher interests. The goal was to identify what aspects of the intelligence ecosystem the literature was especially focused on. Each paper could be tagged with any number of themes as applicable. Any paper that we could not associate with one of the themes was deemed not applicable to the present research and was grayed out in our spreadsheet. The breakdown of the number of papers per theme is shown in Figure 1. Based on our definitions in Section 2.2, we classified the papers from our survey. While many papers from the survey were not categorized, the majority we labeled described a process in the intelligence ecosystem.

Patterns in the Literature
We now describe the patterns we identified from the literature survey. In the survey, we paid special attention to the visual models proposed by various authors. Section 2.3.2 describes the patterns identified from these models.

Insights Gained
As noted in Section 2.1, four key themes were identified to categorize and sort collected papers. Having tagged all papers with one of the four themes, researchers then analyzed the papers from each category. Below we describe some insights gained from this analysis: There was relatively little prior work focused on the Information that analysts deal with in their work. Prior work surrounding this theme included contributions related to tacit knowledge that intelligence analysts develop [23], approaches to information parsing [24][25][26], and the development of ontologies for intelligence analysis [11,27]. We found that there was a significant gap in the literature in identifying information types and the ways that information flows in analysts' work. Additionally, little prior work on the topic of information more broadly was conducted with real analysts. Therefore, we found that we could add significant value to the field by conducting empirical work with analyst participants on the topic of information types and flows. This aspect of intelligence analysts' work also intertwined with the concept of the TLDR system since understanding the types of information that would need to be summarized and the information sources is crucial for developing an information summarization tool.
Papers about Process were the most common and included a broad range of participants, contributions, and methodologies. For example, papers uncovering and defining analysis processes included participant studies with practicing analysts [15,16,28,29], analysts in training [22], and nonanalyst proxy participants [30,31]. Papers included literature reviews, empirical studies, and conceptual work, resulting in contributions such as models of analytic processes [32,33], evaluations of analysis tools [8], design implications [22,34], and theoretical contributions [35,36]. While there are still unexplored research topics regarding analyst processes and workflows, we felt that we could have a more novel contribution by focusing on other areas, considering how much prior work has been conducted on this topic.
There were fewer papers addressing analysts' Jobs, with only 13 papers significantly relating to this theme. We used this theme to identify papers that took a holistic approach to describe the broader work environment of intelligence analysts and the ways that they perceive themselves in the context of this larger environment [37][38][39][40]. We found that the scope of this research area was larger than our planned interview study's allotted time would allow. Two of the works associated with this research question were large-scale ethnographic studies conducted at intelligence agencies [11,12]. Therefore, we felt that existing gaps in addressing this research area could likely not be satisfied by the kind of study that we would be able to conduct in the limited timeframe of the summer conference.
There were a significant number of papers addressing Collaboration; however, we found it was often a supplemental theme to another research question area. For example, some papers focused more heavily on analytic processes, including collaborative processes, but were not primarily focused on collaboration [8,16]. Other papers focused on limited aspects of collaboration, such as the relationship between analysts and their managers, rather than focusing on collaboration more holistically [41]. We found collaboration to be a common theme in work about intelligence analysis due to perceptions of relatively poor collaboration within the intelligence community. Thus, collaboration often came up as a finding relating to intelligence failures, where information silos between agencies stymied effective information sharing [11,12,19,42]. This led us to include supplemental questions about collaboration as it relates to how analysts work with information and the way that information flows in analysis workspaces in our interview study.

Model Identification
From the literature survey, we collected a collection of the visual models proposed and grouped them on a shared Google Jamboard. These visual models consisted of boxes and lines designed to communicate theoretical patterns. They communicate simplified understandings of something, (e.g., the sensemaking loop by Pirolli and Card [20], or the analyst workflow proposed by Ahrend, Jirotka, and Jones [23]). Once gathered, we began a typical qualitative review by affinity diagramming the models, focusing on aspects of the intelligence process. Affinity diagramming is the process by which details of interest percolate out of the data by laying out data visually with the freedom to rearrange and compare data points until groups at an appropriate level of detail are formed [43].
One observation we made during our literature survey was the high number of models presented and the wide variety of foci in the intelligence process. We identified three major distinctions between the identified models: study population, individual vs. general focus, and analysis methodology. For the study population, the main difference between the models was whether the study was performed using intelligence analysts [15,23,44], or a more generalized population [45,46]. For individual vs. general focus, we found that some models represented the process of an individual analyst [8,[45][46][47][48], while others were intended to describe generalizable processes [23,37,44,49]. For analysis methodology, some models were based on empirical research [15,23,47,48,50], while other models represented the author's synthesis of the space without performing empirical studies [8,42,45,51,52].
From this process, we found that nearly all of the models could be described as either being about sensemaking, workflow, or decision-making. The distributions of these topics were roughly similar across the different category groups and led us to understand analyst work practices more holistically.

Study Design
This section describes how our interview study was conceptualized, designed, and implemented in the eight-week span of SCADS.

Study Conceptualization
After identifying information passing as our area of research for the interviews, we first drafted up a study concept document, which can be found in Appendix A. This document helped us formalize the research question, identify desired outcomes, and inform the interview study and its questions.
Initially, the questions were divided into two broad themes: analysis background and information flow. After reviewing these questions, certain themes began to emerge, which provided a more clear grouping of questions. Thus, in the second iteration, nine themes were identified, namely, background, topic area/experience, cooperation, team formation, inputs, information types, interruptions, outputs, and feedback. However, the number of themes and associated questions would have taken too long to go through sufficiently in an hour-long interview. Thus, in the third and final iteration, the themes were prioritized and coalesced into an hour-long interview guide. This was accomplished by including all but the 'interruptions' theme from the previous iteration but had more focused questions and clear goals for each theme. An introduction and debrief script were added to the interview guide during this iteration as well to help cover important aspects of the approved consent form. The final interview guide can be found in Appendix D.

Interview Themes
The overarching goal of these interviews was to understand how information is passed along in an analyst's workflow. Additionally, we aimed to look into information passing within and outside of an intelligence organization. This interview focus would help allow for comparisons in how information flows in the intelligence community and could inform templates to help "tailor" what analysts expect from a tool like the TLDR. For instance, without asking participants about expected features, we can collect data on what analysts work with regularly and build the TLDR to improve the current experience through iterative design.
Three key themes of this goal guided our interview, namely: information inputs, information outputs, and cooperation.

Inputs
One key aspect of understanding how information flows in analysts' work is understanding the sources of triggers that lead to an analysis task and the interactions with those sources. Analysts' "triggers", or the things that cue them to begin an analysis task, could be considered a starting point of their information flows. Regarding triggers, we asked questions about where triggers come from, from whom, and in what form they come. We also asked about how analysts understand the intentions behind a given trigger and what someone may want from a given trigger that is not explicitly stated. For example, we asked how much interaction analysts have with those who send triggers, and how much reframing, redirection, and/or discussion is involved to understand what someone really wants to know based on their request. These questions were intended, in part, to help us map out whether information flows are largely linear or are more cyclical (i.e., complicated and go beyond a simple input-process-output model). We also wanted to explore any other information "inputs" that analysts receive that may not necessarily cue them to begin an analysis task but still enter their workflows.
Outputs Thinking about the other end of the information flow, we wanted to understand the information outputs that analysts produce. We wanted to understand both formal outputs, such as finished intelligence reporting, and informal outputs, such as conversations where information is communicated between coworkers. Several outputs fall somewhere between formal and informal, such as notes that analysts take that are recorded and serialized in databases. We also asked questions regarding how analysts might tailor their information outputs based on what they know about the intended recipients of the information. For example, if an analyst is preparing the same information for three different recipients, how might that information output vary for each recipient? These questions will help us better understand how information is modified throughout an analyst's information flow. Cooperation We wanted to investigate cooperation as its own theme, particularly how it pertains to an analyst's workflow. As we understood from prior work and conversations with other analysts, there were barriers to cooperation among intelligence analysts [12]. Factors such as the frequency of disseminated reports and the amount of work completed with your name on it impact individual promotions, which may lead to more hostile and competitive work practices. Others commented on how many individuals were involved in their individual analysis process or the process of preparing raw intelligence for top-of-the-line products like the presidential daily brief. To more completely understand these anecdotal perspectives, we wanted to hear how analysts experienced working in cooperative ways in a formal setting.

Interview Methodology
All interviews were facilitated by an assigned pair of members from the research team. One member served as the interviewer while the other as the notetaker. Since no audio or video recordings were captured, it was important to have a dedicated notetaker present who could document our participants' responses. To promote consistency in note-taking, a template (see Appendix E) was created using the questions and broad themes listed in the interview guide. This served as the starting point for the data collection in every interview. Additionally, we had the same pairs of interviewer and notetaker for the pilot and the actual interviews to enable teams to gain familiarity with each others' interviewing and notetaking styles and preferences.
An interview was conducted only after explicit verbal consent from the participant. The participant also had the option of stopping the interview at any time with no consequences to them. The interviews were semistructured and were approximately an hour long. Our style of semistructured interviews leaned toward the more structured end of the spectrum as we had a detailed interview guide with prompts for each question (see Appendix D). Conducting our interview as semistructured allowed us to reword and reorder questions in response to the natural conversational flow of the interview, along with providing leeway to explore interesting tangents within the scope of the research question.

Pilot Interviews
To further calibrate the study design, we conducted six pilot interviews with SCADS participants who self-identified as nonanalysts but had worked adjacent to analysts and were relatively aware of an analyst's work. We decided to ask participants who were somewhat familiar with Intelligence to help us find the appropriate words since the goal of these conversations was to ensure we had clear questions and sufficient time to ask about everything we wanted to ask about. From these pilot participants, we asked three to role-play as analysts during the interview based on their knowledge of analysts and to fabricate details and answers to our interview questions as necessary. We asked two participants to answer our questions as they relate to their own job. These approaches allowed us to test our questions both in the context of analyst work and in realistic scenarios with proxy participants. We conducted one additional pilot interview with a participant who had previously worked as an analyst but was outside of our recruitment pool. Based on the pilot interviews, we discovered the benefits of inviting participants to draw out the components of their analysis process. Thus, we included a prompt to ask interviewees to visualize their answers on a blank sheet of paper where possible. We also learned the language of the analysts better, which was reflected in the final interview guide.

Participant Recruitment
This research study was approved by North Carolina State University's Institutional Review Board (IRB) and passed a review by the Department of Defense. We recruited participants by putting up advertisement flyers around the workspace and by directly reaching out to analysts at the LAS. Furthermore, participants at SCADS who had analyst experience were made aware of the study. We interviewed analysts who had worked in various roles, including participants with experience working stateside and those deployed internationally in the field. Descriptions of work roles included signals intelligence analyst, discovery analyst, language analyst, cyber intelligence analyst, and geospatial intelligence analyst. Participants' experience working as analysts ranged from 6 months to 26 years.

Analysis Plan
As described in Section 3.3, two interviewer-notetaker pairs conducted the interviews and collected data in the form of digital notes. Immediately following the interview, each pair sat down to clean up the digital notes and add context where necessary. The cleaned-up and contextualized notes were then used for qualitative analysis.
Having a team of six researchers and the presence of time constraints, the analysis of interview data took a collaborative approach. To allow for a diverse perspective and enhanced evaluation of contextual understanding, we introduced a third party in the qualitative coding process. This third-party, although part of the research team, was not privy to the contents of the interview they were coding and saw the notes for the first time during the coding process. Thus, the six researchers shuffled around in pairs such that each pair had one person who had been part of the interview being coded (either interviewer or notetaker) and the other being the third party as defined before. This also ensured some control for notetaking styles among the two pairs that conducted the interviews.
The team followed an open coding approach. The codes applied to the notes took a descriptive form as an inductive approach was followed, meaning there was no preconceived notion of what the codes would be. Codes were allowed to emerge through the discussion between the two coders. During the period of SCADS, we were able to complete the first iteration of this coding process. Some themes identified through this iteration are reported in Section 4. However, these are preliminary findings. We expect to code the same data iteratively, as needed, to allow for more analytical rigor and the emergence of insightful themes. After completing the required iterations and merging codes, a common coding scheme will be developed, which we expect to be one of the meaningful contributions of this work.

Preliminary Findings
At the time of writing, analysis is still ongoing. Thus, we only describe a few common anecdotes as discussed in at least one interview, that may describe phenomena related to our research question. Full findings will be published once a systematic analysis of the interview data is completed. While there is a desire to quote direct conversations and participants in a formal way, this section only describes some of the features and common threads from our interviews with analysts and our first iteration of descriptive codes.
Firstly, analysts discussed both formal and informal types of information that they referenced to do their work. For example, some analysts relied on emails and formal requests to know what to work on in a day. On the other hand, others experience "hair on fire" walk-bys that reprioritize their work for the entire day. This is to say, that we captured a variety of "inputs" involved in analysis workflows. This diversity in workflows can be seen as a challenge to be supported by a tool like the TLDR.
We also heard repeatedly that there is too much information to consume. This was especially confirmed when considering how finished intelligence was delivered to the community. As described by our participants, formal reporting and tweetlike short summaries (called Squawks) are serialized and sometimes delayed in bureaucracy and secondary checks. Additionally, many reports "die" in the dissemination phase since they exist in a database and notifications do not go out to or are missed by interested parties. Some people never know that new, and, more importantly, relevant information has been released. On more than one occasion, we heard that finished intelligence was lost and often required a follow-up phone call or conversation to make sure the right people were aware when they needed to be.
Finally, this reinforces our identification of the importance of informal conversations within the intelligence community. We heard that information sharing appears to follow professional networks. For example, informal conversations (e.g., phone calls, discussions across cubicle walls) occur within the community, and individuals will verbally notify others when information changes or updates become available. Partially, since the intelligence community is so small, many analysts know each other and rely on this understanding to know what others are interested in. The small community also helps with distributing information accurately because analysts will call interested parties and walk them through released reports and help explain sections of documents to each other. Unfortunately, we also found that without these informal conversations, some reports are missed by the people they would be relevant to.
Throughout our involvement with SCADS, we were told that every analyst's activity is unique, often quoting "If you speak to one analyst, you have spoken to one analyst". While intended to be an off-hand witticism about the diversity and unique methods analysts employ to complete their work, our preliminary results appear to shed light on potential paradoxes established by this de facto expectation. For example, while many analysts insisted that they worked alone and had their own unique methods,our preliminary analysis reveals that the intelligence community relies on many informal conversations to disseminate information, and there are some common experiences around how analysts begin and end their days.We confirm that there is a variety with which analysts begin and adjust their day. This emphasizes the need for a TLDR tool to be adaptable to many different work practices and especially tuned to augment or support important informal conversations between analysts.

Conclusions and Future Directions
In summary, this report detailed a two-part effort undertaken by the HCI team at SCADS, namely the annotated bibliography and the interview study. Synthesis of the literature survey led to the formation of the research questions, guiding the design of the interview study. The findings indicate possible valuable outcomes in understanding information flows and cooperation in the intelligence community. However, we acknowledge that these findings are indeed preliminary where the interviews were conducted with a limited sample of intelligence analysts who may not represent the entire intelligence community. Furthermore, we also did not investigate the entire intelligence analysis organization and the interplay between various parties, which limits the scope of our findings. Considering these limitations, there is still much to be learned about the intelligence community. There is a desire to understand how technologies, like the imagined TLDR, could influence user behaviors [53]. Future work would benefit from the examination of how new technology can impact user work and how job roles may shift and change when technologies are introduced. In the general population there are hesitations around the potential harm that Artificial Intelligence tools could have on human work [54]. Having a better understanding of information passing in the current intelligence ecosystem, can help serve as a marker to contrast later work, and could consider techniques and factors that can ameliorate concerns regarding future technology use cases.
Furthermore, we have identified potential future directions for the study. First, given the exploratory approach undertaken in this study, another iteration of qualitative data analysis needs to be conducted to draw substantiated findings. This will consist of thematic analysis to elicit more specific and attributable understandings of how intelligence operators create and share information in the intelligence ecosystem. Second, given the uniqueness of the coded interview notes dataset, other research questions can be developed. Future SCADS participants will have the data available for further analysis, and we are interested to see what additional questions can be answered from this data. Finally, there is potential to conduct data triangulation by using the interview data and other related datasets. Given the existence of visual analytic datasets such as the annual VAST challenge datasets (e.g., [55][56][57]) and the UKentucky [58] dataset, questions can be asked to bridge the gap between understanding how analysts behave in a controlled environment compared to their described workflow. We hope this report catalyzes further research in understanding the human side of intelligence analysts and leads to a significant contribution toward building the TLDR.

Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Restrictions apply to the availability of these data. Data was obtained from participating in the Summer Conference for Applied Data Science and are available to future researchers with the permission of the Laboratory for Analytic Sciences.

Acknowledgments:
We would like to thank all the participants who took the time to share their experiences in the intelligence community, including those who offered feedback during the pilot study phases. We are also grateful to Elizabeth, Jascha, Christine, Sean, Stephen, Emily, Brent, and those who advised on getting this work approved by the North Carolina State University's Institutional Review Board and the Human Resources within the Department of Defense. This material is based upon work done, in whole or in part, in coordination with the Department of Defense (DoD). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the DoD and/or any agency or entity of the United States Government.

Conflicts of Interest:
One of the authors (R. Jordan Crouser) is guest editing this special issue and will recuse himself from any adjudication of this work. The funders had no role in the design of the study; in the analyses or interpretation of data; in the writing of the manuscript; or in the decision to publish the results. This unique opportunity for data collection was made possible by the funders (LAS).

Appendix A. Study Design Document
The study design document outlines the motivation for and results from an iterative discussion about the interview study. The specific goals of the study, the target population, the research questions, the sample size, the analysis plan, and the expected outcomes are all specified. This document can serve as grounds for the interview study and may be useful for future SCADS participants to gain some context about our work.

•
Intelligence analysts work with a plethora of information coming from various sources. • This information informs the outcomes of their sensemaking process and may have actionable insights that get passed forward. • IAs need to communicate different kinds of information to different kinds of audiences, formally and informally.
• Assembling and preparing information to be communicated to other parties requires nontrivial efforts from the IAs. • It is thus critical to understanding how the information flows from different sources in and out of an analyst's workflow. • Understanding what types of and how (even in abstract terms fit for an unclassified scenario) information is acquired and passed on in the intelligence analysis community may help to design systems that can help to ease the burden of information preparation for communication.

•
For instance, identifying what format of input information leads to what format of output information can ultimately help recommend relevant information through the TLDR. • In the literature, there has been a focus on collaborative practices, or the lack thereof; however, there is a gap in understanding how cooperative or asynchronous collaboration happens in the intricate hierarchy of an intelligence community.

Specific Study Goals:
Through this study, we aim to interview analysts to: • Understand their perceptions of their work in relation to others, • Generalize in how information flows, both into their work and outward to others (a more holistic view) • Understand the impacts of various information types, trustworthiness, completeness, etc. on analyst workflows (treating the analyst as the central node of the information flow) We hope this helps us address the gap in the literature and enhance the stakeholders' understanding of cooperative work/information passing in the intelligence community.

1.
This will need to be more specific when we find out more about the participants 2.
Details may include years of experience, how long since they've last worked as an analyst, area of focus (HumInt, SigInt, GeoInt, etc.)? Other common performance indicators of the individual? Personality factors? 3.
What agency were they affiliated with (if we are allowed to get that information)?
Research Questions/Objective:

1.
What are the critical dimensions of personalization for information passing requirements of intelligence analysts? 2.
Situated context of analysis and how the people share information around the network.

3.
Please add other peripheral areas of foci if appropriate.

Sample Size: N = 8-10 Analysis Plan
How do you plan to answer each research question?
• Grounded theory approach to help us start analysis as and when we get data

1.
Greater understanding of an analysts workflow in the context of the information that they work with 2.
A holistic representation of how information flows through the hierarchy of a (to be determined) intelligence community/organization 3.
Add more if you think we can get anything more from the analysis-might be good to let certain contributions surface on their own 4.
Triangulation of findings across the literature of the common design considerations (thematic?) and with what we find out from the interviews. See if it contributes further to identifying a list of design specifications when we know to come up with the envisioned TLDR system.

Appendix B. Project Provenance
This section aims to explain the provenance of our research projects and the critical decision points that helped inform how we arrived at our final output. As with any large research project, plans are bound to change and adapt throughout the time frame. This section helps to tell the story of how the research was conducted. An overview of all the activities is listed in Table A1. In the summer of 2022, the authors were invited to the Summer Conference on Applied Data Science at North Carolina State University. During this week, we were introduced to the grand challenge: a Tailored Daily Report (also referred to in the program as TLDR for "too long; didn't read") for analysts to have the information they need, when they need it, to make their work more efficient and reliable. The early programming for the summer conference was filled with group meetings with key users (i.e., Day in the Life Briefings from Analysts), and descriptions of the datasets gathered for us, to better understand the work that analysts do and the challenges we were attempting to solve with the grand challenge solution. This week, a shared document was created and distributed to invite people to write biographies to describe their personal interests and experiences. By the end of this "introduction by fire-hose" week, we were more familiar with the Intelligence community and the possible roles and ways information was being shared.

Appendix B.2. Week 2
In the second week, we started to do some team-building activities by reviewing the biographies individuals shared about what experiences they brought to the conference and what they wanted to work on. We also had design students give a presentation about their work uncovering challenges faced by analysts. These presentations were the culmination of a semester of paper prototype and artifact interviews with Laboratory for Analytic Sciences (LAS) analysts to arrive at high-fidelity prototypes and personas for a TLDR. These presentations were helpful in understanding what people had already asked analysts and led to more conversation about what we might want to do for our remaining six-week conference. This week, we also formalized our group and started developing an IRB Request for running our proposed research based on our preliminary study design (see Appendix A). The IRB request was focused on running interviews with LAS analysts and helping to extract what kinds of work they worked with. We also used this week to hear from Deborah Littlejohn, an associate professor of Graphic & Experience Design at NCSU and the coordinator of the design students' projects, to better understand what design research has already been done with analysts at LAS.

Appendix B.3. Week 3
Continuing to identify what had been done, we started exploring the literature. By the third week, we had developed a spreadsheet structure to organize our annotations and the research we were reading.
We submitted the IRB for the interview study at the end of week 3. We met with researchers from LAS, SCADS attendees from academia, and LAS analysts and workshop coordinators throughout the week to finalize components of the IRB. For example, LAS workshop coordinators helped us craft example interview questions that we submitted for the IRB interview guide at the appropriate level of detail we should expect to discuss in a declassified environment.

Appendix B.4. Week 4
We identified and extracted visual models from the existing literature that we gathered (see Table A2). We compiled these models into a Google Jamboard, a virtual whiteboard tool, that allowed us to collaboratively arrange images of models and aid in the identification of common patterns across models. We also marked on the literature survey spreadsheet if we had found a model in the paper to maintain a link between the spreadsheet and Jamboard. The models we extracted included visual representations and tables that described different aspects of analysts' workflow, process, and working environment. We analyzed these models, tagging and categorizing them based on what they represented. For example, we found that many represented processes or categories related to analysts' work. We also differentiated which models were pulled from work that focused specifically on analysts, as compared to work that focused on analysis work in a more general sense. Within those categories, we subcategorized models as representative of either workflow, sensemaking, or decision-making. We also identified if the models were based on empirical or theoretical work. This work informed the discussion found in Section 2.3.2. This helped us to understand how saturated the existing field of literature was on these different aspects of analysts' work specifically, as we see significant value in our intended interview study being empirical work with actual analysts.

Appendix B.5. Week 5
We continued our review and synthesis of models found in the literature and, from this, developed new paths to follow for additional literature collection. For example, some of the additional search terms we decided warranted more exploration from this review were: "cooperative/collaborative analysis", "information/knowledge sharing", and "distributed/asynchronous collaboration". These additional literature synthesis and exploration activities helped us focus on the specific goals and intentions of our planned interview study. This led us to develop a more specific version of the interview guide, building upon the example guide prepared for IRB approval. We also met with other researchers and LAS affiliates with experience working with analysts to gain additional insight into what may be most valuable for us to explore in our interviews with analysts. Finally, during this week, we came up with three options for contingency plans to follow in case the IRB was not approved in time for us to run our planned interview study. These plans were:

1.
A more systematic literature review 2.
Analyst interaction log visualization and representation 3.
A design study to develop design goals for the TLDR A systematic literature review would involve formalizing the literature review process we had already begun by documenting search terms and inclusion/exclusion criteria. This would result in a comprehensive and systematic overview of existing literature on intelligence analysts/analysis.
A design study would involve conducting a study with LAS and SCADS participants with analyst experience wherein the subject of the study is an artifact they create, such as a design of a tool, which would not warrant IRB review.
We did not move forward with these contingency plans as our study was approved, but these plans may be interesting to pursue in future years of SCADS.
Appendix B.6. Week 6 We continued to refine our interview guide, and we ran six pilot interviews with SCADS participants and LAS affiliates who had experience working alongside analysts, but who were not eligible to participate in our study. These pilot interviews allowed us to rehearse our interviewing techniques, and test out our interview questions to identify any that did not result in responses that related to our study goals. These six pilot interviews helped us to refine our interview guide into a final version, which ended up being our fifth iteration of the guide.
By the end of this week, we received approval to run the study and began scheduling interviews for week 7.
Appendix B.7. Week 7 We ran 14 interviews with LAS employees and SCADS attendees with experience working as analysts. Two to four interviews occurred each day of this week. A dedicated member of the research team coordinating scheduling with participants and interview pairs. Two interviewer/notetaker pairs from our team ran these interviews. Interviews were one hour long and were immediately followed by the interviewer/notetaker pair reviewing the notes for accuracy. The rapidity of getting approval for the study and coordinating schedules with experts was made possible by the academic and government collaborative opportunity, unique to a program like SCADS.
During this week, we also finalized our analysis process for our first round of qualitative coding on the interview notes. We decided to do open coding on the notes, conducted by one person (either interviewer or notetaker) who was present at the interview to provide context, and one outside person who was not present in the interview (any of the other four team members) to provide an outside perspective into the interview content. We began this open coding process in week 7 and completed the first round of analysis on approximately half of our data.

Appendix B.8. Week 8
We ran two additional interviews early in week 8, bringing our total participant count up to 16. By the end of this week, we finished the first round of coding the notes from all 16 interviews. We also had these notes deidentified to facilitate sharing with future SCADS researchersand laid the groundwork for how we could continue to work on the data after SCADS when we are no longer affiliated with NCSU (where the IRB was approved). Finally, during the last week of SCADS, we finalized our report and presentation. Table A2. The following table enumerates and describes the metadata extracted from the papers during the literature survey. We report on the status of the survey as completed by the group to open up the possibility for future SCADS conferences to build upon the work. At the end of SCADS 2022, 52 papers were completely read and analyzed, 34 were read and annotated but not fully analyzed, and 55 were left to be reviewed. The papers in the "To read" category have value for future SCADS participants as an already compiled list of relevant papers on the broad topic of intelligence analysis/analysts' work. The papers in the "Complete" and "To review" categories have additional value to future participants as they also include our annotations. We labeled 18 papers as recommended for all members of our team to read because they were deemed to be critical to informing our research. The list of papers can be accessed at https://docs.google.com/spreadsheets/d/1ri_JC0fu98o2Q3EYzVKOI_e5OjGZLBTW4h37yhIT1OM (accessed on 1 June 2023).

Field Description
Paper type Thrust/Thesis A short summary (less than 100 characters) about the paper/what they did. Takeaway column is more about contributions.

Details
More relevant details from the paper.

Population
If empirical work, who and how big was the sampled population? Note: We added this column later on in our process of collecting these papers as it became apparent to us that it would be useful to distinguish which papers used an analyst population in their study and which did not.
Relevant to RQ Which of our four research question areas (Info, Process, Job, Collab) this paper was most relevant to?
How Relevant to RQ A brief description of how the paper is relevant to the selected RQ(s).

Contribution
What the authors claim to contribute with their work/Key takeaways.

Limitations
Areas where we believe the paper was limited in its contribution, methodology, scope, etc., particularly as compared to our intended work (not necessarily the limitations that a paper may list).
Takeaway A short summary of the main lessons learned from this work. Trust/Thesis column is more about motivation and method.
Unique ID An index number to refer to citations within team discussions.