Next Article in Journal
Drone-Based Participatory Mapping: Examining Local Agricultural Knowledge in the Galapagos
Previous Article in Journal
Suitability of the Reforming-Controlled Compression Ignition Concept for UAV Applications
 
 
Article
Peer-Review Record

Thermal and Multispectral Remote Sensing for the Detection and Analysis of Archaeologically Induced Crop Stress at a UK Site

by Katherine James 1, Caroline J. Nichol 1,*, Tom Wade 1, Dave Cowley 2, Simon Gibson Poole 3, Andrew Gray 4 and Jack Gillespie 4
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Reviewer 5: Anonymous
Submission received: 14 August 2020 / Revised: 21 September 2020 / Accepted: 22 September 2020 / Published: 24 September 2020

Round 1

Reviewer 1 Report

The paper describes an interesting case-study of using drone-mounted sensors for detecting buried archaeological features. While the results might look disappointing, as the authors themselves acknowledge (since no clear evidence was obtained), I believe that it is an interesting example of the multiple factors that can impact in the outcome of a well-designed research approach. I believe it has interesting practical implications that will be useful for many people interested in these practices.

The text is very well written and structured, the approach is well described and the results are clearly illustrated. The authors include an interesting discussion of the results that will appeal to a large audience.

I only found a couple of things to be corrected:

Figure 1: Green line is not very well visible

Acknowledgements section includes a sample text

Typos:

  • line 285, be be
  • line 293: out put

Author Response

We thank this reviewer for their attention to our paper and the comments. Please see our responses below:

Figure 1: We have amended this figure to make the green line more visible. 

Acknowledgements: We have removed this sample text.

Line 285: we have removed the second "be" from the text

Reviewer 2 Report

I recommend this paper for publication provided that the authors make their data available. This is the journal’s stated policy on data availability: “In order to maintain the integrity, transparency and reproducibility of research records, authors must make their experimental and research data openly available either by depositing into data repositories or by publishing the data and files as supplementary information in this journal.” One natural place to make the data available would be on the site record (https://canmore.org.uk/), which already has an archive of previous aerial images.

The reason that data availability is so important in this case is that the value of the study is it is an experiment with negative results. That is to be expected. We are currently in a phase where the technology involved (UAV with high resolution multi-spec, thermal) is being experimented with by archaeologists. In this study, the authors discuss a number of variables (environmental conditions, nearby trees, crop cover type) that may be contributing to the failure to detect and map features that are apparent in visible light images as cropmarks. However, experiments without accompanying data are no better than anecdotes.

I also recommend one minor revision. This is the first time I’ve come across the term “shoulder” and I found the brief description of them as “adjacent areas” as insufficient. This part of the research design needs at least a few sentences to explain what they are and why they were used.

From context, I take it that shoulders are arbitrarily chosen non-feature areas close to known features that are used as a null dataset to determine if there is significant variation in feature relative to the null. I get why one would do that and I have no problem with that. But there is a whole load of unstated assumptions in there, most importantly, that these areas do not include buried features. Unless there has been some ground truthing, and by that I mean excavations, to say that they are non-feature areas, then this is a working hypothesis, not a demonstrated fact, and that needs to be said.

Typo: two full stops on Page 5.

I applaud the effort by the authors to bring this to publication. I wish more researchers would do the same with their negative results.

Author Response

We thank this reviewer for their attention to our paper and the comments, and the supportive words around our bringing this work to fruition and publication. Please see our responses below:

Data repository: we are more than happy to make our data available from our work though we are actively exploring the best way to do this, and what data sharing agreements would be needed. At this time this is an ongoing task, but previously (via different papers) we have been asked to share our data subject to data sharing agreements and we have always been happy to do so. We wholeheartedly agree that data sharing is key, where findings are both strong and weak.

Shoulder term query and clarification: we have added more detailed text describing the use of the "shoulder" and "adjacent areas" which can be found in lines 191-202.

Page 5: the two full stops have been corrected, and one removed. 

 

 

Reviewer 3 Report

The paper presents an experimental use of three different sensors for the identification of archeaological traces using UAV-acquired multispectral and thermal data. The paper is well written and offers a very accurate overview of the proposed method with a good technical part.

Unfortunately, as highlighted by the authors, the usefulness of the three sensors for archaeological purposes is afflicted by climatic and seasonal issues that were not deeply investigated by the authors with further tests. At the moment, there are very limited results to support the publication of the paper.

In particular, in their discussion the authors write that "At face value these are perhaps disappointing results, but they highlight the difficulties of developing and testing survey approaches in challenging environments. Very specifically, however, this study demonstrates the cost-effectiveness of a UAV-based survey approach to such research, which would have been prohibitively expensive without such a platform." However, this is absolutely no novelty at all and cannot be considered an original result of this specific study (citations are therefore needed).

Considering the high potential interest of the research for the readers, I suggest to improve the study with at least one new acquisition in a period of the year more favourable for the archaeologically induced cropmarking. This will improve the results of the study by providing a more complete overview of the potential and limits of the method. For this reason, I recommend major revisions.

Detailed revisions:

Page 3, figure 1: C: in the legend the ditch and the bank colors are inverted;

Page 4, rows 149-150: the citation (Thomas, 2017; FLIR, 2019) do not follow the editorial rules and Thomas 2017 is not present in the bibliography.

Page 5, Table 1: it should be interesting to add the time interval between the individual acquisitions.

Page 15, row 285: the word "be" is repeted two times.

Author Response

We thank this reviewer for their attention to our paper and the comments. Please see our responses below:

The reviewer begins by commenting generally about increasing the data set size. Indeed this is a valid point, and we have added detail to our discussion to cover this point and discuss it is not always possible from an operational context to have multi site and year data collection. This was a focused study on a well known archaeological site, and we aimed to highlight that even though it can be relatively quick and easy to deploy an unmanned system such as this, such approaches can yield highly conclusive, or less so, results. We are in an era of rapid assessment in which unmanned systems are often sought for, so in that regard, we think it even more key to publish data collected within a season, and highlight the drawbacks of this. 

Detailed revisions

Page 3: Figure 1C has had the legend corrected and a revised figure inserted.

Page 4: Rows 149-150. This reference (Thomas) is already in the references list. Please see line 418.

Page 5: Table 1: We have added the times as requested. 

Page 15: The second "be" has been removed.

Reviewer 4 Report

The authors offer a critical study for the multispectral study of archaeological remains in a specific context. While the specificity reduces the broader applicability of their research, their choice also offers them better experimental control. We are all looking forward to reading similar studies so that we can eventually compile a “theory of multispectral archaeology.”

It is particularly important the authors revealed why their study was not very successful. While I have my reservations on how they concluded their investigation, I also believe the community should be more open to sharing/publishing similar studies like this.


Inline comments:

188. I believe the authors should elaborate on the ‘shoulder’ methodology even though they provide a reference in their text. How are the shoulders created? Are they based on mirroring or buffer analysis? Or are they hand-drawn? What are the implications of shoulder overlaps in adjacent
features?

192. The authors need to discuss the assumptions for conducting t-tests are not violated: normality and homogeneity of variance. Also, what was the sampling strategy from the shoulders and features? As far as I can tell from Figure 3, the shoulders have larger areas. So, did the authors sample shoulders less than the features?

235. I believe authors should also add the May table for ARI1 and NDRER.

241. In Table 3, the authors claim samples are significantly different for the one-tail test. However, there is no mention of a hypothesis in the text concerning a one-tailed test. Are the authors looking for significantly larger or significantly smaller differences? These should be included in their narrative and be discussed within the context of remote sensing and archaeology.

255. The authors should explain why they did not perform/attempt to estimate thermal inertia and only reported temperature differences between shoulders and features. Drones give us the flexibility to expand for thermal calculations, and unfortunately, the authors seem to fail to exploit the technology at hand.

284-288. While one should appreciate the authors’ straightforward discussion on current systems’ deficiency in tackling sensor electronics and heating, one should also ask why the authors used this sensor, almost knowing it would produce unreliable results.

This comes back to the comment I made at the beginning.
We should have access to studies/data dealing with challenging case studies
like the current one. These studies should also be available to researchers for
building a better science. On the other hand, the present study lacks in
proposing an alternative workflow and fails to elaborate on deficiencies of
sensors and techniques. For instance, there is no discussion of why some
indices provided significant results in May, but not in June. Are there any
commonalities for the indices resulting in significant results? Is it because
authors employed different sensors in different seasons, or is it just because
some “spectral bands” are not well suited for archaeological remote sensing.
Or, how the current thermal sensors, despite their deficiencies, can be used in
archaeological research? For instance, should we ask broader-scale questions
rather than focusing on isolated features?

In summary, the current study is useful for offering empirical data in a specific context. Still, it falls short in suggesting alternative pathways for improving results and lacks a more comprehensive theoretical discussion on multispectral data analysis in archaeology.

 

 

Author Response

We thank this reviewer for their attention to our paper and the comments, and the supportive words around our bringing this work to fruition and publication. Please see our responses below:

188. We agree and have added a lot more detail on this in lines 191-202.

192. We have added detail on the directionality in the text and also elaborate on the areas chosen, why and how they were selected (which is covered above in 191-202 and also via citation to some of our previous work where we adopted the same approach). 

235. We have added the ARI1 and NDRER to the results table (Table 3).

241. We have added text into the methods to highlight the fact we are looking for significance between means rather than only looking at a general trend (Section 2.4)

284-288 Taking this point on board we have expanded the discussion in lines 355-367 to discuss this point of multiple seasons rather than a single season study, and secondly we have added more text to describe why some differences were significant in May and not June, which also relates nicely to our earlier published work where differences were only seen in optical imagery at distinct points of the growing season. This has been added 

255. We have added a paragraph into the discussion to discuss this further and what can and cannot be done with this kind of aerial/thermal data, as the reviewer points out here. This can be found in lines 341-354. Relating to the point of differences at distinct points of the season, we have similarly added more to the discussion on this and relates nicely to previous work we published which found similar trends. This can be found in lines 321-330.

Reviewer 5 Report

Main comment and decision

The research is remarkable and denotes a unique potential in the latest remote sensing application in archaeology. However, the data presented do not support its publication at the present stage. 

The major point concerns the results. All the tests conducted on the site were unsuccessful in the perspective of discovering archaeological traces, which is the main purpose of this new methodological proposal. The experiment conducted only on one site, moreover in a non-ideal climatic condition, is not enough to prove the effectiveness of this approach. I wonder what the result would have been under ideal conditions. In case of negative feedback from the field experiment, one expects to repeat the test through time (next year/s) or considering site(s) in a different environmental condition (land use, soil, etc.). 

Further verification are therefore necessary to validate the potential offered by such a methodology. If the response from many sites is either positive or negative, then we can say that this procedure is either useful or useless respectively.  In their Conclusions (r. 318-324), the authors themselves understand how if the applied method can fit for the purpose, the result is instead not satisfactory.

This is especially important to develop new strategies aimed at a better knowledge of the archaeological features buried underground and threatened by ongoing practices such as agriculture, infrastructures, etc. This point is rightly argumented by Authors (r. 23), however, the approach they propose this might lead to the opposite result: an archaeolgical area well-known from previous investigation (traditional aerial photograph) is not represented at all by using the most advanced method and equipment proposed in this case. 

I guess that authors are well aware of this issue, as often remarked in the paper: (1) "the study was compromised by "unusually damp conditions which reduced the potential for crop marking" (r. 31-32); (2) "timings of the thermal surveys were not optimal (r. 256)"; (3) The direct relevance of this study to the assessment of the effectiveness of the sensors used for detecting archaeologically induced crop responses proved limited (r. 267-268); (4) A secondary aim of this study to assess the effectiveness of VIs proved partially successful (r. 301); (5) "less than ideal conditions for crop stress to manifest, compromised the 322 archaeological outputs from the project" (r. 322-323).  

In consideration of the benefits potentially derived from this research, a main suggestion to the Authors is to validate further this approach by investigating more sites and/or different situations. New tests will provide the authors with more data to be compared and analyzed to finally prove the effectnivess of their research approach, and accordingly its publication.

Additional comments

  • Specific comments and suggestion are yellow-marked in the attached file.
  • About English language and style above: minor typos are red-marked in the attached file.
  • Archaeological background: to be improved. it is a common practice to describe the basic information of the investigated site, such as extension, surface covered by the present research. Moreover, given its peculiar shape, size, and layout, does this site belong to a specific regional archaeological/historical culture?

Comments for author File: Comments.pdf

Author Response

We thank this reviewer for their attention to our paper and the comments, and the supportive words around our bringing this work to fruition and publication. Please see our responses below:

This reviewer comments broadly across our study and indeed makes a number of relevant points. At the core of the review is our approach to use one site and study it one year where climate conditions are described as "non ideal". This is usually the case in all studies as its incredibly rare to have perfect conditions, and indeed the definition of this is very much open to discussion. We added text (the following) to attempt to highlight the often impossible difficulty to achieve perfection in sampling (we do not actually want to achieve this). 

While undertaking such testing in ‘ideal conditions’ where variables can be tightly controlled is of course desirable, that is not realistic in the ‘real world’. This is especially true in Scotland where the year to year, and month to month or week to week, variability in climatic conditions is significant. For example, over the period from 1976 to 2014 the average summer (June–August) rainfall for eastern Scotland varied from about 100mm to just over 400mm, producing conditions for archaeological cropmark formation that ranged from the excellent to the entirely unproductive [12, 49]. This variability highlights the desirability to have the capacity for singe-year, rapid acquisitions, when planning multi-year programmes is so dependent on the vagaries of the weather. In this respect, our study demonstrates the cost-effectiveness of a UAV-based survey approach to such research, which would have been prohibitively expensive without such a platform. While the study has demonstrated the utility of the FLIR Vue Pro-R on a UAV platform, it is recognised that further work is warranted and the benefits to archaeological research should be explored.

It is our aim that this work is read and followed up in future work, where it could be found that a differing site under similar conditions may yield similar findings, or findings of higher significance.

Round 2

Reviewer 3 Report

The authors clearly state their views on the proposed revisions and adequately argue their choices. In addition, they implement some important aspects of the text and partly revise the results.
In the light of this, minor revisions are recommended.
In detail:

Row 320: please replace "July" with "June".

Author Response

Many thanks indeed for second review. I confirm we have made this change and edited July to June in row 320.

Reviewer 4 Report

Authors have significantly improved their paper. They have responded to all the questions and concerns I raised except one issue. We are all ready to employ statistical tests without checking if it is an appropriate test or not.


Two sample t-test is not very robust, and authors need to show if their data set is suitable for treatment with the t-test. Is the data normal/close-to-normal? Are these datasets have similar variances? If yes, please show it. If not please deploy an easy-to-implement nonparametric test. It will significantly improve the scientific quality of the paper.

Author Response

"They have responded to all the questions and concerns I raised except one issue. We are all ready to employ statistical tests without checking if it is an appropriate test or not. Two sample t-test is not very robust, and authors need to show if their data set is suitable for treatment with the t-test. Is the data normal/close-to-normal? Are these datasets have similar variances? If yes, please show it. If not please deploy an easy-to-implement nonparametric test. It will significantly improve the scientific quality of the paper"

We deeply apologise for this miss. I have run Shapiro Wilk normality test across all the data and they were not normally distributed, so re processed the indices using the Mann Whitney U test. The methods "statistics" section has had text added to detail this, and the analysis yielded NDRER as non significant, rather than significant when the ttest was incorrectly performed. So I have adjusted the table accordingly. The thermal data too remains non significant and i have inserted the p-values into that table.

Reviewer 5 Report

I wish to thank the Authors for their reply and the updates on the article.
In my opinion, the article still has a major issue which makes this research not acceptable for publication at the present stage. One test only cannot support the validness of the approach presented, especially if it has revealed unsuccessful like in this case. If the experiment is conducted over a significant number of sites (depending on the availability and the type of similar case studies), the scholars will obtain different results according to the feedback from archaeological evidence (if any) and environmental conditions. By doing so, even the unsuccess of this investigation (i.e. the lack of any archaeological evidence although existing) can be considered as a satisfactory result: the demonstration that the employed approach (method + equipment + processing + parameters) is not suitable for that purpose. Conversely, results evidencing anthropic traces can be compared to the settings used and the conditions being at the time of the test. Moreover, in selecting the case studies, the authors are facilitated by the sites already known from previous research.As a reply to the author's comment on 'ideal condition', I understand that it is difficult to have it, especially during field tests, where weather, climate, soil, etc. change from day to day and they cannot be predicted. As also remarked by the authors in the text, Scotland seems to have an even more unpredictable weather condition, which would be rather to be considered as 'standard'. However, if the authors stress more times on the 'non-ideal condition', it seems that it has strongly influenced the result of the test, either positive or negative. In that sense, I, therefore, suggest deleting 'non-ideal' completely and just leaving some basic information on the environmental setting of the site/research area.
Finally, as a minor observation, the site under investigation requires a more detailed description:
1) The surface covered by the site and by the present investigation must be specified;
2) The chronological attribution to different phases must be explained: who hypothesized the different phases (Iron and Roman ages) and how (excavation, survey, comparison to similar pattern). Without any bibliographic reference, it looks as a result of this research.
Despite the final response, I hope to have been helpful for the authors. Again, this research is really promising and the potential benefits from such an approach might contribute considerably to the UAV-based application in archaeology.

Author Response

Many thanks again for your comments. We have inserted a reference on line 126 'centuries AD [12]', and updated the reference to a better one in the bibliography. We have also updated the detail on the site history in so far as cover is concerned. (lines 115-128).

Back to TopTop