Underwater Soundscape Monitoring and Fish Bioacoustics: A Review

Soundscape ecology is a rapidly growing field with approximately 93% of all scientific articles on this topic having been published since 2010 (total about 610 publications since 1985). Current acoustic technology is also advancing rapidly, enabling new devices with voluminous data storage and automatic signal detection to define sounds. Future uses of passive acoustic monitoring (PAM) include biodiversity assessments, monitoring habitat health, and locating spawning fishes. This paper provides a review of ambient sound and soundscape ecology, fish acoustic monitoring, current recording and sampling methods used in long-term PAM, and parameters/metrics used in acoustic data analysis.

. Number of "soundscape ecology" scientific papers published per year. Data derived by searching Google Scholar using only the keyword "soundscape ecology", and does not reflect earlier works of describing underwater acoustics and marine animal sounds. The number above each bar is the number of publications for that year.
are the ones on coral reefs (in clear water) that produce loud sounds, which are easily heard by scuba divers, including: pomacentrids (damselfish), holocentrids (squirrelfish), sciaenids (drums), and batrachoidids (toadfish) [31]. Other fishes that are much quieter can only be detected using nearby hydrophones, including: carapids (pearlfishes), syngnathids (seahorses), gobiids (gobies), and scarids (parrotfish) [31]. For many species, there appears to be a circadian pattern to sound production that is related to territorial establishment or reproduction. The diel pattern of twilight period spawning by reef fishes was reviewed by Lobel and Lobel [35]. Nocturnal reef fishes frequently produce sounds when active, while most diurnal coral reef fishes tend to produce sounds primarily at dusk and/or dawn when these fishes are engaged in reproduction [31,[36][37][38].
The spawning sounds of a variety of fishes consist of short pulses, grunts, or growls, each only lasting several milliseconds to a few seconds [25]. These brief sound bursts can be easily missed by the current long-term PAM methods of intermittent recording (see details below). Studies aiming to define when a certain fish species is spawning may need to record continuously for 24-48 h in order to determine that species' specific courtship and spawning diel periodicity (e.g. Lobel and Mann, Locascio and Mann, [26,39]). A listing of species that are acoustically active mainly during the hours around sunset and sunrise is catalogued in Lobel et al. [31]. Once the diel courtship and spawning periodicity is determined for a species, the recording intervals and duration can be redefined in order to best capture both the courtship and spawning sounds with regard to the technological limitations.

Ambient Sound
Ambient sound could, potentially, convey important information about habitat quality. Wenz summarized the sound levels of the primary biotic and abiotic components of ocean ambient sound [40]; these are referred to as the "Wenz curves", and can be accessed on the Discovery of Sound in the Sea website created by the University of Rhode Island Graduate School of Oceanography [41]. A few studies have applied soundscape measurements to estimates of biodiversity in terrestrial ecosystems [23,42]. Recently, this approach has also been extended to underwater habitats [21,43]. Based on preliminary studies suggesting that sound intensity is higher on healthy reefs than degraded ones [44], soundscapes have the potential to serve as a monitoring tool for ecosystem health. Structurally complex and diverse habitats that have undergone regime shifts to less complex habitats have been found to be directly correlated with a decrease in biological sounds in one study [21]. In another study, the sound levels were found to increase with increased coral cover, species diversity, and water-flow rates [18]. The ambient sounds of a coral reef have been proposed to attract certain reef fish larvae [44][45][46][47][48][49][50][51][52], thus functioning as an acoustic signpost.

Long-Term Passive Acoustic Monitoring Methods
Acoustic technology has advanced significantly in recent years. Omnidirectional hydrophones with high sensitivities, automatic acoustic recorders, and associated hardware and software are capable of collecting a wide range of acoustic data. This technology is currently limited by battery life and memory storage. Hydrophones are produced with a wide range of recording sensitivities and need to be calibrated to the appropriate sensitivity depending on the target sound source(s). In general, hydrophone sensitivities being used in the field range from −156 to −193 decibels relative to 1 volt per 1 micropascal (dB re: V/µPa), which is the absolute logarithmic measure of hydrophone sensitivities. Studies that have not calibrated their hydrophone systems have shown the importance of doing so when recording long-term [53]. All of this technology allows researchers to collect data at regularly scheduled intervals, independent of previously limiting factors such as weather and study site depth.
Different sources of sounds are produced over a wide range of frequencies. In general, the sound of wind and breaking waves extends over a wide frequency band of 0.1-20 kHz, with a peak from 200-2000 Hz [53]. Shipping noise is generally in the 30-100 Hz band, and typically about 10 dB above other background noise [53]. The peaks in rainfall sound occur in the 15-20 kHz range and generally last over longer periods of time at a fairly steady rate [54]. In terms of biotic sound production, most studies find that the loudest contributors to the overall soundscape are snapping shrimp, which can sometimes drown out other biotic sounds in recordings [17]. Invertebrates (mainly snapping shrimp) dominate the higher frequencies (2.5-15 kHz) in tropical marine habitats [46]. Fishes and whales tend to dominate sound production at lower frequencies (<500 Hz), although whales have many orders of magnitude greater amplitude than fishes. Overall, these diverse biotic and abiotic sounds show high variability between times of day, month, year, and lunar cycle [38]. Such a wide range of spectral and temporal patterns demonstrates the difficulties in capturing and distinguishing between different sources of sounds in an overall soundscape.
Among the long-term PAM projects reported, there has been minimal common framework among the studies. Different studies have used different recording rates based on a compromise of several technicalities including the desired length of study, battery duration, data storage capacity, type of hydrophone, and the bandwidth used for the acoustic recording. Recording rates used in recent studies have been highly variable and are summarized in Table 1. The bandwidth range (frequency rates) has also been highly variable among studies, ranging from 2 kHz to 250 kHz ( Table 2). The accuracy of non-continuous acoustic data from an underwater study was recently assessed to determine the best subsample sampling intervals [55]. Two recording schedules-30 s every four minutes, and two minutes every 10 minutes-most accurately depicted the soundscape derived from the entire 55-min continuous recording [55].

Acoustic Parameters and Measurements
There are a wide variety of parameters used in the analysis of data in reported studies. Across a total of 60 studies examined in this review, 34 different metrics and/or indices were selected and analyzed. Power spectral density (PSD) was the most commonly used [3,7,[17][18][19]21,22,38,43,46,59,61,63,64,[66][67][68]. Sound pressure level (SPL), or root mean square (RMS)-SPL, were the next most frequently used parameters [6,8,11,17,20,21,38,53,54,59,61,63,64,68,69]. SPL is a logarithmic summary measure of the ratio of the pressure of a sound relative to a reference value and results in a measurement in decibels (dB). Generally, the reference value used in underwater acoustics is 1 µPa. Measuring SPL requires a hydrophone recording using the fixed gain setting and measurement of the distance from the hydrophone to the source of a sound (see e.g. Morisaka et al., [70]). The use of this metric may be difficult in studies that characterize soundscapes and ambient sound, because the distance to most of the sounds recorded in these types of acoustic studies is unknown.
Several measurements, including PSD and spectral entropy (H f ), can be quickly calculated using bioacoustics software, such as Raven Pro 1.5 (Bioacoustics Research Program, The Cornell Lab of Ornithology, Ithaca, NY, USA) and Avisoft SASLab Pro 5.2.12 (Avisoft Bioacoustics, Glienicke, Germany). These quick calculations can be especially useful when analyzing larger data sets. Power spectral density estimates the strength of the variations in energy as a function of frequency, instead of time, and is generally used to characterize broadband random signals. In Raven, average PSD is calculated by summing the square magnitudes of the Fourier coefficients across time and frequency, and dividing by the product of the selection duration and selection bandwidth, resulting in a measurement in decibels. PSD can be calculated independent of whether the hydrophone and acoustic recorder used have automatic or fixed gain. This allows PSD to be used more widely and can serve as a parameter to compare across studies, regardless of technological limitations.
Several studies choose a small number of parameters to focus on during analysis [3,6-8,11,17-20, 22,23,38,42,43,46,53,54,58-60,65,66,69,71-74], while some analyzed up to 10 different parameters [63]. There is still discussion about whether one number, or index, can fully describe a soundscape [75]. As the field continues to grow, it is recommended that studies continue to use multiple parameters, each of which provides details about different aspects of a soundscape [75]. Determining which indices provides the most accurate description of the acoustic data, and by extrapolation biological patterns, remains one of the major challenges in soundscape ecology.

Acoustic Indices
There are two main types of acoustic indices: within-group (α) and between-group (β) indices [75]. Within-group indices are useful in comparing all of the aspects in the same group, with a group being defined as "a sample unit as a site, a habitat, or a time event" [75]. Between-group indices are useful in determining how acoustically different multiple acoustic communities are. Both groups of indices contribute to quantifying the soundscape.
Several new indices are being tested to measure the evenness of an acoustic space (acoustic entropy index (H)) [23], the dissimilarity between two communities (acoustic dissimilarity index, D) [23,60], acoustic richness of a community (acoustic richness, AR) [23,60], and degree of complexity (acoustic complexity index, ACI) [42] (Table 3). Indices such as AR, ACI, and H are considered α indices, and the D index is in the β group. Most of the studies testing the robustness of these indices were performed in terrestrial ecosystems, the data of which are not directly comparable to data from underwater acoustics. Similar studies need to be conducted in marine ecosystems [19,22,43,61,76,77]. Each of these indices has different advantages and limitations. Acoustic entropy (H) is the product of both spectral and temporal entropies, and results are on a scale of 0 to 1, with 0 indicating more pure tones and 1 indicating random noise. The spectral entropy calculated in Raven software is affected by the signal, begin/end times, low/high frequencies, window size, discrete Fourier transform size, and overlap. This measure has a low value for signals with a similar type of distribution of energy over a spectral slice. The average entropy measurement computes the entropy of each spectral frame and averages those measurements, while the aggregate entropy corresponds to the overall disorder in a sound. The H index can provide interesting information regarding the species richness in a habitat. A demonstration of the use of this index was conducted in coastal Tanzania by comparing the sounds of a degraded forest to those of a healthy forest [23]. Their study found that H values were significantly higher in the healthy forest than in the degraded forest [23]. However, if a few species dominate the habitat acoustically, then diversity will be shown to be low through this index alone. There is also some error with this index in areas with an overall low number of species, because variability decreases in these communities. Abiotic and anthropogenic noise can also reduce the reliability of this index [23]. In order to account for the false high values generated from geophony and anthrophony, Depraetere et al. [60] elaborated upon the H index to create the acoustic richness (AR) index.
The acoustic dissimilarity index (D) was also used to compare the two Tanzanian forests [23]. The D index estimates the compositional dissimilarity between two communities, and takes into account both temporal and spectral acoustic data [23]. The acoustic dissimilarity index compares two signals of the same duration at the same frequency. This number will increase as the number of unshared species between chorus pairs increases. The suggestion is that this index could be used to infer differences between community compositions. The D values in this study showed differences between the healthy and degraded forests based on the finding of a linear increase in D values with the number of unshared species between the two communities. Comparably to the H index, if a couple of species are more widespread and dominate the area acoustically, then the D index will be low. Both the D and H indices can be used to infer differences between communities.
The most widely used of these newer indices is the acoustic complexity index (ACI) [1,19,22,38,42,43,61,71,75,78,79]. The ACI was developed with the goal of producing a fast and direct quantification of acoustic sounds by focusing on intensity [42]. The creation of this index was based off of the observation that many animal sounds have varying intensities compared with the relatively constant intensity of human-generated noise [42]. The ACI index basically calculates the absolute difference between two adjacent values of intensity in a single frequency bin, and then adds together all of the intensities in the first temporal step of a recording. Although this index was created for and tested in terrestrial habitats, several studies have extended these efforts to marine ecosystems [19,38,43,61,71].
Studies have concluded that this index is better suited for soundscapes with constant intensities, possibly such as those dominated by snapping shrimp. The calculations are also very time-consuming, and may not be well suited to monitoring repeated recording sessions [71]. As with the other indices mentioned, ACI may overlook finer details when there is one dominant, soniferous species, and should therefore be considered along with other parameters.

Acoustic Statistical Software
Several open-access statistical software routines are now available and enable the easy calculation of some of these newer indices. Notable routines are available in Matlab 9.4 (The Mathworks, Inc., Natick, MA, USA) and R 3.5.0 (R Foundation for Statistical Computing), including: PAMGuide [68], CHORUS [67], SoundEcology [81], and Seewave [82]. Although none are yet fully integrated, each package includes code to calculate different indices, as shown below.
PAMGuide includes codes for both Matlab and R to calculate broadband sound pressure level (SPL), PSD, 1/3-octave band levels (TOL), and waveforms. The CHORUS package includes codes to calculate PSD and compose long-term average (LTA) spectrograms and has an automatic detection function that can currently detect two whale calls and allows for the easy addition of automatic detectors. The Soundecology package was created in R with code to measure the ACI and D indices [81]. A plug-in soundscape meter for Wavesurfer (v.1.8) was also developed to calculate the ACI index [42]. Both the H and D indices can be computed through R functions in the free package Seewave, and can be used relatively easily by non-scientists for biodiversity estimation [82]. Depending on the aim of a study, multiple software packages may need to be used to calculate every desired metric.

Contrasting Soundscapes
Many studies also explore the spatial variation within and between habitats [18,19,22,38,83]. Currently, the majority of soundscape studies explore temporal variation at one habitat; the data of which can later be compared with that of other studies to explore the acoustic differences between habitat types (i.e., coral reefs versus sandy patches, etc.). Our review paper aimed to compile a summary table of acoustic measurements from various aquatic habitats to allow for an analysis of the spatial variability between the soundscapes of different underwater ecosystems. After surveying 22 studies that characterized the ambient soundscape of a particular habitat, or multiple habitats [3,6,[17][18][19]21,22,38,43,53,54,61,64,65,69,71,[83][84][85][86][87][88], only seven studies provided exact quantitative measurements either in the body of the paper or in a table/supplementary material [18,19,38,54,61,71,83]. However, the values provided in these seven papers were different metrics (PSD, SPL, sound intensity, and ACI), and therefore could not be directly compared. The other 15 papers that were surveyed did provide several figures to visually display the soundscape variation; however, exact values cannot be extracted from their figures. Clear graphical representation is important, but in order to compare among different soundscape studies, future authors should also include a table summarizing the soundscape measurements for their specific study site.
To demonstrate one approach, we show the following case study contrasting the soundscapes of two different marine habitats in Belize. The first recording is from a relatively quiet, sandy/mangrove habitat at Glovers Atoll; see Randall et al., [89] for a description of the study site. The other is from an acoustically complex, high biodiverse coral reef at Tunicate Cove; see Lindseth, [55] for a description of this habitat and recording methods. Each recording was visually and audibly inspected and cut to a 20-s clip that had minimal anthropogenic noise. Each 20-s clip was then analyzed in Raven Pro 1.4; see supplementary material for full details on the methods used to analyze the two recordings. The preliminary analyses of the 20-s clips from each recording display an acoustic difference between both the waveforms and spectrograms of the two different ecosystems (Figures 2 and 3). Spatial variation between the two habitats (two-way analysis of variance (ANOVA) followed by a t test) was calculated. They revealed a statistical spatial variation between Tunicate Cove and Glovers Atoll for all of the parameters tested (n = 40, p < 0.0001) except for peak frequency (n = 40, F(1,38) = 0.32, p > 0.05). The more acoustically complex site with higher biodiversity (Tunicate Cove) had higher average (n = 40, F(1,38) = 1301.7, p < 0.0001) and peak PSD (n = 40, F(1,38) = 495.6, p < 0.0001), RMS amplitude (n = 40, F(1,38) = 765.4, p < 0.0001), and energy (n = 40, F(1,38) = 1309.9, p < 0.0001). However, both average entropy (n = 40, F(1,38) = 1524.2, p < 0.0001) and aggregate entropy (n = 40, F(1,38) = 324.4, p < 0.0001) were higher at Glovers Atoll, which is the sandy/mangrove habitat. It is important to note that in the very quiet recording of the sand habitat at Glovers Atoll, the camera's operation noise can be heard, and it is seen in the spectrogram as a dark band at about 1.1 kHz to 1.2 kHz (see Kovitvongsa and Lobel,[90] for discussion of camera noise issues in acoustic recordings). These results from this preliminary case study are an example of how these two habitats can be acoustically differentiated. Table 4 itemizes recommended metrics and indices that can be reported when generally characterizing the soundscape of an area of study. It is recommended that future papers provide the same quantitative acoustic measurements, so that it will eventually be possible to directly compare results among studies and begin answering larger-scale questions on acoustic spatial and temporal variation.    n/a 2 n/a 2 1 Sound pressure level (SPL) could not be computed in this example, because the equipment used was automatic gain. 2 Acoustic complexity index (ACI) is still being tested for robustness in underwater acoustic studies and should be included, if possible; however, it could not be computed for this example due to software limitations. PSD: power spectral density, RMS: root mean square.

Discussion
As an emerging scientific topic, soundscape ecology has advanced greatly in recent years with the number of scientific publications increasing mainly within the past 10 years (Figure 1). However, there is still a great deal that is unknown about how best to quantify acoustic signals and quantitatively compare data, especially among studies. Across dozens of studies from the past 10-15 years, most researchers recognize a handful of recommendations as logical next steps; these are detailed below. As many of these limitations and issues are solved, soundscape monitoring may offer a viable method for some wide-scale applications such as monitoring the health of remote habitats and documenting fish spawning activities.
In order to accurately describe the biodiversity of any habitat through acoustics and monitor the impacts of anthropogenic noise, the detection and documentation of fish sounds is key [3,11,38,46,58,59,61]. Fully understanding the hearing sensitivities of soniferous fishes is also important to understanding fishes' communication in the context of their soundscape [21,47,52,74,81,91,92]. Currently, the acoustic signatures of marine mammals [67,93] and over 100 fish species have been well described [26,31,57,59,70,[93][94][95][96][97]. The current best practice requires the use of hydrophones coupled with video recordings of the target species in order to confirm that a sound matches with an individual fish behavior [90,98]. However, the use of video to capture the calling fish is not always practical in the field due to limitations such as poor water visibility. It may be necessary to couple field and lab studies to definitively describe a species call. As the acoustic signatures of more species are better defined by clear recordings, the automatic detection features in bioacoustics software can be more finely tuned. The automatic detection of biological sounds will allow for simpler and faster analysis of large acoustic data files, which is needed to handle long-duration field recordings.
Directional hydrophones are useful in determining the source of a sound [43,99,100]. However, most studies currently use omnidirectional hydrophones, which are capable of picking up sounds from any direction. Terrestrial studies have used groups of directional microphones when examining population density, individual abundance, and locating and tracking animal movements [99]. These methods of localizing sources of sounds reduces the number of differences detected in recordings and allows for more accurate counts of individual sound producers [71,99,101]. A number of recent studies generalize entire acoustic habitats from single-point recordings [3,17,18,21,22,38,43,46,54,56]. The question is whether such limited spatial sampling is adequate. Perhaps multiple hydrophones in geographically distributed arrays would better ground truth patterns and aid in determining whether single-point recordings give an accurate representation of a broader area soundscape [57].
An integral part of any emerging field is to establish a common framework so that data are comparable among studies. In acoustics, this includes the standardization of the sampling/recording methods, metrics, and indices that are used in data analysis, visualization tools, sensor calibration, and ground truth current methods [1,23,38,54,57,58,61,65,68,99,[102][103][104]. As more studies set long-term recording goals of several months or more, along with limitations of battery life and memory storage, studies are required to forego continuous recording. These same sampling schemes must not lose a significant amount of soundscape information. This issue was explored in tropical forests with the aim of determining how much acoustic information is lost as the gap in recording time increases [105]. Although the findings suggest that each location and soundscape may require a specialized recording schedule, overall, the loss of important information increases significantly with the gap between recording times [105]. These findings suggest that the best data comes from using a more intense recording regime.
One future application for soundscape ecology is the use of long-term recordings to monitor the health of an ecosystem. However, more long-term studies that explore the link between the health of an ecosystem and the corresponding soundscape parameters are needed [20,23,42]. A handful of studies have begun to explore the acoustic differences between a healthy habitat (i.e., forests, seagrass beds, coral reefs, etc.) to ones that have been degraded [19,23,44]. These studies have found suggestive differences in the acoustic signatures of healthy versus degraded habitats, but such differences may just as likely be based upon the variation in biological communities. The playback of healthy coral reef habitat sounds also results in greater attraction by the settlement stages of coral, mollusk, and coral reef fish larvae, which are migrating from offshore [5][6][7][8]11,106].

Conclusions
Underwater soundscapes and long-term PAM both have incredible potential in the fields of ecology, behavior, evolution, and conservation biology. Coupled with other conventional underwater survey methods and established physical oceanographic meters, PAM can be used to gain a more accurate understanding of the health, biodiversity, and structure of underwater habitats. Acoustics may enable researchers to monitor several different species simultaneously, which offers an integrative look at different habitats within and between ecosystems [99]. PAM provides a new technology to monitor remote underwater habitats over long durations. However, when beginning at any new site, it will probably be necessary to use synchronous audio and video recordings alongside conventional visual surveys in order to get an accurate assessment of a particular underwater habitat and verify sound-producing species.