Underwater Video as a Tool to Quantify Fish Density in Complex Coastal Habitats

: Habitat loss is a serious issue threatening biodiversity across the planet, including coastal habitats that support important ﬁsh populations. Many coastal areas have been extensively modiﬁed by the construction of infrastructure such as ports, seawalls, docks, and armored shorelines. In addition, habitat restoration and enhancement projects often include constructed breakwaters or reefs. Such infrastructure may have incidental or intended habitat values for ﬁsh, yet their physical complexity makes quantitatively sampling these habitats with traditional gears challenging. We used a ﬂeet of unbaited underwater video cameras to quantify ﬁsh communities across a variety of constructed and natural habitats in Perdido and Pensacola Bays in the central northern Gulf of Mexico. Between 2019 and 2021, we collected almost 350 replicate 10 min point census videos from rock jetty, seawall, commercial, public, and private docks, artiﬁcial reef, restored oyster reef, seagrass, and shallow sandy habitats. We extracted standard metrics of Frequency of Occurrence and MaxN, as well as more recently developed MeanCount for each taxon observed. Using a simple method to measure the visibility range at each sampling site, we calculated the area of the ﬁeld of view to convert MeanCount to density estimates. Our data revealed abundant ﬁsh assemblages on constructed habitats, dominated by important ﬁsheries species, including grey snapper Lutjanus griseus and sheepshead Archosargus probatocephalus . Our analyses suggest that density estimates may be obtained for larger ﬁsheries species under suitable conditions. Although video is limited in more turbid estuarine areas, where conditions allow, it offers a tool to quantify ﬁsh communities in structurally complex habitats inaccessible to other quantitative gears.


Introduction
Habitat loss is a serious issue threatening biodiversity across the planet [1,2], and coastal marine habitats are no exception. In addition to significant losses of critical natural habitats in coastal waters, such as oyster reefs [3], salt marshes [4], and seagrass [5], many coastal areas are being modified by human actions [6,7]. Some human actions cause direct habitat loss, while the construction of infrastructure at various scales, such as ports, seawalls, docks, and armored shorelines, replaces one habitat type with another [8,9]. Similarly, many coastal habitat restoration and enhancement projects deploy artificial structures that attempt to mimic natural features, thereby providing habitat values for fisheries and other species [10][11][12]. Although many modifications of coastal areas tend to degrade habitat quality for fish [9], the addition of structurally complex artificial habitats into systems where the complex habitat is limiting may have incidental or intentional benefits for biodiversity and secondary production [11,13,14]. evaluate the use of simple, single-camera underwater video units as sampling tools to derive quantitative density estimates from a range of estuarine habitats. In particular, we aimed to compare fish assemblages across a variety of natural, restored, and constructed habitats, to develop a simple method to derive density estimates from MeanCount values from underwater video, and to compare these densities to the more widely used metrics of FoO and MaxN.

Study Sites
We quantified fish communities across a variety of constructed, restored, and natural habitats in Perdido and Pensacola Bays in the central northern Gulf of Mexico ( Figure 1). The lower reaches of Perdido Bay where sampling occurred have highly developed shorelines including private residential lots, most with private docks, and commercial businesses, armored by bulkheads or riprap. The substrate is primarily sand, with small seagrass meadows around the lower Perdido Islands, and small isolated pockets of salt marsh occur on undeveloped shorelines. We sampled East Bay within the Pensacola Bay system. The southern shore is dominated by private house lots with armored shorelines. The northeast shoreline is mostly natural sandy beach backed by woodland, while the northwest shoreline is a mixture of private lots with armored shoreline, and natural shorelines backed by marsh or woodlands. The catchments feeding into Perdido and East Bays receive 1600-1800 mm of rainfall per year [32], and the bays experience diurnal tides with an average range of 0.38 m [33]. restoration activities to maximize fish habitat values [31]. The aim of this study was to evaluate the use of simple, single-camera underwater video units as sampling tools to derive quantitative density estimates from a range of estuarine habitats. In particular, we aimed to compare fish assemblages across a variety of natural, restored, and constructed habitats, to develop a simple method to derive density estimates from MeanCount values from underwater video, and to compare these densities to the more widely used metrics of FoO and MaxN.

Study Sites
We quantified fish communities across a variety of constructed, restored, and natural habitats in Perdido and Pensacola Bays in the central northern Gulf of Mexico ( Figure 1). The lower reaches of Perdido Bay where sampling occurred have highly developed shorelines including private residential lots, most with private docks, and commercial businesses, armored by bulkheads or riprap. The substrate is primarily sand, with small seagrass meadows around the lower Perdido Islands, and small isolated pockets of salt marsh occur on undeveloped shorelines. We sampled East Bay within the Pensacola Bay system. The southern shore is dominated by private house lots with armored shorelines. The northeast shoreline is mostly natural sandy beach backed by woodland, while the northwest shoreline is a mixture of private lots with armored shoreline, and natural shorelines backed by marsh or woodlands. The catchments feeding into Perdido and East Bays receive 1600-1800 mm of rainfall per year [32], and the bays experience diurnal tides with an average range of 0.38 m [33].  Sampling was conducted in lower Perdido Bay during October of 2019 and 2020, and on restored oyster reefs in East Bay, Pensacola, in April and May 2021. Our primary objective was to assess the use of video for deriving density estimates from a variety of complex habitats, not to comprehensively evaluate spatial and temporal patterns in fish assemblage structure. Sampling the different habitats and bays in different years and seasons would confound any spatial or temporal patterns, and as such, interpretations should be limited to comparisons among metrics, and not as reflecting robust spatial or temporal patterns among habitats. In Perdido Bay, sampling was conducted in water depths from 0.6 to 5.5 m across multiple artificial and natural habitats, including a rock jetty, a vertical concrete seawall, a boat ramp dock, restaurant dock, and private docks along three sections of shoreline, an artificial reef constructed from concrete bridge pilings and rubble, and natural seagrass and bare sandy substrate ( Figure 1). Together, these represent most of the major habitat types available in the lower bay. Sampling in the Pensacola Bay system focused on 13 restored subtidal oyster reefs in water depths between 1.7 and 3.1 m in East Bay ( Figure 1). These reefs were restored in October 2016 by the addition of cultch, a mixture of oyster shell and limestone cobbles, to enhance the three-dimensional reef structure and promote oyster settlement [34].

Field Sampling
Underwater visual point census replicates were collected with unbaited waterproof video cameras following the protocols developed by Bradley et al. [18,26]. It was essential to use the sampling cameras unbaited to address our objectives. We wanted to quantify fish-habitat associations, and baited cameras would draw fish in from adjacent habitats, therefore biasing the results [26]. Similarly, bait aggregates fish within the sampling area, which would inflate and invalidate the density estimates obtained from baited cameras. We used Garmin VIRB XE cameras in 16:9 Zoom mode, 1080p resolution, and 30 frames per second. Individual cameras were mounted on each of 6 camera bases, each comprising a 30 × 30 cm fiberglass mesh base with a vertical aluminum pole with an acrylic side-arm holding the camera mount, such that the camera lens was positioned 31.5 cm above the substrate over the center of the fiberglass base ( Figure 2A). The camera was angled such that the upper edge of the field of view (FoV) was slightly above horizontal, while the lower edge was 35 cm forward from the camera pole. The camera in the tilted position rested on a fixed vertical bolt, allowing for easy camera exchange while ensuring the camera position was constant ( Figure 2B). A 6 m float line clipped to the top of the pole and fitted with a 15 cm diameter styrofoam float allowed for deployment and retrieval of the cameras. The fleet of 6 cameras was deployed with a minimum spacing of 20 m between cameras to minimize the probability of observing the same fish on multiple replicate videos, and left undisturbed for approximately 15 min before retrieval.
A seventh camera base was set up the same way as the sampling cameras with the addition of a 2.5 m PVC pole extending horizontally through the center of the FoV. The pole was marked with black electrical tape to create alternating 10 cm black and white bands for the length of the pole ( Figure 2C). This visibility camera was deployed spatially, centrally within each group of 5-6 sampling cameras (e.g., on a patch of reef or seagrass), or at either end of a row of 5-6 sampling cameras (e.g., a series of docks along a shoreline) to estimate visibility for each set of point census videos. Visibility camera deployments were for 2-3 min-long enough for the camera to settle and provide a clear view of the visibility pole unimpacted by disturbed sediment.

Video Data Processing
In the laboratory, the visibility videos were analyzed to identify sets of replicate point census videos with acceptable visibility for data extraction (≥0.5 m) [26], and to allow calculation of the area within the FoV of those videos. For the 2019 and 2020 samples from Perdido Bay, all visibility distances were estimated by one observer, being the one most experienced with extracting data from point census videos. The radius of the FoV was Diversity 2022, 14, 50 5 of 15 estimated as the distance from the lens at which the observer was confident they would observe and identify fish under the prevailing conditions. The radius was measured to the nearest 0.1 m using the visibility pole. Since fish were usually not observed within the visibility videos, the radius was estimated based on visible features on the substrate such as animal burrows, rocks, or shell fragments, not the maximum range at which the black and white bands on the visibility pole could be seen. This is because the strongly contrasting black and white bands on the visibility pole are more visible than fish towards the outer edge of the FoV, and hence the maximum visible distance on the pole would overestimate the area within the FoV in which fish can be accurately identified and counted. A seventh camera base was set up the same way as the sampling cameras with the addition of a 2.5 m PVC pole extending horizontally through the center of the FoV. The pole was marked with black electrical tape to create alternating 10 cm black and white bands for the length of the pole ( Figure 2C). This visibility camera was deployed spatially, centrally within each group of 5-6 sampling cameras (e.g., on a patch of reef or seagrass), or at either end of a row of 5-6 sampling cameras (e.g., a series of docks along a shoreline) to estimate visibility for each set of point census videos. Visibility camera deployments were for 2-3 min-long enough for the camera to settle and provide a clear view of the visibility pole unimpacted by disturbed sediment.

Video Data Processing
In the laboratory, the visibility videos were analyzed to identify sets of replicate point census videos with acceptable visibility for data extraction (≥0.5 m) [26], and to allow calculation of the area within the FoV of those videos. For the 2019 and 2020 samples from Perdido Bay, all visibility distances were estimated by one observer, being the one most experienced with extracting data from point census videos. The radius of the FoV was estimated as the distance from the lens at which the observer was confident they would observe and identify fish under the prevailing conditions. The radius was measured to the nearest 0.1 m using the visibility pole. Since fish were usually not observed within the visibility videos, the radius was estimated based on visible features on the substrate such as animal burrows, rocks, or shell fragments, not the maximum range at which the black All 2021 visibility samples from Pensacola Bay were analyzed three times independently by each of three observers, with a minimum of two days between repeat analyses by individual observers. For further analyses, visibility in each of the Pensacola videos was defined as the mean of the nine estimates (three estimates from each of three observers). When ≥5 of the 9 independent scores of a single visibility video indicated <0.5 m visibility, the visibility was recorded as <0.5 m. When <5 scores were estimated as <0.5 m, the mean of visibilities ≥ 0.5 m was used as the final assigned visibility for that video. To assess precision in visibility estimates, the coefficient of variation (CV) was calculated as a measure of within-and between-observer variability in visibility radius estimates from the Pensacola videos. CVs were calculated for all videos with an estimated visibility ≥ 0.5 m, and for subsets of videos with different ranges of estimated visibility (0.5 to 1.0 m, >1.0 to 1.5 m, >1.5 m). When a single visibility radius was sampled centrally to a set of 5-6 point census samples, that visibility radius was assigned to all cameras within that set. When a set of point census samples was bracketed by a pair of visibility videos, the visibility radii usually agreed within 0.1 m, and the mean of the bracketing visibility radii was assigned to all cameras in the set. When visibility radii varied by ≥0.2 m between bracketing visibility videos, point census videos were assigned the visibility radius of the nearest visibility For each video with acceptable visibility, we analyzed a 10 min clip from the total 15 min recording. At least the first and last 1 min were disregarded from each video to minimize gear avoidance or attraction caused during deployment and retrieval of each camera. When sediment or debris was disturbed during deployment and limited the field of view for longer than 1 min (very rare), the starting time for the 10 min clip for analysis was further delayed until visibility cleared, for up to a maximum of 4 min into the 15 min clip. From the resulting 10 min analysis clip, we extracted three metrics. Each taxon observed in each video was identified to the lowest practicable taxonomic level and recorded, allowing the calculation of the FoO of each taxon. MaxN was recorded as the maximum number of individuals of each taxon visible in a single frame of each video [27,28]. MeanCount [19] was recorded as the mean number of each taxon observed in 10 randomly selected frames from within each 10 min clip, with a minimum of 20 s between each random frame.
The horizontal FoV in air of our cameras under the settings used is 90.9 • . Based on Snell's Law and a refractive index of seawater of 1.35, this corresponds to a horizontal FoV in seawater of 63.7 • . To convert MeanCount values into density estimates, we calculated the area of the FoV based on the visibility radius, r, and the horizontal FoV of the camera, θ, using the equation: area = πr 2 × θ/360. To assess the effects of errors in r estimates on the estimated area sampled, we examined the distribution of maximum differences in visibility radius among the nine observations per East Bay visibility video (three independent observers × three replicate observations). We then used a conservative estimate of visibility error to propagate error in radius to error in area sampled.

Data Analysis
We used non-metric multidimensional scaling (nMDS) to compare fish assemblages among habitats based on (1) Frequency of Occurrence, (2) MaxN, and (3) Density. Our aim was to compare patterns among metrics, and because the different bays were sampled in different years and seasons, interpretations should be limited to comparisons of metrics, rather than representing robust analyses of spatial patterns in fish community structure. Only the taxa present in ≥5% of non-empty videos were included in the analysis. nMDS analyses were performed in R using the vegan package function metaMDS [35] and were based on Jaccard (FoO) and Bray-Curtis dissimilarity indices (MaxN and Density). We then fit 95% confidence ellipses over habitat clusters. Assemblage-habitat relationships were examined by fitting vectors onto the nMDS ordination space using the vegan package function envfit.
The relationships between density, FoO, and MaxN were assessed for the three most common fisheries species: two structure-oriented species (Lutjanus griseus, Archosargus probatocephalus) and one more mobile taxon (Mugil spp.), to determine the efficacy of each metric for fish exhibiting varying behaviors. Metrics were calculated as the mean for each habitat type for each species, and relationships between metrics were examined by Pearson correlations. All analyses were performed in RStudio with R language for statistical computing [36,37].

Assemblage Composition
We collected a total of 333 videos with acceptable visibility from which we could estimate densities, and 341 videos that provided FoO and MaxN values (Supplementary Materials Table S1). Eight of the videos from which we extracted FoO and MaxN were excluded from the density dataset due to structures obstructing part of the FoV (e.g., dock pilings) or cameras falling over, but still provided a clear FoV to derive FoO and MaxN. The ten taxa present in ≥5% of non-empty videos were dominated by important fisheries species, including grey snapper Lutjanus griseus, mullet Mugil spp., and sheepshead Archosargus probatocephalus (Table 1). Although not all mullet observed could be positively identified to species, the majority of individuals were identified as M. cephalus and no other species were identified. Grey snapper and mullet were the most frequently observed taxa across a variety of habitats in Perdido Bay but were not observed on the restored reefs of East Bay, while hardhead catfish Ariopsis felis were the most frequently observed taxa on the East Bay reefs. The nMDS plots revealed similar patterns of fish assemblage composition among habitats regardless of which metric was considered (Figure 3). The restored oyster reefs in East Bay had the most distinct fish assemblage, characterized by the hardhead catfish Ariopsis felis, and to a lesser extent, Gobiidae/Blenniidae and the spotted seatrout Cynoscion nebulosus. Within the Perdido Bay habitats, the high-relief habitats, including the seawall, jetty, docks, and Bayou Saint John (BSJ) Reef, were characterized by grey snapper and sheepshead, while sand and seagrass were dominated by Clupeidae, Lagodon rhomboides, and Gerreidae ( Figure 3). The relationship of Mugil spp. with the ordination space was more variable among metrics ( Figure 3). It was widespread among habitats in Perdido Bay and absent from the restored reefs in East Bay (Table 1), and the FoO and other metrics for Mugil spp. were poorly correlated ( Figure 4).

Comparison of Metrics for Common Taxa
The three abundance metrics were strongly positively correlated for grey snapper (Figure 4). In all habitats where grey snapper (Lutjanus griseus) were observed (i.e., nonzero FoO and MaxN), we successfully derived density estimates, meaning when a grey snapper was observed in a habitat, it was also observed in at least one of the subsampled MeanCount frames used to calculate density. Sheepshead (Archosargus probatocephalus) also showed strong correlations among metrics, but were present at lower FoO, MaxN, and density than grey snapper (Figure 4). Although sheepshead were observed in 4 of the 9 samples from the jetty at the entrance to Pedido Bay (FoO 44%, green symbols Figure 4), they were not present in any of the subsample MeanCount frames in videos from that habitat, hence returning zero density for the jetty habitat. Mugil spp. was the second most frequently seen taxon, and it showed a poor correlation between FoO and either MaxN or density, while MaxN and density were more strongly correlated (Figure 4). In several habitats in which Mugil spp. were observed at varying frequencies, no individuals were observed in any of the MeanCount subsample frames, resulting in zero density values for those habitats. Perdido Bay and absent from the restored reefs in East Bay (Table 1), and the FoO and other metrics for Mugil spp. were poorly correlated ( Figure 4).  Only the taxa present in ≥5% of non-empty videos were included in the analysis. Community vectors were fit onto the ordination space and standard deviation ellipses show 95% confidence areas for the ordination space of each habitat. Note that for (C) Density, there is no ellipse for Seawall as n = 1. Symbol colors as per Figure 1 Legend. symbols correspond to the sampled habitat. Only the taxa present in ≥5% of non-empty videos were included in the analysis. Community vectors were fit onto the ordination space and standard deviation ellipses show 95% confidence areas for the ordination space of each habitat. Note that for (C) Density, there is no ellipse for Seawall as n = 1. Symbol colors as per Figure 1 Legend.

Comparison of Metrics for Common Taxa
The three abundance metrics were strongly positively correlated for grey snapper (Figure 4). In all habitats where grey snapper (Lutjanus griseus) were observed (i.e., nonzero FoO and MaxN), we successfully derived density estimates, meaning when a grey snapper was observed in a habitat, it was also observed in at least one of the subsampled MeanCount frames used to calculate density. Sheepshead (Archosargus probatocephalus) also showed strong correlations among metrics, but were present at lower FoO, MaxN, and density than grey snapper (Figure 4). Although sheepshead were observed in 4 of the 9 samples from the jetty at the entrance to Pedido Bay (FoO 44%, green symbols Figure 4), they were not present in any of the subsample MeanCount frames in videos from that habitat, hence returning zero density for the jetty habitat. Mugil spp. was the second most frequently seen taxon, and it showed a poor correlation between FoO and either MaxN or density, while MaxN and density were more strongly correlated (Figure 4). In several habitats in which Mugil spp. were observed at varying frequencies, no individuals were observed in any of the MeanCount subsample frames, resulting in zero density values for those habitats.

Precision in Estimating Area Sampled
Out of 127 East Bay visibility-camera videos, 66 had visibility classified as ≥0.5 m, with 61 videos below the cutoff value (<0.5 m). Using CV as a measure of dispersion for videos with ≥0.5 m visibility, we found little variation in repeated estimates of horizontal visibility either within or among repeated observations of three observers (within observers A, B, C: CV A = 0.054, CV B = 0.033, CV C = 0.041; among observers CV = 0.045) (Supplementary Materials Table S2). We calculated the maximum discrepancies in visibility estimates (among 3 independent estimates by each of 3 observers = 9 observations per video) for all videos with visibility ≥ 0.5 m. The median value of visibility discrepancies was 0.1 m, and the median value did not change when considering subsets of videos at ranges of 0.5 to 1.0 m (n = 46), 1.0 to 1.5 m (n = 15), or greater than 1.5 m (n = 5). Therefore, we considered 0.1 m to be a conservative estimate of visibility error and used this value to calculate a magnitude of error in the resulting sampled area (Figure 5). At a visibility range of 1 m, our videos sampled an area of 0.556 m 2 . Assuming a ±0.1 m error in our estimated visibility translates to the area sampled being between 0.450 and 0.673 m 2 ( Figure 5). The magnitude of error in the estimated area sampled increases gradually at higher visibilities, though error is greater in proportion to the estimated area at lower visibilities. A majority of the East Bay visibility videos (n = 55) had estimated visibilities of less than 1.2 m (Figure 5), not including the 66 videos with visibility below the cutoff value. calculate a magnitude of error in the resulting sampled area (Figure 5). At a visibility range of 1 m, our videos sampled an area of 0.556 m 2 . Assuming a ±0.1 m error in our estimated visibility translates to the area sampled being between 0.450 and 0.673 m 2 ( Figure 5). The magnitude of error in the estimated area sampled increases gradually at higher visibilities, though error is greater in proportion to the estimated area at lower visibilities. A majority of the East Bay visibility videos (n = 55) had estimated visibilities of less than 1.2 m ( Figure  5), not including the 66 videos with visibility below the cutoff value.

Comparison of Metrics for Quantifying Fish Assemblages
This study revealed a high abundance of important fisheries species across a variety of complex artificial and restored habitats in two modified estuaries. While the loss of productive habitats from coastal systems is an ongoing and serious problem for biodiversity and the support of fisheries [3][4][5]30], much of the infrastructure added to these waters

Comparison of Metrics for Quantifying Fish Assemblages
This study revealed a high abundance of important fisheries species across a variety of complex artificial and restored habitats in two modified estuaries. While the loss of productive habitats from coastal systems is an ongoing and serious problem for biodiversity and the support of fisheries [3][4][5]30], much of the infrastructure added to these waters has some level of habitat value. Quantifying fish assemblages across a variety of constructed coastal habitats can help identify which habitats or structural features are particularly attractive to fisheries species, and which are of little value and rarely occupied [9,11]. Such information can help guide the design of future infrastructure to maximize fish habitat values [38].
The spatial patterns in fish assemblage composition across a variety of complex coastal habitats were similar regardless of which metric was examined. MaxN is a widely used index of relative abundance in video sampling [27,29]. Our study found that patterns of fish distribution among habitats were similar when quantified by MaxN or FoO, and for individual structure-oriented species such as grey snapper and sheepshead, MaxN and FoO were highly correlated. Compared to structure-oriented grey snapper and sheepshead, mullet are more mobile fish, and were observed across a variety of constructed and natural habitats, both complex and unstructured bottom. The poor correlations between FoO and either MaxN or density reflect this distribution pattern, and are typical of mobile and schooling species where patterns of occurrence and relative abundance are not tightly correlated due to the infrequent occurrence of large numbers of individuals [39]. Regardless of gear type, the quantification of patchy and schooling species requires a higher level of replication than for site-attached or more sedentary species [15]. We suggest that for studies seeking to describe spatial and temporal patterns in fish communities across a variety of complex habitats, simply recording the presence of each positively identified taxon to derive FoO will provide a robust description of assemblage composition. Not only is FoO logistically simpler and cheaper to extract when processing videos, it may reduce the biases in the more quantitative metrics, introduced by variable visibility among replicates, the variable probability of detecting and identifying different taxa and size classes, and the effects of variable distribution and schooling behavior among species [26,39].
FoO and MaxN are logistically and conceptually simple metrics of relative abundance. However, they fall short of supporting the derivation of production enhancement estimates desired by many coastal resource managers and restoration practitioners for justifying and promoting management and restoration decisions [14]. Using our simple camera systems, we derived density estimates for the more abundant fisheries species from habitats that preclude the use of other quantitative sampling gears. The accuracy of our density estimates is primarily determined by our ability to accurately determine the area sampled in each replicate (see Section 4.2). The derivation of density from videos involved subsampling ten individual frames from each replicate 10 min clip, whereas FoO and MaxN involved the analysis of the entire 10 min of video data. Consequently, for all except the most abundant species observed during our study (grey snapper), there were habitats where a taxon was observed, providing non-zero FoO and MaxN, but was not observed in any of the subsampled MeanCount frames, therefore providing zero density values for that habitat. The method of subsampling each video to derive density estimates [19] will result in many more "empty" videos than when quantifying by FoO or MaxN, and therefore sampling to derive density values will require a larger sampling effort to achieve any given level of replication.

Challenges and Solutions
Accurately estimating the area sampled in each replicate video is the most critical component of deriving robust density estimates. Repeated independent estimates of visibility distance revealed a high level of precision within and among three observers, with most repeat estimates agreeing within 0.1 m. The high repeatability of visibility estimates gives us high confidence in our estimates of the area sampled, at least over the visibility ranges observed in the current study. We see potential for increased error in estimating the visibility range and hence the area sampled as visibility extends beyond 2 m. As the visibility range increases, a fixed number of vertical pixels on the screen represents a rapidly increasing horizontal visibility distance, so that a small variation in the vertical location of the visibility horizon will lead to a large variation in the measured visibility distance and hence area sampled. We recommend testing within-and between-observer precision in estimating visibility distance following our methods, and that the visibility distance used to calculate the area sampled should be derived from three independent observers.
For most of our study, pairs of visibility camera drops bracketing sets of census videos had visibility ranges within 0.1 m, giving confidence that visibility conditions were similar for all census videos in that set. However, in some instances, visibility varied by more than 0.2 m among videos at a single site or habitat, typically associated with depth differences among replicates. During our study, if visibility could not be confidently assigned to a census replicate, the replicate was excluded from analysis. Increasing the number of visibility camera drops when visibility conditions are variable would help improve the accuracy of visibility estimates and confidence in the estimated area sampled.
Under any given visibility condition, the effective area sampled will vary between species, and among size classes within species. For example, the distinct black and white bars and deep body shape of sheepshead make them effectively "swimming secchi discs" that can be confidently identified out to the extreme edge of visibility in any video. Similarly, grey snapper and mullet have distinctive body shapes making them identifiable from a silhouette at a much greater range than smaller or less distinct species. Smaller individuals and similar-shaped species such as Atlantic croaker Micropogonias undulatus and spot Leiostomus xanthurus can be difficult to identify at shorter ranges than the more distinct species such as snapper and sheepshead. During our study, we were particularly interested in fisheries species, which all had distinctive body shapes and were present at larger sizes.
Our visibility range was based on confidence in identifying these target species, and hence we were able to derive robust density values for these. As with any sampling gear, gear efficiency varies among species and size classes, and this should be carefully considered when selecting species for which to derive density estimates from video samples.
Another consideration is when elements of the complex habitats themselves partially obstruct the field of view. In some videos, large fixed structures, such as rocks, dock pilings, or concrete structures used to form artificial reefs, blocked significant portions of the FoV, causing those videos to be excluded from density analyses. In many cases, however, smaller obstructions would not preclude analysis of the video. The MeanCount subsampling involved sampling ten individual random frames in each video, and analysis of each frame involved watching the video for around 10 s either side of the target frame to maximize the probability of counting stationary or cryptic individuals present in the target frame. Due to this, if an individual of a target species was present in the FoV of the subsample frame but hidden behind an obstruction, it would only be missed if it remained hidden for the entire~20 s viewed. The target fisheries species were never observed sitting stationary within the FoV for any length of time, and hence moderate obstructions from elements of the complex habitats being sampled did not preclude those samples from analysis. In dense seagrass meadows, the seagrass canopy provides an obstruction covering the substrate throughout the FoV. One of the key habitat values of seagrass is the provision of shelter for small nekton avoiding visual predators [40]. Therefore, it is unsurprising that a qualitative comparison of our video samples to seagrass trawl samples from the same bay systems (Heck, unpublished data) revealed that the video under-represented small seagrass-associated nekton, while better representing more mobile fish, including mullet and clupeids.
Any error in estimating the area sampled will add error to our density values, and we have identified and discussed a number of issues with the method above. However, many other quantitative sampling gears such as seines and trawls also have a variable sampling area, and will have variable sampling efficiency within the area sampled due to disturbances during deployment and retrieval of the gear [15]. Even highly quantitative gears have some biases and selectivities [41]. Most studies either disregard gear selectivity, or assume it to be constant among species [15], but also see [42]. Ultimately, it is difficult to assess the accuracy of our density estimates in complex habitats that cannot be sampled with other quantitative gears. However, we feel that the density values derived for fisheries species from samples where the FoV was largely unobstructed should be robust and reliable.

Conclusions
As productive coastal ecosystems continue to be modified with infrastructure and restoration efforts, it is critical to understand the functional roles of constructed or modified habitats in supporting fisheries species [31]. Quantitative estimates of fish densities across a range of habitats can guide the design of future infrastructure to maximize benefits for biodiversity [38], and support the derivation of production enhancement estimates in support of decision making [14]. The use of simple underwater video cameras for surveying fish communities is becoming widespread [17], yet the majority of studies derive only indices of relative abundance [19,29]. Our simple method for estimating the area sampled in the field of view offers opportunities to derive density estimates for fisheries species occupying a variety of natural and constructed complex habitats. We have identified some limitations and potential sources of error with the technique, but all sampling methods have their limitations that must be considered when designing field surveys [15]. Many of the limitations can be overcome by increasing replication to allow for replicates that must be discarded due to unfavorable conditions, and a high level of replication is one of the many benefits of sampling with simple video units described here [17,26]. Despite the limitations, the simple method for deriving density values from underwater video offers