A New Method for Extracting Laver Culture Carriers Based on Inaccurate Supervised Classiﬁcation with FCN-CRF

: Timely monitoring of marine aquaculture has considerable signiﬁcance for marine ecological protection and maritime safety and security. Considering that supervised learning needs to rely on a large number of training samples and the characteristics of intensive and regular distribution of the laver aquaculture zone, in this paper, an inaccurate supervised classiﬁcation model based on fully convolutional neural network and conditional random ﬁled (FCN-CRF) is designed for the study of a laver aquaculture zone in Lianyungang, Jiangsu Province. The proposed model can extract the aquaculture zone and calculate the area and quantity of laver aquaculture net simultaneously. The FCN is used to extract the laver aquaculture zone by roughly making the training label. Then, the CRF is used to extract the isolated laver aquaculture net with high precision. The results show that the kappa coe ﬃ cient of the proposed model is 0.984, the F 1 is 0.99, and the recognition e ﬀ ect is outstanding. For label production, the fault tolerance rate is high and does not a ﬀ ect the ﬁnal classiﬁcation accuracy, thereby saving more label production time. The ﬁndings provide a data basis for future aquaculture yield estimation and o ﬀ shore resource planning as well as technical support for marine ecological supervision and marine tra ﬃ c management. laver aquaculture nets, which is mainly based on FCN and CRF. In the experiment, three different FCN structures of FCN-8s, FCN-16s, and FCN-32s were compared. Through the accuracy evaluation of the extraction results of the three methods with different parameters and the comparison of the effect display, it can be found that the FCN-8s with three skip architectures have the best results and


Introduction
Laver aquaculture is an important part of the marine fishery economy and occupies an absolutely dominant position in aquaculture. However, with the development of the economy, the rapid growth of laver aquaculture zones has also brought about marine environmental problems, such as green tides [1]. On the other hand, the large-scale reproduction of enteromorpha will cover the aquaculture boxes and suspended nets, thereby hiding the mariculture zone, which may affect marine traffic and port transportation. Therefore, monitoring the growth status and coverage of laver and other marine products in a timely manner is highly important [2].
Routine identification and management methods of the laver aquaculture zone mainly include the use of statistics on the sea area application and confirmation of the registration records of the farmers. This method can ensure accuracy of sea area usage statistics but involves a large workload and a long cycle, and thus, it should not be used as the mainstream method for the identification and monitoring of aquaculture areas [3]. Satellite remote sensing technology has become an important means of surface monitoring because it is not restricted by time and space and has a wide coverage area [4]. Various methods have been formulated based on remote sensing technology, including the visual interpretation method, traditional classification method based on spectral statistics, morphological color relationships of each pixel and all other pixels in the image, thereby providing a new idea for accurately extracting culture carriers in the aquaculture zone. Based on the characteristics of dense and regular distribution of seawater and nets in the laver aquaculture zone, an inaccurate supervised classification method based on FCN and CRF is proposed. The Lianyungang laver aquaculture area in Jiangsu Province is taken as the research area to design a classification model that can extract the aquaculture zone and count the area and quantity of the laver aquaculture net simultaneously.

Materials
This study selected the offshore area of Lianyungang City, Jiangsu Province, as the study area to verify the accuracy and application performance of the model. The study area is distributed between 33 • 59 -35 • 07 north latitude and 118 • 24 -119 • 48 east longitude and located in the eastern coastal area of China, with the Yellow Sea adjacent to the east. It belongs to the temperate monsoon climate, cold and dry in winter, hot and rainy in summer. Due to its geographical advantages, it is less affected by waves, typhoons, and sea fog. Laver aquaculture zones are spread all over the sea area of Lianyungang, and the industry of laver aquaculture is the main economic component of marine aquaculture in Jiangsu Province. Therefore, it is more representative to select this research area for the extraction of laver culture carriers. An UAV (Unmanned Aerial Vehicle) image with a resolution of 0.1 m is used in this paper, which has 75,540 × 75,399 pixels. The image was obtained by aerial photography of a fixed-wing aircraft carrying a SonyA7R2 with 42 megapixels. It acquired 2216 images through three voyages 500 m above the ground. These images are stitched into the entire experimental image, as shown in Figure 1. providing a new idea for accurately extracting culture carriers in the aquaculture zone. Based on the characteristics of dense and regular distribution of seawater and nets in the laver aquaculture zone, an inaccurate supervised classification method based on FCN and CRF is proposed. The Lianyungang laver aquaculture area in Jiangsu Province is taken as the research area to design a classification model that can extract the aquaculture zone and count the area and quantity of the laver aquaculture net simultaneously.

Materials
This study selected the offshore area of Lianyungang City, Jiangsu Province, as the study area to verify the accuracy and application performance of the model. The study area is distributed between 33°59′-35°07′ north latitude and 118°24′-119°48′ east longitude and located in the eastern coastal area of China, with the Yellow Sea adjacent to the east. It belongs to the temperate monsoon climate, cold and dry in winter, hot and rainy in summer. Due to its geographical advantages, it is less affected by waves, typhoons, and sea fog. Laver aquaculture zones are spread all over the sea area of Lianyungang, and the industry of laver aquaculture is the main economic component of marine aquaculture in Jiangsu Province. Therefore, it is more representative to select this research area for the extraction of laver culture carriers. An UAV (Unmanned Aerial Vehicle) image with a resolution of 0.1 m is used in this paper, which has 75,540 × 75,399 pixels. The image was obtained by aerial photography of a fixed-wing aircraft carrying a SonyA7R2 with 42 megapixels. It acquired 2216 images through three voyages 500 m above the ground. These images are stitched into the entire experimental image, as shown in Figure 1.

Principle
Inaccurate supervised classification involves marking the labels roughly and extracting the ground objects finely. In the culture zone, the aquaculture nets and seawater are staggered and arranged densely, making it cumbersome and time-consuming to mark labels for isolated single aquaculture nets. Therefore, rough labeling and fine classification are carried out in this paper to establish an inaccurate supervised classification model. The example is shown in Figure 2.

Principle
Inaccurate supervised classification involves marking the labels roughly and extracting the ground objects finely. In the culture zone, the aquaculture nets and seawater are staggered and arranged densely, making it cumbersome and time-consuming to mark labels for isolated single aquaculture nets. Therefore, rough labeling and fine classification are carried out in this paper to establish an inaccurate supervised classification model. The example is shown in Figure 2. The rough mark means that the label is an aquaculture area that includes laver aquaculture nets and part of the seawater. Detailed mark means that the label only includes laver aquaculture nets.
The classification model is divided into two parts: FCN and CRF. FCN is used to extract the aquaculture zone and CRF is used to extract the border of the laver aquaculture net accurately and obtain the distribution of the laver aquaculture net to count the quantity and area of the nets.

FCN
Compared with the traditional convolutional neural network, FCN differs mainly from the convolutional layer instead of the fully connected layer, which can accept the input of images of any size and realize the pixel-level semantic segmentation through convolution and deconvolution structures [18].
The FCN model is composed mainly of two parts: coding structure and decoding structure. The convolution layer in the coding structure extracts the image features in the window through the local receptive field each operation and the pooling layer performs mainly high-level extraction on the convolution features. The convolution layer and the pooling layer are matched alternately to complete the extraction of high-level features of the image. If the high-level features acquired by the coding structure are decoded directly to obtain the corresponding semantic information, the boundary information will be lost and the classification result will be rough. Therefore, the decoding process adds a skip architecture, which decodes the high-level features and combines the low-level features in the coding structure to optimize the output, and more refined semantic segmentation results are obtained. The network architecture is shown in Figure 3. The rough mark means that the label is an aquaculture area that includes laver aquaculture nets and part of the seawater. Detailed mark means that the label only includes laver aquaculture nets.
The classification model is divided into two parts: FCN and CRF. FCN is used to extract the aquaculture zone and CRF is used to extract the border of the laver aquaculture net accurately and obtain the distribution of the laver aquaculture net to count the quantity and area of the nets.

FCN
Compared with the traditional convolutional neural network, FCN differs mainly from the convolutional layer instead of the fully connected layer, which can accept the input of images of any size and realize the pixel-level semantic segmentation through convolution and deconvolution structures [18].
The FCN model is composed mainly of two parts: coding structure and decoding structure. The convolution layer in the coding structure extracts the image features in the window through the local receptive field each operation and the pooling layer performs mainly high-level extraction on the convolution features. The convolution layer and the pooling layer are matched alternately to complete the extraction of high-level features of the image. If the high-level features acquired by the coding structure are decoded directly to obtain the corresponding semantic information, the boundary information will be lost and the classification result will be rough. Therefore, the decoding process adds a skip architecture, which decodes the high-level features and combines the low-level features in the coding structure to optimize the output, and more refined semantic segmentation results are obtained. The network architecture is shown in Figure 3.
The upper part of Figure 3 is the coding structure of the FCN model, which is mainly used to extract high-level features step-by-step. The lower part is the decoding structure, which uses the skip architecture (Skip Architecture in Figure 3) to perform label prediction. It performs up-sampling through the high-level features extracted by the coding structure and combines the low-level features in front of the coding structure to obtain the prediction label of the image. The argmax function returns a higher probability class based on the network output class probability to generate a prediction result.
There are many adjustable parameters in the network model, such as learning rate, max-iteration, and batch-size. Among them, the learning rate is a hyperparameter that guides the network to adjust the network weight through the gradient of the loss function. The lower the learning rate, the slower the change of the loss function, but it is easy to reach a local minimum. The max-iteration refers to the number of times that the network performs self-fitting and self-optimization based on the training data. It is better to stop iteration after the loss has converged. The batch-size means the amount of data fed to the neural network in each batch, which can be adjusted according to the memory size.
complete the extraction of high-level features of the image. If the high-level features acquired by the coding structure are decoded directly to obtain the corresponding semantic information, the boundary information will be lost and the classification result will be rough. Therefore, the decoding process adds a skip architecture, which decodes the high-level features and combines the low-level features in the coding structure to optimize the output, and more refined semantic segmentation results are obtained. The network architecture is shown in Figure 3.

CRF
CRF is a discriminative probability undirected graph learning model [26] with significant performance [27,28]. A rough segmentation may be generated in the details because of the independent label given to each pixel during the training of the FCN model. Therefore, the CRF is used in the network model to optimize the network results [29][30][31]. The input of the CRF contains the original image and the classification result of the FCN and aims to highlight the boundary information by minimizing the energy function of all pixel classification results information, position information, and channel information of the image. In the image, each pixel point I has pixel value X i and label value Y i (Y i belongs to tag set L = {L 1 , L 2 , L 3 ... L k }), taking each pixel as a node and the relationship between pixels as the edge to construct a conditional random airport. The conditional random field model conforms to the Gibbs distribution [32,33] shown in Equation (1): where Z(X) is the partition function and X is the fixed pixel point distribution. For convenience, the representation of X is omitted, and the Gibbs energy of y ∈ L is expressed as Equation (2): where i ϕ u (y i ) represents the unary potential function derived from the FCN classification result. The equation is as follows.
where i<j ϕ p y i , y j represents the pairwise potential function, which describes the relationship between pixels. Similar pixels are assigned to the same label, otherwise, different labels are assigned. The equation is as follows: where u y i , y j is the label consistency function, ω (m) is the Gaussian kernel corresponding weight, k G (m) is the Gaussian kernel, and f i , f j is the eigenvector of the pixels i and j.
The two-kernel potentials are defined in terms of the color vectors I i and I j and positions p i and p j . The first half represents the appearance kernel, nearby pixels with similar color are likely to be in the same class. The latter half represents the smoothness kernel, which is only related to the pixel position and can remove small isolated regions. These two kernels are controlled by the spatial standard deviation θ α and θ γ and the color standard deviation θ β .
Pairwise potential is mainly used to describe the relationship between all pixels in an image, which stimulates similar pixels to assign the same label. The criterion for judging similarity is related to the pixel value and the actual relative distance. Therefore, the CRF can compare each pixel in the image with all other pixels, and then obtain an accurate classification result under a global field of view based on the FCN result.

Methods
The methods in this paper are divided mainly into four parts: image preprocessing, FCN training and classification, accuracy evaluation, and CRF post-processing. The overall process is shown in Figure 4. The two-kernel potentials are defined in terms of the color vectors and and positions and . The first half represents the appearance kernel, nearby pixels with similar color are likely to be in the same class. The latter half represents the smoothness kernel, which is only related to the pixel position and can remove small isolated regions. These two kernels are controlled by the spatial standard deviation and and the color standard deviation . Pairwise potential is mainly used to describe the relationship between all pixels in an image, which stimulates similar pixels to assign the same label. The criterion for judging similarity is related to the pixel value and the actual relative distance. Therefore, the CRF can compare each pixel in the image with all other pixels, and then obtain an accurate classification result under a global field of view based on the FCN result.

Methods
The methods in this paper are divided mainly into four parts: image preprocessing, FCN training and classification, accuracy evaluation, and CRF post-processing. The overall process is shown in Figure 4.

Image Preprocessing
To fully train the network, vectorizing the laver aquaculture zone and seawater in the study area image is necessary, that is, assign image label attributes artificially. To prevent the model from being unable to train because of insufficient GPU (Graphics Processing Unit) memory in the training process, the image and label should be clipped regularly into a slice image with a sheet size of 600 × 600 pixels. The clipped data should be divided into a training dataset and validation dataset. The training dataset is mainly used for optimization and adjustment of network model parameters. The validation dataset is used to verify the model accuracy and universality and adjust the network model hyperparameter. In addition, the test dataset is used for the final experimental classification to complete the semantic segmentation of the image.

FCN Training and Classification
After the experimental data is produced, through training, the model learns the data features and the loss function is used to give the fitting error while the back-propagation algorithm is used to adjust the network parameters until the model reaches the optimal state. The parameters, such as the max-iteration and learning rate, can be adjusted further according to the model validation accuracy and the final classification effect.

Image Preprocessing
To fully train the network, vectorizing the laver aquaculture zone and seawater in the study area image is necessary, that is, assign image label attributes artificially. To prevent the model from being unable to train because of insufficient GPU (Graphics Processing Unit) memory in the training process, the image and label should be clipped regularly into a slice image with a sheet size of 600 × 600 pixels. The clipped data should be divided into a training dataset and validation dataset. The training dataset is mainly used for optimization and adjustment of network model parameters. The validation dataset is used to verify the model accuracy and universality and adjust the network model hyperparameter.
In addition, the test dataset is used for the final experimental classification to complete the semantic segmentation of the image.

FCN Training and Classification
After the experimental data is produced, through training, the model learns the data features and the loss function is used to give the fitting error while the back-propagation algorithm is used to adjust the network parameters until the model reaches the optimal state. The parameters, such as the max-iteration and learning rate, can be adjusted further according to the model validation accuracy and the final classification effect.

Accuracy Evaluation
FCN is compared with maximum likelihood classification (MLC), support vector machine (SVM), and neural network classification (NN) methods through the kappa, precision, recall, and F 1 . The equations are as follows: where T s is the correct aquaculture area, T o is the correct seawater range, F s is the wrong aquaculture area, and F o is the wrong seawater range.

CRF Post-Processing and the Number and Area Calculation of Aquaculture Nets
The accurate strip distribution of the laver aquaculture net is obtained using CRF processing in the aquaculture zone exported by FCN. The net area of the aquaculture zone is obtained by the image resolution and the pixel statistics on the results. Finally, the raster-vector conversion is carried out on the results and the quantity of the net in the aquaculture zone is obtained statistically.

Preparation for the Experiment
To facilitate the experiment, the images are labeled manually and divided into two categories: seawater and laver aquaculture zones. Seawater is assigned to attribute 1 and laver aquaculture zone is given attribute 0. In this paper, the number of training datasets is 1800 slice images and the validation dataset is 200. Some training data are shown in Figure 5. Figure 5 shows that the image size is 600 × 600 pixels. The green part represents the laver aquaculture zone and the label attribute is 0, while the black part represents the sea area and the label attribute is 1. The labeling process does not distinguish between the seawater and the laver net inside the mariculture zone. The marked laver aquaculture zone contains a considerable amount of seawater information, which is an inaccurate label. The test image size is 19,988 × 23,949 pixels. Based on the Python language, Tensorflow and several function libraries such as gdal, numpy, and os, are used to build the experimental model. Tensorflow is a typical function library for deep learning. It has a complete data flow and processing mechanism, and encapsulates a large number of efficient algorithms and functions for neural network construction, which is very suitable for large-scale machine learning applications. The specific environment configuration is shown in Table 1.

Preparation for the Experiment
To facilitate the experiment, the images are labeled manually and divided into two categories: seawater and laver aquaculture zones. Seawater is assigned to attribute 1 and laver aquaculture zone is given attribute 0. In this paper, the number of training datasets is 1800 slice images and the validation dataset is 200. Some training data are shown in Figure 5.  Figure 5 shows that the image size is 600 × 600 pixels. The green part represents the laver aquaculture zone and the label attribute is 0, while the black part represents the sea area and the label

Extracting Aquaculture Areas with FCN
As shown in Figure 6, FCN consists of FCN-8s, FCN-16s, and FCN-32s. Therefore, before using FCN to extract mariculture zones, a comparison and analysis of different parameter structures of FCN should be performed, and the best scheme will be selected to extract mariculture zones. attribute is 1. The labeling process does not distinguish between the seawater and the laver net inside the mariculture zone. The marked laver aquaculture zone contains a considerable amount of seawater information, which is an inaccurate label. The test image size is 19,988 × 23,949 pixels. Based on the Python language, Tensorflow and several function libraries such as gdal, numpy, and os, are used to build the experimental model. Tensorflow is a typical function library for deep learning. It has a complete data flow and processing mechanism, and encapsulates a large number of efficient algorithms and functions for neural network construction, which is very suitable for large-scale machine learning applications. The specific environment configuration is shown in Table 1.

Extracting Aquaculture Areas with FCN
As shown in Figure 6, FCN consists of FCN-8s, FCN-16s, and FCN-32s. Therefore, before using FCN to extract mariculture zones, a comparison and analysis of different parameter structures of FCN should be performed, and the best scheme will be selected to extract mariculture zones.  Table 2 and Figure 7.   By comparing the FCN of three different parameter structures, it can be seen that FCN-8s has the highest classification accuracy and the best output effect, and the other two methods are more likely to misclassify the fishing boat in the image into laver. FCN-16s only uses a skip structure to assist the decoder for up-sampling and cannot fully use the low-level features of the mariculture zones extracted by the encoder, so its output is not as good as FCN-8s. However, the way that FCN-32s directly obtains the output result by deconvolution is too rough, and the detailed information of FCN cannot be guaranteed, and the effect is the worst. The training of the FCN-8s, FCN-16s, and FCN-32s yields the cost function curve in Figure 8. As show in Figure 8, each point of the curve in the figure is the average of the loss generated every 300 iterations. It can be seen that the FCN-8s has reached convergence at the 20,000th iteration, its convergence is the fastest and most stable. Therefore, this experimental model is based on FCN-8s (hereinafter referred to as FCN), and use the produced training set and validation set to train the network. The learning rate of the model is set as 0.0001 and the batch-size is 8. Adam optimizer and the cross-entropy loss function are adopted in the model, and many scholars use them to optimize the network to identify aquaculture zones and achieve remarkable results [34,35]. Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions. It designs By comparing the FCN of three different parameter structures, it can be seen that FCN-8s has the highest classification accuracy and the best output effect, and the other two methods are more likely to misclassify the fishing boat in the image into laver. FCN-16s only uses a skip structure to assist the decoder for up-sampling and cannot fully use the low-level features of the mariculture zones extracted by the encoder, so its output is not as good as FCN-8s. However, the way that FCN-32s directly obtains the output result by deconvolution is too rough, and the detailed information of FCN cannot be guaranteed, and the effect is the worst. The training of the FCN-8s, FCN-16s, and FCN-32s yields the cost function curve in Figure 8. By comparing the FCN of three different parameter structures, it can be seen that FCN-8s has the highest classification accuracy and the best output effect, and the other two methods are more likely to misclassify the fishing boat in the image into laver. FCN-16s only uses a skip structure to assist the decoder for up-sampling and cannot fully use the low-level features of the mariculture zones extracted by the encoder, so its output is not as good as FCN-8s. However, the way that FCN-32s directly obtains the output result by deconvolution is too rough, and the detailed information of FCN cannot be guaranteed, and the effect is the worst. The training of the FCN-8s, FCN-16s, and FCN-32s yields the cost function curve in Figure 8. As show in Figure 8, each point of the curve in the figure is the average of the loss generated every 300 iterations. It can be seen that the FCN-8s has reached convergence at the 20,000th iteration, its convergence is the fastest and most stable. Therefore, this experimental model is based on FCN-8s (hereinafter referred to as FCN), and use the produced training set and validation set to train the network. The learning rate of the model is set as 0.0001 and the batch-size is 8. Adam optimizer and the cross-entropy loss function are adopted in the model, and many scholars use them to optimize the network to identify aquaculture zones and achieve remarkable results [34,35]. Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions. It designs As show in Figure 8, each point of the curve in the figure is the average of the loss generated every 300 iterations. It can be seen that the FCN-8s has reached convergence at the 20,000th iteration, its convergence is the fastest and most stable. Therefore, this experimental model is based on FCN-8s (hereinafter referred to as FCN), and use the produced training set and validation set to train the network. The learning rate of the model is set as 0.0001 and the batch-size is 8. Adam optimizer and the cross-entropy loss function are adopted in the model, and many scholars use them to optimize the network to identify aquaculture zones and achieve remarkable results [34,35]. Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions. It designs independent adaptive learning rates for different parameters and iteratively updates neural network weights based on training data. The Adam optimizer is simple, efficient, and requires less memory. The cross-entropy loss function can estimate the similarity between the prediction result and the sample label. The equation is as follows: where y lab is the label (0 or 1), y pre is the predicted value, x is the input value, and n is the number of x. The smaller the cross-entropy loss, the higher the prediction accuracy. When y pre and y lab are both 0 or 1, the value of the loss function is 0, indicating that the prediction is correct. This loss function can accelerate the convergence rate of the network. Finally, the test images are classified through 30,000 iterations of FCN. The classification results of FCN are compared with MLC, SVM, and NN, as shown in Figure 9. where is the label (0 or 1), is the predicted value, is the input value, and is the number of . The smaller the cross-entropy loss, the higher the prediction accuracy. When and are both 0 or 1, the value of the loss function is 0, indicating that the prediction is correct. This loss function can accelerate the convergence rate of the network. Finally, the test images are classified through 30,000 iterations of FCN. The classification results of FCN are compared with MLC, SVM, and NN, as shown in Figure 9.  Figure 9 shows that MLC, NN, and SVM can roughly identify the aquaculture area, but the MLC and SVM classification results have a lot of noise in the sea area. The large fishing boats and unclassified abandoned laver net piles are misclassified, which affects the classification accuracy. Although the NN classification results have improved the classification effect in the sea area, the aquaculture zone has a lot of noise, which destroys the integrity of the laver net and misclassifies the characteristic blurred area. However, the FCN can identify laver aquaculture zones fully without misclassification and the segmentation results are satisfactory, which is suitable for further extraction of subsequent classification models with inaccurate supervision.
The above method is evaluated quantitatively by , , , and , as shown in Table 3.   Figure 9 shows that MLC, NN, and SVM can roughly identify the aquaculture area, but the MLC and SVM classification results have a lot of noise in the sea area. The large fishing boats and unclassified abandoned laver net piles are misclassified, which affects the classification accuracy. Although the NN classification results have improved the classification effect in the sea area, the aquaculture zone has a lot of noise, which destroys the integrity of the laver net and misclassifies the characteristic blurred area. However, the FCN can identify laver aquaculture zones fully without misclassification and the segmentation results are satisfactory, which is suitable for further extraction of subsequent classification models with inaccurate supervision.
The above method is evaluated quantitatively by kappa, precision, recall, and F 1 , as shown in Table 3.  Table 3 shows that among the three classification methods of MLC, SVM, and NN, NN has the highest precision at 96%, while the highest recall of SVM is 87%. FCN has a higher score in various accuracy indices than the other methods. The confusion matrix of the FCN classification results is shown in Table 4.  Table 4 shows that the label data of the mariculture zone have 103,255,518 pixels, and 102,061,026 pixels are classified correctly. The seawater label data have 375,437,094 pixels and the error classification is only 1,194,492 pixels. The recognition effect is positive. FCN has high recognition accuracy for the mariculture zone and is sufficient to complete the extraction of the mariculture zone with high precision. Using the pixel statistics of the results and the 0.1,m resolution, the test image area is 4,786,926.12 m 2 , in which the actual area of the mariculture zone is 1,032,555.18 m 2 and the predicted area of the mariculture zone is 1,033,634.47 m 2 , with a statistical error of 0.1%.

Extracting Laver Aquaculture Nets with CRF
The laver aquaculture nets are obtained accurately by carrying out the CRF processing on the FCN classification result. The classification results of FCN are used to independently compute the unary potentials for each pixel, and set different adjustable parameters such as θ α , θ β , θ γ and epoch to generate the pairwise potentials of the conditional random field. In this paper, we designed four different experiments to extract nets and compare their effects. The experimental parameters are shown in Table 5. Since θ γ had little effect on the result in the experiments, it is defined as 20 to play the best role. We cut out an image with 4183 × 4825 pixels from the experimental image as the data for this CRF experiment and used the above experiments with different parameters to extract the nets respectively. The extraction results of all schemes are shown in Figure 10. From the comparison of CRF_1 and CRF_4 in the figure, it can be found that the fewer epochs of CRF, the more connections between the laver nets, and the worse the accuracy. When the epoch is gradually increased, the effect of the connection is obviously improved. When and are defined as 500 and 50, the net can be precisely extracted, and the internal structure of the aquaculture zone can be further refined. Because of the addition of the smoothness kernel into the pairwise potentials of the fully connected random field, the fishing boat in the purple box is not considered as laver and is correctly classified as the seawater part. That is, the CRF accurately classifies the seawater and laver strips in the undivided aquaculture zone during the labeling process. The shape and size of the classification results are similar to the images without damaging the accuracy of the FCN classification results. The inaccurate supervised classification process from the original image to the net of the aquaculture zone is realized. In order to study the results of CRF in detail, the classification results of the three red areas marked in the above figure are shown below in Figure 11. From the comparison of CRF_1 and CRF_4 in the figure, it can be found that the fewer epochs of CRF, the more connections between the laver nets, and the worse the accuracy. When the epoch is gradually increased, the effect of the connection is obviously improved. When θ α and θ β are defined as 500 and 50, the net can be precisely extracted, and the internal structure of the aquaculture zone can be further refined. Because of the addition of the smoothness kernel into the pairwise potentials of the fully connected random field, the fishing boat in the purple box is not considered as laver and is correctly classified as the seawater part. That is, the CRF accurately classifies the seawater and laver strips in the undivided aquaculture zone during the labeling process. The shape and size of the classification results are similar to the images without damaging the accuracy of the FCN classification results. The inaccurate supervised classification process from the original image to the net of the aquaculture zone is realized. In order to study the results of CRF in detail, the classification results of the three red areas marked in the above figure are shown below in Figure 11. Figure 11a,b show that under-segmentation or over-segmentation can be observed in the fuzzy region mixed with other features and the floating objects on the sea will be misclassified. Meanwhile, for the clear and obvious feature in Figure 11c, the inaccurate supervised classification model based on CRF post-processing can extract the net individuals of the laver aquaculture zone accurately.
The fine area of the laver net strips in the aquaculture area can be obtained based on the results of the CRF treatment in Figure 10. Finally, the result of FCN-CRF treatment can be converted into a vector file through raster-vector conversion and the number of nets can be counted. The final experimental results are shown in Table 6.   Figure 11a,b show that under-segmentation or over-segmentation can be observed in the fuzzy region mixed with other features and the floating objects on the sea will be misclassified. Meanwhile, for the clear and obvious feature in Figure 11c, the inaccurate supervised classification model based on CRF post-processing can extract the net individuals of the laver aquaculture zone accurately.
The fine area of the laver net strips in the aquaculture area can be obtained based on the results of the CRF treatment in Figure 10. Finally, the result of FCN-CRF treatment can be converted into a vector file through raster-vector conversion and the number of nets can be counted. The final experimental results are shown in Table 6. Through experimental calculations, the total area of the image in Figure 10 is 201,829.75 m 2 , of which the area of the aquaculture zone is 45,301.04 m 2 , accounting for 22.45% of the total study area. The area of the net in the aquaculture zone is 25,220.45 m 2 , accounting for 55.67% of the aquaculture area and the seawater in the aquaculture area accounts for 44.33%, that is to say, the marked laver aquaculture zone contains more than 40% seawater, which is actually noise. The findings show that the proposed inaccurate classification model has a strong fault tolerance rate, and the total quantity of 1516 aquaculture screens is predicted. The quantity of screens (1501) in the aquaculture zone is obtained by visual interpretation. The results are very similar and have high reference values.

Discussion
In this paper, we designed an inaccurate supervised classification model for the extraction of laver aquaculture nets, which is mainly based on FCN and CRF. In the experiment, three different FCN structures of FCN-8s, FCN-16s, and FCN-32s were compared. Through the accuracy evaluation of the extraction results of the three methods with different parameters and the comparison of the effect display, it can be found that the FCN-8s with three skip architectures have the best results and Through experimental calculations, the total area of the image in Figure 10 is 201,829.75 m 2 , of which the area of the aquaculture zone is 45,301.04 m 2 , accounting for 22.45% of the total study area. The area of the net in the aquaculture zone is 25,220.45 m 2 , accounting for 55.67% of the aquaculture area and the seawater in the aquaculture area accounts for 44.33%, that is to say, the marked laver aquaculture zone contains more than 40% seawater, which is actually noise. The findings show that the proposed inaccurate classification model has a strong fault tolerance rate, and the total quantity of 1516 aquaculture screens is predicted. The quantity of screens (1501) in the aquaculture zone is obtained by visual interpretation. The results are very similar and have high reference values.

Discussion
In this paper, we designed an inaccurate supervised classification model for the extraction of laver aquaculture nets, which is mainly based on FCN and CRF. In the experiment, three different FCN structures of FCN-8s, FCN-16s, and FCN-32s were compared. Through the accuracy evaluation of the extraction results of the three methods with different parameters and the comparison of the effect display, it can be found that the FCN-8s with three skip architectures have the best results and had fully converged when the number of iterations reached 25,000. The FCN-32s that do not use skip architecture to combine the low-level features of the encoder and the FCN-16s that use only one skip architecture do not work well. They mistake the fishing boat for laver, lose the boundary information of the aquaculture zones, and generate more noise inside the classified aquaculture zones. Therefore, it is proven that the FCN-8s structure is the best, and it is more suitable for the extraction of aquaculture zones in areas with complex features than the other two structures. Then, this article used MLC, SVM, and NN to extract aquaculture zones. By comparing FCN with the above methods, we can see that the output of FCN has clearer boundaries and less noise, especially for abandoned aquaculture zones, FCN has better identification and there is almost no misclassification. After using FCN to extract aquaculture areas, we set up CRFs with different parameters for experiments and comparisons. It can be found that CRFs with higher epoch have higher recognition accuracy and are mainly affected by two parameters, θ α and θ β . After CRF processing, laver strips can be accurately extracted without damaging the output of FCN. We can count the number and area of nets based on the results of the CRF.
The previous studies extracted the aquaculture zones in the offshore area based on supervised classification, and did not divide the culture carriers inside the aquaculture zones. This only achieved a rough extraction of the aquaculture zones. If the internal carriers are segmented, more detailed labels are needed to train the network model. This method will inevitably increase the workload and preprocessing time of the experiment. The inaccurate supervised classification model we proposed can not only extract the laver aquaculture zones but also accurately obtain the area and quantity of culture carriers, which is beneficial to the management of marine resources. However, there are still many limitations in our research. For example, it is easy to mistakenly identify floating objects near the aquaculture zones as laver culture carriers, which will affect our final area statistics. On the other hand, the model in this paper contains more than 40% seawater information in the labeling process of the aquaculture zone, which has a high fault tolerance rate. This is mainly because the features in our study area are relatively single. Features include only aquaculture zones and seawater, which is quite advantageous for our model. We still have a lot of work to do to improve the universality of the model.

Conclusions
This paper designed an inaccurate supervised classification model in allusion to the characteristics of intensive and regular distribution of laver aquaculture zone and the problem that supervised learning requires a large number of samples. The proposed model can extract aquaculture zone and count the area and quantity of the laver aquaculture net simultaneously. The study area focused on the Lianyungang laver aquaculture zone of Jiangsu Province. The conclusions are as follows.
(1) The kappa coefficient of the classification results obtained by FCN-8s can reach 0.984 and F 1 was 0.99, which proves that the FCN network model can complete the classification of laver with high accuracy.
(2) Using CRF post-treatment, the individual laver aquaculture net can be divided accurately, and the overall effect was positive, which proves that the proposed model can extract the area and quantity of the laver cultivation net well and has higher reference value.
(3) The inaccurate supervised classification model can effectively identify the laver aquaculture zones and has a high fault tolerance rate, which can meet the requirements of the inaccurate supervised classification of the coarse label and fine classification. It saves considerable labeling time without affecting the final classification accuracy. The data provide a foundation for future laver farming estimation and offshore resource planning and technical support for marine ecological regulation and maritime traffic management.
Based on the analysis of the experimental conclusions, although the model proposed in this paper has achieved certain results, there is still much work to be done in the future. For example, the recognition accuracy of the target object needs to be improved by adding post-processing operations to avoid misclassification of floating objects at sea. On the other hand, building a more complete network model by adding relevant experimental data of complex regions is an essential research area. Through these works, the fine classification of complex scenes not only limited to single features can be further improved.