Remote Sensing Image Scene Classification in Hybrid Classical–Quantum Transferring CNN with Small Samples

The scope of this research lies in the combination of pre-trained Convolutional Neural Networks (CNNs) and Quantum Convolutional Neural Networks (QCNN) in application to Remote Sensing Image Scene Classification(RSISC). Deep learning (RL) is improving by leaps and bounds pretrained CNNs in Remote Sensing Image (RSI) analysis, and pre-trained CNNs have shown remarkable performance in remote sensing image scene classification (RSISC). Nonetheless, CNNs training require massive, annotated data as samples. When labeled samples are not sufficient, the most common solution is using pre-trained CNNs with a great deal of natural image datasets (e.g., ImageNet). However, these pre-trained CNNs require a large quantity of labelled data for training, which is often not feasible in RSISC, especially when the target RSIs have different imaging mechanisms from RGB natural images. In this paper, we proposed an improved hybrid classical–quantum transfer learning CNNs composed of classical and quantum elements to classify open-source RSI dataset. The classical part of the model is made up of a ResNet network which extracts useful features from RSI datasets. To further refine the network performance, a tensor quantum circuit is subsequently employed by tuning parameters on near-term quantum processors. We tested our models on the open-source RSI dataset. In our comparative study, we have concluded that the hybrid classical–quantum transferring CNN has achieved better performance than other pre-trained CNNs based RSISC methods with small training samples. Moreover, it has been proven that the proposed algorithm improves the classification accuracy while greatly decreasing the amount of model parameters and the sum of training data.


Introduction
In recent times with the rocketing progress of the earth observation capability and the artificial intelligent technology, the progression and innovation of remote sensing (RS) are more important than ever.RSI data processing is now developing towards RSI the age of big data, which relies on data model by data-driven intelligent analysis [1] (refer to Table 1 for all acronyms that are used throughout the paper).Machine learning (ML) plays an essential role in extracting underlying features or a probability distribution of patterns in RSI datasets, such as damage prediction, target recognition or image classification in hitherto unknown domains [2].In particular, deep learning (DL) has proven to be a technological innovation and a historic milestone in many fields [3].It emphasizes neural networks (NNs) involving multiple hidden layers, which use feature representations learned exclusively from the data, instead of handcrafted features that are designed based mainly on domain-specific Various deep learning models have been developed with outstanding performance for image classification on RSI datasets in multiple applications.Recently, significant progress has been made in the methods of RSISC.Early studies involving RSISC in CNN were transplanted from natural image data, which led to the increasing of the difficulty of the RSISC task in complex RSIs [4].Since then, researchers havededicated to replace hand-engineered features with trainable multilayer networks, and several deep learning models have demonstrated impressive feature representation capability, applicable across a wide range of domains, including RSISC [5].Deep learning algorithms have opened up an entirely novel frontier of learning algorithms including CNN techniques that have been adopted in the RSISC [6].CNNs can extract hierarchical and insightful features automatically from the massive size of image data [7].
In recent years, there has been significant progress in RSISC methods using CNNs [2,3,8].Despite the extraordinary achievement that CNN-based methods have enabled in RSISC, the models of the network construction remain a huge challenge, which is significant for the efficiency of the CNN models [7].CNN architecture construction requires skill in both DL methodology and professional expertise [9].In practice, handmade architecture design is complicated and fallible; moreover, the requirement of number of expertise and professionals in both DL and the investigated domain is often unrealistic in practical applications.
In this context, most studies on CNN-based RSISC have utilized networks that are pre-trained on different networks, for instance AlexNet [10], VGGNet [11], GoogleNet [12], and ResNet [13], standing out among numerous DL-based methods due to closely related powerful features extraction performance.Since both assignments involving RSIs data and the pre-trained networks can easily be operated for transferring in RSISC.Although, RSIs differ significantly in terms of spectral resolution, spatial resolution, and radiometric resolution, etc. [14].Researchers have used pre-trained CNN models as features extractors to extract high semantic features.These methods have limitation: It has no advantage if significant domain differences exist between source and target datasets.
Fine-tuning CNN-based models is an effective tool for RSISC.However, such approaches still have limitations in three perspectives related to datasets, models, and labels.These limitations are discussed below.
Firstly, the dataset plays an important role in advancing RSISC.The Size-Similarity Matrix of dataset determines the choices of the pre-trained model.This matrix classifies the strategies based on the size of the dataset and its similarity to the dataset in which the pre-trained model was trained.Due to the matrix transformation conducted in the fully connected layers, the dataset must be transformed into a certain fixed size.The most common approach is to crop and interpolate the original image, which inevitably results in a loss of fine information from the original RSIs.Unfortunately, either of these approaches can be detrimental to the performance of RSISI.Therefore, the selection of the pre-trained models depends on the dataset's size and size-similarity.The RSIs dataset must be transformed into a special fixed size [15].
Secondly, from a modeling perspective, most pre-trained models can achieve excellent classification performance.However, these models also have some limitations.The learned feature may be not entirely satisfactory for the properties of target datasets, and pre-trained treatments applied between heterogeneous networks require manual manipulations of layer combinations, depending on type of task.The RSISC task, in particular, requires a massive tuning procedure to achieve optimal classification performance.
Thirdly, many scene images with identical exteriors have different but correlated scene contents, which can lead to the label paradox.Because an RSI contains various objects, such as water, mountains, bare land, buildings, etc., which may be covered in a residential scene as shown in Figure 1, the variance and complex spatial distributions of the scene contents cause the diversity of RSISC.Because of the different varieties in the exteriors of ground objects within the similar semantic information, traditional methods of RSISC often fail to achieve satisfactory classification accuracy.In addition, low scene category separability has been caused by the presence of the same scene contents within different scene categories or high semantic superimposition between different scene types.Another important difficulty of RSISC is the large variance of object scales caused by sensor imaging altitude variation.Furthermore, the label distribution of ground objects in datasets is irregular and often unavailable in the original training dataset, which is typically caused by resembling the sample dataset that has different but correlated labels, as can be seen in Figure 1 [16].Both RSISC and quantum computing are emergent techniques that have the potential to transform the research and application of RS.Quantum computing can provide significant advantages in terms of improving the performance of classification techniques.The great potential of quantum computing in ML has been actively investigated.In specific cases, traditional machine learning tasks can be improved with exponential acceleration when it operates on a quantum computer [18,19].The Harrow-Hassidim-Lloyd (HHL) algorithm is a quantum computing that has been successfully implanted in conventional ML theories and topics, such as data mining, artificial neural networks, computational learning theory, etc., as shown in [20][21][22][23][24]. HHL-based algorithms are based on the quantum phase estimation algorithm, which operates in a high-depth quantum circuit [25].To bypass this strict requirement of hardware, classical-quantum hybrid algorithms containing a low-depth quantum circuit, for instance, the variational quantum eigensolver (VQE) and the quantum approximate optimization algorithm (QAOA) have been suggested [26,27].The idea of a hybrid algorithm is to divide the problem into two components, either of which can be operated on a quantum and a classical computer separately.Cai verified the feasibility of a K-means algorithm on a quantum computer [28].Otgonbaatar proposed a parameterized quantum circuit (PQC) with only 17 quantum bits for classifying a two-label Sentinel RSI dataset.Quantum-based pseudo-labelling for hyperspectral imagery classification is demonstrated by Shaik [22].Noisy intermediate-scale quantum (NISQ) devices are an advanced quantum computing technology providing solutions for large-scale and complex practical quantum computation, such as solving high-complexity problems or supporting ML algorithms.An NISQ device is termed a variational quantum circuit (VQC) when it has parameterized quantum gates and parameterized quantum circuit.Sometimes, the VQC is a supervised learning algorithm in which quantum neural networks (QNNs) are trained to perform a classification mission.
A transformer-based model is one of the successful approaches of DL and more efficient than other traditional CNN models in a variety of downstream applications.Transformer-based concepts in in the context of quantum machine learning consist mainly of three variants: classical-quantum (CQ), quantum-classical (QC), and quantum-quantum (QQ) [23].The CQ concept can achieve better result in comparison with other transformerbased concept tests and classical-classical concept tests.Therefore, the CQ concept has potential with currently available quantum computers.
The purpose of this paper is to investigate the feasibility and the efficiency of the hybrid classical-quantum transferring CNN model for RSISC.Our study aims to demonstrate that hybrid classical-quantum transferring CNN models can significantly enhance the efficiency and performance of RSISC with small training samples.We focus on the hybrid models scenario, where tensor quantum circuits and CNNs can be jointly trained to achieve RSISC.With the advent of NISQ devices, CQ transfer learning has become particularly attractive, since, with the application of a classical-quantum neural network with large input images, it is possible to efficiently extract highly informative features with a VQC.It is advantageous, since it makes the application of the potential of quantum physics theory-superposition and entanglement, coordinated with the mature methods of classical ML-possible.Accordingly, tensor quantum circuits can be regarded as common quantum feature extractors, which can mimic famous classical networks that are often used as pre-trained networks.
The transfer learning approach in the quantum domain has been largely unexplored, with only very few applications, such as modeling quantum Many-Body Systems (MBSs), associated with classical autoencoders to a quantum Boltzmann machine, and initializing variational quantum networks.In this paper, an innovative, self-designed, and systematic theory, through a concept of tensor quantum circuits and pre-trained networks, is proposed.
We present a model-theoretic approach and provide proof-of-principle examples for practical implementation through numerical simulations.Additionally, we test our model experimentally on physical quantum processors, successfully conducting RSISC with a hybrid CQ system.The remaining sections of this work are introduced as follows: Section 2 introduces some notations about hybrid classical-quantum networks.Section 3 outlines the architecture of the hybrid classical-quantum transferring CNN.Section 4 reports the evaluation and results of our model.Finally, this paper is concluded in Section 5.

Quantum Encoding
The first step is the transformation of the classical image dataset into quantum state; the process denotes the quantum encoding (QE).Most QE methods can be seen as parameterized circuits acting on initial states; the parameters in the parameterized circuits are determined via classical image data.Generally, QE methods can be divided into basis encoding, amplitude encoding, angle encoding, instantaneous quantum polynomial(IQP) encoding, and Hamiltonian evolution ansatz encoding [29][30][31][32].IQP encoding can be achieved more easily than the other QE methods, so it is applied to our proposed algorithm.Then, IQP quantum encoding describes the following quantum state as Equation (1).

Variational Quantum Circuits
VQCs usually can define a subcircuit, which is a basic circuit architecture where complex VQCs can be constructed through repeating layers.Circuits consists of multiple rotating gates as well as CNOT gates that entangle each qubit with its neighboring qubit.We also need a circuit to encode the classical data onto the quantum state, so that the output of the measurement is related to the input.In this case, we encode the binary input onto the qubits of the corresponding order.
VQCs can define a quantum layer like the classical neural network.Furthermore, arbitrary VQCs can be demonstrated as below: where w denotes variational parameters, a unitary operation is achieved, acting on the input state |x> of n q quantum subsystems.
The depth d of a VQC is a superposition of different quantum circuits and matches the product of various parameterized unitary operations with different weights: On other hand, we can measure the expectation values of n q local observables ẑ = [ ẑ1 , ẑ2 , . . .ẑn ] for the extraction of a classical output y.The process is known as a measurement layer, which maps a quantum state to a classical dataset: The initial quantum layers and the final measurement state in the quantum circuits can be globally written as follows:

Tensor Quantum Circuits
The VQC algorithm is limited to parameters adjustment, using the quantum computer as feedback circuits to adjust the parameters in the parameterized circuits to optimal values.It can only optimize the structure of the fixed circuit; it cannot change its own structure, nor increase the number of entanglement operations, nor change the entanglement charac-Sensors 2023, 23, 8010 6 of 18 teristics expressed, so the algorithm still means the quantum chip entanglement operation.For complex Hamiltonian encoding, such as correlated systems, this common method does not performs well.
Tensor quantum circuits adopt the classical quantum hybrid algorithm.The classical part adopts the form of the tensor network to process the higher order tensor into the form of multiple lower order tensor compressors.The quantum part adopts VQC to adjust the parameters.Tensor quantum circuits consist of n qubits and consisting of k layers.Parameterized e iθX⊗X gates are between each neighboring qubit in each layer, followed by a series of single-qubit parameterized Z and X rotations, as can be seen in Figure 2.

Tensor Quantum Circuits
The VQC algorithm is limited to parameters adjustment, using the quantum computer as feedback circuits to adjust the parameters in the parameterized circuits to optimal values.It can only optimize the structure of the fixed circuit; it cannot change its own structure, nor increase the number of entanglement operations, nor change the entanglement characteristics expressed, so the algorithm still means the quantum chip entanglement operation.For complex Hamiltonian encoding, such as correlated systems, this common method does not performs well.
Tensor quantum circuits adopt the classical quantum hybrid algorithm.The classical part adopts the form of the tensor network to process the higher order tensor into the form of multiple lower order tensor compressors.The quantum part adopts VQC to adjust the parameters.Tensor quantum circuits consist of n qubits and consisting of k layers.Parameterized e iθX⊗X gates are between each neighboring qubit in each layer, followed by a series of single-qubit parameterized Z and X rotations, as can be seen in Figure 2.

Hybrid Classical-Quantum Transferring CNN
In the following section, we introduce the concept of transferring prior "knowledge" from classical to quantum.A hybrid classical-quantum transferring CNN for RSISC proposal on the basis of two networks A and B is defined in Figure 3.

Hybrid Classical-Quantum Transferring CNN
In the following section, we introduce the concept of transferring prior "knowledge" from classical to quantum.A hybrid classical-quantum transferring CNN for RSISC proposal on the basis of two networks A and B is defined in Figure 3.A hybrid classical-quantum transferring CNN for RSISC proposal (as shown in Figure 3).The pre-trained network A on a dataset DA for a task CA is CNN'.Then, the final layers are removed.The reduced network A' can be taken as feature extractor.A new trainable network B can integrate closely the pre-trained network A'.The parameters of network B can adjust finely on the specific dataset.DA = ImageNet: ImageNet Large Sclae Visual Recognition Challenge (ILSRC) with many classes [33].
A = ResNet34: a pre-trained residual neural network [13].In the context for transfer learning, the reduced pre-trained network A' is interpreted as a feature extractor, after removing the final layer of A. A' can produce features that are not problem-specific.In a hybrid structure, network A is classical, and the network B is quantum.
Nowadays, classical hybrid transfer learning is perhaps composed of the efficient and mature tools of DL in the current technological era of ML.It is commonly validated through the successful ML algorithms, especially for RSISC.
The quantum circuit model in Equation ( 6) demonstrates the concept.We assume quantum circuits of 4 qubits and utilize the following model: where L 2→4 implies residual networks using the activation function, tanh ϕ = tanh, Q is VQC, and L 4→2 denotes residual networks without activation function.The strategy of quantum encoding establishes links between the image dataset input x and its quantum state |x>.The chosen embedding map from the classical image dataset input vector can be written as Equation ( 8): cos(πx 2 ) sin(πx 2 ) cos(πx 3 ) sin(πx 3 ) cos(πx 4 ) sin(πx 4 ) where H denotes a single Hadamard gate.The quantum circuits consist of 4 variational circuits.And K indicates an entangling unitary operation of 3 CNOT gates.The model in the red frame demonstrates CNOT gates and 3 rotation gates R X , R Y , R Z .The CNOT gates force quantum entanglement between 2 quantum circuits, allowing the qubits from the circuits to be entangled.
Then, the measurement operators should be projected by measuring the expectation values of 4 observables, locally estimated for each qubit: In the final classification stage, the cost function was used as cross entropy and via a LogSoftMax layer.The flowchart of our proposed method is shown in Figure 4, and numerical simulations were conducted using the PyTorch interface and PennyLane software.
In the final classification stage, the cost function was used as cross entropy and via a LogSoftMax layer.The flowchart of our proposed method is shown in Figure 4, and numerical simulations were conducted using the PyTorch interface and PennyLane software.

Evaluation an Results
In this section, we assess the hybrid classical-quantum transferring CNN in terms of performance gains for RSISC.

Evaluation an Results
In this section, we assess the hybrid classical-quantum transferring CNN in terms of performance gains for RSISC.

Data Profile
Our proposed method was evaluated using two challenging RSI datasets: the EuroSat dataset and the Aerial Image dataset (AID) [17,34].The EuroSat dataset contains images taken from the Sentinel-2 satellite, categorizing the ground objects into 10 distinct land cover categories.The collection includes approximately 27,000 images divided across 10 classes, with patches measuring 64 × 64 pixels.The data were originally hyperspectral images captured with 13 spectral bands, but we used only RGB channels.The AID dataset includes 10,000 images of size 600 × 600 pixels, classified into 30 scene classes, with varying numbers of images for each label ranging from 220 to 420.The ground resolution also varies from approximately 8 m to 0.5 m per pixel.Figure 5 show examples of the EuroSAT dataset and Figure 6 shows examples of the AID dataset.cover categories.The collection includes approximately 27,000 images divided across 10 classes, with patches measuring 64 × 64 pixels.The data were originally hyperspectral images captured with 13 spectral bands, but we used only RGB channels.The AID dataset includes 10,000 images of size 600 × 600 pixels, classified into 30 scene classes, with varying numbers of images for each label ranging from 220 to 420.The ground resolution also varies from approximately 8 m to 0.5 m per pixel.Figure 5 show examples of the EuroSAT dataset and Figure 6 shows examples of the AID dataset.The image size has 64 × 64 pixels.Each category contains 2000 to 3000 images.In sum, the dataset has 27,000 geo-referenced images [17].
(  The image size has 64 × 64 pixels.Each category contains 2000 to 3000 images.In sum, the dataset has 27,000 geo-referenced images [17].
taken from the Sentinel-2 satellite, categorizing the ground objects into 10 distinct land cover categories.The collection includes approximately 27,000 images divided across 10 classes, with patches measuring 64 × 64 pixels.The data were originally hyperspectral images captured with 13 spectral bands, but we used only RGB channels.The AID dataset includes 10,000 images of size 600 × 600 pixels, classified into 30 scene classes, with varying numbers of images for each label ranging from 220 to 420.The ground resolution also varies from approximately 8 m to 0.5 m per pixel.Figure 5 show examples of the EuroSAT dataset and Figure 6 shows examples of the AID dataset.The image size has 64 × 64 pixels.Each category contains 2000 to 3000 images.In sum, the dataset has 27,000 geo-referenced images [17].

Evaluation Criteria
There are two extensively used evaluation criteria in RSISC: overall accuracy (OA) and confusion matrix.OA is an evaluating indicator of the classifier's performance on the whole test data set and is defined as the sum of accurately classified samples divided by the sum of tested samples.It is a commonly used to evaluate the performance of RSISC.The confusion matrix is an informative table used to analyze all the errors and confusions between different classes, generated by comparing the performance of correct and incorrect classification of each single classifier.
The definition of the calculation of OA can be expressed as follows: where P ij is the correct prediction of each single label, and n, k represents the sum of each label and the sum of labels.T is the sum of test dataset.

Experimental Setup
We split two different RSI datasets in different training/test ratios (10/90 ratio and 20/80 ratio) class-wise.The RGB images vast majority of all aerial and RSI datasets.For performance measurement, we compare with the performance of the proposed algorithm, the Bag-of-Visual-Words (BoVW) approach with scale-invariant feature transform (SIFT) features, a trained support vector machine (SVM) and other deep learning algorithms for accuracy evaluation.Furthermore, we trained a shallow CNN, a ResNet50, and a GoogleNet model on the training dataset [33,[35][36][37].In addition to the comparison with our proposed model, it is also compared with existing CQ models available in the literature, namely: CNN-QNN end-to-end model [24] and QCNN [38].
In our proposed method, we trained a ResNet34 model and the tensor quantum circuit with the model for 60 epochs over the training dataset, with a quantum depth 6 and an initial learning rate of η = 0.0006.The model was validated with respect to the test dataset after each epoch, and overall classification on the datasets was calculated.
We used many commonly used RSISC algorithms for instance, ResNet50 [13], DC-NNs [39], AlexNet [40], and VGG-VD16 [10], as criteria to assess the performance improvement of our proposed method.These models were pre-trained on the ImageNet dataset and then fine-tuned to adapt them to the RSIs.For the open-source AID dataset [34], the training:test ratios were set to 20%/80% and 50%/50%.

Performance Comparison
Furthermore, in order to further verify the efficiency and feasibility of the proposed algorithm in this paper, the proposed algorithm was also compared with other RSISC algorithms for experimental analysis and validated on the validation set.The results show that the proposed algorithm achieved the best classification result (%), as shown in following Algorithm 1 procedures.As can be seen in Table 2, we can analysis the performance of different approaches on the EuroSAT RGB dataset using various training:test splits.We evaluated four frequently used methods, including BovW, CNN, ResNet50, and GoogleNet, as baselines to assess the performance improvement of the hybrid classical-quantum transferring CNN.We used these pre-trained models on the ImageNet dataset and fine-tuned them on the EuroSAT dataset.For ResNet50 and GoogleNet, we replaced the flattened layer, maintaining the original image sizes.The proposed method exhibited better classification accuracy than other methods and significantly reduced model complexity.To quantitatively evaluate the performance of our proposed method, we adopted the OA and confusion matrix as the evaluation metrics.OA is recorded as the number of correctly classified samples divided by the total number of samples.In the confusion matrix, each column denotes the predicted results, and each row denotes the actual ground objects of the class data.It can display the distribution of each class and can be recommended for the analysis of misclassification results between different classes.
Figure 7 present the evaluation of the proposed method on the EuroSAT dataset in a training and a test set (10/90 ratio).The images of river, highway and herbaceous vegetation misclassified easily as others.As can be seen in Table 2, all CNN algorithms surpassed the BoVW method and the classification results of whole deep CNNs performed better than the classificatfigureion results of shallow CNNs.Nevertheless, the proposed method achieved a classification result of up to 95.81% in a training and a test set (10/90 ratio) for the EuroSAT RSISC.The optimization process is demonstrated in Figure 8.As can be seen in Figure 9, the images of two scene labels are alike, resulting in imperfect classification results compared with other labels.
Table 3 shows the resulting classification accuracies for the best performing DL models CNN models, GoogLeNet, ResNet50, and the proposed method.In experiments with the GoogLeNet, ResNet50, CNN models, and the proposed method, all models were pretrained on the EuroSAT dataset.For a better comparison of performance of all fine-tuning models, we trained the last layer with a learning rate between 0.01 and 0.0001.Using the proposed method, we achieved a classification accuracy about 17% better than the other pre-trained model, GoogleNet, which had been trained on the EuroSAT dataset in the same training:test dataset ratio setting.

98.82
We have implemented these networks using a neural network library named Pytorch and a cross-platform Python library for quantum computing named PennyLane [39,73].The experiments have been operated on a computer graphics workstation equipped with a single AMD Ryzen Threadripper PRO 3945WX 12-Cores CPU and a single NVIDIA GeForce NVIDIA RTX A5000 24GB GPU.
In this section, we detail various limitations and challenges of this study and how they could potentially be avoided.

1.
Lack of access to high-quality, standardized quantum datasets.More work is needed to develop such datasets to properly benchmark our proposed model.

2.
Limited types and amount of remote sensing image (RSI) samples currently used.More RSI data from different satellite sources should be collected.

3.
More methods for tensor network-based feature extraction should be explored to improve the interpretability and performance.4.
Further research into CQ transferring CNN architectures with sharp priors is needed, as these can help avoid issues like barren plateaus.Architectures like quantum graph neural networks and quantum recurrent neural networks show promise [74]. 5.
A theory of "quantum geometric deep learning" could help systematically design architectures suited for different quantum datasets, by encoding appropriate symmetries and physical principles.
In summary, the main limitations involve access to high-quality quantum data, limited diversity of current RSI datasets, and the need for better QCNN architectures.Key future work involves developing standardized quantum datasets, collecting more RSI data, exploring tensor network feature extraction, and establishing a theory of quantum geometric DL.

Conclusions
Over the past decade, rapid development has provided us with massive remote sensing datasets for intelligent earth observation using RSIs.Nevertheless, the lack of publicly available "big data" of RSIs seriously limits the development of innovative methods, especially traditional DL methods.This work first presents the background of VQC and tensor quantum circuits, and a hybrid classical-quantum transferring CNN applied to RSISC has been proposed.The main advantage of the method in this work is that it allows the input images to be of various shapes and sizes and dispense with the resizing of such images prior to processing.It can hold most key features in RSIs, which is mostly beneficial to conclusively achieving better classification results.In comparison to other deep learning methods with the same number of training samples, the proposed method has achieved amazing scene classification results.Data annotation must be done manually by skilled professionals in the area of RSISC.When a RSI dataset is massive, data annotation can become more complicated because of the huge diversities and variations in RSIs.Most of these models require a large-scale labeled dataset and numerous iterations to train their parameter sets.The experimental results show that our innovative method can achieve satisfied classification results with fewer training samples.This means that in the application of RSISC, the increased workload of data annotation can be reduced manually by skilled professionals.In order to improve classification accuracy, the sum of the CNNs layer has expanded from few layers to hundred layers.Usually, the vast majority of models have huge parameters; moreover, the operation of the CNN models requires an enormity of labeled datasets for model training and powerful equipment with high-performance GPUs for notable performance improvements in operation, which severely limits the development of RSISC methods.However, in comparison with huge CNNs, the proposed method is a compact, lightweight, and efficient RSISC model with fewer parameters.On the other side, the proposed method is extremely expensive and time-consuming, due to the limitations of cross-platform libraries for the differentiable programming of quantum computers.In the future, the method should be improved in quantum circuit simulators with support for automatic differentiation, just-in-time compiling, hardware acceleration, and vectorized parallelism.Especially when the quantum circuit size or the batch dimension is large, the new platform can enable acceleration in quantum circuit simulation.

Figure 1 .
Figure 1.Limitations of RSISC, which include (a) low type separability, (b) complex variance of scene scales, (c) coexistence of multiple objects.The sources are from Eurosat dataset [17].Both RSISC and quantum computing are emergent techniques that have the potential to transform the research and application of RS.Quantum computing can provide significant advantages in terms of improving the performance of classification techniques.The great potential of quantum computing in ML has been actively investigated.In

Figure 1 .
Figure 1.Limitations of RSISC, which include (a) low type separability, (b) complex variance of scene scales, (c) coexistence of multiple objects.The sources are from Eurosat dataset [17].

Figure 2 .
Figure 2.An illustration of tensor quantum circuits.

Figure 2 .
Figure 2.An illustration of tensor quantum circuits.

Figure 3 .
Figure 3.A hybrid classical-quantum transferring CNN for RSISC.ResNet network is utilized as features extraction and tensor quantum circuits are applied as a fully connected layer for RSISC.
CA = Image classification.A' = ResNet34 a residual neural network without the final layer.DB = RSI datasets B = Q = L4→2 × Q × L512→4: i.e., 4-qubit tensor quantum circuits and outputs.CB = RSISC.The strategy of feature extraction is much more general than what we needed in this work.In the context for transfer learning, the reduced pre-trained network A' is

Figure 3 .
Figure 3.A hybrid classical-quantum transferring CNN for RSISC.ResNet network is utilized as features extraction and tensor quantum circuits are applied as a fully connected layer for RSISC.A hybrid classical-quantum transferring CNN for RSISC proposal (as shown in Figure 3).The pre-trained network A on a dataset D A for a task C A is CNN'.Then, the final layers are removed.The reduced network A' can be taken as feature extractor.A new trainable network B can integrate closely the pre-trained network A'.The parameters of network B can adjust finely on the specific dataset.

Figure 4 .
Figure 4. Top: general scheme for the hybrid classical-quantum transferring CNN for RSISC.Middle: detailed scheme for classifying the scene image dataset.Bo om: architecture of the tensor quantum circuit with inputs of 4 qubits.Rx(•), Ry(•) and Rz(•) separately denote Pauli rotation X, Y, Z gates.

Figure 4 .
Figure 4. (Top): general scheme for the hybrid classical-quantum transferring CNN for RSISC.(Middle): detailed scheme for classifying the scene image dataset.(Bottom): architecture of the tensor quantum circuit with inputs of 4 qubits.R x (•), R y (•) and R z (•) separately denote Pauli rotation X, Y, Z gates.

Figure 5 .
Figure 5.This outline shows all sample images of all 10 categories covered in the EuroSAT dataset.

Figure 5 .
Figure 5.This outline shows all sample images of all 10 categories covered in the EuroSAT dataset.The image size has 64 × 64 pixels.Each category contains 2000 to 3000 images.In sum, the dataset has 27,000 geo-referenced images[17].

Figure 5 .
Figure 5.This outline shows all sample images of all 10 categories covered in the EuroSAT dataset.

Figure 6 .
Figure6.This outline shows all sample images of all 30 categories covered in the AID dataset.The image size has 600 × 600 pixels.Each category contains 220 to 420 images.In all, the dataset has 10,000 georefenced images[34].

Figure 7 .
Figure 7. Confusion matrix of the proposed method on the EuroSAT dataset in a training and a test set (10/90 ratio) using RSIs in the RGB color space.

Figure 8 .
Figure 8. Relation between the loss function and training steps of the proposed method.

Figure 7 . 18 Figure 7 .
Figure 7. Confusion matrix of the proposed method on the EuroSAT dataset in a training and a test set (10/90 ratio) using RSIs in the RGB color space.

Figure 8 .
Figure 8. Relation between the loss function and training steps of the proposed method.

Figure 8 .Figure 9 .
Figure 8. Relation between the loss function and training steps of the proposed method.

Figure 9 .
Figure 9.The images misclassified as others.(a,b) The images of highways misclassified as rivers.(c,d) The images of rivers misclassified as highways.

Algorithm 1. Comparation with other RSISC algorithms. 1. INPUT: RSI
dataset as training data, rest RSI dataset as test data.2. OUTPUT: Generate the predicted category labels.3. Prepare a ResNet 34 network with the autoencoder.4. Encode the RSI dataset. 5. Traning: a 4 qubit tensor quantum circuit.6. Feed the RSI dataset to the tensor quantum circuit.

Table 3 .
Contrast of the classification accuracies (%) of different training-test ratios on the AID dataset (training ratio = 20% and 50%).

Table 3 .
Contrast of the classification accuracies (%) of different training-test ratios on the AID dataset (training ratio = 20% and 50%).