An Online Hashing Algorithm for Image Retrieval Based on Optical-Sensor Network

Online hashing is a valid storage and online retrieval scheme, which is meeting the rapid increase in data in the optical-sensor network and the real-time processing needs of users in the era of big data. Existing online-hashing algorithms rely on data tags excessively to construct the hash function, and ignore the mining of the structural features of the data itself, resulting in a serious loss of the image-streaming features and the reduction in retrieval accuracy. In this paper, an online hashing model that fuses global and local dual semantics is proposed. First, to preserve the local features of the streaming data, an anchor hash model, which is based on the idea of manifold learning, is constructed. Second, a global similarity matrix, which is used to constrain hash codes is built by the balanced similarity between the newly arrived data and previous data, which makes hash codes retain global data features as much as possible. Then, under a unified framework, an online hash model that integrates global and local dual semantics is learned, and an effective discrete binary-optimization solution is proposed. A large number of experiments on three datasets, including CIFAR10, MNIST and Places205, show that our proposed algorithm improves the efficiency of image retrieval effectively, compared with several existing advanced online-hashing algorithms.


Introduction
With the popularization of optical-sensor networks and the wide use of intelligent interconnected devices, data in various fields are increasing at an unbelievable speed. People realize that intelligent processing and analysis of data is necessary [1][2][3] and it is of great significance to store high-dimensional data effectively and retrieve data rapidly. The traditional indexing methods involving text-based image retrieval (TBIR) and content-based image retrieval (CBIR) [4,5] encounter the curse of dimensionality in high-dimensional situations, and their query performance is even worse than linear query. An approximate nearest-neighbor query based on the hash method is an efficient method to solve the above issue [6,7]. Specifically, in image retrieval systems, image hashing means mapping a high-dimensional real-valued image to a compact binary code, which can preserve the relationship between different high-dimensional data and the Hamming space [8][9][10][11].
According to the dependency between the hash model and the sample data, hashing algorithms include data-independent algorithms and data-dependent algorithms. The representative data-independent algorithms include locality-sensitive hashing (LSH) [12,13], and its variants such as p -stable hashing [14], min-hash [15], and kernel LSH (KLSH) [16]. Data-dependent hashing algorithms include unsupervised hashing algorithms [17][18][19] and supervised hashing algorithms [20][21][22][23]. Since methods using data distribution or class labels perform better in the quick search field, more effort is being put into datadependent methods.
In fact, data samples always arrive sequentially, as time goes on, and thus the previously existing data is often out of date [24,25]. When the discrepancy between newly arrived data and previously existing data is large, the hashing function often loses efficiency on newly arrived data. Therefore, it is very important to present an online hashing model that is suitable for streaming data. Unlike the offline hashing methods, which correct the training error on the fixed dataset through multiple training rounds, online algorithms use multiple batches of streaming data to update hash functions, which are more realistic and have a strong application background [26][27][28][29][30][31].
Most of the mentioned online-hashing algorithms consider the adaptability, pairwise similarity, or independence of hash codes to build a constrained hashing model, but the optimization needs relaxation learning, which brings quantization errors to a certain extent and reduces the retrieval accuracy. In addition, there exist unsupervised online hashing methods, which are mostly based on the idea of "matrix sketch", and its representative works mainly consist of Online Sketching Hashing (SketchHash) [38], Faster Online Sketching Hashing (FROSH) [39], and so on. From the setting of online hashing, the global data grow dynamically with the current arriving data, which just represent the local data at a certain stage. Unsupervised methods that only rely on the distribution of newly arrived data lack a global description of the hashing model.
In image-retrieval application systems, the labeling is carried out manually and the workload is also huge. In addition, manual labeling is prone to errors, and wrong labels will directly lead to retrieval failure. Therefore, the online hashing method that relies too much on data labels while ignoring the structural characteristics of the data itself is subject to many limitations in practical applications, which seriously affects the performance of retrieval accuracy [40,41].
The high-dimensional image data remains on the low-dimensional manifold structure [42], and the query data is often strongly correlated. Therefore, this paper proposes an online hashing model that fuses global and local dual semantics. First, an anchor hash model is built based on manifold learning to retain local features of the original data. Then, a global similarity matrix which is used to constrain the hash codes is constructed, employing the balanced similarity between newly arrived data and previous data [35]. Under a unified framework, an online hash model integrating global and local dual semantics is learned, as well as an effective discrete-binary-optimization scheme being proposed. Compared with several classical and well-established online hashing algorithms, our proposed LSOH method has advantages in many performance indicators.
In summary, the main contributions of our work are as follows: • Extract the manifold structure of high-dimensional data using Laplacian Eigenmaps, thus constructing an anchor hash model.

•
Construct an asymmetric-graph regularization term to constrain the learning process of hash codes using the balanced similarity between current arriving data and previous data sets.

•
Integrate the anchor hash model and the asymmetric graph regularization with a seamless formulation to learn global and local dual-semantic information, then use the alternating-iteration algorithm to solve the optimization issue and obtain high retrieval accuracy by performing a large number of experiments.
The remaining contents are arranged as follows. In Section 2, related work in this field is reviewed. In Section 3, we present our proposed online-hashing algorithm, including the optimization method. Section 4 presents the experimental results and analyses in detail. Finally, we give a conclusion of our work in Section 5.
Huang et al. [24] presented an online hashing algorithm using a kernel function, termed OKH. First, OKH employes the kernel-based hash function to process linearly inseparable data. Then, OKH formulates an objective function based on the inner product of binary codes. They consider the equivalence between optimizing the inner product of binary codes and Hamming distance, and use the greedy algorithm to solve the hash function effectively. It solves the non-convex optimization problem of Hamming distance. Experiments show that it can be widely applied to image-retrieval scenarios.
Similar to OKH's framework, AdaptHash [32] proposes a fast similarity-search algorithm for hash functions, based on the stochastic gradient descent method. Specifically, it defines a hinge-loss function to determine the number of hash functions that need to be updated in Adaptive Hash and optimizes the model by SGD.
Cakir et al. [33] proposed an adaptive online-hashing method based on Error Correcting Output Codes (ECOC), named OSH. No prior assumptions about label space are made and it is the first supervised hashing algorithm suitable for the growth of label space. OSH presents a two-step hashing framework, first generating ECOC as codebooks, and then assigning codewords to each class label. Finally, the exponential loss is optimized and solved by SGD, to ensure that the learned hash function is suitable for binary ECOC.
Based on the knowledge of information theory, MIHash [34] takes mutual information as the learning objective and proposes a measure to eliminate the updates of the unnecessary hash tables. Thus, they optimize the mutual information objective by stochastic gradient descent. The computational complexity is effectively reduced, and the learning efficiency of the hash function is improved.
BSODH [35] believes that there are two unsolved problems: update imbalance and optimization inefficiency, which lead to the unsatisfactory performance of OH in practical applications. In this paper, two balance parameters are introduced to improve the regularization term of asymmetric graphs. Theoretical analysis and extensive experiments verify the role of parameters in alleviating the unbalanced update. It is also the first time discrete optimization has been applied to online hashing, which improves the online hashing performance.
Lin et al. [36] believe that because of the flaw of unknown category numbers in supervised learning, it does not improve the efficiency of online hash retrieval, despite the addition of semantic information. Therefore, they propose a robust supervised onlinehashing scheme, termed HCOH. First, a high-dimensional orthogonal binary matrix, i.e. the Hadamard matrix, is generated. Every column or row of this matrix can be taken as a codebook that corresponds with a class label. Then, LSH is used to convert the codebook into a binary code adapted to the number of hash bits. In an improved version of HMOH [37], hash linear regression is processed as a binary-classification issue, and the case of multi-label is considered as well.
Aimed at the problem of the data embedding into the system in a streaming way and the difficulty of loading into memory for training because of the huge dataset, Sketch-Hash [38] decreases the size of the dataset based on the idea of data sketches, and retains the main features of the dataset to learn an effective hash function. This approach reduces computational complexity and space complexity. Compared with SketchHash, the FROSH [39] method leverages fast transformation to sketch data more compactly. FROSH applies a specific transform on different small data blocks, speeding up the procedure of sketching with the same space cost.
FROSH [39] method leverages fast transformation to sketch data more compactly. FROSH applies a specific transform on different small data blocks, speeding up the procedure of sketching with the same space cost.
There is also semi-supervised online hashing [43,44], which is relatively complicated because labels may come from existing data or streaming data. In addition, deep-hashing methods [40,45,46] occupy a very important position in the existing offline-hashing methods. However, there are large amounts of parameters to be trained in deep learning, and few examples are applied in online hashing at present. Among them, Online Self-Organizing Hashing [47] obtains hash codes by the Self-Organizing Map (SOM) algorithm, but SOMs with multi-layers structures have not been applied to image retrieval.

The Proposed Method
In this section, the variable symbols in this algorithm are first defined, and the modeling process that combines the local structural features and the similarity features of the global datasets is given. Finally, we obtain the objective function and solve it by the alternating-iteration method. The algorithm frame is shown in Figure 1.

Notations
Assume that training samples are poured into retrieval application at stage. They are denoted as = , , ⋯ , ∈ × and their corresponding labels are defined as = ; ; ⋯ ; ∈ × . Each training sample expressed as is -dimensional. The goal of hashing is to learn r-dimensional hash codes, which are denoted as = , , ⋯ , ∈ {1, −1} × , and meanwhile, is much smaller than . The linearhash mapping is widely used as a hash function [48], i.e., where (•) stands for the hash function, ∈ × is the projection matrix to be learned, is the transpose of , and (•) is the symbolic function, and its definition is the following. All symbol notations utilized in this study are presented in Table 1.

Notations
Assume that n t training samples are poured into retrieval application at t stage. They are denoted as X t = x t 1 , x t 2 , · · · , x t n t ∈ R d×n t and their corresponding labels L t are defined as L t = l t 1 ; l t 2 ; · · · ; l t n t ∈ N d×n t . Each training sample expressed as x t i is ddimensional. The goal of hashing is to learn r-dimensional hash codes, which are denoted , −1} r×n t , and meanwhile, r is much smaller than d. The linear-hash mapping is widely used as a hash function [48], i.e., where F(·) stands for the hash function, W t ∈ R d×r is the projection matrix to be learned, W t T is the transpose of W t , and sgn(·) is the symbolic function, and its definition is the following. All symbol notations utilized in this study are presented in Table 1.

Symbol Notations
hash codes learned for X t B t a hash codes learned for X t a . W t hashing projection matrix at the t age d dimension of all input data X t c newly arrived data at the t stage k dimension of every hash code L t c labels of X t c N amount of input data B t c binary codes generated for X t c n t amount of input data at the t stage

Laplacian Eigenmaps
Laplacian Eigenmaps use the Laplacian operator to make similar data in the original space as close as possible after being mapped to the low-dimensional space, so as to embed high-dimensional images in the low-dimensional space. Assume that the original data denoted as X t = x t 1 , x t 2 , · · · , x t n t are mapped to the low-dimensional space, in which hash codes are expressed as B t = b t 1 ; b t 2 ; · · · ; b t n t . We construct a graph whose adjacency matrix is O t to maintain the relationships between different data. Then, we define the objective function to be optimized as follows: where I represents the k-dimensional identity matrix, and O ij t represents the adjacency matrix between the sample data. Mathematically, the objective function to be optimized can be transformed into the formula as follows: where D ii = ∑ i O ij represents the weight matrix of the sample graph, and D − O is the Laplace matrix. By eigen decomposition of D − O, the eigenvectors corresponding to the k smallest non-zero eigenvalues are obtained as the required target hash codes.

Anchor Graph Hashing
It is time-consuming and memory-intensive to compute the adjacency matrix for large amounts of sample data. Calculating the adjacency matrix by using an anchor set instead of the dataset can solve the above problem: m anchor points denoted as [u 1 , u 2 , . . . , u i , . . . , u m ] ∈ R d are obtained through the k-mean clustering method. When the number of anchors is less than that of the training samples, both storage cost and Sensors 2023, 23, 2576 6 of 18 computation time are greatly reduced. The anchor graph is denoted as A t and its elements are defined as follows: where θ is a defined parameter, and {i} represents the index set of the k nearest anchor points. Replace the traditional Laplace matrix with anchor graph A t , and then the objective function is obtained. where represents the hash codes of anchor points, W t T x t i represents the hash codes of the input images, and A ij t represents the anchor graph matrix constructed by the input images and anchor data.

Global-Balanced Similarity
It performs the anchor hashing based on Laplacian Eigenmaps; the hash function of newly arrived data is obtained independently, which will ignore the correlation of the overall data and boost the redundancy of hash codes. Thus, a global similarity constraint is introduced to build an online hashing model.

Similarity
Suppose that the newly arrived data at t stage are denoted as X t c = X t c1 , X t c2 , . . . , X t cn t , while the existing data arriving before t stage are denoted as X t a = X 1 a , X 2 a , . . . , X t−1 a . The similarity matrix, S t , is constructed by the relationships of the data labels between X t c and X t a . Each matrix element is defined as follows: Generally speaking, the more similar the data, the smaller the hash distance. We use the inner product of hash codes to estimate the distance between different vectors in the Hamming space. Constraints on hash codes are constructed using the global similarity matrix which is defined above [35], as shown in Equation (8). Therefore, global semantic information at any stage of the input data remains in the Hamming space. Therefore, the loss function that preserves similarity is defined as follows: i=1 n i represents the total amount of input data arriving before the t stage, · F refers to the F norm, and k is the bit length of hash codes.

Balanced Similarity
The introduction of a similarity matrix improves the hash codes generated by anchor hashing based on LE, and the global semantic information is better reflected. However, images with different labels among the massive data account for the majority. According to the definition of the global similarity matrix, the value of the element is 1 only when the labels are identical. Therefore, the global similarity matrix is sparse. Data imbalance will cause the loss of similar information, derail the optimization process and eventually drag Sensors 2023, 23, 2576 7 of 18 down the retrieval performance. To solve this issue, we employ the balanced similarity matrix S t as follows: where µ s represents the equilibrium factors of similar pairs, and µ d represents the equilibrium factors of dissimilar pairs. Usually, we take µ s < 1 and µ d < 1, which means reducing the Hamming distance of similar vectors and increasing the Hamming distance of dissimilar vectors. By adjusting two equilibrium divisors, the effect that comes from data imbalance is eliminated. By replacing the global similarity matrix, S t , in Equation (8) with a balanced similarity matrix, S t , the loss function that preserves the balanced similarity is defined as follows:

Overall Formulation
On one hand, we construct the anchor asymmetric graph to replace the Laplacian graph, preserving the local structural features of the data, and thus obtaining the objective function, L 1 . On the other hand, we perform an inner-product operation on the hash codes of existing data and newly arrived data and then constrain the learning process of hash codes with the global-balanced similarity matrix. The loss function, L 2 , that retains the semantic information of global-balanced similarity is obtained. Under a unified framework, the online hashing preserves both local and global dual-semantic information, and the total loss function L = L 1 + L 2 is obtained. By adding a quantized loss function, L 3 , the quantization error between the hash function and the target hash codes is minimized.
The F norm of the projection matrix is used as the penalty term to prevent the model from overfitting. The final objective function is obtained as follows: where α t , β t , and γ t are parameters that control the weight of each module.

Alternating Optimization
Because of the binary constraints, Equation (12) is a non-convex objective function in terms of W t , B t a , B t c , B t . We adopt an alternative optimization approach to deal with the overall formula, L; i.e., and when a variable is updated we assume that the remaining variables are fixed as constants.
• W t -step: fix B t a , B t c , B t , then learn hash weights W t . The second term in Equation (12) is eliminated, and the objective function becomes: We transform and simplify Equation (13) by Equation (14), which reveals the relation between F norm and the trace of a matrix. Then, Equation (15) is obtained.
where I represents the identity matrix of d-dimensional. Equation (15) takes the partial derivative with respect to W t , then assigns it zero. We have the following formula: Thus, we can get the Equation (17) to update W t • B t a -step: fix W t , B t c , B t , the second term in Equation (12) is retained and the formula becomes: According to [49], the L1 norm replaces the F norm, and the result is as follows: • B t c -step: fix W t , B t a , B t , the first and the fourth term in Equation (12) are eliminated, and the corresponding sub-problem is: In Equation (20), we remove irrelevant terms and the optimization problem becomes: where tr(·) is trace norm, P = kB t aŜ t T + β t W t T X t c . In the light of supervised discrete hashing (SDH) [50] and BSODH [35], the solution of Equation (21) becomes NP hard. Therefore, we transfer this issue to row-by-row updating, considering that the matrix is made up of row vectors. Thus, Equation (21) becomes: where b t cr , b t ar and p t r are the rth row of B t c , B t a and P t , respectively; B t c , B t a and P t stand for the remaining parts of B t c , B t a and P t except b t cr , b t ar and p t r respectively. Expanding Equation (22), we get the following: After simplification, the Equation (23) becomes the following: Therefore, we solve the sub-problem by following updating rules below: • B t -step: fix W t , B t a , B t c , only the first term remains in Equation (12), and it is transformed into the formula, as follows: Finally, we get the following rule to update the hash codes of the anchor data.
The proposed LSOH is presented in Algorithm 1.

S t by labels
Update W t , B t a , B t via Equation (17), Equation (19) and Equation (27)

Computational Complexity
The main computation cost of the iterative algorithm is from the construction of anchor graph A and the optimization of variables. In total, it costs O(Tdn t m + dn t m) to construct an anchor graph, where T is the iteration times to generate anchors by the clustering algorithm. The time complexities of updating W t , B t a , B t c and B t at the tth round are O(d 3 + n t d 2 + mn t r + n t dr), O(rn t m t ), O(kmn t + kdn t + krm + krn t ) and O(drn t + mrn t ) respectively. Since d, r, m, n t , and k are much smaller than m t , we obtain the time complexity of our proposed LSOH as linear to the size of the data. Obviously, it is scalable to largescale data.

Datasets
The three datasets used in this paper are CIFAR-10, MNIST, and Places205. CIFAR-10 [51] is a widely recognized dataset. It is made up of 60K samples among 10 classes. Every sample is represented by a 4096-dimensional CNN feature. Following [34], CIFAR-10 is divided into a retrieval set and a test set, where the retrieval set has 59K samples and the test set has 1K samples. The 50K samples within the retrieval set participate in the learning hash function. Twenty example images from each category of CIFAR-10 are shown in Figure 2.

Parameter Setting
Given experience, the scopes of , , for LSOH are set in [0:0.05:1]. For CIFAR-10, the best setting of ( , , ) is empirically adopted to (0.05, 0.6, 0.3). For the MNIST, we set (0.05, 0.6, 0.3) as the configuration of ( , , ). For Places205, (0.05, 0.8, 0.3) corresponds to ( , , ). Table 2 shows the specific parameters of our proposed LSOH on the three datasets mentioned above. We conduct a large number of experiments, where the bit length is taken from the set of [8,16,32,48,64,128]. In addition, the batch size should be greater than that of hash codes in SketchHash [38]. Therefore, the experimental results of SketchHash are presented only when the hashing codes are under 64 bits.  MNIST is a set of handwritten digit images, with a total of 70 K samples. Every sample is expressed as a 784-dimensional vector. A test set is constructed by sampling 100 samples from each category and the remaining samples form a retrieval set. There are 60 K instances among the retrieval set which participate in the learning hash function. We select 27 example instances randomly from each category, as shown in Figure 3.   ). Table 2 shows the specific parameters of our proposed LSOH on the three datasets mentioned above. We conduct a large number of experiments, where the bit length is taken from the set of [8,16,32,48,64,128]. In addition, the batch size should be greater than that of hash codes in SketchHash [38]. Therefore, the experimental results of SketchHash are presented only when the hashing codes are under 64 bits.  Places205 has 2.5 million scene images among 205 categories. The instances are firstly processed by the fc7 layer of AlexNet and are fallen to 128-dimensional vectors by PCA. The 20 samples are randomly selected from each category forming the test set. Thus, the rest form the retrieval set. We select a subset of 30K instances randomly from the retrieval set to learn the online hash function. The two hundred images shown in Figure 4 are sampled randomly from Places205.   ). Table 2 shows the specific parameters of our proposed LSOH on the three datasets mentioned above. We conduct a large number of experiments,

Parameter Setting
Given experience, the scopes of α t , β t , γ t for LSOH are set in [0:0.05:1]. For CIFAR-10, the best setting of (α t , β t , γ t ) is empirically adopted to (0.05, 0.6, 0.3). For the MNIST, we set (0.05, 0.6, 0.3) as the configuration of (α t , β t , γ t ). For Places205, (0.05, 0.8, 0.3) corresponds to (α t , β t , γ t ). Table 2 shows the specific parameters of our proposed LSOH on the three datasets mentioned above. We conduct a large number of experiments, where the bit length is taken from the set of [8,16,32,48,64,128]. In addition, the batch size should be greater than that of hash codes in SketchHash [38]. Therefore, the experimental results of SketchHash are presented only when the hashing codes are under 64 bits.

Evaluation Protocols
Some evaluation indicators are adopted, such as the mean average precision (mAP), precision within a Hamming sphere with a radius of 2 centered on every query point (Pre-cision@H2), and the precision of the top-K retrieved neighbors (Precision@K) to evaluate the proposed LSOH. It is worth noting that we apply the average accuracy of the first 1000 retrieved samples (mAP@1000) for Places205, to save calculation time. We adopt the precision-recall (PR) curves on MNIST and CIFAR-10 as well, to compare LSOH and several algorithms.

Compared Methods
To prove the effectiveness of LSOH, we perform abundant experiments and compare LSOH with several advanced OH algorithms such as OKH [24], SketchHash [38], AdaptHash [32], OSH [33], BSODH [35] and DSBOH [52]. The codes of the above comparison methods are publicly available. All the results of the above methods are achieved on a single computer that runs MATLAB and is equipped with a 3.0 GHz Intel Core i5-8500CPU and 16GB RAM. To reduce the error, each experiment was randomly run three times, and then the average is given in this work.

Results and Discussion
The values of mAP and Precision@H2 on the CIFAR-10 dataset are shown in Table 3. It lists the results when generating 8-bit, 16-bit, 32-bit, 48-bit, 64-bit, and 128-bit hash codes under different online methods. The results show that (1) mAP: values of our proposed LSOH are the highest in all the cases. Our proposed LSOH improves the accuracy by 3.3%, 0.4%, 1.9%, 3.4%, 1.5%, and 1.5%, respectively, over the second-best algorithm. It can be seen that LSOH improves the average accuracy, effectively. (2) Precision@H2: our proposed LSOH algorithm is 2.4% higher than the suboptimal algorithm in the situation of 48-bit. It is the second-best algorithm in the case of 8-bit, 16-bit, 32-bit, 64-bit, and 128-bit, while no algorithm can rank first in all the cases. In this way, LSOH still performs well, compared with other algorithms. The best results are displayed in bold. Table 4 reveals the values of mAP and Precision@H2 on MNIST. Results show that (1) mAP: In the case of 8-bit, 16-bit, 48-bit, 64-bit, and 128-bit, our proposed LSOH is 2.6%, 0.7%, 0.7%, 0.7%, and 0.8%, respectively, higher than the suboptimal algorithm orderly. It is the second-best algorithm when generating 32-bit hash codes. The advantage of LSOH is verified. (2) Precision@H2: values of Precision@H2 on LSOH are the highest under 8-bit and it ranks second in other code bits. As the bit length grows, the performance of LSOH is worse than that of BSODH under 48-bit, 64-bit, and 128-bit, but better than DSBOH. The best results are displayed in bold.
The outcomes of mAP and Precision@H2 on the Places205 dataset are expressed in Table 5. (1) mAP: our proposed LSOH is the best algorithm with 3.3% and 1.1% higher than the suboptimal algorithm under 16-bit and 32-bit, respectively. In other cases, the results of LSOH are not optimal. Due to the huge amount of data in Places205, other comparison algorithms have not always performed optimally. In contrast, the LSOH algorithm has better stability and relatively higher retrieval accuracy. (2) Precision@H2: our proposed LSOH has optimal values under 16-bit and 48-bit, with 1.1% and 0.8%, respectively, better than the second-best algorithm, and it ranks second in other code bits. In conclusion, our proposed LSOH algorithm performs well and has high retrieval accuracy on Precision@H2. For further testing of our proposed LSOH, Precision@K curves in the case of 8-bit, 16-bit, 32-bit, 48-bit, 64-bit, and 128-bit are drawn on CIFAR-10 and MNIST, as displayed in Figures 5 and 6. Comparative experiments of these metrics on the Places205 dataset are not conducted, due to its large memory requirements.  From Figure 5, we can see the Precision@K curves on the CIFAR-10 dataset. It is obvious that the Precision@K curve of our proposed LSOH is higher than other comparison curves in the case of 8-bit, 16-bit, 32-bit, 48-bit, 64-bit, and 128-bit. Thus, the performance of 32-bit and 64-bit hash codes is particularly outstanding. As shown in Figure 6, LSOH continuously reveals a higher Precision@K curve, compared with other algorithms on the MNIST dataset. Only when generating 8-bit hash codes, does our proposed LSOH have a temporary fluctuation on the CIFAR-10 dataset. As hash bits increase, the retrieval accuracy goes up slightly, which shows the robustness and superiority of the LSOH algorithm.  From Figure 5, we can see the Precision@K curves on the CIFAR-10 dataset. It is obvious that the Precision@K curve of our proposed LSOH is higher than other comparison curves in the case of 8-bit, 16-bit, 32-bit, 48-bit, 64-bit, and 128-bit. Thus, the performance of 32-bit and 64-bit hash codes is particularly outstanding. As shown in Figure 6, LSOH continuously reveals a higher Precision@K curve, compared with other algorithms on the MNIST dataset. Only when generating 8-bit hash codes, does our proposed LSOH have a temporary fluctuation on the CIFAR-10 dataset. As hash bits increase, the retrieval accuracy goes up slightly, which shows the robustness and superiority of the LSOH algorithm. Figure 7 shows the precision-recall (PR) curves under 32-bit. By calculating the area under curve (AUC) of the PR curves, we obtain the values of 95.85% and 91.77% in turn,

Parameter Sensitivity
In this subsection, we conduct the ablation studies on the hyper-parameters of , , and , as defined in Equation (12). Without loss of generality, we conduct experiments with varying values of these hyper-parameters concerning mAP(mAP@1000) in the case of 32-bit, in Figure 8. (Detailed values used in this paper are outlined in Table 2.) Similar experimental results can be observed in other hashing bits.
As shown in Equation (12), is used to reflect the importance of anchor graph hashing. Figure 8a plots the influence of different values of on the performance. Generally speaking, when =0.05 on CIFAR-10 and MNIST, LSOH obtains the best mAP (0.739 on CIFAR-10 and 0.760 on MNIST). When =0.05 on Places205, LSOH obtains the best mAP@1000, with 0.251. Moreover, when =0, LSOH suffers a performance degradation, as can be seen in Figure 8a. More specifically, in this case, the mAP(mAP@1000) scores are 0.740, 0.731, and 0.200 on MNIST, CIFAR-10, and Places205, respectively. To analyze, when =0, LSOH is similar to BSODH. In the experiments, we empirically set the values of as 0.05 on all three datasets. As shown in Equation (12), is used to reflect the importance of the quantized loss

Parameter Sensitivity
In this subsection, we conduct the ablation studies on the hyper-parameters of α t , β t , and γ t , as defined in Equation (12). Without loss of generality, we conduct experiments with varying values of these hyper-parameters concerning mAP(mAP@1000) in the case of 32-bit, in Figure 8. (Detailed values used in this paper are outlined in Table 2.) Similar experimental results can be observed in other hashing bits.

Limitations and Potential Improvements
By comparing the weights of each module in Equation (12), it can be seen that the global-balanced similarity plays an important role in training hash codes. However, some operations on a matrix need to be processed, due to the introduction of anchor hashing, which leads to the training time of LSOH being slightly slower than that of the BSODH algorithm. For example, LSOH takes several seconds longer than BSODH when generating a 32-bit hash code, but it is shorter than OSH. In addition, the inverse of the matrix is required when calculating , and its time complexity is O( ). In other words, it is timeconsuming when the dimension of the retrieval image is too large. Therefore, employing a more effective and efficient method to perform the matrix operation is desirable and worthwhile.

Conclusions
In this paper, a novel hashing algorithm preserving both the local and global dual semantics for image retrieval, i.e. LSOH, was proposed. By extracting the local manifold structure for data coming at the same time, and constructing a global-balanced similarity matrix from data at a different time, we obtain a relatively comprehensive hash constraint, which avoids the problem of over-reliance on labels and imbalanced data updates. Then, an alternative-iteration algorithm is used to solve the discrete binary optimization. Extensive experiments on the benchmark datasets verify that LSOH has significant advantages, compared with other advanced algorithms. However, similar to other state-of-the-art online-hashing algorithms, LSOH decreases the retrieval accuracy with the hash-bits increase. Recently, cross-modal retrieval has had more application requirements, and our method of mining the local structural features of the retrieval data and finding similarity measures of the global data is also worthy of reference and of application in cross-modal retrieval. Given the strong capability for feature representation, the research on online hashing with deep learning networks is also a valuable topic for the future.
Author Contributions: Conceptualization, X.C. and Y.L.; methodology, X.C.; software, X.C. and C.C.; validation, C.C.; formal analysis, X.C.; investigation, X.C. and C.C.; resources, X.C. and Y.L.; data curation, C.C.; writing-original draft preparation, C.C. and X.C.; writing-review and editing, X.C.; visualization, X.C.; supervision, X.C.; project administration, Y.L.; funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.  As shown in Equation (12), α t is used to reflect the importance of anchor graph hashing. Figure 8a plots the influence of different values of α t on the performance. Generally speaking, when α t = 0.05 on CIFAR-10 and MNIST, LSOH obtains the best mAP (0.739 on CIFAR-10 and 0.760 on MNIST). When α t = 0.05 on Places205, LSOH obtains the best mAP@1000, with 0.251. Moreover, when α t = 0, LSOH suffers a performance degradation, as can be seen in Figure 8a. More specifically, in this case, the mAP(mAP@1000) scores are 0.740, 0.731, and 0.200 on MNIST, CIFAR-10, and Places205, respectively. To analyze, when α t = 0, LSOH is similar to BSODH. In the experiments, we empirically set the values of α t as 0.05 on all three datasets.
As shown in Equation (12), β t is used to reflect the importance of the quantized loss function. From Figure 8b, we can observe that when β t = 0.6 on CIFAR-10 and MNIST, LSOH obtains the best mAP (0.768 on MNIST and 0.747 on CIFAR-10). When β t = 0.8 on Places205, LSOH obtains the best mAP@1000, with 0.251. Moreover, when β t = 0, LSOH suffers great performance degradation, as can be seen in Figure 8b (0.278 on CIFAR-10, 0.244 on MNIST, and 0.13 on Places205). We can observe from Figure 8b that properly applying the quantized loss term in Equation (11) can significantly boost the performance of the three datasets. In the experiments, we empirically set the values of β t as 0.6 on CIFAR-10 and MNIST, and 0.8 on Places205.
From Figure 8c, we can observe that when γ t = 0.3 on CIFAR-10 and MNIST, LSOH obtains the best mAP (0.768 on MNIST and 0.745 on CIFAR-10). When γ t = 0.3 on Places205, LSOH obtains the best mAP@1000, with 0.251. Moreover, when γ t = 0, LSOH suffers great performance degradation, as can be seen in Figure 8c. Thus, it is necessary to use a penalty term properly to prevent the model from overfitting. In the experiments, we empirically set the values of γ t as 0.3 on the three datasets.

Limitations and Potential Improvements
By comparing the weights of each module in Equation (12), it can be seen that the global-balanced similarity plays an important role in training hash codes. However, some operations on a matrix need to be processed, due to the introduction of anchor hashing, which leads to the training time of LSOH being slightly slower than that of the BSODH algorithm. For example, LSOH takes several seconds longer than BSODH when generating a 32-bit hash code, but it is shorter than OSH. In addition, the inverse of the matrix is required when calculating W t , and its time complexity is O(d 3 ). In other words, it is timeconsuming when the dimension of the retrieval image is too large. Therefore, employing a more effective and efficient method to perform the matrix operation is desirable and worthwhile.

Conclusions
In this paper, a novel hashing algorithm preserving both the local and global dual semantics for image retrieval, i.e. LSOH, was proposed. By extracting the local manifold structure for data coming at the same time, and constructing a global-balanced similarity matrix from data at a different time, we obtain a relatively comprehensive hash constraint, which avoids the problem of over-reliance on labels and imbalanced data updates. Then, an alternative-iteration algorithm is used to solve the discrete binary optimization. Extensive experiments on the benchmark datasets verify that LSOH has significant advantages, compared with other advanced algorithms. However, similar to other state-of-the-art online-hashing algorithms, LSOH decreases the retrieval accuracy with the hash-bits increase. Recently, cross-modal retrieval has had more application requirements, and our method of mining the local structural features of the retrieval data and finding similarity measures of the global data is also worthy of reference and of application in cross-modal retrieval. Given the strong capability for feature representation, the research on online hashing with deep learning networks is also a valuable topic for the future.