OHetTLAL: An Online Transfer Learning Method for Fingerprint-Based Indoor Positioning

In an indoor positioning system (IPS), transfer learning (TL) methods are commonly used to predict the location of mobile devices under the assumption that all training instances of the target domain are given in advance. However, this assumption has been criticized for its shortcomings in dealing with the problem of signal distribution variations, especially in a dynamic indoor environment. The reasons are: collecting a sufficient number of training instances is costly, the training instances may arrive online, the feature spaces of the target and source domains may be different, and negative knowledge may be transferred in the case of a redundant source domain. In this work, we proposed an online heterogeneous transfer learning (OHetTLAL) algorithm for IPS-based RSS fingerprinting to improve the positioning performance in the target domain by fusing both source and target domain knowledge. The source domain was refined based on the target domain to avoid negative knowledge transfer. The co-occurrence measure of the feature spaces (Cmip) was used to derive the homogeneous new feature spaces, and the features with higher weight values were selected for training the classifier because they could positively affect the location prediction of the target. Thus, the objective function was minimized over the new feature spaces. Extensive experiments were conducted on two real-world scenarios of datasets, and the predictive power of the different modeling techniques were evaluated for predicting the location of a mobile device. The results have revealed that the proposed algorithm outperforms the state-of-the-art methods for fingerprint-based indoor positioning and is found robust to changing environments. Moreover, the proposed algorithm is not only resilient to fluctuating environments but also mitigates the model’s overfitting problem.


Introduction
With the rapid proliferation of sensor-equipped mobile devices, intelligent applications, and the ubiquity of wireless communication networks, as well as the growing demand for location-based services (LBS), localization, tracking, and navigation systems are receiving much attention from researchers, engineers, practitioners, and the industry [1][2][3]. Location services have greatly facilitated people's daily activities because the services they need are mostly location-based. Although Global Positioning System (GPS) is the standard method for location determination and is known for its reliable and promising accuracy outdoors, it cannot provide accurate location determination indoors due to non-line-of-sight (NLOS) propagation, poor penetration of GPS signals in complex buildings, and the multipath effect caused by walls and obstacles [4][5][6]. On the other hand, rapid commercial growth and user demand for location-based services and social networking services (SNSs) have made Indoor Positioning Systems (IPSs) essential in all areas of life where mobility plays a role [1,7,8]. Moreover, unlike datasets and corresponding frameworks suitable to compare and evaluate specialized localization solutions.
However, Wi-Fi fingerprinting [33][34][35][36][37] is becoming one of the most commonly used methods for indoor localization due to the ubiquity of Wi-Fi infrastructure and the popularity of Wi-Fi-enabled mobile devices. An IPS for commercial buildings has been developed to provide fine-grained occupancy-based HVAC (Heating, Ventilation and Air Conditioning systems (HVAC)) actuation [23]. This system exploits the existing Wi-Fi network infrastructure and occupants' smartphones with the Wi-Fi enabled. They have also discussed about how using the existing infrastructure for occupancy detection greatly lowers the cost and labor of implementation and maintenance. Furthermore, noisy Wi-Fi signals and metadata regarding the building's occupants, access points, and HVAC zone have also been employed to alleviate the difficulties in occupancy sensing, which could make it a cost-effective technique. Another method of Wi-Fi indoor localization based on SVM is proposed [53] to identify the floor using the Wi-Fi signal of each floor, and the method achieved an accuracy rate of 99.09% for 3D positioning [53]. An indoor localization system-based Wi-Fi fingerprint using spatial multi-points matching has also been proposed to estimate the user's position [54]. An indoor location tracking technology-based Wi-Fi fingerprint technique [55] has been proposed to improve the user's location accuracy using mobile communication technology and location tracking technology. Additionally, for an indoor IoT application using a Bayesian network and a limited radio map, a reliable 3D indoor positioning system-based Wi-Fi RSS fingerprinting has been developed [56]. However, to counteract the impact of received signal variations caused by multipath and signal attenuations throughout various time periods, the radio map should be updated within short time frame.
This method does not use radio signal propagation geometry but requires data acquisition and a built-in radio map in the offline phase, although it has notable shortcomings in the dynamic indoor environment and requires enough labeled samples, making it laborintensive and costly [57]. Moreover, relatively satisfactory accuracy can be achieved by this method, but [33][34][35] the database fingerprint must be up to date, otherwise, the location performance will be severely degraded. Some research works have been conducted to reduce the effort and time required to create the radio map. Most notably, these include crowd-sourcing [58] and simultaneous localization and mapping over Wi-Fi [59]. However, crowd-sourcing requires the active participation of users or achieves low accuracy, and the Wi-Fi simultaneous localization and mapping approach also suffers from the high computational cost. In [60], it was also described how to speed up computation and enhance system accuracy for indoor environments by using a clustering approach based on fingerprinting. Compared to the non-clustering approach, which had an average distance error of 3.4 m, the suggested technique obtained an average distance error of 2.4 m. Various methods have been proposed in the literature to address the dynamic indoor environment, which results in low localization accuracy due to variations in fingerprint patterns over time. These methods can be classified into four groups: probabilistic methods [38,39], machine learning methods [40,41], using the quality of fingerprints of various signal features (such as Channel State Information (CSI) [3,42], Phase of Arrival (PoA) [43], Time of Flight (ToF) [44], etc.), and derived signal features (Signal Strength Difference (SSD) [15] and fused group of fingerprints [45,46]).
Although the proposed methods [15,[38][39][40][41][42][43][44][45][46] have improved the location accuracy, they also suffer from fluctuations in the signal distribution and are not robust to a dynamic indoor environment, mainly due to two limitations: (i) the measurements depend on a single fingerprint and cannot represent the dynamic scenario of the indoor environment; (ii) the need for a sufficient number of labeled samples is also very costly. Therefore, a hybrid positioning system (HPS) has been proposed for various seamless localization applications (Wi-Fi, Bluetooth, UWB, and ZigBee integrated into a hybrid base station (BS)) to solve the standalone positioning problem [8]. Other researchers in [47] have also proposed a hybrid indoor localization system based on the IMU sensor and smartphone camera, which has better positioning performance than the standalone system due to the possible combination of errors compensated by each technology. However, hybrid BS is not the subject of our work and is not economically feasible. On the other hand, computational complexity is a serious problem for hybrid indoor positioning systems. Research-based on machine learning algorithms (ML) have also been proposed to address the indoor location problem [40,41,61,62]. A limited number of researchers have focused on reducing the calibration overhead for indoor WLAN locations [63,64]. The impact of reducing the number of samples collected at each coverage point, and reducing the number of coverages points on the accuracy of location determination has been studied [63,64]. To compensate for the performance degradation with reduced calibration, both linear and kernel-based interpolation methods have been proposed [63,64] to patch an incomplete radio map. The hidden Markov model [64] and hierarchical Bayesian model [65] have also been proposed to exploit the information contained in the unlabeled user traces and improve the accuracy of location estimation. A label propagation method [66,67] has also been used to learn from labeled and unlabeled data under the assumption that similar data samples should have similar labels. However, this approach is not suitable for online prediction.

Online Transfer Learning Approach to Localization Problem
Several machine learning approaches [61,62] have achieved promising results assuming that the training and testing data have the same distribution. However, in practice, it is not possible to collect enough training data that belong to the same feature space as the test data due to resource and environment constraints. Transfer learning (TL) therefore addresses such cross-domain learning problems by extracting useful information from data in a related domain and transferring it for use in the target tasks. We believe that leveraging the knowledge from the labeled source data can drastically reduce the expensive data labeling effort. Depending on the feature spaces of the source and target domains, TL is divided into two categories: homogeneous transfer (HomTL) and heterogeneous transfer (HetTL). HomTL is used in many real-world applications where the source and target data follow the same distribution, such as text search [68], image classification [69], and face recognition [70]. Although the focus of this work is on IPS, a multiclassification problem that uses binary scores in a source domain has been proposed to alleviate the problem of sparse numerical scores in the target domain [71].
Another study highlighting HetTL for multiclass problems was performed based on the compressed sensing theory [72]. In addition, co-occurrence data was used to solve OHetTL problems [73][74][75][76]. A scheme was also explored to evaluate the relatedness between given source domains by transferred weights learned from co-occurrence data [77]. We also found that most existing studies on TL operate in offline batch learning environments, where it is assumed that all training instances of the target domain are given in advance [78,79]. However, this assumption does not hold in many real-world scenarios where collecting sufficient training data is very costly and training instances may arrive online. In the online transfer learning (OTL) problem, we aim to perform an online learning task in a target domain by leveraging knowledge from offline source data and sequentially updating a classifier based on feedback from a data sequence by processing each instance as it arrives [80,81].
To address the dynamic indoor environmental conditions, several transfer learning algorithms (TLALs) have been proposed. For example, one study assumed that the RSS value at each location can be computed as a linear aggregate of the RSS values of neighboring benchmarks for each Wi-Fi access point and that this linear relationship remains the same over different time periods [82]. It was found that a model tree was created for each access point at each site. Accordingly, it transmits the Hidden Markov Model (HMM) for a new time point [83]. It uses the neighborhood relation as a bridge to transfer the knowledge and treats the trace as a Markov chain, then the HMM is trained for the new time period. It also considers the problem of cross-device transfer learning and treats multiple devices as multiple learning tasks [84]. However, these algorithms consider either cross-time or cross-device transfer learning and none of them are applied in a real indoor wireless localization system (WILS). The OTL task has also been studied in the context of multi-armed bandits [85]. Moreover, OTL has been applied with online homogeneous source data [86,87]. For online homogeneous TL problems, an ensemble strategy was used, and for OHetTL learning problems, a multi-view approach was used. However, the studies [88,89] considered OHetTL under the assumption that the feature space of the source domain is a subset of the feature space of the target domain. Another study also used some offline data to support an online task in a target domain [88][89][90]. In this sense, the Wi-Fi-based indoor localization system aims to determine the location of the user or mobile device based on the signals received from different Wi-Fi access points (APs). In this work, the proposed model would learn based on the derived new feature spaces refined through the co-occurrence weight score of instances to predict the labeling in the target domain that could arrive in an online (OHetTL). Thus, the Wi-Fi APs send the signal and the mobile user receive the RSS values. The localization process conducted at the user's side connected to a server. Moreover, Figure 1 below explicitly describes the system operation used for RSS-based Fingerprinting of Indoor Positioning System (FBIPS).
knowledge and treats the trace as a Markov chain, then the HMM is trained for the n time period. It also considers the problem of cross-device transfer learning and treats m tiple devices as multiple learning tasks [84]. However, these algorithms consider eit cross-time or cross-device transfer learning and none of them are applied in a real ind wireless localization system (WILS). The OTL task has also been studied in the contex multi-armed bandits [85]. Moreover, OTL has been applied with online homogene source data [86,87]. For online homogeneous TL problems, an ensemble strategy w used, and for OHetTL learning problems, a multi-view approach was used. However, studies [88,89] considered OHetTL under the assumption that the feature space of source domain is a subset of the feature space of the target domain. Another study a used some offline data to support an online task in a target domain [88][89][90]. In this sen the Wi-Fi-based indoor localization system aims to determine the location of the user mobile device based on the signals received from different Wi-Fi access points (APs) this work, the proposed model would learn based on the derived new feature spaces fined through the co-occurrence weight score of instances to predict the labeling in target domain that could arrive in an online (OHetTL). Thus, the Wi-Fi APs send the sig and the mobile user receive the RSS values. The localization process conducted at user's side connected to a server. Moreover, Figure 1 below explicitly describes the syst operation used for RSS-based Fingerprinting of Indoor Positioning System (FBIPS).

Problem Formulation and Framework
Consider an indoor environment can be divided into R reference points (RP), which represent the target's location, and each RP is numbered with a label l, (l = 1, 2, . . . , L). Suppose there are P detectable Wi-Fi access points (j) in total, and the received signal strengths (RSS) of fingerprints at the lth reference point can be represented as a matrix: M l = r l J (i), r l J (i), . . . . . . , r l J (i) where r l J (i) = r l 1 (i), r l 2 (i), . . . . . . , r l P (i) T and represents the ith RSS values collected from the lth reference point of the j th access point (i = 1, 2, . . . , n; j = 1, 2, . . . , p). In the OHetTL problem, the target instances m=1 ∈ x arrive in an online manner, and the labeled source data i=1 ∈ x s × y s come from an offline source domain, where n T and n o f represent the numbers of target and source data, respectively. In the traditional machine learning (ML) techniques, both the training and testing instances be assumed to come from the same domain, and follow the same data distribution. However, in practice, training and testing data may come from different domains, which has either: • Different marginal distributions, or different feature spaces: X S = X T , or; P(X S ) = P(X T ) or • Different predictive distributions, or different label spaces: y S = y T , or, (P S (y|x) = P T (y|x)).
Several machine learning algorithms have been proposed for indoor positioning problems, but they do not consider the possible effects of signal variations that are likely to originate from the source domain to improve the model's positioning performance in the target domain. In practice, there may be multiple source domains associated with the target domain that could improve target positioning, although combining these domains is challenging (see Figure 2). The expensive data labeling effort can also be reduced by using the knowledge of the labeled source data. It is also much easier and cheaper to collect unlabeled samples in the field during the testing phase. Therefore, it is important to develop a method that reduces the effort required for offline calibration while still achieving relatively high accuracy in location determination. Figure 2 illustrates the approach of the proposed algorithm, which uses the knowledge from the source and target domains of heterogeneous feature spaces, considering the offline fingerprints of the radio map and the test fingerprints obtained during the online phase as the source and target domains, respectively. First, the proposed algorithm OHetTLAL would derive the new feature spaces based on the weighted co-occurrence of measurements of the offline and online fingerprints, and the newly derived feature spaces are called co-occurrence data. Hoping to improve the learning performance in the target domain, we then fused the offline data and the cooccurrence data to build the offline decision function. Finally, the positioning performance of the model in the target domain is improved by fusing both the offline and online decision functions. In addition, the loss function between the actual and estimated positions was evaluated to update the model. The ⊕ and ⊗ symbols used in Figure 2 represent the data fusion and the updating phase of the classifier based on the loss measurement, respectively. Figure 2 below illustrates the proposed framework of heterogenous transfer learning-based weight co-occurrence for the indoor positioning system.

Distribution Divergence Minimization
In real world scenarios, for the indoor positioning environment, the feature spaces of the sources domain and target domain are different for several practical reasons. Therefore, using the maximum mean discrepancy (MMD) criterion [91,92] distributions of the two domains was compared, and we worked to minimize the distance between the sample mean of the source and target data (reducing the marginal distribution distance of the two domains). To calculate the marginal distribution distance between the source domain and the target domain, the following formula was adopted.

Distribution Divergence Minimization
In real world scenarios, for the indoor positioning environment, the feature spaces of the sources domain and target domain are different for several practical reasons. Therefore, using the maximum mean discrepancy (MMD) criterion [91,92] distributions of the two domains was compared, and we worked to minimize the distance between the sample mean of the source and target data (reducing the marginal distribution distance of the two domains). To calculate the marginal distribution distance between the source domain and the target domain, the following formula was adopted.

Derivation of the New Source Feature Spaces
There are two major problems with OHetTL: (a) the feature spaces of the source and target domains differ. In this manner, the offline database could not be used directly to enhance learning performance, as it could result in a negative knowledge transfer that would degrade positioning prediction accuracy. The labeled training data of the target domain are limited, making it difficult to build a precise correspondence map linking the source and target domains. Thus, our first step in addressing these challenges is to develop new feature spaces that achieve positive knowledge transfer for the target's localization and are computationally efficient. In order to achieve this, we established a set of standards for constructing new feature spaces called the refined source domain (SD). Exclusion criteria were used to remove redundant knowledge from the offline database during the construction of the new source feature spaces. By excluding irrelevant RSS measurements from different Wi-Fi access points, extraneous features have no longer be included in the analysis. By analyzing the co-occurrence nature of the signal measurements or Wi-Fi access points, this procedure is not only about avoiding or minimizing all negative knowledge transfer to the target domain, but also about identifying the most significant predicting factors.

Derivation of the New Source Feature Spaces
There are two major problems with OHetTL: (a) the feature spaces of the source and target domains differ. In this manner, the offline database could not be used directly to enhance learning performance, as it could result in a negative knowledge transfer that would degrade positioning prediction accuracy. The labeled training data of the target domain are limited, making it difficult to build a precise correspondence map linking the source and target domains. Thus, our first step in addressing these challenges is to develop new feature spaces that achieve positive knowledge transfer for the target's localization and are computationally efficient. In order to achieve this, we established a set of standards for constructing new feature spaces called the refined source domain (SD). Exclusion criteria were used to remove redundant knowledge from the offline database during the construction of the new source feature spaces. By excluding irrelevant RSS measurements from different Wi-Fi access points, extraneous features have no longer be included in the analysis. By analyzing the co-occurrence nature of the signal measurements or Wi-Fi access points, this procedure is not only about avoiding or minimizing all negative knowledge transfer to the target domain, but also about identifying the most significant predicting factors.
However, we believe that the similarity between Wi-Fi access points, which is the co-occurrence of the cross-domain Wi-Fi access points, could be used in place of scrutinizing the individual RSS measurements, as the coverage area of a detected access point may approximate the target's position to some degree. The MMD criterion [91,92]  refined source domain, a metric was used to compute the co-occurrence between the offline sample and online sample, defined as: We can calculate the weight co-occurrence scores for all instances of both sides of the source and target domains as: C p mi = C 1 n T n S , C 2 n T n S , . . . , C p n T n S . The higher C(i, m) would imply that the more detected Wi-Fi APs are shared by these two domains, and the more likely this offline data is useful for the target prediction. The source domain has been refined based on the target domain to avoid negative knowledge transfer. The co-occurrence measures of feature spaces (C p mi ) were used to derive the homogeneous new feature spaces, and those features with higher weight scores were selected to construct the source domain and used as an input to train the classifier, as it could improve the target's location prediction. The pseudo code to construct the source domain is provided in Algorithm 1. for m = 1, 2, . . . , n T do 4.
end for 16. end for 17. end for 18. Train a classifier from X S_h and y S by adjusting weights of source samples W mi 19. Estimate y T on X T_h by applying the trained classifier f ((X s_h , y S ), W mi ) 20. return X S , X T , y T , W mi

The Objective Function
In OHetTL, the objective function could be affected by several factors, because the environment is inherently heterogeneous. The factors of time period and devices used to collect the signal measurements are not uniform due to the dynamic environment of indoor positioning, and if the data are unbalanced, especially in binary classification, the distribution may be different. Therefore, the proposed solution is to minimize the objective function in a new feature space so that the prior knowledge from the corresponding source domain can be transferred to the target domain. The problem is formulated as a constrained optimization, as can be observed in Equation (3) below. The transformation of the feature spaces in the source domain was based on the attributes in the target domain, where the performance of the classifier helps to improve the prediction of the position of the mobile device or user. Therefore, the new feature spaces were used to minimize the difference in the received signal strength so that the transfer weights can be estimated, as follows: The weights constraint is to better minimize the signal measurements of differences between the samples. The equality constraint mentioned in the objective function would assign higher weights to the most related source samples and the lessor weights would be assigned to the less related source samples. The weights are therefore updated as follows. To solve the above constrained optimization problem, Equation (3) can be converted into an unconstrained optimization using the Lagrangian multiplier method. Let us assume that we have obtained a location estimate z (t−1) at the (t−1)th iteration, and now we need to estimate the weights w(t) at the tth iteration. By applying the Lagrangian multiplier method, we obtain: where α is the Lagrangian multiplier. By letting the partial derivative of the Lagrangian with respect to w im and α to be zeros, we obtain: By adding up the first n s terms in Equation (5), we can obtain Substituting Equation (6) into Equation (5) gives the estimated transfer weight as:

Prediction Evaluation Metrics
In this section, effectiveness of the proposed algorithm was measured with different machine learning algorithms taken as baselines and verified through extensive experimentations based on real world datasets. The root mean square error (RMSE) was used to evaluate the effectiveness of the proposed algorithm and defined as: where [x m ,ŷ m ] T and [x, y] T are the predicted location estimate and the true location of a client of the 2-dimensional coordinates of the mth positioning sample, respectively. n is the total number of samples to be located in the target domain.

Experimental Results and Discussion
In this section, two real-world experiments were carried out to evaluate our proposed algorithm. Experimental settings and datasets were first presented, and the overall performance of the classifiers are analyzed. Both feature spaces of scenarios were considered (before and after transformation).

Dataset A
The experiment was conducted at UESTC (University of Electronic Science and Technology of China) as depicted in Figure 3 and the 21st floor of the Innovation building was taken as the indoor environment site consisting of 10 offices and one corridor covers an area of 1460 m 2 and partitioned into 210 reference points (RP), which represent the target's location, and each RP is numbered with a label l, (l = 1, 2, . . . , L). n o f and n ol are the number of offline and online instances collected during the offline training phases and the online testing phase, respectively. The RSS of fingerprints at the lth RP can be represented as a matrix: M l = r l J (i), r l J (i), . . . . . . , r l J (i) . Smartphones were used for the data collection at the deployment area.

Dataset B
Another real-office experiment at UESTC (University of Electronic Science and Technology of China) on the 21 st floor of the Innovation Building was also conducted in different environmental settings to validate our results, as shown in Figure 4. It covers an area of 1460 m 2 and partitioned into 175 reference points (RP). Nine Wi-Fi access points were sparsely deployed to guarantee that at least three access points are detectable at each RP. Moreover, Table 1 below provides the summary of the datasets used to evaluate the algorithms.

Dataset B
Another real-office experiment at UESTC (University of Electronic Science and Technology of China) on the 21 st floor of the Innovation Building was also conducted in different environmental settings to validate our results, as shown in Figure 4. It covers an area of 1460 m 2 and partitioned into 175 reference points (RP). Nine Wi-Fi access points were sparsely deployed to guarantee that at least three access points are detectable at each RP.
 Smartphones were used for the data collection at the deployment area.

Dataset B
Another real-office experiment at UESTC (University of Electronic Science and Technology of China) on the 21 st floor of the Innovation Building was also conducted in different environmental settings to validate our results, as shown in Figure 4. It covers an area of 1460 m 2 and partitioned into 175 reference points (RP). Nine Wi-Fi access points were sparsely deployed to guarantee that at least three access points are detectable at each RP. Moreover, Table 1 below provides the summary of the datasets used to evaluate the algorithms.  Moreover, Table 1 below provides the summary of the datasets used to evaluate the algorithms.  Tables 2 and 3 show the distributions of RSS measurements of the offline and online databases, respectively. There were 416 Wi-Fi access points and 15,660 instances were measured from each Wi-Fi access point. The mean RSS values for AP_0, AP_1, AP_2, AP_3, and AP_4 were −59.4, −64.9, −70.1, −74.2, and −73.3, respectively. A standard deviation of 16.8, 15.9, 14.5, 17.9, and 17.1 was also measured for the signal fluctuations. The mean RSS values for AP_412, AP_413, AP_414, and AP_415 are almost all −100, which means that the base stations are not detected at all ( Table 2). The unique pattern for these Wi-Fi access points with a score of −100 is that the measured signal variations have a standard deviation of 0.11, 0.43, 0.15, and 0.27, while no signal was detected in any of these specified base stations. The lowest values of standard deviation observed in the above scenario suggest a higher consistency of measurements, even though the standard deviation (sd) can be surprisingly zero for some Wi-Fi access points. However, these base stations (with the lowest sd) were not detected on any mobile device. Such a value (−100) means that no base station was detected during the training and testing phase and we excluded it from the training model. Otherwise, it would be a negative knowledge transfer that either falsely inflate (both false positive and false negative predictions could be high) or degrade the target's location prediction. Alternatively, the relationship between signal and location might not be characterized.     Tables 4 and 5 show the distributions of RSS measurements of the derived new features of the source domain and online databases, respectively. Based on the weighted co-occurrence factor, 41 feature spaces were selected to train the classifiers for further location prediction of the target. After creating the new feature spaces, a similar pattern was observed for the Wi-Fi Aps, which had a score of −100, and the signal fluctuations were also inflated to a standard deviation of 16.7, 15.8, 14.5, 17.9 and 14.5 for AP_0, AP_1, AP_2, AP_3, and AP_4, respectively. The higher values of standard deviation that were observed indicate less consistency in the measurements, although the standard deviation may be too large in relative terms for some Wi-Fi access points (with a higher weighting of co-occurrence). Nevertheless, the base stations with a higher weighting of co-occurrence are the ones that have active signal responses to mobile devices in both phases.     Table 6 shows that the number of measurements from each reference point is different and shows that the distribution is unbalanced for dataset A, so feature scaling was considered to avoid the effect of dominance of the larger occurrence of labels within the cluster. Otherwise, the larger features would dominate the others within the cluster. On the other side, the number of measurements collected from each RP demonstrates that the label distribution for dataset B is balanced. Thus, no feature scaling technique is required to avoid the dominance effect of the cluster's higher label occurrence.

Comparative Analysis of Methods
In this section, we present the comparative analysis of the positioning performance of the proposed indoor positioning-based RSS fingerprinting algorithm. Two real-world datasets were used to compare the proposed algorithm with the most commonly used machine learning algorithms (ML) that have achieved promising results in the literature. Our proposed model was compared with seven machine learning algorithms (Decision Tree (DT), K-nearest neighbor (KNN), Support Vector Machine (SVC), Logistic Regression (LR), Random Forest (RF), Gaussian Mixture Model (GMM), and Neural Network (MLP)). Data were preprocessed, and irrelevant feature spaces were excluded from further analysis based on their weighted co-occurrence scores to avoid negative knowledge transfer. Descriptive analysis was also performed for both datasets of A and B to enable us to provide an insight about the data distributions.

Dataset A
The original number of feature spaces was 416 for dataset A. However, based on the co-occurrence weight scores and exclusion criteria, it was determined that several feature spaces were irrelevant to the analysis because no signal was detected in any of the specified access points. In addition, the Wi-Fi-received signal strength measurements that had the lowest co-occurrence weights for the offline and online source instances were excluded from the prediction task because they did not affect the localization estimate. Figure 5 shows that the RSS measurements for the different Wi-Fi access points have a certain pattern and follow different distributions for the different labels of Wi-Fi AP1 (AP_1). It mainly shows that the relationship between signal and location has been characterized and follows the unique pattern for a particular label. This shows that the signal variations of the received Wi-Fi signal strengths at a particular reference point have been measured with different distribution values, such as the standard deviation and the average values of the received signals. The unique pattern for the measurements collected by the Wi-Fi AP seems to have different values for the signal-to-location relationships of the different target points. In addition to that, there are some noticeable outlier values (as observed in Figure 5 for target points 1, 76, and 191) that may affect the distribution of the values or have a negative impact on the location accuracy (both false positive and false negative predictions could be high) or degrade the location prediction of the target point. Therefore, the relationship between signal and location could not be characterized, as well as the outliers could affect the location performance. Moreover, to better visualize the pattern of the signal distributions in a reference point, some portions of the labels have been extracted, as shown in Figure 6. Descriptive analysis was also performed for both datasets of A and B to enable us to provide an insight about the data distributions.

Dataset A
The original number of feature spaces was 416 for dataset A. However, based on the co-occurrence weight scores and exclusion criteria, it was determined that several feature spaces were irrelevant to the analysis because no signal was detected in any of the specified access points. In addition, the Wi-Fi-received signal strength measurements that had the lowest co-occurrence weights for the offline and online source instances were excluded from the prediction task because they did not affect the localization estimate. Figure 5 shows that the RSS measurements for the different Wi-Fi access points have a certain pattern and follow different distributions for the different labels of Wi-Fi AP1 (AP_1). It mainly shows that the relationship between signal and location has been characterized and follows the unique pattern for a particular label. This shows that the signal variations of the received Wi-Fi signal strengths at a particular reference point have been measured with different distribution values, such as the standard deviation and the average values of the received signals. The unique pattern for the measurements collected by the Wi-Fi AP seems to have different values for the signal-to-location relationships of the different target points. In addition to that, there are some noticeable outlier values (as observed in Figure 5 for target points 1, 76, and 191) that may affect the distribution of the values or have a negative impact on the location accuracy (both false positive and false negative predictions could be high) or degrade the location prediction of the target point. Therefore, the relationship between signal and location could not be characterized, as well as the outliers could affect the location performance. Moreover, to better visualize the pattern of the signal distributions in a reference point, some portion of the labels have been extracted, as shown in Figure 6.    (AP_4). It shows that the 'signal-to-location' relationship has been characterized and follows the unique pattern for the specific label. In other words, the signal variations of the Wi-Fi received signal strengths at a reference point were measured to have different distribution values, including the standard deviation and the average values of the received signals. The unique pattern for the measurements collected by the Wi-Fi AP also appears to have different values for the signal-to-location relationship of the different target points. Moreover, there are some noticeable outlier values (as shown in Figure 8 for target 1) that seem to have extremely fluctuating values at the time of measurement, which could affect the location determination of the target. Therefore, the relationship between signal and location could not be characterized because the outliers could affect the position estimation.            Figure 9 illustrates whether the Wi-Fi signal strength received at a grid point from multiple APs was assumed to be independent such that the Wi-Fi signals transmitted from different APs are transmitted independently and do not interfere with each other. There fore, the distributions of the RSS measurements for pairs of Wi-Fi access points, especially after transformation techniques were being carried out, show a different pattern of finger prints for a mobile user at a specific reference point. It shows that the RSS measurements have specific pattern for the different Wi-Fi APs of AP_1 and AP_2 and follows a differen distribution of RSS values for the different labels of an access point (dissimilarity). More over, the 'signal-to-location' relationship was being characterized or studied. The distri butions of the RSS values measured at a particular grid point are so dynamic in nature due to the inherent heterogeneity of indoor environment such that there is no guarantee that the signal fluctuations are to be represented by a single value at a specific position.    Figure 10 shows that the RSS measurements have a specific pattern for the differen Wi-Fi APs of AP_7 and AP_8 and follows a different distribution of RSS values for th different labels of an access point. Moreover, the 'signal-to-location' relationship was be ing characterized, and the distributions of the RSS values measured at a particular gri point are dynamic in nature due to the inherent heterogeneity of the indoor environmen such that there is no guarantee that the signal fluctuations are represented by a singl value at a specific position.  Figure 11a,b show the distributions of RSS measurements of the offline and onlin databases for dataset A, respectively. There were 416 Wi-Fi access points and 15,660 in stances were measured from each Wi-Fi access point; the density distribution of the RS measurements clearly demonstrate that the mean RSS values were observed to be −10 which means that instances were not received, as the majority of the Wi-Fi APs are no detected for the associated target points. The unique pattern for these Wi-Fi access poin with an average score of −100 is that the measured signal variations have the lowest value of standard deviations, while no signal was detected in any of these specified APs. This what the number tells, nevertheless, it is untrue. However, these APs (with the lowe standard deviation) were not detected on any mobile device during the training and tes ing phase, and we excluded it from the training model to avoid a negative knowledg transfer that could either falsely inflate (both false positive and false negative prediction could be high) or degrade the target's position prediction. This is because we cannot tal about the signal distribution of reference points while the associated mobile devices coul not receive any signal. Moreover, Figure 11c,d show that the distributions of RSS mea urements of the derived new features of the source domain and online databases, respec tively. Based on the weighted co-occurrence factor, a new refined source domain was con structed to train the classifiers for the position estimation of the target. One can observe a the density distributions of the refined source domain and target domain that the derive feature spaces are from different distributions with different values of parameters, henc the instances from the offline and online phases are heterogeneous.  Figure 11a,b show the distributions of RSS measurements of the offline and online databases for dataset A, respectively. There were 416 Wi-Fi access points and 15,660 instances were measured from each Wi-Fi access point; the density distribution of the RSS measurements clearly demonstrate that the mean RSS values were observed to be −100, which means that instances were not received, as the majority of the Wi-Fi APs are not detected for the associated target points. The unique pattern for these Wi-Fi access points with an average score of −100 is that the measured signal variations have the lowest values of standard deviations, while no signal was detected in any of these specified APs. This is what the number tells, nevertheless, it is untrue. However, these APs (with the lowest standard deviation) were not detected on any mobile device during the training and testing phase, and we excluded it from the training model to avoid a negative knowledge transfer that could either falsely inflate (both false positive and false negative predictions could be high) or degrade the target's position prediction. This is because we cannot talk about the signal distribution of reference points while the associated mobile devices could not receive any signal. Moreover, Figure 11c,d show that the distributions of RSS measurements of the derived new features of the source domain and online databases, respectively. Based on the weighted co-occurrence factor, a new refined source domain was constructed to train the classifiers for the position estimation of the target. One can observe at the density distributions of the refined source domain and target domain that the derived feature spaces are from different distributions with different values of parameters, hence the instances from the offline and online phases are heterogeneous.  Table 7 illustrates the predictive power of different predictive modeling techniques that have been used to evaluate and predict the target location of a mobile device based on the received signal strengths from different Wi-Fi APs. As shown in Table 6 for dataset A, the number of new feature spaces used for the analysis is 41 after the feature transformation was being carried out. In this case, the measurements coming from the remaining Wi-Fi access points were not included in the analysis because they could negatively affect the positioning task. This is because the co-occurrence weights of these Wi-Fi access points from both domains of the instances are insignificant, so including them would falsely inflate our positioning accuracy. Our experimental results demonstrate that the proposed algorithm (OHetTLAL) is the best algorithm with the lowest root mean square error (RMSE). This indicates that our proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from the related source domain to the target domain. Not only can a positive knowledge transfer from the source domain to the prediction task be observed, but also the computational cost and memory usage were effectively improved by excluding irrelevant features from the analysis. On the other hand, the Gaussian mixture model has achieved the lowest positioning accuracy and consumes a high computational cost.   Table 7 illustrates the predictive power of different predictive modeling techniques that have been used to evaluate and predict the target location of a mobile device based on the received signal strengths from different Wi-Fi APs. As shown in Table 6 for dataset A, the number of new feature spaces used for the analysis is 41 after the feature transformation was being carried out. In this case, the measurements coming from the remaining Wi-Fi access points were not included in the analysis because they could negatively affect the positioning task. This is because the co-occurrence weights of these Wi-Fi access points from both domains of the instances are insignificant, so including them would falsely inflate our positioning accuracy. Our experimental results demonstrate that the proposed algorithm (OHetTLAL) is the best algorithm with the lowest root mean square error (RMSE). This indicates that our proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from the related source domain to the target domain. Not only can a positive knowledge transfer from the source domain to the prediction task be observed, but also the computational cost and memory usage were effectively improved by excluding irrelevant features from the analysis. On the other hand, the Gaussian mixture model has achieved the lowest positioning accuracy and consumes a high computational cost.  Table 8 shows that experiments were also conducted for the indoor practical positioning dataset (Dataset A), and the position estimation accuracy of the different predictive modeling techniques was evaluated to train and estimate the location of a mobile device when Wi-Fi signal strengths were received from different Wi-Fi access points, with the dimensions of the new feature spaces reduced to 15 Wi-Fi access points. The measurements that came from the 26 Wi-Fi access points were also excluded from the analysis because the co-occurrence weight scores of these access points from both domains of the instances were insignificant, so including them would spuriously increase our position accuracy. Similarly, the proposed model was compared with seven machine learning algorithms (decision tree (DT), K nearest-neighbor (KNN), support vector machine (SVC), logistic regression (LR), random forest (RF), Gaussian Mixture Model (GMM), and neural network (MLP)) to see the impact after excluding irrelevant feature domains from the analysis based on their weight scores, hoping to avoid negative knowledge transfer. In addition, those measurements of Wi-Fi received signal strength that had the lowest co-occurrence weight values for the instances of offline and online sources were excluded from the prediction task because they had no impact on the positioning estimation. Moreover, insignificant and redundant features would further affect the computational cost and memory requirements of the algorithm. As shown in Table 8 for dataset A, the number of new feature spaces used for the analysis is 15. Thus, our experimental results demonstrate that the proposed algorithm is the best algorithm with the lowest root mean square error. On the other hand, the Gaussian mixture model (GMM) has obtained the highest error in positioning and consumes a huge computational cost. This indicates that our proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from the source domain to the target domain, and that transferring knowledge from the source domain to the prediction task positively affects the positioning accuracy.  Figure 12 confirms that the proposed algorithm is the best algorithm with the lowest root mean squared error for dataset A with 41 feature spaces. In this case, the measurements collected from the remaining Wi-Fi access points were not included in the analysis because they could negatively affect the prediction task. The coincidence weights of these Wi-Fi access points from both domains of instances were found to be insignificant, so including them would spuriously increase the positioning accuracy. All classifiers achieved minimal RMSE even after applying transfer learning. This indicates that our proposed algorithm succeeded in improving the positioning accuracy by transferring knowledge from the corresponding source domain to the target domain. It can also be observed that positive knowledge from the source domain was transferred to the prediction task. On the other hand, the Gaussian mixture model also achieved the lowest positioning accuracy. including them would spuriously increase the positioning accuracy. All classifiers achieved minimal RMSE even after applying transfer learning. This indicates that our proposed algorithm succeeded in improving the positioning accuracy by transferring knowledge from the corresponding source domain to the target domain. It can also be observed that positive knowledge from the source domain was transferred to the prediction task. On the other hand, the Gaussian mixture model also achieved the lowest positioning accuracy.  Figure 13 shows that the proposed algorithm was the best algorithm with the lowest root mean squared error for dataset A with 15 feature spaces. In this case, the measurements that came from the 26 Wi-Fi access points were also excluded from the analysis because the co-occurrence weight scores of these access points from both domains of the instances were insignificant, so including them would falsely inflate our positioning accuracy. Similarly, all classifiers achieved a minimal RMSE after transfer learning was applied, although slight differences were observed, as some instances were able to falsely improve the RMSE. This indicates that our proposed algorithm succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that positive knowledge from the source domain was transferred to the prediction task. On the other hand, the Gaussian mixture model also achieved the lowest positioning accuracy.  Figure 13 shows that the proposed algorithm was the best algorithm with the lowest root mean squared error for dataset A with 15 feature spaces. In this case, the measurements that came from the 26 Wi-Fi access points were also excluded from the analysis because the co-occurrence weight scores of these access points from both domains of the instances were insignificant, so including them would falsely inflate our positioning accuracy. Similarly, all classifiers achieved a minimal RMSE after transfer learning was applied, although slight differences were observed, as some instances were able to falsely improve the RMSE. This indicates that our proposed algorithm succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that positive knowledge from the source domain was transferred to the prediction task. On the other hand, the Gaussian mixture model also achieved the lowest positioning accuracy. Figure 14 shows the comparison of the accuracy of indoor positioning classifiers applied to dataset A with new feature spaces with 46 and 15 dimensions. It was found that the proposed algorithm has the lowest root mean square error, i.e., 32% fewer mislocalizations for mobile devices using the 41 new derived features. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that not only a positive knowledge from the source domain transferred to the prediction task but also the computational cost and memory consumption were effectively improved by excluding irrelevant features from the analysis. However, the Gaussian Mixture Model (GMM) was found to have the lowest accuracy in indoor fingerprint positioning. It leads to mislocalization of mobile devices about 133% of the time and consumes huge computational costs. On the other side, the accuracy of indoor positioning classifiers applied to dataset A with the reduced dimension of 15 features has also been evaluated. It was found that the proposed algorithm has the lowest root mean square error, i.e., predicts 25% fewer wrongly mobile devices. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that not only positive knowledge from the source domain was transferred to the prediction task, but also the computational cost and memory usage were effectively improved by excluding irrelevant features from the analysis. One can observe the visible differences in positioning accuracy for both scenarios with different dimensions of 41 and 15 feature spaces. Those with 41 feature spaces seem to have better positioning accuracy; however, there are some features whose weight values are almost zero. Moreover, this confirms the claim that the measurements coming from the 26 access points that were excluded from the analysis due to the weighting values for the co-occurrence of these access points from both ranges of instances were insignificant, so the inclusion of these instances would falsely increase our positioning accuracy. On the other hand, the Gaussian mixture model (GMM) was found to have the lowest accuracy in locating indoor objects based on fingerprints. In about 137% of the cases, mobile devices are predicted incorrectly and high computational costs are incurred.  Figure 14 shows the comparison of the accuracy of indoor positioning classifiers applied to dataset A with new feature spaces with 46 and 15 dimensions. It was found that the proposed algorithm has the lowest root mean square error, i.e., 32% fewer mis-localizations for mobile devices using the 41 new derived features. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that not only a positive knowledge from the source domain transferred to the prediction task but also the computational cost and memory consumption were effectively improved by excluding irrelevant features from the analysis. However, the Gaussian Mixture Model (GMM) was found to have the lowest accuracy in indoor fingerprint positioning. It leads  Figure 15 illustrates the impact of different feature spaces used to train indoor p tioning classifiers using dataset A with dimensions of 41 and 15, respectively. It was fo that the proposed algorithm has the lowest root mean square error, i.e., 32% and 25% error in positioning the mobile device for both scenarios. This indicates that the prop algorithm has succeeded in improving the positioning accuracy by transfer knowledge from the source domain to the target domain. It can be observed that not positive knowledge from the source domain was transferred to the prediction task also the computational cost and memory consumption were effectively improved by cluding irrelevant features from the analysis. The difference in positioning accuracy tween the two scenarios is about 7%. This difference led us to the important finding all classifiers achieved lower RMSE both before and after transfer learning for the feature spaces of dimension 41. However, the contributions of these features to target diction were not as significant (since their co-occurrence weights of those 26 features w almost negligible) as for the feature spaces with dimension 15. This could show u searchers that a model with a lower RMSE does not mean that it is always the most sistent model, but that we need to examine the parameters for their negative effect target prediction.  Figure 15 illustrates the impact of different feature spaces used to train indoor positioning classifiers using dataset A with dimensions of 41 and 15, respectively. It was found that the proposed algorithm has the lowest root mean square error, i.e., 32% and 25% less error in positioning the mobile device for both scenarios. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from the source domain to the target domain. It can be observed that not only positive knowledge from the source domain was transferred to the prediction task, but also the computational cost and memory consumption were effectively improved by excluding irrelevant features from the analysis. The difference in positioning accuracy between the two scenarios is about 7%. This difference led us to the important finding that all classifiers achieved lower RMSE both before and after transfer learning for the new feature spaces of dimension 41. However, the contributions of these features to target prediction were not as significant (since their co-occurrence weights of those 26 features were almost negligible) as for the feature spaces with dimension 15. This could show us researchers that a model with a lower RMSE does not mean that it is always the most consistent model, but that we need to examine the parameters for their negative effects on target prediction.

Dataset B
The original number of feature spaces was nine for dataset B. Figure 16

Dataset B
The original number of feature spaces was nine for dataset B. Figure 16 illustrates the distributions of the received signal strength measurements for a given Wi-Fi access point with their corresponding target values. It shows that the RSS measurements have a specific pattern for the different Wi-Fi access points and follow a different distribution of RSS values for the different labels of Wi-Fi AP one (AP_1) of dataset B. This shows that the signal variations of the Wi-Fi-received signal strengths were measured at a reference point and have different distribution values, including the standard deviation and the average value of the received signals. The unique pattern for the measurements collected from Wi-Fi AP (AP_1) also appears to have different values for the signal-to-location relationships of the different target points. Beyond that, there are no noticeable outlier values (as can be observed in Figure 15, except for the RSS values of −90, which demonstrate that the corresponding access points were not detected by any mobile devices). Figure 17 shows that the RSS measurements have a specific pattern for the different Wi-Fi access points and follow a different distribution of RSS values for the different labels of Wi-Fi AP_3 of dataset B. This shows that the signal variations of the Wi-Fi-received signal strengths were measured at a reference point and have different distribution values including the standard deviation and the average values of the received signals. The unique pattern for the measurements collected by the Wi-Fi AP_3 also appears to have different values for the relationships between the signal locations of the different target points. In addition, there are no noticeable outlier values (as can be observed in Figure 16, except for the RSS values of −90, which demonstrate that the corresponding APs were not detected by any mobile devices).       Figure 18 shows that the RSS measurements for the different pair of access points have a certain pattern and follow the different distribution of RSS values for the different labels of an access point (dissimilarity). It is further described that the Wi-Fi signal strength received at a network point by multiple APs is assumed to be independent so that the Wi-Fi signals sent by different APs are transmitted independently and do not interfere with each other. Therefore, the distributions of RSS measurements for pairs of Wi-Fi access points demonstrate a different pattern of fingerprint for a mobile device at a given reference point even after performing transformation techniques. It can be observed that the RSS measurements for the different Wi-Fi access points of AP_3 and AP_7 have a specific pattern and follow the different distribution of RSS values for the different labels  Figure 18 shows that the RSS measurements for the different pair of access points have a certain pattern and follow the different distribution of RSS values for the different labels of an access point (dissimilarity). It is further described that the Wi-Fi signal strength received at a network point by multiple APs is assumed to be independent so that the Wi-Fi signals sent by different APs are transmitted independently and do not interfere with each other. Therefore, the distributions of RSS measurements for pairs of Wi-Fi access points demonstrate a different pattern of fingerprint for a mobile device at a given reference point even after performing transformation techniques. It can be observed that the RSS measurements for the different Wi-Fi access points of AP_3 and AP_7 have a specific pattern and follow the different distribution of RSS values for the different labels of Wi-Fi access points (dissimilarity). Thus, the relationship between signal and location has been characterized or studied and shows that the distributions of RSS values measured at a given network point are dynamic due to the inherent heterogeneity of the indoor environment. Thus, there is no guarantee that the signal variations at a particular location are represented by a single value. Figure 19 describes whether the Wi-Fi signal strength received at a grid point from multiple APs can be assumed to be independent. Thus, the Wi-Fi signals transmitted by different APs are transmitted independently and do not interfere with each other. Therefore, the distributions of RSS measurements for pairs of Wi-Fi APs still follow a different pattern of fingerprint distribution for a mobile user at a given reference point after applying transformation techniques. It also shows that the RSS measurements for the different Wi-Fi access points of AP_8 and AP_9 have distinct patterns and follow the different distribution of RSS values for the different labels of a Wi-Fi access point. The "signal-to-location" relationship was established and further conveys that the distributions of RSS values measured at a given grid point are dynamic, and therefore there is no guarantee that the signal variations are represented by a single value at a given location.
of Wi-Fi access points (dissimilarity). Thus, the relationship between signal and location has been characterized or studied and shows that the distributions of RSS values measured at a given network point are dynamic due to the inherent heterogeneity of the indoor environment. Thus, there is no guarantee that the signal variations at a particular location are represented by a single value.  Figure 19 describes whether the Wi-Fi signal strength received at a grid point from multiple APs can be assumed to be independent. Thus, the Wi-Fi signals transmitted by different APs are transmitted independently and do not interfere with each other. Therefore, the distributions of RSS measurements for pairs of Wi-Fi APs still follow a different pattern of fingerprint distribution for a mobile user at a given reference point after applying transformation techniques. It also shows that the RSS measurements for the different Wi-Fi access points of AP_8 and AP_9 have distinct patterns and follow the different distribution of RSS values for the different labels of a Wi-Fi access point. The "signal-to-location" relationship was established and further conveys that the distributions of RSS values measured at a given grid point are dynamic, and therefore there is no guarantee that the signal variations are represented by a single value at a given location.  Table 9 shows extensive experiments applied to real-world indoor positioning scenarios using dataset B. The original number of feature spaces was nine for dataset B. Nevertheless, experimental results confirmed that the proposed algorithm is the best algorithm with the minimum root mean square error for dataset B, regardless of which feature spaces were fully considered in the analysis. Since the number of feature spaces for dataset  Table 9 shows extensive experiments applied to real-world indoor positioning scenarios using dataset B. The original number of feature spaces was nine for dataset B. Nevertheless, experimental results confirmed that the proposed algorithm is the best algorithm with the minimum root mean square error for dataset B, regardless of which feature spaces were fully considered in the analysis. Since the number of feature spaces for dataset B is very small, we could not see or feel the actual effect of irrelevant features. However, based on our given scenarios, some redundant features need to be handled to minimize both computational cost and memory requirements. Experimental results have demonstrated that all classifiers achieved a minimum RMSE after applying transfer learning. This indicates that our proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from the source domain to the target domain. However, there is no clear evidence against the fact that negative knowledge was not transferred from the source domain to the target domain. In this case, the measurements that came from all Wi-Fi APs were not refined or reviewed to determine whether or not they should be used for location analysis, because they could negatively affect the location process. On the other hand, the Gaussian mixed model also achieved the lowest location accuracy and consumed higher computational costs.  Table 10 shows detailed experiments for real-world indoor positioning scenarios using dataset B. Here, we evaluated the positioning accuracy of the different predictive modeling techniques to estimate the target position of a mobile device based on the received signal strengths from different Wi-Fi APs. The proposed model was compared with seven machine learning algorithms. The data were preprocessed and irrelevant feature spaces were excluded from further analysis based on their weighted co-occurrence score to avoid negative knowledge transfer. The original number of feature spaces was nine for dataset B before feature transformation was being carried out. However, based on the co-occurrence weight scores and exclusion criteria, three feature spaces were deemed irrelevant for analysis because the co-occurrence weight scores were zero in each of the specified base stations. In addition, the Wi-Fi-received signal strength measurements that achieved the lowest co-occurrence weight scores for the offline and online source instances were excluded from the prediction task because they would have no effect on the position estimation. Accordingly, as shown in Table 9 for dataset B, only six Wi-Fi APs are used as new feature spaces to train the classifiers. In this case, the measurements that came from the remaining Wi-Fi APs were not included in the analysis because they could have a negative impact on the positioning process, since the co-occurrence weight scores of these APs from both domains of the instances were insignificant, and therefore including them would falsely inflate our positioning accuracy. The experimental results confirm that the proposed algorithm is also the best algorithm for dataset B with the lowest root mean square error. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that not only a positive knowledge was transferred from the source domain to the prediction task, but also the computational cost and memory usage were effectively improved by excluding irrelevant features from the analysis. On the other hand, the Gaussian mixture model has achieved the lowest positioning accuracy and consumes a high computational cost.  Figure 20 shows that the proposed algorithm is the best algorithm with the lowest mean square error for dataset B, with nine Wi-Fi APs. On the other hand, the Gaussian mixture model has also achieved the lowest positioning accuracy. All classifiers have achieved minimum RMSE after applying transfer learning. This indicates that our proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from the source domain to the target domain. However, there is no strong evidence against the fact that a negative knowledge was not transferred from the source domain to the target domain. In this case, the measurements coming from all access points were not refined or analyzed to determine whether or not they should be used for location analysis because they could negatively impact the positioning process.  Figure 21 shows that the proposed algorithm was the best algorithm with the lowest mean square error for dataset B with six feature spaces. In this case, the measurements coming from the three Wi-Fi APs were not included in the analysis because they could negatively affect the prediction task. This is because the coincidence weights of these Wi-Fi access points from both domains are close to zero, so including them would affect the positioning performance. Similarly, all classifiers achieved minimum RMSE after applying transfer learning. This indicates that our proposed algorithm has succeeded in improving the positioning performance by transferring knowledge from the corresponding source domain to the target domain. It can be observed that positive knowledge from the source domain was transferred to the prediction task. On the other hand, the Gaussian mixture model also achieved the lowest positioning accuracy.  Figure 21 shows that the proposed algorithm was the best algorithm with the lowest mean square error for dataset B with six feature spaces. In this case, the measurements coming from the three Wi-Fi APs were not included in the analysis because they could negatively affect the prediction task. This is because the coincidence weights of these Wi-Fi access points from both domains are close to zero, so including them would affect the positioning performance. Similarly, all classifiers achieved minimum RMSE after applying transfer learning. This indicates that our proposed algorithm has succeeded in improving the positioning performance by transferring knowledge from the corresponding source domain to the target domain. It can be observed that positive knowledge from the source domain was transferred to the prediction task. On the other hand, the Gaussian mixture model also achieved the lowest positioning accuracy. Figure 21 shows that the proposed algorithm was the best algorithm with the lowest mean square error for dataset B with six feature spaces. In this case, the measurements coming from the three Wi-Fi APs were not included in the analysis because they could negatively affect the prediction task. This is because the coincidence weights of these Wi-Fi access points from both domains are close to zero, so including them would affect the positioning performance. Similarly, all classifiers achieved minimum RMSE after applying transfer learning. This indicates that our proposed algorithm has succeeded in improving the positioning performance by transferring knowledge from the corresponding source domain to the target domain. It can be observed that positive knowledge from the source domain was transferred to the prediction task. On the other hand, the Gaussian mixture model also achieved the lowest positioning accuracy.   Figure 22 describes the comparison of the accuracy of indoor positioning classifiers applied to dataset B with nine feature spaces. It was found that the proposed algorithm has the lowest root mean square error, i.e., 33% less mis-localized mobile objects. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from related source spaces to the target area. However, there is no clear evidence against the fact that negative knowledge was not transferred from the source to the target domain. In this case, the measurements that came from all Wi-Fi access points were not refined or reviewed to determine whether or not they should be used for location analysis, because they could negatively affect the positioning process. The Gaussian mixture model (GMM), however, was found to have the lowest accuracy for indoor fingerprint-based location. In about 86% of the cases, mobile devices are incorrectly localized and high computational costs are incurred. Moreover, the comparison of the accuracy of indoor positioning classifiers applied to dataset B have also been evaluated with six feature spaces. It was found that the proposed algorithm has the lowest mean square error, i.e., 22% fewer mis-localizations for mobile objects. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that not only a positive knowledge was transferred from the source domain to the prediction task, but also the computational cost and memory usage were effectively improved by excluding irrelevant features from the analysis. Similarly, the Gaussian mixture model (GMM) was found to have the lowest accuracy for indoor fingerprint-based location. In about 96% of the cases, it results in the incorrect positioning of mobile devices and high computational complexity. square error, i.e., 22% fewer mis-localizations for mobile objects. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from related source domains to the target domain. It can be observed that not only a positive knowledge was transferred from the source domain to the prediction task, but also the computational cost and memory usage were effectively improved by excluding irrelevant features from the analysis. Similarly, the Gaussian mixture model (GMM) was found to have the lowest accuracy for indoor fingerprint-based location. In about 96% of the cases, it results in the incorrect positioning of mobile devices and high computational complexity.

Figure 22.
Comparison of performance of the classifiers for dataset B with 9 and 6 APs. Figure 23 illustrates the effects of different dimensions of feature spaces used for training indoor positioning classifiers, using dataset B with a dimension of 9 and 6, respectively. It was found that the proposed algorithm has the lowest root mean square error, i.e., 1.25 m and 1.42 m less mis-localizations of mobile devices for both scenarios. This indicates that the proposed algorithm has succeeded in improving the location accuracy by transferring knowledge from the source domain to the target domain, where the difference in location accuracy between the two scenarios is about 11%. This difference Figure 22. Comparison of performance of the classifiers for dataset B with 9 and 6 APs. Figure 23 illustrates the effects of different dimensions of feature spaces used for training indoor positioning classifiers, using dataset B with a dimension of 9 and 6, respectively. It was found that the proposed algorithm has the lowest root mean square error, i.e., 1.25 m and 1.42 m less mis-localizations of mobile devices for both scenarios. This indicates that the proposed algorithm has succeeded in improving the location accuracy by transferring knowledge from the source domain to the target domain, where the difference in location accuracy between the two scenarios is about 11%. This difference leads us to an important finding, namely that all classifiers achieved lower RMSE both before and after transfer learning for the new feature spaces of dimension 9. However, there is no strong evidence that negative knowledge was not transferred from the source to the target domain. In this case, especially for the scenario of dataset A with nine feature spaces, the measurements coming from all Wi-Fi access points were not refined or reviewed to determine whether or not they should be used for location analysis because they could negatively affect the positioning process. Moreover, the contributions of these features to target detection were not as significant (since their co-occurrence weight scores of those three features were almost negligible) as for the feature spaces with a dimension of 6. This could justify that a model with a lower RMSE does not mean that it is always the most consistent model, but that we need to examine the parameters for their negative effects on target prediction. leads us to an important finding, namely that all classifiers achieved lower RMSE both before and after transfer learning for the new feature spaces of dimension 9. However, there is no strong evidence that negative knowledge was not transferred from the source to the target domain. In this case, especially for the scenario of dataset A with nine feature spaces, the measurements coming from all Wi-Fi access points were not refined or reviewed to determine whether or not they should be used for location analysis because they could negatively affect the positioning process. Moreover, the contributions of these features to target detection were not as significant (since their co-occurrence weight scores of those three features were almost negligible) as for the feature spaces with a dimension of 6. This could justify that a model with a lower RMSE does not mean that it is always the most consistent model, but that we need to examine the parameters for their negative effects on target prediction.   Figure 25a,b illustrate the use of box and whisker (commonly known as a box plot) to compare the distributions of the RSS measurements received from an access point in a reference point of both training and testing instances, respectively. For both the training and testing instances, we have considered the measurements received from Wi-Fi AP_1 and Wi-Fi AP_48 to compare the nature of the signal distributions in terms of the box-plot parameters. One can observe that the signal distributions of reference point 1 for Wi-Fi AP_1 seem to have outlier values (both types of outliers: low and high), because the RSS values lies out of the whisker box such that the RSS measurements either fall more than 1.5 times the interquartile range above the third quartile or fall more than 1.5 times the interquartile range below the first quartile. Inter quartile range is the difference between the third and first quartile, which used to give the dispersion of a distribution. On the other side, Targets of 185, 191 and 210 have a lower outlier. One can also see that the respective medians of each box plot of a target lies outside of any box of a comparison boxplot of any target. Thus, the unique pattern for the measurements collected by the Wi-Fi APs seems to have different median values for the 'signal-to-location' relationships of the different target points. In other words, the received signals for a reference point from multiple APs are likely to be independent and it is what is really required in practice for reducing extra Wi-Fi Aps' deployment. Similarly, one can compare the distribution nature in terms of skewness (positive and negative), symmetry, potential outliers, and dispersion of measurements. For instance, the measurements for targets 5 and 10 of Wi-Fi AP_4 of both training and testing instances revealed that the destruction is highly dispersed and both distributions are negatively skewed. Moreover, the measurements' distribution received from Wi-Fi access points in both phases of instances are highly dispersed as the whisker box plot is longer. This also justifies the fact that signals were not received from this access point for their corresponding targets with the exception of a few measurements, while this automatically inflates the deviation and degrades the positioning performance. in terms of skewness (positive and negative), symmetry, potential outliers, and dispersion of measurements. For instance, the measurements for targets 5 and 10 of Wi-Fi AP_4 of both training and testing instances revealed that the destruction is highly dispersed and both distributions are negatively skewed. Moreover, the measurements' distribution received from Wi-Fi access points in both phases of instances are highly dispersed as the whisker box plot is longer. This also justifies the fact that signals were not received from this access point for their corresponding targets with the exception of a few measurements, while this automatically inflates the deviation and degrades the positioning performance.   However, compared to the other algorithms we used, the proposed algorithm's average elapsed execution testing time is substantially longer. Additionally, employing the two real-world dataset scenarios, Figure 26 shows the comparative study of computational testing time for FBIPS. Thus, the computational testing time for the proposed algorithm is higher, followed by the GMM, MLP, and the SVC.  Figure 26 presents the computational complexity of the proposed algorithm for IPSbased RSS fingerprinting, which fuses source and target domain knowledge to enhance positioning performance in the target domain. The algorithms presented in this paper were run on a laptop with an AMD Ryzen 3 3200U CPU running at 2.60 GHz and 16 GB of RAM. Two situations of complexity-time and space-are the main determinants of an algorithm's complexity. To assess the time complexity of the algorithms, we employed  ). O n However, compared to the other algorithms we used, the proposed algorithm's average elapsed execution testing time is substantially longer. Additionally, employing the two real-world dataset scenarios, Figure 26 shows the comparative study of computational testing time for FBIPS. Thus, the computational testing time for the proposed algorithm is higher, followed by the GMM, MLP, and the SVC.

Conclusions
In this paper, we propose OHetTLAL, a novel technique for indoor fingerprint-based positioning problems that can improve learning performance in the target domain through knowledge transfer from the source domain. To this end, we derived new feature spaces (on which the model is trained) based on the co-occurrence of RSS measurements from mobile devices in the two domains to capitalize on knowledge that could improve target location prediction. To derive the new feature spaces, the co-occurrence measure of the features ( p mi C ) was computed. The higher the value of co-occurrence between the two domains, the more detected Wi-Fi APs are shared by these two domains and the more likely these source domains are related to the target domain, which positively affects the prediction of the target. The proposed objective function was minimized across the new feature spaces so that the positive knowledge of the related source domain received a

Conclusions
In this paper, we propose OHetTLAL, a novel technique for indoor fingerprint-based positioning problems that can improve learning performance in the target domain through knowledge transfer from the source domain. To this end, we derived new feature spaces (on which the model is trained) based on the co-occurrence of RSS measurements from mobile devices in the two domains to capitalize on knowledge that could improve target location prediction. To derive the new feature spaces, the co-occurrence measure of the features (C p mi ) was computed. The higher the value of co-occurrence between the two domains, the more detected Wi-Fi APs are shared by these two domains and the more likely these source domains are related to the target domain, which positively affects the prediction of the target. The proposed objective function was minimized across the new feature spaces so that the positive knowledge of the related source domain received a higher weight and is transferred to the target domain for predicting a new mobile device. Experimental results demonstrate that the proposed algorithm was the best algorithm for fingerprint-based indoor positioning with the minimum root mean square error in all cases of the two real office scenarios. On the other hand, the Gaussian mixture model (GMM), had the highest mis-localization error rate. Our research shows that in OHetTL, the construction of new feature spaces is critical for extracting relevant information for effective target location prediction. It should be noted that these experimental results may differ in other test phase environments.
The effects of different dimensions of feature spaces used for training indoor positioning classifiers have also been evaluated using two real life datasets. Results demonstrate that the impact of different feature spaces used to train indoor positioning classifiers using dataset A with dimensions of 41 and 15, respectively. It was found that the proposed algorithm has the lowest root mean square error, i.e., 32% and 25% less error in positioning the mobile device for both scenarios. Similarly, the effects of different dimensions of feature spaces are used for training indoor positioning classifiers, using dataset B with a dimension of 9 and 6, respectively. It was found that the proposed algorithm has the lowest root mean square error, i.e., 33% and 22% less mis-localizations of mobile devices for both scenarios. This indicates that the proposed algorithm has succeeded in improving the positioning accuracy by transferring knowledge from the source domain to the target domain for both datasets after new feature spaces are derived based on the weight co-occurrence of both instances. It can be observed that not only positive knowledge from the source domain was transferred to the prediction task, but also the computational cost and memory consumption were effectively improved by excluding irrelevant features from the analysis. The difference in positioning accuracy obtained due to the feature spaces between the two scenarios is about 7% and 11% for both datasets of A and B, respectively. This difference leads us to an important finding, namely that all classifiers achieved lower RMSE both before and after transfer learning over the new feature spaces of dimension 9. However, there is no strong evidence that negative knowledge was not transferred from the source to the target domain. In this case, especially for the scenario of dataset B with nine feature spaces, the measurements coming from all Wi-Fi access points were not refined or scrutinized to determine whether or not they should be used for location analysis because they could negatively affect the positioning process. However, the contributions of these features to target detection were not as significant (since their co-occurrence weight scores of those three features were almost negligible) as for the feature spaces with a dimension of 6. This could justify that a model with a lower RMSE does not mean that it is always the most consistent model, but that we need to examine the parameters for their negative effects on target prediction. Our results demonstrate, however, that the proposed algorithm is more resilient to fluctuating environments and mitigates the problem of overfitting the model. Furthermore, the OHetTLAL-based fusion of both knowledge domains could significantly improve the positioning performance without any extra cost associated to offline calibrations effort. It is not necessarily the positioning performance to improve as the number of feature spaces increases, instead, how the features are important in capturing the maximum variation of the model is the main determinant.
In summary, the positioning performance, as per our results, revealed that it is highly important to extract the most significant predictors that could capture the maximum variance of the model using some metrics in order to evaluate the significance of each Wi-Fi AP deployment. Thus, we discovered that not all available feature spaces need be considered for positioning estimation for at least four reasons: (a) some features may be irrelevant and result in fingerprint duplication or pattern fingerprint mismatch; (b) large feature spaces necessitate a massive deployment of Wi-Fi access points, which is prohibitively expensive from a cost standpoint; (c) from a technical standpoint, computational cost and memory usage are quite expensive when dealing with higher dimensions of features, and (d) model over-fitting is also a serious issue for higher dimensions of feature spaces.

Data Availability Statement:
The dataset used for this study are available upon request to the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.