On the Detection of Fake Certificates via Attribute Correlation

Transport Layer Security (TLS) and its predecessor, SSL, are important cryptographic protocol suites on the Internet. They both implement public key certificates and rely on a group of trusted certificate authorities (i.e., CAs) for peer authentication. Unfortunately, the most recent research reveals that, if any one of the pre-trusted CAs is compromised, fake certificates can be issued to intercept the corresponding SSL/TLS connections. This security vulnerability leads to catastrophic impacts on SSL/TLS-based HTTPS, which is the underlying protocol to provide secure web services for e-commerce, e-mails, etc. To address this problem, we design an attribute dependency-based detection mechanism, called SSLight. SSLight can expose fake certificates by checking whether the certificates contain some attribute dependencies rarely occurring in legitimate samples. We conduct extensive experiments to evaluate SSLight and successfully confirm that SSLight can detect the vast majority of fake certificates issued from any trusted CAs if they are compromised. As a real-world example, we also implement SSLight as a Firefox add-on and examine its capability of exposing existent fake certificates from DigiNotar and Comodo, both of which have made a giant impact around the world.


Introduction
Secure Sockets Layer (SSL) and its successor, Transport Layer Security (TLS), are built upon an X.509 public key infrastructure [1] and used as a base in important secure protocols and applications on the Internet, such as HTTPS, VPN and SMTPS .Within an X.509 infrastructure, certificate authorities (CAs) are in charge of checking other entities' identity and issuing X.509 certificates to verified entities, which may be another CA or end entity.The root CAs issue certificates to themselves, and the certificates of intermediate CAs are issued from other CAs.As a result, any end entity's certificate can be chained back to a root CA certificate through zero or several intermediate CA certificates, which form a certification path [1].SSL/TLS employ end entity certificates to authenticate peer identities [2,3].In particular, the end entity obtains a legitimate identity if its certificate can be chained back to a trusted CA along its certification path.The X.509 public key infrastructure also defines necessary fields and syntax present in X.509 certificates [2].In this paper, we refer to the fields as attributes.
HTTPS uses SSL/TLS to encrypt HTTP connections, thus providing secure web services to a range of web applications, including online business, finance, healthcare, mailing services, and so on.In HTTPS connections, browsers authenticate web server identities based on a group of pre-agreed root and intermediate CAs.This validation process basically depends on two requirements.One is whether the web server's certificate is issued by one of the trusted CAs.The other is whether the certificate's common name (i.e., CN) is bound to the web server's domain name.If both requirements are fulfilled, browsers confirm this web server's identity as legitimate.Otherwise, an alert will be displayed to make users aware that the certificate may be fake, and the web access is held immediately to prevent any potential attacks.
However, if one of the trusted CAs is compromised, fake certificates can be issued and used to hijack targeted HTTPS connections [4,5].Browsers are not aware of the underlying attacks launched by this kind of fake certificate, because they trust the compromised CA by default and cannot distinguish which trusted CA is the legal one to issue which certificate.As reported from the SSL Observatory project [6], there are more than 600 certificate authorities that browsers should trust by default [7].As a result, attackers are only required to compromise one of these CAs, which they are capable of breaking into.Such a threat consequently forces mainstream browsers to revoke their trust on these compromised CAs that have been discovered, such as DigiNotar [8][9][10] and Comodo [11].However, this temporary countermeasure is followed by a side effect that browsers no longer trust the legitimate certificates that were already issued from DigiNotar and Comodo, as well.To make things worse, browsers lose the chance to withdraw their trust of the compromised CAs if they are not discovered.
To address these problems, online detection systems, such as Perspectives [12], HTTPS Everywhere [13] and Google Certificate Catalog [14], have been proposed.They conduct a direct bit-to-bit comparison between the examined certificate and its legitimate sample obtained from the Internet.As these systems check certificate identities on-line, attackers, who can use fake certificates to hijack users' HTTPS tunnels, are more likely to be able to intercept or block the corresponding connections to these on-line services, as well.The Sovereign Keys Project [15], on the other hand, provides a systematic solution for this structural insecurity.However, the implementation of sovereign keys involves cooperation among different CAs and web/DNS servers, thus making it hard to be practically deployed.
In this paper, we propose SSLight, an attribute dependency-based detection mechanism, to help browsers identify fake certificates issued from compromised CAs.SSLight basically relies on a probabilistic model built on a set of legitimate samples.Fake certificates thus can be detected as dependencies between some of their attributes that rarely occur among the legitimate samples.For example, as Australian CAs have never signed any legitimate certificates to American servers, a certificate is more likely to be a fake one if it is issued by an Australian CA, but possessed by an American server.As a result, SSLight is capable of exposing fake certificates from compromised CAs, even if they are not discovered, and mitigating the false alarm on legitimate certificates, as well.SSLight does not require instant on-line checking, helping it circumvent potential network interceptions.Moreover, SSLight is a lightweight solution that does not need cooperation from remote servers or CAs.
In sum, we have made three contributions in this paper.
1. We have designed SSLight, a novel attribute dependency-based detection mechanism, to enhance SSL/TLS's authentication.SSLight is capable of exposing fake certificates issued from trusted, but compromised, CAs.
2. SSLight is built on a training set with 830,306 legitimate certificate samples.We have conducted extensive experiments to evaluate SSLight's detection capability.The experimental results show that SSLight can detect the vast majority of fake certificates issued from any compromised CA with a relatively low false positive rate.
3. We have implemented SSLight as a Firefox add-on and use it to detect real-world fake certificates from DigiNotar and Comodo, both of which have made a catastrophic impact around the world.SSLight achieves a relatively high detection rate on these real-world examples.
The remainder of the paper is organized as follows.Section 2 explains the attributes and the attribute dependency in X.509 certificates.Section 3 presents the threat model.Section 4 elaborates on the design of SSLight.In Section 5, SSLight is thoroughly evaluated and implemented as a Firefox add-on to examine real-world examples.Before concluding this paper, in Section 8, we discuss the limitations of our proposal in Section 6 and review related works in Section 7.

Background
This section presents the details of the attributes in X.509 certificates and the concept of attribute dependency.

Attributes in X.509 Certificates
SSL/TLS employ the X.509 v3 certificate format to profile their X.509 certificates with necessary fields, called attributes, and corresponding usages [2].These attributes can be classified into two groups, basic certificate attributes and certificate extension attributes [2], both of which are encoded following the ASN.1 distinguished encoding rules (DER) [16] in order to facilitate signature calculation.
Basic certificate attributes contain basic information related to the owner and its issuer.In particular, two basic certificate attributes, Subject and Issuer , include several sub-fields defined in the X.500 specification [17].In this paper, we refer to these sub-fields as attributes, too.Certificate extension attributes, on the other hand, associate additional information with the owner and for managing relationships between CAs [2].
To receive a valid certificate, an entity, maybe a CA or a web server in this paper, first uses its private key to generate a certificate signing request (CSR) [18].This CSR is subsequently sent to a trustworthy organization, actually another CA, for validation.After checking the entity's identity, the organization issues a signed certificate back to the requested entity.This issuing process always involves human interactions to fill in the certificate's attributes with necessary personal information.For example, if the entity is a CA located in America, the attribute CA and Country should be set to TRUE and US, respectively.As the contents in any attribute are involved in the signature calculation, they cannot be changed when the certificate is already signed.Any certificate is located in a certification path, in which an end entity certificate can be traced back to a root CA certificate through zero or several intermediate CA certificates [1].From the bottom to the top of each certification path, the upper certificate is owned by a CA, which uses a private key to sign a lower certificate, and the top-most CA signs its certificate itself.With this signing chain, the trust assigned to the top-most certificate can be propagated to the bottom one.In this way, the browsers can only install hundreds of CA certificates and then trust billions of web sites later.
The left side of Figure 1 shows an example of a Google certificate.It is an end entity certificate and includes 15 basic certificate attributes, in which five belong to Subject attributes, three are Issuer attributes and five are certificate extension attributes.We observe much information from these attributes, like: the certificate's valid period is from the 18 December 2009 to 2011; the public key algorithm is RSA; the key length is 1024 bits; it is an American certificate, but issued by a South Africa CA, etc.In this paper, we assume the attributes that have not appeared in a certificate contain an empty value by default.The right side of this figure shows the corresponding certification path in which the Google certificate is located.The root CA, Verisign Class 3, issues a CA certificate to an intermediate CA, Thawte SGC , which signs the end entity certificate to Google.

Attribute Dependency
We define attribute dependency as the conditional probability distribution for all of the possible values of an attribute given a certain value in another attribute (the formalized definition is presented in Equation ( 5) in Section 4.1).These conditional probabilities can be calculated based on a set of legitimate certificates.According to whether the two attributes are from the same certificate or different certificates along with a certification path, we group attribute dependencies into two types, certificate attribute dependency and certification path attribute dependency.As a certificate's Issuer attributes indicate its issuer's Subject attributes along with a certification path, the dependency between an Issuer attribute and another attribute in the same certificate can be considered as a certification path attribute dependency.The certificate attribute dependency represents the relationship inside a certificate, while the certification path attribute dependency reflects the relationship between two different certificates from the same certification path.Figure 2a illustrates an example for the certificate attribute dependency.Two attributes, countryName and Public Key Length, in a server certificate are considered.The countryName is assumed to have possible values US, CN and Empty , while the Public Key Length includes 1024, 2048 and Empty.Each arrow line indicates a conditional probability for a value of the attribute Public Key Length given a certain value in the countryName.We observe three certificate attribute dependencies, x, y and z, in which the certain value of the countryName is US, CN and Empty, respectively.Figure 2b, on the other hand, shows an instance of the certification path attribute dependency between the attribute countryName in a server certificate and a CA certificate, both of which are located at the same certification path.The two countryName attributes are assumed to possess possible values US, CN and Empty.As can be seen, there are three certification path attribute dependencies, {, | and }.

Threat Model
Figure 3 demonstrates the security threat involved in this paper using the real-world compromised CAs, DigiNotar and Comodo.We use Firefox Version 5.0.1 for this demonstration, because Firefox has announced withdrawing its trust in DigiNotar and Comodo since Version 6.0.1 [11,19,20].Since we cannot compromise the real DigiNotar and Comodo, we set up two private CAs in our laboratory to impersonate them instead.We then add the two private CA certificates into the trusted authorities list in Firefox; thus, they can be used as the real compromised DigiNotar and Comodo.To hijack HTTPS connections to the Google mail service, we deploy a man-in-the-middle SSL proxy [21] with fake Google certificates in our laboratory and configured Firefox to access HTTPS sessions through this proxy by default.As shown in Figure 3a, the legitimate Google certificate issued by Thawte SGC CA has been accepted by Firefox.However, Figure 3a,b demonstrates that Firefox also accepts fake Google certificates from DigiNotar and Comodo by default.As a result, users are not aware of underlying attacks when they access Gmail through HTTPS connections, and their account information will be leaked.Note that we obtain the same results in other major browsers, such as IE and Chrome.In this paper, attackers are assumed to be able to intrude any CAs trusted by browsers and hijack any connections to and from the browsers.Note that attackers cannot modify the attributes in any trusted CA certificate because the CA certificates are pre-installed in browsers.Although attackers can exploit the compromised CA to issue fake certificates with arbitrary attributes, SSLight, or human beings, can easily detect these naive fake certificates through certificate attribute dependency.In this case, sophisticated attackers duplicate attributes from the legitimate certificate to its corresponding fake one, thus circumventing this kind of detection.Moreover, as sophisticated attackers can use the compromised CA to issue any number of intermediate CAs with arbitrary attributes, the detection based on the dependency between attributes from different CA certificates in the same certification path can be easily evaded.In this paper, SSLight focuses on the usage of the dependency between attributes from the server certificate and any of its CA certificates along with the same certification path to be against sophisticated attackers who: • cannot do any modification in the trusted CA certificates; • can duplicate attributes from legitimate certificates to the corresponding fake ones; • can issue any number of intermediate CAs with arbitrary attributes using the trusted, but compromised, CA; • can hijack or block any connections to and from the browsers.
The last item indicates that SSLight can work under the worst network conditions, in which any information from the Internet may be faked.

SSLight
In this section, we first build up a probabilistic model based on attribute dependencies among legitimate samples and then elaborate on the design of SSLight on top of this model.As a consequence, we introduce two factors, attack range reduction and false positive, to evaluate SSLight's detection capability.

Probabilistic Model
Let a web server q's legitimate certificate be C 1 q , which is associated with a certification path, defined as: where q is a server certificate, and , where Γ(C 1 q 1 ) = Γ(C 1 q 2 ), we may still have , because the same CA can issue certificates to different entities.
Based on Equation (1), we thus define a non-empty training set including legitimate certificate samples as: where Q = ||C|| is the size of the legitimate sample set C. In this paper, we assume that browsers pre-agree to trust be the set of considered attributes in the i-th level certificate, and is the set of values that attribute A i j may take, and ||V i j || represents the number of these possible values.As a consequence, we define the subset C(i, j, k) ⊆ C to include certification paths Γ(C 1 q ) in which the value of A i j , denoted as A i,q j , is set to V j,i k as: where V (A i,q j ) ∈ V i j represents the value assigned to A i,q j .With the help of Equation ( 3), we calculate the probability of the certification paths Jx can be divided into two subsets as Iy,e Jy ), where: As a result, the single attribute dependency D(A 's attack range from Iy,e Jy )||.We thus use an attack range reduction power, R * (P th , C ) ≤ ∞.In particular,

False Positive
According to Equations ( 8) and ( 9), a single attribute dependency D(A 1  Jx |V (A Iy,e Jy )) can achieve a larger reduction factor with a larger P th .However, this larger P th consequently causes a larger false positive, which is the ratio of legitimate certificates that are wrongly regarded as fake certificates in the legitimate sample set C. The false positive with respect to the single attribute dependency Jy )) can be calculated as: , where V (A Iy ,e Jy ) C(1, J x , k). ( C − Jx includes the legitimate samples, which are falsely regarded as fake in C when the D(A .Note that both of the two false positive definitions, Equations ( 11) and ( 12), have not taken legitimate certificates outside the legitimate sample set C, C 1 e / ∈ C, into consideration.According to Equations ( 8)-( 12), we conclude Corollaries 4-6, which can guide SSLight to choose appropriate P th to balance attack range reduction and false positives.Their proofs are detailed in Appendixes 1.7-1.12.

Theoretical Analysis
According to Equation (5), we calculate attribute dependencies between attributes as conditional probability distributions based on a training set of legitimate certificates.As a result, SSLight's detection capability, in terms of the attack range reduction and false positives, mainly depends on the prior distributions among legitimate samples in the training set.For example, assuming C

Evaluation
In this section, we first explain the experiment setup, which includes the legitimate sample set and feature set used in SSLight.Based on that, we evaluate each single attribute dependency's detection capability in terms of its attack range reduction factor and false positive in different P th .SSLight thus selects appropriate attribute dependencies in accordance with the feature evaluation results to achieve a large attack range reduction power with a low false positive.SSLight is consequently implemented as a Firefox add-on and used to expose real-world fake certificates with a 100% accuracy.

Experiment Setup
The SSL Observatory project [6] has conducted a thorough scan on all allocated IPv4 space in the default port of HTTPS (i.e., 443) and receives 1, 455, 391 valid certificates in its dataset [22].We further select the web server certificates whose attribute CA is FALSE and trace their corresponding certification paths.Finally, we obtain 830, 306 such samples, which have been employed by SSLight as the legitimate sample set C in this paper.In this set, the depth of the longest certification path is limited to 5 (i.e., I y ≤ N max = 5) because we observe that less than 4% of certification paths are longer than 5 in legitimate samples.More precisely, we have 100%, 83.2%, 67.3% and 3.02% of the 830, 306 samples whose I y = 2, 3, 4 and 5, respectively.
According to RFC5280 [2] and X.520 [17], X.509 certificates have more than 120 attribute definitions, which includes around 60 Subject attributes.However, many of these attributes may not be appropriate in the design of SSLight, because they cannot provide useful information for the detection.An example is the attribute CA. ∀C 1 e have CA=FALSE, and ∀C Iy>1 e have CA=TRUE.As a result, the attribute CA has a deterministic, but undistinguished, value in both legitimate and fake certificates.As another example, the value of attribute Signature is unique in different certificates and will be changed even when the positives in P th = 1 are no smaller than that in P th = 0.In accordance with Corollary 4, the false positive remains 0 when P th = 0 (i.e., E(P th = 0, D(A 1  Jx |V (A Iy,e Jy ))) = 0) in our experiments.When P th = 1 and A Ix=1 5 =Organization, all of the false positives are less than 0.2.These small false positives help increase the reduction factors in P th = 1 to a little bit larger than that in P th = 0.The case A Ix=1 5 =Organization with P th = 1 introduces 4 false positives larger than 0.7, but the corresponding reduction factor shows nearly no increase in Figure 5e.Note that, although R(P th = 1, D(A 1  Jx |V (A Iy,e Jy ))) can reach ∞, as explained in Corollary 1, we have not observed ∞ when P th = 1 in our experiments.Moreover, we show the reduction factor and false positive for the results over all available attributes in the certificates in Figures 5i and 6i.As can be seen, more than a 50% reduction factor is larger than 10 4 , and less than 5% suffers from a false positive larger than 0.1.It is worth noting that, when we apply SSLight to real-world scenarios, any one abnormal attribute dependency can expose the fake certificates.As a result, the actual detection capability is much better than that we show through the mean value.) remains 0.Moreover, we observe that at least can be detected.As a result, SSLight is shown to be able to expose the vast majority of fake certificates issued from trusted, but compromised CAs with 0 false positives.) = 1 is not acceptable, because SSLight will wrongly regard ∀C 1 q ∈ C as fake certificates.However, as shown in Figure 6, more than 95% Jy ))) is 0. We thus observe that the combination of a small number of features with small positives may cause a large positive in SSLight.To mitigate this impact, SSLight uses a false positive threshold to filter some of the features whose E(P th = 1, D(A 1  Jx |V (A Iy,e Jy ))) is larger than that threshold.As shown in Figure 7, when we exclude features when the threshold of E(P th = 1, D(A ) = 1 is dropped from more than 42% down to 0%, as well.In this case, the corresponding R * ) is also decreased to the same as R * (P th = 0, C

Iy>1 e
).This feature exclusion process shows how the single feature's false positive affects SSLight's false positive.Appendix 1.8 lists the excluded features whose E(P th = 1, D(A

Discussion
Although we have demonstrated the effectiveness of SSLight through a rich set of experiments with real-world datasets, we still acknowledge some limitations of SSLight in practice.
First, SSLight is a data-drive solution for fake certificate detection.Its detection capability largely relies on the quality of the dataset that is used to train the SSLight.If the training set contains inaccurate or even incorrect information, the effectiveness of SSLight may not be ensured.To overcome this challenge and to fetch a high-quality dataset for SSLight training, we propose a globally-distributed certificate hunter.The basic idea is to deploy a number of machines around the world.Each machine will run ZMap [25], which can scan the entire IPv4 space within 49 minutes, to collect certificates from all of the potential HTTPS services on the Internet.We then follow the idea of Perspectives [12] and consider that a certificate is valid if it belongs to the majority copies.In this way, we can mitigate the possibility of getting fake certificates in the training set.
Second, despite SSLight being an off-line approach, it may involve on-line activities for the downloaded and updated dataset.These on-line activities will introduce the risk of the dataset being corrupted.To avoid this risk, we can deploy a trusted third party.SSLight can only download and update its dataset from such a third party after necessary authentication.In this way, we should ensure the security of the trusted third party.Otherwise, SSLight will be avoided.
Third, we show the effectiveness of SSLight using a measure of the reduction factor, rather than the detection rate.We do this because the reduction factor can show the detection capability in a complete manner.That is, if we use the detection rate directly (just as the results we show in Table 2), we must focus on a subset of fake certificates and legitimate ones.This subset cannot represent how the fake certificates and legitimate ones are distributed well and, therefore, can only show the effectiveness of SSLight for specific cases.Unlike that, if we choose reduction factor, we can show the detection capability in general.It will not be affected by the specific cases we use and can show all of the possibilities of the fake certificates that SSLight can detect.
Fourth, in this paper, we focus on the evaluation of SSLight using the dataset [22], which is the first complete dataset released to the public and may contain the minimized fake certificates inside, because it is crawled immediately after the hacker's behavior has been detected.In this dataset, the hacker's impact is restricted, and the dataset contains minimized incorrect information.
Therefore, this dataset is the most appropriate one that shows the effectiveness of SSLight in a fair manner.Despite that, we will also investigate the effectiveness of SSLight using other datasets, such as https://wiki.mozilla.org/CA:Problematic_Practices,https://www.mozilla.org/en-US/about/governance/policies/security-group/certs/policy/and https://www.linshunghuang.com/papers/mitm.pdf, in our future work.This further investigation can help to demonstrate the status of SSLight in worse cases.

Related Work
Trust is widely used to secure information networks in various research fields.Successful applications include mobile ad hoc networks [26], wireless sensor networks [27], social networks [28], multi-agent networks [29] and, the most recent, anonymity networks [30,31].These successful applications confirm the effectiveness and necessity of trust for network security.This paper's scope falls into the web systems.In the following, we will survey the related works that use trust or trust-like methods to avoid fake certificates in the web.
Perspective [12] is a pioneering work to identify fake certificates.To address the security vulnerability in the so-called trust-in-first-use authentication scheme, Perspective proposed to deploy a distributed system around the world to help browsers obtain a legitimate sample of certificates.The basic idea of this project is adopted by HTTPS Everywhere [13] and Google Certificate Catalog [14], both of which provide on-line services to help users detect fake certificates issued from compromised CAs.However, these solutions need additional network communications, thus making them vulnerable to being blocked or hijacked.Certified Lies [4], on the other hand, focuses on the fake certificates issued by a special group of compromised CAs, the CAs that are compelled by governments.Although its solution is lightweight and does not need on-line checking, it operates in an ad hoc manner to address a limited number of attack scenarios and requires human interaction.The Sovereign Keys Project [15] provides a systematical solution to eliminate this security threat.However, The Sovereign Keys system introduces a totally different architecture, thus making it hard to replace the existing infrastructure in a short time.
The collection of legitimate HTTPS certificates has been done in several projects.However, some of them only focus on a specific target.For example, Lee et al. [32] collected the legitimate samples to evaluate the certificates' cryptographic strength, and Yilek et al. [33] just paid attention to an OpenSSL vulnerability in the Debian system.The SSL Observatory project [6,34,35], according to our knowledge, is the first thorough collection and analysis of legitimate certificates.This project scans all of the allocated IPv4 space with the 443 port.SSL Landscape [36], on the other hand, provides another thorough collection of legitimate samples, but it focuses on the survey of high ranked HTTPS web servers.Both of the datasets from SSL Observatory and SSL Landscape can be used as a legitimate sample set in SSLight.
As browsers always allow users to make the final decision about whether the certificates are trustworthy or not, attacks targeted at the human interface are usually launched to compromise HTTPS connections.Many mechanisms have been proposed to mitigate this threat.For example, SSLock [37] intelligently makes the final decision on behalf of users.Adelsbach et al. [38] and Xia et al. [39] improved the human interface to make users be clearly aware when suspicious certificates are detected by browsers.

Conclusions
In this paper, we have proposed SSLight, a novel fake certificate detection mechanism based on attribute dependency.SSLight is demonstrated to be able to detect fake certificates issued from trusted, but compromised, CAs with a relatively low false positive.In particular, SSLight shows its practicability to expose the real-world fake certificates issued by DigiNotar and Comodo.Although the design of SSLight is only for HTTPS applications, this attribute dependency-based detection method can be extended to other SSL-/TLS-based applications and protocols.

Figure 2 .
Figure 2. Tow different types of attribute dependency.(a) Certificate attribute dependency; (b) certification path attribute dependency.

Figure 3 .
Figure 3.Both the legitimate and fake mail.google.comcertificates have been accepted by Firefox (Version 5.0.1).(a) Legitimate certificate issued by Thawte; (b) fake certificate issued by DigiNotar; (c) fake certificate issued by Comodo.

Corollary 2 .
measure the detection capability obtained by SSLight as: R * (P th , C Based on Equation (10), we have Corollaries 2 and 3, which are proven in Appendixes 1.10 and 1.11, respectively.The attack range reduction power satisfies 1 ≤ R * (P th , C Iy>1 e

1 k∈ V 1 −
0 ≤ E(P th , D(A 1 Jx |V (A Iy,e Jy ))) ≤ 1.As defined in Equation (8), ∀V Jx,Jx (P th , A Iy,e Jy ) will cause SSLight with the single attribute dependency D(A 1 Jx |V (A Iy,e Jy )) to regard the examined certificate as a fake one; thus, their corresponding probabilities contribute to the false positive.As a larger P th leads to a larger ||V 1− Jx (P th , A Iy,e Jy )||, the false positive E(P th , D(A 1 Jx |V (A Iy,e Jy ))) can be increased when P th grows.For SSLight with feature set F, its false positive caused by C Iy>1 ecan be computed as follows.E * (P th , C

Figure 7 .
Figure 7. Reduction power and false positive when P th = 0 and P th = 1 with different feature exclusion in SSLight.(a) Attack range reduction power in SSLight; (b) false positive in SSLight.

Figure 8 .
Figure 8. SSLight works as a Firefox add-on to examine the legitimate and fake mail.google.comcertificates.(a) SSLight accepts the legitimate mail.google.comcertificate issued by Thawte SGC; (b) SSLight rejects the fake mail.google.comcertificate issued by Comodo.
Jy )} ≤ P th (i.e., ||P − th || > 0), C 1 e can be regarded as a fake certificate issued from C As a result, there are A 1 Jx ∈A 1 ||V 1 Jx || possible value combinations that C In the case that if SSLight employs only one attribute dependency D(A 1 Jx |V (A e 's attributes.The number of possible value combinations,A 1 Jx ∈A 1 ||V1 Jx ||, is defined as C Iy>1 e 's attack range, which reflects C Iy>1 e 's capability for issuing fake certificates.SSLight can help limit C Iy>1 e 's attack range, because a number of values in attribute A 1,e Jx may cause P {V (A 1,e Jx )|V (A Iy,e Jy )} ≤ P th , thus making them unable to be assigned.

1
Jx |V (A When considering all of the attribute dependencies in F, SSLight abates C

1
Jx |V (A Jx ∈A 1 is used to unify the samples that are wrongly detected, and the operator A Iy Jy ∈A Iy helps select samples that are issued from C Jy )) is used for the detection.The operator A 1

1
Jx |V (A attribute dependencies do not exist in the sample set.As a result, SSLight is encouraged to use a more comprehensive legitimate sample set to mitigate such an issue.Note that this result does not conflict with Corollary 4, because the legitimate certificate of *.skype.com is not included in our legitimate sample set, C 1 7 / ∈ C.