Evaluation of Quality and Readability of Online Health Information on High Blood Pressure Using DISCERN and Flesch-Kincaid Tools

: High Blood Pressure (BP) is a vital factor in the development of cardiovascular diseases worldwide. For more than a decade now, patients search for quality and easy-to-read Online Health Information (OHI) for symptoms, preventions, therapy and other medical conditions. In this paper, we evaluate the quality and readability of OHI about high BP. In order that the ﬁrst 20 clicks of three top-rated search engines have been used to collect the pertinent data. Using the exclusion criteria, 25 unique websites are selected for evaluation. The quality of all included links is evaluated through DISCERN checklist, a questionnaire for assessing the quality of written information for a health problem. To enhance the reliability of evaluation, all links are separately assessed by two di ﬀ erent groups—a group of Health Professional (HPs) and a group of Lay Subjects (LS). A readability test is performed using Flesch-Kincaid tool. Fleiss’ kappa has been calculated before considering average value of each group. After evaluation, the average DISCERN value of HPs is 49.43 ± 14.0 (fair quality) while for LS, it is 48.7 ± 12.2; the mean Flesch-Reading Ease Score (FRES) is 58.5 ± 11.1, which is fairly di ﬃ cult to read and the Average Grade Level (AGL) is 8.8 ± 1.9. None of the websites scored more than 73 (90%). In both groups, only 4 (16%) websites achieved DISCERN score over 80%. Mann-Whitney and Cronbach’s alpha have been computed to check the statistical signiﬁcance of the di ﬀ erence between two groups and internal consistency of DISCERN checklist, respectively. Normality and homoscedasticity tests have been performed to check the distribution of scores of both evaluating groups. In both groups, information category websites achieved high DISCERN score but their readability level is worse. Highest scoring websites have clear aim, succinct source and high quality of information on treatment options. High BP is a pervasive disease, yet most of the websites did not produce precise or high-quality information on treatment options.


Motivation and Background
Blood pressure (BP) is the pressure exerted by blood against the artery wall during its circulation. The normal level of BP in a healthy person is 140/90 mm Hg [1]. High BP is a prevalent chronic disease; when BP exceeds than normal level, results in detriment to the body's essential organs [2]. High BP which is also termed as hypertension is a major disease with a pervasiveness of 40.8 percent and control rate of 32.3 percent worldwide [3]. High BP is considered the most portentous factor in the development of cardiac diseases. The high pervasiveness of high BP makes it a significant factor for mortality and morbidity in both developed and developing countries. Due to its high mortality rates and lack of early symptoms, it is also known as a major chronic, a 'silent killer' and non-communicable disease [4]. Research shows that more than nine million deaths are associated with the high BP complications; and 45% of them are deaths which is caused due to coronary artery disease, while the death of 51% are caused by stroke [5].
Presently, around one billion individuals suffer from high BP worldwide and the number is expected to surge up to 1.56 billion by 2025 [6]. Nowadays, the use of Information and Communication Technologies (ICT) is rapidly growing in medical education [5]. Internet world statistics [7] show that web users are increasing day by day. As of June 2014, there were 3.035 billion Internet users around the globe and the number reached to 4.208 billion in June 2018, representing 55.1% of the World's population. Every day, thousands of users visit the Internet to get health-related information [5]. Health clients use Internet as the first choice to get instant information about their health issues [8]. According to the latest statistics [9], approximately 7% of Google's daily searches are related to health, which is equivalent to 70,000 per minute. Studies show that people up to 80% use Internet for treatment purposes [10]. Nowadays, more than 60% of people in Europe and 80% people in the US [11] use online health resources for their health-related issues and among them 90% people are convinced that Internet has improved their health-related knowledge [12]. Furthermore, physicians motivate patients to search for material related to their medical condition [13]. Studies show that the primary reason to use online health resources are to verify the information from the physicians and to get answers to questions in their mind and find out alternative treatment options [14]. In a study by the US physician showed that 85% of patients visit the hospital because of the information perceived through Internet [15].
Similarly, the authors of Reference [16] investigated that people search for OHI and trust the information. However, to reduce potential risks that are associated with online-health content and to facilitate health professionals, patients and health seekers to access precise, complete and accurate information, various quality tools are designed to assess the quality of OHI [17]. DISCERN instrument is one of the tool that is widely used for evaluation of OHI [18]. In addition to quality of the information, readability also plays a key role in comprehending written information effectively. It is the metric that a reader can use to understand a line [19]. Previous studies show that although some websites have good quality of information but their content is complex for most of the Internet users [19]. In addition, some web-based material for educating patients is written at above than recommendation grade [20,21]. Some of them are written at grade level-15 [22], while others are written at grade level-17 [23]. There are many tools, such as Flesch-Kincaid, Gunning fog index, Simple Measure of Gobbledygook (SMOG) and so forth, for readability assessment. These tools measure the complexity level of any written information.

Literature Review
Studies show that seventy percent of the research studies related to the evaluation of websites providing health information conclude that the quality of such websites is a major concern [24]. Various studies have been conducted for the assessment of OHI on different medical conditions such as lupus Appl. Sci. 2020, 10, 3214 3 of 22 erythematosus [10], diverticulitis [25], breast cancer treatment [26], childhood epilepsy [27], endoscopic retrograde cholangio pancreatography procedure (ERCP) [28], anxiety disorder [29], cataracts [30] and childhood depression [31]. In Reference [10], the authors search the term 'systemic lupus erythematosus' through Google, Bing and Yahoo and collect first 25 links in each search engine and assess only 25 unique websites (obtained through exclusion criteria) by evaluators. All websites are rated (No: does not fulfil criterion (1 point) to Yes: fulfils criterion (5 points)) through 16 questions of DISCERN checklist which has maximum score of 80. The authors find that none of the websites is categorized as 'excellent website' because none achieve a DISCERN score above 66. Among the 26 websites, 6 websites are poor and the content of 2 websites is very poor. Similarly, the authors of Reference [27] investigate the term 'Childhood Epilepsy' through Google and only 42 links are included in the study after removing advertisements, health forums, duplicated websites and personal experience websites. The assessment is carried out by two different groups through DISCERN tool. Hence, only 9.6% of the websites are of good quality while 26.2% are marked unreliable. Furthermore, 42.8% of the websites are unable to expatiate treatment choices effectively, whereas only 7 out of 42 websites provide clear information and access to additional sources. Furthermore, the authors conclude that source attribution in all websites is poor. The quality of 80% websites has serious deficiencies [32]. In study of ERCP, the authors examine the search term 'ERCP' in top three search engines [28]. A total of 60 websites are retrieved and 24 unique websites (excluding the websites with visual content, websites with completely scientific articles at advanced level or those which did not contain relevant information) are selected for study. Authors demonstrate that the websites are unable to provide proof of reliable source of information and the date on which the information is described. Furthermore, none of the websites achieve 5 out of 5 rating in all indicators of DISCERN. The included websites also fail to describe the effect for health consumers and whether the additional treatment information is available or not. In Reference [29], Google, Yahoo and MSN search engines yield a total of 540 websites for anxiety disorder, only 67 websites are found eligible for the study. All the websites are categorized as per its affiliation. The result shows that 29 out of 67 websites provide poor quality of information, 15 websites are of good quality and only 3 websites produce excellent quality of information on anxiety disorder.
In References [10,[25][26][27][28][29][33][34][35], it is found that inconsistent, unreliable and unauthentic information co-exist with actual information. Since the Internet has a great possible impact on the treatment decisions of patients; it is, therefore, very significant that patients access accurate and reliable OHI to bring down the risks of Internet-based detrimental clinical decisions [36]. In addition, studies have repeatedly showed that OHI are being developed to a degree that is too complex for most web users [19].
Therefore, this paper focuses on evaluating the quality and readability of OHI related to high BP.

Paper Contributions
High BP is a common disease and is a worldwide leading cause of premature death. According to the statistics issued by World Health Organization (WHO) in 2015, every fourth man and every fifth woman suffer from hypertension [37]. Nowadays, people are more health conscious and usually use Internet for health-related issues before and after consultation with health professionals. So it is very important to evaluate the quality of information on high BP available on the World Wide Web (WWW). Therefore, in this paper we will evaluate the quality of OHI about high BP by using the DISCERN tool.
Good quality of information does not guarantee its readability. Some websites providing quality information about OHI are not easy to read/understand. The authors of Reference [33] found that Government websites providing OHI are more readable whereas the readability level of non-profit category is extremely technical (not easily understandable by common man) [33]. Hence, in this paper, we determine readability test via Flesch-Reading Ease and Flesch-Kincaid tools. In this paper, we identify websites associated with high BP which has high quality of information based on standard criterion. In addition, we also highlight those websites which described benefits of each treatment, risks of different treatments and sources on which the publication is based. Similarly, we mark those Appl. Sci. 2020, 10, 3214 4 of 22 websites which has low quality written information about high BP. Furthermore, we examine the readability level of all included websites.

Paper Organization
The rest of the paper is organized as follows-Section 2 shows methodology of our work. Section 3 presents the results and its discussion in terms of quality and readability. Section 4 shows the strengths and shortcomings of our work. Section 5 concludes the paper.

Selection of Search Engines and Web Browser
In this study, we use the most widely used search engines, that is, Google, Yahoo and Bing [28]. At the time of our research study, based on usage of all kinds of electronic devices, the topmost search engines around the globe are Google (77%), Baidu (14%) and Bing (4%). Since the second commonly used search engine is available only in Chinese language, it is not applicable for this study. As such the fourth most commonly used search engine, Yahoo (2.48%), is selected. All searches were performed through Google Chrome version 69.0.3497.81 (for 64-bit OS). Language settings and location was set to default mode and the browser is set to incognito mode to avoid personalized results.

Websites Selection for the Study
After repeated consultation with the health professionals (HPs) and lay subjects (LS) and Internet users the keyword 'high blood pressure' is searched separately in the three already mentioned search engines to obtain relevant information on high BP. Most of the people view their search results to a maximum of first 20 hits [33]. Study shows that more than 91% do not move to next page [38]. Similarly, the study on health-specific Internet search behaviour shows that 97.2% clicks are based on first 10 links [39].
Therefore, we limited our search to the first 20 hits from each search engine, so a total of 60 search results are rigorously reviewed. During this process, advertisements and sponsored links were ignored. Similarly, websites with irrelevant, non-English content, those asking for payment, completely scientific based articles and videos links are excluded from the list. Furthermore, identical websites are also filtered out. As a result of this filtering process, 25 unique websites are chosen for analysis (See Table A1 in Appendix A). The workflow of methodology is shown in Figure 1.
In order to check the quality level of websites, all 25 websites have been split into their respective categories-Government, Information, Institution and Non-profit organization. Most of the websites are from information portal as shown in Table 1.

Quality Assessment Instrument
The quality of all distinct websites is separately evaluated through widely used DISCERN instrument. This tool is used for the assessment of written health care information about health diseases [10,[25][26][27][28][29]33]. It consists of short questionnaire that can be used by HPs, patients and LS. The DISCERN checklist consists of sixteen questions which has the total score 16 (minimum)-80 (maximum), where each question is rated through an ordinal Likert scale of 1 (for No) to 5 (for Yes) [18].

Quality Assessment Instrument
The quality of all distinct websites is separately evaluated through widely used DISCERN instrument. This tool is used for the assessment of written health care information about health diseases [10,33,[25][26][27][28][29]. It consists of short questionnaire that can be used by HPs, patients and LS. The DISCERN checklist consists of sixteen questions which has the total score 16 (minimum)-80 (maximum), where each question is rated through an ordinal Likert scale of 1 (for No) to 5 (for Yes) [18].  (2)  Asks for payment (2)  Duplicate websites (24) Websites Included for Quality and Readability Assessment: This instrument has been set up in three parts; the first part of the DISCERN questionnaire (Questions 1 to 8) is about the reliability of publication-which consists of aim of the publication, source of information, relevance of publication, additional source of information about the disease and so forth. The second part (Questions 9 to 15) concerns the description of treatment. For example, how a specific treatment works, what are the risks and benefits of a particular treatment or is there any other treatment choice for patients and so forth. The third part (Question 16) describes overall quality of the publication as a source of information for treatment options. The DISCERN instrument is used and validated in combination with a handbook [18]. A grading mechanism, as shown in Table 2, has been generated on the basis of score achieved by each website [33]. The detail of the DISCERN questionnaire is given in Table A2).

Readability Assessment
In this study, readability of the studied websites is determined using Flesch-Reading Ease Score (FRES) [29,40]. The built-in function of this tool is available in Microsoft Word 2010. FRES gives us numeric value which does not clarify the level of the grade, so in order to get grade level; we use Flesch-Kincaid Grade Level or Average Grade Level (AGL).
All content of each web page is copied and pasted in Microsoft Word 2010 separately. Each webpage is then analysed thoroughly and irrelevant content is removed. Hyperlinks such as webpage link, colons, semicolons, hyphens, decimals and abbreviations (such as 'U.S') are also removed [41]. Thus, we obtained the numeric value of FRES and AGL for each website as below: The range of values of FRES varies from 0 to 100 (very difficult to read to very easy to read). The content which has high FRES value means that it is easy to read [42].
The value of AGL can be obtained as follows: FRES depends on average sentence length and average word length. The results of the above two tools are related inversely, means a text having a high FRES score should have a lower score of AGL.

Data Collection
The data for this work were extracted on 10 September 2018. Inclusion and exclusion of all the included websites was carried out between 10 and 15 September 2018, while the quality and readability assessments are performed between 15 September and 30 September 2018.

Websites Evaluation
For evaluation of all included links, two groups are created. The first group is from the medical profession, consisting of a total of five HPs, with a minimum degree in Bachelor of Medicine, Bachelor of Surgery (MBBS). All HPs are hailing from different areas of Khyber Pakhtunkhwa-Peshawar, Malakand, Swat. Similarly, the second group is also composed of five members but had different educational background, that is, engineering, natural sciences and so forth. They are also from various regions of Khyber Pakhtunkhwa. All group members are computer literate. The reason for selecting two groups with different academic background is that each member evaluate the included websites independently and then compare and correlate the results of medical group with the second group.
All evaluators were fully trained on DISCERN questionnaire. Each member in both groups has been provided all included websites along with a copy of the DISCERN manual, which explains all 16 questions with examples. Responses from the group members are collected through Google sheet which is shared with all group members. To distinguish a specific evaluation response from a group, all members have been assigned a unique ID. So, after fully understanding of DISCERN checklist and online link, all group members uploaded their DISCERN ratings for each website separately. This way, participants independently evaluated all 25 websites. Average (mean) value and standard deviation (SD) for both groups are computed. SD is a measure of dispersion, which shows that how much data is spread out from the mean of each group. If the data spreads farther, it means SD is greater. Furthermore, all distinctive 25 websites are classified into different quality levels based on the DISCERN score according to Table 2.

Data Analysis
For statistical analysis, we used IBM SPSS Statistics v-20 software. Cronbach's α (alpha) (coefficient of reliability) is used to measure the internal consistency (reliability) of DISCERN questionnaire based on Likert scale that how closely are set of items connected. The range of coefficient of reliability (α) varies from 0 to 1. For Cronbach's α, the value over 0.70 is minimum acceptable range [43]. Similarly, before taking average value of each group, Fleiss' Kappa statistics has been carried out to assess the inter-rater agreement for both evaluating groups. The strength of agreement according to the Fleiss' Kappa (k) statistics is given in Table 3 [44]: Further, two-tailed Mann-Whitney test or Wilcoxon rank-sum test has been conducted to assess the statistical significance of the difference between two groups for each website evaluated by both groups. Similarly, for two indicators (Question 4 and Question 7) of DISCERN, Chi-squared test has been performed and compared for both groups.
Further, normality and homoscedasticity tests have been performed to check the distribution of websites scores for both groups. With not statistically significant results of these tests, we carried out independent samples t-test to assess the difference between two groups of evaluators.

Results
As discussed earlier, websites evaluation is accomplished by two groups. The responses from Group 1 (HPs) are compiled and results along with discussion are presented in Section 3.1, which is followed by the feedback from Group 2 (LS) and results are discussed in Section 3.2.

Group 1: Health Professional
Responses received from HPs are analysed in this section as follows:

High Scoring Websites
The evaluated websites are ranked based on the DISCERN score. Summary of the DISCERN score of the websites on the basis of responses obtained from HPs is given in Table A3 (Column: HPs). The DISCERN score ranges from 27.0 (being the lowest) to 72.2 (being the highest), with the average value 49.43 ± 14.0. The average DISCERN score indicates fair quality of information on high BP.

Quality of Included URLs
Fair to good quality is exhibited by the reviewed websites. Details of the evaluation of candidate websites by HPs are given in Table 4.  Table A3: S.No. 3-4) attain 4.8 and 4.7 points out of 5, respectively, in terms of relevance of publication. Thus, overall 10 (40%) of the websites demonstrate 'fair' evaluation on reliability of publication (Part-I) and 4 (16%) websites are established unreliable during the assessment period as is evident from Table 4. Six websites are unable to produce the date of publication and dates of main sources of evidences. While 11 websites have sufficiently produced date of publication and achieved 3-4 points in DISCERN ratings.

B. DISCERN's Part-II (Question 9-15)
It is observed from Table 4 Part-II that 5 (20%) websites had an excellent quality of information on treatment choices. Six websites have thoroughly explained each treatment which is useful for high BP. According to HPs, these websites have scored 4 to 5 out of 5 points of DISCERN. On treatment options, the quality of 7 (28%) websites is substandard. Furthermore, six websites had not described the risks or disadvantages of particular treatment, although it is very essential to be conscious of the risks in order to understand what to expect from the treatment and to be able to make further decisions. Quality of 9 (36%) websites is fair. Eight websites did not properly describe that what would happen if no treatment is used. According to HPs, it is the most dominant question because a publication of good quality will always include a description of what would happen if the condition remains untreated.

C. DISCERN's Part-III (Question 16)
From Question 16, overall quality of the website is assessed. As is apparent from Table 4, only 5 (20%) websites (out of 25) have 'excellent' quality and have produced appropriate sources of information on high BP. The overall quality of 5 (20%) websites is poor. Similarly, 2 (8%) websites are deficient in content.

DISCERN Indicator: Clarity of Sources
One of the important and essential indicators for reliability of health websites is clarity of sources which mean identification of the source to compile the publication other than the originator. It may be an expert opinion, a reference or any other evidence. The DISCERN rating for Clarity of sources is shown in Figure 2. According to health experts, as DISCERN cannot be used to tell whether the article is true or not, so via this indicator, users can check it easily against some other credible source. Through DISCERN indicator 'clarity of sources,' it is found that 9 out of 25 websites had no evidences about their publications. Only 3 (12%) websites provided clear sources of information about high BP which had obtained maximum score, that is 5 out of 5, for this indicator. 13 out of 25 websites provided somehow reference for this indicator.

DISCERN Indicator: Additional Source of Information
The indicator for additional source of information searches recommendation from other organization to provide further information about the treatment options, conditions or additional information about the disease. It can be seen from Figure 3 that 40% of the websites provided satisfactory additional sources of information on high BP whereas 16% of the websites were unable to provide additional information on high BP. be an expert opinion, a reference or any other evidence. The DISCERN rating for Clarity of sources is shown in Figure 2. According to health experts, as DISCERN cannot be used to tell whether the article is true or not, so via this indicator, users can check it easily against some other credible source. Through DISCERN indicator 'clarity of sources,' it is found that 9 out of 25 websites had no evidences about their publications. Only 3 (12%) websites provided clear sources of information about high BP which had obtained maximum score, that is 5 out of 5, for this indicator. 13 out of 25 websites provided somehow reference for this indicator.

DISCERN Indicator: Additional Source of Information
The indicator for additional source of information searches recommendation from other organization to provide further information about the treatment options, conditions or additional information about the disease. It can be seen from Figure 3 that 40% of the websites provided satisfactory additional sources of information on high BP whereas 16% of the websites were unable to provide additional information on high BP.  Amongst the 25 websites, only 4 (having overall high DISCERN score) have provided details of the DISCERN indicator additional sources of information on high BP. For further studies, it is essential to have detail of other support sources about treatment choices. Websites with high DISCERN score produce adequate additional information on high BP.

Group 2: Lay Subjects
Responses received from LS are analysed in this section as follows:

High Scoring Websites
As per assessment of Group 2 (Lay subjects), all websites are marked according to their DISCERN score. The range of DISCERN scores varies from 25.2 (the lowest score;  Amongst the 25 websites, only 4 (having overall high DISCERN score) have provided details of the DISCERN indicator additional sources of information on high BP. For further studies, it is essential to have detail of other support sources about treatment choices. Websites with high DISCERN score produce adequate additional information on high BP.

Group 2: Lay Subjects
Responses received from LS are analysed in this section as follows:

High Scoring Websites
As per assessment of Group 2 (Lay subjects), all websites are marked according to their DISCERN score. The range of DISCERN scores varies from 25.2 (the lowest score; Table A3: S.No. 25 (LS)) to 69.6 (the highest score; Table A3: S.No. 5 (LS)) with average 48.7 ± 12.2 (fair quality). The average score of LS is close to the average score of HPs group (49.43). The average value indicates that the overall quality of websites is 'fair.' DISCERN score of all websites are given in Table A3.

Quality of Included URLs
Fair to Good quality is exhibited by the reviewed websites. Detail analysis of the evaluation of candidate websites by Group 2 (LS) is given in Table 5.
A. DISCERN's Part-I (Questions 1-8) After further dissecting the results of DISCERN questionnaire by LS, it is found that, only 5 out of 25 websites have produced reliable publication as shown in Table 5. On analysing the first eight questions of DISCERN, it is known that seven websites (mentioned in   Table 5. Good publication always describes the benefits of each treatment. It is found from analysis of Questions 9-15 that 3 out of 25 websites (mentioned in Similarly, if we look at Question 16, which is related to the overall quality of the websites, we can see that only 6 websites provide excellent quality of information about high BP as shown in Table 5. Analysis shows that overall 8 out of 25 websites produce poor quality of information about the disease.

DISCERN Indicator: Clarity of Sources
After analysis of the data assessed by LS, it was found that only one website (4%) shows references at the end of publication which show that the author has produced clear evidence of its publication and is thus highly rated for the DISCERN indicator of 'clarity of sources.' Seven websites (28%) partially show their sources of information and 10 websites (40%) have a poor rating for this indicator because they have not shown any link, sources or any other references to confirm authentication. The statistics are summarized in Figure 4.

DISCERN Indicator: Additional Source of Information
After the assessment of websites by Group 2 (LS), it was found that only two websites (out of 25) have not discussed additional sources of information about high BP and hence achieved a poor score in the DISCERN indicator 'Additional source of information.' The remaining websites have shown an additional source of information. The findings are summarized in Figure 5.

Results Comparison of both Groups (HPs & LS)
In this section we compare the analysis of Group 1 and Group 2 to evaluate the two-group results. Section wise comparison of the DISCERN questionnaire by HPs and LSs is shown in Figure  6. It is clear from the figure that the assessments of part-I and part-II of the DISCERN questionnaire in both groups are closely related compared to a slightly greater difference in part-III of the

DISCERN Indicator: Additional Source of Information
After the assessment of websites by Group 2 (LS), it was found that only two websites (out of 25) have not discussed additional sources of information about high BP and hence achieved a poor score in the DISCERN indicator 'Additional source of information.' The remaining websites have shown an additional source of information. The findings are summarized in Figure 5.

DISCERN Indicator: Additional Source of Information
After the assessment of websites by Group 2 (LS), it was found that only two websites (out of 25) have not discussed additional sources of information about high BP and hence achieved a poor score in the DISCERN indicator 'Additional source of information.' The remaining websites have shown an additional source of information. The findings are summarized in Figure 5.

Results Comparison of both Groups (HPs & LS)
In this section we compare the analysis of Group 1 and Group 2 to evaluate the two-group results. Section wise comparison of the DISCERN questionnaire by HPs and LSs is shown in Figure  6. It is clear from the figure that the assessments of part-I and part-II of the DISCERN questionnaire in both groups are closely related compared to a slightly greater difference in part-III of the

Results Comparison of Both Groups (HPs & LS)
In this section we compare the analysis of Group 1 and Group 2 to evaluate the two-group results. Section wise comparison of the DISCERN questionnaire by HPs and LSs is shown in Figure 6. It is clear from the figure that the assessments of part-I and part-II of the DISCERN questionnaire in both groups are closely related compared to a slightly greater difference in part-III of the questionnaire where HPs (Group 1) achieved 60.4% and LS (Group 2) achieved 64.4% average DISCERN scores.   Figure 7 shows the performance of each category of website by both groups. It is obvious from the figure that the content of the Information category websites is highly rated by both groups and thus achieved the highest DISCERN score. The number of websites from Information category is also highest which is 9. It means that the Information category websites have highly visibility rate due to their good quality of information on health issues. Similarly, non-profit category websites failed to provide clear aims, citations, treatment options and any additional information about high BP. Consequently, the content of the Non-profit category is marked the lowest in both groups. The performance of Government and Institution category websites is average.   Figure 6. Results comparison by both groups (HPs and LS). Figure 7 shows the performance of each category of website by both groups. It is obvious from the figure that the content of the Information category websites is highly rated by both groups and thus achieved the highest DISCERN score. The number of websites from Information category is also highest which is 9. It means that the Information category websites have highly visibility rate due to their good quality of information on health issues. Similarly, non-profit category websites failed to provide clear aims, citations, treatment options and any additional information about high BP. Consequently, the content of the Non-profit category is marked the lowest in both groups. The performance of Government and Institution category websites is average.   Figure 7 shows the performance of each category of website by both groups. It is obvious from the figure that the content of the Information category websites is highly rated by both groups and thus achieved the highest DISCERN score. The number of websites from Information category is also highest which is 9. It means that the Information category websites have highly visibility rate due to their good quality of information on health issues. Similarly, non-profit category websites failed to provide clear aims, citations, treatment options and any additional information about high BP. Consequently, the content of the Non-profit category is marked the lowest in both groups. The performance of Government and Institution category websites is average.   Figure 7. Performance of categories of websites by both Groups.

Statistical Analysis
Before considering the average value of all evaluated websites, we performed Fleiss' kappa (k) analysis for both groups to assess the five raters' scores for all websites. The average values of Fleiss Kappa (k) for HPs and LS are 0.3912 and 0.3421, respectively. According to Table 3, the strength of agreement amongst both groups is 'fair.' Kappa (k) values for each website are shown in Table A1 in Appendix A (Column: HPs & LS). As there are five measurements per group, so the calculated values of Kappa (k) show good consistency among the evaluators of each group. Further, to measure the internal consistency of the DISCERN questionnaire, we calculated Cronbach's α for both evaluating groups. After measuring all questions of DISCERN by each evaluator in HP group, the value of α is 0.876 which shows higher internal consistency. Similarly, the value of α for LS is 0.783 which also reflects a good range for consistency of LS group. Both values of α show that the tool is reliable for evaluation.
Similarly, Wilcoxon-rank sum test has been used to compute the value of p at significance level (p < 0.05) for each website evaluated by both groups. The values are shown in last column in the table  given on Table A3. Furthermore, the Chi-squared test has been calculated for the DISCERN indicator regarding clarity of sources (Question 4 of DISCERN questionnaire given in Table A2). For HPs, the value of Chi-squared is 0.199148 whereas for LS, it is 0.014612 for statistical significance level (p < 0.05). Similarly, the same Chi-squared test has been carried out for additional source of information (Question 7 of DISCERN questionnaire given in Table A2). So, the value of Chi-Square is 0.024406 for HPs while for LS it is 0.005135 at p < 0.05.
Normality and homoscedasticity tests were performed to check whether the scores are normally distributed for both groups of evaluators. After performing these tests, it is found from Shapiro-Wilk's test (p > 0.05) [45,46], visual inspection of their histograms and box plots that all the websites scores were normally distributed for both groups. For HPs, the value of skewness is −0.078 (Standard Error (SE) = 0.464) and a kurtosis is −1.448 (SE = 0.902). Whereas for LS, the value of skewness is 0.197 (SE = 0.481) and a kurtosis is −0.943 (SE = 0.935) [47]. Similarly, after examining the Q-Q plots of both groups, it has been observed that all the dots were along the line which indicates that the data are approximately normally distributed. Further, the analysis of both groups showed that they are homoscedastic because they have the same level of variance.
After confirming normality and homoscedasticity, subsequently, we performed independent samples t-test to find out t-values for the statistical differences of DISCERN questionnaire assessed by both groups. The Independent Samples t-test is a parametric test that compares the means of two independent groups in order to find out statistical evidence for the significant difference between means of the associated populations. It is found that the t-value is 0.09854 and p-value is 0.921914. The result shows that it is not significant at p < 0.05. Lastly, we calculated t-values for the statistical differences of DISCERN questionnaire assessed for all categories of websites. The results are shown in Table 6. It is clear from the results the value of p is significant at p < 0.05 for only Information category websites.

Readability Level of All Included Websites
The median FRES value score of the websites is 58.5 (fairly difficult to read) with standard deviation (SD) 11.1 and the mean AGL is 8.8 with SD (1.9). The overall readability level of all websites is 'fairly difficult to read.' There are certain websites (for example Wikipedia) whose readability level is very low as per criterion of Flesch-Kincaid tool. The FRES and AGL score of all included websites are shown in Table A4. Moreover, the average readability level of all categories of websites was calculated separately. The content of government websites is in plain English which can be easily understood. Table 7 shows the average FRES and AGL of all categories of websites. The Information category websites, which achieved a high DISCERN score in both assessments, however had a worse performance in readability.

Strengths and Limitations
To the best of our knowledge, this is the first time the DISCERN instrument has been used for evaluating the content of the World Wide Web. From the study, we learned that those websites that appeared on 3rd and 4th click of the search engine are not necessarily of good quality.
On the other hand, there are several constraints that must be kept in mind when applying the findings of this work.
The search engines are continuously updating and improving their search strategies. The same query for a search may produce different search items in the future which may lead to different results. Also, because we chose the first 20 hits (first 2 pages) from each search engine, we may have missed high quality websites just because they were not amongst the first 20 search results. Although DISCERN is a reliable tool, it does have its own limitations. For example, it does not tell anything about display/presentation of information or browsing and locating information on a website. We also excluded scientific websites from the list of candidate websites due to the limitations of DISCERN.
Similarly, there are certain limitations of the Flesch-Kincaid tool. For example, most of the medical terminologies are long in length, due to which the number of characters in a sentence increases. Thus, the complexity of an article increases according to the Flesch-Kincaid tool which is not a fair reflection of the actual result. Furthermore, a website may have a greater level of complexity if it has large information content compared to a smaller one. Similarly, complexity and readability of a document does not depend only on the length of words and sentences but also on other factors [17]. Previous studies show that readability formulas do not take graphic or pictorial material into account [40], so it may also affect the overall readability of a website. Furthermore, this study is restricted to websites written in the English language only; quality information may exist on non-English language websites too. There may also be a difference in search results depending on region. Our findings are based on search queries conducted in Malakand, in the Khyber Pakhtunkhwa region of Pakistan. The mentioned reasons may produce variation in results.

Conclusions
In this study, we carried out an assessment of the quality and readability of OHI on high BP, a common disease known to everyone and its treatment is not as complex as that of diseases such as Recurrent Respiratory papillomatosis, maxillofacialtrauma, diverticulitis and Endoscopic retrograde cholangiopancreatography (ERCP).
The study considers two different groups for evaluation and each group consists of 5 evaluators. It was found from the analysis that the quality of OHI on high BP is moderate.
To ensure internal consistency of the DISCERN questionnaire, the coefficient of reliability is computed. The overall average DISCERN score of all studied websites is 48.1 (60.1%) which is below the norm for a disease such as hypertension. The quality of three websites (Table A3: S.No. 23-25) is very poor and five websites have high quality of information on hypertension. Similarly, one thing is common in the assessment of both groups-that the top ten websites in each group are the same but they have been marked differently (see Table A3: S.No. [1][2][3][4][5][6][7][8][9][10]. Overall the quality of 11 out of 25 (44%) websites varies from good to excellent. Both groups concluded that references and supplementary sources are not mentioned by most of the websites.
Moreover, 11 out of 25 websites do not meet the recommended score of readability. Among these 11 websites, 6 achieved an excellent DISCERN score but their readability score is less than 50 which is difficult for a college student to understand properly. Hence, in order to make it more productive, there is a dire need to improve the readability level of the assessed websites. It was also observed that the websites with a high visibility rate (appeared on 4th and 5th hit of search engine) do not necessarily have good quality of health information compared to subsequent websites.   Table A2. DISCERN Questionnaire.