Building an Expert System through Machine Learning for Predicting the Quality of a Website Based on Its Completion

Biyyapu, Vishnu Priya; Jammalamadaka, Sastry Kodanda Rama; Jammalamadaka, Sasi Bhanu; Chokara, Bhupati; Duvvuri, Bala Krishna Kamesh; Budaraju, Raja Rao

doi:10.3390/computers12090181

Open AccessArticle

Building an Expert System through Machine Learning for Predicting the Quality of a Website Based on Its Completion

¹

Department of Computer Science and Engineering, K L Deemed to be University, Vaddeswaram, Guntur 522302, India

²

Department of Computer Science and Engineering, CMR College of Engineering & Technology, Hyderabad 501401, India

³

Department of Computer Science and Engineering, Malla Reddy Engineering College for Women, Secunderabad 500100, India

^*

Author to whom correspondence should be addressed.

Computers 2023, 12(9), 181; https://doi.org/10.3390/computers12090181

Submission received: 21 July 2023 / Revised: 25 August 2023 / Accepted: 31 August 2023 / Published: 11 September 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The main channel for disseminating information is now the Internet. Users have different expectations for the calibre of websites regarding the posted and presented content. The website’s quality is influenced by up to 120 factors, each represented by two to fifteen attributes. A major challenge is quantifying the features and evaluating the quality of a website based on the feature counts. One of the aspects that determines a website’s quality is its completeness, which focuses on the existence of all the objects and their connections with one another. It is not easy to build an expert model based on feature counts to evaluate website quality, so this paper has focused on that challenge. Both a methodology for calculating a website’s quality and a parser-based approach for measuring feature counts are offered. We provide a multi-layer perceptron model that is an expert model for forecasting website quality from the “completeness” perspective. The accuracy of the predictions is 98%, whilst the accuracy of the nearest model is 87%.

Keywords:

quality of website; completeness of a website; model learning to predict; expert system; predicting the quality of a website

1. Introduction

Information is widely disseminated through websites. When it comes to providing users with high-quality material, the quality of websites is the most crucial concern. By considering every factor impacting the quality of a website, more precise and pertinent information may be made available to consumers. To assess the actual condition of a website, it is crucial to identify the variables that affect website quality, utilizing models that aid in quantitative website quality computation.

More than 42 parameters are utilized to calculate website quality. A developed model is necessary to calculate a subfactor’s quality. All subfactors’ quality must be added and averaged to calculate a factor’s quality.

Everyone now engages in the exchange of information via websites. A website’s content is available in various formats, including sound, photographs, movies, audio, graphics, etc. Different material types are posted using multiple forms. Numerous new websites are launched every day. The real issue is the calibre of these websites. People depend on high-quality content. The most challenging task is evaluating quality. Customer happiness depends on how well-designed the website is when is searched or browsed for information. Numerous aspects must be considered to evaluate the quality of websites, including usability, connectivity, navigation, structure, safety, maintainability, dependability, functionality, privacy, portability, etc. Various tools, approaches, models, and procedures must be used to evaluate the calibre of a website. Most website businesses need many elements, such as a logo, animated graphics, colour schemes, mouse-over effects, graphics, art, database communication, and other requirements.

To assess a website’s quality, it is necessary to employ several characteristics, either directly connected or unrelated. Some techniques used to evaluate the quality of websites are subjective and rely on user input. Statistical methods, usually called objective measurements, can be used to measure quality factors.

Different stakeholders have various quality standards. Other factors can be used to evaluate the quality of a website from various perspectives, including those of designers, management, consumers, developers, etc. End users think about usability, efficiency, credibility, and other things, whereas programmers think about maintainability, security, functionality, etc. Only after determining a website’s demands from an actor’s viewpoint can the quality criteria best satisfy those expectations.

Websites are becoming more complicated for those concerned with e-commerce, electronics, museums, etc. Due to the availability of more components and the interactions between the parameters, choosing relevant quality factors and building quality evaluation procedures to quantify the variables is becoming more complex. The relationship between traits, attributes, and websites must be considered. Websites must have an impartial, well-considered subjective inclusion and an objective evaluation. Given the criteria that necessitate those approaches for the evaluation, it is imperative to evaluate both procedures. Websites have become the most important mechanisms to disseminate information on different aspects the user focuses on. However, the reliability of the information posted on websites must be verified. The quality of websites is always crucial to users so that the dependability of the information posted on the site can be assessed.

Most assessment of the quality of a website is based on human–computer interaction through usability heuristics [1], design principles [2], and rules [3]. The quality of a website can be evaluated on different dimensions. As such, the formal framework has yet to be arrived at, even though several quality standards have been prescribed [4,5].

A website’s quality has been described in various ways by different people. Most recently, many metrics have been used to determine a website’s capacity to live up to its owners’ and visitors’ expectations [6]. There is no standard strategy, because every solution suggested in the literature uses distinct evaluation techniques [7]. Evaluating the quality of a website is different from doing so for other systems. A website’s quality is evaluated using multidimensional modelling [8].

The quality of a website is computed from three different perspectives: functional [9], strategic [10], and experiential [11] perspectives. All the methods focus on some measures, attributes, characteristics, dimensions, etc. The methods are synonyms and explore distinctive features of the websites.

Different methodologies are used to compute a website’s quality. The experimental, quasi-experimental, descriptive, observational, associative, qualitative, and subjective–objective methods are either participative (surveys, checklists) or non-participative (web analytics). The participative methods focus on user preferences and psychological responses [12]. The testing technique is frequently employed to compute quality indicators, including usability tests, A/B tests, ethnographic tests, think-aloud tests, diary studies, questionnaires, surveys, checklists, and interviews, to name a few [13].

The most recent method uses an expert system involving heuristic evaluation [14]. Experts judge the quality of a website by considering a chosen set of website features. The expert methods are evaluated manually or through an automated software system. Some expert systems are also developed using artificial intelligence techniques and natural language processing [15].

According to a survey by Morales-Vargas et al. [16], one of the three dimensions—strategic, functional, and experimental—is used to judge the quality of websites. The necessary method for evaluating the quality of a website is expert analysis. They looked at a website’s qualities and categorized them into variables used to calculate its quality. Different approaches have been proposed to aid in quantifying the quality of a website.

The research questions are as follows:

What are the Features that together determine the completeness of a website?
How are the features computed?
How did the features and the website’s degree of excellence relate to one another?
How do we predict the quality of a website, given the code?

The research outcomes are as follows:

This research presents a parser that computes the counts of different features given to a website;
A model can be used to compute the quality of a website based on the feature counts;
An example set is developed considering the code related to 100 websites. Each website is represented as a set of features with associated counts, and the website quality is assessed through a quality model;
A multi-layer perceptron-based model is presented to learn the quality of a website based on the feature counts, which can be used to predict the quality of the website given the feature counts computed through a parser model.

2. Related Work

By considering several elements, including security, usability, sufficiency, and appearance, Fiaz Khawaja et al. [17] evaluated the website’s quality. A good website is simple to use and offers learning opportunities. A website’s quality increases when it is used more frequently. When consumers learn from high-quality websites, their experience might be rich and worthwhile. The “appearance” factor describes how a website looks and feels, including how appealing it is, how items and colours are arranged, how information is organized meaningfully, etc. A method for calculating website quality based on observations made while the website is being used has been presented by Kausar Fiaz Khawajal. Flexibility, safety, and usability are just a few of the elements that Sastry et al. [18] and Vijay Kumar Mantri et al. [19] considered while determining the website’s quality. “Usability” refers to the website’s usefulness, enjoyment, and efficacy. The user presence linked to the website or browsing is called the “safety” factor. There should never be public access to the user’s connection to the website. The “flexibility” aspect is connected to the capability included in a website’s design that enables adjustments to the website even while it is being used. Users can assess a website’s quality using a Portal Data Quality Assessment Tool (PoDQA), which uses pre-set criteria.

Vassilis S. Moustakis et al. [20] indicated that assessing the quality of a website requires several elements, including navigation, content, structure, multimedia, appearance, and originality. The term “content” describes the data published on a website and made accessible to visitors via an interface. The “content” quality factor describes how general and specialized a domain possibly is.

Navigation refers to the aid created and offered to the user to assist in navigating the website. The ease of navigating a website, the accessibility of associated links, and its simplicity all affect its navigational quality. The “structure” quality aspect has to do with things like accessibility, organization of the content, and speed. The appearance and application of various multimedia and graphics forms can impact a website’s overall feel and aesthetic. A website could be developed with a variety of styles in mind. A website’s “uniqueness” relates to how distinct it is from other comparable websites. A high-quality website must be unique, and users frequent such websites frequently. Vassilis proposed a technique known as the Analytical Hierarchical Process (AHP), and they utilized it to determine the website’s quality. Numerous additional factors must also be considered to determine the website’s quality. Andrina Graniü et al.’s [21] assessment of the “portability” of content—the capacity to move it from one site to another without requiring any adjustments on either end—has considered this.

Tanya Singh et al. [22] have presented an evaluation system that considers a variety of variables, such as appearance, sufficiency, security, and privacy. They portrayed these elements in their literal sense. A website’s usability should be calculated to represent its quality in terms of how easy it is to use and how much one can learn from it. The usability of a website refers to how easily users can utilize it. To a concerned user, some information published on the website can be private. The relevant information was made available to the qualified website users.

The exactness with which a user’s privacy is preserved is the attribute “privacy”-related quality. Only those who have been verified and authorized should have access to the content. Users’ information communicated with websites must be protected to prevent loss or corruption while in transit. The security level used during the data exchange can be used to evaluate the website’s quality.

The “adequacy” factor, which deals with the correctness and completeness of the content hosted on the website, was also considered. Anusha et al. [23] evaluated similar traits while determining the websites’ quality. The most important factor they considered was “portability”. This is when content and code may be transferred from one machine to another without being modified or prepared for the target machine. Another critical element of websites is the dependability of the content users see each time they launch a browser window. When a user hits on a specific link on a website, it must always display the same content unless the content is continually changing. The dependability of a website is determined by the possibility that the desired page will not be accessible.

When improvements to the website are required, maintenance of the existing website is straightforward. The ease of website maintenance affects the quality of the website. When evaluating the quality of a website, taking the aspect of “maintainability” into account, several factors, such as changeability, testability, analysability, etc., must be considered. The capacity to modify the website while it is active is a crucial factor to consider regarding maintainability.

The website’s capacity to be analysed is another crucial aspect that should be considered when evaluating a website’s quality. The ability to read the information, relate the content, comprehend it, and locate and identify the website’s navigational paths all fall under a website’s analysability category. When there is no unfinished business, changes to the user interface, and no disorganized presentation of the display, a website can be called to be stable. A reliable website stays mostly the same. When designing a website, the problem of testing must be considered, and while the website is being used, updates should be able to be made. Nowadays, most websites still need to provide this feature.

Filippo Ricca et al. [24] have considered numerous other parameters to calculate website quality. The website’s design, organization, user-friendliness, and organizational structure are among the elements considered. Web pages and their interlinking are part of the website’s organization. The practical accessibility of the web pages is directly impacted by how they are linked. When creating websites, it is essential to consider user preferences. It is necessary to render the anticipated content.

According to Saleh Alwahaishi et al. [25], the levels at which the content is created and the playfulness with which the content is accessible are the two most important factors to consider when evaluating a website’s quality. Although they have established the structure, more sophisticated computational methods are still required to evaluate the websites’ quality. They contend that a broad criterion is required to assess the value of any website that provides any services or material supported by the website. The many elements influencing a website’s quality have been discussed. In their presentation, Layla Hasan and Emad Abuelrub [26] stressed that web designers and developers should consider these factors in addition to quality indicators and checklists.

The amount of information being shared over the Internet is rising frighteningly. Websites and web apps have expanded quickly. Websites must be of a high standard if they are often utilized for obtaining the information required for various purposes. Kavindra Kumar Singh et al. [27] have created a methodology known as the Web Quality Evaluation Method (WebQEM) for computing the quality of websites based on objective evaluation. However, judging the quality of websites based on subjective evaluation might be more accurate. They have quantitatively evaluated the website’s quality based on an impartial review. They included the qualities, characteristics, and sub-characteristics in their method.

People communicate with one another online, especially on social media sites. It has become essential that these social media platforms offer top-notch user experiences. Long-Sheng Chen et al. [28] attempted to define the standards for judging social media site quality. They used feature selection methodologies to determine the quality of social networking sites. Metric evolution is required for the calculation of website quality.

According to the metrics provided by Naw Lay Wah et al. [29], website usability was calculated using sixteen parameters, including the number of total pages, the proportion of words in body text, the number of text links, the overall website size in bytes, etc. Support vectors were used to predict the quality of web pages.

Sastry JKR et al. [30] assert that various factors determine a product’s quality, including content, structure, navigation, multimedia, and usability. To provide new viewpoints for website evaluation, website quality is discussed from several angles [31,32,33,34,35,36,37,38,39,40,41,42,43,44]. However, none of the research has addressed how to gauge the websites’ content’s quality comprehensiveness.

A hierarchical framework of several elements that allow quality assessment of websites has been developed by Vassilis S. Moustakis et al. [20]. The framework considers the features and characteristics of an organization. The model does not account for computational measurements of either factors or sub-factors.

Oon-it, S [45] conducted a survey to find out from the users the factors that they consider to reflect the quality of health-related websites. The users have opined that trust in health websites will increase when it is perceived that the quality of the content hosted on the website is high, considering the correctness, simplicity, etc. Allison R et al. [46] have reviewed the existing methodologies and techniques relating to evaluating the quality of websites, and then presented a framework of factors and attributes. No computational methods have been covered in this presentation.

Barnes, Stuart et al. [47] conducted a questionnaire survey. He developed a method of computing the quality based on responses by various participants’ computed t-scores. Based on the t-score, the questions were classified, with each class representing the website quality.

Sasi Bhanu et al. [48] have presented a manually defined expert model for assessing website quality. The factors related to the completeness of a website have also been measured manually. The computations have been reflected considering a single website. No example set is used. The accuracy of prediction varied from website to website.

Rim Rekik et al. [49] have used a text mining technique to extract the features of websites and a reduction technique to filter out the most relevant features of websites. An example set is constructed, considering different sites. They have applied the apriori algorithm to the example set to find associations between the criteria and find frequent measures. The quality of the website is computed considering the applicability of the association rule given by a website.

Several soft computing models have been used for computing the quality of a website, which include Fuzzy-hierarchical [50], Fuzzy Linguistic [51,52], Fuzzy-E-MEANS [53], Bayesian [54], Fuzzy-neutral [55], SVM [56], and Genetic Algorithm [57]. Most of these techniques focused on filtering out the website features and creating the example sets mined to predict the overall website quality. No generalized model is learned that can indicate the quality of a website when given a new website.

Michal Kakol et al. [58] have presented a predictive model to find the credibility of content hosted on the website based on human evaluations, considering the comprehensive set of independent factors. The classification is limited to a binary classification of the content, but scaled prediction is the need of the hour.

The expert systems referred to in the literature depend on human beings who are considered experts and through experimentation, which is a complicated process. No models as such existed that attempted to compute the website’s features give the web application code. No model that learns the website quality is presented based on the measurement of website elements.

3. Preparing the Example Set

The quality of a website is computed considering 100 websites. The parameters considered include the number of missing images, videos, PDFs, fields in the tables, columns in the forms, and self-references, and the number of URLs. The website quality is computed on a scale of 0.0 to 1.0. A manually defined expert model has been used for computing the quality of a website considering the missing and mismatched objects ranging from 0 to 4, and quality assessed based on the number of missing objects of a specific type. Table 1 shows the expert model. The expert model can be changed dynamically by the users.

A parser program reads through the code of the websites and the resource files and computes the number of missing and mismatched objects, the algorithm of which is explained in Section 4.4. Using Table 1, the quality of each missing object is computed, and the quality of the website is computed by taking an average of the quality of all the sub-factors. A sample quality calculation for a selected website is shown in Table 2.

Empirical formulations have not been used to compute the quality of each of the objects considered from the point of view of “completeness” of the websites. The quality of the 100 websites is computed by subjecting each website to the parser and computing the missing and mismatched counts for each of the sub-factors (images, videos, tables, forms, web pages, self-hyperlinks, and PDFs), and using the expert model to compute the quality of each of the sub-factor. The quality of the entire factor is computed by taking an average of the quality of all the sub-factors.

4. Methods and Techniques

4.1. Analysis of Sub-Factors Relating to the “Completeness” Factor

The information must be complete for consumers to understand the meaning of the content hosted on a website. The completion of the content that is hosted on the websites depends on several factors. Missing URLs or web pages, self-referential “hrefs” on the same page, tabular columns in tables, data items in input–output forms, missing and mismatched images, missing and mismatched videos, and missing PDFs are the characteristics of the “completeness” attribute that are taken into account and evaluated to determine the overall quality of the websites.

Connections, usability, Structure, maintainability, navigation, safety, functionality, portability, privacy, etc., are just a few of the numerous aspects of evaluating a website’s quality. Each element has several related qualities or factors. Important supporting variables and features, such as the following, relate to completeness.

$Q_{w p}$ ⇒ Quality of URL/web pages
$Q_{h r e f s}$ ⇒ Quality of self-referential hrefs
$Q_{T a b l e s}$ ⇒ Quality of tables
$Q_{F o r m s}$ ⇒ Quality of forms
$Q_{I m a g e s}$ ⇒ Quality of images
$Q_{V i d e o s}$ ⇒ Quality of videos
$Q_{P D F s}$ ⇒ Quality of PDFs
$Q_{W E B}$ ⇒ Quality of the website

Quality of URL/web pages

When evaluating the effectiveness of online sites, the number of missing pages is considered. If an “href” is present, but the matching web page is not in line with the directory structure indicated by the URL, the URL is deemed missing. The quantity of references missing from all web pages affects how well this sub-factor performs.

Quality of self-referential hrefs

“hrefs” can provide navigation in an HTML page, especially if the page size is huge. Self-hrefs are used to implement navigation on the same page. Self-hrefs are frequently programmed on the same page, but programme control is not transferred to those sub-hrefs. Additionally, although a sub-href is not defined in the code, control is still transferred. The quantity of missing sub-hrefs (not coded) or sub-hrefs that are coded but not referred to is used to determine the quality of this sub-factor.

Quality of tables

Sometimes, content is delivered by displaying tables. A repository of the data items is created, and linkages of those data items to different tables are established. The tables in the web pages must correspond to the tables in the database, The columns code for defining a table must correspond to the table attributes represented as columns. The number of discrepancies between HTML tables and RDBMS tables reveals the HTML tables’ quality.

Quality of forms

Sometimes, data are collected from the user using forms within the HTML pages. The forms are designed using attributes for which the data must be collected. The forms need to be better designed, not having the relevant fields or when no connection exists between the fields for which data are collected. The number of mismatches reveals the quality of such forms coded into the website.

Missing and mismatched images

The content displayed on websites is occasionally enhanced by incorporating multimedia-based items. The image quality needs to be improved. Some important considerations include the image’s size and resolution. Every HTML page uses its URL to link to the images and normally specifies the size and resolution. Blank images are displayed when the images, and an “href” refers to ones that do not exist. Second, the attributes of the actual image must correspond to the HTML page’s image code specifications. The number of images on each page may be counted, and each image’s details, including its dimensions, resolution, and URL, can then be found. The quantity of blank, incorrectly sized, or resized images can be used to gauge the quality of the images.

Missing and mismatched videos

Video is another form of content found on websites with interactive content. Users typically have a strong desire to learn from online videos. The availability of the videos and the calibre of the video playback are the two sub-factors that impact video quality the most. An HTML or script file that mentions the videos contains the URL and player that must be used locally to play the videos.

The position, width, number of frames displayed per second, and other details are used to identify the videos. There must be a match between the movies’ coded size and the videos’ actual size stored at the URL location. Another crucial factor for determining the quality of the videos is whether they are accessible at the site mentioned in the code. The properties of the video, if it exists, can be checked. The HTML pages can be examined, and their existence at the supplied URL can be confirmed. The quantity of missing or mismatched videos reflects the quality of the websites.

Missing PDFs

Consolidated, important, and related content is typically rendered using PDFs. Most websites offer the option to download PDFs. The material kept in PDFs is occasionally created by referring to the PDFs using hrefs. The quantity of missing PDFs can be used to calculate the quality of the PDFs.

4.2. Total Quality Considering the “Completeness” Factor

When assessing the quality of a website, the “completeness” component and its attributes, which are generated as an average of the quality of the individual criteria, are considered. The more development is made regarding a characteristic, the more a feature’s quality is diminished. When the “completeness” factor is considered, the corresponding sub-factor quality defines a website’s overall quality. One sub-factor’s quality could be top-notch, while another could be below average. The sub-factors are independent as a result. Due to the equal importance of each sub-factor, there are no assigned weights. In Equation (1), a simple measure of the quality of a website is the average quality of all the sub-factors taken together.

Q_{W E B} = (Q_{w p} + Q_{h r e f s} + Q_{T a b l e s} + Q_{F o r m s} + Q_{I m a g e s} + Q_{V i d e o s} + Q_{P D F s}) / 7

(1)

Equation (1) shows the quality of a website from the perspective of completeness as the average of quality of different sub-factors connected with the “completeness" factor.

4.3. Computing the Counts of Sub-Factors through a Parser

A parser is developed, which scans through the entire web pages of a website. The website, as such, is a hierarchical structure of files stored in a disk drive. An example of such a structure is shown in Figure 1. As such, the URL of the root of the structure is given as the beginning of the website, which is taken as input by the parser.

4.4. An Algorithm for Computing Object Counts (Parser Code)

The following algorithm represents a parser designed to compute the counts related to different objects described above. The counts include total referred objects, Total objects in existence, total number of missing objects and Total number of mismatched objects. This algorithm is used to build the example set and compute counts related to a new website using the predicted website quality.

Input: WEB site structure
Outputs:

The list of all files
Number of Missing WEB Pages
Number of the total, Existing and Missing Images,
Number of Existing and Missing videos,
Number of Existing and Missing PDFs,
Number of Existing and Missing Fields in the tables,
Number of the total, Existing and Missing columns in the forms
Number of Existing and Missing self-references

Procedure

Scan through the files in the structure and find the URLS of the Code files.
For each of the code file
- Check for URLS referred and then enter them into a Webpage-URL-Array
- Check for URLS of PDFS; if it exists, enter them into a PDF URL array.
- Check for URLS of Images; enter them into an Image URL array if it exists.
- Check for URLS of Videos, and if it exists, enter a Video URL array.
- Check for URLS of inner pages; if it exists, enter them into an inner-pages URL array!
- Check for the existence of tables and enter the description of the table as a string into an array.
- Check for the existence of forms and enter the description of the forms as a string into an array
For each URL in the Array
- Add to total WEB pages.
- Check for the existence of the WEB page and if available add to the Existing WEB pages if not add to Missing WEB pages.
For each entry in the PDF-Array
- Add to Total-PDFs
- Check for the existence of the PDF file using the URL.
  - If available, add to Existing-PDFs, Else add to Missing-PDFs.
For each entry in the Image-Array
- Add to Total-Images
- Check for the existence of the Image file using the URL.
  - If available, add to Existing-Images else, add to Missing Images.
For each entry in the Video-Array
- Add to Total-Videos
- Check for the existence of the Video file using the URL.
  - If available, add to Existing-Videos else, add to Missing- Videos.
For each entry in the inner-URL-Array
- Add to Total-Inner-URLS
- Check for the Existence of the Inner-URL.
  - If available, add to Existing-Inner-URLS, Else add to Missing-Inner-URLs.
For each entry in Table-Desc-Array
- Fetch the column’s names in each of the entries.
- Fetch the tables having the column names as Table fields.
- If no Table is found, add to missing tables else the table is found.
- If the table is found and the column and field names are the same, add to Matching Tables, Else add to the Mismatch Table.
For each entry in Field-Desc-Array
- Fetch the Field names in each of the entries.
- Fetch the DB Tables having the field names as column names.
- If no table is found, add to missing forms else, the form-related table is found.
- If the field names and Columns names of the Table match, then add to Matching forms else add to mismatch forms.
Write all the 7 counts into a CSV file for each WEB site as a separate Example.
Write all names of the program files to a CSV file.
Write all the Image URLS with associated properties to a CSV file.
Write all the Video URLS with associated properties to a CSV file.
Write all names of the PDF files into a CSV file.
Write all names of the Tables into CSV files.
Write all names of the forms into a CSV file.
Write inner-href into the CSV file.
Write Referred URLs into a CSV file.

The WEB sites are input to the parser, which computes the counts of different objects explained above.

4.5. Designing the Model Parameters

A multi-layer perceptron neural network has been designed, which represents an expert model. The model is used to predict the quality of a website given the counts computed through the parser. Table 3 details the design parameters used for building an NN. For building the MLP model, two dense layers have been used for seven inputs and produce five outputs. Cross-entry has been used to compute the loss, and Adam’s optimiser is used to backpropagate the errors and adjust the weights.

The inputs fed into the neural network include the missing and mismatch counts of each type of objects that including missing URLs, missing inner-hrefs, missing and mismatched images, missing and mismatched videos, missing and mismatched tables, and missing and mismatched forms. The neural network produces five outputs, which are related to the quality level of the websites, and are classified into “Excellent”, “Very Good”, “Good”, “Average”, and “Poor”.

The physical structure of the neural network is shown in Figure 2. There are three layers in the network, which include an input layer containing seven input neurons, a hidden layer containing seven hidden neurons, and an output layer containing five output neurons.

4.6. Platform for Implementing the Model

KERAS and TensorFlow have been used in a notebook interface running on the Microsoft Windows 11 operating system. The software system runs DELL LAPTOP, built on top of the 12th Generation, with eight cores and two Nvidia GPUS.

4.7. Inputting the Data into the Model

The example set is broken into batches, each of 10 examples, and the records are shuffled in each epoch. Each split is converted to a batch of five for updating the weights and the bios. This is performed to avoid the possibility of either overfitting or underfitting.

5. Results and Discussion

5.1. Sub-Factor Counts for Example Websites

Table 4 shows the counts obtained for 10 websites calculated using the parser explained in the above section. The quality of each of the websites is computed by subjecting the counts to an expert-defined model, as described in Table 2. Table 4 refers to sub-factor counts (missing and mismatched counts) computed through the parser, and the website quality is computed based on the expert-defined model defined in Table 2.

5.2. Training and Testing the MLP Models

Out of the 100 Example sites, the missing and matched data related to 80 websites have been used for training the models, and the remaining have been used for testing the model’s accuracy. To avoid overfitting, the data are converted 10 folds after reshuffling the records in the set.

5.3. Weight Computations for the NN Model

The weights learnt for achieving optimum accuracy are shown in Table 5, while fixing the bios to be zero for each output. The number of epochs has been varied from 100 to 1000, and the batch size from 2 to 10, and it has been found that optimum results have been obtained when 300 epochs and batch size = 5 have been used.

In Table 5, the weight values refer to the values assigned to the weights after completion of 300 epochs. In the weight code, the first digit refers to the layer (1 for input to hidden, 2 for hidden to output), the second digit refers to the neuron the information comes from, and the third digit refers to the two neurons. Codes are given to refer to a specific connection in the neural network.

5.4. Predicting the Quality of the Websites Using the Learnt MLP Model

The example websites shown in Table 4 have been tested, and the quality is predicted. The results are shown in Table 6. The accuracy is 100% considering the example websites tested against the model.

5.5. Accuracy Comparisons

A total of 80% of the examples are used for training, and the remaining 20% are for training and estimating the accuracy of the NN model. The examples are selected randomly for training, and those not selected are used for testing. The average accuracy of the MLP model is 98.2%, with a standard deviation of 0.2%. Jayanthi et al. [59] have provided a comparison of the accuracy of their Extreme Learning Machine (ELM) method with the accuracy of SVM [60] and the Ivory method [61]. Table 7 shows the comparison of all these methods.

The model presented can be used by individual organizations that host their websites. Based on the assessment results, improvements to the websites can be made. The users’ usage of such websites can be felt by observing the organization’s business profile changes. Researchers can develop the quality of websites considering the other 41 factors considered by the method used in this paper.

5.6. Discussion

A deep learning model can compute the quality of the website by considering two layers: the input layer and the output layers. For the model, six inputs (number of missing images, videos, PDFs, fields in the tables, columns in the forms, and missing self-references) and five outputs (excellent, very good, good, average, and poor) have been considered. It has been witnessed that the accuracy of the model is optimum when 300 epochs with a batch size = 5 are used.

The example data are prepared in .csv format by considering 100 different websites and running the parser. The counts related to the “completeness” quality factor have been determined, and the same is fed as input to the multilayer perceptron model. The data are split into ten slots, each containing ten examples, and the examples in each are shuffled so that linearity, if any exists, is removed. The model is trained by iterating the number of epochs commencing from 100 and effecting an increment of 50 for each iteration. An optimum accuracy of 98.2% is achieved.

Training on websites is performed using 80 websites. The remaining 20 websites have been used for testing and finding the model’s accuracy.

6. Conclusions

A poorly quality website is never used; even if used, it will lead to adverse information dissemination. A hierarchy of factors, sub-factors, and sub-sub-factors could characterize a website. Knowledge based on a multi-stage expert system is required to compute the overall website quality. The deficiencies of a website can be known through visualization of the quality of different factors. Using an MLP-based NN model, the accuracy of predicting the quality of a website is achieved to the extent of 98.2%.

An example set is required to develop an NN model that helps predict the quality of any website. As such, no example set is in existence in this area. There is a need to develop an example set before any model can be learnt. Parsers can be used for computing the counts of sub-factors, and the quality of the website can be computed using the manually developed expert system or through the development of models based on surveys conducted to seek the opinions of users.

The learning model is highly dependent on the quality of the example set and on the variability of the data. Data shuffling and batching are required when selecting the examples provided as input for the learning model.

Individual organizations can test the quality of their websites using the model presented in this article. The organization looks at the missing or mismatched content and corrects their websites such that their websites, as judged by “completeness”, are 100% accurate.

The accuracy of the model gradually improves as the number of epochs increases. A maximum accuracy of 98.2% could be achieved using the MLP model presented in this paper when the number of epochs used is 300.

7. Future Scope

In the literature, nearly 42 factors have been reported that reflect the quality of the website. Out of the 42 factors, a few must be selected that are relevant to specific websites. It is irrelevant to consider all 42 factors for computing the quality of websites. In the study, only the completeness factor is considered, as it is a general factor applied across all websites. A similar approach can be implemented in respect to other factors, such as “structure of the website”, “website navigation design”, “look and feel”, etc.

Author Contributions

Methodology, S.B.J.; Software, V.P.B.; Validation, B.K.K.D.; Formal analysis, B.C.; Investigation, S.K.R.J.; Resources, R.R.B.; Data curation, R.R.B.; Writing—original draft, S.K.R.J.; Writing—review and editing, S.K.R.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Nielsen, J.; Nielsen Norman Group. 10 Usability Heuristics for User Interface Design. 2020. Available online: https://www.nngroup.com/aiiicles/ux-research-cheat-sheet/ (accessed on 3 May 2021).
Tognazzi, B. First Principles of Interaction Design (Revised and Expanded). 2014. Available online: https://asktog.com/atc/principles-of-interactiondesign/ (accessed on 5 September 2023).
Shneiderman, B. The Eight Golden Rules of Interface Design; Department of Computer Science, University of Maryland: College Park, MD, USA, 2016. [Google Scholar]
Law, R.; Qi, S.; Buhalis, D. Progress in tourism management: A review of website evaluation in tourism research. Tour. Manag. 2010, 31, 297–313. [Google Scholar] [CrossRef]
Shneiderman, B.; Plaisant, C.; Cohen, M.S.; Jacobs, S.; Elmqvist, N.; Diakopoulos, N. Designing the User Interface: Strategies for Effective Human-Computer Interaction, 6th ed.; Pearson Higher Education: Essex, UK, 2016. [Google Scholar]
Morales-Vargas, A.; Pedraza-Jimenez, R.; Codina, L. Website quality: An analysis of scientific production. Prof. Inf. 2020, 29, e290508. [Google Scholar] [CrossRef]
Law, R. Evaluation of hotel websites: Progress and future developments. Int. J. Hosp. Manag. 2019, 76, 2–9. [Google Scholar] [CrossRef]
Ecer, F. A hybrid banking websites quality evaluation model using AHP and COPRAS-G: A Turkey case. Technol. Econ. Dev. Econ. 2014, 20, 758–782. [Google Scholar] [CrossRef]
Leung, D.; Law, R.; Lee, H.A. A modified model for hotel website functionality evaluation. J. Travel Tour. Mark. 2016, 33, 1268–1285. [Google Scholar] [CrossRef]
Maia, C.L.B.; FU1iado, E.S. A systematic review about user experience evaluation. In Design, User Experience, and Usability: Design Thinking and Methods; Marcus, A., Ed.; Springer International Publishing: Cham, Switzerland, 2016; pp. 445–455. [Google Scholar]
Sanabre, C.; Pedraza-Jimenez, R.; Vinyals-Mirabent, S. Double-entry analysis system (DEAS) for comprehensive quality evaluation of websites: Case study in the tourism sector. Prof. Inf. 2020, 29, e290432. [Google Scholar] [CrossRef]
Bevan, N.; Carter, J.; Harker, S. ISO 9241-11 Revised: What have we learnt about usability since 1998? In Human-Computer Interaction: Design and Evaluation; Kurosu, M., Ed.; Springer International Publishing: Cham, Switzerland, 2015; pp. 143–151. [Google Scholar]
Rosala, M.; Krause, R. User Experience Careers: Lf’Hat a Career in UX Looks Like Today; Nielsen Norman: Fremont, CA, USA, 2020. [Google Scholar]
Jainari, M.H.; Baharum, A.; Deris, F.D.; Mat Noor, N.A.; Ismail, R.; Mat Zain, N.H. A standard content for university websites using heuristic evaluation. In Intelligent Computing, Proceedings of the 2022 Computing Conference, London, UK, 14–15 July 2022; Arai, K., Ed.; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2022; Volume 506. [Google Scholar] [CrossRef]
Nikolic, N.; Grljevic, O.; Kovacevic, A. Aspect-based sentiment analysis of reviews in the domain of higher education. Electron. Libr. 2020, 38, 44–64. [Google Scholar] [CrossRef]
Morales-Vargas, A.; Pedraza-Jimenez, R.; Codina, L. Website quality evaluation: A model for developing comprehensive assessment instruments based on key quality factors. J. Doc. 2023, 79, 95–114. [Google Scholar] [CrossRef]
Khawaja, K.F.; Bokhari, R.H. Exploring the Factors Associated with Website Quality. Glob. J. Comput. Sci. Technol. 2010, 10, 37–45. [Google Scholar]
Sastry, J.K.R.; Lalitha, T.S. A framework for assessing the quality of a WEB SITE, PONTE. Int. J. Sci. Res. 2017, 73. [Google Scholar]
Mantri, V.K. An Introspection of Web Portals Quality Evaluation. Int. J. Adv. Inf. Sci. Technol. 2016, 5, 33–38. [Google Scholar]
Moustakis, V.S.; Litos, C.; Dalivigas, A.; Tsironis, L. Website quality assessment criteria. In Proceedings of the Ninth International Conference on Information Quality (ICIQ-04), Cambridge, MA, USA, 5–7 November 2004; pp. 59–73. [Google Scholar]
Graniü, A.; Mitrovic, I. Usability Evaluation of Web Portals. In Proceedings of the ITI 2008, 30th International Conference on Information Technology Interfaces, Dubrovnik, Croatia, 23–26 June 2008. [Google Scholar]
Singh, T.; Malik, S.; Sarkar, D. E-Commerce Website Quality Assessment based on Usability. In Proceedings of the 2016 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 29–30 April 2016; pp. 101–105. [Google Scholar]
Anusha, R. A Study on Website Quality Models. Int. J. Sci. Res. Publ. 2014, 4, 1–5. [Google Scholar]
Ricca, F.; Tonella, P. Analysis and Testing of Web Applications. In Proceedings of the 23rd International Conference on Software Engineering, ICSE 2001, Toronto, ON, Canada, 12–19 May 2001. [Google Scholar]
Alwahaishi, S.; Snášel, V. Assessing the LCC Websites Quality. In Networked Digital Technologies, Proceedings of the Second International Conference, NDT 2010, Prague, Czech Republic, 7–9 July 2010; Zavoral, F., Yaghob, J., Pichappan, P., El-Qawasmeh, E., Eds.; Communications in Computer and Information Science; Springer: Berlin/Heidelberg, Germany, 2010; Volume 87. [Google Scholar] [CrossRef]
Hasan, L.; Abuelrub, E. Assessing the Quality of Web Sites. Appl. Comput. Inform. 2011, 9, 11–29. [Google Scholar] [CrossRef]
Singh, K.K.; Kumar, P.; Mathur, J. Implementation of a Model for Websites Quality Evaluation—DU Website. Int. J. Innov. Adv. Comput. Sci. 2014, 3, 27–37. [Google Scholar]
Chen, L.S.; Chung, P. Identifying Crucial Website Quality Factors of Virtual Communities. In Proceedings of the International Multi-Conference of Engineers and Computer Scientists, IMECS, Hong Kong, China, 17–19 March 2010; Volume 1. [Google Scholar]
Wah, N.L. An Improved Approach for Web Page Quality Assessment. In Proceedings of the 2011 IEEE Student Conference on Research and Development, Cyberjaya, Malaysia, 19–20 December 2011. [Google Scholar]
Venkata Raghavarao, Y.; Sasidhar, K.; Sastry, J.K.R.; Chandra Prakash, V. Quantifying quality of WEB sites based on content. Int. J. Eng. Technol. 2018, 7, 138–141. [Google Scholar]
Sastry, J.K.R.; Sreenidhi, N.; Sasidhar, K. Quantifying quality of websites based on usability. Int. J. Eng. Technol. 2018, 7, 320–322. [Google Scholar] [CrossRef]
Sai Virajitha, V.; Sastry, J.K.R.; Chandra Prakash, V.; Srija, P.; Varun, M. Structure-based assessment of the quality of WEB sites. Int. J. Eng. Technol. 2018, 7, 980–983. [Google Scholar] [CrossRef]
Sastry, J.K.R.; Prakash, V.C.; Sahana, G.; Manasa, S.T. Evaluating quality of navigation designed for a WEB site. Int. J. Eng. Technol. 2018, 7, 1004–1007. [Google Scholar]
Kolla, N.P.; Sastry, J.K.R.; Chandra Prakash, V.; Onteru, S.K.; Pinninti, Y.S. Assessing the quality of WEB sites based on Multimedia content. Int. J. Eng. Technol. 2018, 7, 1040–1044. [Google Scholar]
Babu, J.S.; Kumar, T.R.; Bano, S. Optimizing webpage relevancy using page ranking and content-based ranking. Int. J. Eng. Technol. (UAE) 2018, 7, 1025–1029. [Google Scholar] [CrossRef]
Prasad, K.S.; Sekhar, K.R.; Rajarajeswari, P. An integrated approach towards vulnerability assessment & penetration testing for a web application. Int. J. Eng. Technol. (UAE) 2018, 7, 431–435. [Google Scholar]
Krishna, M.V.; Kumar, K.K.; Sandiliya, C.H.; Krishna, K.V. A framework for assessing the quality of a website. Int. J. Eng. Technol. (UAE) 2018, 7, 82–85. [Google Scholar] [CrossRef]
Babu, R.B.; Akhil Reddy, A.; Praveen Kumar, G. Analysis on visual design principles of a webpage. Int. J. Eng. Technol. (UAE) 2018, 7, 48–50. [Google Scholar] [CrossRef]
Pawar, S.S.; Prasanth, Y. Multi-Objective Optimization Model for QoS-Enabled Web Service Selection in Service-Based Systems. New Rev. Inf. Netw. 2017, 22, 34–53. [Google Scholar] [CrossRef]
Bhavani, B.; Sucharita, V.; Satyanarana, K.V.V. Review on techniques and applications involved in web usage mining. Int. J. Appl. Eng. Res. 2017, 12, 15994–15998. [Google Scholar]
Durga, K.K.; Krishna, V.R. Automatic detection of illegitimate websites with mutual clustering. Int. J. Electr. Comput. Eng. 2016, 6, 995–1001. [Google Scholar]
Satya, T.Y.; G, P. Harvesting deep web extractions based on hybrid classification procedures. Asian J. Inf. Technol. 2016, 15, 3551–3555. [Google Scholar]
Jammalamadaka, S.B.; Babu, V.; Trimurthy, A. Implementing dynamically evolvable communication with embedded systems through WEB services. Int. J. Electr. Comput. Eng. 2016, 6, 381–398. [Google Scholar]
Prasanna, L.; Babu, B.S.; Pratyusha, A.; Anusha, J.L.; Chand, A.R. Profile-based personalized web search using Greedy Algorithms. ARPN J. Eng. Appl. Sci. 2016, 11, 5921–5925. [Google Scholar]
Boon-itt, S. Quality of health websites and their influence on perceived usefulness, trust and intention to use: An analysis from Thailand. J. Innov. Entrep. 2019, 8, 4. [Google Scholar] [CrossRef]
Allison, R.; Hayes, C.; McNulty, C.A.; Young, V. A Comprehensive Framework to Evaluate Websites: Literature Review and Development of GoodWeb. JMIR Form. Res. 2019, 3, e14372. [Google Scholar] [CrossRef] [PubMed]
Barnes, S.; Vidgen, R. WebQual: An Exploration of Website Quality. In Proceedings of the European Conference of Information Systems, Vienna, Austria, 3–5 July 2000; pp. 298–305. [Google Scholar]
Bhanu, J.; Kamesh, D.B.K.; Sastry, J.K.R. Assessing Completeness of a WEB site from Quality Perspective. Int. J. Electr. Comput. Eng. (IJECE) 2019, 9, 5596–5603. [Google Scholar] [CrossRef]
Rekik, R.; Kallel, I.; Casillas, J.; Alimi, A.M. Assessing web sites quality: A systematic literature review by text and association rules mining. Int. J. Inf. Manag. 2018, 38, 201–216. [Google Scholar] [CrossRef]
Lin, H.-F. An application of fuzzy AHP for evaluating course website quality. Comput. Educ. 2010, 54, 877–888. [Google Scholar] [CrossRef]
Heradio, R.; Cabrerizo, F.J.; Fernández-Amorós, D.; Herrera, M.; Herrera-Viedma, E. A fuzzy linguistic model to evaluate the Quality of Library. Int. J. Inf. Manag. 2013, 33, 642–654. [Google Scholar] [CrossRef]
Esteban, B.; Tejeda-Lorente, Á.; Porcel, C.; Moral-Muñoz, J.A.; Herrera-Viedma, E. Aiding in the treatment of low back pain by a fuzzy linguistic Web system. In Rough Sets and Current Trends in Computing, Proceedings of the 9th International Conference, RSCTC 2014, Granada and Madrid, Spain, 9–13 July 2014; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
Cobos, C.; Mendoza, M.; Manic, M.; León, E.; Herrera-Viedma, E. Clustering of web search results based on an iterative fuzzy C-means algorithm and Bayesian information criterion. In Proceedings of the 2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS), Edmonton, AB, Canada, 24–28 June 2013; pp. 507–512. [Google Scholar]
Dhiman, P.; Anjali. Empirical validation of website quality using statistical and machine learning methods. In Proceedings of the 5th International Conference on Confluence 2014: The Next Generation Information Technology Summit, Noida, India, 25–26 September 2014; pp. 286–291. [Google Scholar]
Liu, H.; Krasnoproshin, V.V. Quality evaluation of E-commerce sites based on adaptive neural fuzzy inference system. In Neural Networks and Artificial Intelligence, Proceedings of the 8th International Conference, ICNNAI 2014, Brest, Belarus, 3–6 June 2014; Communications in Computer and Information Science; Springer: Berlin/Heidelberg, Germany, 2014; pp. 87–97. [Google Scholar]
Vosecky, J.; Leung, K.W.-T.; Ng, W. Searching for quality microblog posts: Filtering and ranking based on content analysis and implicit links. In Database Systems for Advanced Applications, Proceedings of the 17th International Conference, DASFAA 2012, Busan, Republic of Korea, 15–18 April 2012; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2012; pp. 397–413. [Google Scholar]
Hu, Y.-C. Fuzzy multiple-criteria decision-making in the determination of critical criteria for assessing service quality of travel websites. Expert Syst. Appl. 2009, 36, 6439–6445. [Google Scholar] [CrossRef]
Kakol, M.; Nielek, R.; Wierzbicki, A. Understanding and predicting Web content credibility using the Content Credibility Corpus. Inf. Process. Manag. 2017, 53, 1043–1061. [Google Scholar] [CrossRef]
Jayanthi, B.; Krishnakumari, P. An intelligent method to assess webpage quality using extreme learning machine. Int. J. Comput. Sci. Netw. Secur. 2016, 16, 81–85. [Google Scholar]
Huang, G.-B.; Ding, X.; Zhou, H. Optimization method based extreme learning machine for classification. Neurocomputing 2010, 74, 155–163. [Google Scholar] [CrossRef]
Huang, G.B.; Zhou, H.; Ding, X.; Zhang, R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2012, 42, 513–529. [Google Scholar] [CrossRef]

Figure 1. An example structure of a website.

Figure 2. Physical network.

Table 1. Expert model for assessing the website quality based on its completeness.

Number of Missing Objects	Assigned Quality	Quality Grading
0	1.0	Excellent
1	0.8	Very Good
2	0.6	Good
3	0.4	Average
≥4	0.0	Poor

Table 2. Sample quality computation based on the counts computed through the parser.

Missing Counts		0	1	2	3	4	Quality Value Assigned
Quality Value		1.0	0.8	0.6	0.40	0	Quality Value Assigned
Missing images	4					✔	0.00
Missing videos	3			✔			0.60
Missing PDFs	9					✔	0.00
Missing columns in tables	1		✔				0.80
Missing fields in the forms	1		✔				0.80
Missing self-references	1		✔				0.80
Missing URLs	0	✔					1.00
The total quality value assigned							4.00
Average quality value							0.57
Quality grade as per the grading table above							Average

Table 3. Multi-layer perceptron model parameters.

Type of Layer	Number of Inputs	Number of Outputs	Type of Activation Function Used	Type of Kernel Initializer
Input Layer	7	7	RELU	Normal
Output Layer	7	5	SIGMOID	Normal
Model Parameters	Loss Function	Optimizers	Metrics
Model Parameters	Cross Entropy	Adams	Accuracy

Table 4. Counts and grading for example websites.

ID	# Missing URLS	# Missing Images	# Missing Videos	# Missing PDFS	# Missing Tables	# Missing Forms	# Missing Internal Hrefs	Quality of the Website
1	3	4	3	9	1	1	1	average
2	0	1	2	0	1	0	0	very good
3	1	2	4	1	2	1	1	good
4	0	2	2	1	2	1	2	very good
5	1	2	3	1	2	1	3	good
6	3	1	2	1	2	1	4	average
7	0	1	1	1	1	1	1	very good
8	1	2	2	2	2	2	2	good
9	3	3	3	3	3	3	3	average
10	4	4	4	4	4	4	4	poor

Table 5. Learned weights for optimum accuracy.

Weight Code	Weight Value	Wight Code	Weight Value
W111	0.0005345	W121	−0.03049852
W112	−0.02260396	W122	−0.05772249
W113	0.10015791	W123	0.0124933
W114	−0.00957603	W124	0.05205087
W115	0.0110722	W125	−0.02575279
W116	−0.07497691	W126	0.06270903
W131	0.03905119	W141	−0.02733616
W132	0.04710016	W142	0.02808586
W133	−0.01612358	W143	−0.03189966
W134	−0.00248795	W144	0.07678819
W135	−0.06121466	W145	−0.05594458
W136	−0.0188451	W146	−0.04489214
W151	0.000643	W161	0.02465009
W152	0.0143626	W162	0.02291734
W153	−0.00590346	W163	0.06510213
W154	−0.05017151	W164	0.0216038
W155	0.00431764	W165	0.02364654
W156	−0.04996827	W166	0.04817796
W211	0.00150367	W221	0.05486471
W212	−0.02436988	W222	−0.0747726
W213	−0.04478416	W223	−0.03751294
W214	−0.0215895	W224	−0.00753696
W215	−0.01126576	W225	−0.16550754
W231	−0.02882431	W241	−0.0697467
W232	0.09704491	W242	−0.00334867
W233	0.00701219	W243	0.00892285
W234	0.05021231	W244	0.08749642
W235	−0.12358224	W245	0.08793346
W251	−0.0181542	W261	−0.06405859
W252	−0.09880255	W262	−0.07070417
W253	−0.00041602	W263	0.01609092
W254	0.02695975	W264	0.00031056
W255	−0.03195139	W265	−0.10547637

Table 6. Counts, grading, and quality prediction of example websites.

ID	# Missing URLS	# Missing Images	# Missing Videos	# Missing PDFs	# Missing Tables	# Missing Forms	# Missing Internal Hrefs	Quality of the Website	Predicted Quality of the Website through the MLP Model
1	3	4	3	9	1	1	1	average	average
2	0	1	2	0	1	0	0	very good	very good
3	1	2	4	1	2	1	1	good	good
4	0	2	2	1	2	1	2	very good	very good
5	1	2	3	1	2	1	3	good	good
6	3	1	2	1	2	1	4	average	average
7	0	1	1	1	1	1	1	very good	very good
8	1	2	2	2	2	2	2	good	good
9	3	3	3	3	3	3	3	average	average
10	4	4	4	4	4	4	4	poor	poor

Table 7. Comparison of accuracy of methods used to compute the quality of the websites.

Method	Reference	Average Accuracy	Standard Deviation
SVM	[60]	78.00%	0.3%
Ivory	[61]	75.00%	0.7%
ELM	[59]	89.00%	1.2%
MLP	This paper	98.20%	0.2%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Biyyapu, V.P.; Jammalamadaka, S.K.R.; Jammalamadaka, S.B.; Chokara, B.; Duvvuri, B.K.K.; Budaraju, R.R. Building an Expert System through Machine Learning for Predicting the Quality of a Website Based on Its Completion. Computers 2023, 12, 181. https://doi.org/10.3390/computers12090181

AMA Style

Biyyapu VP, Jammalamadaka SKR, Jammalamadaka SB, Chokara B, Duvvuri BKK, Budaraju RR. Building an Expert System through Machine Learning for Predicting the Quality of a Website Based on Its Completion. Computers. 2023; 12(9):181. https://doi.org/10.3390/computers12090181

Chicago/Turabian Style

Biyyapu, Vishnu Priya, Sastry Kodanda Rama Jammalamadaka, Sasi Bhanu Jammalamadaka, Bhupati Chokara, Bala Krishna Kamesh Duvvuri, and Raja Rao Budaraju. 2023. "Building an Expert System through Machine Learning for Predicting the Quality of a Website Based on Its Completion" Computers 12, no. 9: 181. https://doi.org/10.3390/computers12090181

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Building an Expert System through Machine Learning for Predicting the Quality of a Website Based on Its Completion

Abstract

1. Introduction

2. Related Work

3. Preparing the Example Set

4. Methods and Techniques

4.1. Analysis of Sub-Factors Relating to the “Completeness” Factor

4.2. Total Quality Considering the “Completeness” Factor

4.3. Computing the Counts of Sub-Factors through a Parser

4.4. An Algorithm for Computing Object Counts (Parser Code)

4.5. Designing the Model Parameters

4.6. Platform for Implementing the Model

4.7. Inputting the Data into the Model

5. Results and Discussion

5.1. Sub-Factor Counts for Example Websites

5.2. Training and Testing the MLP Models

5.3. Weight Computations for the NN Model

5.4. Predicting the Quality of the Websites Using the Learnt MLP Model

5.5. Accuracy Comparisons

5.6. Discussion

6. Conclusions

7. Future Scope

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI