Application of FCM Clustering Algorithm in Digital Library Management System

: Traditional library systems are gradually being replaced by digital libraries. Digital libraries are developing from simple database-based storage and retrieval to knowledge-based implementation. The fuzzy C-means (FCM) clustering algorithm is an example of data collection and data processing technology. It evaluates and draws conclusions based on mathematics, large data, and other technologies. In order to better improve the digital library management system, this paper applied FCM clustering algorithm to the digital library management system. Based on the in-depth study of the FCM clustering algorithm, this paper built a digital library management system. The clustering algorithm was used to cluster library borrowing records and reader information. It provided technical support and suggestions on library collection construction and book purchase and promoted book management to form a good spitting cycle. The experimental results extracted during the evaluation phase demonstrated that the overall error rate of the suggested FCM clustering algorithm for information clustering is 3.66%, which is better than the existing comparative models. This shows that applying the FCM clustering algorithm to a digital library management system has some practical signiﬁcance.


Introduction
Libraries have a long history as cultural centers for providing information and knowledge.Libraries also carry out digital reform with the changes in society.As part of the digitization of library resources, digital library management systems are essential for systematically storing resources and ultimately providing services to online readers.In order to meet the information needs of teachers and students quickly, effectively, and completely, this paper uses the FCM clustering algorithm to study the library management system.
The purpose of digital libraries is to provide users with the information they need.At present, there are many studies on digital libraries.Muhammad examined the download behavior of readers of two well-known journals in education and found that by using cumulative download counts over a few months, downloads could be predicted more accurately [1].MandaliaH described experience in developing digital libraries in local languages and innovated methods for designing relevant local content so as to improve the availability of localization [2].Gooding P outlined the design and implementation of digital library systems.The purpose was to define the stages and steps required to design a digital library system so that digital library developers could use these steps as a guide for designing digital libraries in the most efficient way [3].Das surveyed 1220 research papers on digital libraries published by BRIC countries between 2000 and 2019 and found that more and more research was done on digital libraries in BRIC countries [4].Aditya A provided an in-depth analysis of the quality of service of the Jakarta Digital Library of the Jakarta Provincial Library and Archives Office.The survey results showed that the service quality of the DKI Jakarta Provincial Library and Archives Bureau's digital library was not entirely considered good, and there was still room for improvement [5].However, the current research on digital library management system still needs to be improved, and new algorithms need to be introduced to study it.
The FCM clustering algorithm has been paid increasing attention because of its ability to analyze big data.Yang J developed a clear algorithm for C-means classification based on histogram (AFCMH).The experimental results showed that the computational time of this method was significantly shorter than that of the traditional segmentation algorithm based on mean shift [6].Xiao M recommended an improved C algorithm that was not aligned with the spatial algorithm to solve the problem of C-means fusion [7].Kiki M presented a parallel FCM cluster integration algorithm based on MapReduce.The experimental results showed that the parallel FCM clustering integration algorithm had good performance, fast speed, and scalability [8].Wang H Y proposed a new FCM clustering algorithm validity function to verify the validity of clustering results.The results showed that the proposed validity function could more accurately obtain the optimal number of clusters for the dataset, which has a strong adaptability [9].Huang D proposed a parallel fuzzy clustering method based on cut set to solve the complex and inefficient problem of fuzzy number classification processing.The theoretical analysis and application showed that this method had good classification accuracy and efficiency for fuzzy data clustering [10].In order to provide better services to readers, this paper uses the FCM clustering algorithm to improve the digital library management system.This paper first discusses the database creation process and the business process of the data management system of the book management system based on the FCM clustering algorithm.It also uses the FCM clustering algorithm to retrieve useful data from it, which provides support and evaluation for data integration, production, and analysis.This study plays a positive role in timely understanding the information needs of readers, intelligent allocation of library resources, improving the efficiency of resource utilization, book quality, book management, and so on.

Methods of Digital Library Management Systems
The establishment of a new library management system requires a comprehensive, detailed, and thorough study of the current system.This was done specifically by means of direct interviews and ad hoc visits and participation in business practices, working with staff through the most basic work processes.The main way in which information management was originally carried out in the library management and processing center was based on manual processing, at most printing some lists using the computer's text system, with a high volume of information processing, prone to errors and lacking a systematic and standardized means of information management.The library management system that is now ready to be established implements computerized unified management of the library's book management, reader management, book lending management, and other daily management work in order to improve efficiency and management levels [11].This system analyzes the demand from several aspects such as library collection construction, shelf management, collection utilization rate, and personalized construction, and carries out the design of the management system.

Needs of Library Management Systems
The purpose of the library management system is to make the information management of the library efficient, large-scale, and automated in order to improve the efficiency of work.Library management systems can use clustering algorithms to analyze the types of books that have been borrowed frequently, to increase or decrease the number of different types of books in the process of book purchase, and to help developers understand new books on the shelves [12].The clustering algorithm flow is shown in Figure 1.The library management system is divided into five parts: control system, book management, borrowing management, book return management, and information list.The control system contains the system control interface, including user access, adding users, editing users, deleting users, and exiting the system.Book management includes adding books, modifying books, and deleting books.Borrowing management includes borrowing books and changes in borrowing information.Book return management manages book borrowing by returning books and viewing the information on book returns.The information list includes a list of books, a list of borrowings, and a table of user details.The library management system hierarchy module is shown in Figure 2. Currently, libraries divide their collections into different sections according to regulations.Combining similar books together may increase the readability of some but may not interest the reader in other publications.At the same time, if it arranges the shelves according to specific numbers: each type of book should have enough space for new books.If is the number reserved is too large, it would waste space.If not enough space is reserved, it would be difficult for books to be stored.As books continue to be sorted and classified, they quickly get old and damaged.
The "28 effect" exists in many places; it is the same as libraries.There are thousands of books in the library, not all of which will be used by readers.Through data mining technology, books with high and low reading volumes can be found.Popular collections can be increased, and the loss value of books can be removed by donation, so as to improve the utilization rate of books.
For groups with different reading habits, using the same working mode, the number and duration of books borrowed limit the active reading group's reading interest.Inactive reading groups cannot get accurate guidance and publicity, and their attitudes towards reading cannot be easily changed.To address this situation, data mining can be used to obtain useful information about groups.For example, by analyzing their borrowing data, an active reader can increase the maximum number of books borrowed and reduce reading restrictions.For ordinary readers, these methods find the most popular books to introduce and good books to recommend.In addition, according to the borrowing habits of the readers, including the borrowing behavior of the library readers and the types of books they borrow, each reader must go through the library to create a space for a dedicated display and reading of books, which can attract all readers to read eagerly in the library.In addition, according to the borrowing number of different books and the time characteristics of borrowing readers, many reading activities can be carried out scientifically to improve the effective utilization of books and improve services.

Overall Design of Digital Library Management System
This system is a network application system using B/S mode [13].The advantage of B/S mode is that the system is easy to develop, maintain, and upgrade, and the cost of control is low.This feature has the same user interface.It is easy to use and can be used by different users with different login settings.In general, the client does not need to install new software.It can be used on the browser Web page, which makes the system easy to maintain and manage.In addition, the application runs on the server side, and the management, updating, and upgrading of the system are very flexible, reducing the interaction between the server side and the client side.The B/S mode is shown in Figure 3.The whole library website system uses a three-tier structure: data access layer, business logic layer, and user interface layer.

Internet
The user interface of the device is located directly in the user interface section, and the device has multiple interactive pages.Users can access the system directly through this layer to perform necessary tasks.Depending on the type of user, different users use different credentials to log on to the system.Taking a university library as an example, the system is divided into three different users: teachers, students and administrators.The permissions are from bottom to top.In addition to reading and browsing all the resources on the website, teachers and students can read and borrow different textbooks, but they have different borrowing powers.Administrators can update login information and manage all administrative issues and settings after login.
The business logic layer implements the business functions of an application, situated between the user and data layers, which is the core component of the entire hierarchical type.The system has several modules that provide user interface functionality and access the database by calling the functionality provided by the data access layer.
The data layer provides access to external databases, which is a lower part of the overall layer-based system.The data access layer provides services in the business logic layer, including user information, book content details, and so on [14].User information includes basic information for temporary users and administrators.Web site content data mainly include book ordering information, borrowing information, and other information.
The library system has two components: the client system and the server.The library management system structure is shown in Figure 4 [15].The main functions of the client system are the registration of user information, the login of users to the library system, the browsing of books and their basic information, the book inquiry function, the borrowing of necessary books, and the inquiry of library staff about pre-purchase situations.The main functions on the server side are to manage book resources, manage user information, manage book bookings and after-purchase books, and recommend smart books.

Library
The business model of the library system has 23 business links.Users only need to browse books online at anytime and anywhere [16].These 23 business links are mail application, mail form, application form, successful registration of client system, adding user name password, user information library, user login, successful login, book browsing, book borrowing, history information database, storage of books, adding to shopping list, submitting, adding new book information to new shopping library, submitting new book information, extracting related information, submitting recently purchased book information, submit sample results and user names, read user names and e-mail addresses, store user data, assign information to mail servers, and send various subscription information to mail readers.The library business process is shown in Figure 5.

Algorithms of Digital Library Management System
Cluster analysis is an important feature of data mining.Integration is one of the most common methods in the data mining industry used to detect a set of unknown objects in a database.Through the clustering process, people who do not meet the same criteria are grouped into different groups, each of which is called a class.
When providing a dataset V{v i |i = 1, 2, . . .n} representing data objects, the data are grouped according to the similarity between data objects v i and is satisfied: This process is called clustering, and C i (i = 1, 2, . . .n) is called clustering (class).

Cluster Analysis Principles
The input to cluster analysis is represented by a set of criteria for (x, s) or (x, d), where X represents a set of samples, and s and D are similar or inconsistent values [17] between the combined samples, respectively.The output of a clustering system is a set of x = (G1, G2, . . . ,GK)s, and GK(k = 1, 2, . . .N) is a subset of x, as follows: The member G1, G2, . . ., GK in X is called a class, and each class has certain characteristics.In clustering, the results of cluster analysis are classes and their attribute descriptions.

Measures of Similarity
Generally, clustering algorithms use spatial distance as a measure to calculate the difference between two samples.The inequality table is defined by d(x, y), and the difference is often referred to as distance.
A. Manhattan Distance In formula ( 5), the distance from data object i to data object j is d(i, j); (x i1 , x i2 , . . .x im ), x j1 , x j2 , . . .x jm are the characteristic m [18,19] of data object i and data object j, respectively.
B. Minkowski Distance It is supposed that x, y is the corresponding part and N is the dimension.Minkowski distance measurements for x and y are in the following forms: when r = 1, the formula is as follows: when r = 2, the formula is as follows: C. Euclidean Distance [20] D. Quadratic Distance d(x, y) = (x − y) T A(x − y) 1 /2 (10) where A is a nonnegative definite matrix [21].
where q is a positive integer [22].

Cluster Criterion Function
A. Error Square and Criterion Function E Given a dataset X = {x 1 , x 2 , . . .x n } containing n objects, it is divided into separate par- titions K based on the size of similarity, each partition being a cluster containing n 1 , n 2 , . . .n k elements.To evaluate the results of clustering, a squared error criterion function [23] is used: B. Weighted Average Square Distance and Criterion J The square distance J of the average weight is defined as [24]: where P is the sum of the variables and s * j is the orthogonal mean between the samples in the class.
Substituting formulas ( 17) and ( 18) into formula (16) gives: C. Class Spacing and Criteria F To describe the distance ratio between categories of aggregation results, the difference F 2 between the distance between categories F 1 and the average weight of the categories can be used, which is defined as [25]: In the formulas, in the mean vector sample m j , m is the average vector of all samples, and p j is the probability.
The distance between data processing for different components defines the degree of separation between different types of data.It can be seen that the larger the distance value, the better the separation effect of various grouping results, and the higher the quality of grouping.

Application Experiment of the FCM Clustering Algorithm in Digital Library Management System
This paper takes a university digital library as the research object, uses the FCM clustering algorithm to cluster its book borrowing information and reader information, and verifies the accuracy of the clustering results.

Clustering of Book Borrowing
In the library system, there is a large amount of book borrowing data, and the frequency of book use can be obtained by cluster analysis of the data of these books.This paper chooses last year's digital library borrowing information to clear up the non-cluster analysis data.Finally, three aspects of data-book number, borrowing date, and return date-are selected to compose the initial data.This paper uses the FCM clustering algorithm to classify books into three categories according to their heat, and shows the results of clustering the database, as shown in Table 1 and Figure 6.From the data, the average number of books borrowed per year is between 0 and 6000, indicating that the usage of different types of books varies greatly.The data in the second and fourth quarters are significantly higher than those in the first and third quarters, which may be caused by final exams in the second and fourth quarters.For borrowing days, books returned within 1-3 days do not meet the reader's true borrowing needs, so the length of borrowing should be set to 4-60 days.For popular books, the number of days of borrowing is much greater than the number for unpopular books, indicating that the popular book type is more in line with the needs of the reader.The analysis of the above results shows the readers' interest and preference for books.Among them, the number of books in the most popular categories is very large, which can increase the number of such books, timely supplement the purchase, and check the integrity of the books to ensure that the readers can borrow them when they need them.In addition, in terms of book placement, one might consider placing such books in a prominent place in the library for readers to borrow.This increases the rationality of book purchase and allocation of book purchase funds, and does not cause blindness in book purchases.

Clustering of Reader Information
In this paper, an aggregation algorithm is used to collect the borrowing information of readers and classify the reader groups according to their needs, so as to better serve the readers.This paper selects the reader information table and the reading history information table in order to analyze the historical information of the books borrowed last year and sort the reader groups according to the different results of the books borrowed.The results of reader information clustering are shown in Table 2 and Figure 7.As can be seen from Table 2 and Figure 7, readers can be classified into three categories according to their amount of reading: those who read frequently, those who read generally, and those who do not read regularly.Additionally, those who read frequently differ greatly; they have different requirements for reading.In true library services, most libraries follow the same rules for users.In fact, borrowing times are often different between readers.For readers who like to read, because of the large demand for borrowing, the library can consider changing the number of books borrowed and circulated, rather than treating readers with different needs in the same way, to better serve the readers.On the other hand, for those who read frequently, their borrowing interest can be further evaluated by providing them with effective publicity services.

Assessment of Cluster Analysis Results
In order to evaluate the accuracy of cluster analysis results, this paper uses the error rate cross-estimation method [26] to verify the performance of cluster analysis.The basic principle of the cross-estimation method is to group each group of data and find out each sample group.This paper compares the classification results to get the percentage of error data.The smaller the percentage, the better the clustering ability of the clustering algorithm.It uses the above methods to calculate the number of false positives for each type of data in the FCM clustering algorithm used in this paper.Figure 8 and Table 3 are obtained.The error rate of the system is obtained by dividing the number of errors of each type of data by the number of data in the algorithm.From the data, the overall error rate of the suggested FCM clustering algorithm for information clustering is 3.66%, which is better than the existing comparative models.Therefore, the use of the FCM clustering algorithm in the digital library management system works well.

Discussion on the Application of the FCM Clustering Algorithm in Digital Library Management Systems
This paper innovates the digital library management system through the FCM clustering algorithm and verifies the accuracy of this algorithm in the digital library management system through experiments.This paper mainly addresses the following two aspects: (1) It improves the structure of library collection and the utilization of resources.Data aggregation consists of three datasets: basic information records of readers, reading information, and returning records of readers.Readers are classified and collected according to the information provided by the library.

Figure 2 .
Figure 2. Library management system level module.

Figure 7 .
Figure 7. Clustering data graph of reader information.

Table 1 .
Library classification clustering data.

Table 2 .
Cluster data table of reader information.