1. Introduction
Digital archives (DAs) are aimed at preserving human knowledge and artifacts in databases via converting and storing them as digital contents such as text, images, audio, and video. During the last few decades, to preserve these digital contents, tremendous efforts have been made to develop advanced techniques, including modeling 3D or 4D objects [
1,
2], processing historical documents [
3], and archiving geospatial data [
4]. Since search techniques have been regarded as the core information access method in DAs, they are developed and optimized extensively. Some of the representative search techniques in DAs are proposed for keyword searches, image searches [
5], and semantic searches [
6].
The recommender system (RecSys) is widely used as one of the information access methods in many fields, including the research and applications of personalization in cultural heritage listed in [
7]. In DAs, RecSys is barely being paid attention to and is less developed. We focus on building RecSys in DAs that has the potential to gain more attention from the crowd and support its usage.
Researchers have become increasingly aware of the importance of RecSys in DAs as well as in other digital galleries, libraries, archives, and museums (GLAMs). For example, Wilson-Barnao [
8] emphasized that algorithmic cultural recommendation creates commercial value that is embodied in historical and cultural values, and the significance of the Google Cultural Institute is that it bridges a commercial enclosure with the cultural collections of public institutions.
The main purpose of our proposed method is to use the queries formulated by high-level experts to help other users find the appropriate query (or queries) to narrow down the scope of the search results. The proposed model can be downloaded on GitHub (
https://github.com/blueorris/M-CRBMs, accessed on 22 December 2022). According to the work in [
9], researchers are accessing Europeana (
https://www.europeana.eu/portal/en, accessed on 22 December 2022) [
10], which is a web portal created by the European Union containing digitized museum collections of more than 3000 institutions across Europe for seeking materials, which is often considered the main purpose of the users coming to DAs. There are various types of users and various purposes for the users of Europeana. It is worth noting that the fact that various types of users (including cultural heritage enthusiasts, students, academics, teachers, cultural heritage professionals, and others) are accessing Europeana illustrates that users have different levels of understanding of the materials stored in DAs. The fact that the proportion of the purposes “create new work” and “personal interest” is larger than that of “professional activities” illustrates that seeking research-relevant materials is only part of the information needs in DAs. Therefore, it is necessary to rethink how to build the information access functions in DAs for supporting better usage. Since users who do not have a high level of understanding of the materials may fail to formulate appropriate queries for their purposes, we propose a method that recommends candidate queries to assist these users.
Query recommendation (or query suggestion) assists users to refine queries in order to satisfy their information needs. Often, users try different queries until they are satisfied with the results. This will be hindered by having little knowledge about the information for which they are searching. As mentioned before, the users might vary from different levels of expert knowledge of the contents in DAs, thus causing difficulties in formulating appropriate queries for information seeking.
For example, such difficulties occur when a user searches ukiyo-e (ukiyo-e is a kind of famous Japanese art of woodblock prints) databases. Users with basic background knowledge of ukiyo-e would formulate a simple search query, such as “美人” (meaning “beauty”, which is one of the most famous themes of ukiyo-e), but experts tend to use more specific queries to limit the result scope, such as “三美人” (meaning “three beauties”), “当世美人揃” (meaning “beauties of the present age”) or “見返り美人図” (meaning “beauty looking back”). In this example, the queries formulated by experts related to “美人” are often the whole title or the series name of an ukiyo-e print or the subwords of them.
In this article, we utilize the query-item pairs extracted from the access log of the Ukiyo-e Portal Database [
11] of the Art Research Center (ARC-UPD, Ritsumeikan University) to train the proposed model and conduct our experiments. The ARC-UPD is one of the biggest DAs that stores ukiyo-e prints as digital versions and their extensive metadata (artist, title, genre, etc.). More than 200,000 public ukiyo-e prints can be browsed in the ARC-UPD as of August 2022.
The restricted Boltzmann machine (RBM) is a two-layer undirected graphical model with a visible layer of observable variables or features and a hidden layer of latent, representative units. It has been used for many different machine learning tasks, including image generation [
12], dimensionality reduction [
13], and representation learning of human motion [
14]. It was first introduced to the recommendation task by Salakhutdinov et al. [
15]. Compared with RBMs, conditional RBMs (CRBMs) take extra information into account and are often applied to temporal sequences of data, such as the CRBM model proposed by Taylor, G. W. et al. [
14] and the CRBM model proposed by Salakhutdinov, R. et al. [
16]. Both RBMs and CRBMs capture the dependencies between visible layer variables by associating energy to each configuration of those variables.
Our idea for query recommendation is that similar queries should exist in one configuration. For example, if the queries “美人” (meaning “beauty”) and “三美人” (meaning “three beauties”) are often used to search for one certain ukiyo-e print, then these two queries should be in one configuration. The weight matrices in the energy model extract the dependencies of each of the units (the units can represent queries or metadata values) that finally figure out the configurations. In this case, the RBM model only learns the dependencies among the queries. The CRBM model utilizes extra information to help find the potential relative queries in one configuration. In M-CRBMs, the reduced weight matrix from conventional CRBMs is between the conditional layer and visible layer (i.e., the dependencies directly between the queries and metadata). The details of the methodology will be explained in
Section 3 and
Section 4.
The novelty and contributions of this work are as follows:
We propose a method to recommend queries in DAs called modified conditional restricted Boltzmann machines (M-CRBMs). M-CRBMs provide users with different expert knowledge levels to seek information in the database when given an initial query.
We modify the conventional CRBMs and construct M-CRBMs by reducing the weight matrices in the model. This makes the model trainable for average-performaing computers. Aside from that, we use free energy instead of energy to efficiently train the model.
The proposed M-CRBM model is able to predict the queries relevant to the user’s query and predict the relevance degree (ranking) simultaneously.
This paper is structured as follows.
Section 2 states the work related to our research.
Section 3 introduces the basic methodology of RBMs and CRBMs.
Section 4 states the methodology of our proposed M-CRBMs.
Section 5 explains the DA dataset that we use in the experiments.
Section 6 describes the experiments, and
Section 7 concludes the paper.
4. Modified CRBMs (M-CRBMs)
We propose M-CRBMs in this paper, which incorporate three conditional layers that can take extra information into account rather than only the observed visible variables, as shown in
Figure 3. The conditional layers can easily be added or removed according to the number of types of extra information. In our experiment on a dataset of DAs, three types of extra information were used. Therefore, we implemented a three-conditional-layer M-CRBM in this paper.
Our proposed M-CRBMs are much more scalable on large-size training data and suitable for DAs compared with the conventional CRBMs described in
Section 3.3. First, in the query recommendation task, the training data can be extremely large in size, because in query recommendation, the ordinary embedding method is to represent an item (or a URL) with the bag-of-words (BoW) encoding method, where the ones represent the formulated queries that have been used to access that item. Therefore, to represent all the items, the number of dimensions of the representative vector of the item would be the number of all the queries in the access log. Next, the queries were difficult to embed for many of the natural language processing (NLP) models, which often reported efficiently reducing the dimension of the representative vector. The reason for this is that the queries in DAs are special, often being from a unique field, and can easily fail to be embedded well by those NLP models trained on a general text corpus. Moreover, our proposed M-CRBM has the ability to incorporate more information, such as the metadata, which are abundant in DAs. Lastly, the extra information is considered to alleviate the difficulty of dealing with the sparse query representative vector. This is also one of the key advantages of M-CRBMs when compared with RBMs.
Algorithm 3 states our proposed method. The idea of M-CRBMs is to capture the configurations (or dependencies) of the co-occurring queries formulated by high-level experts and low-level expertise users. Each M-CRBM represents an item (an ukiyo-e print), each visible layer unit represents a query, each hidden unit represents a latent feature of the queries, and each conditional unit represents a word from the corresponding metadata field. In this model, we regarded the queries that retrieved the same item as relevant, and these relevant queries were semantically similar. By learning the configurations of high-level and low-level expertise queries from all of the query log, the relevant queries could be detected.
Algorithm 3: CD in M-CRBMs with free energy |
|
The energy function of our proposed M-CRBMs is defined as follows:
Thus, the free energy is computed as follows:
We used the free energy constructed loss function as shown in Equation (
10), because logarithm operation in free energy avoids extremely small energy values during the training phase. Extremely small energy values could exist due to the model learning too many frequent configurations. This would prevent the model from learning some relatively rare configuration that was meaningful.
The prediction phase of this model is such that, given the input query vector and its extra metadata, it applies forward and backward inference once to reconstruct the input vector. This would assign a probability to each of the visible units that represents the probability that each of the queries is related to the current input.