A Systematic Approach to Long-Term Storage of Documents Using Digital Twin Technologies

: The article discusses the modeling of the use of digital twin technologies (a digital twin) for the task of organizing long-term storage of various types of documents. A digital twin in this respect differs from a digital copy, which has no connection with the real original object. A review of the problem was carried out. We argue that, despite the active use of digital twin technology in various areas, its applications in the area of long-term storage of documents remain limited. At the same time, the task of increasing the durability of documents during long-term storage remains important and largely unresolved. The complexity of solving the problem of long-term storage of documents is considered, and a formal statement of the problem of long-term storage using digital twin technologies is carried out. Destructive factors that affect long-term storage documents and signiﬁcantly reduce their durability have been identiﬁed. A system of indicators has been developed to assess the durability (preservation) of documents. The modeling of the use of digital twin technology in the organization of long-term preservation within the framework of Industry 4.0 was carried out. In the course of modeling, the goals and the strategies of long-term storage were established. Primary mathematical models for controlling destructive factors as well as technological solutions for digital twin long-term storage are proposed. It is assumed that the key part of this technology is the Industrial Internet of Things. The effectiveness of the use of digital twin technologies for solving the problem was assessed. The spheres of application and further ways of research are determined.


Introduction
The rapid digitalization of the economy and society has long become a global trend in the transformation of industry and public life.Such terms as augment reality (AR) and virtual reality (VR), 3D/4D printing, artificial intelligence (AI), Industrial Internet of Things (IIoT), and digital twins are becoming more and more familiar to the general audience.Increasingly, one can come across statements in the literature that the fourth industrial revolution, or Industry 4.0, has already begun (see, for example, Refs.[1,2]).The main aspects of Industry 4.0 involve the generation of a large amount of data (within IIoT) and their intensive use for analysis in the transition to the concept of intelligent data processing and the use of AI [3].
In this regard, the issue of long-term storage of data within Industry 4.0 remains underdeveloped [4].Traditionally, the greatest difficulties in organizing long-term data storage arise with archival documents in paper form.The information they often contain is extremely valuable for modernizing existing production and technological processes.At the same time, the periods of storage of archival documents are either not limited or are IoT 2023, 4 611 decades long [5].This question is extremely important because the value of such documents is usually very high.
It should be noted that, according to [6], during long-term storage, due to large volumes of stored documents, only a selective check of the state of documents is possible, while monitoring the parameters according to [6], which are 1-2 times a week, does not allow timely identification of the risks of storage and possible damage to documents.
As a result, and also taking into account the ratio of stored volumes of documents to restored volumes per year of 10,000:1, it can be argued that, in cases of damage to documents, only a negligible part of the documents can be saved by means of restoration.This statement is confirmed by studies (see, for example, Refs.[7,8]).In addition, the ratio of documents to be restored to those actually restored is estimated at 200:1, which also indicates the impossibility of the timely detection of damaged documents and the restoration of them in the required quantity.
Thus, this contradictory situation gives rise to the need to solve the extremely urgent scientific and practical problem of increasing the durability (preservation) of long-term storage documents by converting them into digital forms.
This article discusses the relevance and necessity of using digital twin technology for organizing the long-term storage of archival documents.A digital twin in this respect differs from a digital copy, which has no connection with the real original object.The effectiveness of the new approach to solving the problem of long-term storage is shown.
Section 2 is devoted to a review of the literature and the current state of the art in solving the problem of long-term storage.Section 3 formalizes the problem of creating digital twins.Section 4 describes a general methodological approach for developing a concept for organizing long-term storage using digital twin technology.Section 5 addresses a number of technology development issues.Section 6 is devoted to modeling the application of digital twin technology for long-term storage.Section 7 discusses comparison issues with the application and limitations of the presented methodology.Section 8 contains directions for the development of digital twin technologies in the future.

Related Works and Problem Review
Digital twin technologies have become very popular recently, and the relevance and necessity of their use are emphasized by the rapid growth of publications on these issues over the past 3-4 years.
If we make a brief review of publications on the topic of digital twin technologies, we can draw the following conclusion: the majority of papers consider digital twin technologies as an integral part of industrial production during its digital transformation (see, for example, Refs.[9][10][11][12][13]).
The use of digital twins in industry, according to most contemporary papers, is extremely relevant because it allows one to simulate operations by creating a digital model of production products (parts, technical devices, other devices, etc.).At the same time, problems are identified, and thus, it is possible to improve the quality and competitiveness of products from the design stage to the stages of product testing and operation.Moreover, at the stage of operation, the use of the IIoT is directly assumed to control the operation of real products [9,11].
At the same time, modeling an object or creating a digital twin is possible not only for final products but also for production as a whole [14].All stages of production and operation of the enterprise are simulated, from the design stage to virtual testing and certification.For this, in addition to the concept of a digital twin, as a highly adequate mathematical model of products and enterprises, the concept of a digital shadow is also introduced, as a model of the behavior of an object in all modes of operation, and the concept of digital certification, as a series of virtual tests in all modes of operation.
However, in addition to industrial production, the use of digital twin technology has spread to other areas.Here, we may note the sphere of energy (see, for example, Refs.[15,16]).In this area, the importance of the task of increasing the efficiency of providing energy to the population and industry, as well as the efficiency of managing the redistribution of these services on a regional scale using digital twin technology, is emphasized.
An important component of digital twin technology is the stage of designing industrial products (see, for example, Refs.[17,18]).This highlights the importance of creating a digital twin during the product design phase.An increase in the efficiency of product development with this approach has been proven.
Another important industry for which the use of digital twins is arguably pivotal is the management of cities, both within the framework of smart cities (see, for example, [19]) and within the framework of the management of city services and communications as part of their digitalization (see, for example, Refs.[20,21]).
In addition, an increase in efficiency is declared when using digital twin technology for the monitoring of the technical operation of cars (see [22]), for modeling social processes (see [23]), and for geochemical monitoring of the environment (see [24]).The relevance and effectiveness of the use of digital twins in the banking sector (see [25]), in the analysis of the geomagnetic activity of the Earth (see [26]), and even in industries such as Arctic tourism (see [27]) have also been described.
Of utmost interest is the study [28], which provides an elaborated reference architecture of a digital twin and applies it to the industrial case for the digital transformation of complex and unique wetlands.In addition, it is shown that the digital twin as a service (DTaaS) paradigm can effectively be utilized for digital transformation, including benefits such as intelligent planned maintenance, real-time monitoring, remote control, and predictive functions.Also, [28] provides results confirming that there is a significant relationship between DTaaS and mass personalization.In turn, the importance of mass personalization is seen as a common goal of recent industrial revolutions.This argument is supported by [29].In [29], a reference architecture model for achieving mass personalization is also proposed, which contributes to understanding how Industry 5.0 improves Industry 4.0 through a human-centric approach to increase its sustainability.
In this regard, it should be noted that, despite the creation of electronic archives, the technology of digital twins in archiving is still not used.However, high risks of damage and loss of valuable items and especially valuable documents (see, for example, [7,8,30]) make the problem of increasing the efficiency of long-term storage extremely urgent.
Although the digitization of stored documents is an important and necessary measure, a digital copy cannot replace a document that has legal significance.
The widespread digitization of archival documents does not solve the problem of storage because, as a result, a digital image of a document at a certain point in time is obtained, which is not a digital twin and which also needs to be stored in an unstable digital software and hardware environment (see [30]).
Thus, a digital copy of a document, having no connection with a real object, does not model the aging processes occurring in a real object or the influence of destructive factors described below.
Cloud solutions can be used to solve problems with long-term storage.This is relevant, for example, when the placement of program-hardware parts in archive premises is impossible due to non-compliance or violation of storage requirements, lack of premises, or insufficient energy supply.However, they are not a "panacea", because the same problems of unstable hardware and software environments described in [30] also affect the software and hardware solutions that provide specific cloud solutions.

Problem Statement
What is a digital twin, then?A concise definition runs as follows: a digital twin of a long-term storage document is a digital copy of a real-life document that has a connection with this document, is carried out by digital means (software and hardware), and in which the processes occurring in a real-life document are simulated.
Let us consider the main risks when storing documents in archives and the means used to control the safety of documents.
First of all, the risk of damage or loss of documents in the archive is associated with non-compliance with the storage conditions described in detail in [6,7].
These conditions should include the following:

•
The presence of buildings that allow for long-term preservation;

•
The availability of storage facilities that meet the requirements of long-term preservation; Another problem lies with the durability of the information layer that forms the document on the corresponding medium.This refers to the problem with the durability of dye, magnetic, optical, film, photo, and other layers that actually represent the content of information on the corresponding media.
Here are averaged data for the most common media.Such media as parchment, papyrus, and potentially promising quartz digital media are not considered due to their low prevalence or complete absence at the present time.
Let us consider typical negative influences of destructive factors (Des = {dd i }) that affect documents (carrier, dyes, etc.) during their storage (see [7,8,35] for more details).See Table 2.For more details on the problems of long-term preservation of digital data and electronic documents, see [30].
In addition to the problems and risks described above that arise when organizing longterm storage, there are also problems with implementing digital twin technology.Despite the relevance of the digitalization task outlined in the public development program "Digital Economy", sponsored by the Government of the Russian Federation [36], and the statement of the importance of such areas as AR, VR, and IIoT, the difficulties of implementing digital twin technology are the high cost of projects, the lack of experts in this subject area, the lack of regulatory documents, and the governing of the use of the technology.
Nevertheless, when organizing the long-term storage of documents, particularly valuable ones, one must take into account the fact that their loss or damage can result in enormous costs, both political and economic, not to mention the blow to organizations.The complexity of solving the problem of the long-term storage of documents lies in the fact that, when creating a document, few people think that the document will be stored for a long time in an unfavorable storage environment.The storage medium, including electronic storage, is subject to change.Thus, the document is affected by a number of adverse factors that contribute to its destruction or damage.
In addition, before being transferred to long-term storage, the document may be stored in improper conditions and not retain its original state.From the point of view of safety, the limit of changes in relative humidity of 30-60−30% is considered the safest.Otherwise, the rate of destruction of documents increases by 10-15 times [7,8]. dd

Impact of the lack of ventilation Paper
The ventilation of the air avoids the formation of closed stagnant zones of the inertial microclimate in the premises of the archive (such zones bring an increased risk of damage to the documents, the risk of the biological nature) [8].
dd Impact of the "crowding" of documents Paper Even while maintaining the required light conditions, as well as conditions of humidity, temperature, ventilation in the premises, it is necessary to store documents with sufficient insulation to avoid the formation of closed stagnant zones of the inertial microclimate in the mass of documents [7]. dd

Mechanical exposure Paper
The main mechanical effects, according to statistics (see studies [7,8]), occur even before transfer to long-term storage.There may be damage during the issuance of documents from the archive.
dd Biological impact (mold, insects, bacteria) Paper For example, mold infestation becomes possible at temperatures of 10-40 • C and relative humidity above 65-68%.The growth rate of the lesion is from 0.3 to 15 cm per day, depending on humidity and temperature [8].
dd Chemical exposure in the storage environment Paper Depends on the chemical composition of the air.For example, the rate of acid corrosion of materials in urban conditions is 2-10 times higher than outside the city [8].
dd Exposure to electromagnetic fields Digital [35] dd 10 Loss of authenticity Digital [35] dd 11 Non-interpretability Digital The inability to interpret the sequence of bits that make up an electronic document [35].
Impact of storage firmware changes [35] In this regard, the problem of organizing long-term storage may be stated as the problem of stabilizing a storage object (a document) in a storage environment subject to parametric disturbances that violate the stability of the storage object.Then, the problem of organizing long-term storage using digital twin technologies belongs to the class of optimal control problems.
By "digital twin", we mean a virtual copy of a storage object and a long-term storage archive as a whole, in which the processes that occur with real objects are simulated.
A digital twin in this respect differs from a digital copy, which has no connection with the real original object.
From this point of view, the problem can be classified as a problem of modeling the process of long-term storage, which uses AR, VR, and IIoT technologies.
The mathematical formulation of the problem can be given in the following form.
The following definitions are given: • A set of documents subject to long-term storage: Doc = {d i }; • A set of types of long-term storage documents: Typ = {dt i }; • A set of requirements for the storage conditions of documents: Rul = {dr i };

•
The set of real storage conditions for documents: Con = {dc I }; • A set of destructive factors impacting the documents: Des = {dd i }; Document types Typ = {dt i } and sets of requirements for storage conditions % are defined in more detail in [6].The format of the article does not allow for them to be cited in full.
The following can be found: The designated period of document storage, defined as the calendar length of maintaining the safety of a document of type dt i under specified storage conditions dc i at the level SV 0i :T dsi .Accordingly, for all types of documents, (Typ = {dt i }): T ds = {T dsi }.

•
The appointed period of document restoration, defined as the calendar duration of the detection of a problem and the restoration of a document of type dt i under the specified storage conditions dc i to the preservation level SV 0i :T dri .Accordingly, for all types of documents, (Typ = {dt i }): The probability of a document's persistence, defined as a statistical estimate of the probability that a document of type dt i under given storage conditions dc i will reach its intended storage period T dsi : P (T dsi ) = n i /N i (n i is the number of documents of type dt i remaining safe, i.e., in the state of initial preservation SV 0i for a period of time T dsi , N I is the total number of documents of type dt i ).Accordingly, for all document types (Typ = {dt i }): The probability of a document's recovery, defined as a statistical estimate of the probability of detecting a problem and restoring a document in a time not exceeding T dri , belonging to the type dt i : P (T dri ) = nv i /Nv i (nv i is the number of documents of type dt i that can be restored to the state of initial preservation SV 0i over a period of time T dri , Nv i is the total number of documents of type dt i ).Accordingly, for all document types, (Typ = {dt i }): P (T dr ) = {P (T dri )}.

•
The average shelf life of a document, defined as the mathematical expectation of time during which a document of type dt i remains completely saved at the level SV 0i : T dsvi = (∑ j = 1, Ni P j (T dsi ) T dsij )/N i (where P j (T dsi ) is the probability of persistence of the j-th document of type dt i , T drj is the time of problem detection and recovery of the j-th document of type dt i ).Accordingly, for all types of documents, (Typ = {dt i }):

•
The average document recovery time defined as the mathematical expectation of the time of detecting a problem and restoring a document of type dt i :T drvi = (∑ j = 1, Ni P j (T dri ) T drij )/N i (where P j (T dri ) is the probability of recovering the j-th document of type dt i , T drij is the time of problem detection and recovery of the j-th document of type dt i ).Accordingly, for all document types, (Typ = { dt i }): T drv = { T drvi }.

•
Gamma-percentage shelf life of a document, defined as the shelf life of a document of type dt i under given storage conditions dc i at the level SV 0i , achieved with a given probability "gamma" (γ):

•
Now let us consider modeling the use of digital twin technologies to ensure the long-term preservation of documents.

Methodology Structure
This paper presents a systematic approach to developing the concept of organization of long-term storage using digital twin technology.
For this purpose, the following apply: The efficiency of digital twin technology is evaluated on the basis of the efficiency indicators of document recovery time and the probability of document recovery proposed in Section 3.

Objectives of Long-Term Storage
Of course, when organizing long-term storage, one should clearly understand the purpose of storing a particular document.
Such goals as the following may be set:

•
Preservation of the original document in its original form.This goal, first of all, concerns documents that have legal significance.Such documents may contain original signatures (including electronic signatures) and seals.

•
Preservation of the information that the document contains.The value is not the original document itself, but the information contained in it.

•
Preservation of the original appearance of the original document or some other important physical property.The original view can be valuable from a historical point of view.

•
Saving information about a document.In addition to the document itself, knowledge about when and where the document was created, what documents or events it was associated with, and information about the authors of the document can be of value.

Optimal Strategies for Long-Term Storage
Depending on the goals set in the organization of long-term storage, preservation efforts can be optimized and optimal storage strategies can be selected.
Such strategies for ensuring long-term storage (MStg = {mst i }) include the following: • Saving the original (mst 1 ); • Digitization (creating an electronic copy (mst 2 )); From the point of view of any archive, the preservation of the original is, of course, an important task.But, as was shown above, even if the storage conditions are observed, the storage time of the original is finite.The difficulty is in the creation of a digital twin of such an original because, in a digital twin, the processes occurring in the original object must be reproduced exactly.
Digitization, or the creation of an electronic copy, is an important task and, in the context of digitalization, necessary.However, the creation of a digital copy is not the creation of a digital twin because the digital copy itself "does not know" anything about the original object.
The restoration itself also does nothing for the technology of the digital twin except to clarify the state of the original object.In addition, restoration, as shown above, cannot solve storage issues due to the limitations of the speed of restoration and the time required to check the status of stored documents.
The most interesting strategy in terms of the possibility of using digital twin technology is reproduction.The reason for this is the availability of new durable materials and the possibility of embedding microchips in them.
However, in addition to the actual document's twin, albeit equipped with various electronic means, it is necessary to control the processes that occur with the stored documents.

Mathematical Models of Digital Twins
In order to control not only storage objects but also the storage environment itself, the use of digital twins in the organization of long-term storage should include the following models (MDTwn = {md i }):

Technological Solutions for Creating a Digital Twin of a Document
Creating a digital twin for stored documents (md 2 ) requires the use of the following technological solutions (MTech = {mth i }):

•
Creating new documents on a new durable medium (mth 1 );

•
Embedding in traditional and new durable digital storage media (mth 2 ).
The authors, having experience in developing long-term storage solutions (see, e.g., [30]), believe that creating a digital twin and modeling the processes occurring in documents is necessary for each stored document, especially for long-term storage.
In terms of creating new documents on new durable media, polymer film and composite materials are of the greatest interest.According to modern research, such materials are durable, water-resistant, frost-resistant (up to −60 • C), and not subject to destruction due to mold or insects.However, they can melt or soften at temperatures above 60-70 • C. In addition, due to their airtightness, poor hygienic performance is inevitable.It may provoke the development of an unfavorable storage environment for paper-based documents (see, for example, Ref. [37]).
The topic of creating new materials for the reproduction and creation of long-term storage documents seems promising but requires a lot of interdisciplinary research.
However, this technology (mth 1 ) is preparatory for the "digitization" of the document: the procedure necessary to create a digital twin.
By the "digitalization" of the document, we mean the embedding of various digital microchips into traditional and new durable materials of documents to control the storage conditions and the storage of information about the document."Digitalization" can affect traditional media such as paper documents, document folders, document storage boxes, and the shelving on which documents are placed.Also, of course, "digitalization" should also affect the media of electronic documents, video, audio, etc.In fact, we are talking about creating long-term storage for documents in a cyber-physical system.In this system, there must be data exchange between the document digital twin (md 2 ) and the archive digital twin (md 1 ), i.e., IIoT technologies should be used.
From the point of view of specific technological solutions, it seems promising to use digital microdevices with the possibility of radio frequency identification, or RFID (mth 3 ).Such devices may include RFID tags.Depending on the targets and the information they contain, RFID tags can be RO (read only), WORM (write once read many), or RW (read and write).From the point of view of a durable document, the RO or WORM types can be used to record data about the document to preserve authenticity, and RW can be used to record data about the current state of the document.
The advantages of RFID tags (see [38] for details) include their compactness and relative cheapness, as well as the possibility of wireless contact with the digital twin of a long-term storage archive (distance up to 100 m) and high-speed processing (up to 200 tags per second).
The disadvantages are their relative fragility (they operate for a maximum of 10 years), small memory capacity (no more than 512 Kb), and loss of performance due to mechanical damage (see, for example, Ref. [39]).
Another problem with using this technology is that the use of RFID tags is not applicable to all stored documents.For example, it is risky to digitalize worn-out and old documents because their loss in the process of such "digitalization" is quite possible.
However, one may reproduce such documents using 3D/4D printing of the new document.For example, in 4D printing, a printed object changes shape or properties in response to a certain stimulus, allowing the object to be printed under certain conditions along with an embedded RFID tag or microchip (mth 4 ).At the same time, the reproduction of especially valuable documents is possible once every 10 years or in case of failure of the RFID tag or microchip (the fact of the failure is determined by the lack of communication with the digital twin of the long-term storage archive (md 1 )).

Technological Solutions for Creating a Digital Twin of the Archive
The creation of a digital twin of a long-term storage archive (md 1 ) will require the creation of a virtual environment in which AR and VR technologies will be used with the aid of software and hardware.
To organize such an environment, it will be necessary to create mathematical control models (MCntl = {mc i }) of a real storage environment with real storage conditions Con = {dc i } and develop technological solutions (MTech = {mth i }) to implement the actual virtual environment.
The following control models and technologies can be proposed for organizing a virtual environment: 1.
Control of distance standards m d = {m di } for the stacks (stationary and mobile) in comparison with real distances (m = {m i }).
So, for example, the distance between the rows of stacks (main passage): m 1 ≥ m d1 = 120 cm.The distance between the outer wall of the building and the stacks parallel to the wall: m 2 ≥ m d2 = 75 cm.The distance between the wall and the end of the stack: m 3 ≥ m d3 = 45 cm.The distance between the ceiling and the top shelf of the stack: m 4 ≥ m d4 = 50 cm.The distance from heating and (or) heating devices: m 5 ≥ m d5 = 100 cm.This is necessary to reduce the impact of the "crowding" of the documents (dd 5 ).
In general, the control model (mc 1 ): The measurement of real distances (m = {m i }) is possible with the help of laser rangefinders located on the stacks (mth 5 ).

2.
Control of ventilation in the archive room.
For example, it is possible to simply measure ventilation (dd 4 ) using an anemometer (mth 6 ) and to measure the airflow velocity V(m/s) from an air conditioning system or air circulation.Then, in the archive digital twin simulation environment (md 1 ), the volume of airflow in i-th archive location can be calculated as (mc 2 ).
L i = 3600 × F i × V i , where L i is the air consumption (m 3 /h), 3600 is the correction factor (number of seconds per hour), V i is the air flow velocity (m/s), F i is the cross-sectional area of the exit of the air flow (m 2 ).With a known volume of the i-th archive room Q i (m 3 ), the condition L i ≥ 2/3Q i (see requirements [6]) applies to all i-th premises of the archive.

3.
Temperature control in the archive.
The requirements for the temperature regime (dd 2 ) are quite strict; for example, according to [6], the air temperature parameters for documents on paper should be 17-19 • C. For details on other types of documents (photo, film, video, audio, etc.), see Section 5 in [6].
To comply with such a strict regime, no less strict control of the entire temperature field on the premises of the archive must be ensured.To control and simulate AR and VR-driven changes in the temperature field, temperature sensors (mth 7 ) are required to be protected from accidental contact and located on three levels (bottom-middle-top) for all storage racks (devices).In a more complex case, it is possible to use infrared radiation sensors or thermal imagers (mth 8 ), which will more accurately simulate the temperature field of the room, but such a solution will be more expensive.
Then, the control model in the simplest case can be as follows: , where t ijk is the temperature indicator measured by the j-th sensor for the i-th type of document (dt i ) at the control time k (if there are no other considerations, then the interval between the control time points k can be selected from series 1, 2, 4, 8, 12, 24 h), t dimin and t dimax , respectively, are the minimum and maximum values of the temperature standard for the i-th type of documents (dt i ).

4.
Humidity control in archive rooms.
The requirements for humidity (dd 3 ) on the premises of the archive are also strict.For example, according to [6], the air humidity parameters for documents on paper should be 50-55%.
As for temperature control, to control and simulate (AR, VR) changes in humidity, sensors (mth 9 ) are required to be protected from accidental contact and located on three levels (bottom-middle-top) for all storage racks (devices).
Then, the control model in the simplest case can be as follows (mc 4 ): , where f ijk is the humidity index measured by the j-th sensor for the i-th type of document (dt i ) at the control time k (if there are no other considerations, then the interval between the control time points k can be chosen from series 1, 2, 4, 8, 12, 24 h), f dimin and f dimax , respectively, are the minimum and maximum values of the air humidity standard for storing the i-th type of documents (dt i ).

5.
Control of illumination on the premises of the archive Lighting requirements (dd 1 ) are also very strict and varied.In the simplest case, according to [6], the requirements for the vertical surface of the stack are no more than 20-50 lux, and on the horizontal are no more than 100 lux for paper documents.
To control the illumination, it is enough to use at least two lux meters on each stack (for vertical and horizontal surfaces).This technology (mth 10 ) will simulate the illumination of the room in real time.
The control model in the simplest case can be as follows (mc 5 ): 0 = e dimin ≥ e ijk ≥ e dimax , where e ijk is the illumination index measured by the j-th sensor for the i-th type of documents (dt i ) at the control point in time k (if there are no other considerations, then the interval between the control points of time k can be chosen from the range of 1, 2, 4, 8, 12, 24 h), e dimin (in the general case, it is taken as equal to 0, because the storage of all types of documents occurs in the dark) and e dimax , respectively, are the minimum and maximum values of the illumination standard for storing the i-th type of documents (dt i ).
Of course, we have considered the simplest case.For more accurate control, it is necessary to control the level of ultraviolet radiation (UV) in the premises (W/m 2 ), which will be a more expensive solution but will allow more accurate modeling of processes in documents.In this case, it is necessary to use UV energy/radiation meters (mth 11 ) as control devices.
In principle, for modeling the main destructive factors (Des = {dd 1 , dd 2 , dd 3 , dd 4 , dd 5 }) when storing non-electronic documents, the technologies listed are quite sufficient.
The proposed technologies (MTech = {mth i }) should be used not only for direct measurements but also for transferring measurement results directly to the digital twin of a long-term storage archive (md 1 ), i.e., the digital twin of the document (md 2 ) should directly interact with the digital twin of the archive (md 1 ) according to the above models (MCntl = {mc i }), and the transmitted parameters should provide modeling and control of storage conditions.
Storage conditions must also be managed, i.e., the digital twin of the long-term archive (md 1 ) must be able to control the "climate" (storage conditions) in the archive premises.This will allow not only the control but also the management of the conditions for storing documents.This procedure, in turn, can significantly increase the preservation of documents.
Thus, the proposed technologies and control models should serve as a basis for developing models of processes occurring in stored documents (MPrc = {mp i }).Such processes will be the subject of further research by the authors of this paper.
Figure 1 shows a conceptual diagram of the use of digital twin technology in solving the problem of organizing long-term storage of documents.This problem can be considered an optimal control problem in an unstable environment, where a document is an object of control.
veloping models of processes occurring in stored documents (MPrc = {mpi}).Such processes will be the subject of further research by the authors of this paper.
Figure 1 shows a conceptual diagram of the use of digital twin technology in solving the problem of organizing long-term storage of documents.This problem can be considered an optimal control problem in an unstable environment, where a document is an object of control.

Development of Digital Twin Technology in Long-Term Storage
If we talk about the future, or about directions for the future development of digital twin technologies and, more broadly, the use of artificial intelligence methods in organizing the long-term storage of documents, we can propose the following in terms of the development of control models (MCntl = {mci}) and technological solutions (MTech = {mthi}).

Control of mold, insects, and dust (dd7).
Speaking of technological solutions for organizing control, it would be obvious to employ recognition tools (mth2), i.e., the presence of cameras, including cameras of mobile devices and software tools, that allow the recognition and transference of data to the digital twin of the archive (md1).
The big problem in this case is that the documents are stored in low light or even in the dark in a closed container.Low illumination makes optical recognition extremely

Development of Digital Twin Technology in Long-Term Storage
If we talk about the future, or about directions for the future development of digital twin technologies and, more broadly, the use of artificial intelligence methods in organizing the long-term storage of documents, we can propose the following in terms of the development of control models (MCntl = {mc i }) and technological solutions (MTech = {mth i }).
Speaking of technological solutions for organizing control, it would be obvious to employ recognition tools (mth 2 ), i.e., the presence of cameras, including cameras of mobile devices and software tools, that allow the recognition and transference of data to the digital twin of the archive (md 1 ).
The big problem in this case is that the documents are stored in low light or even in the dark in a closed container.Low illumination makes optical recognition extremely difficult, whereas darkness makes recognition impossible.The only accurate way would be to detect the presence of dust using AI tools, namely, recognition.
Options for solving the problem of low light can be as follows: • Image recognition in the infrared range using infrared video cameras-thermal imagers (mth 13 ).In this case, you can count on sufficiently high-quality problem detection.

•
Modeling the possibility of a problem arising (mold, insects, bacteria) in digital twin (md 2 ) using the previously considered technologies for controlling ventilation, humidity, and temperature (mth 6 , mth 7 , mth 8 , mth 9 ).In this case, the detection of the problem will be significantly lower.Instead of detection, a forecast of the presence of a problem zone will be issued.However, the cost of the solution will be significantly lower.
to remotely monitor and control storage in an automated mode.The interacting objects are connected by computer networks.This in turn actually corresponds to the generally recognized definition of the IIoT: "IIoT is a system of interconnected computer networks and objects connected to them with embedded sensors and software for data collection and exchange, with the possibility of remote control and management in automated mode".

Efficiency of Using Digital Twin Technology in Long-Term Storage
Talking about efficiency is always difficult enough.Moreover, digital twin technologies are currently not fully implemented in long-term storage.If we talk about a possible assessment of efficiency, then we can simulate how the indicators considered in Section 3 can change the average document recovery time (T drv ) and the probability of document recovery (P (T dri )).As can be seen from the definition of T drv in Section 3, the average recovery time (T drv ) is the sum of the average time to detect a problem (let us denote it as T drv1 ) and the average time to restore a damaged document (let us denote it as T drv2 ), or T drv = T drv1 + T drv2 .
To estimate T drv1 without using digital twin technology, one may refer to the regulatory documents governing the rules for long-term storage.According to the regulations given in [6], measurements of the temperature and humidity parameters of the air environment are carried out at the same time of day: in a conditioned archive, once a week; in an unconditioned archive, twice a week; in the case of non-compliance of parameters with regulatory requirements, once per day.Thus, the mathematical expectation of the detection time of the problem area in the archive ranges from 1.75 to 3.5 days (or from 42 to 84 h).
When using the technology of digital twins, in the digital twin of the archive (md 1 ), the detection time of the problem area is almost equal to 0. In this case, we can say that if T drv1 = T drv11 + T drv12 , where T drv11 is the time of detection of the problematic storage area and T drv12 is the time of checking the document, then T drv1 is excluded from the formula for calculating T drv1 T drv11 .Even if we assume that the ratio T drv11 : T drv12 = 1:1 (usually it takes much less time to check than 84 or 42 h, i.e., T drv11 > T drv12 ) and the ratio T drv1 : T drv2 = 1:1 (usually, however, T drv1 < T drv2 ), then the efficiency in terms of T drv will increase by about 25% (or 1.3 (3) times).
If 3D/4D printing technology is used as a recovery measure to create a durable copy of the original document, then the efficiency will increase by about three times due to the practical exclusion of T drv11 and T drv2 from the calculation of T drv .
But the indicator T drv also depends on the indicator of the probability of document recovery (P (T dri )) according to the relation (see Section 3).
In order to evaluate the indicator P (T dri ), it is necessary to estimate the probability of recovering a document in a regular archive without using digital twin technologies.
According to [6], "archival documents (selectively, but not less than 0.01% of the total number of storage units) and the archive at least twice a year (at the beginning and end of the heating season) are subject to entomological and mycological examination."That is, it is assumed that no more than 1 in every 10,000 storage units can theoretically be damaged.But, with selective control, the probability of detection and recovery will lie in the range of 0 to 1 (if the selective control was so successful that all problematic documents were found), with a mathematical expectation of 0.5.That is, for example, if there is one damaged document for every 1000 storage units, then the probability of its detection and recovery will be no more than 0.1 (the mathematical expectation is 0.05).
In the digital twin of an archive (md 1 ), the problem zone detection time is practically equal to 0.Not selective but complete control of the entire volume of storage units is achieved.Consequently, efficiency will increase at least two times and, most likely, significantly more because control is carried out not twice a year but online.Due to this, the pinpoint detection of problematic documents is possible (thus, the processes of destruction cannot go too far).
Due to the growth of the indicator P (T dri ), the growth of the final indicator T drv will be from 2.6 (6) to 6 times.This storage efficiency is estimated to increase by at least 2.6 (6), or six times.In addition, the use of digital twins allows us to simulate possible processes in documents and predict the storage time depending on the actual storage conditions.
As a result, it is possible to effectively respond to emerging destructive influences and damage.Work on restoration and verification is not selective but in relation to storage objects for which traces of destruction or violation of storage conditions have been identified.
These preliminary calculations show a significant increase in the efficiency of document preservation, even at the expense of monitoring the storage environment in real time.

Discussion of the Digital Twin Technology Application and Limitations
As was previously shown in Section 2, the task of introducing digital twin technology to solve the problem of the long-term storage of documents has not been discussed in the literature.Nevertheless, the approaches proposed here for modeling the environment for the functioning of digital twins [10,11] and the use of RFID technology for this [38] are considered by many researchers to be drivers for the implementation of digital twins [9,10,39].The presented study is the first attempt to systematize the problems of creating digital twins for archival paper documents.
The experience of the authors in the development of large projects of electronic archives, including the electronic archive of personalized accounting for the Pension Fund of the Russian Federation, the electronic archive of Gazprombank JSC, Law Firm Gorodissky and Partners LLC, Cognitive Technologies LLC, and participation in the development of regulatory documents (such as the "Recommendations for the acquisition, accounting and organization of storage of electronic archival documents in the archives of organizations" for the All-Russian Scientific And Research Institute For Records And Archives Management (VNIIDAD)), confirm the high relevance of using digital twin technologies in organizing long-term storage.
In the development of the above archival solutions, technologies for monitoring the safety of electronic documents for a long period of storage, described in Section 5 and in the author's work [30], were partially used.
The following issues are intentionally not addressed in the paper: 1.
The choice of a specific technology for digital data storage and digital twins.From the point of view of the presented concept, it does not matter what kind of storage system is used.Digital doubles can be stored in a distributed system (if, for example, the archive is multilevel or territorially distributed), centralized (if the localization of the archive by geographical location is clearly defined), or cloud-based (if, for example, the placement of hardware and software in the archive premises is impossible due to non-compliance or violation of requirements for the placement of technical means, lack of premises, insufficient power supply, etc.).

2.
Consideration of specific technologies of distributed storage (for example, such as HDFS, Chord, or HBase) or cloud storage.

3.
Issues of the cost of developing a long-term archival co-preservation solution.It is initially assumed that the cost of creating a digital archive is always significant.However, if we are talking about the preservation of especially valuable documents, the loss of which may be in some sense irreplaceable, then the question of costs should be resolved when creating such an archive on the basis of an expert examination of the value of stored documents.

Conclusions and Future Work
In this study, the possibility of using digital twin technologies to solve the problem of organizing the long-term storage of documents is considered.A digital twin in this respect differs from a digital copy, which has no connection with the real original object.
The realization of digital twin technology requires interaction and data exchange between objects with built-in sensors and software, with the possibility of remote control and management of storage in an automated mode (see Section 5).In this case, the interacting objects are connected by computer networks.Consequently, according to the authors, the IIoT is the key technology for solving the task at hand.In addition, the information interaction described above actually corresponds to the generally recognized definition of the IIoT.
Here, the main results of this study are listed: 1.
A brief review of the problem is made.It is stated that, despite the active use of digital twin technology in various fields of activity, the field of long-term storage of documents remains undeveloped.At the same time, the task of increasing the durability of documents during long-term storage remains important and largely unsolved.This statement applies to both physical and especially electronic documents that are in an unstable digital storage environment (see [30] for more details).2.
The authors were among the first to show the complexity of solving the problem of long-term storage of documents and completed the formal formulation of the problem of long-term storage using digital twin technologies.Destructive factors that affect long-term storage documents and significantly reduce their durability are systematized.

3.
A system of indicators is proposed to assess the durability (preservation) of documents.The modeling of the use of digital twin technology in the organization of long-term preservation within the framework of Industry 4.0 was carried out.In the course of modeling, the goals of long-term storage and strategies for long-term storage were established.Primary mathematical models for controlling destructive factors have been developed, and technologies for organizing long-term storage have been proposed.

4.
A preliminary assessment of the effectiveness of using digital twin technology for longterm storage has been performed and shows a significant increase in efficiency when using digital twin.Based on these estimates, the practical application of the presented approaches can significantly increase the long-term storage of important documents.
The formalized approach proposed in the article will help to more fully define the requirements and criteria for the success of the implementation of digital twins of documents in the archives and can also serve as a roadmap for the development of the industry in solving the problem of long-term storage.
In further research, it is planned to explore in more detail possible ways for the development of digital twin technologies.We plan to clarify the mathematical models of safety indicators; to work out the possibilities of using each of the presented technologies; to develop mathematical models of control and clarify those presented in the article; to clarify the estimates of the effectiveness of the use of digital twin technologies.
In the course of further research, it is also planned to formulate a general statement of the problem of using digital twin technology to solve the problem of organizing long-term storage of documents as an optimal control problem in an unstable environment.
In the future, given the public good of the preservation of the valuable documents of the past (electronic, paper, and others), the authors see the need for widespread use of the digital twin technologies described in the article in the organization of long-term storage.
of mathematical models for monitoring storage conditions: MCntl = {mc i }; • A set of digital twin models: MDTwn = {md I }; • Multiple ways and strategies to ensure long-term storage: MStg = {mst i }; • The set of models of processes occurring in stored documents: MPrc = {mp i }; • A set of technological solutions for long-term storage: MTech = {mth I }; • The set of performance indicators for long-term storage: MQual = {mq I }; Now, we should consider in more detail the choice of indicators for the effectiveness of long-term storage.There are no special indicators, except for keeping the shelf life, for long-term storage.Therefore, the authors propose a system of indicators similar to reliability indicators in technical systems according to the main international standards: • IEC 60300−3−11: Dependency management.Part 3-11: Application guide: Reliabilitycentered maintenance.• IEC 60319: Presentation and specification of reliability data for electronic components.• ISO 2394: General principles on reliability for structures.• GOST 27.002−2015: Interstate standard.Reliability in technology.Such indicators may include the following: • Initial safety of the document: the safety of the document at the time of transfer to long-term storage (SV 0i ).Value range [0, 1].Characterizes the initial state of the document d i .Accordingly, for all types of documents, (Typ = {dt i }): The complexity of solving the problem of long-term storage is considered; − The formal statement of the problem of long-term storage using digital twin technology is performed; − Destructive factors acting on documents and essentially reducing their durability are revealed; − Modeling of the use of digital twin technology in the organization of long-term preservation is carried out; − The goals of long-term preservation are established; − Strategies for long-term preservation are established; − Primary mathematical models of the control of destructive factors are proposed; − Technological solutions for the organization of long-term storage for the realization of digital twins are offered; − The efficiency of using digital twin technology for solving the problem of long-term preservation is evaluated; − Spheres of application and further avenues of research are defined.

Figure 1 .
Figure 1.Conceptual diagram of the use of digital twin technology.

Figure 1 .
Figure 1.Conceptual diagram of the use of digital twin technology.

Author Contributions:
Conceptualization, A.S.; methodology, A.S.; validation, I.T.; writing-original draft preparation, I.T.; writing-review and editing, I.T.All authors have read and agreed to the published version of the manuscript.Funding: Article was prepared at the State Academic University of the Humanities within the framework of the state assignment of the Ministry of Science and Higher Education of the Russian Federation (topic No. FZNF-2023-0004).

Table 1 .
• Durability of information carriers.

Table 2 .
Negative factors of document storage.