Atomicity Violation in Multithreaded Applications and Its Detection in Static Code Analysis Process

: This paper is a contribution to the ﬁeld of research dealing with the parallel computing, which is used in multithreaded applications. The paper discusses the characteristics of atomicity violation in multithreaded applications and develops a new deﬁnition of atomicity violation based on previously deﬁned relationships between operations, that can be used to atomicity violation detection. A method of detection of conﬂicts causing atomicity violation was also developed using the source code model of multithreaded applications that predicts errors in the software.


Introduction
Software errors exist as long as the software itself. The cause of most errors in multithreaded applications is the so-called resource conflict, while the result is undesirable phenomena, such as race condition. Race condition and deadlock are by far the most common phenomena that occur in multithreaded applications. In comparison to the phenomenon of atomicity violation, race condition and deadlock are easy to locate. They result directly from the code structure of the application [1,2], as opposed to the atomicity violation phenomenon, which is detailed in Section 4. However it was important to start studies on atomicity violation, and they had to be started from less complicated phenomena. Race condition and deadlock were the best case study. Race condition and atomicity violation belong to these same type of phenomena and the incorrect fixing of both of them may lead to deadlock. In other words resolving the problem with race condition phenomenon and deadlock phenomenon allow for a better understanding of how to fix data race errors and without adding new ones.
Multithreading is one of the basic mechanisms for parallel computing, which is possible on one processor and on several processors [3,4]. It is used very often in utility applications (mentioned in the rest of the work), but also in scientific research [5]. Popular, commonly used open-source multithreaded applications are a very good source for research on undesirable phenomena in multithreaded applications. In a popular Apache server, atomicity violations have been located several times, with the examples described in the works concerning buffer handling [6,7]. The effect of an incorrectly written function was, in the worst case scenario, the unexpected termination of the application. The situation was similar for the MySQL relational database management system. The state of the log file was stored in the form of a global variable, which was used by several operations [6] constituting a logical whole.
The problem of atomicity violation concerns not only open-source programs. Various JDK packages were also tested and in version 1.4.2 of this package it was discovered that the handling of five popular containers had errors in its implementation that caused the atomicity violation

Description of the Phenomenon
The phenomenon of atomicity violation is a result of inconsistent order of access to data [11] as shown in Figure 1. Most often it refers to a situation in which the programmer intended to execute two instructions in series, excluding other instructions with access to a shared memory area, and yet another instruction affects the memory area of these two functions [12]. In this work atomicity violation is a phenomenon resulting from violating the execution of a pair of operations (forming a logical whole) on the shared resource, by an operation of another thread operating on that resource. These operations in the source code of the program do not have to occur directly after each other-i.e., there may be other operations between them that do not affect the shared resource in any way. The operation is to be understood as any C language instruction.
The phenomenon of atomicity violation can be described as a situation in which there is a scenario of two threads working, the course of which deviates from the scenario expected by the programmer.
The phenomenon of atomicity violation is closely related to the phenomenon of race condition, because in both cases both threads should be mutually exclusive when working on a shared resource. The exclusion of threads takes place by placing the operations performed on the resource in the critical section, where the critical section is understood as a set of operations performed between the locking and unlocking mutex.
In [13], the atomicity violation phenomenon is also called "high-level data races". However, in the case of atomicity violation, the critical section should include a pair of operations carried out on a shared resource. In both cases, however, the order in which the operations are performed affects the final result. It can also be said that the phenomenon of race condition is a special case of atomicity violation. This is because there are instructions in C and C++ languages, which at the programming language level are atomicity operations-i.e., consisting of one language instruction (e.g., an incrementation operation). However, the situation changes when the code is compiled. The time needed to retrieve a word value from memory to cache and then to register can be (in terms of processors) very long. At best, about 500 instructions can be executed before the value reaches the registry, and it will take the same amount of time to pass the new value to its destination [15]. When the atomicity violation occurs at the programming language level, the symptoms may include:
Incorrect application behavior.
In works on atomicity violation, a thread in which a pair of operations is performed (to form a logical whole) is called local, and a thread with a violating operation is called the remote thread.
Chew and Lie [16] describing the phenomenon of atomicity violation, stress that this phenomenon may be required, harmless and undesirable [16]. An example of a required violation of atomicity is a situation in which two threads communicate by passing information using a shared resource.
This example is a part of the scenario described in the first column in the second row of Table 1. However, this example is incorrect because the pair of operations of the local thread requires a record operation in the remote thread to work correctly. To sum up, the example scenario described by [16] does not meet one of the conditions of atomicity violation-i.e., the remote operation does not interfere with the work of local pair of operations. On the contrary, if the remote thread records are not performed, the local thread will be affected. Therefore, it seems reasonable to question the claim that there are scenarios in which atomicity violation is required. If two operations in the indicated memory area are to be performed without any interference, the interference may be neutral or undesirable. If the interference is required, its non-occurrence causes the application to malfunction, but it is not an atomicity violation. It is worth adding, however, that the discussed example will be taken into account during static analysis of the application code, as a result of which false-positive error will be reported. Table 1. Scenarios of the atomicity violation. Source [16].
In the program described below, called AV1 (https://bit.ly/2QBlVOC), there is a conflict of resources causing atomicity violation. The usleep operation present in the application, which suspends the thread for 100 milliseconds (the equivalent of request handling in a server-type application) is intended to increase the probability of observing atomicity violation-i.e., the probability of observing atomicity violation increases at the moment when the execution time of operations between a pair of operations performed on a shared resource, which form a logical whole, increases. According to the definition, a logically atomic pair of operations that should not be affected is an operation of incrementing shared resource r1 and printing the value of this resource on the standard output. These operations are not in the same critical section, which leads directly to an atomicity violation. In this case, unwanted application behavior will result in the user console displaying twice the same value of the shared resource. However, this phenomenon is undesirable when the application user has the ability to view all values printed on the standard output-e.g., on the console. However, if these values were displayed, for example, on a seven-segment display, the user might not notice that the same value is displayed twice. Therefore, whether the phenomenon of atomicity violation is neutral or undesirable may depend on the case of using a given application.

Existing Solutions
There are two different patents for detecting atomicity violations. The first one consists in the analysis of processor instructions and simulation of selected scenarios [17]. The second one describes a method of detecting the atomicity violation phenomenon called "access interleaving invariants" (AII) [12]. The method described there was originally implemented in AVIO [18] and SVD tools [19], and a few years later it was also used in the CTrigger framework [7].
Park et al. [7] also criticized the methodology of searching for atomicity violation through stress tests. Its determinism was indicated as a disadvantage of this solution. Although multithreaded applications are only seemingly non-deterministic, it is precisely this apparentness that makes it impossible to cover all possible states of a multithreaded application with deterministic tests.
The process of static analysis of the code for searching for atomicity violations has also been criticized, indicating that these methods report false-positive error in many places [7,20]. As a solution to this problem, it has been suggested to perform tests or use separate tools to investigate false-positive error [7]. It was also pointed out that there are scenarios in which CTrigger requires huge resources and has to work many hours to detect the phenomenon of atomicity violation. For example, 100 different 1-h (within CTrigger) input tests and 10 different configurations would take CTrigger 2 days to complete testing on 20 machines, while stress tests would take between 20 and even 2000 days to achieve a similar ability to expose atomicity violation that is far too long to be acceptable [7]. Although two days of operation of 20 machines seems a better solution compared to the stress tests, it is still unacceptable in most projects. The process of static code analysis, in order to detect atomicity violation conflicts, should be much faster and cheaper compared to long-term analysis of application behavior. Despite the imperfections of such a process and many false-positive errors reported, a programmer should be able to verify a false report.
Flanagan and Freund [9] presented the Atomizer tool. Despite promising results, the authors of Atomizer decided to focus on hybrid methods that combine static code analysis with the "on-the-fly" analysis method they developed.
The use of the method based on a system of types was repeated by another team [21]. Sasturkar et al. [21], as with the authors of Atomizer, bypassed one of the limitations of the developed method, so that it is no longer limited to the compilation process. The authors also mention that this method allows us to omit some conflicts that cause atomicity violation, but instead the amount of false-positive error is marginal.
Wang et al. [22] developed a model called "trace-based symbolic predictive model", which was used to develop a mixed method based on tracking specific application execution paths and analyzing the source code of that application to detect an atomicity violation. The authors claim that the method presented by them does not report any false-positive error, but they do not guarantee that all conflicts causing the atomicity violation are detected.
Solutions based on the analysis of the application during its operation, to avoid the phenomenon of atomicity violation, are applied regardless of the language in which the programs are written. With the growing popularity of JavaScript [23], this language experienced the use of Node.js runtime environment, which is used for writing server applications. In this environment, applications are created with the help of an event programming paradigm, in which many operations are performed in parallel in different threads [24]. However, this environment does not provide tools for marking operations as atomic [24]. Nor does Rust, which was developed for secure multithreading and published in 2010 [25]. The fact that the phenomenon of atomicity violation is language-independent and accounts for 70% of all multithreading errors [7] (without deadlock) shows how dangerous this phenomenon is.
In [16], a paper was published describing the Kivati tool for Linux and x86 platform, which aims to detect atomicity violations in applications written using C language. This tool interferes with the source code of the system kernel in order to use so-called watchpoints to observe resources that have been used in atomic regions. These regions are obtained during the analysis of the program whose operation is observed by Kivati. The fact of observing the operation of the program (as well as influencing its operation) classifies the method used as a mixed one, which contradicts the information provided by the authors that the method used belongs to static methods [16]. Java grammar allowed [26] to develop a method of static analysis of the Java code allowing one to detect the phenomenon of atomicity violation. Their method described in the work entitled [26] is based on locating in the definition of classes of methods with the keyword synchronized in the declaration, whose structure is then analyzed for the occurrence of patterns that guarantee the phenomenon of atomicity violation [26]. The disadvantage of this method is that it cannot be translated into languages differing in similarity to Java, including the C language.
The method called "Hybrid Atomicity Violation Explorer" (HAVE) [27] is based on the use of the so-called "Static Summary Tree", created during static analysis of Java code. Then, based on the created trees, the work of selected fragments of the application is simulated. The effects of the simulation and the created trees are the basis for building the so-called hybrid trees, which are then analyzed using the author's algorithm, which results in information about potential conflicts that cause the atomicity violation and race condition. The authors of the method mention the occurrence of false-positive error in the results of their method and further work on their elimination. They also indicate further development of static code analysis as one of the most important elements of the developed method.
All described methods today can be included into few groups. The first group contains applications which are not available publicly from different reasons. The second group contains applications for other languages or for application which are not considered in the scope because they used libraries other than pthread. The third group contains methods which were developed for needs of specific languages and cannot be implemented in C language easily. For these reasons, there is no any other static analysis method which can be compared directly with the method proposed later in this paper. There is also no other known database with well described known bugs where C applications with pthreads library can be found.

Agreement Relationships between Operations
Relationships between operations, which can be read from the source code of any language, are limited to the order in which the operations are performed. According to the definition, the phenomenon of atomicity violation is a result of a violation of an operation pair on a specified memory area-i.e., on a shared resource occupying a specified area. It is not a disruption of the order of these operations, so one may suspect that there is a certain relationship between the pair of operations that is being violated; as a result of which, the program does not work properly. As a result of the analysis of the atomicity violation phenomenon, three relations have been developed, the violation of which leads to the atomicity violation phenomenon. Thus, if a program has two operations a and b, where a ≺ b and these operations form a logical whole, the following relationships may occur between these operations: • Forward-a relationship in which the operation a always must be followed by an operation b; • Backward-a relationship where operation a always has to precede operation b; • Symmetric-a relationship where operation a always has to precede operation b, and operation b always has to follow operation a.
Despite a large number of studies on the phenomenon of atomicity violation, none of the works discuss the relationship between operations. The existing programming languages do not have any mechanisms that would allow for the declaration of such relations. However, the programmer may use a locking mechanism which allows one to secure the desired operations from violating the relations.
Complementing the source code with information about these relationships should allow for faster locating of the atomicity violation during the static code analysis process, or at least reduce the number of false-positive errors reported. An alternative solution is to enrich the process of converting the code into a model with the possibility of accepting such data as an input.
In the review of literature and technical docs, no mechanism has been found so far in programming languages, by means of which it is possible to declare the relations presented in this point. Knowledge about relations can be found in libraries' documentation or by asking programmers, and that is why they are referred to as agreement relationships.
Lack of knowledge of the relationships developed is the reason why programmers do not understand the phenomenon of atomicity violation. If these relationships could be defined by means of programming language mechanisms, it would be much easier to locate atomicity violation. It is therefore reasonable to draw up a new definition which will have information about the relationships whose disruption cause phenomenon of atomicity violation. Pairs of operations in relation to each other may be located by the Kivati program as components of atomic regions which are located for dynamic analysis. In the proposed solution, instead of checking how the code of the atomic regions works, the structure of the code is more important. It allows for checking only the related operation pairs and skips other operations which speed up process of analysis.
Taking into account the mentioned gap in the literature, it is worth presenting the following definition of the phenomenon of the atomicity violation.

Definition 1.
Atomicity violation is a phenomenon in which there is a relationship between two operations (o i,j , o a,b ) of one thread, using a common resource r c , whose disruption caused by the operation of another thread on the same resource (unexpected change of resource value) results in the undefined behavior of the algorithm using those operations.
The operations should be understood as instructions or functions called in the program code. Sometimes the relation between two operations may result from the context and may not be a rule-e.g., setting a new value to one shared resource involves resetting the shared counter, but only when this value is set in the selected thread. Taking such cases into account is associated with an increase in the number of false-positive errors reported.
The C language standard library has many defined functions that can be included in one of the three relationships mentioned above. Most of the relations presented in Table 2 result directly from C language documentation. In the case of calloc, free and malloc, free pairs, the free relation results from the way the memory is managed in C.
The relations shown in the table are only a small subset of all possible relations that may occur in multithreaded applications. The C language standard library provides only basic mechanisms to allow the programmers to build their own, more complex mechanisms. For this reason, applications written using the C language are very extensive.
Therefore, in order to detect atomicity violation, programmers must determine themselves, in which C language functions, functions of external libraries and functions created by themselves exist in relation to each other. It should also be remembered that the relations presented in Table 2 can be violated, because not all violations of the relations causing atomicity violation are undesirable-i.e., the effect of violating the relations can be harmless.
To eliminate the phenomenon of atomicity violation, locks are used to create critical sections. None of the programming languages have other mechanisms that would support defining sets of operations as a logical whole. This also means that the C and the pthread do not provide mechanisms to declare the relationship between two functions. C language does not provide any mechanisms to declare the relationship between any pair of two operations in a thread. The only reason for an existing relationship may be a structure in which two operations of the same thread use the same shared resource. This assumption, will cause a very large amount of false-positive error to be reported. To reduce the number of false-positive error when converting the source code of a multithreaded application to a model, it is necessary to develop a relation table similar to Table 2, which will contain relations between user-defined functions.
The scenarios presented in Table 1 concern the relations that can occur between the triple operation of two threads. In fact, in any advanced multithreaded program there are many such relations.

Multithreaded Application Source Code Model
Locating an atomicity violation is possible using the source code model of a multithreaded application [2]; however, this model should be extended with a set that will describe the relationship between the two operations. This model looks as follows: where: 1. P is the application index; 2.
T P = {t i |i = 0...α}, (α ∈ N) is a a set of threads t i of C P application, where t 0 is the main thread |T P | > 1; 3.
U P = (u b |b = 1...β), (β ∈ N + ) is a sequence of u b , sets which are subsets of T P containing threads working in the same period of time in the application C P , whereas |U P | > 2, u 1 = {t 0 } and u β = {t 0 }; 4.
O P = {o i,j |i = 1...δ, j = 1... }, (δ, , ∈ N + ) is a set of all C P application operations which are atomic at a certain level of abstraction-i.e., dividing them into smaller operations is impossible (it should be understood as an instruction or function defined in a programming language); Index i indicates the number of the thread in which the operation is performed, and index j is the ordinal number of operations working within the same thread; 6.
Q P = {q s |s = 1...κ}, q s = (w s , x s ), (κ, ∈ N + )-a set of all locks available in the program, defined as a pair of variables, type of lock, where the type is understood as one of the set values (PMN, PME, PMR, PMD); 7.
a set of pairs of backward relationship operations: family of pairs of operations related by a symmetrical relation: Introduced into the model set B P in the graphical representation is as shown in Figure 2. If a pair of operations is in a forward relationship with each other, then the edge connecting the two operations has a filled ring at the operation after which the second operation is required. Similarly, in a backward relationship, the edge connects the two operations, with the empty circle being placed next to the operation that needs to be preceded by another operation. In the case of a symmetrical relation, the edge is the combination of the two above, and there are both a ring and a circle at its ends. o1,1 o1,2 o1,1 o1,2 o1,1 o1,2

Relation Breakage
The order of execution of an operation within one thread may cause the result of one operation to be overwritten by the result of another operation. As a result, a further operation in relation to both operations will always receive the result only from the one that occurs later. Such a situation is called a relation breakage and takes place in the code from the Figure 5, the graphical representation of this code is shown in Figure 6. According to the model from Section 5 and Table 2      Based on the previous paragraph and on the definition of the subgraph s i G P [2], the definition of a relationship breaking up is as follows. Broken relationships are a consequence of two operations, of which the second invalidates the result of the first operation-e.g., by overwriting the value of the shared resource. If a relationship is broken, it should not be present in the sets of B P sequences. Therefore, if there are relationships that are not broken or are conditionally broken, they may be disrupted by an operation from another thread, leading to atomicity violation.

Reversal of Relation
The reversal of the relationship between operations of one is not considered for several reasons. In the case of the forward relationship, the second operation of the pair has the right to occur independently before the first operation. Likewise, in the reverse relationship, the first operation of the pair can occur independently after the second. In the case of symmetrical relation, reversing the relation will always result in an error, which will be revealed when such a code is first run.
There may also be a situation where two operations are linked by one of the relations and each of them is in a separate thread. A scenario discussing such a case may lead to the phenomenon of order violation, which is not the subject of this work.

Problem Definition
A multithreaded P application is given, written in C using the pthread library, in which pairs of operations are marked with each other in the relations described in Section 4. This application is affected by the phenomenon of atomicity violation. Is it possible to detect a conflict causing a violation of atomicity using the model from Section 5?

Sufficient Condition
A multithreaded application code is given as a model in which:

•
There are no conflicts causing race condition and deadlock; • In graph G P [2], there is a pair of threads t i and t j working in the same range of u b ; • In thread t i , there is a pair of operations (o i,α , o i,β ) ∈ B P and these operations are in an agreement relationship which is not completely broken; • In thread t j , there is operation {o j,γ }; • In the set of operations {o i,α , o i,β , o j,γ }, at least one of them is connected to a resource with a shared use edge (one of the scenarios listed in Table 1 is fulfilled).
The phenomenon of atomicity violation occurs at the moment when each of the operations of pair (o i,α , o i,β ) is present in a different critical section. Figure 7 shows the first scenario from Table 1 using a graphical representation of the source code model of a multithreaded application. All characteristics listed in the points above can be found there. So, assuming that there are two threads t i i t j and set of operations {o i,α , o i,β , o j,γ } protected by q s mutex and using resource r c in such a way that they fulfil one of the scenarios set out in Table 1, it can be concluded that: Operations o i,α and o i,β are connected by one of the agreement relationship, which is not completely broken.
If there is a cycle in G P graph that consists of mutex q s and only one operation from set {o i,α , o i,β } then the phenomenon of atomicity violation occurs.

Theorem 1. The atomicity is violated if the above lemma is fulfilled.
Proof. This is the consequence of Definition 1. If there is a cycle in G P graph that consist of q s and only one operation from the set of {o i,α , o i,β }-that is, the operations o i,α , o i,β do not belong to the same critical section. This means that it is possible to execute an o j, operation of another thread (t γ ) on the same resource between the o i,α , o i,β operations-i.e., order o i,α ≺ o j, ≺ o i,β . It is then possible for atomicity violation to occur for the o i,α , o i,β operation by execution of the o j, operations.

Leading Example
The atomicity violation can be found in the sample AV1 application (source code: http://bit.ly/ 2P849ma). To do that there is only the need to use the rdao_detector (https://github.com/PKPhdDG/ rdao_detector), application which uses MASCM to detect, among other things, the phenomenon of atomicity violation. The detection discussed in the previous section is implemented as an algorithm in the author's tool and is used to optimize code. The code used is equivalent to a piece of TCP/IP server application and was running on an embedded device. The server was run as a data collector in a commercial environment. Two running threads were responsible for handling requests from two different sensor groups. In the presented example there are many simplifications to show an idea that is significant for multithreaded applications. After converting the application to a model described in Section 5, it looks as follows. In AV1, there is a backward relationship between the printf function and the incrementation operation. As a result of its presence, the user is likely to receive incorrect data on the standard output.
Locating the atomicity violation phenomenon consists of several steps, after making sure that the application is free of resource conflicts causing race condition or deadlock. In the first step, it is necessary to determine whether there are intervals in which more than one thread works, which is true for AV1 applications. Then, as a second step, one should determine which of the pairs belonging to the sets of sequences of B AV1 can be violated by changing the shared resource. For AV1 applications these are pairs: 1.
In step three, it should be determined whether the pairs coming from the sets of B AV1 sequences meet one of the scenarios in Table 1. If the pair meets any scenario, it should be determined whether there is a path in the operation graph that meets one of the lemmas described in Section 7. In the case of AV1 applications, there are two such paths that meet Lemma 1 and both of them start on a mutex q 1 . As a result of the fulfilment of Lemma 1, AV1 can therefore be defined as an incorrect application-i.e., one in which there are resource conflicts causing the phenomenon of atomicity violation.
On the basis of the leading example, it can therefore be concluded that the use of a model to monitor the source code of a multithreaded application allows for the detection of the atomicity violation phenomenon. The condition for detecting conflicts causing this phenomenon is to determine which functions and instructions of the source code are with each other in one of the three developed relationships. The described example of atomicity violation is another phenomenon that can be detected with the proposed model of multithreaded applications. The work on the model [2] has shown that the proposed approach is 2350 times faster in operation than alternatives based on a complete review. Due to the fact that the proposed solution is based on static code analysis, it can be assumed that a similar result will be obtained in case of detecting atomicity violations. For example, locating phenomenon of atomicity violation in AV1 application using rdao_detector took 0.84 s which is not so long when compared with the compilation time which was 0.52 s. Parts of both processes have common steps, such as converting source code into abstract syntax trees. If this process was common and the result could be shared, the time of compilation and validation of source code together could take approximately double the compilation time.

Conclusions
In order to locate the conflicts causing the phenomenon of race condition, the developed source code model of multithreaded applications had to be extended with a new relationship that may occur between the two operations. The analysis of the literature showed that in the context of the phenomenon of race condition, the developed relations were not considered anywhere. Knowledge of this relationships may be used to improve programming language grammar or to create an extension for languages. Programmers aware of the existence of relationships have the possibility to create better architectures of applications and better unit tests. It is also possible that the developed relations may be used for research on various types of errors not only in multithreaded applications. Another interesting idea is use MASCM and developing relationships between memory allocation operations and memory deallocation operations to search for memory leaks in C applications.
MASCM can be used also in mixed methods-e.g., for searching atomic regions. Additonally, code converted to instances of MASCM can be used with neural networks for intelligent methods of detecting race condition, atomicity violation, etc. To avoid false-positive errors there is need more researches on the atomicity violation and static analysis methods.
The developed method based on a sufficient condition enabled us to locate two resource conflicts causing the phenomenon of race condition in the leading example, which improves code optimization in, for example, applications with parallel computing. Further research on the relations should also make it possible to predict the conflicts causing the phenomenon of order violation.
During studies on atomicity violation, any false-positive errors do not occur, but there is no proof that developed rules of detecting do not allow that. It is possible to find part of code which in instance of MASCM can meet atomicity violation requirements, as it is in most other methods using static analysis of code.
Due to the fact that 70% of all multithreaded errors [7] are errors that are a violation of atomicity, the development of a method that allows for their rapid detection only through static analysis of the source code may turn out to be a significant contribution to the field of computer science.
Compilers that allow for detecting these errors during compilation process may permit us to avoid atomicity violation even before testing process. It means that programmers will be able to be more efficient, their programs will be less unreliable and the process of testing multithreaded applications will be less complicated because fewer cases will need to be checked.