Conversion to a logical rules set from a legal text document remains an important and difficult task [1
] due to the complex structure of legal text documents. The legal text document consists of different sections, subsections, clauses, sub-clauses and sub-sub-clauses. Therefore, it is a complex job for a software engineer who does not understand the exact meaning of these clauses and has been given the task of building a medical decision support system (MDSS), which supports the exchange of private health information according to the medical law of that country. However, it will make it easier for a software engineer to design and build MDSS if we provide him or her a logical rules set. Informatics-assisted compliance verification with laws and regulations is however highly challenging [2
]. Legal drafts often contain ambiguities and have incomplete contingencies. There are frequent cross-references to different subsections of the same or other sections [4
]. Furthermore, there are implied cross-reference meanings of its provisions associated with the intent of the law, possibly conflicting definitions and domain-specific terminology [5
]. The implementation of laws also gets refined gradually as the various provisions are tested in contests and courts provide case-specific clarifications [6
]. In addition, laws and regulations undergo updates and amendments that would require a software engineer to manage and track these changes [7
Therefore, the first step to build MDSS is to formalize the medical law into logical rules. These logical rules can further be used for making a decision to release or the denial of certain medical records in compliance with the law, when requested by any entity of the medical system. The users of the medical system are almost the same all over the world, i.e
., doctors, nurses, hospitals, labs, researchers, etc.
, but the medical laws for those entities may differ from country to country. The U.S. Department of Health and Human Services (HHS) in 1996 created the Health Insurance Portability and Accountability Act HIPAA as a means of providing a mechanism to protect broad civil rights, including the privacy of medical information [8
]. Failing to conform to HIPAA may result in a fine of up to $25,000 per year and one to five years of imprisonment [10
]. HIPAA Administrative Simplification, Regulation Text: 45 CFR (Code of Federal Regulations) Parts 160, 162 and 164 specifically regulate the disclosure of personal health information [11
The complexity of HIPAA, combined with potentially stiff penalties for violators, has led physicians and hospitals to withhold information from those who may have a right to it [12
]. A review of the implementation of the HIPAA privacy rules by the U.S. government accountability office found that healthcare providers worry “about their legal privacy responsibilities and often responded with an overly guarded approach to disclosing information than necessary to ensure compliance with the privacy rules” [13
Several attempts have been made to convert legal text into logic rules [14
] with a special focus on HIPAA, given its importance to healthcare. The authors of [16
] attempted to generate Datalog rules corresponding to the sentences in the legal text. Their proposed mechanism combines associated rules in the form of “permitted_by” and “forbidden_by”, where the latter has precedence for making a decision, and prohibition takes precedence over permission in the case of “conflicts”. Similarly, the authors of [17
] attempted to convert legal text into Prolog statements using several steps based on Hohfeldian concepts of defining/classifying legal rights [17
]. Each clause and the corresponding rules are categorized into four types, such as right, obligation, permission and definition. In some similar studies [19
], the authors proposed least fixed point (LFP) logic for assigning the particular semantic modal and signature, which specify the privacy regulations, where all legal constraints are expressed as positive and negative norms to make a decision, whereas the latter takes precedence over the former in the case of conflict. For the exchange of medical data, the authors of [21
] proposed different approaches for the exchange of protected health information among the medical entities.
One of the common perils in such approaches is the inability to capture the deep semantic connections between various sections of the large HIPAA text. In close observation, one would notice that these first generation approaches (such as [16
]) are attempts to translate the legal text, clause by clause, into corresponding logical expressions. The rules set reflects the clausal organization and syntax of the legal text. It spares comprehension of the overall HIPAA semantics, except for the part by part attempt to model the semantics of the sentences. The inference process is expected to drive from one “sentence” (presented in the form of a logical expression) to another and to find the connections between constraints defined in various parts of the act by shared predicates. In one sense, it is remarkable, without the explicit comprehension of the overall HIPAA, how much success they have achieved in making many decisions correctly. However, the resulting logical systems in such approaches seem to miss yet many higher cognitive level connections hidden in the semantics of the overarching HIPAA environment. Indeed, several cases of “conflict” cited are artifacts of missing deeper semantic/indirect cross-references. A human expert can connect the context and may not see these cases as a conflict. Indeed, to resolve the conflict artifacts, all of these approaches have to use in some form of preference/precedence meta rules extensively, though these are not part of HIPAA.
Supporting programmable legal compliance and real life actions requires a substantial understanding of the conceptual framework of medical law. For automatic verification of HIPAA privacy compliance, an understanding of the structure and interrelation of the privacy rules is needed. For example, why do we need a law to govern the release, exchange and use of information? How does one respond to a requester or entities in a medical system? What conditions are applied for accessing specific information? What kind of actions should be taken by the system in response to different events? It can be argued that every law has its reasons and purposes, and there are some conditions for these purposes. Each request has a response and action. In other words, in order to build an effective MDSS, all aspects beyond the legal text must be covered.
This paper mainly focuses on Section 164 of HIPAA, which is related to the security and privacy issues of healthcare. In the remainder of this paper, Section 2
briefly presents the methodology, the concept space of HIPAA, the tag set and the rule set formation from HIPAA. Section 3
explains the result used in the example. Section 4
is about the discussion, with a comparison of different approaches to the proposed solution using different studies. Section 5
concludes the discussion on MRM for any medical law and medical decision support system for EHR.
2. Materials and Methods
This research attempts to capture and accommodate deeper underlying semantics of the complex aspects of medical law. A medical relational model (MRM) is used, which includes the medical entities and their relationships that define the semantics of the domain, on which the Act and their provisions have been laid and structured. Based on the MRM, a systematic tag-based approach is adopted to convert the corpus of legal texts into a set of logical constraints and actions. It requires the designer to explicitly comprehend and extract the connections (with HIPAA experts) and to summarize the overall behavior as a set of independently-constructed rules. The system pre-resolves the semantic connections and, thus, generates very precise resolutions. The MRM and the rules generated, then, can provide a very transparent process in the form of decision trees to precisely resolve information requests. Besides the decisions, the resulting system can much better articulate other intents of HIPAA. It also specifies how to release particular pieces of information. If denied, what are the alternate options that generate a logically-coherent explanation to support and conform the decisions? Conforming to the original expectations of HIPAA, the entire process can be subsequently automated.
2.1. Concept Space of HIPAA
shows eight (8) elementary concept classes, named requester, purpose, patient record items, condition, action, fee and time, information procedure and, finally, record release. It is an internal structure of medical law, which is based on these concept classes.
Concept classes indeed have relations with each other. Some of these relations are conceptual, whereas others are compositional. However, making a relationship between tags in concept classes requires MRM to understand the semantics of these connections.
represents the proposed MRM that governs the connection of conceptual and compositional relations between classes. Our MRM is based on eight conceptual relations between concept classes and three compositional relations that form three classes. These classes are request flow, evaluation and release.
The first conceptual relation is between a requester (1) and purposes (5), where a requester could have different purposes for a request. Furthermore, remaining conceptual relations are explained as follows: a requester (1) requests records from a covered entity (2); the covered entity manages patient’s information (3), which contains record types (4) and has an information release procedure (10); the covered entity takes an action (6) based on the conditions (7) and takes the appropriate time and fee (9) for that action.
There are three compositional relation classes that are formed by several concept classes. The request flow class is composed of request (1), purpose (5) and patient record item (4) classes. The evaluation class is composed of request flow, time and fee (9), action (6) and condition (7) classes. The release class consists of record release (8), information procedure (10) and evaluation.
In Figure 2
, big ovals represent our concept classes; small ovals with the concept classes represent the tags associated with each class; and rectangle shapes represent the processes that use those classes.
2.2. Tags’ Set
Related clauses are grouped under any one of the concept classes and assign tags to these clauses. For example, the concept class related to requester is searched for the requester’s information clauses. Then, each requester is assigned a tag (ReqT1, ReqT2, etc.) and added under the requester concept class. Each tag refers to a clause that contains the information related to the requester concept class. Conditions that need to be satisfied are placed under the conditions concept class, and each condition rule is assigned a (CT) tag. The time and fee class indicates the time and fee needed to release PHI, referred to as TFT. For example, the covered entity might place a preparation fee on disclosing documents. Some medical records may require human verification to make sure these records cannot be identified. This requires a processing time that might delay the release of these records. All purposes of requests, which are related to the requester, indicating the reasons for disclosing PHI are listed under the purpose concept class, and each rule is assigned a PPTtag. Action needs to be taken whether to deny or disclose PHI. We collect these actions under the actions concept class, and each action is assigned an AT tag. The information procedure concept class represents how information will be released, and the IPT tag is used in this class. The record release concept class is related to the rules that indicate what type of information will be released.
For example, the psychotherapy notes (RRT) tag is used to distinguish among rules in its class. All record items in each section need to be evaluated for dependencies. For example, psychotherapy notes cannot be disclosed without an authorization from the individual, as stated in clause 164.508.a.2 of HIPAA. This indicates that a condition needs to be satisfied for this type of record item. We mark all patient’s record items that have dependencies and put them in one PRI class.
2.3. Rule Set Formation from HIPAA
For illustration in this article, we focus only on the HIPAA edicts specific to the use of PHI by the researchers. HIPAA clauses in Sections 164.508, 164.512, 164.514 and 164.532, respectively, cover the portion of this usage. There are multiple ways of organizing the logical rule expressions. In this system, logical rules are organized based on possible combinations of the requester (ReqT), purpose (PPT) and patient record item types (PRI). Each rule also has associated condition tags (CT). Each logical rule then specifies a unique decision action (AT) and special instruction (IPT) on release process format choices (RRT); time and fee restrictions (TFT) are applied when applicable.
The resolution process reads those as: If (Requester = “Researcher” & Purpose = “Purpose Tags” & Items = “Patient Record Items”; then check conditions (Pre-Condition = “Pre-Conditions Tags” & ReqT Condition = “Requester Tags” & Purpose Condition = “Conditional Purpose Tags”) then (Action = “Action Tags” & Record Release = “Record Release Tags” & Information Procedure = “Information Procedure Tags” & Time Taken = “Time Tags” & Fee = “Fee Tags”). As a result, extracted information from all HIPAA sections will be distributed among 8 concept classes, and decision will be based on combinations of tags from these concept classes. Note: the fee class and time class might not be available in all privacy rules sections of HIPAA.
A three-step approach is undertaken in this article to address the challenge of HIPAA. In Figure 3
, the first part of the modeling process consists of parsing the HIPAA text to identify the compositional and functional model of the HIPAA regimen in terms of components and core processes. This involves identification and classification of the key concepts in terms of entities, actors, actions and conditions, including factors that build the conditions and action modifiers described in various parts of HIPAA. Once these concepts are extracted, they are further organized into classes, types and their relations in the second step. Thirdly, the core processes involving these entities (functional relationship between the entities) are also ascribed. This process leads to an MRM model.
2.4. HIPAA Rules’ Filtration through the World Rule Model
We proposed partial formalization of legal text into logical rules in the first part of WRM. Instead of formalizing the comprehensive text of privacy rules, only rules that are related to a certain discipline will be formalized. Logical rules from different sections of HIPAA are extracted using Algorithm 1, and the flow of the algorithm is shown in Figure 4
Tag rules are complied with for each medical entity, as shown in Figure 5
for the researcher. After that, tag rules are compiled into XML-based logical rules for each entity with Algorithm 1. A sample of XML logical rules generated through the algorithm from HIPAA is shown below.
<rule ruleid = "164.512.b">
The second part of the modeling process is the derivation of the actual constraints expressed in the Act. The constraints can be of a logical, temporal or functional type. In this step, the logical constraints, conditions and exemptions of the information release process as expressed in HIPAA are derived. Each rule or clause in these concept classes is identified (in terms of tag) and is connected together based on how a request of disclosing PHI is processed in the privacy rules of HIPAA. Different sections of the privacy rules may consist of related information, which needs to be considered in this process. At the end, the HIPAA requirements are compiled together as a set of logical rules.
In the final part, requests for various administrative actions are processed. An administrative request originates from the requester. It includes, but is not limited to, identity, purpose, specifications of information requested and a set of associated credentials (such as authorization, waiver, etc.). The transponder (decision work flow specific to a particular type of administrative request) generates the “decision” or “conformance verification” as per HIPAA.
The transponder acts as the mediator between the administrative requests on one side and the medical databases with HIPAA tagging on the other side. The transponder runs the resolution process as per the specification in the HIPAA rule set. The transponder can also generate four other items, namely: (i) a list of deliverables; (ii) special instructions specifying format choices (raw record, summary, etc.), special processing (like de-identification required), fee and time constraints, etc., as specified in HIPAA; upon request, the system can also provide (iii) an explanation identifying rules triggered and (iv) an audit specifying who, when and what has been requested, created, released, etc. This paper focuses only on the first item “generation of decision”.
We compare our methodology to three other methods [16
] using the previous example from Section 3
. There are a few exceptions, like 164.508(a) (2), which indicates that a covered entity is obligated to obtain authorization for using or disclosing psychotherapy notes, except in certain conditions. In [16
], a researcher without authorization (164.508.b.3.i) would not be able to obtain PHI or psychotherapy notes due to “forbidden_by” and the negative norm; even if the covered entity provided a waiver (164.512.i.1.i) for the researcher to disclose PHI for research purposes. This is due to the conflict between the “forbidden_by”, “permitted_by”, negative norm and positive norms to resolve an overlapping problem between clauses. In all of the above methods, the denying clause is treated as a “forbidden_by” or negative norm that does not satisfy the “only if” condition. On the one hand, information should be permitted if at least one rule of the “ifs” positive norms satisfies a condition. On the other hand, the information should be permitted if all of the rules in the “only if” negative norms are satisfied. We can conclude that if there is a negative norm, then information will not be released. More precedence is given to the negative norm as compared to the positive norm, because if any one of the negative norms is not satisfied, then it is treated as a denial of releasing information, even if there are some positive norms allowing the disclosure of PHI.
Remember, the discussed example is about a researcher who does not have an authorization, but has a waiver, meaning that there is a negative norm because the research must have an authorization to disclose PHI. As a result, the researcher will not be able to disclose PHI, because one of the negative norms has not been satisfied. Nevertheless, a waiver has precedence over authorization, and the researcher should be able to access PHI. Moreover, nothing in [16
] has been discussed regarding how the output will be generated and what will be the format of the output. Individuals and medical providers preferences were not considered in [16
]. Cross-referencing of clauses in different sections has been handled manually. Table 1
shows the comparison results with other approaches.