At one time, medical practitioners needed only a pleasing bedside manner to reassure their patients. Now they need something more: statistical documentation that evaluates the effectiveness of their treatments. This is because of a new sociopolitical and economic development in America: the escalating consolidation of services under managed care organizations.
The advent of managed health care has generated demands by payors, users, and managers for precise information on the effectiveness of different therapies on healing time, time lost from work and, most significantly, cost-effectiveness. Health care services that cannot provide such information will almost certainly be dropped from consideration. Furthermore, the information must be gathered in a timely, objective, and user-friendly manner by reproducible means regardless of geographical location or variation in patient population.
The author believes that the most efficient way to approach this problem is to enter all relevant patient data while on the clinic floor into a computer database connected to a network. At intervals following the clinic visit, the patient should be polled to determine the effects of the clinic treatment. The polled responses should then be entered into the database. The database program must be specifically designed to rapidly search both clinical and outcomes data and, in response to standard query language queries, generate useful outcomes reports. The reports could be formatted as labeled grids, charts, and graphs. Such outcomes reports would serve as proof to payors that the profession is measuring outcomes data in practical ways at treatment sites. They could also form the basis of future “best practice” guidelines that will enhance patient well-being and the development of practical treatment methodology that addresses cost containment, a major concern in today’s health care reform movement [
1].
A model database program that has been specifically designed to gather, extract, and analyze clinical data and generate outcomes reports on that data is presented. The reports generated by the small model program presented here are a subset of a complete range of graphical analytic reports on patient outcomes that will be generated by the final working version of the program.
The outcomes database presented differs from a conventional clinical database in one important respect: all information is entered in the form of integers. No text, financial, date, or time data are entered. The outcomes reports that the program generates, however, can present data in textual and graphic format. Although the small model used consists of one program, the actual working version will consist of two separate programs, one to enter data into the database from the clinic floor and another to query the database and generate outcomes reports.
For the purpose of demonstrating the model database presented here, records of 50 imaginary patients have been entered. Each patient has one of five grades of diabetic ulcer and has been treated with one of five common treatment modalities. The database analyzes post-treatment effects of these treatment modalities on three outcomes parameters: pain, time lost from work, and ability to walk.
In a real situation, patients would be polled, as stated above, by mail or telephone to determine the effects of the treatments; they can answer the poll questions by entering a number from one (best or least) to ten (worst or most) for each item on the questionnaire. Fictional integer responses have been supplied by the author so that outcomes reports can be generated by the program.
Hypothesis
The author agrees with Wray et al [
2] that a conventional clinical database is inadequate for outcomes research. The requirements of an administrative database interfere with the computer’s ability to generate rapid, accurate charts and graphs and fill the database’s hard drive with useless patient identification, history text, and scheduling information.
It is the author’s hypothesis that:
1) an outcomes database would operate most efficiently if it were separate and distinct from an institution’s administrative database and,
2) if it contained only integers in its fields and tables with no identification of individual patients other than through an assigned patient number.
Efficiency would be best served by having two outcomes database management programs: an operational-level outcomes database management program for data entry in the clinic, and a strategic-level program to generate outcomes reports by the institution’s senior planning officers. Both outcomes database management programs would access the same database and would reside on client computers in a client-server system on the same local area network.
Regarding the second part of the hypothesis, the author believes that an outcomes database management program should have as part of its data engine an integer manipulation and conversion system. It is through the manipulation and conversion of integer numbers that any collection of chaotic or raw data can best be metamorphosed into an ordered and useful document of information. “Data” are what a computer stores in a database on a hard drive; “information” is the intelligent communication to the user generated by the database’s frontend program.
An integer manipulation and conversion system requires only a set of agreed-upon tables of correspondence (mapping tables) that can vary from system to system but which must remain constant within each system. The tables of correspondence in this article’s outcomes database management program consist of integer sets in the domain of 1 to 10 and their textual correspondences when these are necessary.
There is a trade-off here. The less textual presentation an outcomes database management program requires, the faster and more efficiently it will operate, but the less comprehensible it will become to the people operating it. The algorithms that assign set members to the tables of correspondence in the present outcomes database management program are contained in SELECT CASE statements embedded in the front-end code. These front-end set members provide for those reversible mapping operations that are necessary for the conversion of data to information, while at the same time, allowing the database tables to “see” only integers.
The heart of the integer manipulation and conversion system lies not in higher mathematics but in simple number theory. The author believes that integers are the number system of choice for several reasons. Natural numbers (where if a - b = c, a must be >b for c to exist) are unusable because, lacking negative values, they can form only abelian groups in addition and cannot be plotted onto a Cartesian graph. Real (floating point) numbers are more precise, but their number of subdivisions (eg, between 0.0 and 1.0) approach infinity. For purposes of database storage, real numbers could be used only in carefully defined subsets. The integer family, on the other hand, contains in its virtually infinite domain the complete set of natural numbers (including the zero element) as separate, finite data points and, in addition, admits negative values so that in (a ~ b = n), n can be plotted to the left of a Cartesian y axis. Integers can form commutative or abelian groups in both addition and subtraction so that (0 ~ a = a ~ 0).
Materials and Methods
User Interface (Front End)
An outcomes database management program has two major components: a front end that the user uses to access and manipulate the underlying database and a back end, the database itself. The model outcomes database management program was developed on a Gateway 2000® (Gateway 2000, North Sioux City, SD) 100MHz Pentium personal computer using Microsoft Visual Basic 4.0® (Microsoft Corporation, Seattle) to create the frontend graphical user interface. The graphical user interface is designed to run in the Microsoft Windows 3.1® (Microsoft Corporation, Seattle) operating system and will also run in the new Windows 95® (Microsoft Corporation, Seattle) operating system in 16-bit mode. As Windows 95 becomes more widespread, the program can easily be recompiled into a 32-bit version that can use more sophisticated commands.
The graphical user interface incorporates three screens for data entry, data editing, and generation of outcomes reports. The screens are arranged in the form of three tabbed file cards. Users choose among the cards by clicking on the appropriate tab. They can then operate the program with command buttons, two-dimensional screen images that react to the presence of a mouse cursor. In the larger, final working version, the program will present users with drop-down menus and dialogue boxes that ask for text input or warn of impending errors. These boxes will be identical to the standard Windows dialogue boxes, containing no more than three command buttons (such as “ok” and “cancel”).
First Tabbed Screen: The Clinic File Entry Form. Data entry screens are called “forms.” The first form in this outcomes database management program contains a box that gives the identification number of the last patient entered in the database and asks the user for the next number. The other boxes on the form are labeled text-boxes that ask for the patient’s age, treatment modality, grade of ulceration, and type of ulcer. These boxes represent a small subset of data that, in a real-life situation, would be entered either from the clinic floor or from patient records.
In
Figure 1, this screen also contains a databound grid that gives a running read-out of the database table that contains the entered data. This allows the user to view individual patient records and edit them if necessary. The data are presented in a spreadsheet grid so the user can browse the database table and check individual entries for verification. This grid, which takes up much screen space, could either be placed on its own separate tabbed screen or even be eliminated in working outcomes database management program.
Second Tabbed Screen: The Outcomes Entry Form. The second form is for the entry of data that would be obtained from a mail-out questionnaire called an “outcomes instrument.” Text boxes on the form ask for the patient’s response to questions concerning pain level, time lost from work, and ability to walk. These responses consist of numbers ranging from 1 to 10 in which 1 represents the best case and 10 represents the worst. A typical question on the outcomes instrument would take the following form: “How much pain have you had from your ulcer in a) the first 6 months after treatment, b) the second 6 months after treatment, and c) the second year following treatment?”
All data on both tabbed screens will be entered as integer numbers. Each text box will self-validate the data entered into it and generate error messages if the user attempts to enter incorrect data. Additional methods of ensuring data integrity are constraints and default masks, programming techniques that force the entry of data in a particular format.
Third Tabbed Screen: Outcomes Report. Data are retrieved in structures called “reports.” The design of the report will determine how the data are presented on-screen. The data are extracted from the database tables in the form of structured query language queries. These reports present graphical and statistical analysis of the parameters that were entered into the form on the first screen. The results will be graphically illustrated in the form of line, bar, or pie charts.
Figure 3 shows an outcomes report in the form of a two-dimensional bar chart that has been generated by querying the database with a structured query language statement.
Figure 2.
The second tabbed screen. Information taken from the patient’s responses to the questionnaire (outcomes instrument) is entered into the database. As in the first tabbed screen, the patient is identified only by an assigned number and all data are entered into labeled text boxes as numbers (integers).
Figure 2.
The second tabbed screen. Information taken from the patient’s responses to the questionnaire (outcomes instrument) is entered into the database. As in the first tabbed screen, the patient is identified only by an assigned number and all data are entered into labeled text boxes as numbers (integers).
Figure 3.
An outcomes report generated by a structured query language query and presented on screen in the form of a two-dimensional bar chart. In this query, the user has asked to see the pain levels during the first 6 months, the second 6 months, and the second year following treatment of diabetic ulcers by local debridement in three different age groups of patients. This chart could be printed in color on an inkjet printer for a permanent record.
Figure 3.
An outcomes report generated by a structured query language query and presented on screen in the form of a two-dimensional bar chart. In this query, the user has asked to see the pain levels during the first 6 months, the second 6 months, and the second year following treatment of diabetic ulcers by local debridement in three different age groups of patients. This chart could be printed in color on an inkjet printer for a permanent record.
The outcomes database management program also accesses a printer to produce a hard copy of the graphic reports.
Database (Back End)
The database for this outcomes database management program was developed using Microsoft Access 2.0® (Microsoft Corporation, Seattle). Microsoft Access 2.0 is a three-dimensional relational database in which the data are stored in structures called “tables,” small spreadsheet structures with rows and columns. The columns are called “fields” and each row is a “record” of one or more fields. Each field in a record will be atomic in nature, that is, it will contain a unit of information so small that the unit cannot be divided any further. When every field in a table contains atomic data bits, the table is said to be in first normal form (1NF).
Having nonatomic data in a field is considered inefficient by database programmers. In the database’s design phase, compound data are divided between two smaller fields, and then divided again if necessary, until each field contains only one indivisible data item. This reduction division is referred to as decomposition or normalization. Normalization of data facilitates searching procedures and helps to maintain data integrity.
At least one field of each record will be selected as a “key” field for indexing purposes. Key fields in a record are designated as either primary or foreign. The primary key is the first searching landmark and must contain unique information (the patient’s assigned identification number, in this case).
In a conventional administrative database, data decomposition or normalization is usually taken to the third normal form (3NF). This level of normalization forces each table into a monothematic mode. That is, each table will store only a narrow segment of the entire database so that each table carries a small burden of information, and the database will then consist of many small tables joined to one another by key fields. Thus, a typical, conventional clinical database will be made up of many small monothematic tables whose records contain atomic data bits. In this database, all fields will be of the integer data type.
Queries will be made using structured query language. By this method, information from several tables can be combined into one chart or graph. Each graph presents a dynaset or snapshot of any combination of parameters from the stored data. This can reveal relationships or effects that would otherwise remain hidden in the mass of digitized data. From these graphs and charts, reasonable and informed judgments can be made regarding the outcome of the treatment modalities under consideration.
Outcomes Instrument
The development of a reliable, easy-to-understand outcomes instrument is of particular importance for the accuracy of an outcomes research project. Usually, this instrument takes the form of a questionnaire that is mailed out to the patient at specified times following treatment.
Several instruments have already been developed for various projects. In 1991, Wachtel et al [
3] developed the Medical Outcomes Study Short Form Health Survey as an indicator for the quality of life in patients with human immunodeficiency virus. Factors assessed by this instrument included incidents of memory loss, seizures, chills, fevers, diaphoresis, weight loss, and dyspnea. Wilkerson et al [
4] examined functional status and gain as predictors of efficient resource allocation at rehabilitation facilities.
Whiteneck et al [
5] used the Craig Handicap Assessment and Reporting Technique to quantify the extent of handicap in individuals. This instrument used scaling and scoring procedures for determining dimensions of handicap identified and described by the World Health Organization.
The Minnesota Living with Heart Failure questionnaire was tested for validity as a primary outcome measure in conjunction with an exercise test by Rector and Cohen [
6]. They used a double-blind, randomized placebo-controlled 3-month trial of an investigational medication on 198 ambulatory heart patients with class III heart failure. In cases of joint or chest pain, elderly patients enrolled in HMOs were queried by a household telephone survey to determine their outcomes with fee-for-service patients having similar conditions [
7]. An effective outcomes measuring instrument was used by the Center for Epidemiological Studies (Depression) in 1994 [
8]. They queried geriatric patients on health status (including medications), functional ability (activities of daily life), and affect.
The data for this model database are fictional and consist of integers entered by the author. In a reallife situation, the data would be obtained from a mail-out questionnaire. Each questionnaire would be accompanied by a self-addressed, stamped envelope for easy reply. Patients who did not reply would be contacted by telephone. The various parameters on the questionnaire will be quantified by simple scales of 1 to 10. Information from the questionnaire will be entered directly into the second tabbed screen of the graphical user interface (
Fig. 2).
Discussion
The purpose of a conventional database is to store data elements that have known relationships and to facilitate the retrieval of those elements by searching and sorting techniques. Great importance is usually attached to the detailed identification of individuals (patients, clients, and customers) and to their individual records in the data file.
The goal of an outcomes database, on the other hand, is to determine unknown or unseen relationships between 1) a compendium of patients’ clinical records and 2) a collection of information obtained from outcomes instruments filled out by those patients at various times following treatment. As long as each instrument is correctly matched with the appropriate patient, the detailed identification of each patient is unimportant. The results of an outcomes database will be generated not by the detailed examination of individual records but by viewing aggregations of data presented in statistical or graphical format. In this way, relationships that were not apparent when the data were entered should become visible.
An outcomes research database need not be large. While a conventional administrative database may need to contain 50,000 or more patient records complete with text and graphic images, an outcomes research database could return useful query information based on the outcomes responses of only a few hundred patients. If all table entries were in numerical (integer) format, then the overall database would be compact and query responses would be swift.
In a conventional database, both searching and sorting procedures are important. Sorting, usually by selection and exchange along a binary tree, is especially relevant to the management of textual or string data. At this stage of the author’s research, it does not appear that sorting is particularly relevant in an outcomes database, especially as no string data will be manipulated by the front-end program. This view may need to be modified with further experience in the field using much larger amounts of data.
Popular ideas of table normalization may also have to be modified in an outcomes database. At present, it appears that a few large monothematic tables with atomic data elements may be the most efficient storage model for the integer data used in this model, however unwieldy such a model may appear on an entity-relations diagram. This is because a single table of integers taken to 1NF has no need of further decomposition to enhance efficiency. The present model database contains only two tables, each at 1NF. This model may also prove untenable when large amounts of data need to be manipulated.
The statistical analysis of the outcomes results returned by the instruments presents a few problems. The chief problem is that while outcomes polls are not controlled clinical experiments, they are sometimes treated as if they were. It must be remembered that present-day outcomes research uses large quantities of anecdotal evidence of varying reliability. Orchard [
9] suggests that adjusting reports for case mix may help reduce error in the quality and effectiveness of the reports. Ebert [
10] points out that, if statistical analyses are to be implemented in everyday clinical practice, they must be formulated succinctly and be able to inform patients of the anticipated course of their disease. Keith [
11] maintains that the utility of outcomes measures rests on their conceptual foundations.
In most patient outcomes studies, there is no control group. Therefore, strict measures must be taken to ensure statistical effectiveness. D’Agostino and Kwan [
12] suggest that, in cases such as this, statistical adjustments such as matching or covariance analysis be used to adjust for inequalities or for biases between the treatment groups. Moses [
13] states that, if a database approach is chosen, adjustment variables should be included in the database entries to give the effect of a randomized controlled trial. This is especially important when the factor of recollection error, particularly among elderly patients, is taken into account [
14].
Regarding the use of computer technology in outcomes research, Grigsby et al [
15] devised a simulated neural network to sample retrospective data from 387 orthopedic physical rehabilitation patients between the ages of 60 and 89 years. They measured functional capacity as a factor of age. Their networks had a claimed accuracy of 86% to 91% in predicting outcomes, costs, and length of hospital stay.
Wray et al [
2] investigated the feasibility of using the clinical data in an administrative database to evaluate outcomes and health care costs in a large hospital setting. They caution against the possible inadequacy of the clinical data contained in such a database and suggest, like Keith, that the major stumbling block in such a study is the lack of a conceptual framework to guide the analysis. Finally, Zielstorff [
16] outlined information system requirements for accuracy in capturing clinical outcome data from large multipurpose databases.
Conclusion
Outcomes research is a branch of medical informatics (information management) that can become an adjunct to clinical decision-making and also a means of patient-physician communication [
17,
18,
19]. The goal of outcomes research is to formulate logical connections between various methods of therapy and the overall health and well-being of the patient while maintaining reasonable fiscal responsibility to payors of the managed care delivery system.
In the author’s opinion, the backbone of effective patient outcomes research is a properly designed and executed database. The patterns revealed by a properly designed outcomes database should provide useful information that could be transmitted to thirdparty payors on the database’s health care delivery system. This information should allow the payors to make informed choices based on cost-effectiveness and quality care from among various services available to them within that delivery system. At the same time, the administrators of the health care system, by an assessment of outcomes, can determine the best practice guidelines based on success rates, real cost, and patient well-being and satisfaction.