Friday, September 14, 2007

The Blossoming of Standards


It recently came to my attention that the National Institute of Neurological Disorders and Stroke has been conducting a project to define a data elements dictionary that can be used in trials it sponsors. The intent is both to facilitate data collections and storage as well as to promote data sharing. While there are many similarities in the data needed for different therapy areas, there are also many differences in both content and in structure. This makes creating a single data dictionary something of a challenge.

Tackling this challenge would be a tall order in a company, but there everyone (at least in theory) reports into the same organization, and, provided management has the intestinal fortitude to do so, common approaches can be mandated. The NINDS project has an even greater challenge, in that it has a significantly smaller degree of control over the structure of the studies it sponsors and thus must provide carrots rather than sticks (i.e., give investigators incentives to cooperate, rather than force them to do so).


This project is an example of an increasingly popular trend towards sharing data in the academic and government research environments, and to a lesser extent in industry (the latter being driven mostly by the need to create consistently structured data for regulatory submissions). CDISC is the most familiar one to those of us in industry, and it is aimed primarily at regulatory agencies. Outside of industry, I know of HL-7, which is (among other things) a set of messaging standards for defining and exchanging data between healthcare providers. There is also the National Cancer Institute with its caBIG project, which is a set of data dictionaries and processes for standardizing the definition, collection and storage of data from oncology trials sponsored by NCI. There may well be other projects.


It is wonderful to see such awareness of and enthusiasm for standardization. This can only help to increase efficiency, reduce costs, and enhance the safety and efficacy of medical treatments. On the other hand, it is a concern that, although these organizations do have some awareness of each others’ work, they are still working mostly independently. Not only is there some duplication of effort, but they will inevitably make different decisions and produce slightly different results. Anyone involved in such a project understandably has a tremendous sense of ownership and will be committed to his or her own decisions. At some point the various schemas will have to be harmonized, and this will require a great deal of work both to achieve the harmonization and then to apply the results back to the source systems. Because of this, it is critical that the organizations already developing standards be open, transparent and vocal about their work so that any other organizations contemplating initiating a standards project will join an existing one and work toward ensuring their needs are met there. By the same token, it is encumbent upon the existing projects to be open to the needs of others so that the number of projects is limited and eventual integration is not even more daunting.

Photo: Phyllachne colnsoi. Haast Pass, South Island, New Zealand. Flowers approx 3 mm diameter. c. 2006, Kit Howard.

Wednesday, May 30, 2007

The Washington Monument was Built by Aliens!

It has been said that figures don’t lie but liars figure. In his book Fads and Fallacies (1) Martin Gardner presented this cautionary tale as part of his examination of the numerical myths associated with the Great Pyramid in Egypt.

… If one looks up the facts about the Washington Monument in the World Almanac, he will find considerable fiveness. Its height is 555 feet and 5 inches. The base is 55 feet square, and the windows are set at 500 feet from the base. If the base is multiplied by 60 (or five times the number of months in a year) it gives 3,300, which is the exact weight of the capstone in pounds. Also, the word "Washington" has exactly ten letters (two times five). And if the weight of the capstone is multiplied by the base, the result is 181,500 —a fairly close approximation of the speed of light in miles per second. If the base is measured with a "Monument foot",(2) which is slightly smaller than the standard foot, its side comes to 56½ feet. This times 33,000 yields a figure even closer to the speed of light.

And is it not significant that the Monument is in the form of an obelisk—an ancient Egyptian structure? Or that a picture of the Great Pyramid appears on a dollar bill, on the side opposite Washington's portrait? Moreover, the decision to print the Pyramid (i.e., the reverse side of the United States seal) on dollar bills was announced by the Secretary of the Treasury on June 15, 1935—both date and year being multiples of five. And are there not exactly twenty-five letters (five times five) in the title, "The Secretary of the Treasury"?

It should take an average mathematician about fifty-five minutes to discover the above "truths," working only with the meager figures provided by the Almanac.

One definition of the quality of data is its “fitness for use”, which implies that the use has been defined. There is increasing interest at the FDA and within many companies in establishing data warehouses that will allow for meta-analyses that are orders of magnitude bigger than any possible today. We must be careful that, in our haste to mine the data for interesting relationships we do not use the data in ways that were never intended and create our own mythologies.

(1) Martin Gardner, Fads & Fallacies in the Name of Science. Dover Publications, New York.
c. 1957. pg 179.
(2) Derived by dividing the length of one building stone by 25, i.e., five times five.

Tuesday, April 24, 2007

Absence of Evidence...

In archaeology there is a popular phrase that says that “Absence of evidence is not evidence of absence.” In essence, it means that the fact that nothing was found does not mean that nothing exists. It merely means that it wasn’t found. For example, let’s assume that I conduct an archaeological dig in my back garden in Michigan and I find no indication that there was a Potawatomi settlement there. There are those who would then claim that these Native Americans did not live here, whereas all that can legitimately be stated is that I did not find any evidence of them. In fact, given the topography and location, it’s quite likely that there was at least a seasonal camp nearby, but based on my research I cannot make an accurate statement one way or the other.

A similar issue exists in drug development. There has been a long-standing dispute between Clinical Data Management and most of the other functional areas involved in clinical trials that is centered around the use of “Not Done” and “None” data fields. A “None” box is completed on the Case Report Form (CRF) when there were no findings of a particular sort for a study subject. This is typically used to record that no adverse events were observed, or that no concomitant medications were taken. Similarly, the “Not Done” box is checked when a test or evaluation was not done. This avoids leaving a blank on the CRF, which would usually be queried by the monitor or the data manager.

There has been significant resistance to using these data fields. They are administrative fields, in that they do not contain data that are analyzed, and given that every additional field collected increases the costs of the trial, they would seem to be good candidates to drop. In addition, the FDA has typically said that they want to see only the data that represent that something happened, and not lists of “Not Done” and “None” fields. The CDISC submission standards do not define “Not Done” fields, so organizations that are modeling their entire data stream on CDISC have nowhere to put these fields. Finally, the fields create opportunities for superfluous data queries when they have been marked but there are also data values present.

All of these are good reasons for dropping “None” and “Not Done,” but they are all trumped by the fact that if “None” and “Not Done” are excluded, we cannot say if the tests were not performed, or if they were performed but the site did not complete the CRF. These are obviously two very different situations. For example, an ECG CRF that records no findings will mean one thing if the test results were not transferred from the source documents, something very different if the test was not done, and something different again if the test was done and there were no abnormal findings. In the first and third cases, we can retrieve the results and know whether or not there were significant findings. In the second case, we can’t know one way or the other.

It is important to note that there is no reason why the presence or absence of “Not Done” and “None” should affect the structure of the submission, nor the data presented in tabulations and listings. Subjects who had no findings and those whose evaluations were not done can easily be suppressed from the display. The net effect is to increase the reliability of the displays, as reviewers can be certain that blanks and absent records indicate that the data are really not available, rather than just omitted. We can then be certain that, for the purposes of analysis and submission, the absence of data really is evidence of absence.
Photo courtesy of Archaeology Magazine, Archaeological Institute of America

Friday, March 30, 2007

The Missing Link in Clinical Data Standards

The term “clinical data standards” means different things to different people. In the world of clinical trials, it has traditionally meant having case report forms and a database structure that are reusable from study to study. At the DIA CDM Annual Meeting last week in Orlando, Dr. Steve Wilson observed that, from the FDA’s perspective, the content, format and uses for data in regulatory submissions have evolved over the years. Initially, the push was to make the transition from paper to electronic submissions. With that, the need for structural standards became apparent, and hence the development of CDISC. We now recognize that the format of the data needs to be standardized, and CDISC is developing standard data collection modules (CDASH) and controlled terminology (e.g., code lists) to address this need. The FDA hopes to reap the benefit of this work in Janus, a data repository that will eventually house all submission data. This will allow them to monitor drug safety much more proactively.

These are all excellent developments, and bring us closer to an environment where standard data viewing and analysis tools are feasible and where databases and programs can be reused. I would argue, however, that there is one more step we need to take in order to be truly standardized. We are doing much to define the structure of data, but have very little in place to define its content. By this I don’t mean the terminology used to categorize data, but rather the processes and assumptions inherent in generating and collecting the data.

For example, suppose you want to analyze the emergence of adverse events in a particular drug class. You access the merged safety database, assign each AE to a time interval based on its start date, and then count the AEs and compare the intervals. It seems straightforward, until you realize that some studies started collected AEs when the informed consent was signed, some started at first dose, and some started collecting serious AEs when the IC was signed and all AEs at first dose. From the point of view of structure and terminology, the data are standard, but they are not suitable for this analysis, and while the collection starting point would be defined in the protocol, that information would rarely accompany the data.

This lack of definition about the processes and assumptions is, I believe, the greatest threat to data quality. In order for data to be truly comparable we must know not just its electronic characteristics but also the processes and assumptions made in generating it. Only then can we know when the data we have is fit to answer the questions we want to ask.