Feeds:
Posts

## Regression: A richly textured method for comparison and classification of predictor variables

### Regression: A richly textured method for comparison and classification of predictor variables: The multivariable Case

Author: Larry H. Bernstein, MD

e-mail: plbern@yahoo.com.

Keywords:  bias correction, chi square, linear regression, logistic regression, loglinear analysis, multivariable regression, normal distribution, odds ratio, ordinal regression, regression methods

Abstract

Multivariate statistical analysis is used to extend this analysis to two or more predictors.   In this case a multiple linear regression or a linear discriminant function would be used to predict a dependent variable from two or more independent variables.   If there is linear association dependency of the variables is assumed and the test of hypotheses requires that the variances of the predictors are normally distributed.  A method using a log-linear model circumvents the problem of the distributional dependency in a method called ordinal regression.    There is also a relationship of analysis of variance, a method of examining differences between the means of  two or more groups.  Then there is linear discriminant analysis, a method by which we examine the linear separation between groups rather than the linear association between groups.  Finally, the neural network is a nonlinear, nonparametric model for classifying data with several variables into distinct classes. In this case we might imagine a curved line drawn around the groups to divide the classes. The focus of this discussion will be  the use of linear regression  and explore other methods for classification purposes.

#### Introduction

Multivariate statistical analysis extends regression analysis and introduces combinatorial analysis for two or more predictors.   Multiple linear regression or a linear discriminant function would be used to predict a dependent variable from two or more independent variables.   If there is linear association dependency of the variables is assumed and the test of hypotheses requires that the variances of the predictors are normally distributed.  Linear discriminant analysis examines the linear separation between groups rather than the linear association between groups, and it also requires adherence to distributional assumption. There is also a relationship of analysis of variance as a special case of linear regression, a method of examining differences between the means of two or more groups. A method using a log-linear model circumvents the problem of the distributional dependency in a method called ordinal regression.  Finally, the neural network is a nonlinear, nonparametric model for classifying data with several variables into distinct classes. In this case we might imagine a curved line drawn around the groups to divide the classes.

Regression analysis.

The use of linear regression, linear discriminant analysis and analysis of variance has to meet the following assumptions:

The variables compared are assumed to be independent measurements.

The correlation coefficient is a useful measure of the strength of the relationship between two variables only when the variables are linearly related.

The correlation coefficient is bounded in absolute value by 1.

All points on a straight line imply correlation of 1.

Correlation of 1 implies all points are on a straight line.

A high correlation coefficient between two variables does not imply a direct effect of one variable on another (both may be influenced by a hidden explanatory variable).

The correlation coefficient is invariant to linear transformations of either variable.

The correlation coefficient is also expressed as the covariance or product of the deviations of X and Y from their means standardized by dividing by the respective standard deviations.

These assumptions may be valid if the amount of data compared is very large, and if the data is parametric.  This is not necessarily the case.  There are also special applications in laboratory evaluations and crossover studies between methods and instruments that require correction for bias or for differences in the error variance term.

How do we measure the difference if there is any?  We use the t-test (19, 21).   If t is small than the null hypothesis is satisfied and no difference is detected in the means.   The conclusion is that the null hypothesis is accepted and the means are essentially the same .  However, the ability to accept or reject the null hypothesis is dependent on sample size, or power.  If the null hypothesis is rejected, bias has to be suspected.  This is useful when analyzing certain data, where the results of OLS are unsatisfactory. This test is here applied to linear increasing values of Y on X measured by A and B methods.   Of course the measurements are plotted and a line is fitted to the scatterplot.   OLS gives the fit of the line based on the least squares error, where the slope of the line is given by (20,22).

B = å (xi – mean x)yi .

å(xi – mean x)2

It is assumed that there are n pairs of values of x and y, and xi and yi denote the ith pair of values.   The slope defines the regression line of y on x.  An intercept that differs from zero is the bias.  It is worthwhile to mention that there is a difference here between the correlation measurement and the least squares fit of y on x.   We are measuring X by methods A and B.  We can then determine that the is a linear association with r valued between 0 and 1 (-values excluded).  In the case of the regression model, we are predicting B from A by plotting B on y from A on x.   Of course, experimentally, we are expecting the prediction to hold over a range of measurements, and the agreement drops off at some value of the coordinates (xi, yi).

Multiple regression is an extension of linear regression where the dependent variable is predicted by several independent variables.

In this case, the extended equation is (23)

Y = b0 + b1x1 + b2x2 + b3x3 …bnxn.

The model assumes a linear relationship between many predictor variables and the dependent variable.  The model usually assumes that the independent variables are not correlated with each other, which may not be the case.  The model can be tested by stepwise removal of predictor variables to assess their contribution to the model.     The model is  considered to be parametric, and so it requires that the inputs are normally distributed.  The bs (or betas) are also partial correlation coefficients.  The partial F test is the measure of the contribution of each variable after all the variables are in the equation.

Figures 1-3 are scatterplots of eGFR (glomerular filtration rate calculated by MDRD equation) and of hemoglobin vs Nt-proBNP, and a boxplot of Nt-proBNP by WHO criteria for anemia. Figure 2 is a 3D plot of NT-proBNP spliced by eGFR and hemoglobin.  The linear regression model is presented in Table I.  The correlation coefficient (R ) for the model is weak, but not insignificant. What do you think is the effect of the large variance in the dependent variable?   Figure 3 is a 3D plot of the eGFR and hemoglobin vs a transformed variable – age normalized 1000*Log(Nt-proBNP)/eGFR.  The variance is reduced on the transformed variable.  Table II is the regression model on the data.  The correlation coefficient R is improved.

Analysis of variance (ANOVA) and Analysis of covariance (ANCOVA)

ANOVA is used if the dependent variable is continuous and all of the independent variables are categorical.   One-way ANOVA is used for a single independent variable, and multi-way ANOVA is used for multiple independent variables.   The ANOVA is based on the general linear model.   The F-test is used to compare the difference between the means of groups.   The independent variable has discrete values is not used as a measure.   The t-test can be used between each pair in the groups.   The goal of ANOVA is to explain the total variation found in the study.   An example of this application is shown in Figure 4.

Figure 4.  BNP determined within ejection fraction above or within 40

Figure 5 is the means and 95% confidence intervals for a comparison of D-dimer and positive or negatine venous duplex scans.  There are only two variables so the corresponding ANOVA is one-way.  The F-value is high and corresponds to a high t in the t-test.  F is the same as t2 and p = 0.0001 (Table III).  Our interest here is in multiple variables so we’ll hold the discussion of difference testing between two variables.

Figure 5.

Table III.

If some of the independent variables are categorical (nominal, ordinal or dichotomous) and some are continuous ANCOVA is used.   The ANCOVA procedure first adjusts the dependent variable on the basis of the continuous independent variable and then does ANOVA on the adjusted dependent variable.

Generalized linear and generalized additive models

Generalized linear models transform the response by assuming that a transformation of the expected response is a linear function of the predictor variables.   The variance of the response is a function of the mean response.   When the relationship between the parameters is not linear, a generalized linear model can’t be used.   A generalized additive model can be used to fit nonlinear data-dependent functions of the predictor.   Tree-based models are used for exploratory analysis and are related to clustering, which is a method for studying the structure of the data, creating clusters of data with similar characteristics.

Discriminant analysis

The discriminant analysis is a modification of the general linear regression model.   The method is used to assign data to any of distinct classes as the dependent variable.   The linear regression model predicts based on a linear relationship between the dependent and the independent variables.   They are codependent.   In the discriminant function they are independent.   The function determines a separation between the classes to which the data assigns patients.   The goal is to assign a new incoming patient based on the independent variables to one of the different groups.   The mathematical function can be linear, quadratic, or another function.   The stepwise linear regression with removal or addition of variables is viewed in the same way.   However, the discriminant function produces a separation between the classes rather than through them.  The same qualifications for the method fit pertaining to distributional assumptions that applies to multiple linear regression applies to the linear discriminant function, but the analysis of data on congestive heart failure, renal insufficiency and anemia partitioned with NT-proBNP, creatinine, age and hemoglobin concentration shown in Figure 6 and Table IV uses a quadratic equation.  I re-classify the data using the transformed variable age-normalized 1000*Log(NT proBNP)/eGFR presented in Figure 7 and Table V.  The use of the logarithmic transform and removal of age and hemoglobin as predictors give impressive results.

Figure  6.

Table IV.

Figure 7.

Table V.

Mahalanobis D2

The euclidean distance between two coordinates having the position (x1y1), (x2y2) is given by the distance D = ([x1 – x2]2 + [y1 – y2]2)1/2.   This is generalized for N-dimensional space, and the square of the distance is D2.   The two points are the centroids in a cloud of points in space separated by D, the euclidean distance between the points in an N-dimensional space.   The multiplication of a vector and a variance-covariance matrix T-1 yields the linear discriminant functions.   The Mahalanobis distance can be used to evaluate the distances of centroids and also the distances of objects towards the centroid of their class.

Logistic regression

The linear probability model (logistic regression) is the standard regression model applied to data for which the dependent variable is dichotomous (0,1). It fits a logistic function to the dependent variables valued at 0 or 1 and estimates the probabilities associated with each observation (24).  The predicted values from the model are interpreted as a probability that the response is a 1.  The test of significance of the model is the Maximum Likelihood Estimator (MLE).  The significance is determined by adjusting the parameters to maximize the likelihood of the observed data arising from the linear sum of the variables.

There are problems in using the linear probability model (49).

The residuals don’t have a constant variance so that estimates from regression are not best linear unbiased, therefore, not minimum variance.

Standard errors of regression coefficients can be erroneous giving invalid confidence intervals.

The predicted values from regression can range outside the interval [0,1], whereas probabilities are bounded by that interval

The linearity assumption inherently imposes constraints on the marginal effects of predictor variables that are not taken into account by the OLS estimation.

The linearity assumption implies that the marginal effect of a predictor is constant across its range.

The usual r squared measure is problematic.

Ordinal regression

I now turn to the application of a special nonparametric regression program developed by Jay Magidson (GOLDmineR; Statistical Innovations Inc., Belmont, MA), referred to as Ordinal regression, or universal regression (25-28).   Let’s look at the application of this tool, which makes outcomes analysis easy.   This method brings a powerful tool to the analysis of laboratory data for clinical validation of diagnostic tests.  It overcomes serious limitations of logistic analysis when there is more than two possible outcomes to consider.   This has become more important as we introduce tests that have results that are affected by morbid conditions so that a range of probabilities might be associated with scaled “dummy values” of the test (possibly because of hidden or unspecified variables).

Ordinal dependent variables are multivalued and have an ordered relationship to the predictor variable(s).   Magidson (25-28), inspired by the work of Leo Goodman (29,30), suggests the existence of a single regression model that can accomodate dependent variables of any metric – dichotomous, ordinal, or continuous.   This supermodel holds true under the assumption of bivariate normality and under other distributional assumptions and subsumes linear distribution and logistic regression as special cases (25).    It uses a log odds model fit and the odds ratio is obtained from the log(odds ratio).   In the linear probability model, the coefficients (bi) are partial correlation coefficients.   In the logit model the coefficients are partial log(odds-ratio).

The monotonic regression of X on Y is described by:

J

E(Y|X = x) = å   Pj.x yj

J=1

Where Pj.x, the conditional probability of the occurrence of Y=yj (an ordinal dependent variable) given X=x (qualitative or quantitative predictor variables), is estimated from a sample of N observations using 2 steps.

1)      Conditional logits Yj.x are predicted using the generalized logit model, where Yj.x*  is: Yj.x = aj + (b1x1* +  b2x2*  + bMxM*)yj*    j= 1,2,…, J.
The Y-scores, which determine the ordering and relative spacing of the J outcomes, may be specified or if unspecified, they are treated as model parameters and estimated with other parameters.   Yj* , the relative Y-score, is the difference between yj and some Y-reference score y0 defined as a weighted average of the original Y-scores.

2)      The predicted logits are transformed to predicted probabilities using the identity:

J

Pj.x º exp(Yj.x)/å exp(Yj.x)

J=1

## For a given X=x, the generalized logit is defined as

Yj.x º ln(Pj.x/P0.x)

where Pj.x is the conditional probability of the jth outcome occurring when X=x

J

and P0.x =  P (Pj.x)ej

j=1

I performed a nonparametric regression using the universal regression program GOLDminer, developed by Jay Magidson (25-28) at Statistical Innovations, Belmont, MA.  The universal regression program is a logistic regression if the dependent variable is a binary outcome, and it is a polytomous regression if there are more than two dependent variables, but it can accommodate a paired comparison of covariates.  The measure of association is phi and R2.  The measure of fit is L2 (chi square).  The logarithmic form transforms into a probability model, which we aren’t concerned with here.

Graphical Ordinal Logit Display (GOLDminer)

I have mentioned the nonparametric universal regression of Magidson (25-28), based on work with log-linear modeling with Prof. Leo Goodman (29,30).  The logistic regression and linear regression models can be viewed as special cases of this more general model.  This regression model has greatest use for examining structure in data where there are more than two dependent variables, and the independent variables are scaled to intervals (25-28).  The model is more general than the logistic regression and is not constrained by the conditions encountered with logistic regression identified above.

I cite a number of publications of its use in clinical laboratory outcomes analysis.

Example  1.  The association between predictors of nutrition risk and malnutrition risk

I use here data obtained by Linda Brugler and coworkers at St.FrancisHospital in Wilmington, DE (31) that examines association between the malnutrition assessed before intervention with three predictors of malnutrition risk.  Poor oral intake and malnutrition related diagnosis are categorical, and the laboratory-derived serum albumin is scaled to form an ordinal predictor.   The strength of the predictors is given by Table VI:

Table VI.  Ordinal regression model for combined 3 predictors of malnutrition risk.

The model is defined by the following:  L2 = 267.68, R2 = 0.405, phi = 1.1134,

Df (3, 42), p = 9.7e-58.

Example 2:  Ordinal regression for thalassemia risk

Table VII shows the odds-ratios for the combinatorial scaled results of Mentzer score (ratio of MCV: red cell count), MCV, and Hgb A2(e)(by electrophoresis is higher than by HPLC).  The presence of only a single positive test gives an unlikely result for thalassemia, while two or more positive tests give a high likelihood of thalassemia.   This is summarized as follows: 0,0,0-0,0,1-0,1,0-1,0,0 = 0; 1,1,1-1,0,1-0,1,1-1,1,0 = 1.

Table VIII.   Expected Odds Ratios – Diagnosis Thalassemia

Example 3. Ordinal regression for risk of newborn respiratory distress syndrome

A study by Kaplan, Chapman and coworkers (32) extending work by Bernstein and Rundell (33) looked at the relationship between gestational age and RDS of the newborn and used the ordinal regression model to predict expected outcomes (33).  Table IX gives probabilities for the prediction of risk.

Table IX.   Probabilities of RDS given by gestational age and S/A ratio.

Example 4.  Prediction of myocardial infarction risk by EKG and troponin T at 0.1 ng/ml

Bernstein, Zarich and Qamar (34) carried out a study in which the physicians were blinded to the troponin T results.  A randomized prospective study of over 800 patients followed (35-37).  The chest pain characteristics, EKG findings and troponin T results were reviewed for consecutive patients entered into the study (34).   EKG results were scaled as: negative, nonspecific, 0; ST depression or T wave inversion, 1, ST elevation or new Q-wave, 2.  Troponin T was scaled as follows: 0-0.075 ng/ml, 0; 0.076-0.099, 1; > 0.1.The diagnoses were as follows: noncardiac, cardiac and nonischemic, 1; Unstable angina with MI ruled out, 2; non ST or ST elevation MI, 3.  Table X is the table of odds ratios and probabilities.

Table X. Ordinal regression of EKG and troponin T on diagnoses

### Ovarian Cancer Survival

Rosman and Schwartz have reported a relationship between CA125 post-chemotherapy of ovarian carcinomatosis and serum half-life of CA125.  We examined a published data set provided by Dr. Martin Rosman.  Data were analyzed from 55 women who were treated at YaleUniversity, had an evaluable CA125 half-life (t1/2), and were followed for disease recurrence for at least 3 years.  We modeled survival or remission for ovarian cancer using operative findings, stage, and CA125 halflife (46).  Figure 9 is a plot of the CA125 elimination half-life vs the Kullback-Liebler distance using the data provided by Dr. Martin Rosman. The K-L distance is the difference between the total entropy of the data in which association is removed and the observed entropy for each value of CA125.  The t1/2 is 10 days.  What Rudolph and Bernstein (43) have referred to as effective information is KL distance. This was done to determine the value of CA125 that best predicts survival.

Figure 9 CA125 halflife

The next step was to carry out a Kaplan Meier survival plot with Cox regression on the data vs the time to death or remission.  A survival of 30 months is considered a cure.  A survival less is considered a remission.  Some patients died only shortly into chemotherapy.   The study result is shown in Figure 11.

Figure 10.  Kaplan Meier plot

We also examined the associations between OPERATIVE FINDINGS and CA125 to REMISSION and NONREMISSION or RELAPSE using a universal regression model under bivariate normality with estimation of generalized odds-ratios developed by Jay Magidson (Statistical Innovations, Inc., Belmont, MA).  It uses a parallel log-odds model based on adjacent odds to describe the data.  The universal regression is carried out after scaling the continuous variables with intervals we determined as follows: halflife- 0-5, 6-10, 11-15, 16-20, >20.   A crosstabulation is constructed using the scaled variables as treatment vs. the effect (full, short remission or none), to obtain the frequency tabulation of treatment level vs remission, relapse or nonremission.

Table XI is a cross-tabulation of the observed and expected outcome frequencies in remission (rem), short remission (short,< 30 months) and non-remission (none) versus the scaled half-lives.   Relapse and failure to achieve remission were combined into one outcome class.  The means and standard error of the means (SEM) of half-life versus remission or non-remission/relapse are effectively separated (F=7.42, p < 0.01) as follows: Remission, 7.9, 2.8, [19];  Relapse/Non-remission, 17.4, 2.05, [36].

Table XII.  Observed and expected odds and odds-ratios of remission, relapse and no response by half-life

Perspective for the Future

Linear regression has been used extensively for methods comparison and for quality control, exclusively based on distributional assumptions and distance from the center of the population sample.   This is essential to analytical chemistry principles, but it has reached a limit.  The last 30 years has seen the development of very powerful regression tools that are not dependent on distributional assumptions and that move the method into classification and prediction.  The development of the Akaike Information Criterion (38-40) brought together two major disciplines that had separate developments, information theory and statistics.   The work by Bernstein et al. (41-42) in predicting myocardial infarction using bivariate density estimation, and with Kullback-Liebler Distance (43, 44), an extension of work by Rypka (45) is closely related. The use of tables and the scaling of data has been the dominant approach to statistics that uses ordinal and categorical data in outcomes research.  This has become a powerful method used in studies of placebo and drug effects.   The approach is readily amenable to studies of laboratory tests and outcomes.   Outcomes studies will be designed and carried out for laboratory tests that will ask questions appropriate for the clinical laboratory sciences, and that will not be subordinated to pharmaceutical evaluations, which currently have exclusion criteria that are inappropriate for laboratory investigations.

Summary

Regression has a long history in the development of modern science since the 18th century.  Regression has had a role in the emergence of physics, anthropology, psychology, and chemistry.  But its development was initially tied to linear association and assumption of normal distribution.   There are many associations that are tied to frequency of discrete events.  The use of chi-square as a measure of goodness of fit has such a tie to genetic analysis and to classification tables.   The importance of outcomes management and the recognition of a multivariable data structure that needs to be explored leads us to a new domain of regression models and includes an assumption that the dependent variable may not be know with certainty.  This is the case with the emerging models known as mixture models, structural equation models and latent class models.  This type of model is not traditionally a regression model and looks at defined variables and also unmeasured, hidden or latent variables (factors) in the model.  However, there are factor analysis and regression forms of the LCM that are included in the LCM software releases of Statistical Innovations, Inc. (Latent Gold). This important subject is beyond the scope of this review, but Demidenko (47) has written an excellent text on the subject.

References:

19. Hoel PG. Elementary Statistics, Testing Hypotheses: The difference between two means. Chapter 3.3. pp133-117. 1960. Wiley, New York.

20. Hoel PG. Ibid. Regression. Chapter 9. pp141-153.

21. Norman GR, Streiner DL. Biostatistics: The Bare Essentials. Two repeated observations: The paired t-test and alternatives. Chapter 10. pp89-93. 2000, BC Deckker, Hamilton, Ont., Canada.

22. Norman GR, Streiner DL. Ibid. Simple regression and correlation. Chapter 13. pp118-126.

23. Norman GR, Streiner DL. Ibid. Multiple regression. Chapter 14. pp127-137.

24. Norman GR, Streiner DL.Ibid. Logistic regression. Chapter 15. pp139-144.

25.  Magidson J.  “Multivariate Statistical Models for Categorical Data,” Chapters 3 & 4   in Bagozzi R, Advanced Methods of Marketing Research, Blackwell, 1994.

26.  Magidson J. Introducing a new graphical method for the analysis of an ordered categorical response – Part I. Journal of Targeting, Measurement and Analysis for Marketing (UK). 1995; IV(2):133-148.

27.  Magidson J.  Introducing a new graphical model for the analysis of  an ordered categorical response – Part II. Ibid. 1996;IV(3):214-227.

28.  Magidson J.  Maximum likelihood assessment of clinical trials based on an ordered categorical response. Drug information Journal. 1996;30:143-170.

29.   Goodman LA.  Simple models for the analysis of associations in cross-  classifications having ordered categories.  Journal of the American Statistical Association. 1979;74: 537-552.  Reprinted in The Analysis of Cross-Classified Data Having Ordered Categories. 1984, HarvardUniversity Press.

30. Goodman LA.  Association models and the bivariate normal for contingency tables with ordered categories. Biometrika 1981;68:347-355.

31.Brugler L, Stankovic AK, Schlefer M, Bernstein L. A simplified nutrition screen for hospitalized patients using readily available laboratory and patient information. Nutrition 2005;21:650-658.

32. Kaplan LA, Chapman JF, Bock JL, Santa Maria E, Clejan S, et al. Prediction of respiratory distress syndrome using the Abbott FLM-II amniotic fluid assay. Clin Chim Acta 2002;326[1-2]:61-68.

33.  Bernstein LH, Stiller R, Menzies C, McKenzie M, Rundell C. Amniotic fluid    polarization of fluorescence and lecithin/sphingomyelin ratio decision criteria assessed. Yale J Biol Med 1995; 68(2):101-117.

34.  Bernstein LH, Qamar A, McPherson C, Zarich S.   Evaluating a new graphical   ordinal logit method (GOLDminer) in the diagnosis of myocardial infarction utilizing clinical features and laboratory data.   Yale J Biol Med 1999; 72:259-268.

35. Bernstein L, Bradley K, Zarich S. GOLDmineR: Improving Models for Classifying Patients with Chest Pain. Yale J Biol Med 2002;75: 183-198.

36. Zarich S, Bradley K, Seymour J, Ghali W, Traboulsi A, et al. Impact of troponin T determinations on hospital resources and costs in the evaluation of patients with suspected myocardial ischemia. Amer J Cardiol 2001;88:732-6.

37. Zarich SW, Qamar AU, Werdmann MJ, Lizak LS, McPhersonCA, Bernstein LH. Value of a single troponin T at the time of presentation as compared to serial CK-MB determinations in patients with suspected myocardial ischemia. Clin Chim Acta 2002;326:185-192.

38. Akaike H. Information theory and an extension of maximum likelihood principle.    In B.N. Petrov and F. Csake (eds.), Second International Symposium on Information Theory. 1973, Akademiai Kiado, pp 267-281, Budapest.

39. Akaike H. A new look at the statistical model identification.  IEEE Transactions on Automation Control, AC-19, 1974; 716-723.

40. Dayton CM. Information Criteria for the Paired-Comparisons Problem.  American Statistician. 1998;52: 144-151.

41. Bernstein LH, Good IJ, Holtzman GI, Deaton ML, Babb J:  Diagnosis of myocardial infarction from two enzyme measurements of creatine kinase isoenzyme MB with use of nonparametric probability estimation.  Clin Chem 1989;35:444-7.

42. Bernstein LH, Good IJ, Holtzman GI, Deaton ML, and Babb J. Diagnosis of heart attack from two enzyme measurements by means of bivariate probability density estimation: statistical details. J Statistical Computation and Simulation. 1989.

43. Rudolph RA, Bernstein LH, Babb J. Information-induction for the diagnosis of myocardial infarction. Clin Chem 1988;34:2031-8.

44. Kullback S, Liebler RA. On information and sufficiency. Ann Mathematical Statistics 1951;22:79-86.

45. Rypka EW. Methods to evaluate and develop the decision process in the selection of tests. Clinics in Laboratory Med 1992;12[2]: 351-385.

46. Bernstein LH. Outcomes-based Decision Support: How to Link Laboratory Utilization to Clinical Endpoints. Chapter 8. Pp91-128. In Bissell MG, ed. Laboratory-Related Measures of Patient Outcomes: An Introduction. 2000. AACC Press. Washington, DC.

47. Demidenko E.  Mixture models: Theory and applications.  2004.  Wiley-Interscience. Hoboken, NJ.

48. Martin RF. General Deming regression for estimating systematic bias and confidence interval in method-comparison studies. Clin Chem 2000;46:100-104.

49. Magidson J.  Opportunities grow on trees. A general alternative to linear regression. Monotonic regression of dichotomous, ordinal and grouped continuous dependent variables.  1998. Statistical Innovations, Inc. Belmont, MA.

Table I.  Regression of eGFR and hemoglobin to predict Nt-proBNP

 Step number : 0 R : 0.376 R-square : 0.141

 In Effect Coefficient Standard Error Std. Coefficient Tolerance df F-ratio p-value 1 Constant 2 eGFR -83.499 14.063 -0.297 0.951 1 35.256 0.000 3 Hgb -910.224 260.436 -0.175 0.951 1 12.215 0.001

Information Criteria

 AIC 7785.03 AIC (Corrected) 7785.14 Schwarz’s BIC 7800.63

 Dependent Variable NTproBNP (pg/ml) N 365 Multiple R 0.376 Squared Multiple R 0.141 Adjusted Squared Multiple R 0.137 Standard Error of Estimate 10287.156

Analysis of Variance

 Source SS df Mean Squares F-ratio p-value Regression 6.309E+009 2 3.155E+009 29.809 0.000 Residual 3.831E+010 362 1.058E+008

Table II. Linear regression of NKLog(Nt-proBNP0/eGFR by eGFR and hemoglobin

Log transform flattens the high Nt-proBNP scale and eGFR and age are normalized

 R : 0.597 R-square : 0.357

 In Effect Coefficient Standard Error Std. Coefficient Tolerance df F-ratio p-value 1 Constant 2 eGFR -1.873 0.144 -0.573 0.933 1 170.011 0.000 3 Hgb -4.259 2.436 -0.077 0.933 1 3.056 0.081

Information Criteria

 AIC 4299.79 AIC (Corrected) 4299.9 Schwarz’s BIC 4315.33

 Dependent Variable NKLogNTGFR N 360 Multiple R 0.597 Squared Multiple R 0.357 Adjusted Squared Multiple R 0.353 Standard Error of Estimate 94.260

Regression Coefficients B = (X’X)-1X’Y

 Effect Coefficient Standard Error Std. Coefficient Tolerance t p-value CONSTANT 256.151 27.745 0.000 . 9.232 0.000 MDRD_GFR -1.873 0.144 -0.573 0.933 -13.039 0.000 Hgb -4.259 2.436 -0.077 0.933 -1.748 0.081

Table III. One-way ANOVA of D-dimer for positive and negative scans

 Dependent Variable D_DIMER N 817

Analysis of Variance

 Source Type III SS df Mean Squares F-ratio p-value VENDUP 43456570.851 1 43456570.851 68.278 0.000 Error 5.187E+008 815 636461.763

Table 4.   Discriminant function for CHF, renal insufficiency and anemia by age, NT-proBNP, creatinine and hemoglobin

 Group Frequencies 0 1 2 135 335 235
 Group Means 0 1 2 NTproBNP (pg/ml) 1516.369 5964.054 12902.662 Creatinine 0.716 1.654 2.103 Hgb 11.972 11.533 11.305 Age 60.570 71.373 74.966
 Between Groups F-matrix df : 4 699 0 1 2 0 0.000 1 23.445 0.000 2 45.108 11.788 0.000

Wilks’s Lambda

 Lambda : 0.778 df : (4,2,702) Approx. F-ratio : 23.337 df : (8,1398) p-value : 0.000

 Classification Functions 0 1 2 CONSTANT -32.018 -35.196 -37.394
 Variable F-to-remove Tolerance 5 NTproBNP (pg/ml) 13.489 0.801 6 Creatinine 21.368 0.799 7 Hgb 0.190 0.928 3 Age 38.632 0.948
 Test Statistic Statistic Value Approx. F-ratio df p-value Wilks’s Lambda 0.778 23.337 8 1398 0.000 Pillai’s Trace 0.226 22.295 8 1400 0.000 Lawley-Hotelling Trace 0.279 24.382 8 1396 0.000

Table V.  The DFA calculations for Figure 9.

 Group Frequencies 0 1 2 221.000 631.000 571.000
 Means NKLgNTproGFRe 15.589 55.971 81.159 MDRD 123.130 61.940 48.748
 Group 0 Discriminant Function Coefficients NormKLgNTproGFR- e MDRDest Constant NKLgNTproGFRe -0.015 MDRD -0.001 0.000 Constant 0.588 0.052 -15.590
 Group 1 Discriminant Function Coefficients NormKLgNTproGFR- e MDRDest Constant NKLgNTproGFRe 0.000 MDRD 0.000 -0.001 Constant 0.024 0.089 -12.106
 Group 2 Discriminant Function Coefficients NormKLgNTproGFR- e MDRDest Constant NKLgNTproGFRe 0.000 MDRD 0.000 -0.001 Constant 0.015 0.147 -13.077
 Between Groups F-matrix df : 2 1419 0 1 2 0 0.000 1 236.650 0.000 2 335.228 21.342 0.000

Wilks’s Lambda for the Hypothesis

 Lambda : 0.671 df : (2,2,1420) Approx. F-ratio : 156.542 df : (4,2838) p-value : 0.000

 Classification Matrix (Cases in row categories classified into columns) 0 1 2 %correct 0 206 15 0 93 1 237 363 31 58 2 69 459 43 8 Total 512 837 74 43
 Jackknifed Classification Matrix 0 1 2 %correct 0 205 16 0 93 1 237 363 31 58 2 69 462 40 7 Total 511 841 71 43
 Test Statistic Statistic Value Approx. F-ratio df p-value Wilks’s Lambda 0.671 156.542 4 2838 0.000 Pillai’s Trace 0.330 140.347 4 2840 0.000 Lawley-Hotelling Trace 0.488 173.026 4 2836 0.000
 Canonical Discriminant Functions 1 2 Constant -1.912 -1.075
 NKLgNTproGFRe 0.001 0.009 MDRD 0.028 0.008
 Canonical Discriminant Functions : Standardized by Within Variances 1 2 NKLgNTproGFRe 0.085 1.061 MDRD 1.026 0.284
 Canonical Scores of Group Means 1 2 0 1.576 0.034 1 -0.122 -0.069 2 -0.476 0.063

Table VI  Ordinal regression model for combined 3 predictors of malnutrition risk.

Predictor                                              L2                     p                      exp(beta)

Poor oral intake                                    60.29               8.2e-15              5.3

# Malnutrition related condition    46.29               1.0e-11              3.06

Albumin                                                152.01             6.3e-35              3.16

Table VII.   Expected Odds Ratios – Diagnosis Thalassemia

Odds-Ratios

Me,M,A2(e)                 Thalassemia

1,1,1                                 9713

1,1,0                                 1696

1,0,1                                   263

0,1,1                                   212

1,0,0                                     46

0,1,0                                     37

0,0,1                                       6

0,0,0                                       1

Table VIII.   Probabilities of RDS given by gestational age and S/A ratio.

Dependent variable: Respiratory outcome (Resp_Sca)

Predictors: Surfactant to albumin (S/A) Ratio_45: 0, > 45; 1, 21-44; 2, < 21;

Gestational age at delivery: 0, > 36; 1, 34-36; 2, < 34.

S/A Ratio_45               p = 8.7*10-22

Gestational Age at Delivery Scaled        p = 4.2*10-9

Combined variables: ChiSq = 130.14,   p = 5.1*10-28,   R2 = 0.433,   phi = 0.8231,   exp(beta) = 2.16 (S/A),   1.88 (GA)

 Definition (S/A, GA) Exp. Probabilities Exp. Odds-Ratios 0-20, < 34 0.84 4427 0-20, 34-36 0.64 668 21-44, < 34 0.57 441 0-20, > 36 0.31 101 21-44, 34-36 0.25 67 > 45, < 34 0.19 44 21-44, > 36 0.06 10 > 45, 34-36 0.04 7 > 45, > 36 0.01 1

Table IX. Ordinal regression of EKG and troponin T on diagnoses

Association Summary               L²                     df         p-value             R²        phi

Explained by Model                  206.52             2          1.4e-45            0.686   1.3856

Residual                                         48.64               14        1.0e-5

Total                                               255.16             16        4.5e-45

Odds Ratios and probabilities for diagnoses

average                        1                2                              0          1          2

score                            0.00        0.00

2,3       2.87                             466.82   10086.03         0.01     0.11     0.88

2,2       2.67                             105.78    1087.95           0.04     0.20     0.75

1,3       2.64                             95.35          931.05            0.05     0.21     0.74

2,1       1.95                             23.97          117.35            0.26     0.27     0.47

1,2       1.87                             21.61           100.43           0.29     0.26     0.45

0,3       1.79                             19.48             85.95           0.32     0.26     0.42

1,1       0.67                             4.90                10.83          0.73     0.15     0.12

0,2       0.61                             4.41                   9.27          0.75     0.14     0.11

0,1       0.12                             1.00                 1.00            0.95     0.04     0.01

Table X.  Observed and expected odds and odds-ratios of remission, relapse and no response by half-life

Half-life          exp. odds     exp. odds-ratios

(range, days)      Rem    short    none     Rem   short  none

> 20                           1      4.16    17.11      1    12.49  56.07

16-20                         1      2.21     4.84      1     6.64  44.16

11-15                          1      1.18     1.37      1     3.53  12.49

6-10                            1     0.63     0.39      1     1.88   3.53

< 6                               1     0.33     0.11       1       1         1

HL-ref                         1    0.33      0.11       1       1         1

Figure 1.  log_NT-proBNP vs eGRF

Figure 2.   Boxplots of NT-proBNP and WHO criteria

Figure 3.  NT-proBNP vs Hb

Figure 4.   3D plot of NT-proBNP, MDRD eGFR, Hb

Figure 5.   3D plot of Normalized K*Log_NTproBNP/eGFR, eGFR, Hb

Figure  6. D-dimer Confidence Intervals vs Imaging

Figures 7 & 8.   Canonical Scores Plots

Figure 9.  Entropy Plot of CA125 halflife (x) vs Effective Information
(Kullback Entropy) showing sharp drop in Entropy at 10 days (equivalent to information added to resolve uncertainty).  AS developed by Rosser R Rudolph

Figure 10.  Kaplan Meier Plot of CA125 half-life vs Survival in Ovarian Cancer

The Role of Informatics in The Laboratory

Larry H. Bernstein, M.D.

Introduction

The clinical laboratory industry, as part of a larger healthcare entrerprise, is in the midst of large changes that can be traced to the mid 1980’s, and that have accelerated in the last decade.   These changes are associated with a host of dramatic events that require accelerated readjustments in the work force, scientific endeavors, education, and the healthcare enterprise.   These changes are highlighted by the following (not unrelated) events:  globalization, a postindustrial information explosion driven by advances in computers and telecommunications networks, genomics and proteomics in drug discovery, consolidation in retail, communication, transportation, the healthcare and pharmaceutical industries.   Let us consider some of these events.   Globalization is driven by the principle that a manufacturer may seek to purchase labor, parts or supplies from sources that are less than is available at home.   The changes in the airline industry have been characterized by growth in travel, reductions in force, and ability of customers to find the best fares.   The discoveries in genetics that have evolved from asking questions about replication, translation and transcription of the genetic code, has moved to functional genomics and to elucidation of cell signaling pathways.   All of these changes were impossible without the information explosion.

## The Laboratory as a Production Environment

The clinical laboratory produces about 60 percent of the information used by nurses and physicians to make decisions about patient care.   In addition, the actual cost of the laboratory is only about 3 – 4 percent of the cost of the enterprise.   The result is that the requirements for the support of the laboratory don’t receive attention without a proactive argument of how it contributes to realizing the goals of the organization.   The key issues affecting laboratory performance are:  staffing requirement, instrument configuration, workflow, what to send out, what to move to point-of-care, how to reconfigure workstations, and how to manage the information generated by the laboratory.

Staffing requirement, instrument configuration and workflow are being addressed by industry automation.   The first attempt was based on connecting instruments by tracks.   This  system proved unable to handle STAT specimens without a noticeable degrading of turnaround time.   The consequence of the failure is to drive creation of  a parallel system of point-of-care, and connecting them in a network with a RAWLS.  Another adjustment was to have an infrastructure for pneumatic tube delivery of specimens, and to redesign the laboratory. This had some success, but required capitalization.   The pneumatic tube system could be justified on the basis to a value to the organization in supporting services besides the laboratory. The industry is moving in the direction of connected modules that share an automated pipettor and reduce the amount of specimen splitting.   These are primarily PREANALYTICAL refinements.

There are other improvements that affect quality and cost that are not standard, and should be.   These are:  autoverification, embedded quality control rules and algorithms, and incorporation of the X-bar into standard quality monitoring.   This can be accomplished using middleware between the enterprise computer and the instruments designed to do more than just connect instruments with the medical information system.   The most common problem encountered when installing a medical repository is the repeated slowdown of the system as more users are connected with the system.   The laboratory has to be protected from this phenomenon, which can be relieved considerably by an open-architecture.   Another function of middleware will be to keep track of productivity by instrument, and to establish the cost per reportable result.

## The Laboratory and Informatics

A few informatics requirements for the processing of tests are:

1. Reject release of runs that fail Quality Control rules
2. Flag results that fail clinical rules for automatic review
3. Ability to construct a report that has correlated information for physician review, regardless of where the test is produced (RBC, MCV, reticulocytes and ferritin)
4. Ability to present critical information in a production environment without technologist intervention (platelet count or hemoglobin in preparation of transfusion request)
5. Ability to download 20,000 patients from an instrument for review of reference ranges
6. Ability to look at quality control of results on more than one test on more than one instrument at a time
7. Ability to present risks in a report for physicians for medical decisions as an alternative to a traditional cutoff value

I list essential steps of the workload processing sequence and identification of informatics enhancement of the process (bolded):

Prelaboratory (ER) 1:

Nurse draws specimens from patient (without specimen ID) and places tubes in bag labeled with name

Nurse prints labels after patient is entered.

Labels put on tubes

Orders entered into computer and labels put on tubes

Tubes sent to laboratory

### Lab test  is shown as PENDING

Prelaboratory 2:

Tubes in bags sent to lab (by pneumatic tube)

Time of arrival is not same as time of order entry (10 minutes later)

If order entry is not done prior to sending specimen – entry is done in front processing area –

Sent to lab area 10 minutes later after test is entered into computer

Preanalytical:

Centrifugation

Delivery to workareas (bins)

Aliquoting for serological testing

### Workstation assignment

Dating and amount of reagents

Blood gas or co-oximetry – no centrifugation

Hematology – CBC – no centrifugation
send specimen for Hgb A1c

Send specimen for Hgb electrophoresis and Hgb F/Hgb A2

Specimen to Aeroset and then to Centaur

Analytical:

### Use of bar code to encode information

Check alignment of bar code

Quality control and calibration at required interval – check before run

Run tests

Manual:

2 hrs per run

enter accession #

enter results 1 accession at a time

Post analytical:

### Verify results

Special  problems:

Calling results

Misaligned bar code label

Inability to find specimen

Coagulation

Manual differentials

Informatics and Information Technology

The traditional view of the laboratory environment has been that it is a manufacturing center, but the main product of the laboratory is information, and the environment is a knowledge business.   This will require changes in the education of clinical laboratory professionals.   Biomedical Informatics has been defined as the scientific field that deals with the storage, retrieval, sharing, and optimal use of biomedical information, data, and knowledge for problem solving and decision making. It touches on all basic and applied fields in biomedical science and is closely tied to modern information technologies, notably in the areas of computing and communication.   The services supported by an informatics architecture include operations and quality management, clinical monitoring, data acquisition and management, and statistics supported by information technology.

The importance of a network architecture is clear.   We are moving from computer-centric processing to a data-centric environment. We will soon manage a wide array of complex and inter-related decision-making resources. The resources, commonly referred to as objects and contents, can now include voice, video, text, data, images, 3D models, photos, drawings, graphics, audio and compound documents.  The architectural features required to achieve this is in Fig 1.

According to Coeira and Dowton (Coiera E and Dowton SB. Reinventing ourselves: How innovations such as on-line ‘just-in-time’ CME may help bring about a genuinely evidence-based clinical practice. Medical Journal of Australia 2000;173:343-344), echoing Lawrence Weed, “Clinicians in the past were trained to master clinical knowledge and become experts in knowing why and how. Today’s clinicians have no hope of mastering any substantial portion of the medical knowledge base.  Every time we make a clinical decision, we should stop to consider whether we need to access the clinical evidence-base. Sometimes that will be in the form of on-line guidelines, systematic reviews or the primary clinical literature.”

Fig 1

Interoperability across environments

Define representation for storage that is independent of  implementation

Define a representation of collection that is independent of the database – schema, table structures

Informatics and the Education of Laboratory Professionals

The increasing dependence on laboratory information and the incorporation of laboratory information into Evidence-Based Guidelines necessitates a significant component of education in informatics.   The public health service has mandated informatics as a component of competencies for health services professionals (“Core Competencies for Public Health Professionals” compendium developed by the Council on Linkages Between Academia and Public Health Practice.), and nursing informatics competencies have already been written.   Coiera (E. Coiera, Medical informatics meets medical education: There’s more to understanding information than technology, Medical Journal of Australia 1998; 168: 319-320) has suggested 10 essential informatics skills for physicians.

I have put together a list below with items taken from Coiera and the Public Health Service competencies for elaboration of competencies for Clinical Laboratory Sciences.
A.   Personal Maintenance
1.   Understands the dynamic and uncertain nature of medical knowledge and know how to keep personal knowledge and skills up-to-date

1.  Searches for and assesses knowledge according to the statistical basis of scientific evidence
2. Understands some of the logical and statistical models of the diagnostic process
3. Interprets uncertain clinical data and deals with artefact and error
4. Evaluates clinical outcomes in terms of risks and benefits

B.   Effective Use of Information

Analytic Assessment Skills

1. Identifies and retrieves current relevant scientific evidence
2. Identifies the limitations of research
3. Determines appropriate uses and limitations of both quantitative and qualitative data

9.  Evaluates the integrity and comparability of data and identifies gaps in data sources

10.  Applies ethical principles to the collection, maintenance, use, and dissemination of data and information
11.  Makes relevant inferences from quantitative and qualitative data
12.  Applies data collection processes, information technology applications, and computer systems storage/retrieval strategies

13.  Manages information systems for collection, retrieval, and use of data for decision-making
14.  Conducts cost-effectiveness, cost-benefit, and cost utility analyses

1. Effective Use of Information Technology
1.  Select and utilize the most appropriate communication method for a given task (eg, face-to-face conversation, telephone, e-mail, video, voice-mail, letter)
2.  Structure and communicate messages in a manner most suited to the recipient, task and chosen communication medium.

17.  Utilizes personal computers and other office information technologies for working with documents and other computerized files

1.  Utilizes modern information technology tools for the full range of electronic communication appropriate to one’s duties and programmatic area.
2. Utilizes information technology so as to ensure the integrity and protection of electronic files and computer systems
1.  Applies all relevant procedures (policies) and technical means (security) to ensure that confidential information is appropriately protected.

I expand on these recommended standards.   The first item is personal maintenance.   This requires continued education to meet the changing needs of the profession in expanding knowledge and access to knowledge that requires critical evaluation.   The payment for the profession has been paid for recognizing the technical contributions made by the laboratory profession as a task oriented contribution, but not for a contribution as a knowledge worker.   This can be changed, but it can’t be realized through the usual bacchalaureate educated requirement.   Most technologists want to get out in the workforce, but after they are out in the workforce – what next?   In many institutions, it falls back on the laboratory to provide the expertise to drive the organization in the computer and information restructuring, from staff taken from the transfusion service, microbiology, and elsewhere.   The laboratory is recognized for an information expertise, but then there is still reason to do more.   The fact is that the mind set of the laboratory staff has been in a manufacturing productivity related to test production, but the data that the production represents is information.   We have the quality control of the test process, but we are required to manage the total process, including the quality of the information we generate.   Another consideration is that the information we generate is used for clinical trials, and a huge variation in the way the information is used is problematic.

The first category for discussion is personal maintenance.  These items are keeping up with knowledge about advances in medical knowledge,  being critical about the quality of the evidence for current knowledge, and being aware of the statistical underpinnings for that thinking (1-5).   It is not enough to keep up with changes in medical thinking using only the professional laboratory literature. A systematic review of problem topics using PubMed as a guide is also essential.  This requires that the clinical laboratory scientist will have to know how to access the internet and search for key studies concerning the questions that are being asked.   The reading of abstracts and papers also requires an education in methods of statistical analysis, contingency tables, study design, and critical thinking.   The most common methods used in clinical laboratory evaluation are linear regression, linear regression, and yes, linear regression.  A discussion over distance learning among members of the American Statistical Association reveals that much of statistical education for the biologists, chemists, and engineers now comes from *software*.  Knowledge workers in drug development and in molecular diagnostics are increasingly challenged with larger, more complicated data sets, and there is a need to interpret and report results quickly. This need is not confined to basic research or the clinical setting, and it may have to be done without consulting with statisticians.  Category A slides into category B, effective use of information.

Effective use of information requires skills that support the design of evaluations of laboratory tests, methods of statistical analysis, and the critical assessment of published work (6-9), and the processes for collecting data, using information technology application, and interpreting the data (10-12).   Items 13 and 14 address management issues.

There is a vocabulary that has to be mastered and certain questions that have to be answered whenever a topic is being investigated.   I identify a number of these at this point in the discussion.

Contingency Table:  A table of frequencies, usually two-way, with event type in columns and test results as positive or negative in rows.   A multi-way table can be used for multivalued categorical analysis.   The conventional 2X2 contingency table is shown below –

 No disease Disease Test negative A  (TN) B  (FN) A+B

PVN =

TN/(FN+TN)Test positiveC  (FP)D  (TP)C+D

PVP =

TP/(TP+FP) A+C

Specificity=
TN/(FP+TN)B+C

Sensitivity =
TP/(TP+FN)A+B+C+D

Type I error:  There is no finding when one actually exists (missed diagnosis)(false negative error).

Type II error:  There is a finding when none exists (false positive error).

Sensitivity:  Percentage of true positive results.  D/(B + D)

Specificity:  Percentage of true negative results. A/(A + C)

False positive error rate:  The percentage of results that are positive in the absence of disease (1 – specificity).  C/(A + C)

ROC curve:  Receiver operator characteristic curve is plot of sensitivity vs I-specificity.  Two methods can be compared in ROC analysis by the area under the curve.   The optimum decision point can be identified as within a narrow range of coordinates on the curve.

Predictive value (+)(PVP):  Probability there is disease when a test is positive (D/C + D), or percentage of patients with disease, given a positive test.   The observed and expected probability may be the same or different.

Predictive value (-)(PVN):  Probability of absence of disease given a negative test result (A/A + B), or percentage of patients without disease given a negative test.  The observed and expected probability may be the same or different.

Power:   When a statement is made that there is no effect, or a test fails to predict the finding of disease, are there enough patients included in the study to see the effect if it exists.   This applies to randomized controlled drug studies as well as studies of tests. Power protects against the error of finding no effect when it exists.

Selection Bias:  It is common to find a high performance claimed for a test that is not later substantiated when it is introduced and widely used.   Why does this occur?   A common practice in experimental design is to define inclusion criteria and exclusion criteria so that the effect is very specific for the condition and to eliminate the interference by “confounders”, unanticipated effects that are not intended.   A common example of this is the removal of patients with acute renal failure and chronic renal insufficiency because of delayed clearance of analytes from the circulation.   The result is that the test is introduced into a population different than the trial population with claims  based on the performance in a limited population.   The error introduced could be prediction of disease in an individual in whom the effect is not true.   This error is reduced by elimination of selection bias, which may require multiple studies using patients who have the confounding conditions (renal insufficiency, myxedema).   Unanticipated effects often aren’t designed into a study.   In many studies about cardiac markers, the study design included only patients who had Acute Coronary Syndrome (ACS)  This is an example of selection bias.   Patients who have ACS  have chest pain of anginal nature that lasts at least 30 minutes, and usually have more than a single episode in 24 hours.   That is not how a majority of patients present to the emergency department who are suspected of having a myocardial infarct.   How then is one to evaluate the effectiveness of a cardiac marker?

Randomization:   Randomization is the assignment of the treatment group to either placebo (no treatment) or treatment.   The investigator and the participant enrolled in the study are blinded.   The analyst might also be blinded.   A potential problem is selection bias from dropouts who skew the characteristics of the population.

Critical questions:

What is the design of the study that you are reading?   Is there sufficient power or is there selection bias?  What are the conclusions of the authors?   Are the conclusions in line with the study design, or overstated?

Statistical tests and terms:

Normal distribution:  Symmetrical bell shaped curve (Gaussian distribution).   The 2 standard deviation limits is approximately the 95% confidence interval.

Chi square test:  Has a chi square distribution.   Used for measuring probability from a contingency table.   Non-parametric test.

Student’s t-test:  Parametric measure of difference between two population means.

F-test:  An F-test ( Snedecor and Cochran, 1983) is used to test if the standard deviations of two populations are equal.  In comparing two independent samples of size N1 and N2 the F Test provides a measure for the probability that they have the same variance. The estimators of the variance are s12 and s22. We define as test statistic their ratio T = s12/ s22, which follows an F Distribution with f1= N1-1 and f2= N2-1 degrees of freedom.

F Distribution: The F distribution is the ratio of two chi-square distributions with degrees of freedom and , respectively, where each chi-square has first been divided by its degrees of freedom.

Z scores:  Z scores are sometimes called “standard scores”. The z score transformation is especially useful when seeking to compare the relative standings of items from distributions with different means and/or different standard deviations.

Analysis of variance:  Parametric measure of two or more population means by the comparison of variances between the populations.   Probability is measured by the F-test.

Linear Regression:  A classic statistical problem is to try to determine the relationship between two random variables  X and Y. For example, we might consider height and weight of a sample of adults.  Linear regression attempts to explain this relationship with a straight line fit to the data.  The simplest case of regression — one dependent and one independent variable — one can visualize in a scatterplot, is simple linear regression (see below).   The linear regression model is the most commonly used model in Clinical Chemistry.

Multiple Regression:  The general purpose of multiple regression (the term was first used by Pearson, 1908) is to learn more about the relationship between several independent or predictor variables and a dependent or criterion variable.  The general computational problem that needs to be solved in multiple regression analysis is to fit a straight line to a number of points.  A multiple regression fits a line using two or more predictors to the dependent variable by a model — Y = a1X1 + a2X + b + g.

Discriminant function:  Discriminant analysis is a technique for classifying a set of observations into predefined classes. The purpose is to determine the class of an observation based on a set of variables known as predictors or input variables. The model is built based on a set of observations for which the classes are known. This set of observations is sometimes referred to as the training set. Based on the training set , the technique constructs a set of linear functions of the predictors, known as discriminant functions, such that

L = b1x1 + b2x2 + … + bnxn + c , where the b’s are discriminant coefficients, the x’s are the input variables or predictors and c is a constant.

These discriminant functions are used to predict the class of a new observation with unknown class. For a k class problem k discriminant functions are constructed. Given a new observation, all the k discriminant functions are evaluated and the observation is assigned to class i if the ith discriminant function has the highest value.

Nonparametric Methods:

Logistic Regression: Researchers often want to analyze whether some event occurred or not.  The outcome is binary.  Logistic regression is a type of regression analysis where the dependent variable is a dummy variable (coded 0, 1).   The linear probability model, expressed as Y = a + bX + e, is problematic because

1. The variance of the dependent variable is dependent on the values of the independent variables.
2. e, the error term, is not normally distributed.
3. The predicted probabilities can be greater than 1 or less than 0.

The “logit” model has the form:

ln[p/(1-p)] = a + BX + e or

[p/(1-p)] = expa expBX expe

where:

• ln is the natural logarithm, logexp, where exp=2.71828…
• p is the probability that the event Y occurs, p(Y=1)
• p/(1-p) is the “odds ratio”
• ln[p/(1-p)] is the log odds ratio, or “logit”

The logistic regression model is simply a non-linear transformation of the linear regression. The logit distribution constrains the estimated probabilities to lie between 0 and 1.

Graphical Ordinal Logit Regression:  The logistic regression fits a non-parametric solution to a two-valued event.   The outcome in question might have 3 or more values.

For example, scaled values of a test – low, normal, and high – might have different meanings.   This type of behavior occurs in certain classification problems.  For example, the model has to deal with anemia, normal, and polycythemia, or similarly, neutropenia, normal, and systemic inflammatory response (sepsis).   This model fits the data quite readily.

Clustering methods:  There are a number of methods to classify data when the dependent variable is not known, but is presumed to exist.   A commonly used method classifies data using geometric distance of the average point coordinates.   A very powerful method used is Latent Class Cluster analysis.

Data Extraction:

Data can be extracted from databases, but have to be worked at in a flat file format.   The easiest and most commonly used methods are to collect data in a relational database, such as Access (if the format is predefined), or the convert data into an Excel format.   A common problem is the inability to extract certain data because it is not in an extractable or usable format.

Let us examine how these methods are actually used in a clinical laboratory setting.

The first example is a test introduced almost 30 years ago into quality control in hematology by Brian Bull at Loma LindaUniversity called the x-bar function (also the Bull algorithm).   The method looks at the means of runs of the population data on the assumption the means of the MCV don’t vary for a stable population from day-to-day.  This is a very useful method that can be applied to the evaluation of laboratory.   It is a standard quality control program used in industrial processes since the 1930s.

We next examine the Chi Square distribution.  Review the formula for calculating chi square and calculations of expected frequencies.  Take a two-by-two table of the type

Effect               No effect          Sum Column

Predictor positive          87                     12                     99

Predictor negative         18                     93                    111

Sum Rows                    105                   105                   210

Experiment with the recalculation of chi square by changing the frequencies in the columns for effect and no effect, keeping the total frequencies the same.  The result is a decrease in the chi square as predictor negative – effect and predictor positive – no effect both increase.  The exercise can be carried out on the chi square calculator using Google to find the site.   The chi square can be used to test the contingency table that is used to indicate the effectiveness of fetal fibronectin for assessing low risk of preterm delivery.

For example,

 No Preterm Labor Yes Preterm Labor Sum Row FFN – neg 99 1 100 FFN – pos 35 65 100 Sum Column 134 66 200

PVN = 100*(1/100)% = 99%

99% observed probability that there will not be preterm delivery with a negative test.

Chi square goodness of fit:

Degrees of freedom: 1
Chi-square = 92.6277702397105
p is less than or equal to 0.001.
The distribution is significant.

Examine the effects of scaling of continuous data from a heart attack study to obtain ordered intervals.  Look at the chi square test for the heart attack test by a Nx2 table with the table columns as heart attack or no heart attack.  This allowed us to determine the significance of the test in predicting heart attack.   Look at the Student T test for comparing the continuous values of the test between the heart attack and non-heart attack population.  The T test is like the one-way analysis of variance with only two values for the factor variable.  The T test and ANOVA1 compares the means between two populations.  If the result is significant, then the null hypothesis that the data is taken from the same population is rejected.  The alternative hypothesis is that they are different.

One can visualize the difference by plotting the means and confidence intervals for the two groups.

One can visualize the difference by plotting the means and confidence intervals for the two groups.

We can plot a frequency distribution before we calculate the means and check the distribution around the means.   The simplest way to do this is the histogram.   The histogram for a large sample of potassium values is used to illustrate this.   The mean is 4.2.

We can use a method for quality control called the X-bar (Beckman Coulter has it on the hematology analyzer) to test the deviation from the means of runs.   I illustrate the validity of the X-bar by comparing the means of a series of runs.

Sample size                   =       958

Lowest value                  =        84.0000

Highest value                 =        90.7000

Arithmetic mean               =        87.8058

Median                        =        87.8000

Standard deviation            =         0.9362

————————————————————

Kolmogorov-Smirnov test

for Normal distribution       :   accept Normality (P=0.353)

If I compare the means by the T-test, I am testing whether the sampling is taken from the same or different populations.   When we introduce a third group, then we are asking whether the sampling is taken from a single population or to reject the hypothesis, taking the alternative hypothesis that the samples are different.   This is illustrated by sampling from a group of patients with no cardiac disease and normal, neither of which have acute myocardial infarction.   This is illustrated below:

Two-sample t-test on CKMB grouped by OTHER against Alternative = ‘not equal’

 Group N Mean SD 0 660 1.396 3.085 1 90 4.366 4.976

Separate variance:

t                         =       -5.518

df                        =         98.5

p-value                   =        0.000

Pooled variance:

t                         =       -7.851

df                        =          748

p-value                   =        0.000

Two-sample t-test on TROP grouped by OTHER against Alternative = ‘not equal’

 Group N Mean SD 0 661 0.065 0.444 1 90 1.072 3.833

Separate variance:

t                         =       -2.489

df                        =         89.3

p-value                   =        0.015

Pooled variance:

t                         =       -6.465

df                        =          749

p-value                   =        0.000

Another example illustrates the application of this significance test.   Beta thalassemia is characterized by an increase in hemoglobin A2.   Thalassemia gets more complicated when we consider delta beta deletion and alpha thalassemia.   Nevertheless, we measure the hemoglobin A2 by liquid chromatography on the Biorad Variant II.   The comparison of hemoglobin A2 in affected and unaffected is shown below (with random resampling):

Two-sample t-test on A2 grouped by THALASSEMIA DIAGNOSIS against Alternative = ‘not equal’

 Group N Mean SD 0 257 3.250 1.131 1 61 6.305 2.541

Separate variance:

t                         =       -9.177

df                        =         65.7

p-value                   =        0.000

Pooled variance:

t                         =      -14.263

df                        =          316

p-value                   =        0.000

When we do a paired comparison of the Variant hemoglobin A2 versus quantitation of Helena isoelectric focusing, the results with the T-test shows no significance.

Paired samples t-test on A2 vs A2E with 130 cases

Alternative = ‘not equal’

Mean A2                   =        3.638

Mean A2E                  =        3.453

Mean difference           =        0.185

SD of difference          =        1.960

t                         =        1.074

df                        =          129

p-value                   =        0.285

Consider overlay box plots of the troponin I means for normal, stable cardiac patients and AMI patients:

The means between two subgroups may be close and the confidence intervals around the means may be wide so that it is not clear whether to accept or reject the null hypothesis.  I illustrate this by taking for comparison the two groups that feature normal cardiac status and stable cardiac disease, neither having myocardial infarction.   I use the nonparametric Kruskal Wallis analysis of ranks between two groups, and I increase the sample size to 100,000 patients by a resampling algorithm.   The result for CKMB and for troponin I is:

Kruskal-Wallis One-Way Analysis of Variance for 93538 cases

Dependent variable is CKMB

Grouping variable is OTHER

Group       Count   Rank Sum

0                     83405            3.64937E+09

1                    10133             7.25351E+08

Mann-Whitney U test statistic =  1.71136E+08

Probability is        0.000

Chi-square approximation =     9619.624 with 1 df

Kruskal-Wallis One-Way Analysis of Variance for 93676 cases

Dependent variable is TROP

Grouping variable is OTHER

Group       Count   Rank Sum

0                    83543             3.59446E+09

1                    10133             7.93180E+08

Mann-Whitney U test statistic =  1.04705E+08

Probability is        0.000

Chi-square approximation =    21850.251 with 1 df

Examine a unique data set in which a test is done on amniotic fluid to determine whether there is adequate surfactant activity so that fetal lung compliance is good at delivery.  If there is inadequate surfactant activity there is risk of respiratory distress of the newborn soon after delivery.  The data includes the measure of surfactant activity, gestational age, and fetal status at delivery.  This study emphasized the calculation of the odds-ratio and probability of RDA using surfactant measurement with, and without gestational age for infants delivered within 72 hours of the test.  The statistical method (Goldmine) has a graphical display with the factor variable as the abscissa and the scaled predictor and odds-ratio as the ordinate.  The data acquisition required a multicenter study of the National Academy of Clinical Biochemistry led by John Chapman (Chapel Hill, NC) and Lawrence Kaplan (Bellevue Hospital, NY, NY), published in Clin Chimica Acta (2002).

The table generated is as follows:

Probability and Odds-Ratios for Regression of S/A on Respiratory Outcomes

 S/A interval Probability of RDS Odds Ratio 0 – 10 0.87 713 11 – 20 0.69 239 21 – 34 0.43 80 35 – 44 0.20 27 45 – 54 0.08 9 55 – 70 0.03 3 > 70 0.01 1

There is a plot corresponding to the table above.  It is patented as GOLDminer (graphical ordinal logit display).  As the risk increases, the odds-ratio (and probability of an event)  increases.  The calculation is an advantage when there is more than two values of the factor variable, such as, heart attack, not heart attack, and something else.  We  look at the use of the Goldminer algorithm, this time using the acute myocardial infarction and troponin T example.   The ECG finding is scaled so that the result is normal (0), NSSTT (1), ST depression or t-wave inversion, ST elevation.   The troponin T is scaled to: 0.03, 0.031-0.06, 0.061-0.085, 0.086-0.1, 0.11-0.2, > 0.20 ug/L.   The Goldminer plot is shown below with troponin T as 2nd predictor.

(Joint Y)                                   DXSCALE

average            0                      4

X-profile          score                1.00                 0.00

4,5                   3.64                 0.00                 0.68

4,4                   3.51                 0.00                 0.59

4,3                   3.35                 0.00                 0.48

3,5                   3.07                 0.01                 0.34

4,1                   2.87                 0.02                 0.27

3,4                   2.79                 0.02                 0.24

4,0                   2.54                 0.04                 0.17

3,3                   2.43                 0.06                 0.15

3,2                   2.00                 0.12                 0.08

2,5                   1.88                 0.15                 0.07

3,1                   1.55                 0.23                 0.04

2,4                   1.42                 0.26                 0.03

3,0                   1.12                 0.36                 0.01

2,3                   1.02                 0.40                 0.01

2,2                   0.70                 0.53                 0.00

2,1                   0.47                 0.65                 0.00

2,0                   0.32                 0.74                 0.00

1,3                   0.29                 0.77                 0.00

1,2                   0.20                 0.83                 0.00

1,1                   0.13                 0.88                 0.00

1,0                   0.09                 0.91                 0.00

The table is the table of probabilities from the Goldminer program.   The diagnosis scale 4 is MI.   Diagnosis 0 is baseline normal.

We return to a comparison of CKMB and troponin I.   CKMB may be used as a surrogate test for examining the use of troponin I.   We scale the CKMB to 3 and the troponin to 6 intervals.   We construct a 3-by-6 table shown below, with the chi square analysis.

Frequencies

TNISCALE (rows) by CKMBSCALE (columns)

 0 1 2 Total 0 709 12 9 730 1 14 0 2 16 2 3 0 0 3 3 2 0 0 2 4 4 0 0 4 5 22 5 17 44 Total 754 17 28 799

Expected values

TNISCALE (rows) by CKMBSCALE (columns)

 0 1 2 0 688.886 15.532 25.582 1 15.099 0.34 0.561 2 2.831 0.064 0.105 3 1.887 0.043 0.07 4 3.775 0.085 0.14 5 41.522 0.936 1.542
 Test statistic Value df Prob Pearson Chi-square 198.580 10.000 0.000

How do we select the best value for a test?  The standard accepted method is a ROC plot.  We have seen how to calculate sensitivity, specificity, and error rates.  The false positive error is 1 – specificity.  The ROC curve plots sensitivity vs 1 – specificity.  The ROC plot requires determination of the “disease” variable by some means other than the test that is being evaluated.   What if the true diagnosis is not accurately known?   The question posed introduces the concept of Latent Class Models.

A special nutritional study set was used in which the definition of the effect is not as clear as that for heart attack.  The risk of malnutrition is assessed at the bedside by a dietitian using observed features (presence of wound, malnutrition related condition, and poor oral intake), and by laboratory tests, using serum albumin (protein), red cell hemoglobin, and lymphocyte count.  The composite score was a value of 1 to 4.  Data was collected by Linda Brugler, RD, MBA, at St.FrancisHospital, (Wilmington, DE) on 62 patients to determine whether a better model could be developed using new predictors.

The new predictors were laboratory tests not used in the definition of the risk level, which could be problematic.  The tests albumin, lymphocyte count, and hemoglobin were expected to be highly correlated with the risk level because they were used in its definition.  The prealbumin, but not retinol binding protein or C reactive protein, was correlated with risk score and improved the prediction model.

The crosstable for risk level versus albumin is significant at p < 0.0001.

A GOLDminer plot showed scaled prealbumin versus levels 3 & 4.    A value less than 5 is severe malnutrition and over 19 is not malnourished.  Mild and moderate malnutrition are between these values.

A method called latent class cluster analysis is used to classify the data.   A latent class is identified when the classification isn’t accurately known.   The result of the analysis is shown in Table 4.   The percent of variable subclasses are shown within each class and total 1.00 (100%).

Cluster1           Cluster2           Cluster3

Cluster Size

0.5545             0.3304             0.1151

PAB1COD

1          0.6841             0.0383             0.0454

2          0.3134             0.6346             0.6662

3          0.0024             0.1781             0.1656

4          0.0001             0.1490             0.1227

ALB0COD

1          0.9491             0.4865             0.1013

2          0.0389             0.1445             0.0869

3          0.0117             0.3167             0.5497

4          0.0003             0.0523             0.2621

LCCOD

1          0.1229             0.0097             0.7600

2          0.3680             0.0687             0.2381

4          0.2297             0.2383             0.0016

5          0.2793             0.6832             0.0002

There are other aspects of informatics that are essential for educational design of the laboratory professional of the future.  These include preparation of powerpoint presentations, use of the internet to obtain current information, quality control designed into the process of handling laboratory testing, evaluating data from different correlated workstations, and instrument integration.   The integrated open architecture will be essential for financial management of the laboratory as well. The continued improvement of the technology base of the laboratory will become routine over the next few years.   The education of the CLS for a professional career in medical technology will require an individual who is adaptive and well prepared for a changing technology environment.   The next section of this document will describe the information structure needed just to carry out the day-to-day operations of the laboratory.

Cost linkages important to define value

Traditional accounting methods do not take into account the cost relationships that are essential for economic survival in a competitive environment so that the only items on the ledger are materials and supplies, labor and benefits, and indirect costs.   This is a description of the business as set forth by an NCCLS cost manual, but it is not sufficient to account for the dimensions of the business in relationship to its activities.   The emergence of spreadsheets, and even as importantly, the development of relational database structures, has transformed and is transforming how we can look at the costing of organizations in relationship to how individuals and groups within the organization carry out the business plan and realize the mission set forth by the governing body.   In this sense, the traditional model was incomplete because it only accounted for the costs incurred by departments in a structure that allocates resources to each department based on the assessed use of resources in providing services.   The model has to account for the allocation of resources to product lines of services (as a DRG model developed by Dr. Eleanor Travers).   A revised model has to take into account two new dimensions.   The first dimension is that of the allocation of resources to provide services that are distinct medical/clinical activities.   This means that in the laboratory service business there may be distinctive services as well as market sectors.   That is, health care organizations view their markets as defined by service Zip codes which delineate the lines drawn between their market and the competition (in the absence of clear overlap).

We have to keep in mind that there are service groups that were defined by John Thompson and Robert Fetter in the development of the DRGs (Diagnosis Related Groups) that have a real relationship to resource requirements for pediatric, geriatric, obstetrics, gynecology, hematology, oncology, cardiology, medical and surgical.   These groups are derived from bundles of ICDs (International Code of Diagnosis) that have comparable within group use of laboratory, radiology, nutrition, pharmacy and other resources.   There was an early concern that there was too much variability within DRGs, which was addressed by severity of illness adjustment (Susan Horn).   It is now clear that ICD’s don’t capture a significant content of the medical record. A method is being devised to correct this problem by Kaiser and Mayo using the SNOMED codes as a starting point.   The point is that it is essential that the activities, resources required, and payment be aligned for validity of the payment system.   Of some interest is the association of severity of illness with more than two comorbidities, and of an association with critical values of a few laboratory tests, e.g., albumin, sodium, potassium, hemoglobin, white cell count.   The actual linkages of these resources to cost of the ten or 20 most common diagnostic categories is only a recent event.   As a rule the top 25 categories account for a substantial volume of the costs that it is of great interest to control.   The improvement of database technology makes it conceivable that 100 categories of disease classification could be controlled without difficulty in the next ten years.

Quality cost synergism

What is traditionally described is only one dimension of the business of the operation.   It is the business of the organization, but it is only one-third of the description of the organization and the costs that drive it.   The second dimension of the organization’s cost profile is only obtained by cost accounting how the organization creates value.   Value is simply the ratio of outputs to inputs.   The traditional cost accounting model looks only at business value added.   The value generated by an organization is attributable to a service or good produced that a customer is willing to purchase.   We have to measure the value by measuring some variable that is highly correlated with the value created.   That measure is partly accounted for by transaction times.  We can borrow from the same model that is used in other industries.   The transportation business is an example.   A colleague has designed a surgical pathology information system on the premise that a report in the pathology office or a phone inquiry by a surgeon is a failure of the service.   This is analogous to the Southeast Airlines mission to have the lowest time on the ground in the industry.   The growing complexity of service needs, the capital requirements to support the needs, and the contractual requirements are driving redesign of services in a constantly changing environment.

Technology requirements

We have gone from predominantly batch and large scale production to predominantly random access and a growing point-of-care application with pneumatic tube delivery systems in the acute care setting in the last 15 years.   The emphasis on population-based health and increasing shift from acute care to ambulatory care has increased the pressure for point-of-care testing to reduce second visits for adjustment of medication.   The laboratory, radiology and imaging services, and pharmacy information have to be directed to a medical record that may be accessed in acute care or ambulatory setting.   We not only have the proposition that faster is better, but access is from anyplace and almost anytime – connectivity.

There has been a strategic discussion about configuration of information services that is resolving itself by the needs of the marketplace.   Large, self contained organizations are short-lived, and with the emergence of networked provider organizations there will be no compelling interest in having systems that are not tailored to the variety of applications and environments that are served.   The migration from minicomputer to microcomputer client-server networks will go rapidly to N-tiered systems with distributed object-oriented features.   The need for laboratory information systems as a separate application can be seriously challenged by the new paradigm.

Laboratory utilization has to be looked at from more than one perspective in relationship to costs and revenues.   The redefinition of panels cuts the marginal added cost to produce an additional test, but it doesn’t cut the largest cost in obtaining and processing the specimen.   Unfortunately, there is a fixed cost of the operations that has to be achieved, which also drives the formation of laboratory consolidations to have sufficient volume.    If one looks at the capital requirements and labor to support a minimum volume of testing, the marginal cost of added tests decreases with large volume.   The problem with the consolidation argument is that one has to remove testing from the local site in order to increase the volume with an anticipated effect on cycle time for processing.   There is also a significant resource cost for courier service, specimen handling and reporting.   Lets look at the reverse.   What is the effect of decreasing utilization?   One increases the marginal added cost per unit of testing on specimens or accessions.   There is the same basic fixed cost, and if the volume of testing needed to break even is met, the advantage of additional volume is lost.   Fixing the expected cost per patient or per accession becomes problematic if there is a requirement to reduce utilization.

The key volume for processing in the service sense is the number of specimens processed, which has an enormous impact on the processing requirements (number of tests adds to reagent costs and turnaround time per accession).   The result is that one might consider the reduction of testing that is done to monitor critical patients’ status more frequently than is needed.   One can examine the frequency of the CBC, PT/APTT, panels, electrolytes, glucose, and blood gases in the ICUs.   The use of the laboratory is expected to be more intense, reflecting severity of illness, in this setting.   On the other hand, excess redundancy may reflect testing that makes no meaningful contribution to patient care.   This may be suggested by repeated testing with no significant variation in the lab results.

Intangible elements

Competitive advantage may have marginal costs with enormous value enhancement.   This is in the manner of reporting the results.   My colleagues have proposed the importance of a scale-free representation of the laboratory data for presentation to the provider and the patients.   This can be extended further by the scaling of the normalized data into intervals associated with expected risks for outcomes.   This would move the laboratory into the domain of assisting in the management of population adjusted health outcomes.

Blume P. Design of a clinical laboratory computer system. Laboratory and  Hospital Information Systems. In Clinics Lab Med 1991;11:83-104.

Didner RS. Back-to-front systems design: a guns and butter approach. Proc Intl Ergonomics Assoc 1982;–

Didner RS, Butler KA. Information requirements for user decision support: designing systems from back to front. Proc Intl Conf on Cybernetics and Society. IEEE. 1982;–:415-419.

Bernstein LH. An LIS is not all pluses. MLO 1986;18:75-80.

Bernstein LH, Sachs B. Selecting an automated chemistry analyzer: cost analysis. Amer Clin Prod Rev 1988;–:16-19.

Bernstein L, Sachs E, Stapleton V, Gorton J. Replacement of a laboratory instrument system based on workflow design. Amer Clin Prod Rev 1988; –: 22-24.

Bernstein LH. Computer-assisted restructuring services. Amer Clin Prod Rev1986;9:–

Bernstein LH, Sachs B, Stapleton V, Gorton J, Lardas O. Implementing a laboratory information management system and verifying its performance. Informatics in Pathol 1986;1:224-233.

Bernstein LH. Selecting a laboratory computer system: the importance of auditing laboratory performance. Amer Clin Prod Rev 1985;–:30-33.

Castaneda-Mendez K, Bernstein LH. Linking costs and quality improvement to clinical outcomes through added value. J Healthcare Qual 1997;19:11-16.

Bernstein LH. The contribution of laboratory information systems to quality assurance. Amer Clin Prod Rev 1987;18:10-15.

Bernstein LH. Predicting the costs of laboratory testing. Pathologist 1985;39:–

Bernstein LH, Davis G, Pelton T. Managing and reducing lab costs. MLO 1984;16:53-56.

Bernstein LH, Brouillette R. The negative impact of untimely data in the diagnosis of acute myocardial infarction. Amer Clin Lab 1990;__:38-40.

Bernstein LH, Spiekerman AM, Qamar A, Babb J. Effective resource management using a clinical and laboratory algorithm for chest pain triage. Clin Lab Management Rev 1996;–:143-152.

Shaw-Stiffel TA, Zarny LA, Pleban WE, Rosman DD, Rudolph RA, Bernstein LH. Effect of nutrition status and other factors on length of hospital stay after major gastrointestinal surgery. Nutrition (Intl) 1993;9:140-145.

Bernstein LH. Relationship of nutritional markers to length of hospital stay. Nutrition (Intl)(suppl) 1995;11:205-209.

Bernstein LH, Coles M, Granata A. The BridgeportHospital experience with autologous transfusion in orthopedic surgery. Orthopedics 1997;20:677-680.

Bernstein LH. Realization of the projected impact of a chemistry workflow management system at BridgeportHospital. In Quality and Statistics: Total Quality Management. Kowalewski MJ, Ed. 1994; 120-133 ASTM: STP 1209. Phila, PA.

Bernstein LH, Kleinman GM, Davis GL, Chiga M. Part A reimbursement: what is your role in medical quality assurance? Pathologist 1986;40:–.

Bernstein LH. What constitutes a laboratory quality monitoring program? Amer J Qual Util Rev 1990;5:95-99.

Mozes B, Easterling J, Sheiner LB, Melmon KL, Kline R, Goldman ES, Brown AN. Case-mix adjustment using objective measures of severity: the case for laboratory data. Health Serv Res 1994;28:689711.

Bernstein LH, Shaw-Stiffel T, Zarny L, Pleban W. An informational approach to likelihood of malnutrition. Nutr (Intl) 1996;12:772-226.

### The Automated Second Opinion Generator

Author: Larry H. Bernstein, MD, FCAP

Gil David and Larry Bernstein have developed a first generation software agent under the supervision of Prof. Ronal Coifman, in the Yale University Applied Mathematics Program that is the equivalent of an intelligent EHR Dashboard that learns.  What is a Dashboard?   A Dashboard is a visual display of essential metrics. The primary purpose is to gather information and generate the metrics relatively quickly, and analyze it, meeting the highest standard of accuracy.  This invention is a leap across traditional boundaries of Health Information Technology in that it integrates and digests extractable information sources from the medical record using the laboratory, the extractable vital signs, EKG, for instance, and documented clinical descriptors to form one or more  provisional diagnoses describing the patient status by inference from a nonparametric network algorithm.  This is the first generation of a “convergence” of medicine and information science.  The diagnoses are complete only after review of thousands of records to which diagnoses are first provided, and then training the algorithm, and validating the software by applying to a second set of data, and reviewing the accuracy of the diagnoses.

The only limitation of the algorithm is sparsity of data in some subsets, which doesn’t permit a probability calculation until sufficient data is obtained.  The limitation is not so serious because it does not disable the system from recognizing at least 95 percent of the information used in medical decision-making, and adequately covers the top 15 medical diagnoses.  An example of this exception would be the diagnosis of alpha or beta thalassemia, with a microcytic picture (MCV low) and RBC high with a low Hgb).  The accuracy is very high because the anomaly detection used for classifying the data creates aggregates that have common features.  The aggregates themselves are consistent within separatory  rules that pertain to any class.  As the model grows, however, there is unknown potential for there to be prognostic, as well as diagnostic information within classes (subclasses), and a further potential to uncover therapeutic differences within classes – which will be made coherent with new classes of drugs (personalized medicine) that are emerging from the “convergence” of genomics, metabolomics, and translational biology.

The fact that such algorithms have already been used for limited data sets and unencumbered diagnoses in many cases using the approach of studies with inclusions and exclusions common for clinical trials, the approach has proved ever more costly when used outside the study environment.   The elephant in the room is age-related co-morbidities and co-existence of obesity, lipid derangements, renal function impairment, genetic and environmental factors that are hidden from view.  The approach envisioned is manageable, overcoming these obstacles, and handles both inputs and outputs with considerable ease.

We anticipate that the effect of implementing this artificial intelligence diagnostic amplifier would result in higher physician productivity at a time of great human resource limitation(s), safer prescribing practices, rapid identification of unusual patients, better assignment of patients to observation, inpatient beds, intemsive care, or referral to clinic, shortened length of patients ICU and bed days.  If the observation of systemic issues in “To err is human” is now 10 years old with marginal improvement at great cost, this should be a quantum leap forward for the patient, the physician, the caregiving team, and the society that adopts it.

## Scale‑Free Diagnosis of AMI from Clinical Laboratory Values

### Scale‑Free Diagnosis of AMI from Clinical Laboratory Values

Author: Larry H. Bernstein, MD, FCAP

Scale‑Free Diagnosis of AMI from Clinical Laboratory Values

William P. Fisher, Jr., Larry H. Bernstein, Thomas A Naegele, Arden

Forrey, Asadullah Qamar, Joseph Babb, Eugene W. Rypka, Donna Yasick

Objective. Clinicians are often challenged with interpreting myriads of laboratory test results with few resources for knowing which values are most relevant, when any given value indicates a need for action, or how urgent the need for action is. The arrival of the electronic health record creates a context in which computational resources for meeting these challenges will be readily available. The purpose of this study was to evaluate the feasibility of employing probabilistic conjoint (Rasch) measurement models for creating the needed scale‑free standard measures and data quality standards.

Methods. Pathology data from 144 clients suspected of suffering myocardial infarctions were obtained. Thirty indicators were converted from their original values to ratings indicating a worsening of condition. These conversions took advantage of the fact that serial measurement of creatine kinase (CK; EC 2.7.3.2) isoenzyme MB (CK‑MB) and lactic dehydrogenase (LD; EC 1.1.1.27) isoenzyme 1 (LD‑1) in serum have characteristic evolutions in acute myocardial infarction (AMI). CK‑MB concentration begins to rise within 4 to 8 hours, peaks at 12 to 24 hours, and returns to normal within 48 to 72 hours. LD‑1 becomes elevated as early as 8 to 24 hours after infarction, and reaches a peak in 48 to 72 hours. However, the ratio of serum activity of LD‑1/total LD may be more definitive than LD‑1 activity itself. While these are most important in ECG negative AMI, they are not by themselves a “gold standard” for diagnosis.

The additional information and functionality required for such standards, including probabilistic estimates of scale parameters whose values do not depend on the calibrating sample and the capacity to deal with missing data, were sought by fitting the data to a Rasch partial credit model. This model estimates separate rating step values for each group of items sharing a common rating structure, en route to testing the hypothesis that the items work together to delineate a unidimensional measurement continuum defined by the repetition of a single unit quantity.

Results. Twenty of the 30 items were identified as delineating a unidimensional continuum.  Client measurement reliability was 0.90, and item calibration reliability was 0.96. Overall model fit is indicated by the client information‑ weighted mean square fit (infit) statistic (mean = .94, SD = .34) and  outlier‑ sensitive mean square fit (outfit) statistic (mean = 1.02, SD = .72), and the item infit (mean = .99, SD = .41) and outfit (mean = 1.04, SD = .72). The data‑to‑ model global fit is also indicated by the chi‑square of 3094.5, with 164 maximum independent parameters, 2766 maximum degrees of  freedom, and a probability (statistical significance) of less than .01 that this ora greater chi‑square would be observed with perfect data‑model fit.

Discussion. The analysis identified the 20 values most relevant to the diagnosis of AMI; these data may also support the construction of a unidimensional measure of AMI severity. If the construct supports both diagnostic and severity inferences, then the clinical action needed and its urgency will be indicated by the client’s measure. Similar analyses of data from other diagnostic groups will determine the extent to which lab value item relevance and hierarchies vary across diagnoses; such variation will be crucial to determining computer‑based decision support algorithms, which will match individual clients’ data with specific diagnostic profiles. Further analyses will also demonstrate the extent to which diagnosis is affected by missing data.

## Demonstration of a diagnostic clinical laboratory neural network agent applied to three laboratory data conditioning problems

Demonstration of a diagnostic clinical laboratory neural network agent applied to three laboratory data conditioning problems

Izaak Mayzlin                                                                        Larry Bernstein, MD

Principal Scientist, MayNet                                            Technical Director

Boston, MA                                                                          Methodist Hospital Laboratory, Brooklyn, NY

Our clinical chemistry section services a hospital emergency room seeing 15,000 patients with chest pain annually.  We have used a neural network agent, MayNet, for data conditioning.  Three applications are – troponin, CKMB, EKG for chest pain; B-type natriuretic peptide (BNP), EKG for congestive heart failure (CHF); and red cell count (RBC), mean corpuscular volume (MCV), hemoglobin A2 (Hgb A2) for beta thalassemia.  Three data sets have been extensively validated prior to neural network analysis using receiver-operator curve (ROC analysis), Latent Class Analysis, and a multinomial regression approach.  Optimum decision points for classifying using these data were determined using ROC (SYSTAT, 11.0), LCM (Latent Gold), and ordinal regression (GOLDminer).   The ACS and CHF studies both had over 700 patients, and had a different validation sample than the initial exploratory population.  The MayNet incorporates prior clustering, and sample extraction features in its application.   Maynet results are in agreement with the other methods.

Introduction: A clinical laboratory servicing a hospital with an  emergency room seeing 15,000 patients with chest pain to produce over 2 million quality controlled chemistry accessions annually.  We have used a neural network agent, MayNet, to tackle the quality control of the information product.  The agent combines a statistical tool that first performs clustering of input variables by Euclidean distances in multi-dimensional space. The clusters are trained on output variables by the artificial neural network performing non-linear discrimination on clusters’ averages.  In applying this new agent system to diagnosis of acute myocardial infarction (AMI) we demonstrated that at an optimum clustering distance the number of classes is minimized with efficient training on the neural network. The software agent also performs a random partitioning of the patients’ data into training and testing sets, one time neural network training, and an accuracy estimate on the testing data set. Three examples to illustrate this are – troponin, CKMB, EKG for acute coronary syndrome (ACS); B-type natriuretic peptide (BNP), EKG for the estimation of ejection fraction in congestive heart failure (CHF); and red cell count (RBC), mean corpuscular volume (MCV), hemoglobin A2 (Hgb A2) for identifying beta thalassemia.  We use three data sets that have been extensively validated prior to neural network analysis using receiver-operator curve (ROC analysis), Latent Class Analysis, and a multinomial regression approach.

In previous studies1,2 CK-MB and LD1 sampled at 12 and 18 hours postadmission were near-optimum times used to form a classification by the analysis of information in the data set. The population consisted of 101 patients with and 41 patients without AMI based on review of the medical records, clinical presentation, electrocardiography, serial enzyme and isoenzyme  assays, and other tests. The clinical or EKG data, and other enzymes or sampling times were not used to form a classification but could be handled by the program developed. All diagnoses were established by cardiologist review. An important methodological problem is the assignment of a correct diagnosis by a “gold standard” that is independent of the method being tested so that the method tested can be suitably validated. This solution is not satisfactory in the case of myocardial infarction because of the dependence of diagnosis on a constellation of observations with different sensitivities and specificities. We have argued that the accuracy of diagnosis is  associated with the classes formed by combined features and has greatest uncertainty associated with a single measure.

Methods:  Neural network analysis is by MayNet, developed by one of the authors.  Optimum decision points for classifying using these data were determined using ROC (SYSTAT, 11.0), LCM (Latent Gold)3, and ordinal regression (GOLDminer)4.   Validation of the ACS and CHF study sets both had over 700 patients, and all studies had a different validation sample than the initial exploratory population.  The MayNet incorporates prior clustering, and sample extraction features in its application.   We now report on a new classification method and its application to diagnosis of acute myocardial infarction (AMI).  This method is based on the combination of clustering by Euclidean distances in multi-dimensional space and non-linear discrimination fulfilled by the Artificial Neural Network (ANN) trained on clusters’ averages.   These studies indicate that at an optimum clustering distance the number of classes is minimized with efficient training on the ANN. This novel approach to ANN reduces the number of patterns used for ANN learning and works also as an effective tool for smoothing data, removing singularities,  and increasing the accuracy of classification by the ANN. The studies  conducted involve training and testing on separate clinical data sets, which subsequently achieves a high accuracy of diagnosis (97%).

Unlike classification, which assumes the prior definition of borders between classes5,6, clustering procedure includes establishing these borders as a result of processing statistical information and using a given criteria for difference (distance) between classes.  We perform clustering using the geometrical (Euclidean) distance between two points in n-dimensional space, formed by n variables, including both input and output variables. Since this distance assumes compatibility of different variables, the values of all input variables are linearly transformed (scaled) to the range from 0 to 1.

The ANN technique for readers accustomed to classical statistics can be viewed as an extension of multivariate regression analyses with such new features as non-linearity and ability to process categorical data. Categorical (not continuous) variables represent two or more levels, groups, or classes of correspondent feature, and in our case this concept is used to signify patient condition, for example existence or not of AMI.

The ANN is an acyclic directed graph with input and output nodes corresponding respectively to input and output variables. There are also “intermediate” nodes, comprising so called “hidden” layers.  Each node nj is assigned the value xj that has been evaluated by the node’s “processing” element, as a non-linear function of the weighted sum of values xi of nodes ni, connected with nj by directed edges (ni, nj).

xj = f(wi(1),jxi(1) + wi(2),jxi(2) + … + wi(l),jxi(l)),

where xk is the value in node nk and wk,j is the “weight” of the edge (nk, nj).  In our research we used the standard function f(x), “sigmoid”, defined as f(x)=1/(1+exp(-x)).  This function is suitable for categorical output and allows for using an efficient back-propagation algorithm7 for calculating the optimal values of weights, providing the best fit for learning set of data, and eventually the most accurate classification.

Process description:  We implemented the proposed algorithm for diagnosis of AMI. All the calculations were performed on PC with Pentium 3 Processor applying the authors’ unique Software Agent Maynet. First, using the automatic random extraction procedure, the initial data set (139 patients) was partitioned into two sets — training and testing.  This randomization also determined the size of these sets (96 and 43, respectively) since the program was instructed to assign approximately 70 % of data to the training set.

The main process consists of three successive steps: (1) clustering performed on training data set, (2) neural network’s training on clusters from previous step, and (3) classifier’s accuracy evaluation on testing data.

The classifier in this research will be the ANN, created on step 2, with output in the range [0,1], that provides binary result (1 – AMI, 0 – not AMI), using decision point 0.5.

In this demonstartion we used the data of two previous studies1,2 with three patients, potential outliers, removed (n = 139). The data contains three input variables, CK-MB, LD-1, LD-1/total LD, and one output variable, diagnoses, coded as 1 (for AMI) or 0 (non-AMI).

Results: The application of this software intelligent agent is first demonstrated here using the initial model. Figures 1-2 illustrate the history of training process. One function is the maximum (among training patterns) and lower function shows the average error. The latter defines duration of training process. Training terminates when the average error achieves 5%.

There was slow convergence of back-propagation algorithm applied to the training set of 96 patients. We needed 6800 iterations to achieve the sufficiently small (5%) average error.

Figure 1 shows the process of training on stage 2. It illustrates rapid convergence because we deal only with 9 patterns representing the 9 classes, formed on step 1.

Table 1 illustrates the effect of selection of maximum distance on the number of classes formed and on the production of errors. The number of classes increased with decreasing distance, but accuracy of classification does not decreased.

The rate of learning is inversely related to the number of classes. The use of the back-propagation to train on the entire data set without prior processing is slower than for the training on patterns.

Figures 2 is a two-dimensional projection of three-dimensional space of input variables CKMB and LD1 with small dots corresponding to the patterns and rectangular as cluster centroids (black – AMI, white – not AMI).

We carried out a larger study using troponin I (instead of LD1) and CKMB for the diagnosis of myocardial infarction (MI).  The probabilities and odds-ratios for the TnI scaled into intervals near the entropy decision point are shown in Table 2 (N = 782).  The cross-table shows the frequencies for scaled TnI results versus the observed MI, the percent of values within MI, and the predicted probabilities and odds-ratios for MI within TnI intervals.  The optimum decision point is at or near 0.61 mg/L (the probability of MI at 0.46-0.6 mg/L is 3% and the odds ratio is at 13, while the probability of MI at 0.61-0.75 mg/L is 26% at an odds ratio of 174) by regressing the scaled values.

The RBC, MCV criteria used were applied to a series of 40 patients different than that used in deriving the cutoffs.  A latent class cluster analysis is shown in Table 3.  MayNet is carried out on all 3 data sets for MI, CHF, and for beta thalassemia for comparison and will be shown.

Discussion:  CKMB has been heavily used for a long time to determine heart attacks. It is used in conjunction with a troponin test and the EKG to identify MI but, it isn’t as sensitive as is needed. A joint committee of the AmericanCollege of Cardiology and European Society of Cardiology (ACC/ESC) has established the criteria for acute, recent or evolving AMI predicated on a typical increase in troponin in the clinical setting of myocardial ischemia (1), which includes the 99th percentile of a healthy normal population. The improper selection of a troponin decision value is, however, likely to increase over use of hospital resources.  A study by Zarich8 showed that using an MI cutoff concentration for TnT from a non-acute coronary syndrome (ACS) reference improves risk stratification, but fails to detect a positive TnT in 11.7% of subjects with an ACS syndrome8. The specificity of the test increased from 88.4% to 96.7% with corresponding negative predictive values of 99.7% and 96.2%. Lin et al.9 recently reported that the use of low reference cutoffs suggested by the new guidelines results in markedly increased TnI-positive cases overall. Associated with a positive TnI and a negative CKMB, these cases are most likely false positive for MI. Maynet relieves this and the following problem effectively.

Monitoring BNP levels is a new and highly efficient way of diagnosing CHF as well as excluding non-cardiac causes of shortness of breath. Listening to breath sounds is only accurate when the disease is advanced to the stage in which the pumping function of the heart is impaired. The pumping of the heart is impaired when the circulation pressure increases above the osmotic pressure of the blood proteins that keep fluid in the circulation, causing fluid to pass into the lung’s airspaces.  Our studies combine the BNP with the EKG measurement of QRS duration to predict whether a patient has a high or low ejection fraction, a measure to stage the severity of CHF.

We also had to integrate the information from the hemogram (RBC, MCV) with the hemoglobin A2 quantitation (BioRad Variant II) for the diagnosis of beta thalassemia.  We chose an approach to the data that requires no assumption about the distribution of test values or the variances.   Our detailed analyses validates an approach to thalassemia screening that has been widely used, the Mentzer index10, and in addition uses critical decision values for the tests that are used in the Mentzer index. We also showed that Hgb S has an effect on both Hgb A2 and Hgb F.  This study is adequately powered to assess the usefulness of the Hgb A2 criteria but not adequately powered to assess thalassemias with elevated Hgb F.

References:

1.  Adan J, Bernstein LH, Babb J. Lactate dehydrogenase isoenzyme-1/total ratio: accurate for determining the existence of myocardial infarction. Clin Chem 1986;32:624-8.

2. Rudolph RA, Bernstein LH, Babb J. Information induction for predicting acute myocardial infarction.  Clin Chem 1988;34:2031- 2038.

3. Magidson J. “Maximum Likelihood Assessment of Clinical Trials Based on an Ordered Categorical Response.” Drug Information Journal, Maple Glen, PA: Drug Information Association 1996;309[1]: 143-170.

4. Magidson J and Vermoent J.  Latent Class Cluster Analysis. in J. A. Hagenaars and A. L. McCutcheon (eds.), Applied Latent Class Analysis. Cambridge: CambridgeUniversity Press, 2002, pp. 89-106.

5. Mkhitarian VS, Mayzlin IE, Troshin LI, Borisenko LV. Classification of the base objects upon integral parameters of the attached network. Applied Mathematics and Computers.  Moscow, USSR: Statistika, 1976: 118-24.

6.Mayzlin IE, Mkhitarian VS. Determining the optimal bounds for objects of different classes. In: Dubrow AM, ed. Computational Mathematics and Applications. MoscowUSSR: Economics and Statistics Institute. 1976: 102-105.

7. RumelhartDE, Hinton GE, Williams RJ. Learning internal representations by error propagation. In:

RumelhartDE, Mc Clelland JL, eds. Parallel distributed processing.   Cambridge, Mass: MIT Press, 1986; 1: 318-62.

8. Zarich SW, Bradley K, Mayall ID, Bernstein, LH. Minor Elevations in Troponin T Values Enhance Risk Assessment in Emergency Department Patients with Suspected Myocardial Ischemia: Analysis of Novel Troponin T Cut-off Values.  Clin Chim Acta 2004 (in press).

9. Lin JC, Apple FS, Murakami MM, Luepker RV.  Rates of positive cardiac troponin I and creatine kinase MB mass among patients hospitalized for suspected acute coronary syndromes.  Clin Chem 2004;50:333-338.

10.Makris PE. Utilization of a new index to distinguish heterozygous thalassemic syndromes: comparison of its specificity to five other discriminants.Blood Cells. 1989;15(3):497-506.

Acknowledgements:   Jerard Kneifati-Hayek and Madeleine Schlefer, Midwood High School, Brooklyn, and Salman Haq, Cardiology Fellow, Methodist Hospital.

Table 1. Effect of selection of maximum distance on the number of classes formed and on the accuracy of recognition by ANN

 ClusteringDistanceFactor F(D = F * R) Number ofClasses Number of Nodes inThe HiddenLayers Number ofMisrecognizedPatterns inThe TestingSet of 43 Percent ofMisrecognized 10.90.80.7 2414135 1, 02, 03, 01, 02, 03, 0 3, 2 3, 2 121121 1 1 2.34.62.32.34.62.3 2.3 2.3

Figure 1.

Figure 2.

Table 2.  Frequency cross-table, probabilities and odds-ratios for scaled TnI versus expected diagnosis

 Range Not MI MI N Pct in MI Prob by TnI Odds Ratio < 0.45 655 2 657 2 0 1 0.46-0.6 7 0 7 0 0.03 13 0.61-0.75 4 0 4 0. 0.26 175 0.76-0.9 13 59 72 57.3 0.82 2307 > 0.9 0 42 42 40.8 0.98 30482 679 103 782 100

## A Software Agent for Diagnosis of ACUTE MYOCARDIAL INFARCTION

### A Software Agent for Diagnosis of ACUTE MI

Authors: Isaac E. Mayzlin, Ph.D.1, David Mayzlin1,Larry H. Bernstein, M.D.2

1MayNet, Carlsbad, CA, 2Department of Pathology and Laboratory Medicine, BridgeportHospital, Bridgeport, CT.

Agent-based  decision  support  systems  are  designed  to  provide  medical  staff  with  information  needed  for making critical decisions. We describe a Software Agent for evaluating multiple tests based on a large data base  especially  efficient  when  time  for  making  the  decision  is  critical  for  successful  treatment  of  serious conditions, such as stroke or acute myocardial infarction (AMI).

Goldman and others (1) developed a screening algorithm based on characteristics of the chest pain, EKG changes, and key clinical findings to separate high-risk from low-risk patients at the time they present using clinical features without using a serum marker. The Goldman algorithm was not widely used because of a 7 percent misclassification error, mostly false positives.       Nonetheless, A third of emergency room visits by patients presenting with symptoms of rule out AMI are not associated with chest pain. A related issue is the finding that a significant number of patients who are at high risk have to be identified using a cardiac marker. The use of cardiac isoenzymes has been to classify patients meeting the high risk criteria, many of whom are not subsequently found to have AMI.

Software Agent for Diagnosis based on the Knowledge incorporated in the Trained Artificial Neural Network and Data Clustering

This Software Agent is based on the combination of clustering by Euclidean distances in multi-dimensional space and non-linear  discrimination  fulfilled  by  the  Artificial  Neural  Network  (ANN)  trained  on  clusters’  averages.         Our  studies indicate that at an optimum clustering  distance the number of classes is minimized with efficient training on the ANN, retaining accuracy of classification by the ANN at 97%. The studies   conducted involve training and testing on separate clinical data sets.  We perform clustering using the geometrical (Euclidean) distance between two points in n-dimensional space,  formed  by  n  variables,  including  both  input  and  output  variables.  Since  this  distance  assumes  compatibility  of different variables, the values of all input variables are linearly transformed (scaled) to the range from 0 to 1.

The ANN technique for readers accustomed to classical statistics can be viewed as an extension of multivariate regression analyses with such new features as non-linearity and ability to process categorical data. Categorical (not continuous) variables represent two or more levels, groups, or classes of correspondent features, and in our case this concept is used to signify patient condition, for example existence or not of AMI.

Process  description. We  implemented  the  proposed  algorithm  for  diagnosis  of  AMI.  All  the  calculations  were performed on the authors’ unique Software Agent Maynet. First, using the automatic random extraction procedure, the initial data set (139 patients) was partitioned into two sets — training and testing.  This randomization also determined the size of these sets (96 and 43, respectively) since the program was instructed to assign approximately 70 % of data to the training set.

The main process consists of three successive steps:

(1)        clustering performed on training data set,

(2)        neural network’s training on clusters from previous step, and

(3)        classifier’s accuracy evaluation on testing data.

The classifier in this research will be the ANN, created on step 2, with output in the range [0,1], that provides binary result (1 – AMI, 0 – not AMI), using decision point 0.5.

In this paper we used the data of two previous studies (2,3) with three patients, potential outliers, removed (n = 139). The data contains three input variables, CK-MB, LD-1, LD-1/total LD, and one output variable, diagnoses, coded as 1 (for AMI) or 0 (non-AMI).

Table  1.  Effect  of  selection  of  maximum  distance  on  the  number  of  classes  formed  and  on  the accuracy of recognition by ANN

 Clustering Distance Factor F(D = F * R) Number ofClasses Number of Nodes in The Hidden Layers Number of Misrecognized Patterns inThe TestingSet of 43 Percent ofMisrecognized 10.90.8 0.7 241413 5 1,  02,  03,  0 1,  0 2,  0 3,  0 3,  2 3,  2 121 1 2 1 1 1 2.34.62.3 2.3 4.6 2.3 2.3 2.3

Abbreviations: creatine kinase MB isoenzyme: CK-MB; lactate dehydrogenase isoenzyme-1: LD1; LD1/total LD ratio: %LD1; acute myocardial infarction: AMI; artificial neural network: ANN