Funding, Deals & Partnerships: BIOLOGICS & MEDICAL DEVICES; BioMed e-Series; Medicine and Life Sciences Scientific Journal – http://PharmaceuticalIntelligence.com
More on the Performance of High Sensitivity Troponin T and with Amino Terminal Pro BNP in Diabetes
Writer and Curator: Larry H. Bernstein, MD, FCAP
UPDATED on 9/1/2019
Risk-Based Thresholds for hs-Troponin I Safely Speed MI Rule-Out
HISTORIC suggests benefit to patients, clinicians
PARIS — Using different cutoffs for high-sensitivity cardiac troponin I (hs-cTnI) testing based on risk accurately ruled out MI and sent patients home from the emergency department sooner without missing adverse cardiac events, the HISTORIC trial found.
In the stepped-wedge trial of over 30,000 consecutive patients, introduction of the risk-based approach reduced length of stay at the emergency department by over 3 hours compared with standard care (6.8 vs 10.1 hours, P<0.001), reported Nicholas Mills, MD, PhD, of the University of Edinburgh in Scotland.
And 74% of patients under the new pathway were discharged without requiring hospital admission versus 53% under standard protocols (adjusted risk ratio 1.57, 95% CI 1.34-1.83, P<0.001).
For the primary safety endpoint, 2.5% of patients in the standard group died from cardiac causes or had an MI at 12 months post-discharge versus 1.8% of those in the early rule-out group (adjusted OR 1.02, 95% CI 0.74-1.40).
“Adoption of this approach will have major benefit for both patients and healthcare providers,” said Mills during a late-breaking press briefing at the 2019 European Society of Cardiology (ESC) congress.
For example, many patients will need only a single troponin test under the algorithm to lead to a decision on admission, he noted, which could have “absolutely enormous” cost savings.
This is the final up to date review of the status of hs troponin T (or I) with or without the combined use of the Brain Type Natriuretic Peptide or its Amino Terminal peptide precursor. In addition, a new identification of the role of the Atrial Natriuretic Peptide has been reported with respect to arrythmogenic activity. On the one hand, the diagnostic value of the NT-proBNP has been seen as disappointing, in part because of the question of what information is gained by the test in overt known congestive heart failure, and in part because of uncertainty about following the test during a short hospital stay. At least, this is the view of this reviewer. However, in the last several years there has been an emphasis on the value this test adds to prediction of adverse outcomes. In addition, there has been a hidden nvariable that has much to do with the original reference values that were established for age ranges, without any consideration of pathophysiology that might affect the values within those ranges, leading one to consider values in an aging population as normal, that might well be high. Why is this? Aging patients are more likely to have hypertension, and also the onset of type-2 diabetes mellitus, with cardiovascular disease consequences. Type-2 diabetes mellitus (T2DM), for instance, is associated with insulin resistance and also fat gain with generation of adipokines, but the is also a hyalinization of insulin forming beta-cells of the pancreas, and there is hyalinization of glomeruli (glomerulosclerosis) and afferent arteriolonephrosclerosis with expected decline in glomerular filtrattion rate and hypertension as well. Of course, this is also associated with hepatosteatosis. Nevertheless, a reference range is established that takes none of this pathophysiology into account. While a more reasonable approach has been pointed out, there has been no followup in the literature.
On the other hand, there has been much confusion over the restandardization of a high sensitivity troponin I or T test (hs-Tn(I or T). The reference range declines precipitously, and there is a good identification of patients who are for the most part disease free, but there is no delineation of patients who are at high risk of acute coronary syndrome with plaque rupture, vs a host of other cardiovascular conditions. These have no relationship to plaque rupture, but may be serious and require further evaluation. The question then becomes whether to admit for a hospital stay, to refer to clinic after an evaluation in the ICU without admission, or to do an extensive evaluation in the emergency department overnight before release for followup. There is still another dimension of this that has to do with prediction of outcomes using hs-Tn(s) with or without the natriuretic peptides. Another matter that is not for discussion in this article is the underutilization of hs-CRP. Originally used for a marker of sepsis in the 1970s, it has come to be tied in with identification of an ongoing inflammatory condition. Therefore, the existence of a known inflammatory condition in the family of autoimmune diseases, with one exception, might make it unnecessary.
The discussion is broken into three parts:
Part 1. New findings on the troponins.
Part 2. The use of combined hs-Tn with a natriuretic peptide (NT-proBNP)
Part 3. Atrial natriuretic peptide
Part 1. New findings on the troponins.
Troponin: more lessons to learn
C Liebetrau,HM Nef,andCW.Hamm*
KerckhoffHeartandThoraxCenter;DepartmentofCardiology,BadNauheim,
Germany; (GermanCentreforCardiovascularResearch),partnersite
RheinMain,BadNauheim, Germany; and UniversityofGiessen,Medizinische
KlinikI,KardiologieundAngiologie,Giessen,Germany
European Hear tJournal http://dx.doi.org/10.1093/eurheartj/eht357This editorial refers to ‘Risk stratification in patients with acute chest pain
using three high-sensitivity cardiac troponin assays’,
by P. Haafetal. http://dx.doi.org/10.1093/eurheartj/eht218Cardiac troponin entered our diagnostic armamentarium 20 years ago and –
unlike any other biomarker –
is going through constant expansion in its application.
Troponin started out as a marker of risk in unstable angina’, then was used
as gold standard for risk stratification and therapy guiding in acute coronary syndrome
served further to redefine myocardial infarction, and
has also become a risk factor in apparently healthy subjects.
The recently introduced high-sensitivity cardiac troponin (hs-cTn) assays
have not only expanded the potential of troponins, but
have also resulted in a certain amount of confusion
among unprepared users.
After many years troponins were accepted as the gold standard in
patients with chest pain by
classifying them into troponin-positive and
troponin-negative patients.
The new generation of hs-cTn assays has
improved the accuracy at the lower limit of detection and
provided incremental diagnostic information especially
in the early phase of myocardial infarction.
Moreover, low levels of measurable troponins
unrelated to ACS have been associated with
an adverse long-term outcome.
Several studies demonstrated that
these low levels of cardiac troponin measureable
only by hs-Tn assays
are able to predict mortality in patients with ACS
as well as patients with assumed
stable coronary artery disease.
Furthermore, hs-cTn has the potential
to play a role in the care of patients
undergoing non-cardiac surgery.
The additional determination of hs-cTn
improves risk stratification despite
established risk scores providing both diagnosis and
for prognosis prediction in chest pain patients.
The daily clinical challenge in using the highly sensitive assays is to
interpret the troponin concentrations, especially
in patients with concomitant diseases
independently from myocardial ischaemia
influencing cardiac troponin concentrations (e.g. chronic kidney disease, or stroke).
The troponin test lost its ‘pregnancy test’ quality with the different users.
Different opinions exist on
the change of hs-cTn levels compared to simple ‘positive–negative’ interpretation
and thereby makes diagnosis finding more complex than before.
This uncertainty probably has the paradigm that
serial measurements of troponins are necessary, and also
boosted the number of diagnoses of ACS and
invasive diagnostic procedures in some locations.
This is more than understandable, with acute chest pain using
three high-sensitivity cardiac troponins with their respective baseline value
before the diagnosis of acute myocardial infarction (AMI) can be made.
What is a relevant change in concentrations compatible with acute myocardial necrosis and
what is only biological variation for the specific biomarker and assay?
Changes in serial measurements between 20% and 200%have been debated, and
the discussion is ongoing. Furthermore, it has been proposed that
absolute changes in cardiac troponin concentrations
have a higher diagnostic accuracy for AMI
compared with relative changes, and
it might be helpful in distinguishing AMI from other causes of cardiac troponin elevation.
Do we obtain any helpful directives from experts and guidelines for our daily practice?
Foreseeing this dilemma, the 2011 European Society of Cardiology (ESC) Guidelines
on non ST-elevation ACS acted.
Minor elevations of troponins were accepted as hs-cTn values in the ‘grey zone’.
This was and still is the rule, but
the ESC provided a general algorithm on how to manage patients with limited data.
The ‘Study Group on Biomarkers in Cardiology’ suggested
a rise of 50% from the baseline value at low concentrations.
However, this group of experts could also not find a substitute for the missing data
needed to validate the proposed recommendation.
The story is just too complex:
different troponin assays with
different epitope targets,
different patient populations,
different sampling protocols,
different follow-up lengths, and much more.
Therefore, any study that helps us to see better through the fog is welcome here.
Haaf et al. have now presented the results of their study of
different hs-cTn assays
(hs-cTnT, Roche Diagnostics; hs-cTnI, Beckman-Coulter; and hs-cTnI, Siemens)
with respect to the -outcome of patients with acute chest pain.
The authors examine 1117 consecutive patients presenting with acute chest pain.
[340 patients with ACS (30.5%)] from the Advantageous Predictors of Acute Coronary Syndrome
Evaluation (APACE) study. Blood was collected
directly on admission and
serially thereafter at 2, 3, and 6h.
Eighty-two patients (7.3%) died during the 2-year follow-up. The main finding of the study is that
hs-cTnT predicts mortality more accurately than the hs-cTnI assays,
-that a single measurement is sufficient
challenges causes of cardiac troponin elevation.
These results of APACE remain in contrast to recent findings from a GUSTO IV cohortof 1335 patients with ACS (Table1).
Table1 Studies investigating high sensitivity troponins for long-term prognosis
Contrasting results have also been reported in patients(n 1/4 3.623)
with stable coronary artery disease and preserved systolic left ventricular function
from the PEACE trial (Table1).
During a median follow-up period of 5.2 years,
there were 203 (5.6%) cardiovascular deaths or
first hospitalization for heart failure.
Concentrations of hs-cTnI (Abbott Diagnostics) at or above
the limit of detection of the assay were measured in 3567 patients (98.5%), but
concentrations of hs-cTnI at or above the gender-specific 99th percentile
were found in only 105 patients (2.9%).
This study revealed that
there was a strong and graded association
between increasing quartiles of hs-cTnI concentrations and
the risk for cardiovascular death or heart failure.
Hs-cTnI provided incremental prognostication information
over conventional risk markers and
other established cardiovascular biomarkers,
including hs-cTnT.
In contrast to the APACE results, only hs-cTnI, but
no ths-cTnT, was significantly
associated with the risk for AMI.
Is there a real difference between cardiac troponin T and cardiac troponin I
in predicting long term prognosis?
The question arises of whether there is a true clinically relevant
difference between cTnT and cTnI.
Given the biochemical and analytical differences,the two
troponins display rather similar serum profiles during AMI.
While minor biological differences between cTnT and cTnI are
apparently not relevant for diagnosis
and clinical management in the acute setting of ACS.
This is a provocative theory, but appears premature in our opinion.
Above all, the results of the current study appear
too inconsistent to allow such conclusions.
In the present study, hs-cTnT (Roche Diagnostics) outperformed
hs-cTnI (Siemens and Beckman-Coulter) in terms of
very long term prediction of cardiovascular death and
heart failure in stable patients.
We don’t know how hs-cTnI from Abbott Diagnostics
performs in the APACE consort.
The number of patients and endpoints provided
by the APACE registry are rather low.
The results could, therefore, be a chance finding.
It is far too early to favour one high sensitivity assay over the other. The findings need confirmation.
Implications for clinical practice
There is no doubt that high-sensitivity assays
are the analytical method of choice
in terms of risk stratification in patients with ACS.
What is new?
A single measurement of hs-cTn seems to be adequate
for long-term risk stratification in patients without AMI.
However, the question of which troponin might be preferable
for long-term risk stratification remains unanswered.
Part 2. ability of high-sensitivity cTnT and NT pro-BNP to predict cardiovascular events and death in patients with T2DM
Hillis GS; Welsh P; Chalmers J; Perkovic V; Chow CK; Li Q; Jun M; Neal B; Zoungas S; Poulter N; Mancia G; Williams B; Sattar N; Woodward M Diabetes Care. 2014; 37(1):295-303 (ISSN: 1935-5548)
OBJECTIVE
Current methods of risk stratification in patients with
type 2 diabetes are suboptimal.
The current study assesses the ability of
N-terminal pro-B-type natriuretic peptide (NT-proBNP) and
high-sensitivity cardiac troponin T (hs-cTnT)
to improve the prediction of cardiovascular events and death in patients with type 2 diabetes.
RESEARCH DESIGN AND METHODS
A nested case-cohort study was performed in 3,862 patients who participated in the Action in Diabetes and Vascular Disease:
Preterax and Diamicron Modified Release Controlled Evaluation (ADVANCE) trial.
RESULTS
Seven hundred nine (18%) patients experienced a
major cardiovascular event
(composite of cardiovascular death, nonfatal myocardial infarction, or nonfatal stroke) and
706 (18%) died during a median of 5 years of follow-up.
In Cox regression models, adjusting for all established risk predictors,
the hazard ratio for cardiovascular events for NT-proBNP was 1.95 per 1 SD increase (95% CI 1.72, 2.20) and
the hazard ratio for hs-cTnT was 1.50 per 1 SD increase (95% CI 1.36, 1.65). The hazard ratios for death were
1.97 (95% CI 1.73, 2.24) and
1.52 (95% CI 1.37, 1.67), respectively.
The addition of either marker improved 5-year risk classification for cardiovascular events
(net reclassification index in continuous model,
39% for NT-proBNP and 46% for hs-cTnT).
Likewise, both markers greatly improved the accuracy with which the 5-year risk of death was predicted.
The combination of both markers provided optimal risk discrimination.
CONCLUSIONS
NT-proBNP and hs-cTnT appear to greatly improve the accuracy with which the
risk of cardiovascular events or death can be estimated in patients with type 2 diabetes.
PreMedline Identifier: 24089534
Part 3. M-Atrial Natriuretic Peptide
M-Atrial Natriuretic Peptide and Nitroglycerin in a Canine Model of Experimental Acute Hypertensive Heart Failure:
Differential Actions of 2 cGMP Activating Therapeutics.
Paul M McKie, Alessandro Cataliotti, Tomoko Ichiki, S Jeson Sangaralingham, Horng H Chen, John C Burnett
Journal of the American Heart Association 01/2014; 3(1):e000206. http://dx.doi.org/10.1161/JAHA.113.000206 Source: PubMed
ABSTRACT
Systemic hypertension is a common characteristic in
acute heart failure (HF).
This increasingly recognized phenotype
is commonly associated with renal dysfunction and
there is an unmet need for renal enhancing therapies.
In a canine model of HF and acute vasoconstrictive hypertension
we characterized and compared the cardiorenal actions of M-atrial natriuretic peptide (M-ANP),
a novel particulate guanylyl cyclase (pGC) activator, and
nitroglycerin, a soluble guanylyl cyclase (sGC) activator.
HFwas induced by rapid RV pacing (180 beats per minute) for 10 days. On day 11, hypertension was induced by continuous angiotensin II infusion. We characterized the cardiorenal and humoral actions
prior to,
during, and
following intravenous infusions of
M-ANP (n=7),
nitroglycerin (n=7),
and vehicle (n=7) infusion.
Mean arterial pressure (MAP) was reduced by
M-ANP (139±4 to 118±3 mm Hg, P<0.05) and
nitroglycerin (137±3 to 116±4 mm Hg, P<0.05);
similar findings were recorded for
pulmonary wedge pressure (PCWP) with M-ANP (12±2 to 6±2 mm Hg, P<0.05)
and nitroglycerin (12±1 to 6±1 mm Hg, P<0.05).
M-ANP enhanced renal function with significant increases (P<0.05) in
glomerular filtration rate (38±4 to 53±5 mL/min),
renal blood flow (132±18 to 236±23 mL/min), and
natriuresis (11±4 to 689±37 mEq/min) and
also inhibited aldosterone activation (32±3 to 23±2 ng/dL, P<0.05), whereas
nitroglycerin had no significant (P>0.05) effects on these renal parameters or aldosterone activation.
Our results advance
the differential cardiorenal actions of
pGC (M-ANP) and sGC (nitroglycerin) mediated cGMP activation.
These distinct renal and aldosterone modulating actionsmake
M-ANP an attractive therapeutic for HF with concomitant hypertension, where
Risk of bias in translational medicine may take one of three forms:
a systematic error of methodology as it pertains to measurement or sampling (e.g., selection bias),
a systematic defect of design that leads to estimates of experimental and control groups, and of effect sizes that substantially deviate from true values (e.g., information bias), and
a systematic distortion of the analytical process, which results in a misrepresentation of the data with consequential errors of inference (e.g., inferential bias).
Risk of bias can seriously adulterate the internal and the external validity of a clinical study, and, unless it is identified and systematically evaluated, can seriously hamper the process of comparative effectiveness and efficacy research and analysis for practice. The Cochrane Group and the Agency for Healthcare Research and Quality have independently developed instruments for assessing the meta-construct of risk of bias. The present article begins to discuss this dialectic.
Background
As recently discussed in this journal[1], translational medicine is a rapidly evolving field. In its most recent conceptualization, it consists of two primary domains:
translational research proper and
translational effectiveness.
This distinction arises from a cogent articulation of the fundamental construct of translational medicine in particular, and of translational health care in general.
The Institute of Medicine’s Clinical Research Roundtable conceptualized the field as being composed by two fundamental “blocks”:
one translational “block” (T1) was defined as “…the transfer of new understandings of disease mechanisms gained in the laboratory into the development of new methods for diagnosis, therapy, and prevention and their first testing in humans…”, and
the second translational “block” (T2) was described as “…the translation of results from clinical studies into everyday clinical practice and health decision making…”[2].
These are clearly two distinct facets of one meta-construct, as outlined in Figure 1. As signaled by others, “…Referring to T1 and T2 by the same name—translational research—has become a source of some confusion. The 2 spheres are alike in name only. Their goals, settings, study designs, and investigators differ…”[3].
Figure 1.Schematic representation of the meta-construct of translational health carein general, and translational medicine in particular, which consists of two fundamental constructs: the T1 “block” (as per Institute of Medicine’s Clinical Research Roundtable nomenclature), which represents the transfer of new understandings of disease mechanisms gained in the laboratory into the development of new methods for diagnosis, therapy, and prevention as well as their first testing in humans, and the T2 “block”, which pertains to translation of results from clinical studies into everyday clinical practice and health decision making [[3]].The two “blocks” are inextricably intertwined because they jointly strive toward patient-centered research outcomes (PCOR) through the process of comparative effectiveness and efficacy research/review and analysis for clinical practice (CEERAP). The domain of each construct is distinct, since the “block” T1 is set in the context of a laboratory infrastructure within a nurturing academic institution, whereas the setting of “block” T2 is typically community-based (e.g., patient-centered medical/dental home/neighborhoods[4]; “communities of practice”[5]).
For the last five years at least, the Federal responsibilities for “block” T1 and T2 have been clearly delineated. The National Institutes of Health (NIH) predominantly concerns itself with translational research proper – the bench-to-bedside enterprise (T1); the Agency for Healthcare Research Quality (AHRQ) focuses on the result-translation enterprise (T2). Specifically: “…the ultimate goal [of AHRQ] is research translation—that is, making sure that findings from AHRQ research are widely disseminated and ready to be used in everyday health care decision-making…”[6]. The terminology of translational effectiveness has emerged as a means of distinguishing the T2 block from T1.
Therefore, the bench-to-bedside enterprise pertains to translational research, and the result-translation enterprise describes translational effectiveness. The meta-construct of translational health care (viz., translational medicine) thus consists of these two fundamental constructs:
translational research and
translational effectiveness,
which have distinct purposes, protocols and products, while both converging on the same goal of new and improved means of
individualized patient-centered diagnostic and prognostic care.
It is important to note that the U.S. Patient Protection and Affordable Care Act (PPACA, 23 March 2010) has created an environment that facilitates the pursuit of translational health care because it emphasizes patient-centered outcomes research (PCOR). That is to say, it fosters the transaction between translational research (i.e., “block” T1)(TR) and translational effectiveness (i.e., “block” T2)(TE), and favors the establishment of communities of practice-research interaction. The latter, now recognized as practice-based research networks, incorporate three or more clinical practices in the community into
a community of practices network coordinated by an academic center of research.
Practice-based research networks may be a third “block” (T3)(PBTN) in translational health care and they could be conceptualized as a stepping-stone, a go-between bench-to-bedside translational research and result-translation translational effectiveness[7]. Alternatively, practice-based research networks represent the practical entities where the transaction between
translational research and translational effectiveness can most optimally be undertaken.
It is within the context of the practice-based research network that the process of bench-to-bedside can best seamlessly proceed, and it is within the framework of the practice-based research network that
the best evidence of results can be most efficiently translated into practice and
be utilized in evidence-based clinical decision-making, viz. translational effectiveness.
Translational effectiveness
As noted, translational effectiveness represents the translation of the best available evidence in the clinical practice to ensure its utilization in clinical decisions. Translational effectiveness fosters evidence-based revisions of clinical practice guidelines. It also encourages
effectiveness-focused,
patient-centered and
evidence-based clinical decision-making.
Translational effectiveness rests not only on the expertise of the clinical staff and the empowerment of patients, caregivers and stakeholders, but also, and
most importantly on the best available evidence[8].
The pursuit of the best available evidence is the foundation of
translational effectiveness and more generally of
translational medicine in evidence-based health care.
The best available evidence is obtained through a systematic process driven by
a research question/hypothesis that is articulated about clearly stated criteria that pertain to the
patient (P), the interventions (I) under consideration (C), for the sought clinical outcome (O), within a given timeline (T) and clinical setting (S).
PICOTS is tested on the appropriate bibliometric sample, with tools of measurements designed to establish the level (e.g., CONSORT) and the quality of the evidence. Statistical and meta-analytical inferences, often enhanced by analyses of clinical relevance[9], converge into the formulation of the consensus of the best available evidence. Its dissemination to all stakeholders is key to increase their health literacy in order to ensure their full participation
in the utilization of the best available evidence in clinical decisions, viz., translational effectiveness.
To be clear, translational effectiveness – and, in the perspective discussed above, translational health care – is anchored on obtaining the best available evidence,
which emerges from highest quality research.
which is obtained when errors are minimized.
In an early conceptualization[10], errors in research were presented as
those situations that threaten the internal and the external validity of a research study –
that is, conditions that impede either the study’s reproducibility, or its generalization. In point of fact, threats to internal and external validity[10] represent specific aspects of systematic errors (i.e., bias) in the
research design,
methodology and
data analysis.
Thence emerged a branch of science that seeks to
understand,
control and
reduce risk of bias in research.
Risk of bias and the best available evidence
It follows that the best available evidence comes from research with the fewest threats to internal and to external validity – that is to say, the fewest systematic errors: the lowest risk of bias. Quality of research, as defined in the field of research synthesis[11], has become synonymous with
Several years ago, the Cochrane group embarked on a new strategy for assessing the quality of research studies by examining potential sources of bias. Certain original areas of potential bias in research were identified, which pertain to
(a) the sampling and the sample allocation process, to measurement, and to other related sources of errors (reliability of testing),
(b) design issues, including blinding, selection and drop-out, and design-specific caveats, and
(c) analysis-related biases.
A Risk of Bias tool was created (Cochrane Risk of Bias), which covered six specific domains:
1. selection bias,
2. performance bias,
3. detection bias,
4. attrition bias,
5. reporting bias, and
6. other research protocol-related biases.
Assessments were made within each domain by one or more items specific for certain aspects of the domain. Each items was scored in two distinct steps:
1. the support for judgment was intended to provide a succinct free-text description of the domain being queried;
2. each item was scored high, low, or unclear risk of material bias (defined here as “…bias of sufficient magnitude to have a notable effect on the results or conclusions…”[16]).
It was advocated that assessments across items in the tool should be critically summarized for each outcome within each report. These critical summaries were to inform the investigator so that the primary meta-analysis could be performed either
only on studies at low risk of bias, or for
the studies stratified according to risk of bias[16].
This is a form of acceptable sampling analysis designed to yield increased homogeneity of meta-analytical outcomes[17]. Alternatively, the homogeneity of the meta-analysis can be further enhanced by means of the more direct quality-effects meta-analysis inferential model[18].
Clearly, one among the major drawbacks of the Cochrane Risk of Bias tool is
the subjective nature of its assessment protocol.
In an effort to correct for this inherent weakness of the instrument, the Cochrane group produced
detailed criteria for making judgments about the risk of bias from each individual item[16], and
that judgments be made independently by at least two people, with any discrepancies resolved by discussion[16].
This approach to increase the reliability of measurement in research synthesis protocols
is akin to that described by us[19,20] and by AHRQ[21].
In an effort to aid clinicians and patients in making effective health care related decisions, AHRQ developed an alternative Risk of Bias instrument for enabling systematical evaluation of evidence reporting[22]. The AHRQ Risk of Bias instrument was created to monitor four primary domains:
1. risk of bias: design, methodology, analysis scoring – low, medium, high
2. consistency: extent of similarity in effect sizes across studies within a bibliome scoring – consistent, inconsistent, unknown
3. directness: unidirectional link between the interventions of interest and the sought outcome, as opposed to multiple links in a casual chain scoring – direct, indirect
4. precision: extent of certainty for estimate of effect with respect to the outcome scoring – precise, imprecise In addition, four secondary domains were identified:
a. Dose response association: pattern of a larger effect with greater exposure (Present/Not Present/Not Applicable or Not Tested)
a. Confounders: consideration of confounding variables (Present/Absent)
a. Strength of association: likelihood that the observed effect is large enough that it cannot have occurred solely as a result of bias from potential confounding factors (Strong/Weak)
a. Publication bias
The AHRQ Risk of Bias instrument is also designed to yield an overall grade of the estimated risk of bias in quality reporting:
•Strength of Evidence Grades (scored as high – moderate – low – insufficient)
This global assessment, in addition to incorporating the assessments above, also rates:
–major benefit
–major harm
–jointly benefits and harms
–outcomes most relevant to patients, clinicians, and stakeholders
The AHRQ Risk of Bias instrument suffers from the same two major limitations as the Cochrane tool:
1. lack of formal psychometric validation as most other tools in the field[21], and
2. providing a subjective and not quantifiable assessment.
To begin the process of engaging in a systematic dialectic of the two instruments in terms of their respective construct and content validity, it is necessary
to validate each for reliability and validity either by means of the classic psychometric theory or generalizability (G) theory, which allows
the simultaneous estimation of multiple sources of measurement error variance (i.e., facets)
while generalizing the main findings across the different study facets.
G theory is particularly useful in clinical care analysis of this type, because it permits the assessment of the reliability of clinical assessment protocols.
the reliability and minimal detectable changes across varied combinations of these facets are then simply calculated[23], but
it is recommended that G theory determination follow classic theory psychometric assessment.
Therefore, we have commenced a process of revision the AHRQ Risk of Bias instrument by rendering questions in primary domains quantifiable (scaled 1–4),
which established the intra-rater reliability (r = 0.94, p < 0.05), and
the criterion validity (r = 0.96, p < 0.05) for this instrument (Figure 2).
Figure 2.Proportion of shared variance in criterion validity (A) and inter-rater reliability (B) in the AHRQ Risk of Bias instrument revised as described. Two raters were trained and standardized[20] with the revised AHRQ Risk of Bias and with the R-Wong instrument, which has been previously validated[24]. Each rater independently produced ratings on a sample of research reports with both instruments on two separate occasions, 1–2 months apart. Pearson correlation coefficient was used to compute the respective associations. The figure shows Venn diagrams to illustrate the intersection between each two sets data used in the correlations. The overlap between the sets in each panel represents the proportion of shared variance for that correlation. The percent of unexplained variance is given in the insert of each panel.
A similar revision of the Cochrane Risk of Bias tool may also yield promising validation data. G theory validation of both tools will follow. Together, these results will enable a critical and systematic dialectical comparison of the Cochrane and the AHRQ Risk of Bias measures.
Discussion
The critical evaluation of the best available evidence is critical to patient-centered care, because biased research findings are fundamentally invalid and potentially harmful to the patient. Depending upon the tool of measurement, the validity of an instrument in a study is obtained by means of criterion validity through correlation coefficients. Criterion validity refers to the extent to which one measures or predicts the value of another measure or quality based on a previously well-established criterion. There are other domains of validity such as: construct validity and content validity that are rather more descriptive than quantitative. Reliability however is used to describe the consistency of a measure, the extent to which a measurement is repeatable. It is commonly assessed quantitatively by correlation coefficients. Inter-rater reliability is rendered as a Pearson correlation coefficient between two independent readers, and establishes equivalence of ratings produced by independent observers or readers. Intra-rater reliability is determined by repeated measurement performed by the same subject (rater/reader) at two different points in time to assess the correlation or strength of association of the two sets of scores.
To establish the reliability of research quality assessment tools it is necessary, as we previously noted[20]:
•a) to train multiple readers in sharing a common view for the cognitive interpretation of each item. Readers must possess declarative knowledge a factual form of information known to be static in nature a certain depth of knowledge and understanding of the facts about which they are reviewing the literature. They must also have procedural knowledge known as imperative knowledge that can be directly applied to a task in this case a clear understanding of the fundamental concepts of research methodology, design, analysis and inference.
•b) to train the readers to read and evaluate the quality of a set of papers independently and blindly. They must also be trained to self-monitor and self-assess their skills for the purpose of insuring quality control.
•c) to refine the process until the inter-rater correlation coefficient and Cohen coefficient of agreement are about 0.9 (over 81% shared variance). This will establishes that the degree of attained agreement among well-trained readers is beyond chance.
•d) to obtain independent and blind reading assessments from readers on reports under study.
•e) to compute means and standard deviation of scores for each question across the reports, repeat process if the coefficient of variations are greater than 5% (i.e., less than 5% error among the readers across each questions).
The quantification provided by instruments validated in such a manner to assess the quality and the relative lack of bias in the research evidence allows for the analysis of the scores by means of the acceptable sampling protocol. Acceptance sampling is a statistical procedure that uses statistical sampling to determine whether a given lot, in this case evidence gathered from an identified set of published reports, should be accepted or rejected[12,25]. Acceptable sampling of the best available evidence can be obtained by:
•convention: accept the top 10 percentile of papers based on the score of the quality of the evidence (e.g., low Risk of Bias);
•confidence interval (CI95): accept the papers whose scores fall at of beyond the upper confidence limit at 95%, obtained with mean and variance of the scores of the entire bibliome;
•statistical analysis: accept the papers that sustain sequential repeated Friedman analysis.
To be clear, the Friedman test is a non-parametric equivalent of the analysis of variance for factorial designs. The process requires the 4-E process outlined below:
•establishing a significant Friedman outcome, which indicates significant differences in scores among the individual reports being tested for quality;
•examining marginal means and standard deviations to identify inconsistencies, and to identify the uniformly strong reports across all the domains tested by the quality instrument
•excluding those reports that show quality weakness or bias
•executing the Friedman analysis again, and repeating the 4-E process as many times as necessary, in a statistical process akin to hierarchical regression, to eliminate the evidence reports that exhibit egregious weakness, based on the analysis of the marginal values, and to retain only the group of report that harbor homogeneously strong evidence.
Taken together, and considering the domain and the structure of both tools, expectations are that these analyses will confirm that these instruments are two related entities, each measuring distinct aspects of bias. We anticipate that future research will establish that both tools assess complementary sub-constructs of one and the same archetype meta-construct of research quality.
References
Jiang F, Zhang J, Wang X, Shen X:Important steps to improve translation from medical research to health policy.
Maida C:Building communities of practice in comparative effectiveness research.InComparative effectiveness and efficacy research and analysis for practice (CEERAP): applications for treatment options in health care. Edited by Chiappelli F, Brant X, Cajulis C. Heidelberg: Springer–Verlag; 2012.
Chapter 1
Agency for Healthcare Research and Quality:Budget estimates for appropriations committees, fiscal year (FY) 2008: performance budget submission for congressional justification.
Chiappelli F, Brant X, Cajulis C:Comparative effectiveness and efficacy research and analysis for practice (CEERAP) applications for treatment options in health care. Heidelberg: Springer–Verlag; 2012.
Dousti M, Ramchandani MH, Chiappelli F:Evidence-based clinical significance in health care: toward an inferential analysis of clinical relevance.
Campbell D, Stanley J:Experimental and quasi-experimental designs for research. Chicago, IL: Rand-McNally; 1963.
Littell JH, Corcoran J, Pillai V:Research synthesis reports and meta-analysis. New York, NY: Oxford Univeristy Press; 2008.
Chiappelli F:The science of research synthesis: a manual of evidence-based research for the health sciences. Hauppauge NY: NovaScience Publisher, Inc; 2008.
Higgins JPT, Green S:Cochrane handbook for systematic reviews of interventions version 5.0.1. Chichester, West Sussex, UK: John Wiley & Sons. The Cochrane collaboration; 2008.
CRD: Systematic Reviews:CRD’s guidance for undertaking reviews in health care. National Institute for Health Research (NIHR). University of York, UK: Center for reviews and dissemination; 2009.PubMed Abstract|Publisher Full Text
McDonald KM, Chang C, Schultz E:Closing the quality Gap: revisiting the state of the science. Summary report. U.S. Department of Health & Human Services. AHRQ, Rockville, MD: Summary report. AHRQ publication No. 12(13)-E017; 2013.