The Logic and Reliability of Evaluations
of Competence to Stand Trial
Jennifer L. Skeem, Stephen L. Golding, and Nancy B. Cohn
University of Utah
and
Gerald Berge
Utah State Hospital
In Press, Law and Human Behavior
Author Note
Jennifer L. Skeem, Stephen L. Golding, Nancy B. Cohn, University of Utah; and Gerald Berge, Utah State Hospital.
This paper is based on a master's thesis completed by Jennifer Skeem. Portions of the paper were presented at the American Psychological Association's 103rd Annual Convention (August, 1995). The authors thank Randall Borum for reviewing an early version of the coding manual for this study and 4 reviewers for their helpful comments on an earlier version of the manuscript.
Correspondence concerning this article should be addressed to Jennifer L. Skeem at the University of Utah, Department of Psychology, 390 S. 1530 E., RM#502, Salt Lake City, Utah 84112-0251. Electronic mail may be sent to skeem@psych.utah.edu
Running Head: LOGIC & RELIABILITY OF CST EVALUATIONS
Submission Date: 28 May 97
Revision Date 1: 24 November 97
Revision Date 2: 05 January 98
The Logic and Reliability of Evaluations
of Competence to Stand Trial
(Ms no. 97-046)
Abstract
Because the trier of fact determines the weight to be assigned to an examiner's opinion by assessing the strength and persuasiveness of his or her analysis of the data, it is essential that forensic reports communicate the examiner's reasoning process. This study analyzes community examiners' reports on competence to stand trial (CST), emphasizing the nature of examiners' (1) expressed conceptualizations of CST, and (2) reasoning establishing a nexus between CST impairments and symptoms of psychopathology. Expert raters coded 100 randomly selected CST reports with respect to a variety of issues, including the examiners' description of the defendant's psycholegal deficits, provision of specific reasoning to link these deficits to psychopathology, and agreement with a paired examiner's global and specific opinions about the defendant's impairments. CST reports were found to: (1) reflect basic operationalizations of competence that fail to incorporate legally relevant facets such as a defendant's decisional capacities; and (2) adequately document clinical findings, but fail to describe the reasoning underlying psycholegal conclusions. Examiners demonstrated moderately high levels of agreement on defendant's global CST, but expressed radically divergent bases for this opinion. These findings are discussed in light of legal, ethical and professional standards of practice.
The Logic and Reliability of Evaluations of Competence to Stand Trial
The assessment of competence to stand trial has been labeled "the most significant mental health inquiry pursued in criminal law" (Stone, 1975, p. 200) because of the critical role of CST evaluations in criminal adjudication, the number of defendants evaluated annually, and the cost(1) of their evaluation, adjudication and treatment (see Steadman, Monahan, Hartstone, Davis & Robbins, 1982; Winick, 1987). Although the ultimate determination of CST is a legal matter, studies have uniformly concluded that judges often defer to the opinions of examiners, with rates of judge-examiner agreement typically exceeding 90% (Hart & Hare, 1992; Reich & Tookey, 1986; Williams & Miller, 1981). Since judges typically rely solely on examiners' written reports (Melton, Petrila, Poythress & Slobogin, 1987; Roesch & Golding, 1980; Steadman, 1979), the nature and reliability of the reasoning expressed in such reports becomes a critical part of the adjudication process.
Given the pivotal role of these reports, it is not surprising that commentators have directed a significant amount of criticism toward clinicians' CST evaluations. Critics have generally observed that examiners (1) fail to address CST or confuse the CST legal standard with that of civil commitment or insanity, (2) focus on diagnostic or treatment issues rather than psycholegal abilities, (3) rely on traditional clinical data (i.e., diagnoses, testing) to assess CST and fail to explain any relationship between psychopathology and psycholegal abilities, (4) fail to collect third party information to corroborate their opinions, and (5) issue conclusory opinions on CST devoid of explicit, data-based reasoning (see Bennett, 1985; Elwork, 1984; Grisso, 1986; Miller & Germain, 1986). Surprisingly, much of this criticism is based on a few early studies that included qualitative report surveys as adjunct issues to their primary focus on CST adjudication (Geller & Lister, 1978; Hess & Thomas, 1963; Roesch & Golding, 1977; Vann, 1965).
Few empirical studies have focused on the nature and quality of CST reports. A subset of these studies operationalize report quality as legal consumers' levels of satisfaction with reports, and have found acceptable "favorability" ratings with select samples of reports completed by forensically trained experts (Melton, Weithorn & Slobogin, 1985; Petrella & Poythress, 1980). Although these consumer ratings are an important index of report quality, they do not translate into report quality per se. Specifically, judges often defer to examiners' opinions and expect little of CST assessments, but nevertheless express a high degree of satisfaction with those they receive (Owens, Rosner & Harmon, 1987). Grisso (1987) has argued that legal professionals may expect and reward mediocrity in forensic assessment: judges are accustomed to following precedent and may require only that assessments offer no less than those they have received in the past. For these reasons, it is critical that report quality be measured not only from the perspective of legal consumers, but also in light of professional standards of practice. Thus, the degree to which reports reflect indices of quality defined by professional, legal, and ethical criteria should be assessed.
Several recent studies have begun to address this issue by investigating whether CST reports reflect basic assessment and documentation procedures delineated in ethical codes and expert commentary. Unlike early surveys, these studies almost uniformly found that examiners addressed the correct issue of CST. However, there was substantial variability in whether reports documented using third party information and assessment instruments, or cited provision of notice to the defendant about the purpose and confidentiality of the evaluation (Heilbrun & Collins, 1995; Heilbrun, Rosenfeld, Warren & Collins, 1994; Larkin & Collins, 1989; Nicholson, LaFortune, Norwood & Roach, 1995; Petrella & Poythress, 1980). While such studies provide critical information about CST assessments, they generally do not address the nature and reliability of any reasoning presented in support of psycholegal conclusions, nor what "operationalizations" or conceptualizations of competence the assessments reflect. Because the trier of fact determines the weight to be assigned to an examiner's opinion by assessing the strength and persuasiveness of his or her analysis of the data (American Bar Association, 1986; Bazelon, 1975), it is essential that CST reports communicate the examiner's reasoning and process of data interpretation.
Although Borum and Grisso (1996) have surveyed experts on the importance of addressing specific psycholegal abilities and documenting the reasoning underlying psycholegal conclusions in CST reports, only one study has assessed whether these issues are actually addressed in reports. Nicholson and his colleagues (1995) found that, of 7 CST abililties, only the defendant's understanding of the charge was described in the majority of reports. These reports reflected a basic "operationalization" of the CST construct where foundational psycholegal abilities such as defendants' appreciation of their charges were addressed almost to the exclusion of contextually relevant, decisional abilities such as defendants' appreciation of plea bargaining (Bonnie, 1992, see discussion below for details). More importantly, however, only half of the reports in this study provided an example or rationale to support their conclusions about the defendant's CST abilities.
The central purpose of this study is to go beyond previous dichotomous (e.g., criteria present or absent) characterizations to analyze the nature and reliability of the logic examiners present in their reports and the degree to which their practices comport with legal, ethical, and professional standards (American Psychological Association, 1992; Borum & Grisso, 1996; Committee on Ethical Guidelines for Forensic Psychologists, 1991; Golding, 1993; Grisso, 1986, 1988; Heilbrun, 1992; Melton et al., 1985, 1987). At the time of this study, Utah's CST standard (Utah Code Annotated, § 77-15-2, 1992) was a relatively unelaborated variant of the federal standard (Dusky v. United States, 1960) which defined a defendant as incompetent if he(2) suffered from mental disease or defect that resulted in his inability to factually or rationally understand the proceedings or consult with counsel. Given this and virtually all legal standards for CST, a critical report evaluation question concerns the degree to which examiners assess and substantiate any nexus between symptoms of psychopathology and deficits in competence (Grisso, 1986; Golding, 1993; Melton et al., 1985, 1987; Nicholson & Kugler, 1991; Nicholson et al., 1995). This issue formed a central focus of this study, as did clarifying the nature of any relationships examiners described between particular symptoms of psychopathology and specific CST deficits. To address these issues, the logic and structure of CST reports were examined with respect to examiners' (1) description and substantiation of the defendant's CST abilities (e.g., "operationalization" of the CST construct), (2) description and substantiation of the defendant's symptoms of psychopathology, (3) provision of data and reasoning to link these symptoms to psycholegal deficits in support of CST conclusions, (4) corroboration of opinions with third party sources of information and assessment instrument data, and (5) agreement with another examiner's opinions on the defendant's clinical status and CST and on the bases for these opinions.
Method
Subjects
Fifty orders for initial CST evaluations were drawn from the files of Utah's Third District Court, which is responsible for over half of the state's forensic referrals (Utah State Division of Mental Health, 1995). Half of the orders were for all CST evaluations during a 13 month period prior to a state-wide forensic training initiative (1/91-2/92), and half were randomly drawn from a pool of orders filed over a 9 month period postdating the initiative (5/93-1/94). (3) Four of the orders initially drawn were replaced by randomly drawing a new order (3 because an assigned examiner failed to attend training and 1 due to missing data). Of the 50 defendants in the sample, most were single (74%), White (76%), unemployed (78%), males (86%) with an average age of 32 years (SD= 11) and educational level of grade 11 (SD= 2).(4)
Given that two clinicians were ordered to examine each defendant, 100 CST reports that referenced these defendants were obtained from legal or medical files. All of the reports were based on orders to address CST, and approximately half (46%) included an order to address criminal responsibility. According to information contained in the reports, only 9% of the evaluations were performed while defendants were committed at the state hospital. Of the evaluations, 80% were performed by Ph.D. psychologists, 14% by psychiatrists, and 6% social workers, all of whom were employed in the community. Only 2 examiners (responsible for 8% of the reports) held diplomate status with the American Boards of Forensic Psychology or Psychiatry. Thus, using Grisso's (1987) categorization, these examiners were "occasional experts" with highly variable degrees of forensic training.
Although 18 examiners (62% of all those state-approved) were represented in the report sample, over half (59%) of the evaluations were completed by 5 examiners. However, comparisons of the reports completed by these 5 examiners with those completed by the remaining 13 examiners revealed virtually no differences. First, 2 analyses were used to compare the full sample of reports completed by the 5 most frequent examiners (N=59) to those of the remaining 13 examiners (N= 41). At a liberal familywise error rate of p=.10 (associated with a per-comparison error rate of p<.01), these 14 comparisons revealed no significant differences in: (1) the frequencies with which the 11 CST domains were addressed [e.g., Capacity for Reasoned Choice, 2 (1, N=100)=0.69](5), (2) whether or not examiners related CST impairments to psychopathology [2 (1, N=173)=0.50], (3) the number of defendants deemed competent or incompetent [2 (1, N=100)=1.76], or (4) global ratings of report quality [2 (3, N=100)=2.63].
Second, because the above analyses violate the assumption of independent observations (in that most examiners contributed multiple reports and data points), we (1) computed the average proportion with which each examiner addressed each variable across reports, (2) completed an arcsine transformation on these average proportions to approximate a normal distribution (Cohen & Cohen, 1975), and (3) completed t- tests to compare the transformed average proportions of the 5 examiners to those of the remaining 13. At a liberal familywise error rate of p=.10 (per comparison, p <.01) these 13 comparisons revealed no significant differences in (1) the frequency with which the 11 CST domains were addressed [e.g., Capacity for Reasoned Choice, t (16) = 1.17, p=.26](6), (2) whether or not examiners related CST impairments to psychopathology, t (16)=-0.63, p=.54, or (3) the proportion of defendants deemed incompetent, t (16) = -1.58, p= .13. However, because the total sample size was only 18, these results could reflect insufficient power. Thus, we calculated 90% confidence intervals for the difference between the means for each test. Even assuming the largest effect included in the confidence interval for each variable, only 1 of 13 comparisons would be significant at the p<.01 level.(7)
In sum, although analyses are complicated by the problems of using a "real-world," representative sample of reports, the results do not appear unduly influenced by the 5 examiners who completed a majority of the reports. The results appear to represent the full sample of 18 examiners.
Because a sample of CST reports representative of those submitted to Utah courts was desired, the sampling strategy was not designed to avoid duplication of examiners. Although examiners are supposed to be court appointed in Utah, in practice, examiners are selected and agreed upon by opposing attorneys. Thus, examiners with whom attorneys are familiar and/or prefer are likely to be selected to conduct evaluations repeatedly. In our experience, this selection process is representative of most jurisdictions.
Training Initiative
A secondary interest was in the effect of minimal training on report characteristics. Over a 2 year period, all examiners attended two, 2-day state-wide forensic workshops which included a total of 8 hours of training on CST legal standards and evaluations. Participants received a chapter on CST adjudication (Golding & Roesch, 1988) and both forms of the Interdisciplinary Fitness Interview (IFI, Golding & Roesch, 1983; IFI-R, Golding, 1993).
Report Coding Procedure
Based upon a comprehensive review of the literature, critiques and standards regarding CST assessment and report quality were distilled in a coding manual,(8) which underwent several iterations as a function of feedback from practicing professionals and nationally recognized experts. The final version of this manual codes the logic and structure of CST reports with respect to the issues listed below.
Description and substantiation of CST abilities. Substantial attention has been devoted to understanding the psycholegal abilities encompassed by the vague language of the federal Dusky standard. To add content to the language, several professional and legal bodies have created sets of functional abilities defendants must possess to stand trial (Grisso, 1986, 1988; Utah Code Annotated §77-15-5.4, 1994). Given these numerous sets of abilities, examiners could conceptualize and assess CST in a number of ways, ideally focusing on the issues most relevant to the defendant's case. Thus, to capture various conceptualizations of CST, raters coded whether each report described the defendant's capacities with respect to 11 global psycholegal domains and 31 nested subdomains (see Table 1 for a list). These domains and subdomains were based on psycholegal abilities reviewed in the IFI-R (Golding, 1993) and by Grisso (1988) in his manual for examiners. Across domains and subdomains, raters coded whether the report addressed the CST ability, and, if so, whether the defendant was described as impaired. Raters also coded whether examiners cited statutory CST criteria and considered the context of the defendant's case (i.e., the probable demands of trial) in reaching an opinion.
Description and substantiation of psychopathology. Each report's description of defendant psychopathology was coded based on the IFI-R (Golding, 1993) organization of symptoms into nine carefully defined categories (see Table 2). Each category was coded as present (symptom positive) or absent (symptom negative or not mentioned).(9) Raters also noted the primary DSM III-R diagnosis described in the report and judged the degree to which examiners (1) presented symptoms to support their diagnosis and (2) described the specific nature of these symptoms. They also indicated whether examiners reported any medication the defendant was prescribed.
Raters coded whether examiners described assessing the defendant's potential for malingering, and if so, whether they believed the defendant was malingering. If malingering was assessed, raters coded whether the author used verifying information (i.e., records) to form an opinion. If malingering was not assessed, raters judged the degree to which malingering was an issue that should have been addressed.
Provision of reasoning for psycholegal conclusions. When a defendant was described as impaired(10) with respect to any CST domain, raters judged the degree of relationship specified by the examiner between that CST impairment and symptomatology, using the following scale: (0) No description of a relationship is provided (i.e., "The defendant is unable to provide information to assist in her defense."); (1) A relationship is merely implied by examples or defendant quotes (i.e., "When asked to describe her version of the alleged events, the defendant said she did not know what happened."); (2) A relationship is asserted, but no specific supporting data are provided (i.e., "The defendant has problems with memory which preclude her from providing information to assist in her defense."); and (3) A relationship is substantiated in that the author specifies how the defendant's CST impairment is linked to psychopathology, providing specific supporting data (i.e., "As noted, the accused has problems with memory, and could not relate what she was doing at the time of the alleged assault. She may have difficulty providing information to assist in her defense."). This rating scale reflects combinations of the following considerations: (a) the extent to which the examiner includes a statement that the defendant's CST impairment is based on psychopathology; (b) the extent to which the examiner provides specific data relevant to a link between the CST impairment and psychopathology, and (c) the extent to which the examiner specifically describes how the defendant's impairment is based on psychopathology. (The latter two considerations overlap heavily.) The rating criteria for "substantiated" do not assess whether examiners supported their CST impairment-psychopathology links with empirical validity studies. Examiners, to be fair, are writing CST reports rather than journal articles. Also, raters' judgments were not based on whether they agreed with examiners' reasoning. For any non-zero relationship, raters also coded the symptom(s) to which the author linked the CST impairment, using the nine IFI-R categories noted above.
When a defendant was described as competent, raters judged how frequently (rarely, sometimes, or often) the assertion was supported with specific examples or defendant quotes.
The nature and relevance of assessment methods. Raters coded any CST assessment instruments, mental status examinations, diagnostic interviews, or psychological tests (PT) that examiners described using. PT's were organized into intelligence, neuropsychological, objective, projective, and mood categories. Raters noted whether PT results were corroborated (with observations or external information) or used to rule out malingering. For each PT category and for the battery as a whole, raters judged the extent to which test results were related to the defendant's CST based on the following 3-point scale: (0) no relationship is described; (1) a vague relationship is asserted (i.e.,"The defendant's verbal learning and memory abilities are impaired and may compromise his ability to assist counsel."); (2) a concrete relationship is specified (i.e., "The defendant's verbal learning and memory abilities are impaired and may affect his ability to recall the events in his trial as they unfold."). Finally, raters coded whether examiners described requesting and/or reviewing records or making any third party contacts (sources were specified).
Documentation of notice and general rating of quality. Finally, raters (1) coded whether examiners described having provided the defendant with a warning about the purpose and confidentiality of the evaluation, and (2) rendered a subjective rating of report quality based on four ordinal categories (poor to excellent).
Report Raters: Training and Reliability
Two experienced forensic psychologists(11) and the first author served as report raters. Raters completed a 2-day training in which they discussed and clarified rating criteria, coded 4 reports that were not included in the sample, and received feedback about coding problems. After all reports were "sanitized" by removing identifying examiner and defendant information, 2 raters were assigned to independently code each report. To avoid "drift," raters were given 6 detailed feedback letters about disagreements at equal intervals during coding. Significant coding problems were resolved by conference between the first two authors and the coding manual and prior cases were altered when new rules became necessary.
Interrater reliability. To estimate the reliability of the raters' coding processes, reliability analyses were completed on the study variables that involved the most rater judgment. These included the ratings of CST impairment-psychopathology relationships,(12) test result-CST relationships, symptom and diagnosis substantiation, and general report quality. Because of their centrality to the study, reliability analyses were also completed for the 11 CST domains.
For the rating variables, two forms of interrater reliability were computed: weighted kappa (Cohen, 1968)(13) and percent agreement. Generally, kappa values of .75 and greater are considered to reflect excellent agreement; .60-.74, good agreement; .40-.59 fair agreement; and .00-.40, poor agreement (Cichetti & Sparrow, 1981). Using these categorizations, the reliability data reflect good chance-corrected weighted agreement for ratings of CST impairment-psychopathology relationships (M Weighted Kappa= .74, SD= .11) and ratings of diagnosis and symptom substantiation and general report quality (M Weighted Kappa = .61, SD= .02), and excellent agreement for test result-CST relationships (M Weighted Kappa= .77, SD=.08). The average percent agreement for these groups of variables is 77%, 68%, and 93%, respectively.
For the 11 CST domains, kappa and percent agreement were computed. The kappa figures reflect excellent chance-corrected rates of agreement for the CST domains (M Kappa= .83, SD=.12), that are corroborated by percent agreement data (M=90%, SD=7). Overall, the reliability analyses indicate respectable rates of interrater agreement in accord with those reported in other CST report coding studies (Heilbrun & Collins, 1995; Nicholson et al., 1995).
Resolution of rating disagreements. A single set of data for each of the 100 reports was formed after any rating disagreements were resolved by conference between the first two authors (with the second author independently resolving any rating disagreements involving the first author). This set of data was used in the main analyses.
Training Effects
Comparison of the pre-training and post-training samples of reports revealed virtually no differences indicative of training effects. First, 2 tests were used to compare the full pre-training sample of reports (N=50) to the post-training sample (N=50). Given that 13 comparisons were made, a liberal familywise error rate of p=.10 (per-comparison p<.01) was used for these tests. There were no significant differences between the samples in whether the reports related CST impairments to psychopathology 2 (1, N=165) = 2.04. Similarly, only one CST domain (Appreciation of Appropriate Courtroom Behavior) was addressed with significantly different frequency 2 (1, N=100) = 13.22, p<.001) in the samples.(14) There were no significant differences between samples in global ratings of report quality 2 (3, N=100) = 6.49.
Second, because the above analyses may violate the assumption of independent observations, we computed the average proportion, across pre-training and post-training reports, with which each examiner addressed each variable. Because 7 examiners did not have at least one pre-training and one post-training report, a total of 11 examiners were included in these analyses. We completed an arcsine transformation on the average proportions to approximate a normal distribution (Cohen & Cohen, 1975). We then completed paired t- tests on these data to compare examiners' pre-training with post-training report scores. At a liberal familywise error rate of p=.10 (per comparison, p <.01), 11 comparisons revealed only one significant difference in the frequency with which the 11 CST domains were addressed.(15) Examiners addressed the defendant's Basic Knowledge of Legal Strategies and Options more often following the training (M=1.67, SD= .77) than they did prior to the training (M=0.97, SD=1.28), t (10) = 3.01, p=.002. Unfortunately, there were insufficient data to calculate differences in the extent to which examiners linked CST impairments to psychopathology (for some examiners' averages, these links were irrelevant as only 1-2 reports were included and no CST impairments were noted in those reports). Because the total sample size for these analyses was only 11, the results could reflect insufficient power. Thus, we calculated 90% confidence intervals for the difference between the means for each test. Even assuming the largest effect included in the confidence interval for each variable, none of the comparisons would be significant at the p<.01 level. Given the above findings, the report samples were pooled for the main analyses.
Results
The Nature of Data and Reasoning Provided in CST Reports
Psycholegal factors considered in reports. Almost all reports contained at least one specific reference to a defendant's CST abilities (95%), and the majority cited Utah's CST standard (71%). The majority of reports (76%) included an ultimate opinion about the defendant's CST and CST opinions were easily inferred from all reports. Surprisingly, in a majority of reports (53%), examiners opined that the defendant was incompetent to stand trial.
Table 1 lists the CST domains and subdomains in order of the frequency with which they were addressed in the reports and displays (a) the percent of reports that addressed each domain (column 1), (b) the proportion of reports that described the defendant as unimpaired or impaired on the domain, given that the domain was mentioned (column 2-3), (c) the percent of reports that mention each subdomain, given that the domain is addressed (column 4), and (d) the proportion of reports that describe the defendant as unimpaired or impaired on the subdomain, given that the domain is mentioned (columns 5-6). Using Bonnie's categorization (1992, see discussion below for details), these data reveal that most reports addressed foundational CST abilities including the defendant's appreciation of charges, potential penalties, and roles of courtroom personnel, and capacity to disclose information to counsel. Higher-order, contextualized decisional abilities such as the defendant's capacity for reasoned choice of legal options were addressed relatively infrequently.
__________________________________
Insert Table 1 about here
__________________________________
Although most of the reports (67-85%) described defendants as unimpaired across the CST domains most often addressed, a majority of the reports (53%) concluded that defendants were incompetent to stand trial. In fact, the average rate of impairment across all CST domains (M= 36%) is substantially lower than the rate of opined incompetence (M=53%). Moreover, of the reports that concluded that defendants were incompetent, 15% noted 0 CST impairments, and 36% noted only 1 or 2 impairments.
It may be argued that incompetence is not quantifiable based on the rate of CST impairment, but involves weighing a defendant's impairments against the anticipated demands of his case. However, only 12% of the reports noted the specific demands and/or context of the defendant's case were considered. Hence, most defendants were apparently not deemed incompetent because they failed to meet difficult trial demands.
Reasoning presented to support clinical and psycholegal conclusions. The nature and extent of the reasoning that examiners presented in support of their conclusions was assessed with respect to the examiners' opinions on defendants' CST impairments, CST abilities, and psychopathology.
To determine the degree to which examiners related CST impairments to psychopathology, the percent of absent, implied, asserted, and substantiated relationships was computed for each CST domain, given that that the domain was described as impaired. This data reveals that reports generally provide little data to support their conclusions about defendant's CST impairments. Specifically, if examiners noted CST impairments, they typically provided no description of a relationship between the impairment and symptoms of psychopathology (M= 34%, SD=14) or merely asserted (M= 36%, SD=13), or implied (M =19%, SD=7) that there was a relationship. Very few reports provided data or reasoning to specifically describe how a defendant's psychopathology compromised CST abilities (M= 10%, SD=5).
To determine the nature of the links between specific CST impairments and symptoms of psychopathology, for each domain where a non-zero CST impairment-symptom relationship was described, the percent of reports describing specific symptoms as linked to the domain were computed. Because (1) sample sizes are modest because few reports described such relationships, and (2) the data generally reflect unsubstantiated examiner assertions about CST impairment-psychopathology relationships, these data will not be presented in detail. However, many of the asserted relationships make logical and clinical sense. For instance, a defendant's ability to "track" trial proceedings may be compromised by impaired memory (44%) or attention (39%), his capacity for appropriate courtroom behavior may be impaired by poor impulse control (60%), and his understanding of the nature of legal proceedings may be compromised by impaired cognition (80%). However, it is fundamentally unclear whether these relationships reflect "true" relationships or examiner inference about defendants' psycholegal abilities based on their psychopathology.
In the reports that concluded that defendants were competent, there was substantial variability in the degree to which examiners provided specific descriptions of defendants' psycholegal abilities. Of these reports, 41% "rarely," 28% "sometimes," and 31% "often" described defendants' CST abilities to corroborate their conclusions.
Relative to the variable and poor substantiation provided for conclusions about CST abilities and impairments, the reports usually provided adequate reasoning for their conclusions about defendant psychopathology. Most (87%) reports included opinions on the defendant's diagnosis. For analyses, DSMIII-R diagnoses were pooled into organic, psychotic, mood, developmental, and "other" disorder categories.(16) Most of the reports diagnosed defendants with psychotic disorders (41%) or organic disorders (22%). Mood disorders (15%), developmental disorders (11%), and "other" disorders (10%) were diagnosed less frequently. Of the reports in which diagnoses were issued, examiners often presented "multiple" or "most" (67%) of the symptoms that substantiated their diagnoses.
Virtually all (94%) reports presented defendant symptomatology. Table 3 displays the distribution of opined symptomatology across reports, and reveals that impaired mood (15%) and cognition (14%) were noted most frequently in the reports. Of the reports in which symptoms were noted, most "sometimes" or "often" (71%) provided specific examples of that symptomatology.
Although most (88%) of the reports did not describe ruling out malingering, raters judged that that malingering was "probably not" or "definitely not" an issue in most (82%) of these reports. Of the reports that addressed malingering, most (58%) concluded that the defendant was malingering and supported this finding with data from records or testing (75%).
Examiner Agreement on Global Opinions and Bases for Opinions
Reliability analyses were completed to assess the degree to which pairs of examiners agreed about whether defendants were CST and/or suffered from particular forms of psychopathology, and whether they presented similar bases for their opinions on these issues. Specifically, analyses were completed on CST and diagnostic opinions, and the 11 CST domains and 9 symptom categories.
Table 2 displays three forms of interexaminer reliability data for these variables: percent agreement, kappa, and the Anderberg Joint Probability (Anderberg, 1973) statistic, which reflects the average percent of agreement given that one of the examiners made a judgment. For the CST domains, the Anderberg statistic is reported for examiner judgments about the defendant's impairment on domains (impaired/unimpaired) and about the salience of the domain in the defendant's case (not mentioned).(17) For the symptom categories, the statistic is reported only for examiner judgments about the defendant's impairment (impaired/unimpaired). Because many of the CST domains and symptom categories had disproportionate marginals, the Anderberg statistic may represent a less biased measure of agreement than kappa.
__________________________________
Insert Table 2 about here
__________________________________
Reliability of global opinions and CST domain opinions. Inspection of Table 2 reveals good chance-corrected rates of interexaminer agreement for global opinions on CST and broad diagnostic category. However, the data indicate poor interexaminer agreement for the specific CST impairments that form the bases for these opinions. Where both examiners addressed a CST domain, they agreed in an average of only 25% of cases about whether the defendant was impaired on that domain. For almost half of the domains, rates of agreement fell below 10%. These data are corroborated by the Anderberg figures. For example, given that one examiner reported defendant impairment on a CST domain, the other examiner rarely reported such impairment (M= .25).
Reliability of opinions on symptomatology. Relative to the CST domains, the reports demonstrated better agreement on defendant symptomatology. The average rate of agreement on symptom presence was 75%. The Anderberg data reveal that the reports agreed more on the absence of symptoms (M= .82) than on their presence (M= .55). However, for major symptoms such as hallucinations and cognitive impairment, examiner agreement on symptom presence was respectable.
Relationship Between CST Opinion and Psycholegal Variables
Because of the variability with which examiners addressed particular CST domains and symptom categories, it was impossible to derive a generalized decisional model for examiners' CST judgments via multivariate analyses. Therefore, univariate measures of association between examiner's opinions on defendant CST and the CST domains and symptom categories were computed.
Table 3 presents two forms of association data: (1) phi, depicting the degree of association between each CST domain (impaired/unimpaired) or symptom category (impaired/unimpaired or not mentioned) and examiners' CST opinions (competent/incompetent);(18) and (2) conditional probability figures.
__________________________________
Insert Table 3 about here
__________________________________
Association between CST domain impairment and CST opinion. The CST domains in Table 3 are listed in order of the frequency with which they were addressed in the reports (the last 3 were addressed in fewer than 25% of the reports and should be interpreted cautiously). CST domain impairment was moderately related to examiners' final CST opinions. The nature of this relationship is best depicted in column 2 of Table 3: when defendants were described as impaired with respect to some domains such as appreciation of the charges or capacity for reasoned choice, they were virtually always (95-100%) deemed incompetent to stand trial.
For interpretive purposes, it is important to consider whether impairments on these "critical" domains may be related to other domain impairments. Although complicated by the variability of CST domain base rates, the majority of reports that found defendants incompetent mentioned 2 or fewer CST impairments.
Characterization of defendants found incompetent. The conditional probabilities displayed in rows 4-6 in Table 3 were computed to determine how the reports characterized incompetent defendants. Defendants found incompetent were reported to have approximately as many CST abilities (M=.21) as impairments (M= .25) across CST domains. Inspection of the more frequently addressed domains in the reports deeming defendants incompetent reveals no clear impairment pattern.
Association of symptomatology and diagnosis with CST opinion. Relative to reported psycholegal impairments, the degree of association between reported symptomatology and CST opinions was weak. Thought disorder and impaired attention, however, were moderately associated with CST opinions. Given the wide-ranging impact of these symptoms, this is not unexpected.
Diagnosis was not significantly associated with CST opinion (F= .28, p=.14). Although defendants diagnosed with psychotic disorders were likely to be deemed incompetent, examiners did not appear to equate mental illness with incompetence (i.e., 31% of psychotic defendants were found competent).
Assessment Methods and Provision of Notice
Although relatively few reports described administering a mental status examination (45%) or formal diagnostic interview (3%), most of the reports described administering at least one PT (69%). Of the latter reports, 83% described using intellectual tests, most (75%) of which included the WAIS-R, and 75% cited personality tests, most (68%) of which included the MMPI or MMPI-2. Approximately half of the reports cited neuropsychological (64%) or projective tests (55%), but few reports cited mood inventories (12%).
The reports rarely corrorobated the results of mood (0%), projective (16%) and neuropsychological (29%) tests based on observations or third party information, but sometimes substantiated personality (52%) and intellectual (47%) test results. More importantly, in most of the reports (70%), examiners failed to relate the results of the PT battery to the defendant's CST. In the remaining reports, examiners simply asserted that symptoms revealed by testing globally impaired the defendants' CST. An analysis of these links at the test category level reveals that mood and projective tests are least often related to CST (0-3%) while intellectual tests are most often related to CST (33%). Although an alternate purpose for using PTs might be to rule out malingering, only 21% of the reports described PTs to do so.
In contrast with PT use, very few (25%) reports described using CST assessment instruments in the evaluations. The CST tool used most often was the Competency Assessment Instrument (10%).
With respect to including corroborating assessment data, a majority of the reports described reviewing police reports on defendants' alleged crimes (65%), and many cited the defendant's mental health records (37%). Very few reports described contacting the defense attorney (9%). Based on information contained in the reports, the reasons for failure to incorporate third-party information do not appear to be attributable to a lack of availability. Specifically, examiners ruled out the existence of particular types of records in a maximum of 9% of the reports. Only 5% of the reports described records as requested, but unavailable, and no reports cited failed attempts to contact third parties. Although third party information was apparently available, we cannot determine how easily accessible it was because the information that examiners were "routinely" provided with in Utah at the time of the study varied considerably by case and defense attorney.
Many of the reports included no indication that the defendant had been provided with a warning about the purpose of the evaluation (63%) or the limits of confidentiality (47%).
Discussion
The primary results of the study may be organized around three central points. First, examiners' reports reflect a very basic operationalization of competence and often fail to incorporate or address critical issues such as a defendant's decisional competence. Second, while reports adequately support clinical findings, they generally fail to provide any link between psycholegal deficits and symptoms of psychopathology. Third, while examiners generally agree on a defendant's global competence, they express substantially different bases for these opinions, particularly with respect to whether a defendant is impaired across specific psycholegal abilities.
Examiner "Operationalization" and Assessment of CST
Our data are consistent with modern findings (Heilbrun & Collins, 1995; Nicholson et al., 1995) that examiners are aware of the correct referral issue. However, the nature of the psycholegal abilities addressed suggest that examiners operationalize CST in terms of rudimentary competence abilities which correspond to Bonnie's conception of foundational competence (Bonnie, 1992; Nicholson, et al., 1995). Bonnie's theory (1992, 1993) builds upon earlier analyses of the CST construct and the importance of considering the defendant's specific case context in assessing his competence (Burt & Morris, 1972; Roesch & Golding, 1980; Winick, 1987). According to Bonnie, CST may be understood as two separable constructs: a foundational construct of competence to assist counsel and a contextualized construct of decisional competence. Foundational abilities are the minimal abilities defendants must possess to participate in their defense. Decisional abilities reference the cognitive tasks of actively understanding and rationally choosing among legal options, and are contextual in that the content and rigor of the decisional "test" is determined based upon the context of the case (e.g., the complexity and nature of available defenses; whether the defendant is refusing the advice of counsel).
We found that examiners place heavy emphasis on foundational domains such as the defendant's appreciation of the charges, understanding of the roles of court personnel, and capacity to disclose information to counsel, while relatively little attention is devoted to higher-order decisional capacities referenced by the "rational" language in the Dusky standard, such as the defendant's capacity for reasoned choice of legal options. For example, reports rarely (12%) address the defendant's understanding of the implications of a guilty plea, despite the fact that (a) all defendants in this sample who returned to court engaged in some form of plea bargain, and (b) over 90% of cases are resolved via plea bargain nationally (Bonnie, 1992). Because most defendants will be called upon to fully understand and voluntarily waive the specific rights involved in pleading guilty, scholars and forensic experts argue that relevant decisional capacities should be routinely considered in assessing CST (see Borum & Grisso, 1995; Bonnie, 1992; Golding, 1993).
The Supreme Court was recently given an opportunity to clarify whether the "rational" language of the Dusky standard applied to a defendant's decision making capacities. In Godinez v. Moran (1993), the Court was faced with the question of whether the competence standard for pleading guilty or waiving the right to counsel was "higher" or different than the standard for CST. The Court held that the standard of competence for pleading guilty and proceeding pro se was the same as the Dusky standard for CST, but underscored that waivers of such rights must be made voluntarily and knowingly. After noting that "capacity for reasoned choice among available alternatives" appeared synonymous with Dusky's "rational understanding of proceedings," the Court reasoned that
...all criminal defendants- not merely those who plead guilty- may be required to make important decisions once criminal proceedings have been initiated. And while the decision to plead guilty is undeniably a profound one, it is no more complicated than the sum total of decisions that a defendant may be called upon to make during the course of a trial (at 2686).
Unfortunately, Godinez has spawned more confusion than clarification because the case involved two separable issues. Moran's CST evaluation deemed him competent while he was represented by counsel, but never addressed the issues that arose three months later when he dismissed counsel and decided to plead guilty and present no defense. Thus, the issues raised include (1) what competence standard should be applied to pleading guilty and proceeding pro se, and (2) if a defendant is found competent with respect to foundational issues, does that generalize to decisional competencies? When read from one perspective, the Godinez opinion implies that defendants must meet only foundational Dusky requirements that can be generalized across all CST issues. Alternatively, the opinion can be read to imply that the Dusky standard applies to all CST issues, but that decisional capacities must be assessed because defendants routinely make difficult choices.
Given the first interpretation, Godinez risks being understood to mean that every defendant found CST is competent to waive the constitutional rights involved in pleading guilty or proceeding pro se, even if such "decisional" issues were never addressed. Because defendants invariably face difficult decisions and may be deemed competent to make them if found CST, it is critical that CSTevaluations include an assessment of the defendant's capacity to make decisions. Providing the court with such information may avoid inappropriate generalizations of "foundational competence" to decisional contexts in which the defendant is functionally incompetent. Research on competence to consent to treatment indicates that individuals' foundational abilities (e.g., understanding treatment information) do not necessarily predict their status on related decisional abilities (e.g., thinking rationally about treatment) (Grisso, Appelbaum, Mulvey & Fletcher, 1995).
As noted above, relevant areas of inquiry with respect to decisional capacities are often suggested by considering the context of the defendant's case. In fact, in order to make an informed assessment of CST, examiners should consider the trial demands and range of decisions that a defendant might face and assess his ability to meet those demands (Bonnie, 1992; Golding, 1993; Grisso, 1986; Roesch & Golding, 1980). Unfortunately, our data indicate that examiners almost never describe assessing the congruence between defendants' abilities and their particular case contexts. Examiners also virtually never describe assessing the impact of defendants' medication on their competence to stand trial (see Riggins v. Nevada, 1990).
Reasoning Presented in Support of Psycholegal Conclusions
Examiners typically presented sufficient reasoning to substantiate their clinical conclusions, but provided almost no reasoning to support their psycholegal conclusions. A typical report contained little or no reasoning addressing the nexus between clinical descriptions of symptomatology and impairments in CST abilities. This is a critically important report evaluation issue. Since legal criteria uniformly link mental disorder with competence deficits, examiners' reports should specifically inform the court about the nature of the relationship between the defendant's psychopathology and deficits in psycholegal abilities (Grisso, 1986; Roesch & Golding, 1980). Although a significant minority of the reports at least contained assertions that defendants' symptoms impaired their CST abilities, it may be argued that such assertions preempt the court from making an independent assessment by failing to provide the data that purportedly underlies such conclusions. Failure to specifically substantiate CST deficits precludes not only effective judicial scrutiny but also a detailed analysis of the nature of examiners' CST deficit-symptom links.
In parallel fashion, psychological testing should be used in forensic evaluation only when it can be specifically related to legal constructs (Grisso, 1987; Heilbrun, 1992; Nicholson & Kugler, 1991). The testing described in these reports do not appear to meet such a relevance criterion. Although most reports routinely included measures of personality and intellect, less than 30% related the results of testing to the defendant's competence, and only 21% described using testing to rule out malingering. The reports rarely described using assessment instruments specifically designed to assess CST.
Reliability of Opinions and the Bases for Opinions
Global agreement. The examiners generally agreed as to defendants' global diagnostic category (79%) and competence (82%). The latter finding is significant, given that defendants in this sample are deemed competent and incompetent with nearly equal frequency.
However, the rate with which examiners agree on the defendant's competence is substantially lower than the excellent rates of agreement (e.g., 90-97%) found when forensically trained examiners use structured CST assessment instruments in joint or independent interviews (see Golding, Roesch & Schreiber, 1984; Roesch & Golding, 1980; Schrieber, Roesch & Golding, 1987). Unfortunately, it is difficult to determine whether these excellent rates of agreement are primarily attributable to the use of the instruments or the training received to use the instruments. This is underscored by the fact that Poythress and Stock (1980) found perfect rates of agreement between extensively trained examiners who conducted joint interviews without the aid of CST assessment instruments.
Specific agreement. Relative to their agreement on global opinions, examiners rarely present similar bases for their opinions, particularly with respect to specific CST deficits. While examiners agree in an average of 75% of cases about whether defendants experience particular symptoms of psychopathology, they agree in an average of only 25% of cases as to whether defendants are impaired on particular psycholegal domains. The latter rates of specific psycholegal agreement are much poorer than those that have been reported for trained examiners using structured CST instruments in joint interviews (Roesch & Golding, 1980; Golding et al., 1984). For instance, Roesch and Golding (1980) found that trained examiners using the CAI agreed in a median of 81% of cases (range=69-98%) as to whether defendants were impaired across 13 psycholegal abilities.
These poor rates of specific psycholegal agreement do not seem attributable to temporal instability in defendants' interview behavior. First, the intervals between the examinations were short: most pairs of examiners assessed and reported on defendants within one week (MDN=8 days) of each other. Second, psychiatric research indicates that a very small proportion (10%) of interexaminer diagnostic disagreements results from temporal change in the patient (Ward, Beck, Mendelson & Robinson, 1962). A much larger proportion is attributable to differences between examiners in the aspects of psychopathology they elicit (Beck, Ward, Mendelson, Mock & Erbaugh, 1962; Rosenzeig, Vandenberg, Moore & Dukay, 1961). Because there is little reason to believe that psycholegal deficits are grossly less stable than psychopathology, examiner disagreement with respect to CST abilities does not appear largely attributable to changes in defendants' presentation.
Practical effects of disagreement. Given these data, it is apparent that judges often receive reports which present similar ultimate opinions, but present dissimilar or conflicting data with respect to specific psycholegal abilities and deficits. Because most of the reports fail to provide data and reasoning to substantiate the defendant's specific psycholegal impairments, and many fail to do so with respect to the defendant's psycholegal abilities, judges are prevented from comparatively evaluating the reasoning expressed in the reports to render an independent judgment. These factors arguably make the courts dependent on the conclusions expressed in the reports, preempting their informed decision making role (see Melton et al., 1987; Miller & Germain, 1986; Morse, 1978).
Additional Report Characteristics: Incorporation of Third Party Information and Provision of Notice
While a majority of examiners described reviewing the arrest report (which may be considered the "backbone" of conducting an appropriate CST evaluation), most reports did not describe contacting the defendant's attorney, a contact which could clarify relevant areas of inquiry by characterizing the reason for the referral and the likely demands of the defendant's case. Additionally, only a minority of examiners reviewed a defendant's mental health records, which may assist in detecting malingered deficits. Based on the examiners' reports, failure to review critical sources of third party information was not attributable to a lack of availability of the information.
These poor rates of incorporation of third party information are consistent with prior studies (Heilbrun & Collins, 1995; Helbrun et al., 1994; but see Nicholson et al, 1995). This is disconcerting because consulting such sources of information is a fundamental part of completing relevant, informed, and valid forensic assessments (Golding, 1993; Melton et al., 1987). Moreover, examiners who fail to review and incorporate "outside" evidence leave themselves vulnerable to adversarial attack. Attorneys can easily assail uninformed examiners on the witness stand with evidence that contradicts their reports or conclusions. Clearly, there appears to be much "room for improvement" in the degree to which examiners review third party information (Heilbrun & Collins, 1995).
In keeping with prior studies of community examiners' reports (Heilbrun & Collins, 1995), only about half of these reports indicated that the defendant was given notice about the limits of confidentiality and purpose of the evaluation prior to the assessment. Provision and documentation of notice is a fundamental tenet of forensic assessment (Committee on Ethical Guidelines for Forensic Psychologists, 1991).
Base Rate Considerations
Most published studies have investigated CST assessments completed by either (a) examiners trained to use CST assessment tools, (b) inpatient forensic facility staff, or (c) community examiners with extensive forensic training (see Grisso, 1991). Few studies have investigated CST assessments in the more common ecological context that involves community-based "occasional experts" (Grisso, 1987) with little systematic forensic training.(19) Thus, it is difficult to contextualize our finding that such experts find many defendants incompetent. Specifically, although examiners in published studies find an average of 25-30% of referred defendants incompetent (Nicholson & Kugler, 1991; Roesch & Golding, 1980), examiners in this study found defendants incompetent in 53% of reports. Defendants in this sample were not more severely disordered than defendants in other studies (Nicholson & Kugler, 1991). Additionally, these results are not attributable to a few "extreme" examiners who completed multiple reports: virtually all the examiners deemed 40-50% of defendants incompetent.
Significant variability in base rates of incompetence have been reported across jurisdictions, with ranges as wide as 4-77% (Grisso, 1986). Several systems factors may account for this variability. First, there may be systematic differences in defendants referred for CST evaluation across jurisdictions, based on the nature of the CST referral system and the availability of (1) civil commitment and adequate mental health services, (2) pretrial mental health services, and (3) realistic mental state defenses (see Appelbaum, Fisher, Nestelbaum & Batemen, 1992; Golding, 1992; Miller, 1992; Steadman et al., 1993). Second, examiners in outpatient settings are more likely to find defendants incompetent than those in inpatient settings (Goldstein, 1973; Nicholson et al., 1995; Williams & Miller, 1981). Similarly, examiners with little forensic training may be more likely to find defendants incompetent than those with systematic training (Golding, 1993). Clearly, further research across different CST evaluation systems (Grisso, Cocozza, Steadman, Fisher & Greer, 1994) is necessary to clarify the reasons for base rate differences.
With specific reference to the elevated base rate observed in this study, our data suggest that these examiners set low thresholds for findings of incompetence. They typically assessed minimally demanding foundational CST abilities, described very few CST impairments, and yet deemed many defendants incompetent. These examiners apparently relied upon their clinical training to focus on detecting psychopathology. Then, given their lack of familiarity with the competence construct and their very basic operationalization of CST, they apparently surmised that any impairment in CST abilities proved a mentally ill defendant incompetent.
Implications for Examiner Training and Monitoring Programs
Anecdotal and empirical data have fueled criticism of examiners' forensic reports for several decades. Although this study suggests a reduction in gross evaluation errors (e.g., failure to address the issue of competence), many serious problems with report quality continue. These problems persist despite the development of ethical guidelines, professional handbooks, and some evaluator training programs meant to address them (Committee on Ethical Guidelines for Forensic Psychologists, 1991; Grisso, 1986; Melton et al., 1987). We found no improvement in report quality following examiners' participation in a forensic training initiative which consisted of two annual, 2-day workshops. This training initiative reflects the modal form of training in jurisdictions in which it is provided (Farkas, DeLeon & Newman, 1997). A comparison of our results with prior research suggests that the training provided in most jurisdictions may be insufficient.
Specifically, Melton and his colleagues (1985) found that examiners who completed a comprehensive training program obtained higher scores on tests of forensic knowledge and produced reports that were rated more favorably by legal personnel than examiners who did not. Unlike our training initiative, this training program consisted of 50 hours of lecture, demonstration, and perhaps most importantly, supervised evaluation. Although research is needed to identify the most effective components of forensic training (e.g., length, frequency, supervised experience), it appears that more extensive training is needed to improve examiners' assessments and reports. Evaluations may also be improved if forensic assessment systems (1) institute more stringent training and/or examination requirements for examiner certification (Farkas et al., 1997), and (2) develop methods for monitoring the quality of examiners' reports.
With the institution of more extensive training programs, more stringent certification requirements, and systems for monitoring report quality, examiners' CST assessments and reports are likely to improve. Failure to take such action will lend greater credence to some experts' proposals that we reform the competence evaluation process to assign defense attorneys, rather than clinicians, primary responsibility for determining competence based upon the nature of the attorney-client relationship and the specific demands of the defendant's case (Winick, 1995).
Potential Limitations and Future Directions
This study addressed how CST reports and assessment practices comport with legal criteria, ethical guidelines, and expert interpretation, commentary, and models for practice. Future research might address the comparability between these criteria and judicial ratings of "consumer satisfaction" with reports to clarify what factors influence judges' perception of the quality and usefulness of reports. In a related sense, research is needed to clarify whether judges would make more independent, informed determinations of competence if the quality of CST reports were improved to the extent that such determinations were possible.
Because these data are among the first to address the nature of CST reports, the generalizability of these findings requires further study. First, the random sampling strategy of this study was designed to obtain reports that were representative of those submitted to Utah courts. Although examiners are supposed to be court appointed in Utah, in practice, attorneys select examiners. Because examiners with whom attorneys are familiar and/or prefer are likely to be selected to complete evaluations repeatedly, we did not avoid duplication of examiners. Nevertheless, the fact that only 18 professionals completed these reports might limit the generalizeability of this study. In addition, the results may not generalize well to the few jurisdictions where the court genuinely participates in the active selection of examiners, independent of attorney preference. Second, given the differences that have been demonstrated in hospital-based and community-based evaluations of CST (Heilbrun & Collins, 1995; Nicholson et al., 1995), the results may better generalize to community-based than inpatient CST evaluations. The results are also most generalizable to jurisdictions that provide examiners with little systematic forensic training.
One might argue that this study characterizes the nature and logic of CST reports, rather than the "inherent" nature of examiner's conceptualizations and evaluations of CST. While this is epistemologically true, the reports nevertheless reflect the examiner's assessments and conceptualizations as viewed by the courts. As noted, CST reports are the tangible product of the assessment and form the chief basis for judicial determinations of CST (Melton et al., 1985). As in other studies, judges in this sample almost invariably (89%) agreed with the conclusions expressed by examiners in their reports. Thus, the quality of the reasoning that examiners use to form their opinions and express in their reports is pivotal. Moreover, psychologists who perform forensic evaluations are under ethical obligations to provide adequate substantiation of their conclusions in their forensic assessments and reports (American Psychological Association, 1992), and to provide documentation of the data and factual bases for their conclusions in their forensic communications (Committee on Ethical Guidelines for Forensic Psychologists, 1991).
References
American Bar Association (1986). Justice Mental Health Standards, §7.3.14(b). Washington, DC: A.B.A..
American Psychological Association (1992). Ethical principles of psychologists and code of conduct. American Psychologist, 47, 1597-1611.
Anderberg, M. (1973). Cluster Analysis for Applications. New York: Academic Press.
Appelbaum, P. , Fisher, W., Nestelbaum, Z., & Batemen, A. (1992). Are pretrial commitments for forensic evaluation used to control nuisance behavior? Hospital and Community Psychiatry, 43, 603-608.
Bazelon, D. (1975). A jurist's view of psychiatry. Journal of Psychiatry & Law, 3, 175-90.
Beck, A., Ward, C., Mendelson, M., Mock, J., & Erbaugh, J. (1962). Reliability of psychiatric diagnoses II: A study of consistency of clinical judgments and ratings. American Journal of Psychiatry, 119, 351-357.
Bennett, G. (1985). A guided tour through selected ABA standards relating to incompetence to stand trial. Georgetown Law Review, 53, 375-413.
Bonnie, R. (1992). The competency of criminal defendants: A theoretical reformulation. Behavioral Sciences & the Law, 10, 291-316.
Bonnie, R. (1993). The competence of criminal defendants: Beyond Dusky and Drope. University of Miami Law Review, 47, 539-601.
Borum, R., & Grisso, T. (1996). Establishing standards for criminal forensic reports: An empirical analysis. Bulletin of the American Academy of Psychiatry and Law, 24, 297-317.
Bureau of Labor Statistics (1996, February). Consumer Price Index: All Urban Consumers (#CUUR0000SA0). Available Internet: URL: http://stats.bls.gov/cgi-bin/surveymost?cu
Bureau of Labor Statistics (1996, February). How to Use the Consumer Price Index for Escalation. Available Internet: URL: http://stats.bls.gov/cpifact3.htm
Burt, R., & Morris, N. (1972). A proposal for the abolition of the incompetency plea. University of Chicago Law Reivew, 40, 66-95.
Cicchetti, D., & Sparrow, S. (1981). Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127-137.
Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213-220.
Cohen, J., & Cohen, P. (1975). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. New Jersey: Lawrence Erlbaum.
Committee on Ethical Guidelines for Forensic Psychologists (1991). Specialty Guidelines for forensic psychologists. Law and Human Behavior, 15, 655-665.
Dusky v. United States. 362 U.S. 402 (1960).
Elwork, A. (1984). Psychological assessments, diagnosis and testimony: A new beginning. Law and Human Behavior, 8, 197-203.
Farkas, G., DeLeon, P., & Newman, R. (1997). Sanity examiners certification: An evolving national agenda. Professional Psychology: Research & Practice, 28, 73-76.
Geller, J., & Lister, E. (1978). The process of criminal commitment for pre-trial psychiatric examination: An evaluation. American Journal of Psychiatry, 135, 53-63.
Godinez v. Moran. 113 S.Ct. 2680 (1993).
Golding, S. (1992). Studies of incompetent defendants: Research and social policy implications. Forensic Reports, 5, 77-83.
Golding, S. (1993). Interdisciplinary Fitness Interview-Revised: Training Manual. Unpublished manuscript.
Golding, S., & Roesch, R. (1983). Interdisciplinary Fitness Interview Training Manual. Unpublished manuscript.
Golding, S., & Roesch, R. (1988). Competency for adjudication: An international analysis. In D. Weisstub (Ed.) Law and Mental Health: International Perspectives. New York, NY: Permagon.
Golding, S., Roesch, R., & Schreiber, J. (1984). Assessment and conceptualization of competency to stand trial: Preliminary data on the Interdisciplinary Fitness Interview. Law and Human Behavior, 9, 321-334.
Goldstein, R. (1973). The fitness factory, Part I: The psychiatrist's role in determining competency. American Journal of Psychiatry, 130, 1144-1147.
Grisso, T. (1986). Evaluating competencies: Forensic assessments and instruments. New York: Permagon.
Grisso, T. (1987). The economic and scientific future of forensic psychological assessment. American Psychologist, 42, 831-839.
Grisso, T. (1988). Competency to stand trial evaluations: A manual for practice. Sarasota, FL: Professional Resource Exchange.
Grisso, T. (1991). Five-year research update (1986-1990): Evaluations for competence to stand trial. Behavioral Sciences and the Law, 10, 353-369.
Grisso, T., Appelbaum, P., Mulvey, E., & Fletcher, K (1995). The MacArther treatment competence study II: Measures of abilities related to competence to consent to treatment. Law & Human Behavior, 19, 127-148.
Grisso, T., Cocozza, J., Steadman, H., Fisher, W., & Greer, A. (1994). The organization of pretrial forensic evaluation services: A national profile. Law & Human Behavior, 18, 377-393.
Hart, S., & Hare, R. (1992). Predicting fitness for trial: The relative power of demographic, criminal and clinical variables. Forensic Reports, 5, 53-54.
Heilbrun, K. (1992). The role of psychological testing in forensic assessment. Law & Human Behavior, 16, 257-272.
Heilbrun, K., & Collins, S. (1995). Evaluations of trial competency and mental state at time of offense: Report characteristics. Professional Psychology: Research and Practice, 26, 61-67.
Heilbrun, K., Rosenfeld, B., Warren, J., & Collins, S. (1994). The use of third-party information in forensic assessments: A two-state comparison.
Hess, J., and Thomas, H. (1963). Incompetence to stand trial: Procedures, results and problems. American Journal of Psychiatry, 119, 713-720.
Larkin, E., & Collins, P. (1989). Fitness to plead and psychiatric reports. Medicine, Science and the Law, 29, 26-32.
Maguire, K., & Pastore, A. (Eds.). (1994). Sourcebook of Criminal Justice Statistics 1993. U.S. Department of Justice, Bureau of Justice Statistics. Washington, DC: USGPO.
Melton, G., Petrila, J., Poythress, N., & Slobogin, C. (1987). Community mental health centers and the courts: An evaluation of community-based forensic services. Lincoln NE: University of Nebraska Press.
Melton, G., Weithorn, L., & Slobogin, C. (1985). Community mental health centers and the courts: An evaluation of community-based forensic services. Lincoln NE: University of Nebraska Press.
Miller, R. (1992). Economic factors leading to diversion of the mentally disordered from the civil to the criminal commitment systems. International Journal of Law and Psychiatry, 15, 1-12.
Miller, R., & Germain, E. (1986). The specificity of evaluations of competency to proceed. Journal of Psychiatry and Law, 14, 333-347.
Morse, S. (1978). Crazy behavior, morals and science: An analysis of mental health low. Southern California Law Review, 51,527-654.
Nicholson, R., & Kugler, K. (1991). Competent and incompetent defendants: A quantitative review of comparative research. Psychological Bulletin, 109, 355-370.
Nicholson, R., LaFortune, K., Norwood, S., & Roach, R. (1995). Pretrial competency evaluations in Oklahoma: Report characteristics and consumer satisfaction. Paper presented at the American Psychological Association's 103rd Annual Convention, New York, N.Y. August, 1995.
Owens,H., Rosner, R., & Harmon, R. (1987). The judge's view of competency evaluations II. Bulletin of the American Academy of Psychiatry and the Law, 15, 381-389
Petrella, R., & Poythress, N. (1983). The quality of forensic evaluations: An interdisciplinary study. Journal of Consulting and Clinical Psychology, 51, 76-85.
Poythress, N., & Stock, H. (1980). Competency to stand trial: A historical review and some new data. Journal of Psychiatry and Law, 8, 131-146.
Reich, J., & Tookey, L. (1986). Disagreements between court and psychiatrist on competency to stand trial. Journal of Clinical Psychiatry, 47, 616-623.
Riggins v. Nevada. 112 S.Ct 1810 (1992).
Roesch, R., & Golding, S. (1977). A systems analysis of competency to stand trial procedures: Implications for forensic services in North Carolina. Urbana: University of Illinois.
Roesch, R., & Golding, S. (1980). Competency to stand trial. Urbana-Champaign, IL: University of Illinois Press.
Roesch, R., & Golding, S. (1987). Defining and assessing competence to stand trial. In I. Weiner & A. Hess (Eds.), Handbook of Forensic Psychology. NY: John Wiley.
Rosenzeig, N., Vandenberg, S., Moore, K., & Dukay, A. (1961). A Study of the Reliability of the Mental Status Examination. American Journal of Psychiatry, 117, 1102-1108.
Schreiber, J., Roesch, R., & Golding, G. (1987). An evaluation of procedures for assessing competency to stand trial. Bulletin of the American Academy of Psychiatry and the Law, 15, 187-203.
Steadman, H.J. (1979). Beating a Rap? Defendants Found Incompetent to Stand Trial. Chicago: University of Chicago Press.
Steadman, H.J., McGreevy, M., Morrissey, J., Callahan, L., Robbins, P., & Cirincione, C. (1995). Before and After Hinkley: Evaluating Insanity Defense Reform. NY: Guilford Press.
Steadman, H.J., Monahan, J., Hartsone, E., Davis, S., & Robbins, P. (1982). Mentally disordered offenders: A national survey of patients and facilities. Law and Human Behavior, 6, 31-38.
Stone, A. (1975). Mental health and law: A system in transition. (DHEW Public, No. ADM-75-176). Rockville, MD: National Institute of Mental Health.
Utah State Division of Mental Health (1995). Distribution of Forensic Referrals by District, Referral Type, and Diagnosis. Unpublished report.
Utah Code Annotated §77-15-2 (1992); §77-15-5.4 (1994).
Vann, C. (1965). Pre-trial determination and judicial decision-making: An analysis of the use of psychiatric information in the administration of criminal justice. University of Detroit Law Journal, 43, 13-33.
Ward, C., Beck, A., Mendelson, M., & Robinson, J. (1962). The psychiatric nomenclature. Archives of General Psychiatry, 28, 198-205.
Williams, W., & Miller, K. (1981). The processing and disposition of incompetent mentally ill offenders. Law and Human Behavior, 5, 245-261.
Winick (1987). Incompetency to stand trial: An assessment of costs and benefits, and a proposal for reform. Rutgers Law Review, 39, 243-287.
Winick, B. (1993). New directions in the right to refuse mental health treatment: The implications of Riggins v. Nevada. William and Mary Bill of Rights Journal, 2, 205-238.
Winick, B. (1995). Reforming incompetency to stand trial and plead guilty: A restated proposal and a response to Professor Bonnie. Journal of Criminal Law and Criminology, 85, 571-624.
Footnotes
1. 1. Based on the U.S. Total Crime Index (Maguire & Pastore, 1994) and a conservative estimate that 2% of defendants are referred for evaluations (Bonnie, 1992), approximately 49,611 defendants were referred for CST evaluations in 1993. About 25% of these defendants were likely to have been found incompetent (Nicholson & Kugler, 1991; Roesch & Golding, 1980). Correcting for inflation Winick's (1987) costs for CST evaluations and treatment (see Bureau of Labor Statistics Data, 1996), the 1993 cost of an outpatient CST evaluation was approximately $2,953, and the cost of evaluating and treating an incompetent defendant was $28,776. Hence, in 1993, the nation spent approximately 467 million dollars on CST evaluation and treatment.
2. 2. Where other pronouns are awkward, male pronouns are used in this paper to reflect the study's predominantly male sample.
3. 3. Given an increase in the rate of CST referrals, random sampling was used for the latter half of the sample to obtain reports spanning an interval comparable to the first half.
4. 4. Although there is a moderately lower percentage (appx. 13%) of minority defendants in this sample than average (see Nicholson & Kugler, 1991), this does not compromise generalizeability because the "subjects" of this study are CST reports rather than defendants.
5. Remaining results included: Capacity to Appreciate Charges, 2 (1, N=100)=2.63, Appreciate Penalties, 2(1, N=100)= 0.57, Understand the Nature of Proceedings 2 (1, N=100)=1.81, and Disclose Relevant Information, 2 (1, N=100)=4.35; Basic Knowledge of Legal Options 2(1, N=100)=0.09; Relationship with Counsel 2(1, N=100)=0.54; and Capacity to Behave Appropriately 2 (1, N=100)=4.28, Participate in Trial 2(1, N=100)=6.07, and Testify 2(1, N=100)=3.34; and Medication Effects on CST 2(1, N=100)=0.04.
6. Remaining results included: Capacity to Appreciate Charges, t (16)= 0.40, p=.69, Appreciate Penalties, t (16)=0.28, p=.78, Understand the Nature of Proceedings t (16)= 1.06, p=.31, and Disclose Relevant Information, t (16)=0.30, p=.77; Basic Knowledge of Legal Options t (16)=-.14, p=.89; Relationship with Counsel t (16) = -.32, p=.75; and Capacity to Behave Appropriately t (16)=0.68, p=.51, Participate in Trial t (16)=2.17, p=.05, and Testify t (16)= 0.41, p=.69; and Medication Effects on CST t(16)=0.63, p=.54.
7. Specifically, for the CST domain, Capacity to Participate in Trial, the 90% confidence interval for the observed difference between the means (0.79) = 0.33-1.25. Assuming a large effect (1.25) with the same standard error of the difference, t (16) = 3.47, p <.01. The 5 "most frequent" examiners addressed this domain more often than did the remaining 13.
8. 8. The coding manual is available from the first author upon request. More detailed analyses and additional data are also available upon request.
9. 9. Symptoms not mentioned were coded with symptoms noted as absent because pilot coding revealed that (1) very few reports noted absent symptoms, and (2) it was time consuming to code the two categories separately. We decided that time would be more productively spent focusing on the study's primary goals (e.g., coding CST abilities and psycholegal reasoning).
10. 10. For reports that described a CST "impairment" as attributable to a factor other than psychopathology (i.e., ignorance), raters coded the CST domain as "unimpaired" to avoid penalization for failure to link the "impairment" to mental disorder.
11. 11. Both psychologists had over 10 years of experience working as licensed psychologists and 7 years of experience working chiefly in forensic psychology.
12. 12. The coding manual was structured such that CST impairment-psychopathology relationships were not rated when a CST domain was coded as not mentioned or unimpaired. Thus, when raters disagreed as to whether a CST domain was mentioned or impaired, the domain was not included in the "relationship" reliability analyses.
13. 13. Rather than treating all disagreements equally, weighted kappa provides weights such that more serious disagreements are weighted more heavily. For these analyses, weights of 0, 1, 2 and 3 were assigned.
14. This variable was addressed with greater frequency in the post-training report sample. Remaining results included: Capacity to Appreciate Charges, 2 (1, N=100)=1.96, Appreciate Penalties, 2(1, N=100)= 3.02, Understand the Nature of Proceedings 2 (1, N=100)=4.1, and Disclose Relevant Information, 2 (1, N=100)=0.19; Basic Knowledge of Legal Options 2(1, N=100)=6.78; Relationship with Counsel (1, N=100)=8.99; and Capacity for Reasoned Choice 2(1, N=100)=1.70, to Participate in Trial 2(1, N=100)=.054, and to Testify 2(1, N=100)=7.89; and Medication Effects on CST 2(1, N=100)=4.34.
15. Remaining results included: Capacity to Appreciate Charges, t (10)= 1.29, p=.23, Appreciate Penalties, t (10)=1.79, p=.10, Understand the Nature of Proceedings t (10)= 1.59, p=.14, and Disclose Relevant Information, t (10)=0.56, p=.59; Relationship with Counsel t (10) = 0.27, p=.79; Capacity for Reasoned Choice t (10)=0.60, p=.56; Capacity to Behave Appropriately t (10)=1.31, p=.22, Participate in Trial t (10)=-0.79, p=.45, and Testify t (10)= -0.29, p=.78; and Medication Effects on CST t(10)=1.43, p=.18.
16. 16. "Organic" includes dementia, organic personality, and organic NOS; "Psychotic" includes schizophrenia, schizoaffective, delusional, and psychotic NOS; "Mood" includes depression, bipolar, and adjustment; "Developmental" includes mental retardation, autism, and pervasive developmental; and, "Other" includes substance, personality, attention deficit, post-traumatic, and multiple personality.
17. 17. Because CST is often construed as a context-dependent or case-specific construct, the not mentioned category arguably reflects examiners' agreement on the relevant features of the defendant's CST.
18. 18. Given that two examiners assessed each defendant, separate analyses were conducted to determine whether the correlations observed between psycholegal variables and examiners' opinions reflected data dependence. Specifically, all correlational analyses were repeated after randomly dropping one examiner from each defendant case. The results reflected little change (M change in phi= .05, range= .00-.14), suggesting that the relationships observed were not largely case-dependent.
19. 19. In their national survey of pretrial evaluation services, Grisso, Cocozza, Steadman, Fisher and Greer (1995) found that, of 11 states classified as community-based, only 1 had an extensive forensic certification process and only 3 provided and required attendance at annual continuing education conferences. Of the 24 states the authors classified as traditional, modified-traditional, and private practioner, only 1 reported engaging in any form of examiner education or quality control. Thus, most examiners receive little or no systematic state-supported forensic training. Although research is needed on examiners' levels of "non-state-based" forensic training, Grisso ventures that a large, diverse group of "occasional experts" includes many members who "enter into forensic assessment with little or no specialized forensic knowledge" (1987 at 833).