Welcome to the Schizophrenia Resource Centre

Welcome, this website is intended for healthcare professionals in EMEA with an interest in the treatment of schizophrenia. By clicking the link below you are declaring and confirming that you are a healthcare professional

You are here

Reliability and validity of methods for measuring the duration of untreated psychosis: A quantitative review and meta-analysis

Schizophrenia Research, 1-3, 160, pages 20 - 26



The duration of untreated psychosis (DUP) has been associated with a wide range of clinical outcomes, and is considered to be one of the key parameters in managing clinical high risk and first episode psychosis patients. However, considerable discrepancies exist in the way that DUP is estimated in different studies. There is no standard or consensus on which method is most reliable and valid for assessing DUP.


This review aimed to quantitatively assess different DUP measurement instruments and definitions by comparing their inter-rater reliability, and their strength of validity in predicting biological and clinical outcomes.


Nine instruments designed for measuring DUP were found. Their inter-rater reliability were found to be adequate to excellent, although quite varied. This analysis did not show that any instrument was clearly outstanding compared to the others, although the limited available data do not exclude this possibility. DUP was also significantly associated with a range of outcomes, although mostly with small effect sizes. However, non-instrument based, ad hoc clinical interviews remained the most common way of measuring DUP. Definitions of onset of psychosis and onset of treatment were inconsistent among studies.


This review did not find quantitative evidence to support the use of one instrument over another. DUP remains a promising modifiable risk factor for a range of long-term clinical outcomes. Future research should quantify and improve the reliability and validity of the structured instruments for DUP measurement.

Keywords: Duration of untreated psychosis, Schizophrenia, Meta-analysis.

1. Introduction

The duration of untreated psychosis (DUP), or the time between the onset of psychosis and initiation of treatment, is a growing research and clinical focus in early psychosis. Research first identified DUP as a prognostic factor in the 1980s (McGlashan, 1999 and Singh, 2007). More recent systematic reviews and meta-analyses (Marshall et al, 2005, Norman et al, 2005, Perkins et al, 2005, Large et al, 2008, MacBeth and Gumley, 2008, Farooq et al, 2009, Large and Nielssen, 2011, and Boonstra et al, 2012) have examined the relationship of DUP to outcomes including positive and negative symptom severity at follow-up, likelihood of remission of psychosis, overall functioning, and quality of life. These efforts generally found that patients with shorter DUP fare better even after adjustment for premorbid functioning. Additional studies have generally found that numerous outcomes including neuropsychological test performance, changes on brain imaging, risk of violence, and risk of suicide are also associated with DUP (Large and Nielssen, 2011, Chang et al, 2013, and Chou et al, 2014), although these findings were not consistent. For example, a recent study reported that neuropsychological test performance is not related to DUP at initial hospitalization ( Broussard et al., 2013 ). These findings support the increasing research and public health efforts aimed at reducing DUP by identifying and addressing barriers to the early identification and treatment of psychosis (Killackey and Yung, 2007, McGorry et al, 2007, Bird et al, 2010, Lloyd-Evans et al, 2011, and Broussard et al, 2013).

Despite these positive results, considerable discrepancies exist in the way that DUP is estimated in different studies. Measuring DUP requires establishing both the date on which psychosis first presented and the date on which treatment commenced. However, as has been discussed extensively in the literature, these measurements are fraught with practical difficulties, making reliable assessment difficult ( Norman and Malla, 2001 ). The majority of patients with psychotic disorders as defined by established diagnostic criteria experience a prodromal period, which can include attenuated psychotic symptoms and/or brief limited intermittent psychotic episodes ( Fusar-Poli et al., 2013 ). Even with a valid and reliable measure of symptom severity, the point at which attenuated or brief symptoms cross the diagnostic duration and severity thresholds is often unclear, making the dating of psychosis onset necessarily somewhat arbitrary (Norman and Malla, 2001, Compton et al, 2007, Compton et al, 2011, and Singh, 2007). In addition, psychosis encompasses multiple symptom categories, but studies are inconsistent as to which categories are used to define onset (Larsen et al, 2001, Norman and Malla, 2001, and Compton et al, 2007).

Dating the onset of treatment is quite inconsistent across studies as well, as multiple definitions of “treatment” have been used for both theoretical and practical reasons. The date on which the patient first sought mental health care for psychotic symptoms, and the date on which the first antipsychotic prescription was written, are appealingly concrete but do not necessarily imply a patient has received effective treatment ( Breitborde et al., 2009 ). The date of the first hospitalization is similarly practical, but may falsely lengthen DUP by disregarding any prior period of outpatient treatment ( Large et al., 2008 ), or falsely shorten DUP by disregarding periods of ineffective inpatient treatment caused for example by nonadherence or treatment resistance. Attempting to discern the first clinically effective antipsychotic trial, although intuitively logical, introduces the problem of reliably defining “effective” (Norman and Malla, 2001, Compton et al, 2007, Breitborde et al, 2009, and Dell'Osso et al, 2013). Determining the first “effective” psychosocial treatment, like supported employment or cognitive-behavioral therapy for psychosis has not, to our knowledge, been attempted in the literature, but would likely encounter similar methodological difficulties.

Further complicating these measures is the fact that patients typically enroll in DUP studies after they first seek treatment for their psychosis. The dating of both symptom onset and treatment seeking is therefore usually retrospective and potentially done at a time when the patient is still experiencing psychotic and/or cognitive symptoms (Maurer and Hafner, 1995 and Norman and Malla, 2001). Most studies gather collateral data from family, mental health providers, or medical records to address this difficulty; researchers must methodically reconcile these sources when they disagree ( Norman et al., 2005 ).

Because no previous studies that we are aware of have actually quantitatively compared specific methods for measuring DUP in the same patient populations, the method to achieve the best reliability and validity is still unclear. At least five excellent reviews have qualitatively discussed the above issues in measuring DUP (Norman and Malla, 2001, Compton et al, 2007, Singh, 2007, Breitborde et al, 2009, and Dell'Osso et al, 2013), and at least seven have systematically examined DUP as a predictor of clinical outcomes (Marshall et al, 2005, Norman et al, 2005, Perkins et al, 2005, MacBeth and Gumley, 2008, Farooq et al, 2009, Large and Nielssen, 2011, and Boonstra et al, 2012). However, these reviews were not aimed to establish quantitative comparisons on the reliability and validity of DUP assessment tools.

The aim of the present paper is to summarize and, to the extent possible,quantitatively evaluate the quality ofthe methods that have been used in the peer-reviewed published literature for measuring DUP. Here, we consider the quality of DUP assessments in the context of reliability and validity. For reliability, because very few included studies have reported test-retest reliability, our evaluation is based on inter-rater reliability. For validity, although construct and face validity are seemingly simple constructs, how to quantify the validity of DUP is not straightforward, especially since construct and criterion validity were not determined during the development of many of the DUP instruments as one would expect were they derived from a strict empirical measurement perspective. Therefore, we were required to operationally define what the field intends to achieve by measuring DUP. DUP is an important research concept in part because it may predict outcomes of disease and treatment. In this review we evaluated predictive validity by exploring which DUP measurement method best predicts outcomes of disease and treatment. In this review we evaluated predictive validity by exploring which DUP measurement method best predicts outcomes of disease and treatment. Specifically, we used two categories of quantitative data to assess and compare DUP measurement tools: 1) their inter-rater reliability for psychosis onset, treatment onset, and/or DUP measurements; and 2) their effect size for the relationship between DUP and biological and clinical outcomes.

2. Methods

PubMed was searched for keyword “duration of untreated psychosis” OR “duration” AND “untreated” AND “psychosis” on December 5, 2013. The PRISMA flow diagram detailing the reviewed citations is in Fig. 1 . Inclusion criteria were publication in a peer-reviewed journal, containing measurement of DUP, and assessment of the relationship of DUP to one or more patient outcome measurements. A total of 141 studies were included. The following information was extracted from these papers: number of included individuals with psychosis, instrument or method used to measure DUP, definition of onset of psychosis, definition of onset of treatment, definition of outcome, reported association of DUP with this outcome, and inter-rater reliability for the measurement of psychosis onset, treatment onset and/or DUP. Authors were contacted with requests for copies of the instruments that are not available online.


Fig. 1 PRISMA Flow Diagram, after Moher et al., 2009 .

The reported associations of DUP with biological and clinical outcomes were entered into the software package Comprehensive Meta Analysis version 2.2 (CMA, Biostat, Inc., Englewood, NJ, 2011). Nine papers were excluded from the analysis due to insufficient data compatible for this meta-analysis package, bringing the total included papers to 132. Univariate analysis based results were available from 121 and only multivariate analysis based results were available from the remaining 11 papers; results were almost identical when these 11 papers were excluded, so they were included in the final analysis. When multiple statistical results were presented in the same paper, we used the result with the lowest p value; this choice was based on the assumption that papers reporting just one p value may have had other non-significant results they did not report. For three papers, p value was given as “not significant” rather than as a specific number, so a p value of 0.45 was used. This was arbitrary, to avoid representing “nonsignificant” withp = 0.06 or p = 0.99.Following Perkins et al., 2005 , for studies with results from assessments done in multiple time points, the result from the longest time point was used. Although the relative importance of DUP throughout the course of illness has not been established for most outcomes, in their meta-analysis of the relationship of DUP to negative symptoms Boonstra et al. (2012) found no evidence for attenuation of the strength of association at long term (5–8 year) versus short term (1-2 year) follow-up or baseline. Meta-analyses were performed using random effects models, synthesizing outcome data with Fischer’s Z transformation into a single Fisher’s Z statistic with 95% confidence interval. The heterogeneity of the sample populations was evaluated with the I2statistic calculated in CMA. Following Perkins et al., 2005 , clinical outcomes were grouped as positive symptoms, negative symptoms, overall functioning, and relapse risk. In addition, categories were created for “outcomes” in treatment adherence, neuroimaging, neurocognitive changes, and suicidality and violence.

3. Results

3.1. Reliability of DUP assessments

The 132 studies were from 94 research groups, accounting for a total of 17 135 subjects. A summary of the reliability for DUP, psychosis onset, and/or treatment onset measurements are in Table 1 , categorized by different types of instruments. More comprehensive information on each of the 132 studies is tabulated in the Online Supplement Table S1.

Table 1 Reported reliabilities for DUP, psychosis onset, and treatment onset by DUP measurement method.

Instrument Estimated time to administer and Score Reported Reliability a Number of studies b Number of research groups c Total number of subjects
DUP Psychosis onset Treatment onset
Basel Interview for Psychosis   NI NI NI 1 (1 + 0) 1 60
Beiser Scale 0.5 hr ICC = 0.79-0.98 ICC = 0.94-0.98 0.95 11 (9 + 2) 7 786
Comprehensive Assessment of Symptoms and History 2 hr ICC = 0.87-1.00 ICC = 0.96 ICC = 0.96-1.00 4 (3 + 1) 2 337
Circumstances of Onset and Relapse Schedule 1.5 hr d ICC = 0.71-0.98 NI NI 7 (7 + 0) 1 259
Interview for the Retrospective Assessment of the Onset of Schizophrenia 1.5-2 hr κ = 0.6-0.95 PA = 77% PA = 80-100% 11 (8 + 3) 8 1089
Nottingham Onset Schedule 0.25-0.75 hr ICC = 0.95-0.99 PA = 70% NI 2 (2 + 0) 2 174
Positive and Negative Syndrome Scale for Schizophrenia(modified) 0.5 hr ICC = 0.9-0.99 NI NI 18 (17 + 1) 8 1969
Psychiatric and Personal History Schedule 0.5-1 hr ICC = 0.90 NI NI 4 (4 + 0) 2 277
Royal Park Multidiagnostic Instrument for Psychosis 4-7 hr d κ = 0.79 κ = 0.79 NI 6 (6 + 0) 1 661
Symptom Onset in Schizophrenia Inventory 0.5 hr ICC = 0.99 ICC = 1.0 NI 7 (5 + 2) 5 937
Clinical Interview   ICC = 0.7-1.0 NI NI 55 (55 + 0) 52 10,089
Chart Review   ICC = 0.73 NI NI 6 (6 + 0) 5 497
Totals         132 94 17 135

a Some reports included multiple reliability calculations. For those cases, we included the best reliability reported.

b Total number of included studies. Some studies may not use the full, original version. Numbers in parentheses indicate the number of studies using the unmodified instrument plus the number of studies using the modified versions of the instrument.

c Numbers of research group may not match the numbers of studies when the instrument was used in more than one report by a research group. The numbers of research groups were an approximation based on authorship and reported affiliations.

d Time required to administer the full instrument; an abbreviated version targeting DUP measurement is shorter.

DUP, duration of untreated psychosis. ICC, intraclass correlation. NI, none identified from the published report. PA, pairwise agreement.

Twenty-seven of these research groups used eight instruments developed specifically to assess DUP. The most commonly used instrument was the Interview for the Retrospective Assessment of the Onset of Schizophrenia (IRAOS), used by eight groups. IRAOS's inter-rater reliability was 73 to 97% by pairwise agreement (Hafner et al, 1992 and Hafner et al, 1994). The Beiser scale ( Beiser et al., 1993 ) was used in seven groups, with reported ICC of 0.79 to 0.98 ( Clarke et al., 2006 ). The Symptom Onset in Schizophrenia Inventory (SOS; Perkins et al., 2000 ) was used by five groups, with κ = 0.49 to 1.0 for individual items, and κ = 0.8-0.98 overall ( Cuesta et al., 2011 ). The Comprehensive Assessment of Symptoms and History (CASH) and Nottingham Onset Schedule (NOS) were each used by two research groups, and the Circumstances of Onset and Relapse Schedule (CORS), Royal Park Multidiagnostic Instrument for Diagnosis (RPMIP), and Basel Interview for Psychosis each by one research group (McGorry et al, 1990, Andreasen et al, 1992, Ho et al, 2004, Norman et al, 2004, Singh et al, 2005, and Fridgen et al, 2013). Overall, these instruments were reported to have good to excellent reliability in their original publications.

Ten research groups used more generic psychosis assessment instruments to measure DUP. The PANSS was used by eight research groups (total N = 1969), who typically defined psychosis onset by a score 4 or higher on the PANSS positive subscale with a duration criterion ( Larsen et al., 1996 ). Of these eight, four groups (total N = 740) reported the inter-rater reliability for DUP measurement (ICC = 0.73-0.99). The Psychiatric and Personal History Schedule was used by two research groups (total N = 277), with a maximum reported ICC of 0.90 ( Janca and Chandrashekar, 1993 ).

The most common way of determining DUP, however, was through clinical interviews, which were used by 52 research groups with a total N of 10 089 participants. Only 8% of the clinical interview studies (N = 1094 participants) reported the inter-rater reliability of their technique for measuring DUP, with ICC ranging from 0.7 to 1.0 (Drake et al, 2000, Takahashi et al, 2007, Chang et al, 2012, and Lihong et al, 2012). However, because 92% of these studies did not report reliability, this ICC range may be biased. Some studies incorporated instruments including the Structured Clinical Interview for DSM-IV Axis 1 Disorders ( Craig et al., 2000 ), Present State Examination ( Madsen et al., 1999 ), Scale for the Assessment of Positive Symptoms ( González-Blanch et al., 2008 ), Brief Psychiatric Rating Scale ( Alvarez-Jiminez et al., 2009 ), and Association for Methodology and Documentation in Psychiatry ( Bottlender et al., 2003 ) into their clinical interviews for determining DUP. However, the reliability and validity for using many of these instruments to retrospectively measure the onset of psychosis are unclear.

The definitions of treatment onset used in measuring DUP fell into six main categories ( Table 2 ). The most common definition (total N = 6141 subjects) was the first time psychosis was “adequately” treated. Few studies attempted to set quantitative criteria for this definition; among those that did, the required medication trial duration ranged from 2 to 6 weeks (Malla et al, 2002, de Haan et al, 2003, Diaz et al, 2013, Lopez-Morinigo et al, 2013, and Winsper et al, 2013), with most groups requiring around 4 weeks. Lopez-Morinigo et al. (2013) required 75% adherence for one month, and Larsen et al. (1996 and 2000) required an “antipsychotic given in sufficient time and amount that it would lead to clinical response in the average non-chronic schizophrenia patient.” The first psychiatric hospitalization (total N = 4616) and first time antipsychotic medication was prescribed, regardless of trial duration, dose, or compliance (total N = 2948) were also commonly used definitions. About 10% of studies used more than one definition, for example using either the first adequate treatment or the first psychiatric hospitalization.

Table 2 Frequencies with which different definitions of treatment onset were used, ranked by the number of studies. Note that some studies used more than one definition and are included in more than one row.

Definition of Onset of Treatment Number of Publications (%) Number of Research Groups Total Number of Subjects (%)
First Psychiatric Hospitalization 37 (28) 22 4616 (27)
First Antipsychotic Treatment 37 (28) 28 2948 (17)
First Adequate Treatment 36 (27) 23 6141 (36)
Enrollment in Study 14 (11) 7 1997 (12)
First Treatment for Psychotic Symptoms 15 (11) 15 2256 (13)
Undefined 14 (11) 14 2037 (12)
Total 132 94 17 135

3.2. Predictive validity

The effect sizes of DUP for predicting outcomes are summarized in Table 3 . “Predictive validity” here refers to either the correlation with other meaningful clinical and biological measures, or predictive value for treatment response and clinical and functional improvement or deterioration over time. Fisher’s Z statistics were generally in the 0.1 to 0.3 range, corresponding to small effect sizes. I2for individual analyses was 43.2 to 86.7, indicating moderate to high heterogeneity, and supporting the use of the random effects model. A funnel plot of the overall sample was roughly symmetrical, and Rosenthal’s classic fail-safe N was 21724, indicating lack of significant publication bias.

Table 3 A meta-analysis comparing the predictive values of different DUP measurement methods for clinical and biological outcomes.

Instrument Treatment Adherence Overall Function Imaging Negative Symptoms Neuro-cognition Positive Symptoms Relapse Risk Suicidality/Violence Overall
  Z N Z N Z N Z N Z N Z N Z N Z N Z N
Basel Interview                 0.19 1             0.19 1
Beiser Scale 0.05 2 0.30 lowast 5     0.33 lowast lowast 2 0.44 lowast 1 0.28 lowast lowast 3 0.21 2 0.16 lowast 2 0.20 lowast lowast 11
CASH     0.14 lowast 2 0.14 lowast 2 0.02 1 0.15 1 0.12 1         0.12 lowast 4
CORS     0.19 lowast lowast 2 0.22 lowast 1 0.18 lowast 3 0.19 2 0.22 lowast lowast 2 0.11 2     0.006 lowast 7
IRAOS     0.17 lowast lowast 5     0.10 lowast 5 0.26 lowast 2 0.15 lowast 4 0.14 4     0.17 lowast lowast 11
NOS         0.68 lowast lowast 1                 -0.02 1 0.12 2
PANSS (modified) 0.19 lowast 1 0.20 lowast lowast 10     0.20 lowast 7 0.10 1 0.11 6 0.15 lowast 5 0.19 lowast lowast 3 0.16 lowast lowast 18
PPHS     0.45 2 0.35 lowast 1     0.03 1             0.23 lowast 4
RPMIP     0.33 lowast lowast 3     0.29 lowast lowast 2 0.38 lowast 1 0.31 lowast lowast 2 0.33 lowast lowast 1 0.02 1 0.20 lowast lowast 6
SOS     0.16 2     0.27 lowast lowast 1 -0.09 1     0.28 lowast lowast 1 0.01 1 0.16 lowast lowast 7
Clinical Interview 0.15 lowast 3 0.21 lowast lowast 18 0.32 lowast lowast 8 0.19 lowast lowast 11 0.29 lowast lowast 8 0.24 lowast lowast 9 0.22 lowast lowast 12 0.07 5 0.17 lowast lowast 55
Chart Review     0.14 3 0.27 1             0.37 lowast 2 0.03 2 0.32 lowast lowast 6
Overall 0.14 lowast lowast 6 0.22 lowast lowast 49 0.25 lowast lowast 14 0.21 lowast lowast 32 0.20 lowast lowast 19 0.22 lowast lowast 27 0.21 lowast lowast 29 0.084 lowast lowast 15 0.18 lowast lowast 132

lowast p < 0.05, **p < 0.001. Z, Fisher’s Z. N, number of studies. Overall: weighted average of Fisher’s Z, weighted by the number of studies. CASH, Comprehensive Assessment of Symptoms and History. CORS, Circumstances of Onset and Relapse Schedule. IRAOS, Interview for the Retrospective Assessment of the Onset of Schizophrenia. NOS, Nottingham Onset Schedule. PANSS, Positive and Negative Syndrome Scale for Schizophrenia. PPHS, Psychiatric and Personal History Schedule. RPMIP, Royal Park Multi-diagnostic Instrument for Psychosis. SOS, Symptom Onset in Schizophrenia Inventory. Studies may report more than one outcome. Blank cells indicate no studies meeting inclusion criteria were found.

Overall, without considering DUP measurement method, DUP numerically had the highest predictive validity for imaging outcomes in terms of Z score (Z = 0.25, p < 0.001) as compared with the Z scores for other outcome measures. Note that this is a summary score of all imaging data because the current review does not analyze more granular data on specific imaging modality, anatomic regions, or technical differences in imaging. DUP had less robust predictive validity for treatment adherence (Z = 0.14) and suicidality/violence prediction (Z = 0.084), although these were still statistically significant in the meta-analysis.

The effect sizes on how well psychosis onset as measured by different instruments can independently predict outcomes (regardless how treatment onset was determined) are also summarized in Table 3 . There were some variations in how each method performed for different outcomes: the Beiser Scale performed best for overall functioning, the PANSS and clinical interviews for negative symptoms, and clinical interviews for positive symptoms and relapse risk. However, no instrument clearly had larger effect sizes across different categories of outcomes or when all outcomes were grouped together. In fact, when all studies using specialized DUP instruments were grouped together and compared to all studies using clinical interviews, effect sizes were roughly equivalent ( Table 4 ).

Table 4 A meta-analysis comparing the predictive value of DUP for clinical and biological outcomes for studies in which DUP is measured by specialized instruments versus by generic clinical interviews. Studies may report more than one outcome.

DUP Measurement Method Treatment Adherence Overall Function Imaging Negative Symptoms Neuro-cognition Positive Symptoms Relapse Risk Suicidality/Violence Overall
  Z N Z N Z N Z N Z N Z N Z N Z N Z N
DUP instrument 0.13 lowast 3 0.21 lowast lowast 31 0.25 lowast lowast 5 0.20 lowast lowast 21 0.15 lowast 9 0.19 lowast lowast 18 0.16 lowast lowast 15 0.06 7 0.17 lowast lowast 71
Clinical interview 0.15 lowast 3 0.22 lowast lowast 16 0.32 lowast lowast 8 0.19 lowast lowast 10 0.27 lowast lowast 7 0.24 lowast lowast 9 0.19 lowast lowast 11 -0.05 3 0.17 lowast lowast 55

lowast p < 0.05, lowast lowast p < 0.001. Z, Fisher’s Z. N, number of studies.

Finally, the effect sizes on how well treatment onset definitions can independently predict outcomes (regardless how psychosis onset was determined) are summarized in Table 5 . First antipsychotic medication treatment and first hospitalization trended toward having a greater statistically significant effect sizes in predicting outcomes ( Table 5 ).

Table 5 A meta-analysis comparing different definitions of treatment onset for DUP determination and their impact on the predictive value for clinical outcomes. For blank cells, no studies meeting inclusion criteria were found. Studies may report more than one outcome and/or definition.

Definition of onset of treatment Treatment adherence Overall function Imaging Negative symptoms Neurocognition Positive symptoms Relapse risk Suicidality/violence Overall
  Z N Z N Z N Z N Z N Z N Z N Z N Z N
First treatment for psychotic symptoms 0.19 lowast 1 0.41 lowast lowast 5 0.27 lowast 2 0.10 4 0.19 2 0.18 lowast lowast 3 0.32 lowast lowast 2     -0.003 15
Enrollment in study 0.15 lowast 1 0.23 lowast lowast 8 0.46 2 0.13 3 0.27 lowast 2 0.22 lowast lowast 4 0.33 lowast lowast 1 0.02 1 0.20 lowast lowast 14
First antipsychotic treatment 0.21 2 0.27 lowast lowast 14 0.28 lowast 6 0.22 lowast lowast 10 0.27 lowast 2 0.27 lowast lowast 9 0.25 lowast 5 0.23 lowast lowast 4 0.25 lowast lowast 37
First adequate treatment 0.10 1 0.17 lowast lowast 14 0.24 lowast lowast 3 0.23 lowast lowast 9 0.12 lowast 3 0.10 lowast 8 0.17 lowast 6 0.09 4 0.028 lowast lowast 36
First psychiatric hospitalization 0.42 lowast 1 0.23 lowast lowast 18 0.38 lowast 2 0.24 lowast lowast 11 0.20 lowast lowast 2 0.21 lowast lowast 9 0.23 lowast lowast 8 0.12 4 0.24 lowast lowast 37
Undefined 0.05 1 0.18 lowast 5     0.27 lowast lowast 1 0.24 5     0.14 lowast lowast 1 -0.17 1 0.14 lowast lowast 14

lowast p < 0.05, lowastlowastp < 0.001. Z, Fisher’s Z. N, number of studies.

4. Discussion

This quantitative review showed some clear and surprising characteristics of current DUP assessment methods. We found that the inter-rater reliabilities of the instruments used in DUP assessment were generally good, and they did not substantially different from one another. Surprisingly, based on reliability, we did not find clear quantitative evidence to support the use of one DUP measurement instrument over another, or even the use of instruments rather than ad hoc clinical interviews. However, given the lack of studies directly comparing DUP measurement methods, we cannot exclude the possibility that some in fact have greater reliability and validity. Furthermore, many of these instruments did not report reliability; it is possible that some of the instruments are better than others, but this cannot be demonstrated with the existing limited data. This review confirmed the significant associations of DUP with clinical and biological measures as has been reported previously in greater detail (Marshall et al, 2005, Norman et al, 2005, Perkins et al, 2005, Boonstra et al, 2012, and Cascio et al, 2012). Regardless of measurement method, DUP was a statistically significant predictor, but with generally small effect sizes in the 0.2 to 0.3 range. We also found that the majority of instruments have been used by just one or two research groups in published studies, and that instruments and individual studies had varying definitions of the onset of psychosis and the onset of treatment. Our analysis did, however, find that the two most clearly operationalized definitions of treatment onset, the first-ever antipsychotic prescription and the first-ever psychiatric hospitalization for psychosis, trended toward having a greater magnitude in association with some outcome measures, and therefore may have greater validity.

Our findings are consistent with those of previous reviews of the measurement of DUP. In a systematic qualitative review, Compton et al. (2007) found that the definition of DUP as time from the onset of psychosis to the onset of treatment was quite consistent; however, the translation and quantification of these time points varied significantly across studies. The meta-analysis of Large et al. (2008) found that method of measuring DUP did not significantly influence mean or median DUP. Multiple other authors have noted the conceptual and practical difficulties with operationalizing DUP (Norman and Malla, 2001, Norman et al, 2005, Perkins et al, 2005, Singh, 2007, Breitborde et al, 2009, Farooq et al, 2009, and Dell'Osso et al, 2013). Our quantitatively-based review largely confirmed these challenges, suggesting that future research should focus on validating one or several DUP measurement methods.

Based on meta-analysis, imaging outcomes have the strongest relationship with DUP compared with other categories of clinical outcome measures. This finding could be interpreted as supporting a biological correlate of the DUP concept. The included imaging papers here mostly utilized structural imaging, and may have had higher predictive power in part because structural imaging typically has higher test-retest reliability than clinical measures. Similarly, patients may be able to recall the date of their first hospitalization or first antipsychotic medication with some precision, and this may have contributed to these definitions’ greater predictive power.

To our knowledge, this is the first review aiming to quantitatively compare the methods used to measure DUP in the published literature. However, there are several important limitations to this review. Given that some outcomes are likely more closely associated with DUP than others, and that many DUP measurement instruments have so far been used with a limited number of outcomes, conclusions about the reliability and validity of individual instruments should be drawn with caution. The time at which the outcomes were assessed varied widely in included studies; if associations with DUP vary systematically with length of follow-up, this may be a source of bias given the small number of studies for many DUP instruments. Our effort to include multiple outcomes likely has led to some oversimplification during our meta-analysis, as the underlying mechanisms for these associations likely vary and include such disparate factors as family dynamics and biological differences. These variations likely contributed to the moderate to high heterogeneity found in our meta-analyses. Also, the patients recruited into instrument-based studies may have systematically differed from those recruited into clinical interview based studies. For example, patients in research clinics versus community clinics may differ in DUP length, diagnosis, or selection bias ( Friis et al., 2004 ). We did not focus on such potential confounders as this paper intended to compare the reliability and validity of different DUP measurement methods. Finally, the small number of studies for several DUP measurement instruments limited the power of the meta-analysis. Although we have used weighted scores during meta-analysis ( Table 3 ), these limitations should be considered when reviewing the comparative results.

Improving the overall reliability and validity of DUP measurement in future research will be facilitated by separately addressing the distinct challenges of measuring psychosis onset and treatment onset. DUP remains a promising modifiable risk factor for a range of long-term clinical and biological outcomes. To improve the ability to compare and interpret results, consistent inclusion of reliability and validity assessment in the DUP research methodology should be a priority.

Role of funding source

Support was received from NIH grants MH085646 and MH103222.


Drs. Hong and Register-Brown designed the study. Dr. Register-Brown managed the literature searches and analyses, and wrote the first draft of the manuscript. Both authors have approved the final manuscript.

Conflict of interest

Both authors declare that they have no conflicts of interest.



Appendix A. Supplementary data


Download file

Supplementary material Table S1. A summary of the data from individual reports used in our review and meta-analysis.


  • Alvarez-Jimenez et al., 2009 M. Alvarez-Jimenez, J.F. Gleeson, S. Cotton, D. Wade, D. Gee, T. Pearce, K. Crisp, D. Spiliotacopoulos, B. Newman, P.D. McGorry. Predictors of adherence to cognitive-behavioural therapy in first-episode psychosis. Can. J. Psychiatry. 2009;54(10):710-718
  • Andreasen et al., 1992 N.C. Andreasen, M. Flaum, S. Arndt. The Comprehensive Assessment of Symptoms and History (CASH). An instrument for assessing diagnosis and psychopathology.Arch. Gen. Psychiatry. 1992;49(8):615-623 Crossref
  • Beiser et al., 1993 M. Beiser, D. Erickson, J.A. Fleming, W.G. Iacono. Establishing the onset of psychotic illness. Am. J. Psychiatry. 1993;150(9):1349-1354
  • Bird et al., 2010 V. Bird, P. Premkumar, T. Kendall, C. Whittington, J. Mitchell, E. Kuipers. Early intervention services, cognitive-behavioural therapy and family intervention in early psychosis: systematic review. Br. J. Psychiatry. 2010;197(5):350-356 Crossref
  • Boonstra et al., 2012 N. Boonstra, R. Klaassen, S. Sytema, M. Marshall, L. De Haan, L. Wunderink, D. Wiersma. Duration of untreated psychosis and negative symptoms–a systematic review and meta-analysis of individual patient data. Schizophr. Res.. 2012;142(1–3):12-19 Crossref
  • Bottlender et al., 2003 R. Bottlender, T. Sato, M. Jager, U. Wegener, J. Wittmann, A. Strauss, H.J. Moller. The impact of the duration of untreated psychosis prior to first psychiatric admission on the 15-year outcome in schizophrenia. Schizophr. Res.. 2003;62(1–2):37-44 Crossref
  • Breitborde et al., 2009 N.J. Breitborde, V.H. Srihari, S.W. Woods. Review of the operational definition for first-episode psychosis. Interv. Psychiatry. 2009;3(4):259-265 Crossref
  • Broussard et al., 2013 B. Broussard, M.E. Kelley, C.R. Wan, S.L. Cristofaro, A. Crisafio, P.J. Haggard, N.L. Myers, T. Reed, M.T. Compton. Demographic, socio-environmental, and substance-related predictors of duration of untreated psychosis (DUP). Schizophr. Res.. 2013;148(1–3):93-98 Crossref
  • Cascio et al., 2012 M.T. Cascio, M. Cella, A. Preti, A. Meneghelli, A. Cocchi. Gender and duration of untreated psychosis: a systematic review and meta-analysis. Interv. Psychiatry. 2012;6(2):115-127 Crossref
  • Chang et al., 2012 W.C. Chang, J.Y. Tang, C.L. Hui, M.M. Lam, S.K. Chan, G.H. Wong, C.P. Chiu, E.Y. Chen. Prediction of remission and recovery in young people presenting with first-episode psychosis in Hong Kong: a 3-year follow-up study. Aust.N.Z.J. Psychiatry. 2012;46(2):100-108 Crossref
  • Chang et al., 2013 W.C. Chang, C.L. Hui, J.Y. Tang, G.H. Wong, S.K. Chan, E.H. Lee, E.Y. Chen. Impacts of duration of untreated psychosis on cognition and negative symptoms in first-episode schizophrenia: a 3-year prospective follow-up study. Psychol. Med.. 2013;43(9):1883-1893 Crossref
  • Chou et al., 2014 P.H. Chou, S. Koike, Y. Nishimura, S. Kawasaki, Y. Satomura, A. Kinoshita, R. Takizawa, K. Kasai. Distinct effects of duration of untreated psychosis on brain cortical activities in different treatment phases of schizophrenia: a multi-channel near-infrared spectroscopy study. Prog. Neuropsychopharmacol. Biol. Psychiatry. 2014;49:63-69 Crossref
  • Clarke et al., 2006 M. Clarke, P. Whitty, S. Browne, O. Mc Tigue, A. Kinsella, J.L. Waddington, C. Larkin, E. O'Callaghan. Suicidality in first episode psychosis. Schizophr. Res.. 2006;86(1–3):221-225 Crossref
  • Compton et al., 2007 M.T. Compton, T. Carter, E. Bergner, L. Franz, T. Stewart, H. Trotman, T.H. McGlashan, P. McGorry. Defining, operationalizing, and measuring the duration of untreated psychosis: advances, limitations and future directions. Interv. Psychiatry. 2007;1:236-250 Crossref
  • Compton et al., 2011 M.T. Compton, T.L. Gordon, P.S. Weiss, E.F. Walker. The "doses" of initial, untreated hallucinations and delusions: a proof-of-concept study of enhanced predictors of first-episode symptomatology and functioning relative to duration of untreated psychosis. J. Clin. Psychiatry. 2011;72(11):1487-1493 Crossref
  • Craig et al., 2000 T.J. Craig, E.J. Bromet, S. Fennig, M. Tanenberg-Karant, J. Lavelle, N. Galambos. Is there an association between duration of untreated psychosis and 24-month clinical outcome in a first-admission series?. Am. J. Psychiatry. 2000;157(1):60-66
  • Cuesta et al., 2011 M.J. Cuesta, V. Peralta, M.S. Campos, E. Garcia-Jalon. Can insight be predicted in first-episode psychosis patients? A longitudinal and hierarchical analysis of predictors in a drug-naive sample. Schizophr. Res.. 2011;130(1–3):148-156 Crossref
  • de Haan et al., 2003 L. de Haan, D.H. Linszen, M.E. Lenior, E.D. de Win, R. Gorsira. Duration of untreated psychosis and outcome of schizophrenia: delay in intensive psychosocial treatment versus delay in treatment with antipsychotic medication. Schizophr. Bull.. 2003;29(2):341-348 Crossref
  • Dell'Osso et al., 2013 B. Dell'Osso, I.D. Glick, D.S. Baldwin, A.C. Altamura. Can long-term outcomes be improved by shortening the duration of untreated illness in psychiatric disorders? A conceptual framework. Psychopathology. 2013;46(1):14-21 Crossref
  • Diaz et al., 2013 I. Diaz, J.M. Pelayo-Teran, R. Perez-Iglesias, I. Mata, R. Tabares-Seisdedos, P. Suarez-Pinilla, J.L. Vazquez-Barquero, B. Crespo-Facorro. Predictors of clinical remission following a first episode of non-affective psychosis: sociodemographics, premorbid and clinical variables. Psychiatry Res.. 2013;206(2–3):181-187 Crossref
  • Drake et al., 2000 R.J. Drake, C.J. Haley, S. Akhtar, S.W. Lewis. Causes and consequences of duration of untreated psychosis in schizophrenia. Br. J. Psychiatry. 2000;177:511-515 Crossref
  • Farooq et al., 2009 S. Farooq, M. Large, O. Nielssen, W. Waheed. The relationship between the duration of untreated psychosis and outcome in low-and-middle income countries: a systematic review and meta analysis. Schizophr. Res.. 2009;109(1–3):15-23 Crossref
  • Fridgen et al., 2013 G.J. Fridgen, J. Aston, U. Gschwandtner, M. Pflueger, R. Zimmermann, E. Studerus, R.D. Stieglitz, A. Riecher-Rössler. Help-seeking and pathways to care in the early stages of psychosis. Soc. Psychiatry Psychiatr. Epidemiol.. 2013;48(7):1033-1043 2013 Jul Crossref
  • Friis et al., 2004 S. Friis, I. Melle, T.K. Larsen, U. Haahr, J.O. Johannessen, E. Simonsen, S. Opjordsmoen, P. Vaglum, T.H. McGlashan. Does duration of untreated psychosis bias study samples of first-episode psychosis?. Acta Psychiatr. Scand.. 2004;110(4):286-291 Crossref
  • Fusar-Poli et al., 2013 P. Fusar-Poli, S. Borgwardt, A. Bechdolf, J. Addington, A. Riecher-Rossler, F. Schultze-Lutter, M. Keshavan, S. Wood, S. Ruhrmann, L.J. Seidman, L. Valmaggia, T. Cannon, E. Velthorst, L. De Haan, B. Cornblatt, I. Bonoldi, M. Birchwood, T. McGlashan, W. Carpenter, P. McGorry, J. Klosterkotter, P. McGuire, A. Yung. The psychosis high-risk state: a comprehensive state-of-the-art review. JAMA Psychiatry. 2013;70(1):107-120 Crossref
  • Gonzalez-Blanch et al., 2008 C. Gonzalez-Blanch, B. Crespo-Facorro, M. Alvarez-Jimenez, J.M. Rodriguez-Sanchez, J.M. Pelayo-Teran, R. Perez-Iglesias, J.L. Vazquez-Barquero. Pretreatment predictors of cognitive deficits in early psychosis. Psychol. Med.. 2008;38(5):737-746
  • Hafner et al., 1992 H. Hafner, A. Riecher-Rossler, M. Hambrecht, K. Maurer, S. Meissner, A. Schmidtke, B. Fatkenheuer, W. Loffler, W. van der Heiden. IRAOS: an instrument for the assessment of onset and early course of schizophrenia. Schizophr. Res.. 1992;6(3):209-223 Crossref
  • Hafner et al., 1994 H. Hafner, K. Maurer, W. Loffler, B. Fatkenheuer, W. an der Heiden, A. Riecher-Rossler, S. Behrens, W.F. Gattaz. The epidemiology of early schizophrenia. Influence of age and gender on onset and early course. Br. J. Psychiatry. 1994;(23):29-38 (23)
  • Ho et al., 2004 B.C. Ho, M. Flaum, W. Hubbard, S. Arndt, N.C. Andreasen. Validity of symptom assessment in psychotic disorders: information variance across different sources of history. Schizophr. Res.. 2004;68(2–3):299-307 Crossref
  • Janca and Chandrashekar, 1993 A. Janca, C. Chandrashekar. Catalogue of Assessment Instruments Used in the Studies Coordinated by the WHO Mental Health Programme, Division of Mental Health. World Health Organization, Geneva (, 1993)
  • Killackey and Yung, 2007 E. Killackey, A.R. Yung. Effectiveness of early intervention in psychosis. Curr. Opin. Psychiatry. 2007;20(2):121-125
  • Large and Nielssen, 2011 M.M. Large, O. Nielssen. Violence in first-episode psychosis: a systematic review and meta-analysis. Schizophr. Res.. 2011;125(2–3):209-220 Crossref
  • Large et al., 2008 M. Large, O. Nielssen, T. Slade, A. Harris. Measurement and reporting of the duration of untreated psychosis.Early Interv. Psychiatry. 2008;2(4):201-211 Crossref
  • Larsen et al., 1996 T.K. Larsen, T.H. McGlashan, L.C. Moe. First-episode schizophrenia: I. Early course parameters. Schizophr. Bull.. 1996;22(2):241-256 Crossref
  • Larsen et al., 2001 T.K. Larsen, S. Friis, U. Haahr, I. Joa, J.O. Johannessen, I. Melle, S. Opjordsmoen, E. Simonsen, P. Vaglum. Early detection and intervention in first-episode schizophrenia: a critical review. Acta Psychiatr. Scand.. 2001;103(5):323-334 Crossref
  • Lihong et al., 2012 Q. Lihong, S. Shimodera, H. Fujita, I. Morokuma, A. Nishida, N. Kamimura, M. Mizuno, T.A. Furukawa, S. Inoue. Duration of untreated psychosis in a rural/suburban region of Japan. Interv. Psychiatry. 2012;6(3):239-246 Crossref
  • Lloyd-Evans et al., 2011 B. Lloyd-Evans, M. Crosby, S. Stockton, S. Pilling, L. Hobbs, M. Hinton, S. Johnson. Initiatives to shorten duration of untreated psychosis: systematic review. Br. J. Psychiatry. 2011;198(4):256-263 Crossref
  • Lopez-Morinigo et al., 2013 J.D. Lopez-Morinigo, B. Wiffen, O'Connor J., R. Dutta, M. Di Forti, R.M. Murray, A.S. David. Insight and suicidality in first-episode psychosis: understanding the influence of suicidal history on insight dimensions at first presentation. Interv. Psychiatry. 2013;8:113-121
  • MacBeth and Gumley, 2008 A. MacBeth, A. Gumley. Premorbid adjustment, symptom development and quality of life in first episode psychosis: a systematic review and critical reappraisal. Acta Psychiatr. Scand.. 2008;117(2):85-99
  • Madsen et al., 1999 A.L. Madsen, A. Karle, P. Rubin, M. Cortsen, H.S. Andersen, R. Hemmingsen. Progressive atrophy of the frontal lobes in first-episode schizophrenia: interaction with clinical course and neuroleptic treatment. Acta Psychiatr.Scand.. 1999;100(5):367-374
  • Malla et al., 2002 A.K. Malla, R.M. Norman, R. Manchanda, M.R. Ahmed, D. Scholten, R. Harricharan, L. Cortese, J. Takhar. One year outcome in first episode psychosis: influence of DUP and other predictors. Schizophr. Res.. 2002;54(3):231-242 Crossref
  • Marshall et al., 2005 M. Marshall, S. Lewis, A. Lockwood, R. Drake, P. Jones, T. Croudace. Association between duration of untreated psychosis and outcome in cohorts of first-episode patients: a systematic review. Arch. Gen. Psychiatry. 2005;62(9):975-983 Crossref
  • Maurer and Hafner, 1995 K. Maurer, H. Hafner. Methodological aspects of onset assessment in schizophrenia. Schizophr. Res.. 1995;15(3):265-276 Crossref
  • McGlashan, 1999 T.H. McGlashan. Duration of untreated psychosis in first-episode schizophrenia: marker or determinant of course?. Biol. Psychiatry. 1999;46(7):899-907 Crossref
  • McGorry et al., 1990 P.D. McGorry, D.L. Copolov, B.S. Singh. Royal Park Multidiagnostic Instrument for Psychosis: Part I. Rationale and review. Schizophr. Bull.. 1990;16(3):501-515 Crossref
  • McGorry et al., 2007 P.D. McGorry, E. Killackey, A.R. Yung. Early intervention in psychotic disorders: detection and treatment of the first episode and the critical early stages. Med. J. Aust.. 2007;187(7 Suppl.):S8-S10
  • Moher et al., 2009 D. Moher, A. Liberati, J. Tetzlaff, D.G. Altman, The PRISMA Group. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Med.. 2009;6(6):e1000097 Crossref
  • Norman and Malla, 2001 R. Norman, A. Malla. Duration of untreated psychosis: a critical examination of the concept and its importance. Psychol. Med.. 2001;31(3):381-400
  • Norman et al., 2004 R.M. Norman, A.K. Malla, M.B. Verdi, L.D. Hassall, C. Fazekas. Understanding delay in treatment for first-episode psychosis. Psychol. Med.. 2004;34(2):255-266 Crossref
  • Norman et al., 2005 R.M. Norman, S.W. Lewis, M. Marshall. Duration of untreated psychosis and its relationship to clinical outcome. Br. J. Psychiatry Suppl.. 2005;48:s19-s23 Crossref
  • Perkins et al., 2000 D.O. Perkins, J. Leserman, L.F. Jarskog, K. Graham, J. Kazmer, J.A. Lieberman. Characterizing and dating the onset of symptoms in psychotic illness: the Symptom Onset in Schizophrenia (SOS) inventory. Schizophr. Res.. 2000;44(1):1-10 Crossref
  • Perkins et al., 2005 D.O. Perkins, H. Gu, K. Boteva, J.A. Lieberman. Relationship between duration of untreated psychosis and outcome in first-episode schizophrenia: a critical review and meta-analysis. Am. J. Psychiatry. 2005;162(10):1785-1804 Crossref
  • Singh, 2007 S.P. Singh. Outcome measures in early psychosis; relevance of duration of untreated psychosis. Br. J. Psychiatry Suppl.. 2007;50:s58-s63 Crossref
  • Singh et al., 2005 S.P. Singh, J.E. Cooper, H.L. Fisher, C.J. Tarrant, T. Lloyd, J. Banjo, S. Corfe, P. Jones. Determining the chronology and components of psychosis onset: The Nottingham Onset Schedule (NOS). Schizophr. Res.. 2005;80(1):117-130 Crossref
  • Takahashi et al., 2007 T. Takahashi, M. Suzuki, R. Tanino, S.Y. Zhou, H. Hagino, L. Niu, Y. Kawasaki, H. Seto, M. Kurachi. Volume reduction of the left planum temporale gray matter associated with long duration of untreated psychosis in schizophrenia: a preliminary report. Psychiatry Res.. 2007;154(3):209-219 Crossref
  • Winsper et al., 2013 C. Winsper, S.P. Singh, S. Marwaha, T. Amos, H. Lester, L. Everard, P. Jones, D. Fowler, M. Marshall, S. Lewis, V. Sharma, N. Freemantle, M. Birchwood. Pathways to violent behavior during first-episode psychosis: a report from the UK National EDEN Study. JAMA Psychiatry. 2013;70(12):1287-1293 Crossref


a University of Maryland/Sheppard Pratt Psychiatry Residency Training Program, University of Maryland. 701 W. Pratt St., 4th Floor, Baltimore, MD 21201, USA

b Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD. Tawes Ct., Catonsville, MD 21228, USA

lowast Corresponding author. Tel.: + 1 410 328 6325; fax: + 1 410 328 1212.