SAMHSA's National Mental Health Information Center

This Web site is a component of the SAMHSA Health Information Network

    | | |    
Search
In This Section

About the Program

Evidence-Based Practices

Related Topics

Featured Publications

In The News

Related Links

Community Support
Homepage

 
 
 
 
Page Options
printer icon printer friendly page

e-mail icon e-mail this page

bookmark icon bookmark this page

shopping cart icon shopping cart

account icon  current or new account

This Web site is a component of the SAMHSA Health Information Network.


Skip Navigation

Community Support

Integrating Process and Outcome Evaluations

VI. Preparing for Analytic Uses of the Database

A. Cross-site descriptive analysis of multi-site demonstrations

Before a PDB is used in cross-site analysis, efforts should be made to ensure that a complete, cleaned and mergeable file is presented to the analysts who will use it in conjunction with other data files containing individual service delivery or outcome data. If the project-level coding identified as implementation proceeds has been completed by the time outcome data is available for analysis, it should be possible to add new variables, including recodes and scales formed from existing elements, as the cross-site analysts generate additional hypotheses. If the data management operation is a true relational database, and developers of the PDB retain update access, as on a LAN, PDB data updating can continue over the full course of the demonstration, as long as outcome analysts are notified.

Apart from its utility for supporting cross-site analyses of outcomes, project-level data bases will also be useful as practical tools for managing the numerous characterizations of programs, coding them, and permitting researchers to examine and cross-tabulate characteristics in a variety of ways. Some evaluation questions anticipate the use of process data for non-outcome purposes, such as characterizing the distinctions between interventions, and determining replicability of program models.

Preliminary descriptive analyses can take a variety of forms. Basic tables of characteristics in a description of a multi-site demonstration can, with supporting narrative, guide a reader of a report to an appreciation of the patterns of variation in background characteristics and the complexities that constrain interpretation. The basic technique ("cross-case displays") is considered to be worth an entire chapter in the second edition of Miles and Huberman's Qualitative Data Analysis (1994, Ch. 7). Some of the tables generated from the PDB created for the NIAAA homeless demonstrations PDB were, for example, used to guide interpretation of projects' implementation experiences in the construction of standard narrative implementation case studies. They were then used to prepare the way for interpreting patterns of observed outcomes. These cross-case displays are contained in Appendix C.

For both descriptions in reports and in developing analytical responses to hypotheses or tentative interpretations by analysts and decision makers, somewhat more complex displays of characteristics are useful. These can include:

Cross-tabs of context, program, consumer characteristics and implementation measures
Mappings of programs onto grids of selected characteristics. Plots and regression lines

Cross-tabulations of characteristics can facilitate multiple program description and can be used in generating and checking exploratory hypotheses about, for example, an aspect of implementation such as the establishment of interagency linkages and its relationship to coded features such as the type of organization sponsoring the project. Several graphic representations of program characteristics have been suggested. Perhaps the most straightforward locates sites on two-dimensional grids of selected characteristics.

It is also possible to conduct informal or statistical cluster analysis (of sites using multiple characteristics. Miles and Huberman (1994, Box 7.6) show how clustering tactics could be applied to array 12 sites on three dimensions, based on such characteristics as "level of assistance" provided and "smoothness of early implementation." The array enabled an analyst to see that neither predicted later practice stabilization, but to draw boundaries around what seemed like reasonable "families" of program experience and formulate hypotheses about the operation of other intervening variables, such as the size of the innovation being attempted. A statistical cluster analysis of case-management characteristics (Teague, 1996) has also recently been used to relate the ACCESS programs to recognized strong implementations of the assertive community treatment (ACT) model of case management, showing that they were more similar to strong implementations of the ACT model than another set of (VA) programs, but differed in their limited case management duration.

B. Multi-level analyses of project and participant-level data

One of the most central motives for undertaking the additional effort required to encode process evaluation data into a PDB was to support and facilitate its systematic integration into analyses of individual participant-level service delivery, and especially outcome data. Over the last few years, developments in statistical analysis techniques and statistical software have provided new technology for integrated multi-level analyses.

Multi-level, hierarchical linear statistical models (Bryk and Raudenbush, 1992) address many of the cross-site analysis issues that are typically raised and are among available techniques ideally suited to integrating the qualitative and quantitative data now being collected in many community-based programs. These models also address a number of long standing statistical issues associated with these multi-level designs and biases caused by the aggregation of data.

The application of multi-level analysis (or "hierarchical linear modeling") to multi--site evaluations has been demonstrated, producing successful and revealing analyses even where there is very considerable diversity among the implemented interventions. Notable examples of the use of the technique include Seltzer's (1994) systematic exploration of multi-level modeling techniques in an evaluation of the Transition Mathematics program. This study was able to demonstrate the influence of site characteristics on average outcomes with as few as 12 sites. Another was Osgood and Smith's (1995) use of hierarchical linear models in the a long-term followup study of the Boy's Town program, and in prevention, McNeal and Hansen's (1995) analysis of individual students' D.A.R.E. outcomes within a hierarchical model that explicitly modeled the variation between schools (a macro variable) before assessing the program's effects on individual students. Murray and Wolfinger (1994) have outlined appropriate analysis plans and Hser (1995) has also outlined plans for the use of the framework to analyze the effect of drug treatment counselor's practices on treatment effectiveness. Yin et al. (1997) have applied the methods in an evaluation of community-based drug prevention studies. These successful uses of and prospects for multilevel modeling in multi--site evaluation demonstrate that it is no longer necessary to choose between levels of analysis, each of which has its disadvantages.

A simple form of a two-stage hierarchical linear model that could be used in analyzing services or outcomes data from a multi--site study would consist of two linear models: a model of intervention effects within-sites (level-1) and a between-site model (level 2). With recent developments in analytic software (e.g., HLM version 4; SAS Proc Mixed)., it is also possible to build complex three-level hierarchical models of services or outcomes which would simultaneously take account of the nesting of sites within communities, the nesting of groups within sites and the nesting of participants within groups. This would be the most appropriate way to handle the contextual or program "environment" variables (level 3) that are often viewed as necessary to understanding between-site and between-group differences, as well as individual differences among participants.

Our current inventory of project-level databases all suggest the appropriateness of using of a three-level model. For example, our evaluation for NIAAA of the effect of interventions on homeless substance abusers (see Appendix A) had approximately 5,000 participants in 36 groups in 13 cities. The basic within-site hypothesis was that participants assigned to an intervention group would reduce alcohol and drug abuse, increase economic security, and achieve greater residential stability than participants in a comparison group ("usual care") or, in some sites, a less intensive intervention. Within-site assignment to groups was usually random, assuring equivalence between groups. The basic cross-site hypothesis is that between-group differences vary, and that this variation will be at least partly explainable through the examination of implementation levels, service configurations, theoretical models, and so forth. Finally, the cross-setting hypothesis is that community context factors (e.g., service "richness" of the city) also have an influence on both level 1 and level 2 findings. The hierarchical levels and their respective sample sizes of NIAAA and the multi-site demonstrations programs that form the basis for the two other project-level databases being developed are shown in the following table.

Table 1. Levels of information in three multi-site demonstrations

Sponsor Study Level 1 Level 2 Level 3
NIAAA Effectiveness of Community Interventions to Improve Quality of Life for Homeless Substance Abusers ~5,000 36 13
CMHS Impact of Systems Integration on Services and Outcomes for Homeless Mentally Ill ~7,200 18 9
NIDA Effectiveness of Community Interventions to Reduce HIV Transmission Risk Among Injection and Non-Injection Drug Users ~25,000 51 23


The ACCESS Demonstration

The purpose of the national evaluation of the Access to Community Care and Effective Services and Supports (ACCESS) demonstration program is to understand the implementation of different approaches to service systems integration for homeless adults with severe mental illness, particularly those with co-occurring alcohol and/or other substance use disorders, and to link services integration strategies with client outcomes. The national evaluation of ACCESS is being conducted at both system and client levels to determine the extent to which services integration takes place, its impact on access to services and--to the extent possible--client outcomes that can be associated with services integration. The conceptual framework for the ACCESS systems-level evaluation being conducted by R.O.W. Sciences is organized around two major goals. The first is to identify different service integration approaches as well as to document how they are implemented and how they change over time (process evaluation). The second is to contribute to the success of each project by providing feedback on progress made that can provide "assistance with overcoming problems" (formative evaluation). The systems-level evaluation is organized around four major evaluation questions:

Service System and Community Context. What are the political, organizational, historical, and service-system characteristics, processes, and program inputs that form the environment of the projects and that will affect their implementation?

Nature of ACCESS Projects and Their Implementation. What is the nature of the ACCESS service integration and comparison projects, as planned and as implemented?

Services Integration and Other Systems-Level Outcomes. What services integration outcomes are achieved by the projects at the systems level, and does the extent of services integration differ at comparison sites?

Service Utilization and Other Intermediate Client-Level Outcomes. What short-term or intermediate client outcomes are achieved, in terms of recruiting, retaining, and providing appropriate services at the client level, and in what ways are these outcomes related to the projects' models of services integration and to the processes underlying implementation?

The overall data collection strategy for the systems-level evaluation is to provide data sources to answer all of CMHS's specific evaluation questions, to cover additional items implied by the process evaluation goals, and to provide some cross-checks and alternate sources for key information. The national evaluation team is utilizing a variety of data collection methodologies that include logic models, comprehensive organizational interviews, site visits, staff logs, focus groups, implementation case studies, and quarterly telephone contacts with ACCESS grantees. Results from this evaluation will be reported through case study analyses, interorganizational network analyses, and implementation analyses. By linking the client- and system-level data, the project-level database will permit analysts to address the question that is at the heart of the ACCESS demonstration: Do improvements in the level of service system integration lead to improvements in system and client outcomes?

The NIDA Cooperative Agreement Program

The NIDA Cooperative Agreement Program is a national collaborative research effort between the National Institute on Drug Abuse (NIDA) and 23 community-based AIDS outreach/intervention program sites. The national evaluation includes process and outcome components and other data analyses related to the efficacy, effectiveness, and efficiency of interventions for out-of treatment injection drug users, crack smokers, and sex partners of injection drug users. Participants' risk behavior and HIV status are assessed at baseline and in 3 or 6 month follow-ups. The analyses contrast one or more enhanced interventions with a standard HIV prevention intervention in each community setting, and attempt to account for variation in characteristics of individuals, programs, and communities. They are intended to address the following evaluation questions related to intervention effectiveness:

Which interventions work? Various tests are being conducted of theoretical models of specific hypotheses about behavioral change, describing what interventions work, under what circumstances or contexts they work, and how long their effects last.

Which behaviors and/or health variables change? Strategies are being reviewed that focus on and emphasize favorable and/or unfavorable behavioral and health outcomes, measurement issues, and methods to validate self-reported data.

For which populations? Individual or comparative studies for HIV prevention projects targeted at specific subgroups whose behaviors place them at high risk for HIV infection are also being evaluated.

The NIAAA Cooperative Agreement Program

In the NIAAA Cooperative Agreement Program, innovative service approaches that provided outreach, shelter, alcohol and other drug treatment, case management, mental health services, housing and other services to homeless persons were compared with usual care or other services in 14 cities. (The program is described in detail in Appendix A). In these projects, the highest level of the analytic hierarchy was a community setting. This will typically be true of multisite demonstrations, although the definition of setting will vary across programs. For NIAAA project the setting was the city in which the program was housed, and characteristics of their service system of the city were targeted for inclusion in analyses. (In contrast, the CMHS ACCESS program the setting is most often the state, in that the states are the grantees, who typically proposed an integration site and a comparison site in different cities. In some ACCESS projects (3 of 9), the two sites are in different geographical areas of the same city. In these cases the level-3 variable set will include city variables and state variables, because state activities (e.g., Medicaid reform) can also affect level 1 and level 2 findings. In the NIDA HIV prevention project, settings ranged from neighborhoods (e.g., East Harlem) to cities to counties. Analyses of the ACCESS and NIDA data are in the planning or early implementation phases.) Within each NIAAA project, one or more treatments were compared to usual services conditions, and individuals were assessed at baseline, discharge, and six-month followup. Multilevel analyses of the NIAAA data making use of the project-level database are in progress, as part of a program of alternative analyses of the data.

C. A Simple Illustration of Multi-Level Analysis: Implementation and Outcomes of NIAAA Homeless Projects

A simplified example of the use of project-level data in multi-level analysis can be constructed using the NIAAA homeless demonstration data--both the coded PDB and participant outcome data. (As mentioned earlier, the models used in these analyses are also known as a hierarchical linear models, random-regression models or mixed-effects linear models.) This example examines the relationship between a constructed measure of implementation problems and several key individual outcomes, across the 9 projects that included a contrast between an innovative-service group and a ususal-services group of homeless persons with alcohol and other drug problems. The implementation-problems measure was constructed from a set of 4 ratings by analysts who followed the implementation of sites through site visits, bi-monthly conference calls, and review of quarterly reports and other qualitative process material. Based on these materials, the analysts coded the extent of staff turnover among clinIcal directors and supervisors, and among counselors, therapists, and case managers, using a simple "low," "medium" and "high" scale. They also rated the impact of this turnover on program operations as "no impact" "some" or "severe." Turnover has often been found to be related other implementation problems and it may stand as a proxy for other implementation problems. The reliability of the turnover ratings was promoted by discussion of the rating criteria by evaluators and implementation analysts. Agreement on these items as assessed in the reliability study was high. Because the items were correlated, they were combined in a simple Turnover scale (TOSCALE). This project-level implementation measure was included as a random effect in models of participants' alcohol and drug-composite scores from the Addiction Severity Index, and the number of days they spent literally homeless (on the street or in shelters) or in "stable housing" in the 60 days prior to follow-up. The mean differences between the scores of individuals in enhanced or intensive treatment groups and usual-care groups were assessed at level one and the relationship between outcomes and implementation problems as assessed by the Turnover Scale was assessed at level two. Data from a SAS file containing PDB items merged with selected participant characteristics and outcomes was output to an ASCII file for use in the MIXREG program in the Hedeker and Gibbons (1995) Toolkit in this series.

C. A Simple Illustration of Multi-Level Analysis Graphic

To illustrate these analyses, on the Residential Stability score (RSTABLEF) (the number of days participants spent in stable housing in the 60 days prior to follow-up) the MIXREG configuration was as follows:
The output from this run is as follows:

MIXREG - The program for mixed-effects linear regression analysis

MIXREG on NIAAA RSTABLEF (DAYS IN STABLE HOUSING) FOR 9 SITE

With TOSCALE (Turnover scale) as a random effect (14 Dec 199

Numbers of observations
Level 2 observations = 9
Level 1 observations = 1828

The number of level 1 observations per level 2 unit are:
208  109  244  285  210  103  346  250  73

Descriptive statistics for all variables

Variable Minimum Maximum Mean Stand. Dev.
RSTABLEF 0.00000 60.00000 20.53063 26.60598
TOSCALE 5.00000 12.00000 6.44475 1.90057
GROUP 1.00000 2.00000 1.48632 0.49995

Descriptives for response variable RSTABLEF by the variable GROUP

RSTABLEF

Group Mean Stand. Dev. N
1.000 22.52609 27.19126 939
2.000 18.42295 25.82165 889

Starting values

mean 2.5108
covariates 2.3014
var. terms 144.4930
residual 722.4648


MIXREG - The program for mixed-effects linear regression analysis

MIXREG on NIAAA RSTABLEF (DAYS IN STABLE HOUSING) FOR 9 SITE

With TOSCALE (Turnover scale) as a random effect (14 Dec 1997)

* Final Results - Maximum Marginal Likelihood (MML) Estimates *

EM Iterations = 10
Fisher Iterations = 5
Total Iterations = 15
Log Likelihood = -8580.246

Variable Estimate Stand. Error Z p-value
TOSCALE 3.90282 0.38446 10.15152 0.00000
Group -3.24532 1.18994 -2.72730 0.00639
Random-effect variance & covariance term(s)
TOSCALE 0.59297 0.32087 1.84801 0.03230
Residual variance 690.90345 22.90934 30.15815 0.00000

note: p-values are 2-tailed except for variances which are 1-tailed

Correlation of the MML estimates of the fixed terms

1  2

TOSCALE  GROUP

1  TOSCALE  1.0000

2  GROUP  -0.6982  1.0000

Correlation of the MML estimates of variance-related terms

1  2

Var-Cov1  Residual

1  Var-Cov1  1.0000

2  Residual  -0.0089  1.0000

The results indicate statistically significant effects of the Group variable, even allowing for the nesting of the contrast between groups within the 9 sites. Although the number of days both groups of participants spent in stable housing improved greatly between baseline and follow-up, those in the enhanced or intensive groups spent an average of 22.5 days in stable housing in the 60 days prior to follow-up, compared to 18.4 days in the usual-care groups. In addition, the significant effect of the turnover scale would indicate that, in the absence of other co-varying explanatory factors, there would be reason to conclude that implementation problems as reflected in turnover played a role in producing the housing outcome. Similar results were obtained for the number of days spent literally homeless, and for the alcohol composite score of the Addiction Severity Index (ASI). But in this instance, unlike the other outcomes, the turnover effect was positive. Because the project level database was available, it was possible to determine that the relationship between the turnover scale and the residential stability outcome was driven almost entirely by the fact that the highest turnover occurred in the one site that served homeless women exclusively, and while their other outcomes were less favorable, they were not often literally homeless or in housing arrangements at outcome that were not classified as stable. The relationships between the implementation and the ASI scores were as expected.

Beginning with a finding of significant contrasts within sites and significant variation in effects from site to site associated with variation in a project-level variable, the project-level database, along with random effects models, provides analysts the ability to examine a large number additional cross-site factors such as variation in housing availability and program focus that may also explain the observed outcome variation associated with program implementation measures. Ongoing analyses of the NIAAA data are examining the apparent impact of staff turnover on outcomes in relation to other project-level factors such as the availability of appropriate housing and characteristics of case management, in addition to making statistical adjustments for factors such as differential attrition on characteristics of individual participants. By permitting the two levels of data to be combined in analyses, the project-level database supports the examination of a large number of alternative explanatory hypotheses.

D. Extensions of PDB use to meta-analysis of implementation and outcome evaluations:

Comparison bases for individual projects

As program process and outcome information accumulates across multiple demonstrations dealing with the same program areas (for example the earlier round of Community Demonstration projects for homeless person and the later Cooperative Agreements sponsored by NIAAA ) and these results are supplemented other studies from the published research literature, the accumulating body of coded information can become an increasingly reliable basis for assessing the outcomes of new programs and interpreting their findings. If the PDB approach to integrating process and outcome evaluation data is adopted in a standardized way, for example, and the characteristics of all projects or sites in the several similar multi--site demonstration programs can be coded into a PDB record, the overlap with meta-analytic goals would be clear and the distinction between project level database and study-level meta-analytic coding would be eliminated. The inclusion of strong and appropriate program-theory coding should make it possible to interpret an observed change in behavior or program-implementation outcome against the varied background of programs similar in clients, interventions, settings, etc. This extension is a link with recent developments in meta-analysis (Cook et al., 1992; GAO, 1992). A similar approach focused on systematically assessing programmatic "natural experiments" has been advocated by Rog et al. (1994). Analysts who are expert in the programmatic issues of a particular area of services research would therefore do well to begin developing or collaborating on their own project-level databases.

Table of Contents | Previous | Next

Home  |  Contact Us  |  About Us  |  Awards  |  Accessibility  |  Privacy and Disclaimer Statement  |  Site Map
Go to Main Navigation United States Department of Health and Human Services Substance Abuse and Mental Health Services Administration SAMHSA's HHS logo National Mental Health Information Center - Center for Mental Health Services