Evaluation Partnerships and the Systems Evaluation Protocol: The Importance of Lifecycle Analysis
In the push to figure out “what works,” evaluation designs (e.g., randomized controlled trials) are often inappropriately matched with the maturity of the programs they are meant to evaluate. In the third post on their series on the Systems Evaluation Protocol, Monica Hargraves and Jennifer Brown Urban present a tool they call”Lifecycle Analysis” for matching programs with evaluations.
Strange as it might first sound, programs are like people. They are “born” (initiated); they develop and change; they mature; and they may end up taking on larger or smaller roles in the world. To staff responsible for managing programs – nurturing them through early rough patches, navigating setbacks, and making critical decisions about their development and future – there can certainly be a “parental” feeling to the work!
Balancing the felt commitments to a program with disciplined decision-making about program design and support can be difficult, and good evaluation is invaluable. There are those who would argue that “more” is always better, that the highest standard of rigor is to conduct randomized controlled tests (RCTs) of a program’s effectiveness. But are those appropriate for a fledgling program just working the bugs out? What’s right for programs that are being expanded or disseminated more widely? How do you decide “how much” evaluation is enough?
The Systems Evaluation Protocol (SEP) offers a way of thinking about programs that clarifies these choices and offers guidance about what constitutes “rigor” at various developmental stages. Today’s post will focus on the Lifecycle Analysis step of the Systems Evaluation Protocol (SEP) (see the first post in this series for background on the SEP). All of the Protocol steps are laid out in Figure 1.
Systems evaluation recognizes that program development is an evolutionary process, and that programs have a lifecycle. To operationalize this view, program lifecycles are divided into four phases:
- Phase I – Initiation: program is brand new or substantially adapted from previous versions, and is going through lots of adjustment to work out early problems or shortfalls
- Phase II – Development: most of the program elements have been settled, although some changes may still be happening
- Phase III – Stability: the program is being implemented consistently and may even have formal written procedures and can be run consistently by new trained facilitators
- Phase IV – Dissemination: the program is being implemented in multiple sites or may be widely distributed.
A lifecycle analysis is essential to understanding what stage a program is in, and what kind of evaluation is needed. The critical point is this: for any given program lifecycle stage, there is an appropriate type of evaluation work to be done – that is, a corresponding evaluation lifecycle stage. “Appropriate” means an evaluation that provides the kind of information that will be most useful and relevant to the decisions that arise for the program in its current lifecycle phase. Lifecycle analysis in the SEP therefore offers considerable guidance when planning evaluation.
Evaluation lifecycles are also divided into four phases:
- Phase I – Process and Response: evaluation provides rapid feedback, primarily about program implementation; it may involve examining the presence or absence of selected outcomes among participants
- Phase II – Change: examines the program’s association with change in outcomes of interest using pre-/post-tests, quantitative/qualitative assessments of change
- Phase III – Comparison and Control: assesses program effectiveness using control groups or statistical controls including controlled experiments or quasi-experiments
- Phase IV – Generalizability: examines effectiveness across a wider range of contexts, multi-site analysis of large data sets, etc.
An essential step in the SEP is to assess what lifecycle stage the program is in currently, and the evaluation lifecycle stage of prior evaluation efforts. If the lifecycles are in alignment, then the evaluation planning focuses on what kind of information will be needed in order to help the program move forward with its development.
We can illustrate alignment and misalignment with Figure 2, where program lifecycle phases are on the horizontal axis and evaluation lifecycle phases are on the vertical axis. If the program and evaluation phases are perfectly aligned, the program would fall somewhere along the diagonal red line (e.g., Program A).
In practice, there is often misalignment between program and evaluation lifecycles. In that case, evaluation planning should work toward closing that gap and either focus on “filling in” information that has not been properly established yet (reining in the evaluation), or pushing for a higher level of evidence about the program (see Figure 2; Urban, Hargraves, & Trochim, 2014).
Figure 2 depicts the relationship between program phases (on the horizontal axis) and evaluation phases (on the vertical axis). If the program and evaluation phases are perfectly aligned, the program would fall somewhere along the diagonal red line (e.g., Program A).
Programs above the red line (e.g., Program C) are doing evaluations that are more advanced than their program lifecycle phase calls for. Program C is in the “Initiation” program lifecycle phase but it is being evaluated using a “Comparison & Control” evaluation such as an RCT design. What Program C really needs right now is rapid feedback on program implementation.
Programs below the red line (e.g., Program B) have evaluations that are “lagging behind” the program lifecycle phases. Program B is in the “Stability” phase of its program lifecycle, but is doing “Process & Response” evaluation. This program should be using a more sophisticated evaluation design.
What are the costs of misalignment?
For a program like C, it’s a case of using too big (and costly) a tool for the task. Not only is this a waste of resources, but it introduces the risk of unfairly assessing a program that is still changing a lot. It may be that this program has a great deal of potential but is still struggling with some “bugs”. An overblown evaluation might lead to the elimination of something that actually could be quite valuable. Conversely, early results might happen to be overly positive (for example, they might have been due to a particularly good facilitator and not to the design of the program itself). Relying too much on such an evaluation might tend to over-promote a program that really is not as strong as it seemed in that early test.
For a program like B, which is being implemented in a very stable fashion but has only been assessed through simple smile sheets, for example, there is a chance that the program is actually either more or less effective than it is currently understood to be because the evaluation is not appropriate for assessing effectiveness. This could be a very strong program that ought to be disseminated much more widely but is constrained because the evidence is not strong enough to support dissemination. Conversely, it could be a rather weak program that continues to be implemented despite its limitations. Again, the evaluation is not serving the decision-making process well.
What to do if your program and evaluation are out of alignment?
Misalignment happens. But it’s essential to understand the inherent risks and weaknesses when making program decisions on the basis of misaligned evaluations. And it’s important to move toward alignment over successive evaluation cycles. Promoting the healthy evolution of programs over time is essential. For a program whose program and evaluation phases are not currently aligned, the move toward alignment does not necessarily occur within one evaluation cycle. Rather, the focus is on building evidence over successive evaluation cycles while simultaneously striving for phase alignment.
 Trochim et al., 2012; Urban et al., 2014