Evaluating What Works
The SIF is a federal tiered innovation and evidence initiative that prioritizes evaluation and building evidence of effectiveness. The focus on evidence in SIF goes beyond confirming that the funded programs achieved their intended outcomes and impacts; it includes understanding how they are successful and how they can be improved.
By developing a strong evaluation process, putting resources towards supporting quality evaluations, and encouraging our grantees and subgrantees to do the same, the SIF increases the evaluation capacity of all stakeholders and offers best practices and lessons learned to the social innovation field.
A demonstrated track record of using evaluation to make programmatic decisions is a requirement for SIF grantees. In addition, all funded programs being implemented in communities must complete a rigorous evaluation to strengthen their base of evidence and to document and assess:
A key goal of the SIF is to build the evaluation capacity of nonprofit organizations so they can successfully assess whether their programs are truly creating impact. The Social Innovation Fund Evaluation Plan (SEP) Guidance provides a common framework and shared understanding of what rigorous evaluation means, the elements and criteria against which SIF grantees and subgrantees plans are assessed, and suggestions other organizations can use as they develop their own evaluations.
What are Evidence Tiers?
The SIF relies on a framework that organizes evidence levels into three categories: preliminary, moderate, and strong. This tiered-evidence framework enables more dollars to be directed towards programs that have demonstrated success and are ready to be scaled for wider impact, while also directing lesser amounts of funding toward interventions that need to be tested and proven. As a result, tiered-evidence grant programs have the goal of identifying evidence-based models that can be replicated.
Preliminary evidence means the model has evidence based on a reasonable hypothesis and supported by credible research findings. Examples of research that meet the standards include: 1) outcome studies that track participants through a program and measure participants’ responses at the end of the program; and 2) third-party pre- and post-test research that determines whether participants have improved on an intended outcome.
Moderate evidence means evidence from previous studies on the program, the designs of which can support causal conclusions (i.e., studies with high internal validity) but have limited generalizability (i.e., moderate external validity) or viceversa - studies that only support moderate causal conclusions but have broad general applicability. Examples of studies that would constitute moderate evidence include: (1) at least one well-designed and well-implemented experimental or quasiexperimental study supporting the effectiveness of the practice strategy, or program, with small sample sizes or other conditions of implementation or analysis that limit generalizability; or (2) correlational research with strong statistical controls for selection bias and for discerning the influence of internal factors.
Strong evidence means evidence from previous studies on the program, the designs of which can support causal conclusions (i.e., studies with high internal validity), and that, in total, include enough of the range of participants and settings to support scaling up to the state, regional, or national level (i.e., studies with high external validity). The following are examples of strong evidence: (1) more than one well-designed and well-implemented experimental study or well-designed and well-implemented quasi-experimental study that supports the effectiveness of the practice, strategy, or program; or (2) one large, well-designed and well-implemented randomized controlled, multisite trial that supports the effectiveness of the practice, strategy, or program.
By employing three different evaluation designs, SIF grantees aim to move to a higher level of evidence. If successful, three-quarters of these studies will allow programs to demonstrate a moderate or strong level of evidence. All SIF interventions are required to reach at least a moderate level of evidence by the end of their grant term.
How Does the SIF Apply these Evidence Tiers?
STEP 1: All programs funded by SIF must demonstrate a minimum of preliminary evidence of effectiveness. SIF grantees must ensure that the programs they fund also meet this minimum evidence requirement.
STEP 2: Once funded, programs must build on their level of evidence. A program must partner with an independent evaluation team to conduct a rigorous evaluation that will help build evidence supporting its effectiveness and potentially move it to a higher tier of evidence.
STEP 3: Programs with higher levels of evidence are prioritized for greater expansion. The SIF expects programs with stronger levels of evidence will receive more financial support so that they can scale up their programs.
How does the SIF Support Evaluations?
The SIF provides grantees and their subgrantees with access to a team of evaluation experts who work with grantees to review and provide feedback on their evaluation design strategies, ensuring these designs meet the program’s evidence goals and lead to the development and implementation of quality evaluations.
SIF and its evaluation contractors provide grantees and subgrantees with guidance documents and ongoing feedback as they develop their evaluation plans, implement them, and report on results, promoting a culture of learning and support among its grantees and sharing learnings with the field.
What does SIF Evaluation Investment Look Like?
SIF evaluation designs can be placed into three categories: experimental, quasi-experimental, and non-experimental.
The term is a catch all category that refers to a range of research and evaluation studies that do not fall under the experimental or quasi-experimental research designs. They include process and outcomes evaluations, case studies, cost effectiveness or cost benefit analysis, feasibility studies, rapid assessments, situational and contribution analysis, developmental evaluation, strategic learning, systems change studies, and others.
Quasi-experimental design (QED)
A design that forms a counterfactual group by means other than random assignment. This approach is used for conducting impact evaluations where observed changes in the treatment group are compared with a comparison group (as a counterfactual representing an absence of intervention) to assess and estimate the impact of the program on participants. However groups formed in these designs typically differ for reasons other than chance, and these differences may influence the impact estimate. There are different types of approaches used in quasi-experimental designs such as those using Propensity Score Matching (PSM), Regression Discontinuity, Interrupted Time Series (ITS) and others.
Experimental design studies using random control trials or RCTs assign program participants to two distinct groups (at random): the treatment group, which receives program services, and the control group, which does not. The control group is called the “counterfactual,” representing the condition in which the program or intervention is absent. Random assignment ensures that the treatment and control groups are initially similar and do not differ on background characteristics or other factors. Random assignment thus creates an evaluation design where any observed differences between the two groups after the program intervention takes place can be attributed to the intervention with a high degree of confidence.
Note: As of 2011 all SIF interventions are required to reach at least a moderate level of evidence by the end of their grant term.