Skip to main content

Effect Categories

MUSE classifies the results of each analysis into five intuitive categories so that users can quickly understand what an intervention actually produced. When browsing Evidence Cards or building a Logic Model, each piece of evidence is tagged with one of these categories, giving you an at-a-glance summary of whether an intervention worked, for whom, and whether it caused unintended consequences.

Quick Reference

Icon colorCategoryIDMeaning
GreenPositive+Expected effect was found
RedNo Effect-Expected effect was not observed
GrayMixed+-Results vary depending on conditions
OrangeSide Effect!Unintended effects were observed
GrayUnclearN/AEvidence is insufficient to classify

Category Descriptions

Positive Effect

Icon: Plus sign (+) on a green background

Indicates that the expected effect was found. In many cases, this is statistically significant and shows that a practically meaningful effect of considerable magnitude was observed.

When this applies:

  • The study detected the anticipated change in the target outcome.
  • The result is statistically significant (i.e., unlikely to be due to chance).
  • The effect size is large enough to matter in practice, not merely a detectable but negligible difference.

Example: A job-skills training program is evaluated on employment rates six months after completion. Participants show a statistically significant increase in employment compared to a matched control group, and the magnitude of the difference is large enough to justify the program cost.


No Effect

Icon: Minus sign (-) on a red background

Indicates that the expected effect was not observed. In many cases, this shows that the sample size was sufficient but the effect was not statistically significant. When the sample size is extremely large, even if statistically significant, it may represent a practically meaningless effect, which is also classified in this category.

When this applies:

  • The study had adequate statistical power, but no meaningful change in the target outcome was detected.
  • Alternatively, in very large samples a tiny statistically significant effect is found that has no practical value.

Example: A financial literacy workshop is evaluated using a large administrative dataset. Participants show a small, statistically significant increase in savings, but the dollar amount is so marginal that it would not meaningfully change recipients' financial security. The result is classified as No Effect.


Mixed Effect

Icon: Blend icon on a gray background

Intervention effects show heterogeneity in many cases. Results are classified as Mixed when outcomes differ depending on various conditions — for example, effects were found for men but not for women, or effects were found for young people but not for elderly people.

When this applies:

  • Sub-group analyses reveal that the intervention benefits some populations while not benefiting (or even harming) others.
  • The overall average effect is statistically null, but meaningful positive effects exist in specific segments.
  • The direction or size of the effect changes based on context, geography, or implementation fidelity.

Example: A literacy intervention shows large improvements in reading scores for students in urban schools, but no measurable improvement for students in rural schools where classroom sizes and teacher training differed. The evidence is tagged as Mixed.


Side Effect

Icon: Alert triangle (!) on an orange background

Indicates that unintended effects other than the intervention's intended outcomes were observed. In many cases, these are statistically significant and represent practically undesirable effects of considerable magnitude.

When this applies:

  • The intervention produced outcomes that were not part of the original theory of change.
  • These unintended outcomes are adverse — that is, they represent harm or deterioration in some dimension.
  • The unintended effects are large enough to be practically meaningful, not just statistically detectable noise.

Example: A conditional cash transfer program successfully increases school enrollment (a Positive Effect on its primary indicator), but also correlates with increased household debt as families borrow against anticipated payments. The debt increase is logged as a Side Effect on a separate evidence card.


Unclear

Icon: Question mark on a gray background

Classified as Unclear when the sample size is insufficient or analytical methods are inadequate to draw reliable conclusions. Interventions judged as Unclear require additional testing before they can be acted upon with confidence.

When this applies:

  • The study is underpowered — too few participants to detect a true effect even if one exists.
  • Methodological limitations (poor comparison group, confounding, selection bias) make the estimates unreliable.
  • The evidence base is too thin to support a definitive conclusion in either direction.

Example: A pilot community health outreach program enrolled only 30 participants. The follow-up survey showed a modest improvement in reported health behaviors, but the confidence intervals are extremely wide and the result could plausibly reflect random variation. The evidence is tagged as Unclear pending a larger replication study.


Why These Categories Matter

Categorizing evidence by its effect type helps logic model builders and program evaluators make faster, better-informed decisions:

  • Positive evidence strengthens the causal links in a Logic Model.
  • No Effect evidence prompts teams to reconsider whether a given activity truly drives the expected outcome.
  • Mixed evidence surfaces equity considerations — who benefits and who does not.
  • Side Effect evidence flags risks that may need mitigation in program design.
  • Unclear evidence signals where future research investment is most needed.

Together, these five categories give the MUSE canvas a shared language for evidence quality that is accessible to non-specialists while remaining rigorous enough for researchers.