Complete guide
Use the calculator above to paste two columns of paired data and instantly calculate Pearson’s correlation coefficient (r), with a live scatter plot, regression line, R-squared and a plain-English interpretation of strength and direction. Correlation is the starting point for any data-driven hypothesis about cause and effect.
What it is
What is correlation?
Correlation measures the strength and direction of a linear relationship between two variables. Pearson’s r ranges from −1 (perfect negative) through 0 (no linear relationship) to +1 (perfect positive). It does not prove causation — only that the two variables move together in a predictable way.
Calculation logic
How the calculation works
Pearson’s r = Σ((x−x̄)(y−ȳ)) ÷ √(Σ(x−x̄)² × Σ(y−ȳ)²). The calculator standardises both variables, then computes the average product of their deviations from the mean. R-squared (r²) is the proportion of variation in one variable explained by the other.
Common mistakes
Watch-outs before using correlation
- Confusing correlation with causation — r tells you variables move together, not that one causes the other.
- Reporting r without R² — R² is the more useful number for explaining how much variation is shared.
- Using Pearson’s r on non-linear relationships — it only detects linear association.
- Drawing conclusions from very small samples (n < 20) — small samples produce unstable r values.
- Ignoring outliers, which can either inflate or hide a real correlation.
What to do next
Turn the result into action
When r is strong, follow up with a Designed Experiment to confirm causation. When r is weak, save the effort and screen other variables. Always plot the scatter — the number can lie, the picture rarely does.
What does the correlation coefficient mean?
It measures the strength and direction of a linear relationship between two variables. Values range from −1 (perfect negative) through 0 (no linear relationship) to +1 (perfect positive).
What is a strong correlation?
A common rule of thumb: |r| above 0.7 is strong, 0.4-0.7 is moderate, below 0.4 is weak. Context matters — in physics 0.95 is weak; in social science 0.4 is notable.
Does correlation imply causation?
No. Correlation only tells you two variables move together. Causation requires further evidence — typically a designed experiment that manipulates one variable and measures the effect on the other.
What is R-squared?
R² is the square of the correlation coefficient. It represents the proportion of variation in one variable that is explained by the other. R² = 0.61 means 61% of the variation is shared.
When should correlation not be used?
When the relationship is non-linear (use Spearman’s rank or transform the data), when the data is ordinal or categorical (use chi-square or rank methods), or when the sample is too small (n < 20).