Home Templates Calculators Videos Academy Software About Contact Login
Statistics

Correlation Calculator

Paste two columns of paired data to instantly calculate Pearson's correlation coefficient (r) — with a live scatter plot, regression line, R-squared, and a plain-English interpretation of strength and direction.

PDF Guide
Was this useful?

Enter your values

Enter your X and Y values, comma-separated, one pair per line. Minimum 3 pairs. Enter at least 3 valid x,y pairs.
📈

Ready to calculate

Enter your values on the left, then press Calculate.

Pearson r
correlation coefficient
R² (variance explained)
Sample size (n)
Strength of relationship
What this means

Simulation Lab

Correlation Simulation

Eight days of temperature vs defect rate data. Enter the lab and find out whether the relationship is real — and how strong it is.

Complete guide

Correlation Calculator Guide

Use the calculator above to paste two columns of paired data and instantly calculate Pearson’s correlation coefficient (r), with a live scatter plot, regression line, R-squared and a plain-English interpretation of strength and direction. Correlation is the starting point for any data-driven hypothesis about cause and effect.

What it is

What is correlation?

Correlation measures the strength and direction of a linear relationship between two variables. Pearson’s r ranges from −1 (perfect negative) through 0 (no linear relationship) to +1 (perfect positive). It does not prove causation — only that the two variables move together in a predictable way.

Calculation logic

How the calculation works

Pearson’s r = Σ((x−x̄)(y−ȳ)) ÷ √(Σ(x−x̄)² × Σ(y−ȳ)²). The calculator standardises both variables, then computes the average product of their deviations from the mean. R-squared (r²) is the proportion of variation in one variable explained by the other.

Worked example

Worked example: oven temperature vs defect rate

A QA team plots oven temperature against defect rate across 30 batches. Pearson’s r comes back at −0.78 — a strong negative correlation. R² = 0.61, meaning 61% of the variation in defects is associated with temperature variation.

The team has a credible hypothesis: holding temperature tighter should reduce defects. They follow up with a designed experiment to confirm causation. Correlation pointed them at the right variable; the DOE proved the relationship was causal, not coincidental.

Why it matters

Operational impact

Correlation tells improvement teams which variables to investigate further. A strong r between a process input and an output justifies a designed experiment; a weak r saves effort that would have been wasted.

Decision making

When to use it

Use correlation in DMAIC Analyse to screen candidate root causes before launching a full Designed Experiment. It is also the first tool in Measurement System Analysis and regression studies.

Lean Six Sigma

Link to Six Sigma

Correlation, regression, hypothesis testing and DOE form the analytical core of Six Sigma. Correlation screens; regression quantifies; hypothesis testing validates; DOE optimises.

Industry examples

Where correlation is useful

ManufacturingTest process parameters against defect rates to choose which factors to control tightly.
Sales analyticsCorrelate marketing spend with revenue per channel to focus the marketing mix.
Healthcare researchScreen candidate risk factors against outcomes before launching a controlled study.
Software performanceCorrelate request volume with response time to expose capacity constraints.
Common mistakes

Watch-outs before using correlation

  • Confusing correlation with causation — r tells you variables move together, not that one causes the other.
  • Reporting r without R² — R² is the more useful number for explaining how much variation is shared.
  • Using Pearson’s r on non-linear relationships — it only detects linear association.
  • Drawing conclusions from very small samples (n < 20) — small samples produce unstable r values.
  • Ignoring outliers, which can either inflate or hide a real correlation.
What to do next

Turn the result into action

When r is strong, follow up with a Designed Experiment to confirm causation. When r is weak, save the effort and screen other variables. Always plot the scatter — the number can lie, the picture rarely does.

Resources

Templates, videos and learning

Combine correlation with regression, hypothesis testing and DOE. The resources below help turn an exploratory r into a defensible improvement action.

Frequently asked questions

What does the correlation coefficient mean?

It measures the strength and direction of a linear relationship between two variables. Values range from −1 (perfect negative) through 0 (no linear relationship) to +1 (perfect positive).

What is a strong correlation?

A common rule of thumb: |r| above 0.7 is strong, 0.4-0.7 is moderate, below 0.4 is weak. Context matters — in physics 0.95 is weak; in social science 0.4 is notable.

Does correlation imply causation?

No. Correlation only tells you two variables move together. Causation requires further evidence — typically a designed experiment that manipulates one variable and measures the effect on the other.

What is R-squared?

R² is the square of the correlation coefficient. It represents the proportion of variation in one variable that is explained by the other. R² = 0.61 means 61% of the variation is shared.

When should correlation not be used?

When the relationship is non-linear (use Spearman’s rank or transform the data), when the data is ordinal or categorical (use chi-square or rank methods), or when the sample is too small (n < 20).

Want to know how correlation analysis fits into root cause identification? The Green Belt covers this in full.

View Green Belt →