IQ Fundamentals

How Accurate Is an IQ Score? What Measurement Error Actually Means

Professional IQ tests have reliability of 0.95+ but every score has a ±5 point margin. Here's how to read your score with the confidence interval applied, why scores change between tests, and how online tests compare.

11 min read

Short answer: Professionally administered IQ tests are remarkably accurate — they have reliability coefficients of 0.95 or higher, which is better than most medical tests. But every score carries about ±5 points of measurement uncertainty. A measured IQ of 118 should be read as “somewhere between 113 and 123” rather than a precise number. Understanding that window is the key to interpreting your score correctly.

Want to see this in practice? You can take our 50-question assessment and see your score with a percentile breakdown across four cognitive domains.

Reliability vs. accuracy: two different questions

When people ask whether IQ scores are “accurate,” they're usually asking one of two related but distinct questions:

  • Reliability: If I take the test again next week, will I get the same score?
  • Validity: Does the number actually measure what it claims to measure (general cognitive ability)?

These have different answers. Reliability is a question of consistency; validity is a question of meaning. A scale that always reports your weight as 5kg heavier than it really is would be highly reliable but not very valid. A scale whose readings bounce around by 20kg each time would be neither. Professional IQ tests perform well on both dimensions, but this article focuses on reliability — because that's what most people mean when they ask “how accurate is my IQ score?”

How reliable are the best IQ tests?

Reliability is measured on a scale of 0 to 1, where 1 is perfect consistency. The Wechsler Adult Intelligence Scale (WAIS-IV, and the newer WAIS-V released in 2024) is the gold-standard clinical test for adults. It has reported reliability coefficients of:

MeasurementReliabilityInterpretation
WAIS-IV Full Scale IQ0.96 - 0.98Excellent
WAIS-IV Verbal Comprehension0.94 - 0.96Excellent
WAIS-IV Perceptual Reasoning0.94 - 0.95Excellent
WAIS-IV Working Memory0.93 - 0.95Excellent
WAIS-IV Processing Speed0.90 - 0.92Very good

For context, here are some reliability figures for comparison:

  • A standard bathroom scale: about 0.99
  • Routine blood pressure measurement: about 0.80
  • A typical university exam: about 0.70 to 0.85
  • A professional IQ test: about 0.95 to 0.98
  • Most online personality quizzes: 0.60 to 0.80
  • A five-question online IQ quiz: unknown, probably 0.50 to 0.70

IQ tests are among the most reliable psychological measurements ever created. That's not marketing — it's what a century of psychometric research has consistently found.

The standard error of measurement (SEM)

Even with reliability of 0.96, no test is perfectly precise. The key concept for understanding the uncertainty in your score is the standard error of measurement (SEM). This is a single number that tells you how much your measured score would typically vary if you took the same test many times.

The formula is straightforward:

SEM = SD × √(1 − reliability)

For the WAIS-IV Full Scale IQ, with SD = 15 and reliability ≈ 0.96:

SEM = 15 × √(1 − 0.96) = 15 × 0.2 = 3 points

That 3-point figure is the key number. It means: if you took the WAIS-IV many times, about 68% of your scores would fall within 3 points of your true latent IQ, and 95% of them would fall within about 6 points.

In practice, psychologists and clinical test publishers typically report SEM values between 2.16 and 2.5 points for the WAIS-IV Full Scale IQ across different age groups. At 95% confidence (approximately ±2 SEM), this gives an error band of roughly ±5 points.

What this means for your score

Here's how to read your score with the error band applied:

Measured scoreTrue score (95% confidence)What this tells you
10095 - 105Solidly in the Average range
115110 - 120High Average, possibly touching Superior
125120 - 130Superior, possibly touching Very Superior
132127 - 137Very Superior; likely over the Mensa threshold

The honest way to report an IQ score, if you're being technically precise, is “score plus confidence interval.” Clinical reports typically write it as “118 (95% CI: 113-123).” That's not fudging — it's acknowledging what every measurement tool can and can't do.

The practical implication: if your score is near a meaningful boundary (e.g. 128 near the 130 gifted threshold, or 119 near the 120 Superior threshold), don't treat the classification as fixed. Your true ability is genuinely uncertain within that window, and a retest could plausibly land you on either side.

Why your score can change between tests

If you've taken two IQ tests and gotten different numbers, you're not imagining things. Three sources of variation are at play:

1. Pure measurement error (≈±3 to ±5 points)

The SEM we just discussed. This is inherent to any test and can't be eliminated. Even if you were identical on both days, the questions on the test happen to sample from a slightly different corner of your ability. Some days the items favour your strengths; other days they don't.

2. State factors (≈±5 to ±15 points)

These are temporary conditions that affect your test-day performance:

  • Sleep — sleep deprivation can depress scores by 5-10 points
  • Anxiety — research on stereotype threat shows effects of up to 15 points
  • Caffeine, timing of food, hydration — small but real effects
  • Motivation — effort is genuinely optional on most tests
  • Distraction and environment — noise, temperature, screen size

None of these change your underlying cognitive ability. They're factors that shift your performance on one specific day. A rested, focused, motivated test-taker will typically score 5-10 points higher than a tired, distracted, indifferent one with the same true ability.

3. Practice effects (≈+3 to +8 points)

If you take the same test twice, or even different IQ tests close together, you typically score higher on the second. Research suggests a bump of 3-8 points on retest within a year. This is why clinicians wait at least a year before re-administering the same test — the practice effect contaminates the score.

Practice effects decay over longer intervals but don't fully disappear. Your second, third, and fourth lifetime IQ tests will typically show this effect, smaller each time.

How stable is IQ over a lifetime?

Extremely stable, after early adolescence. This is one of the most robust findings in psychology. The classic demonstration comes from the Scottish Mental Survey — a 1932 study that tested nearly every 11-year-old in Scotland. In 1998, researchers tracked down as many of the survivors as they could and gave them the same test at age 77. The correlation between the 11-year-old score and the 77-year-old score was 0.66.

Extended follow-ups at age 90 still produced correlations around 0.54. Put differently: a majority of the variance in your adult IQ is predictable from your childhood score, across decades of life events. IQ is one of the most stable psychological traits ever measured.

Some individual variation still happens. Serious illness, head injury, severe chronic stress, or meaningful educational interventions can shift scores somewhat. Ritchie and Tucker-Drob (2018) estimate that each additional year of formal education adds roughly 1-5 points to measured IQ. Dementia can reduce scores substantially in later life. But in the absence of major life events, your IQ at 25 is a pretty good predictor of your IQ at 75.

How do online tests compare?

Online tests fall into three categories, and their accuracy varies dramatically across them:

TypeReliabilityWhat it tells you
Clinical WAIS / Stanford-Binet (in-person)0.95 - 0.98Your true IQ within ±5 points at 95% confidence
Well-designed online assessment (30-50 questions)0.85 - 0.92Your IQ within ±8-10 points at 95% confidence
Short online quiz (5-15 questions)0.50 - 0.75 (estimated)Very noisy estimate; could be off by 20+ points

The key variable is test length. Longer tests sample more items and therefore produce more stable estimates. This is pure statistics — if you roll a die 5 times your average might be anywhere from 1 to 6, but if you roll it 50 times your average will almost certainly be close to 3.5. Same principle applies to cognitive tasks.

This is why clinical tests take 60-90 minutes: they need enough items to minimise the measurement error. A 3-minute quiz can't produce a reliable score regardless of how the questions are designed.

For context: our assessment uses 50 questions across verbal, numerical, spatial, and memory reasoning — enough items to deliver a substantially more reliable estimate than a short quiz, while staying completable in one sitting. You can take it here and see your score with a percentile breakdown across the four domains.

When to take a score seriously (and when not to)

Given everything above, here's a practical framework for interpreting any IQ result:

  • Trust most: professionally administered WAIS-IV / WAIS-V / Stanford-Binet, with a qualified examiner, no major state factors. Score is within ±5 points of your true IQ.
  • Trust moderately: a substantial online assessment (30+ questions, clear methodology, reasonable time commitment). Score is within ±8-10 points of your true IQ.
  • Trust with caution: multiple online tests returning roughly similar scores. Averaging several moderate-quality results can produce a reasonably accurate estimate even when any individual one isn't great.
  • Trust very little: any test that took less than 15 minutes, any test that charged you to see your score, any test advertising a specific high score as its outcome (“Your IQ is 149!”). These produce noise, not signal.
  • Don't trust at all: Facebook quizzes, apps on your phone, any test that also asks you to give your email or pay upfront to see the number. These are engagement products, not measurements.

The practical takeaway

IQ tests are genuinely among the most accurate psychological measurements that exist. But no measurement is perfect, and a single score always comes with a confidence interval around it. The honest read of any IQ number is:

  • Plus or minus 3-5 points for measurement error
  • Plus or minus 5-15 points for state factors on test day
  • Plus or minus 3-8 points for practice effects on retest
  • Much larger uncertainty if the test was short, free, or poorly designed

So a measured score of 120 is best read as “my true underlying IQ is probably somewhere in the high teens to mid-twenties — definitely in Superior territory, possibly touching Very Superior.” That's more accurate than the false precision of “my IQ is 120.”

If your measured score matters — for career planning, gifted programme eligibility, Mensa qualification, or personal curiosity — take a substantial test and understand the confidence interval. A short online quiz followed by strong life decisions is the worst combination; a proper assessment treated as an estimate rather than a fingerprint is the best.

See your own score with the error band

Our 50-question assessment covers verbal, numerical, spatial, and memory reasoning — the four cognitive domains measured by clinical tests like the WAIS. You get your score, percentile, and a domain-by-domain breakdown so you can see where you're strongest.

Take the test →

50 questions · Full cognitive breakdown · 2.5M+ completed

Frequently asked questions

How accurate is an IQ score?

Professionally administered IQ tests have reliability of 0.95 or higher — better than most medical tests. Every score carries about ±5 points of uncertainty at 95% confidence, so a score of 120 means “true IQ probably 115-125.”

What is the standard error of measurement on the WAIS?

The WAIS-IV Full Scale IQ has an SEM of approximately 2.16 to 2.5 points depending on age group. That gives a 95% confidence interval of about ±5 points around any observed score.

Why did my IQ score change between tests?

Three reasons: measurement error (±3-5 points), test-day factors like sleep and anxiety (±5-15 points), and practice effects on retest (+3-8 points). A 10-point difference between two IQ scores is completely normal and doesn't indicate your IQ changed.

How reliable are online IQ tests?

Well-designed online tests with 30+ questions can achieve reliability of 0.85-0.92 — good but slightly less than a full clinical test. Short online quizzes (under 15 questions) have poor reliability and shouldn't be used for any meaningful decision.

Does IQ change over a lifetime?

Very little after early adolescence. The classic Scottish Mental Survey found a correlation of 0.66 between IQ measured at age 11 and the same test at age 77 — one of the most stable psychological findings in the literature.

Sources

  • Wechsler, D. (2008). Wechsler Adult Intelligence Scale–Fourth Edition (WAIS-IV) Technical and Interpretive Manual. San Antonio, TX: Pearson.
  • Deary, I. J., Whiteman, M. C., Starr, J. M., Whalley, L. J., & Fox, H. C. (2004). The impact of childhood intelligence on later life: Following up the Scottish Mental Surveys of 1932 and 1947. Journal of Personality and Social Psychology, 86(1), 130-147.
  • Ritchie, S. J., & Tucker-Drob, E. M. (2018). How much does education improve intelligence? A meta-analysis. Psychological Science, 29(8), 1358-1369.
  • Crocker, L., & Algina, J. (1986). Introduction to Classical and Modern Test Theory. New York: Holt, Rinehart and Winston.
  • Kaufman, A. S. (2009). IQ Testing 101. New York: Springer Publishing.
  • Steele, C. M., & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69(5), 797-811.