Power & Sample Size Calculator
Use this advanced tool to calculate the sample size required for a one-sample statistic or for differences between two proportions or means (two independent samples). Multiple test groups are supported for binomial data. Calculate power given sample size, alpha, and the minimum detectable effect (MDE).
- Using the power & sample size calculator
- The importance of sample size determination
- Statistical power explained
- Sample size formula
- Types of null and alternative hypotheses in significance tests
- Absolute versus relative difference and why it matters in power analysis
Using the power & sample size calculator
This tool allows the evaluation of different statistical designs when planning an experiment (trial, test) which utilizes a Null-Hypothesis Statistical Test to make inferences. It can be used both as a sample size calculator and as a statistical power calculator. Usually one would determine the sample size required given a particular power requirement. In cases where there is a predetermined sample size one can instead calculate power for a given effect size of interest (a.k.a. perform a power analysis).
Parameters for sample size and power calculation
1. Number of test groups. The power and sample size calculator supports experiments in which one is gathering data on a single sample in order to compare it to a general population or known reference value (one-sample), as well as ones where a control group is compared to one or more treatment groups (two-sample, k-sample) in order to detect differences between them. For comparing more than one treatment group to a control group sample size adjustments based on the Dunnett's correction are applied. These are only approximately accurate and subject to the assumption of approximately equal effect size in all k groups and only support equal sample sizes in all groups. Power calculations are not currently supported for more than one treatment group.
2. Type of outcome. The outcome of interest can be the absolute difference of two proportions (binomial data, e.g. conversion rate or event rate), the absolute difference of two means (continuous data, e.g. height, weight, speed, time, revenue, etc.), or the relative difference between two proportions or two means (percent difference, percent change, etc.). See Absolute versus relative difference for more.
3. Baseline The mean under H0 is the parameter value one would expect to see if all experiment participants were assigned to the control group. It is the mean one expects to observe if the treatment has no effect whatsoever. If entering means data, specify the mean under the null hypothesis and the standard deviation of the data either from a known population or estimated from a sample.
4. Minimum Detectable Effect. The minimum effect of interest, which is often called the minimum detectable effect (MDE, but more accurately: MRDE, minimum reliably detectable effect) should be a difference one would not like to miss, if it existed. Enter it as a proportion (e.g. 0.10) or as a percentage (e.g. 10%). It is always relative to the mean/proportion under H0 ± the superiority/non-inferiority or equivalence margin. For example, if the mean under H0 is 10 and there is a superiority alternative hypothesis with a superiority margin of 1 and the minimum effect of interest relative to the baseline is 3, then enter an MDE of 2, since the MDE plus the superiority margin will equal exactly 3. In this case the MDE is calculated relative to the mean under H0 plus the superiority margin.
5. Type of alternative hypothesis. The sample size calculator supports superiority, non-inferiority and equivalence alternative hypotheses. The equivalence margin cannot be zero. See Types of null and alternative hypothesis for an in-depth explanation.
6. Maximum error rates. The type I error rate, α, should always be provided. It is the cut-off point for p-value calcutions and also equals 1 - confidence level of a corresponding confidence interval. Power, calculated as 1 - β, where β is the type II error rate, is only required when determining a sample size. For an in-depth explanation of power see What is statistical power below.
Calculator output
The output consists of the sample size of a single test group, as well as the total sample size required across all groups. If used in power analysis mode it will output the power as a proportion and as a percentage.
The importance of sample size determination
Estimating the required sample size before running an experiment that will be judged by a statistical test (a test of significance, confidence interval, etc.) allows one to:
- determine the sample size needed to detect an effect of a given size with a given probability
- estimate the magnitude of the effect that can be reliably detected with a certain sample size
- calculate statistical power for a given sample size and effect size of interest
This is crucial information with regards to making a test cost-efficient. Having a proper sample size can sometimes be the difference between conducting the experiment or postponing it for when one can afford a sample large enough to ensure high probability to detect an effect of practical significance.
For example, if a medical trial has low statistical power of say less than 80% (β = 0.2) for a given minimum effect of interest, then it might be unethical to conduct it due to the low probability of rejecting the null hypothesis and establishing the effectiveness of the treatment, even if it is effective. The same holds for experiments in physics, psychology, economics, marketing, conversion rate optimization, etc. A good power analysis is just a part of the process of balancing the risks and rewards and securing the viability of an experiment for stakeholders.
Statistical power explained
Statistical power is the probability of rejecting a false null hypothesis with a given level of statistical significance, against a particular alternative hypothesis. Alternatively, it can be said to be the probability to detect a true effect of a certain magnitude with a given level of significance. This is the correct way to interpret the output of the tool in "power calculator" mode. Power is inversely related to the type II error rate of a test: β since it is equal to (1 - β). In probability notation the Type II error of a test at a given point alternative μ1 is expressed as the following equation [1]:
β(Tα; μ1) = P(d(X) ≤ cα; μ = μ1)
It makes it clear that the type II error rate is calculated at a given point of the parameter space, and does not cover the entire range of possible parameter values. Similarly, is present in the formula for statistical power since POW = 1 - β [1]:
POW(Tα; μ1) = P(d(X) > cα; μ = μ1)
In the equations above cα represents the critical value for rejecting the null (significance threshold), d(X) is a statistical function of the parameter of interest - usually a transformation to a standardized score, and μ1 is a specific value of interest from the space of the alternative hypothesis.
One can also calculate and plot the whole power function by doing a power calculation at many different points under the alternative hypotheses. Due to the S-shape of the function, power quickly rises to nearly 100% for effect sizes larger than the MDE, while it decreases more gradually to zero for effect sizes smaller than the MDE. Such a power function plot is not yet supported by our statistical software, but one can calculate the power at a few key points (e.g. 10%, 20% ... 90%, 100%) and connect them for a rough approximation.
Statistical power is directly and inversely related to the significance threshold. At the zero effect point for a simple superiority alternative hypothesis power is exactly 1 - α as can be easily demonstrated with our power calculator. At the same time power is positively related to the number of observations, so increasing the sample size will increase the power for a given effect size, assuming all other parameters remain the same.
Post-hoc power (Observed power)
A power calculation can be useful even after a test has been completed since failing to reject the null can be used as an argument for the null and against particular alternative hypotheses to the extent to which the test had power to reject them. This is more explicitly defined in the severe testing concept proposed by Mayo & Spanos (2006) [1].
Computing observed power is only useful if there was no rejection of the null hypothesis and one is interested in estimating how probative the test was towards the null. It is absolutely useless to compute post-hoc power for a test which resulted in a statistically significant effect being found [5]. If the effect is significant, then the test had enough power to detect it. In fact, there is a 1 to 1 inverse relationship between observed power and statistical significance, so one gains nothing from calculating post-hoc power, e.g. a test planned for α = 0.05 that passed with a p-value of just 0.0499 will have exactly 50% observed power (observed β = 0.5).
It is strongly encouraged to use this power analysis calculator to compute observed power in the former case, but strongly discouraged in the latter.
Sample size formula
The formula for calculating the sample size of a test group in a one-sided test of absolute difference is:

where Z1-α is the Z-score corresponding to the selected statistical significance threshold α, Z1-β is the Z-score corresponding to the selected statistical power 1-β, σ is the known or estimated standard deviation, and δ is the minimum effect size of interest. The formula makes use of the Z-distribution (normal distribution). The standard deviation is estimated analytically in calculations for proportions, and empirically from the raw data for other types of means.
The formula applies to single-sample tests as well as to tests of absolute difference between two means. A proprietary modification is employed in our sample size calculator to calculate the sample size for a test of relative difference. This modification has been extensively tested through simulations in a variety of scenarios.
Types of null and alternative hypotheses in significance tests
When doing sample size calculations, it is important that the null hypothesis (H0, the hypothesis being tested) and the alternative hypothesis is (H1) are well thought out. The test can reject the null or it can fail to reject it. Strictly logically speaking it cannot lead to acceptance of the null or to acceptance of the alternative hypothesis. A null hypothesis can be a point hypothesis or a composite hypothesis covering many possible values, usually from -∞ to some value or from some value to +∞. The alternative hypothesis can also be a point one or a composite one.
In a Neyman-Pearson framework of NHST (Null-Hypothesis Statistical Test) the alternative should exhaust all values that do not belong to the null, so it is usually composite. Below is an illustration of common combinations of null and alternative statistical hypotheses: superiority, non-inferiority, strong superiority (margin > 0), equivalence, all of which are supported in our power analysis calculator.

Careful consideration has to be made when deciding on a non-inferiority margin, superiority margin or an equivalence margin. Equivalence trials are sometimes used in clinical trials where a drug can be performing equally (within some bounds) to an existing drug but can still be preferred due to less or less severe side effects, cheaper manufacturing, or other benefits, however, non-inferiority designs are more common. Similar cases exist in disciplines such as conversion rate optimization [2] and other business applications where benefits not measured by the primary outcome of interest can influence the adoption of a given solution. For equivalence tests it is assumed that they will be evaluated using two one-sided t-tests (TOST) or z-tests, or confidence intervals.
Note that our calculator does not support the schoolbook case of a point null and a point alternative, nor a point null and an alternative that covers all the remaining values. This is since such cases are non-existent in experimental practice [3][4]. The only two-sided calculation is for the equivalence alternative hypothesis, all other calculations are one-sided (one-tailed).
Absolute versus relative difference and why it matters in power analysis
When doing a sample size calculation it is important to know what kind of inference one is looking to make: about the absolute or about the relative difference, often called percent effect, percentage effect, relative change, percent lift, etc. Where the fist is μ1 - μ the second is μ1-μ / μ or μ1-μ / μ x 100 (%). The division by μ is what adds more variance to such an estimate, since μ is just another variable with random error, therefore a test for relative difference will require larger sample size than a test for absolute difference. Consequently, if sample size is fixed, there will be less power for the relative change equivalent to any given absolute change.
For the above reason it is important to know and state beforehand if one is going to be interested in percentage change or if absolute change is of primary interest. Then it is just a matter of flipping a radio button.
References
1Mayo D.G., Spanos A. (2010). "Error Statistics". in P. S. Bandyopadhyay & M. R. Forster (Eds.), Philosophy of Statistics, (7, 152–198). Handbook of the Philosophy of Science. The Netherlands: Elsevier.
6Spanos A. (2019). "Probability Theory and Statistical Inference". Cambridge University Press. pp. 396-404, doi: 10.1017/9781316882825
Cite this calculator & page
Cite results from this online calculator or information on this page by choosing a citation format:
Georgiev, G.Z. (n.d.). Sample Size Calculator. GIGAcalculator.com. Retrieved Jun 27, 2026, from https://www.gigacalculator.com/calculators/power-sample-size-calculator.php
Our statistical calculators are featured in 400+ scientific papers (Google Scholar ) published in high-profile science journals by:













Georgi Georgiev is an applied statistician with background in statistical analysis of online controlled experiments, including developing statistical software, writing over one hundred articles and papers, as well as the influential book "Statistical Methods in Online A/B Testing".