Summer Research Fellowship Programme of India's Science Academies

R Code For Distributions and Quantiles of Samples Correlation Coeffecients in Bivariate Normal

Souvik Nath

Pondicherry University, Kalapet, Puducherry 605014 

Dr Rituparna Sen

Indian Statistical Institute, Chennai Centre, Chennai 600029


In 1915, Fisher deduced the distribution of sample correlation coefficient ‘r’, based upon the foundations laid by Sir Francis Galton and Karl Pearson. In 1931, Pearson suggested Ms. F.N. David to take up the job of tabulating the values of the Areas and Ordinates of the distributions of r, for any sample size N, and any population correlation coefficient ρ. In this work, we wish to build a library package in the programming language: R, in which we are incorporating the tables, designed by Ms. F N David; so that these tabulated values are easily accessible to everyone through R. For this, we are performing a comparative study by Simulation of Bivariate Normal Distribution and by using the Fisher’s Formula and Hotelling’s Formula for computing the PDF, CDF of the distribution of the sample correlation coefficient ‘r’. For the latter case, we are making use of two techniques: Riemann Approximation and Quadrature Formulas and see, which of them gives a better approximation compared to the values tabulated by Ms. David, and taking lesser execution time. Finally, we wish to incorporate all the methodologies to form the R-package, and to compute the above. The package also returns the Critical Region of the distribution of ‘r’, used for testing the 1-sided or 2-sided Hypothesis for some given value of population correlation coefficient ρo and sample size N. Note that all our results are exact for sample size N and not asymptotic approximation.

Keywords: Samples, Correlation Coefficient, Normal Distribution, Hypothesis Testing.

Written, reviewed, revised, proofed and published with