| | Class: BaseSource Location: /PHPExcel/Shared/JAMA/examples/Stats.php
 
 
| 
 A class to calculate descriptive statistics from a data set. 
 
 Author(s):Version: |  |  |  
 
 Class Details
[line 119 ] 
A class to calculate descriptive statistics from a data set. Data sets can be simple arrays of data, or a cummulative hash.  The second form is useful when passing large data set,  for example the data set:  $data1 = array (1,2,1,1,1,1,3,3,4.1,3,2,2,4.1,1,1,2,3,3,2,2,1,1,2,2); can be epxressed more compactly as:  $data2 = array('1'=>9, '2'=>8, '3'=>5, '4.1'=>2);Example of use:  include_once 'Math/Stats.php';
 $s = new Math_Stats();
 $s->setData($data1);
 // or
 // $s->setData($data2, STATS_DATA_CUMMULATIVE);
 $stats = $s->calcBasic();
 echo 'Mean: '.$stats['mean'].' StDev: '.$stats['stdev'].' \n';
 // using data with nulls
 // first ignoring them:
 $data3 = array(1.2, 'foo', 2.4, 3.1, 4.2, 3.2, null, 5.1, 6.2);
 $s->setNullOption(STATS_IGNORE_NULL);
 $s->setData($data3);
 $stats3 = $s->calcFull();
 // and then assuming nulls == 0
 $s->setNullOption(STATS_USE_NULL_AS_ZERO);
 $s->setData($data3);
 $stats3 = $s->calcFull();
 Originally this class was part of NumPHP (Numeric PHP package) Tags:
 
 
 
 Class Variables
 
 Class Methods
  
 method absDev [line 750]
    
	
		Calculates the absolute deviation of the data points in the set  Handles cummulative data sets correctly
 Tags: 
 method absDevWithMean [line 773]
    | | mixed absDevWithMean(
numeric
$mean) | 
 | 
 
	
		Calculates the absolute deviation of the data points in the set  given a fixed mean (average) value. Not used in calcBasic(), calcFull()  or calc(). Handles cummulative data sets correctly Tags:Parameters: 
 method calc [line 326]
    | | mixed calc(
int
$mode, [boolean
$returnErrorObject = true]) | 
 | 
 
	
		Calculates the basic or full statistics for the data set
 Tags:Parameters: 
 method calcBasic [line 349]
    | | mixed calcBasic(
[boolean
$returnErrorObject = true]) | 
 | 
 
	
		Calculates a basic set of statistics
 Tags:Parameters: 
 method calcFull [line 373]
    | | mixed calcFull(
[boolean
$returnErrorObject = true]) | 
 | 
 
	
		Calculates a full set of statistics
 Tags:Parameters: 
 method center [line 295]
    
	
		Transforms the data by substracting each entry from the mean. This will reset all pre-calculated values to their original (unset) defaults. Tags: 
 method coeffOfVariation [line 1109]
    | | mixed coeffOfVariation(
) | 
 | 
 
	
		Calculates the coefficient of variation of a data set. The coefficient of variation measures the spread of a set of data  as a proportion of its mean. It is often expressed as a percentage.  Handles cummulative data sets correctly Tags: 
 method count [line 599]
    
	
		Calculates the number of data points in the set  Handles cummulative data sets correctly
 Tags: 
 method frequency [line 1171]
    
	
		Calculates the value frequency table of a data set. Handles cummulative data sets correctly Tags: 
 method geometricMean [line 965]
    
	
		Calculates the geometrical mean of the data points in the set  Handles cummulative data sets correctly
 Tags: 
 method getData [line 217]
    | | mixed getData(
[boolean
$expanded = false]) | 
 | 
 
	
		Returns the data which might have been modified  according to the current null handling options.
 Tags:Parameters: 
 method harmonicMean [line 995]
    
	
		Calculates the harmonic mean of the data points in the set  Handles cummulative data sets correctly
 Tags: 
 method interquartileMean [line 1235]
    | | mixed interquartileMean(
) | 
 | 
 
	
		The interquartile mean is defined as the mean of the values left  after discarding the lower 25% and top 25% ranked values, i.e.: interquart mean = mean(<P(25),P(75)>) where: P = percentile Tags: 
 method interquartileRange [line 1273]
    | | mixed interquartileRange(
) | 
 | 
 
	
		The interquartile range is the distance between the 75th and 25th  percentiles. Basically the range of the middle 50% of the data set,  and thus is not affected by outliers or extreme values. interquart range = P(75) - P(25) where: P = percentile Tags: 
 method kurtosis [line 832]
    
	
		Calculates the kurtosis of the data distribution in the set  The kurtosis measures the degrees of peakedness of a distribution. It is also called the "excess" or "excess coefficient", and is  a normalized form of the fourth central moment of a distribution.  A normal distributions has kurtosis = 0  A narrow and peaked (leptokurtic) distribution has a  kurtosis > 0  A flat and wide (platykurtic) distribution has a kurtosis < 0  Handles cummulative data sets correctly Tags: 
 method Math_Stats [line 176]
    | | object Math_Stats Math_Stats(
[optional
$nullOption = STATS_REJECT_NULL]) | 
 | 
 
	
		Constructor for the class
 Tags:Parameters: 
 method max [line 451]
    
	
		Calculates the maximum of a data set. Handles cummulative data sets correctly Tags: 
 method mean [line 624]
    
	
		Calculates the mean (average) of the data points in the set  Handles cummulative data sets correctly
 Tags: 
 method median [line 864]
    
	
		Calculates the median of a data set. The median is the value such that half of the points are below it  in a sorted data set.  If the number of values is odd, it is the middle item.  If the number of values is even, is the average of the two middle items.  Handles cummulative data sets correctly Tags: 
 method midrange [line 940]
    
	
		Calculates the midrange of a data set. The midrange is the average of the minimum and maximum of the data set.  Handles cummulative data sets correctly Tags: 
 method min [line 427]
    
	
		Calculates the minimum of a data set. Handles cummulative data sets correctly Tags: 
 method mode [line 900]
    
	
		Calculates the mode of a data set. The mode is the value with the highest frequency in the data set.  There can be more than one mode.  Handles cummulative data sets correctly Tags: 
 method percentile [line 1389]
    | | mixed percentile(
numeric
$p) | 
 | 
 
	
		The pth percentile is the value such that p% of the a sorted data set  is smaller than it, and (100 - p)% of the data is larger. A quick algorithm to pick the appropriate value from a sorted data  set is as follows: Count the number of values: nCalculate the position of the value in the data list: i = p * (n + 1)if i is an integer, return the data at that positionif i < 1, return the minimum of the data setif i > n, return the maximum of the data setotherwise, average the entries at adjacent positions to i
   The median is the 50th percentile value.
 Tags:Parameters: 
 method product [line 546]
    
	
		Calculates PROD { (xi) }, (the product of all observations)  Handles cummulative data sets correctly
 Tags: 
 method productN [line 567]
    | | mixed productN(
numeric
$n) | 
 | 
 
	
		Calculates PROD { (xi)^n }, which is the product of all observations  Handles cummulative data sets correctly
 Tags:Parameters: 
 method quartileDeviation [line 1298]
    | | mixed quartileDeviation(
) | 
 | 
 
	
		The quartile deviation is half of the interquartile range value quart dev = (P(75) - P(25)) / 2 where: P = percentile Tags: 
 method quartiles [line 1199]
    
	
		The quartiles are defined as the values that divide a sorted  data set into four equal-sized subsets, and correspond to the  25th, 50th, and 75th percentiles.
 Tags: 
 method quartileSkewnessCoefficient [line 1349]
    | | mixed quartileSkewnessCoefficient(
) | 
 | 
 
	
		The quartile skewness coefficient (also known as Bowley Skewness),  is defined as follows: quart skewness coeff = (P(25) - 2*P(50) + P(75)) / (P(75) - P(25)) where: P = percentile Tags: 
 method quartileVariationCoefficient [line 1321]
    | | mixed quartileVariationCoefficient(
) | 
 | 
 
	
		The quartile variation coefficient is defines as follows: quart var coeff = 100 * (P(75) - P(25)) / (P(75) + P(25)) where: P = percentile Tags: 
 method range [line 645]
    
	
		Calculates the range of the data set = max - min
 Tags: 
 method sampleCentralMoment [line 1040]
    | | mixed sampleCentralMoment(
integer
$n) | 
 | 
 
	
		Calculates the nth central moment (m{n}) of a data set. The definition of a sample central moment is: m{n} = 1/N * SUM { (xi - avg)^n } where: N = sample size, avg = sample mean. Tags:Parameters: 
 method sampleRawMoment [line 1076]
    | | mixed sampleRawMoment(
integer
$n) | 
 | 
 
	
		Calculates the nth raw moment (m{n}) of a data set. The definition of a sample central moment is: m{n} = 1/N * SUM { xi^n } where: N = sample size, avg = sample mean. Tags:Parameters: 
 method setData [line 189]
    | | mixed setData(
array
$arr, [optional
$opt = STATS_DATA_SIMPLE]) | 
 | 
 
	
		Sets and verifies the data, checking for nulls and using  the current null handling option
 Tags:Parameters: 
 method setNullOption [line 236]
    | | mixed setNullOption(
$nullOption) | 
 | 
 
	
		Sets the null handling option. Must be called before assigning a new data set containing null values Tags:Parameters: 
 method skewness [line 795]
    
	
		Calculates the skewness of the data distribution in the set  The skewness measures the degree of asymmetry of a distribution,  and is related to the third central moment of a distribution. A normal distribution has a skewness = 0  A distribution with a tail off towards the high end of the scale  (positive skew) has a skewness > 0  A distribution with a tail off towards the low end of the scale  (negative skew) has a skewness < 0  Handles cummulative data sets correctly Tags: 
 method stdErrorOfMean [line 1146]
    
	
		Calculates the standard error of the mean. It is the standard deviation of the sampling distribution of  the mean. The formula is: S.E. Mean = SD / (N)^(1/2) This formula does not assume a normal distribution, and shows  that the size of the standard error of the mean is inversely  proportional to the square root of the sample size. Tags: 
 method stDev [line 691]
    
	
		Calculates the standard deviation (unbiased) of the data points in the set  Handles cummulative data sets correctly
 Tags: 
 method stDevWithMean [line 731]
    | | mixed stDevWithMean(
numeric
$mean) | 
 | 
 
	
		Calculates the standard deviation (unbiased) of the data points in the set  given a fixed mean (average) value. Not used in calcBasic(), calcFull()  or calc(). Handles cummulative data sets correctly Tags:Parameters: 
 method studentize [line 259]
    
	
		Transforms the data by substracting each entry from the mean and  dividing by its standard deviation. This will reset all pre-calculated  values to their original (unset) defaults.
 Tags: 
 method sum [line 476]
    
	
		Calculates SUM { xi }  Handles cummulative data sets correctly
 Tags: 
 method sum2 [line 498]
    
	
		Calculates SUM { (xi)^2 }  Handles cummulative data sets correctly
 Tags: 
 method sumN [line 521]
    
	
		Calculates SUM { (xi)^n }  Handles cummulative data sets correctly
 Tags:Parameters: 
 method variance [line 671]
    
	
		Calculates the variance (unbiased) of the data points in the set  Handles cummulative data sets correctly
 Tags: 
 method varianceWithMean [line 715]
    | | mixed varianceWithMean(
numeric
$mean) | 
 | 
 
	
		Calculates the variance (unbiased) of the data points in the set  given a fixed mean (average) value. Not used in calcBasic(), calcFull()  or calc(). Handles cummulative data sets correctly Tags:Parameters: 
 | 
 |