Listing A.24 Constants
1 const uInt DEFAULT_BOOT = 1000;
As bootstrapping is implemented, a default number of resamples is defined.
Listing A.25 Data structures
1 struct Statistic
2 {
3 Real value;
A.16. Statistics (mrv_stats.h)
4 Real error; // the standard error of the value
5 Statistic();
6 };
7
8 struct VectorStatistics
9 {
10 Real minimum, maximum;
11 Statistic mean, variance, stdev;
12 Statistic median, skewness, excess_kurtosis;
13
14 VectorStatistics();
15 VectorStatistics(RealVector v, uInt B = DEFAULT_BOOT);
16 };
17
18 struct GraphStatistics
19 {
20 uInt nV, nE, nT;
21 Real total_length, diameter, compactness;
22 VectorStatistics edge_stat;
23 RealVector edges, total_lengths, diameters, compactnesses;
24
25 GraphStatistics();
26 GraphStatistics(DisconnectedGraph& G);
27 GraphStatistics(WeightedGraph& G, uInt B = DEFAULT_BOOT);
28 GraphStatistics(MST& mst, uInt B = DEFAULT_BOOT);
29 GraphStatistics(MSF& msf, uInt B = DEFAULT_BOOT);
30 };
Statistic groups values with their standard errors. The default constructor initializes both toNaNto draw the attention of the user when forgetting to initialize either of it (nan value is spread to outcomes of arithmetic operations involving them.)
VectorStatistics groups commonly used descriptive statistics for one dimen- sional data. The empty constructor assigns NaNto all values, while the other computes the statistics of the elements in the given vector. The second parameter is the number of bootstrap samples to be taken in order to compute the standard error of all values. If missing, the default is used (DEFAULT_BOOT.)
GraphStatistics defines the statistics we chose to measure in graphs, trees,
APPENDIX A. THE MORAVA PACK forests etc:
• nV, nE, nT: number of vertices, edges and trees (=1 in graphs,>1 in forests.)
• total_length, diameter, compactness: measures that depend on the vertices
• edge_stat: statistics on the distribution of the edge lengths
• edges all edge lengths are stored here so that the user or another program can compute other statistics, histograms, etc.
• total_lengths, diameters, compactnessesif referring to a for- est, these measures are computed for each tree for post–processing (e.g. statistics, histogram.)
Aside from the empty constructor setting reals to NaN and integers to 0, all other constructors do the statistics on graphs, MSTs, MSFs. The second parameter is the number of bootstrap samples, equal to the de- fault (DEFAULT_BOOT) value if missing. In the case ofDisconnectedGraph
obviously there is no bootstrap parameter as there are no edges.
Listing A.26 Functions
1 // descriptive statistics
2 Real Sum(RealVector&);
3 Real Mean(RealVector&);
4 Real Variance(RealVector&);
5 Real Variance(RealVector&, Real m);
6 Real Stdev(RealVector&);
7 Real Stdev(RealVector&, Real m);
8 Real Skewness(RealVector&, Real m, Real s);
9 Real Skewness(RealVector&, Real m);
10 Real Skewness(RealVector&);
11 Real ExcessKurtosis(RealVector&, Real m, Real s);
12 Real ExcessKurtosis(RealVector&, Real m);
13 Real ExcessKurtosis(RealVector&);
14 Real Median(RealVector&, bool is_sorted = false);
15 Real MedianWithSort(RealVector& v, bool is_sorted = false);
16 Real MedianWithQSelect(RealVector& v, bool is_sorted = false);
17
18 // bootstrap-related
19 RealVector Bootstrap(RealVector&);
A.16. Statistics (mrv_stats.h)
20 Real BootstrapMedianSE(RealVector&, uInt B, bool is_sorted = false);
21 VectorStatistics GetStatisticsOn(RealVector&, uInt B);
22
23 // diameter - total weight - compactness
24 Real Diameter(Vertices&, MetricFunc);
25 Real Diameter(Vertices&);
26 Real Diameter(WeightedGraph&, MetricFunc);
27 Real Diameter(WeightedGraph&);
28 Real Diameter(MST&, MetricFunc);
29 Real Diameter(MST&);
30 Real Diameter(MSF&, MetricFunc);
31 Real Diameter(MSF&);
32 Real TotalWeight(Edges&);
33 Real TotalWeight(WeightedGraph&);
34 Real TotalWeight(MST&);
35 Real TotalWeight(MSF&);
36 Real Compactness(Real diameter, Real total_length);
The following functions operate on the elements of aRealVectorobject:
Sum sum using the compensated summation technique (reduced round–offerrors, see §4.7.)
Mean sample mean. If empty vector,NaNis returned.
Variance sample variance using compensated summation (§4.7). If none or one element,NaNandINFis returned respectively.
If the sample mean is known, precomputed, etc., it can be passed as second argument to avoid recalculation.
Stdev sample standard deviation. NaN when empty, INF when only one ele- ment. Again, sample mean can be provided as second argument to reduce redundant calculations.
Skewness,ExcessKurtosis sample skewness and excess kurtosis. For 0, 1 or 2–sized vector,NaN INFandINFare returned respectively.
If the sample mean is known, it can be passed as second argument. If sample standard deviation is also known, it can be passed as third argument.
For median functions, note that the second argument signifies whether the input real vectoris known to be sorted. If not given, the default valuefalsewill be used and appropriate actions will be taken depending on the function that will
APPENDIX A. THE MORAVA PACK alter the input vector. If the vector is to remain unaltered, use a copy before calling the functions.
Median computes the median of the elements of a real vector by using the faster method supplied in MoravaPack. For the current version, this is
MedianWithQSelect. Thus, if the vector is not sorted, then partial reorder- ing of the elements will take place.
MedianWithQSelectuses the QuickSelect
If the second argument is bypassed, the default valuefalseis used.
MedianWithSort returns the median of a real vector
. The second argument (if not given,falseis implied) defines sorts the real vector usingstd::sortof C++STL and returns the median.
MedianWithQSelect