• Nenhum resultado encontrado

Statistics ( mrv_stats.h )

No documento A user – friendly M inimum S panning T ree (páginas 171-175)

Listing A.24 Constants

1 const uInt DEFAULT_BOOT = 1000;

As bootstrapping is implemented, a default number of resamples is defined.

Listing A.25 Data structures

1 struct Statistic

2 {

3 Real value;

A.16. Statistics (mrv_stats.h)

4 Real error; // the standard error of the value

5 Statistic();

6 };

7

8 struct VectorStatistics

9 {

10 Real minimum, maximum;

11 Statistic mean, variance, stdev;

12 Statistic median, skewness, excess_kurtosis;

13

14 VectorStatistics();

15 VectorStatistics(RealVector v, uInt B = DEFAULT_BOOT);

16 };

17

18 struct GraphStatistics

19 {

20 uInt nV, nE, nT;

21 Real total_length, diameter, compactness;

22 VectorStatistics edge_stat;

23 RealVector edges, total_lengths, diameters, compactnesses;

24

25 GraphStatistics();

26 GraphStatistics(DisconnectedGraph& G);

27 GraphStatistics(WeightedGraph& G, uInt B = DEFAULT_BOOT);

28 GraphStatistics(MST& mst, uInt B = DEFAULT_BOOT);

29 GraphStatistics(MSF& msf, uInt B = DEFAULT_BOOT);

30 };

Statistic groups values with their standard errors. The default constructor initializes both toNaNto draw the attention of the user when forgetting to initialize either of it (nan value is spread to outcomes of arithmetic operations involving them.)

VectorStatistics groups commonly used descriptive statistics for one dimen- sional data. The empty constructor assigns NaNto all values, while the other computes the statistics of the elements in the given vector. The second parameter is the number of bootstrap samples to be taken in order to compute the standard error of all values. If missing, the default is used (DEFAULT_BOOT.)

GraphStatistics defines the statistics we chose to measure in graphs, trees,

APPENDIX A. THE MORAVA PACK forests etc:

nV, nE, nT: number of vertices, edges and trees (=1 in graphs,>1 in forests.)

total_length, diameter, compactness: measures that depend on the vertices

edge_stat: statistics on the distribution of the edge lengths

edges all edge lengths are stored here so that the user or another program can compute other statistics, histograms, etc.

total_lengths, diameters, compactnessesif referring to a for- est, these measures are computed for each tree for post–processing (e.g. statistics, histogram.)

Aside from the empty constructor setting reals to NaN and integers to 0, all other constructors do the statistics on graphs, MSTs, MSFs. The second parameter is the number of bootstrap samples, equal to the de- fault (DEFAULT_BOOT) value if missing. In the case ofDisconnectedGraph

obviously there is no bootstrap parameter as there are no edges.

Listing A.26 Functions

1 // descriptive statistics

2 Real Sum(RealVector&);

3 Real Mean(RealVector&);

4 Real Variance(RealVector&);

5 Real Variance(RealVector&, Real m);

6 Real Stdev(RealVector&);

7 Real Stdev(RealVector&, Real m);

8 Real Skewness(RealVector&, Real m, Real s);

9 Real Skewness(RealVector&, Real m);

10 Real Skewness(RealVector&);

11 Real ExcessKurtosis(RealVector&, Real m, Real s);

12 Real ExcessKurtosis(RealVector&, Real m);

13 Real ExcessKurtosis(RealVector&);

14 Real Median(RealVector&, bool is_sorted = false);

15 Real MedianWithSort(RealVector& v, bool is_sorted = false);

16 Real MedianWithQSelect(RealVector& v, bool is_sorted = false);

17

18 // bootstrap-related

19 RealVector Bootstrap(RealVector&);

A.16. Statistics (mrv_stats.h)

20 Real BootstrapMedianSE(RealVector&, uInt B, bool is_sorted = false);

21 VectorStatistics GetStatisticsOn(RealVector&, uInt B);

22

23 // diameter - total weight - compactness

24 Real Diameter(Vertices&, MetricFunc);

25 Real Diameter(Vertices&);

26 Real Diameter(WeightedGraph&, MetricFunc);

27 Real Diameter(WeightedGraph&);

28 Real Diameter(MST&, MetricFunc);

29 Real Diameter(MST&);

30 Real Diameter(MSF&, MetricFunc);

31 Real Diameter(MSF&);

32 Real TotalWeight(Edges&);

33 Real TotalWeight(WeightedGraph&);

34 Real TotalWeight(MST&);

35 Real TotalWeight(MSF&);

36 Real Compactness(Real diameter, Real total_length);

The following functions operate on the elements of aRealVectorobject:

Sum sum using the compensated summation technique (reduced round–offerrors, see §4.7.)

Mean sample mean. If empty vector,NaNis returned.

Variance sample variance using compensated summation (§4.7). If none or one element,NaNandINFis returned respectively.

If the sample mean is known, precomputed, etc., it can be passed as second argument to avoid recalculation.

Stdev sample standard deviation. NaN when empty, INF when only one ele- ment. Again, sample mean can be provided as second argument to reduce redundant calculations.

Skewness,ExcessKurtosis sample skewness and excess kurtosis. For 0, 1 or 2–sized vector,NaN INFandINFare returned respectively.

If the sample mean is known, it can be passed as second argument. If sample standard deviation is also known, it can be passed as third argument.

For median functions, note that the second argument signifies whether the input real vectoris known to be sorted. If not given, the default valuefalsewill be used and appropriate actions will be taken depending on the function that will

APPENDIX A. THE MORAVA PACK alter the input vector. If the vector is to remain unaltered, use a copy before calling the functions.

Median computes the median of the elements of a real vector by using the faster method supplied in MoravaPack. For the current version, this is

MedianWithQSelect. Thus, if the vector is not sorted, then partial reorder- ing of the elements will take place.

MedianWithQSelectuses the QuickSelect

If the second argument is bypassed, the default valuefalseis used.

MedianWithSort returns the median of a real vector

. The second argument (if not given,falseis implied) defines sorts the real vector usingstd::sortof C++STL and returns the median.

MedianWithQSelect

No documento A user – friendly M inimum S panning T ree (páginas 171-175)