• Nenhum resultado encontrado

Quadratic Interpolation

No documento A user – friendly M inimum S panning T ree (páginas 104-200)

C.6 OUTPUT tab

4.16 Quadratic Interpolation

4.8. Cosmological distance calculations

Experience never errs; it is only your judgments that err by promis- ing themselves effects such as are not caused by your experiments.

Leonardo da Vinci, from Jean Paul Richter (1888), “The Notebooks of Leonardo da Vinci”

5

Numerical experiments

Outline of chapter

5.1 Experimental time complexity . . . 86 5.2 Fast Bootstrap Median . . . 94 5.3 Distance Calculator . . . 98 5.4 DEUSS MSTs . . . 106

5.1. Experimental time complexity

5.1 Experimental time complexity

Finding the MST for dense graphs can be challenging concerning the computa- tional resources. The complete graph as a worst–case scenariois perfect for verifying time and space complexity: the two algorithms as implemented by us can work on complete graphs while non–complete graphs cannot be handled by CompletePrim §4.4.

By setting the metric to be Euclidean for the numerical experiment, we can also study the Delaunay tetrahedralization optimization while the Kruskal and Prim algorithms are unaffected.

Summarizing, three methods for producing MSTs of complete graphs are provided by MoravaPack and their average time complexity will be measured:

Complete Kruskal : compute all edges and use Kruskal’s algorithm

DT+Kruskal : compute the Delaunay tetrahedralization and use the output’s edges as an input for Kruskal’s algorithm

Complete Prim : use Prim’s algorithm to operate on graph’s vertices with Euclidean norm as the metric.

CHAPTER 5. NUMERICAL EXPERIMENTS Accurate timing

TheTicandTocfunctions defined in §A.6were employed to measure execution times. The accuracy in the system that run the experiments was0.001s. A MST of a small graph can be found in time less or comparable to that value.

To face this obstacle, any MST run will be repeated as many times as they are needed so that the total duration is longer than aminimum duration, which was decided to be1second, being sufficiently larger than the timing accuracy.

Reducing input dependency effects

Aside from its length, the input itself can have measurable effect on the execution time of an algorithm. For example, finding a number in a sorted in ascending (or descending) order list takes O(logn)time using the classic binary search method. If an implementation verifies that the requested key is greater to the first element and less or to the last one, before proceeding to the search, it can reject many values or return the minimum and maximum values, inO(1).

In practice, most complex algorithms present this effect. In order to reduce it, we can run the MST algorithms for different graphs of the same length. The list of durations can be used to estimate the average time complexity and the standard error.

To address this issue, we set the minimum number of different graphs to be used for any algorithm and any input length, equal to10. Larger values where avoid to reduce the duration of the numerical experiment.

OS–induced delays

The above solutions also remedy another problem. The actual execution time may vary from time to time due to background operations of the operating system.

The effects are averaged by the many repetitions and by calling algorithms in parallel: e.g. we did not finish studying Kruskal and then moved on Prim.

Proposed complexities

As explained in §2.3.1, Kruskal algorithm’s heaviest operation is the sorting of the graph’s edges by non–descending order. A complete graph ofN vertices, has

Comparison of algorithms

Table 5.1 lists the results that are plotted in Figure5.1. Figure5.2using log- arithmic axes provides better inspection and confirms the fact that power law approximates most time complexities (power laws appear as line segments in log–log plots.)

CHAPTER 5. NUMERICAL EXPERIMENTS Input length Average execution time (s)

N Complete Kruskal DT+Kruskal Complete Prime

128 0.0033 0.0016 0.0007

192 0.0078 0.0024 0.0015

256 0.0145 0.0033 0.0027

384 0.0343 0.0052 0.0057

512 0.0629 0.0070 0.0100

768 0.1579 0.0112 0.0229

1024 0.2801 0.0149 0.0390

1536 0.6448 0.0238 0.0878

2048 1.1891 0.0330 0.1607

3072 2.7619 0.0516 0.3559

4096 5.0798 0.0717 0.6446

6144 12.0630 0.1244 1.4725

8192 21.7384 0.1717 2.6364

12288 50.0073 0.2712 5.8723

Table 5.1 Average execution times of MST algorithms

0 2000 4000 6000 8000 10000 12000

Input length, N 0

10 20 30 40 50

Execution time, t (s)

Complete Kruskal Complete Prim DT + Kruskal

Figure 5.1 Average execution times of the three algorithms. To be visible, the error bars were magnified 50 times.

5.1. Experimental time complexity

103 104

Input length, N 10-3

10-2 10-1 100 101 102

Execution time, t (s)

Complete Kruskal DT + Kruskal Complete Prim

Figure 5.2 Execution time scaling in log–log. To be visible, the error bars were magnified 10 times.

Comparison of proposed formulas against power law We decided to plot the relative difference f˜iffi

i between «fitted values»f˜i and

«observed values»fi instead of fit residuals because (i) the fits are very close to each other and (ii) the deviations seem to depend on scale. The formulas are stated with the values and the 95% confidence interval of the parameters.

CHAPTER 5. NUMERICAL EXPERIMENTS Complete Kruskal

103 104

Input length, N 0

0.05 0.1 0.15 0.2

Fit vs observed relative difference

Proposed Power

Figure 5.3 Fitting power lawT = axb andT = aN2logN on execution time of Kruskal to confirm the expected average time complexity.

Power law fit:

T = (1.836±0.166)×10−7N2.063±0.009

R2adj = 0.999982 RM SE = 0.0575

Proposed fit:

T = (2.451±0.017)×10−8N2log2N

R2adj = 0.999984 RM SE = 0.0547

5.1. Experimental time complexity DT+Kruskal

103 104

Input length, N -0.15

-0.1 -0.05 0 0.05 0.1

Fit vs observed relative difference

Proposed Power

Figure 5.4 Fitting power lawT = axb and T = aNlogN on execution time of DT+Kruskal to confirm the expected average time complexity.

Power law fit:

T = (4.388±1.013)×10−6N1.172±0.025

Radj2 = 0.9995 RM SE = 0.0018 Proposed fit:

T = (1.604±0.028)×10−6Nlog2N

Radj2 = 0.9986 RM SE = 0.0029

CHAPTER 5. NUMERICAL EXPERIMENTS Complete Prim

103 104

Input length, N -0.04

-0.03 -0.02 -0.01 0 0.01 0.02 0.03 0.04 0.05 0.06

Fit vs observed relative difference

Proposed Power

Figure 5.5 Fitting power lawT =axb andT =aN2 on execution time of Kruskal to confirm the expected average time complexity.

Power law fit:

T = (4.073±0.418)×10−8N1.995±0.011

Radj2 = 0.9995 RM SE = 0.0018

Proposed fit:

T = (3.895±0.010)×10−8N2

R2adj = 0.999974 RM SE = 0.0083

CHAPTER 5. NUMERICAL EXPERIMENTS

5.2.1 Procedures

Timing and dependency on input

To avoid pathological cases of input and increase time measurement resolution (for more details, §5.1.1), we measured the time needed to process30different samples for each input length. The average and the standard error of the mean were computed.

Input lengths

A typcial choice would be powers of two, e.g.:

128,256,512,1024,· · · ,65536

though, this would only show half the story: there are the extra instructions for even–lengthed inputs (especially for QuickSelect). By adding the previous values decreased by one, we can detect the difference due to even–odd effect as the input length effect would be negligible for such close lengths.

The final choice was:

127,128,255,256,511,512,1023,1024,· · · ,65535,65536

The number of resamplesB was50and1000. For briefness, we do not provide all the plots forB = 50, as the results resembled those forB = 1000except for the errors that were slightly larger.

5.2.2 Results

5.2. Fast Bootstrap Median

0 1 2 3 4 5 6

Input length ×104

0 2 4 6 8 10 12 14

Execution time (s)

B = 1000

Sort QuickSelect FBM

Figure 5.6 Execution times for naïve approach (sort), Quickselect algorithm and FBM for 1000 resamples. Error bars were magnified 10 times and appear in pairs because of the neighbouring input lengths.

103 104

Input length 10-2

10-1 100 101

Execution time (s)

B = 1000

Sort QuickSelect FBM

Figure 5.7 Log–log version of Figure5.6. Error bars were magnified 10 times.

CHAPTER 5. NUMERICAL EXPERIMENTS

103 104

Input length 0

0.5 1 1.5 2 2.5 3 3.5 4 4.5

Speed up

B = 1000

QuickSelect vs Sort FBM vs Sort FBM vs QuickSelect

Figure 5.8 Speed up of all pairs of algorithms, for 1000 resamples. The closex–values show the effect of even–odd input lengths.

103 104

Input length 0

0.5 1 1.5 2 2.5 3 3.5 4 4.5

Speed up

B = 50

QuickSelect vs Sort FBM vs Sort FBM vs QuickSelect

Figure 5.9 As in Figure5.8, for 50 resamples.

CHAPTER 5. NUMERICAL EXPERIMENTS

0 1 2 3 4

0 0.5 1 1.5 2 2.5 3 3.5

4x 104

Redshift, z

Distance (Mpc)

Comoving distance, d

C

Luminosity distance, d

L

Figure 5.10 The cosmological distances computed by our algorithm forh= 0.74, and Ωm = 0.26.

Comparison with CosmoCalc

The following cosmological parameters were used for both calculators, h= 0.72 Ωm = 0.23 ΩΛ= 0.77

and the interpolation upper limit and tolerance level were also set for Distance- Calc:

zmax= 10 tolerance= 10−6 Then, for redshifts

z = 0,0.1,0.2, · · · , 4.0

5.3. Distance Calculator

, we requested the comoving distances

dC from CosmoCalc,

dLfrom DistanceCalc using linear interpolation

dQfrom DistanceCalc using quadratic interpolation

and computed the relative differencesrCLandrCQbetween CosmoCalc and each DistanceCalc interpolation (C=CosmoCalc,L=linear,Q=quadratic). The formula of our choice treats no data set as reference:

rAB = 2 |dAdB|

|dA|+|dB|; (5.1)

Comparison with DEUSS catalogs

DistanceCalcand DEUS–LCDM distance data agreement was verified by pro- duced the comoving distance function from the halo objects’ redshift vs. distance data from theΛCDM catalog.

We used two approaches:

Method A : Root Mean Square Error of the two datasets, according to the formula:

RM SE =

v u u u t

N

P

i=1

DiD˜i2 N

whereDi andD˜i is thei-th’s halo object distance from the catalog and the estimation fromDistanceCalcrespectively.

Method B : Visualization of relative difference versus redshift, by

• deriving of an interpolant of the distances from the DEUSSΛCDM catalog using cubic interpolation

• settingDistanceCalcto use DEUSS cosmological parameters and redshift range, requesting relative accuracy of10−8

• evaluation of interpolants at redshifts

z = 0.02,0.04· · ·0.62

CHAPTER 5. NUMERICAL EXPERIMENTS

• plotting the relative differencedr(z)of the DEUSS interpolantD(z) with the DistanceCalc functionD(z):˜

dr(z) = 2

D(z)−D(z D(z) + ˜D(z)

5.3. Distance Calculator

continued from previous page

3.4 7063.9664 7064.1924 7062.2

3.6 7242.8354 7242.8968 7240.7

3.8 7410.4973 7410.6684 7408.4

4.0 7568.4591 7568.5692 7566.2

0 1 2 3 4

0 5 10 15 20x 10−4

Redshift, z

Relative difference

Linear interpolation Quadratic interpolation

Figure 5.11 The relative difference of comoving distance computed by our algorithm using either linear or quadratic interpolation, and the one fromWright (2006).

Comparison with DEUSS catalogs Method A

3110107 objects halo objects were extracted from the DEUSSΛCDM catalog.

The DistanceCalc produced 1025 interpolation points. The following root mean square error was found:

RM SE = 1.1818Mpc

DistanceCalc vs CosmoCalc

DistanceCalc’s linearly interpolated values and the results from CosmoCalc displayed relative difference less than2×10−3. With quadratic interpolation, the differences were even smaller,<3×10−4.

5.3. Distance Calculator

In Figure5.11we observe two trends: the relative difference increases with redshift in both sets, while in the case of linear interpolation a significant increase is also observed for small redshifts.

The first trend is clearly explained by the fact that the distances compared were 3 to 4 orders of magnitude higher than the precision (0.1Mpc) of CosmoCalc (Table5.2.) The higher the redshift/distance, the higher the error in the relative difference.

The second trend is explained by the fact that linear interpolation always underestimates the comoving distance (explained in detail in §4.8.3) especially in the steep region close to0(Figure5.10.) This result justifies the inclusion of quadratic interpolation in DistanceCalc.

DistanceCalc vs DEUS catalog

The RMSE found is five orders of magnitude less than the distances studied.

Our calculator was asked to work at tolerance level 10−6. The inconsistency cannot be explained by accuracy of comoving distance in DEUSS catalog, but the accuracy of redshifts which was10−5.

As we mentioned in the description of the experiment, the DEUSS catalog presented a lot of objects with the same redshift and different distance. When we averaged the distances to create thezdC interpolation, the «round–off» error in the catalog propagated to the RMSE.

To confirm this, we looked at the catalog again and found that

• the 3110107 objects in theΛCDM catalog are classified in 60426 redshift groups

• the groups with a single member were only 1850: the coincidence of redshift was the rule with minor exceptions

• groups up to 142 objects were found — the histogram in Figure5.13gives the full picture

• to estimate the propagated error, we took each one of the 58576 groups with more than one object and computed the in–group standard deviations of the distance.

• The histogram of the SDs (Figure5.14) shows that many the standard error in corresponding a distance to a redshift from DEUS catalog was about 1Mpc, very close to the RMSE we calculated.

CHAPTER 5. NUMERICAL EXPERIMENTS

0 20 40 60 80 100 120 140

Number of objects in redshift group 0

200 400 600 800 1000 1200 1400 1600 1800 2000

Number of redshift groups

Figure 5.13 Frequency of redshift groups of by the number of objects they contain.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

SD of comoving distance in redshift group (Mpc) 0

2000 4000 6000 8000 10000 12000 14000 16000 18000

Number of redshift groups

Figure 5.14 Histogram of the standard deviations of the comoving distances of objects in the same redshift groups.

The results from the Kolmogorov–Smirnov tests are listed in Table5.4.

Note that Dn1,n2 is the test statistic for samples of length n1 and n2 and cumulative probability functionsF andGrespectively:

Dn1,n2 =

s n1n2 n1+n2 sup

x

F(x)−G(x)

Samples Outcome p–value Dn1,n2

LCDM-1, RPCDM-1 H1 0.000 0.0274

LCDM-3, RPCDM-3 H1 0.000 0.0517

LCDM-5, RPCDM-5 H0 0.543 0.0370

LCDM-1-5, RPCDM-1-5 H1 0.000 0.0259

Table 5.4 Two sample Kolmogorov–Smirnov tests between the LCDM and RPCDM catalogs of the sameMF oF thresholds. The zeropvalues were less than 10−6.

5.4. DEUSS MSTs

(a) LCDM-3

(b) RPCDM-3

Figure 5.15 The minimum spanning trees of the LCDM and RPCDM catalogs for M ≥3×1014M

CHAPTER 5. NUMERICAL EXPERIMENTS

0 0.5 1 1.5 2 2.5 3 3.5 4

Normalized edge length,l=<l>

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

Empiricalprobabilitydensity

MF oF 61#1014M-

LCDM RPCDM

Figure 5.16 Distribution of edge lengths forMF oF ≥1014M

Statistic LCDM-1 RPCDM-1

Graph attributes

# of vertices 198425 62649

Total length (h−1Mpc) 4.61464×106 2.17967×106

Diameter (h−1Mpc) 3309.64 3234.32

Compactness 0.999283 0.998516

Edge statistics

Minimum (h−1Mpc) 0.132 0.444

Maximum (h−1Mpc) 88.328 123.170

Mean (h−1Mpc) 23.257±0.026 34.792±0.081 Median (h−1Mpc) 22.347±0.041 34.201±0.118 Variance (h−2Mpc2) 168.981±0.418 372.222±1.700 St. dev. (h−1Mpc) 12.999±0.016 19.293±0.044

Skewness 0.417±0.005 0.299±0.009

Ex. kurtosis −0.440±0.016 −0.504±0.020 Table 5.5 Comparison of the MST statistics for LCDM-1 and RPCDM-1 catalogs

5.4. DEUSS MSTs

0 0.5 1 1.5 2 2.5 3 3.5

Normalized edge length,l=<l>

0 0.02 0.04 0.06 0.08 0.1 0.12

Empiricalprobabilitydensity

MF oF 63#1014M-

LCDM RPCDM

Figure 5.17 Distribution of edge lengths forMF oF ≥3×1014M

Statistic LCDM-3 RPCDM-3

Graph attributes

# of vertices 17882 3215

Total length (h−1Mpc) 1.02919×106 0.330359×106

Diameter (h−1Mpc) 3289.93 3202.09

Compactness 0.996803 0.990307

Edge statistics

Minimum (h−1Mpc) 1.228 2.066

Maximum (h−1Mpc) 173.284 346.437

Mean (h−1Mpc) 57.558±0.215 102.787±0.846 Median (h−1Mpc) 57.855±0.303 101.900±0.944 Variance (h−2Mpc2) 817.203±7.605 2423.550±63.131 St. dev. (h−1Mpc) 28.587±0.133 49.230±0.642

Skewness 0.159±0.014 0.310±0.047

Ex. kurtosis −0.430±0.031 0.232±0.160

Table 5.6 Comparison of the MST statistics for LCDM-3 and RPCDM-3 catalogs

CHAPTER 5. NUMERICAL EXPERIMENTS

0 0.5 1 1.5 2 2.5 3

Normalized edge length,l=<l>

0 0.02 0.04 0.06 0.08 0.1 0.12

Empiricalprobabilitydensity

MF oF 65#1014M-

LCDM RPCDM

Figure 5.18 Distribution of edge lengths forMF oF ≥5×1014M

Statistic LCDM-5 RPCDM-5

Graph attributes

# of vertices 4071 522

Total length (h−1Mpc) 0.400533×106 0.101034×106

Diameter (h−1Mpc) 3283.46 3183.05

Compactness 0.991802 0.968495

Edge statistics

Minimum (h−1Mpc) 1.749 7.716

Maximum (h−1Mpc) 291.903 457.675

Mean (h−1Mpc) 98.411±0.692 193.923±3.634 Median (h−1Mpc) 97.254±0.861 192.799±3.788 Variance (h−2Mpc2) 2083.160±48.318 6899.760±429.721 St. dev. (h−1Mpc) 45.642±0.529 83.065±2.588

Skewness 0.278±0.039 0.237±0.092

Ex. kurtosis 0.164±0.109 0.034±0.160

Table 5.7 Comparison of the MST statistics for LCDM-5 and RPCDM-5 catalogs

5.4. DEUSS MSTs

0 0.5 1 1.5 2 2.5 3 3.5 4

Normalized edge length,l=<l>

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

Empiricalprobabilitydensity

1#1014M-< MF oF <5#1014M-

LCDM RPCDM

Figure 5.19 Distribution of edge lengths for5×1014M < MF oF <5×1014M

Statistic LCDM-1-5 RPCDM-1-5

Graph attributes

# of vertices 194354 62127

Total length (h−1Mpc) 4.58341×106 2.17383×106

Diameter (h−1Mpc) 3309.64 3234.32

Compactness 0.999278 0.998512

Edge statistics

Minimum (h−1Mpc) 0.132 0.444

Maximum (h−1Mpc) 88.328 123.170

Mean (h−1Mpc) 23.583±0.025 34.991±0.070 Median (h−1Mpc) 22.777±0.043 34.470±0.126 Variance (h−2Mpc2) 169.523±0.468 371.857±1.704 St. dev. (h−1Mpc) 13.020±0.018 19.284±0.044

Skewness 0.395±0.005 0.290±0.008

Ex. kurtosis 0.447±0.014 −0.505±0.023

Table 5.8 Comparison of the MST statistics for LCDM-1-5 and RPCDM-1-5 catalogs

CHAPTER 5. NUMERICAL EXPERIMENTS

This page intentionally left blank

A new idea must not be judged by its immediate results.

Nicola Tesla, “My inventions: V. The Magnifying Transmitter”

6

Conclusions

Outline of chapter

6.1 Conclusions . . . 116 6.2 Future work . . . 118

6.1. Conclusions

In this chapter we will tie together the main points of the thesis and the discussion sections of the numerical experiments (§5.4.3).

Additionally, we will briefly review the software as users and express our expectations from future work in correcting and expanding the library.

6.1 Conclusions

We repeat the objectives stated in the Introduction section (§1.2) and comment how they were met.

Development of a library for finding MSTs and applying operators and statistics on them.

MoravaPack as a C++library conformed to the 1998 standard of the language presents high compatibility with compilers and operating systems while offering

• flexible representations of graph theoretical objects

• two MST algorithms and another two operators

• optimizations and automation of common operations

• statistical, random number, timing and other tools.

While succeeded in providing the aimed functionality, it also presents itself as a complete, independent framework for 3D minimum spanning trees in the making.

Creation of an add–on enabling the application to cosmological data sets.

Because of the limited time frame of the study, the application to cosmological data was not as broad as we wished for. Thus, only two main features were imple- mented (i) an I/O add-on for DEUSS data import, manipulation and conversion to graphs and (ii) a cosmic distance calculator to be used for catalog completion and verification, Monte Carlo simulations, statistics, etc.

The application of the add-ons for the DEUSS catalogs proved their function- ality while showed the way for future «wrappers» and «plug-ins» to expand the library.

CHAPTER 6. CONCLUSIONS Development of a multi–platform graphical user interface.

MoravaGUI application is a C++program using most of the features of Morava- Pack. Designed with the aid ofQt APIfor multi–platform applications, it is able to run in Linux, Windows or OS X platforms.

The GUI proved to be very useful in debugging MoravaPack interactively and quickly. From the point of view of another program using the library, inspired additional algorithms and class methods. Plus, it aided in conducting numerical experiments and creating figures for this document.

Conduction of numerical experiments to verify the implementation and ex- pected time complexity of crucial or newly proposed algorithms.

1. The numerical experiments in §5.1indicate that

• Kruskal’s algorithm works in the expectedO(N2logN)time com- plexity

• Prim’s version for complete graph works presentedO(N2)complex- ity (also proven using the «accounting method»)

• Delaunay triangulation successfully optimized the speed of finding MST via Kruskal’s algorithm, down toO(NlogN).

Extrapolating from the fits – as inaccurate as it may be – we predict that the MST of 106 points in the personal computer that was used for the experiments, would need about 4 days using CompleteKruskal, 11 hours via CompletePrim, andfew minutesusing the DT optimization!

2. DistanceCalc was found to be consistent with CosmoCalc and DEUS catalogs (§ 5.3) in the level of precision of the latter. Simultaneously, we demonstrated the accuracy and effectiveness of theintegrate once&

interpolatestrategy for big data.

3. The MSTs of the 8 subcatalogs from DEUS simulation were produced and plotted with MoravaGUI (§5.4.) Subsequent distribution equality tests (Kolmogorov–Smirnov) led us to the following conclusions about the MST technique:

• it is very effective at distinguishing large–scale structural differences in cosmological data sets,

CHAPTER 6. CONCLUSIONS

6.2.1 MoravaPack

1. More MST algorithms should be implemented. E.g. The original Prim’s algorithm should be available (currently only complete graphs are sup- ported.) Also, choice for the data structure should be added, to match the user’s needs and availability of computational resources.

2. Parellelization of the library where possible and efficient, using multiple cores or GPUs.

3. Our implementation for Kruskal algorithm presents poor space complexity.

Better memory management or adaptive data structure selection should be added.

4. More statistics on graphs could be offered. E.g. distributions of branch lengths, vertex degrees, angles and direction of edges, etc.

5. Ability to import and export to known formats like comma–separated values (CSV) or spreadsheet files from Office suites.

6.2.2 MoravaGUI

1. OpenGL acceleration for faster 3D plots.

2. Manipulation of the various elements of the plots, histograms etc.

3. Interactiveness: e.g. selection of trees in plotted forests or branches in trees.

4. Better help system.

This page intentionally left blank

The best programs are written so that computing machines can perform them quickly and so that human beings can understand them clearly. A programmer is ideally an essayist who works with traditional aesthetic and literary forms as well as mathematical concepts, to communicate the way that an algorithm works and to convince a reader that the results will be correct.

Donald Ervin Knuth, “Selected Papers on Computer Science”

A

The Morava Pack

APPENDIX A. THE MORAVA PACK

A.1.2 TetGen

It is a library for finding theDelaunay Triangulation(DT) of a set of points in the 3D Euclidean space. Its ability to work as a header file included in our code, instead of a precompiled library, was crucial in its selection. Besides that, it is one of the fastest such libraries.

A.2 Morava files

Morava files (proposed extension: .mrv) can store geometric graphs (weighted or not), trees and forests. Each graph, tree or forest can be fully represented by a set of vertices, edges and the weights.

A header with two numbers could indicate how many of the following lines define the vertices and how many define the edges. The edges can track their endpoints by their indices in the vertex list. The weight of each edge follows.

Between the two numbers, we add another one, defining the dimension of the space where vertices reside. In this way, an input routine can expect how many coordinates are there to read. The inclusion of dimension in the format was chosen for possible future development of MoravaPack into anN-D library and other types of graphs likeHaloGraphwhereD = 5(see §B.3.)

While using this arrangement we can represent a forest of trees as a combined collection of vertices and edges, the information of which vertex or edge belongs to thei–th tree is lost. To accommodate this extra information, a Morava file can begin with a line containing only one number: the number of graph objects stored.

Then, reading the file is easy in all programming languages or mathematical suites like Mathematica or Matlab:

1. Read first numberN =how many graph objects are stored 2. RepeatN times

(a) Read next three numbers: V,D,E (b) RepeatV times:

• ReadDnumbers,x1, x2,· · · , xD

• Store the vertexvk = (x1, x2,· · · , xD) (c) RepeatEtime:

No documento A user – friendly M inimum S panning T ree (páginas 104-200)