• Nenhum resultado encontrado

Testing the cosmological principle with AI reconstruction methods

N/A
N/A
Protected

Academic year: 2023

Share "Testing the cosmological principle with AI reconstruction methods"

Copied!
64
0
0

Texto

(1)

UNIVERSIDADE DE LISBOA FACULDADE DE CIÊNCIAS DEPARTAMENTO DE FÍSICA

Testing the cosmological principle with AI reconstruction methods

Maria Alexandra Martins Gonçalves

Mestrado em Física

Especialização em Astrofísica e Cosmologia

Dissertação orientada por:

António da Silva

José Pedro Mimoso

(2)
(3)

Para duas grandes mulheres da minha vida, a Ti Bei e a Ti Seixa, as minhas avós

(4)
(5)

Resumo

Palavras chave: Cosmologia, Princípio Cosmológico, Inteligência Artificial, Programação Genética, modelos de Lemaître–Tolman–Bondi

A ideia de que vivemos num Universo homogéneo e isotrópico é uma hipótese chamada de princípio cosmológico (CP) 1. O CP está no centro da cosmologia moderna. Combinado com as equações de campo de Einstein conduz aos populares modelos homogéneos e isotrópicos de Friedmann-Lemaitre- Robertson-Walker (FLRW) que se tornaram no atual paradigma base para analisar e prever observações cosmológicas a grandes escalas. Apesar de alvo de várias críticas, o CP não é usualmente considerado uma causa das tensões cosmológicas que surgem no contexto dos modelos FLRW. No entanto faz todo o sentido, que este princípio mereça a nossa atenção pois podemos estar a basear quase toda a cosmologia numa hipótese que pode não ser válida a escalas cosmológicas.

Para testar o CP podem-se considerar modelos não homogéneos. Em particular, nesta tese, nós vamos considerar os modelos cosmológicos de Lemaître–Tolman–Bondi (LTB). Os modelos LTB são modelos isotrópicos, simetricamente esféricos e não homogéneos. Estes modelos são usalmente estudados como uma alternativa à energia escura. Em geral, os modelos LTB ditam que nós vivemos dentro de uma zona com baixa densidade de matéria e isso é a causa da expansão acelerada que observamos hoje. É de notar que a expansão acelerada do Universo, segundo os modelos LTB, também poderia existir caso vivessemos numa zona de alta densidade de matéria, contudo as observações sugerem que os modelos LTB mais viáveis são os que preveem não homogeneidades de baixa densidade de matéria.

Ainda que os modelos LTB sejam uma alternativa interessante aos modelos FLRW, alguns destes modelos não estão em total acordo com observações a grandes escalas, o que levou à criação dos mod- elos ΛLTB. Estes modelos são em tudo análogos aos modelos LTB, à excecção de considerarem um componente adicional no Universo, a energia escura. Porém, a qualidade dos dados cosmológicos atual não é suficiente para fazer distinções entre modelos FLRW e ΛLTB. Por outro lado, existe uma am- pla variedade de modelos LTB que ainda não foram testados com dados observacionais. Em particular, nesta tese focamo-nos em dois tipos de modelos LTB, os modelos de Clifton-Ferreira-Land (CFL) e os modelos de Garcia-Bellido-Haugbølle (GBH), e num tipo de modelo FLRW, denominadoΛCDM, que é conhecido como o modelo padrão da cosmologia.

À medida que aumentam os avanços na precisão dos dados cosmológicos, estes permitir-nos-ão fazer melhores constrangimentos em ambos os modelos FLRW e LTB e possivelmente descartar, pelo menos, um dos dois. Porém, falta-nos ainda perceber qual será a melhor maneira de analisar os dados cosmológicos. Tradicionalmente, em cosmologia, utilizam-se métodos paramétricos que ajustam os dados com funções que dependem do modelo em estudo, por essa razão, os nossos resultados estão naturalmente enviesados pelo modelo que usamos para os analisar. Esta abordagem obviamente limita o conhecimento científico e pode ser uma possível causa para as tensões presentes nos modelos FLRW,

1Notar que nesta tese utilizamos os acrónimos em língua inglesa.

(6)

pois diferentes dados de diferentes sondas cosmológicas podem dar paramêtros diferentes para a mesma função.

Um dos principais objetivos desta tese é perceber qual será a melhor maneira de analisar dados cosmológicos. Os métodos paramétricos são dependentes do modelo que estamos a estudar, isto porque eles ajustam paramêtros de funções já conhecidas, desses mesmos modelos. Contudo, à medida que grandes volumes de dados de elevada precisão e o poder computacional aumentam devem-se considerar outros métodos, denominados não paramétricos, que permitem reconstruir as funções cosmológicas sem assumira prioria forma dessas funções.

Ao contrário dos métodos paramétricos, os métodos não paramétricos são completamente inde- pendentes do modelo, portanto, eles não enviesam os resultados. Em vez de nos darem os melhores paramêtros para uma dada função cosmológica, estes métodos dão-nos a função que melhor descreve os dados juntamente com os seus paramêtros. Isto faz com que os métodos não paramétricos nos pareçam a melhor abordagem quando se trata de analisar dados cosmológicos. Ainda assim, nós precisamos de perceber se estes métodos serão ou não a melhor alternativa em relação aos métodos paramétricos tendo em conta a atual precisão dos dados.

Neste trabalho, usamos um conhecido método paramétrico que consiste em combinar métodos de Markov Chain Monte Carlo (MCMC) com estatística Bayesiana. Os métodos MCMC são usados para tirar amostras de uma distribuição de probabilidades enquanto que a estatística Bayesiana é uma teo- ria para intrepertar dados observados. Combinando estes dois tópicos, temos um poderoso método paramétrico capaz de constranger os paramêtros de qualquer função cosmológica no contexto de um dado modelo.

Além disso, para uma melhor compreensão dos métodos não paramétricos e se eles podem atual- mente substituir os métodos paramétricos nós vamos usar algoritmos de inteligência artificial para re- construir funções cosmológicas usando tanto dados simulados como dados reais e aplicados a modelos FLRW e LTB. Mais especificamente, o algoritmo de inteligência artificial (AI) que vamos utilizar é conhecido como programação genética (GP).

O algoritmo GP foi inspirado pela seleção natural e resolve problemas de regressão simbólica.

Regressão simbólica é uma técnica de aprendizagem automática usada para encontrar uma expressão matemática que melhor descreve uma relação entre grandezas de um conjunto de dados. Desta forma, não precisamos de escolher um modelo de antemão, em vez disso, deixamos que os dados nos digam qual o modelo que melhor os descreve. O mecanismo principal do GP consiste numa população em que cada indivíduo é um possível candidato à solução do problema. Os indíviduos vão evoluir ao longo de várias gerações através de operações características destes algoritmos, como por exemplo, a mutação e o cruzamento entre dois indivíduos.

Nesta tese, além de testarmos formas de tentar distinguir modelos FLRW de modelos LTB, começamos por tentar perceber se os métodos não paramétricos podem ser atualmente uma boa alterna- tiva aos métodos paramétricos. Para isso, em primeiro lugar, propomos comparar os métodos paramétri- cos com os não paramétricos, para perceber se estes últimos têm uma precisão análoga e, portanto, se poderão ser uma alternativa aos métodos tradicionais. No seguimento deste estudo, concluimos que os não métodos paramétricos fornecem resultados análogos aos dos paramétricos na reconstrução de várias funções cosmológicas.

O primeiro passo deste estudo foi tentar compreender a natureza estocástica dos GP. A maneira como fazemos isso é tentando replicar os mesmos resultados obtidos por outro código GP para o mesmo con- junto de dados cosmológicos. Como esperado, não conseguimos replicar exatamente a mesma função usando códigos diferentes. É de notar que resultados diferentes não significam necessariamente maus

(7)

resultados, em vez disso mostram-nos o quão estocástico um algoritmo GP pode ser. É importante salien- tar que isto acontece não só quando usamos códigos GP diferentes, mas também quando executamos o mesmo código várias vezes.

O segundo passo foi perceber até que ponto o nível de ruído nos dados cosmológicos pode ser um problema na reconstrução de funções cosmológicas. Para isso, aplicamos vários níveis de ruído e tenta- mos reconstruir a mesma função teórica. Isto é de enorme importância porque os dados cosmológicos são sempre acompanhados de ruído. Desta forma pudemos concluir que a atual precisão dos dados cosmológicos não é suficiente para reconstruir de forma exata as funções teóricas usadas.

No final, propomos uma observação da variação do redshift que irá ajudar a distinguir modelos FLRW de modelos LTB e que poderá descartar um dos dois modelos. Isto fornecerá um teste observa- cional ao CP e será um importante passo para que o CP se torne mais que um princípio não testado.

(8)

Abstract

Keywords: Cosmology, Cosmological Principle, Artificial Intelligence, Genetic Programming, Lemaître–Tolman–Bondi models

The cosmological principle (CP) is at the heart of modern cosmology. Combined with the Einstein field equations leads to the popular homogeneous and isotropic Friedmann-Lemaitre-Robertson-Walker (FLRW) models that have become the present baseline paradigm to analyze and predict large-scale cos- mological observations. Although it has been the subject of several criticisms, the CP is a fundamental and untested assumption. However, it may be one of the possible causes of the cosmological tensions that arise in the context of FLRW models.

In order to test the CP one can consider inhomogeneous models. In particular, in this thesis, we consider the Lemaître–Tolman–Bondi (LTB) cosmological models. The advances on the quality of cos- mological data will allow us to make better constraints on both FLRW and LTB models.

One of the main goals of this thesis is to understand different ways of analyzing cosmological data.

The traditional methods used in cosmology are the so-called parametric methods, however, as the quality of data volumes and the computational power increase one should consider whether non-parametric methods can be a better alternative. Unlike parametric methods, non-parametric methods do not assume a priorifunctions, based on a theory model, to adjust the data and, therefore, they have the advantage of being theory independent. However, one needs to understand if these methods can already be considered as a reliable alternative to the parametric methodology, given the quality of present cosmological datasets.

We address this issue in this thesis using artificial intelligence (AI) / Genetic Programing (GP) algorithms in the context of FLRW and LTB mock data. We assess the quality of reconstructions of cosmological functions with the GP method for different mock data noise levels and apply our GP implementation to real data as well as compare our findings with the results of the traditional parametric methods.

(9)

Agradecimentos

Em primeiro lugar, gostaria de agradecer aos meus dois orientadores, António da Silva e José Pedro Mimoso. Agradeço-lhes por todo o trabalho e dedicação ao longo deste último ano, foram ambos ex- celentes mentores que me ensinaram imenso. Passámos muitos bons momentos, em particular, sentirei muita falta das nossas reuniões em que, além de aprender muita cosmologia, aprendi muitas piadas.

Além dos meus orientadores, não poderia deixar de agradecer à minha família e amigos. Em especial, quero agradecer aos meus pais por todo o apoio e à minha irmã que teve e tem uma pessoa a seu cargo e por ser um Bibi d’Oiro. Aos amigos, agradeço em especial àqueles que partilharam das minhas dores, os meus Junkie$: a Larita, a Constancinha, o Duartito e o Eduardo.

Não poderia deixar de agradecer e dedicar esta tese às minhas avós, que apesar de nunca terem compreendido o que eu fazia, sempre acreditaram que eu seria capaz. Onde quer que estejam, estarão sempre presentes.

(10)

Index

List of Figures xi

List of Tables xii

1 Introduction 1

2 Theoretical Introduction 3

2.1 Friedmann–Lemaître–Robertson–Walker models . . . 3

2.1.1 ΛCDM model . . . 4

2.2 Lemaître–Tolman–Bondi models . . . 5

2.2.1 The LTB solution . . . 5

2.2.1.1 Propagation of light in LTB models . . . 7

2.2.1.2 Shear Estimator . . . 8

2.2.2 ΛLTB . . . 9

2.2.3 Clifton-Ferreira-Land model . . . 9

2.2.4 Garcia-Bellido-Haugbølle model . . . 10

2.3 Redshift Drift . . . 11

2.3.1 Redshift Drift in Friedmann–Lemaître–Robertson–Walker models . . . 12

2.3.2 Redshift Drift in Lemaître–Tolman–Bondi models . . . 12

3 Methodology 14 3.1 Parametric methods . . . 14

3.1.1 Bayes’ Theorem . . . 15

3.1.2 Markov Chain Monte Carlo methods . . . 15

3.1.3 LMFIT and emcee . . . 15

3.2 Non-parametric methods . . . 15

3.2.1 Genetic Programming . . . 16

3.2.2 Error Analysis . . . 18

3.2.3 gplearn . . . 20

3.3 Bubble . . . 20

3.4 How Does Our Code Work? . . . 21

4 Results 23 4.1 Parametric vs Non-Parametric Methods . . . 23

4.2 Stochastic Nature of Genetic Programming . . . 25

4.2.1 Hubble Parameter . . . 26

4.2.2 Om(z)statistic . . . 27

(11)

INDEX

4.3 Noise Levels of Data . . . 28

4.3.1 Luminosity Distance . . . 28

4.3.2 Shear Estimator . . . 31

4.4 Redshift Drift: Distinguishing BetweenΛCDM and LTB Models . . . 34

4.4.1 ΛCDM model . . . 34

4.4.2 Lemaître–Tolman–Bondi models . . . 35

4.4.2.1 Clifton-Ferreira-Land model . . . 35

4.4.2.2 Garcia-Bellido-Haugbølle model . . . 38

5 Conclusion 41

Appendices 47

A Hubble Parameter Code 48

(12)

List of Figures

2.1 Spatial distribution of energy density of the three void curvature profiles. . . 10

2.2 The matter content,Ωm(r), of our fiducial GBH model as a function of the radius, r. . . . 11

3.1 Syntax tree of the shear estimator best-fit function. . . 16

3.2 Crossover operation represented by syntax trees. . . 17

3.3 Mutation operation represented by syntax trees. . . 17

4.1 The posterior probability distribution of theΛCDM cosmological parameters. . . 24

4.2 The distance modulus,µ, as a function of redshift, z. . . 25

4.3 The Hubble parameter,H(z), as a function of redshift,z. . . 27

4.4 TheOm(z)statistic as a function of redshift,z. . . 28

4.5 The luminosity distance,dL, GP reconstruction as a function of redshift,z. . . 29

4.6 GP reconstruction of the DESI mock data for the angular diameter distance errors. . . 30

4.7 The luminosity distance,dL, GP reconstruction as a function of redshift,z, with noisy data. 30 4.8 The shear estimator,Σs, GP reconstruction as a function of redshift,z. . . 32

4.9 GP reconstruction of the Hubble parameter errors of Table 4.1. . . 33

4.10 The shear estimator,Σs, GP reconstruction as a function of redshift,z, with noisy data. . 33

4.11 The velocity shift for theΛCDM model after a period of 10 years. . . 34

4.12 The redshift drift per year for the fiducial CFL and the eight toy models. . . 36

4.13 The redshift drift per year for the fiducial CFL model and theΛCDM model. . . 37

4.14 The velocity shift for theΛCDM model after a period of 10 years. . . 37

4.15 The redshift drift per year for the fiducial GBH and the eight toy models. . . 39

4.16 The redshift drift per year for the fiducial GBH model and theΛCDM model. . . 40

4.17 The velocity shift for theΛCDM model after a period of 10 years. . . 40

(13)

List of Tables

4.1 Dataset used for the reconstruction of the Hubble parameter. . . 26 4.2 The coefficient of determination,R2, the standard deviations,σdA(z), and the luminosity

distance function,dL(z), for each run of the GP algorithm. . . 29 4.3 The coefficient of determination,R2, the standard deviations,σH, and the shear estimator,

Σs, for each run of the GP algorithm. . . 32 4.4 Values for the three initial parameters of the CFL models in study. . . 35 4.5 Values for the four initial parameters of the GBH models in study. . . 38

(14)

Chapter 1

Introduction

In cosmology, the most used models are derived from the Friedmann–Lemaître–Robertson–Walker (FLRW) metric. These models describe a homogeneous and isotropic Universe in expansion. It was through FLRW models and observations of the accelerated expansion of the Universe that one can con- clude that there is a dark energy component in the Universe. However, the accelerated expansion that we observe today may not be necessarily due to dark energy if we consider the possibility of non- homogeneous models.

The idea that we live in a Universe homogeneous and isotropic is an hypothesis called the cosmo- logical principle (CP). This principle has become one of the biggest assumptions of cosmology and it is rarely considered as a possible cause of the cosmological tensions that arise from FLRW models. It is a generalization of the Copernican principle that states that we do not live in a privileged position in the Universe. However, although the CP is not compatible with an inhomogeneous Universe, the Copernican principle can be [1]. So, in order to intrepret the tests of the CP we need to consider inhomogeneous models.

Inhomogeneous models have been studied as an alternative to dark energy [2]. Dark energy is sup- posed to be the most abundant component of the Universe and it was "discovered" as a consequence of imposing FLRW models to cosmological data. However, this mysterious component has never been observed directly and its physics remains unknown. So, it leaves us wondering whether in fact exists or it could be that we are just not using the right physical assumptions and modeling approaches to analyse the cosmological data.

The most common inhomogeneous models used as an alternative to FLRW models are the Lemaître–Tolman–Bondi (LTB) models [3–5]. These models opened up the possibility that we live in a large void and that mimics the Universe’s accelerated expansion without the need of dark energy.

Nevertheless, these models were not in total agreement with some large-scale observations [6] and therefore this led to a new type of models, theΛLTB models [7], which are LTB models but with a dark energy component. An important feature about LTB/ΛLTB models is that they can explain, for exam- ple, the Hubble tension [8]. This tension arises inΛCDM FLRW models when different cosmological datasets, namely Supernovae Type Ia (SN Ia) and CMB anisotropy observations, provide statistically inconsistent values for the Hubble constant if datasets are treated separately. In the realm of LTB/ΛLTB models the mean local density changes with the radial coordinate of the large-scale inhomogeneity, in- ducing radial and transverse expansion rates. In other words, tensions may be solved if the universe is non-homogeneous at low redshifts where SN Ia are observed. This idea has been investigated in several studies based on the KBC Void model, theΛLTB, and extensions of the FLRW models considering large inhomogeneities [8–13]. The emerging picture is that the Hubble tension is hard to avoid unless the large-

(15)

scale inhomogeneity does not extend to high redshiftsz<0.15 forΛLTB, whereas popular Dark Energy FLRW-based alternatives still fail to alleviate tensions. Moreover these attempts have only probed a lim- ited range of LTB models. Future observations may also help validate or disprove the explanation of the Hubble tension via non-homogeneous models.

However, the quality of cosmological data today is not precise enough to distinguish ΛLTB from FLRW models. On the other hand, the study of LTB models is still justified given the present state of observations and the wide range of possible LTB models. The advent of a new generation of galaxy surveys, such as Euclid [14], DESI [15] and Pan-STARRS [16], will allow to further constrain ΛLTB and LTB non-homogeneous models.

Since cosmological data is becoming more precise, we must rethink the approaches we use to analyse it. One approach is by using the traditional methods, which are known as parametric methods. These methods adjust cosmological functions to data. For that reason, this approach is model dependent, be- cause the cosmological functions always depend on the model. We can always make a model fit the data by adjusting its parameters. Thus, if we can do this for several models and each model has different parameters and different results, how can we know which is the right model to describe the data? Of course, this is a limitation of these methods since the constraints we obtain for the parameters are only valid for a class of models represented by that function, which bias our results and makes the progress of scientific knowledge more limited.

Luckily, there are alternative methods, which are known as non-parametric methods. These methods offer an agnostic approach, i.e., they do not require any assumptions to analyse cosmological data. They do not give us the parameters for a given cosmological function, instead, they give us the function that best describes the data along with its parameters.

One of the non-parametric methods is the symbolic regression method. Symbolic regression is a machine learning technique used to find a mathematical expression that best describes a relationship. In this way, we do not need to know a model a priori, instead, we let the data tell us the model that best describes it. This method can be implemented through genetic programming (GP).

One of the main goals of this thesis is to understand if non-parametric methods can be a good al- ternative to parametric methods. On the one hand, we know that parametric methods can easily lead to tensions when using data from different surveys, while non-parametric methods can solve this problem since they do not require a model to fit the data. On the other hand, non-parametric methods have their limitations, for example, in order to have really precise reconstructions we need very precise data. Here, we want to investigate how precise data should be to have a good reconstruction. Have we already access to data with such precision?

In addition, by developing artificial intelligence (AI) reconstruction methods, we intent to look for observational signatures of LTB models in order to confront them with mock data. To implement the AI algorithms we use Python (3.10.4 version), in particular we use a Python package called gplearn (0.4.2 version). This package implements genetic programming (GP) in Python in order to solve symbolic regression problems.

This thesis is organized as follows. In Chapter 2, there is a theoretical introduction of FLRW, LTB and ΛLTB models. In Chapter 3, we present the methodology of this work, where we introduce parametric and non-parametric methods, along with details about the codes used. Moreover, in Chapter 4, we present the results. Finally, in Chapter 5, there is the conclusion of this work.

(16)

Chapter 2

Theoretical Introduction

Cosmology, as many other branches of science, has evolved so much during the past decades that now we have access to a large number of models designed to describe the Universe. Usually, FLRW are the most used models in cosmology, however, alternative models started to be considered with the need to fulfil the flaws of FLRW models. Among them are LTB models.

In this Chapter, there is a review of FLRW models, in particular, theΛCDM model, which is the standard model of cosmology. In addition, there is also a review of LTB models, with a special emphasis on theΛLTB model, the Clifton-Ferreira-Land (CFL) model and the Garcia-Bellido-Haugbølle (GBH) model. Moreover, there is a theoretical introduction of the redshift drift, for both FLRW and LTB models.

2.1 Friedmann–Lemaître–Robertson–Walker models

FLRW models are in agreement with the CP. For that reason, FLRW models state that the Universe is homogeneous and isotropic, at large scales (>100 Mpc). In addition, these models also describe an expanding Universe. The metric of such models is given by

ds2=−dt2+a(t)2 dr2

1−kr2+r2dΩ2

, (2.1)

wherea(t)is the scale factor,kgive us the curvature of space (k=1→closed,k=−1→open,k=0→ flat) anddΩ2=dθ2+sin2θdφ2.

Using equation (2.1) and the stress-energy tensor in the form of a perfect fluid

Tµ ν = (ρ+p)uµuν+pgµ ν, (2.2)

whereρ and pare the density and the pressure of the fluid, respectively, anduµ is the four-velocity of the fluid, we can apply it on Einstein’s equations, which are given by

Gµ ν+gµ νΛ=8πGTµ ν. (2.3)

By doing so, we get a set of equations known as Friedmann equations a˙

a 2

=8πG 3 ρ− k

a2

3, (2.4)

¨ a

a=−4πG

3 (ρ+3p) +Λ

3, (2.5)

(17)

2.1 Friedmann–Lemaître–Robertson–Walker models

where ˙a/ais the Hubble parameter, H, andΛis known as the cosmological constant. These equations describe the evolution of the background in the Universe.

From the conservation of the stress-energy tensor for a perfect fluid

µTµ ν=0, (2.6)

we can obtain the continuity equation for any fluid that constitutes the Universe

˙

ρ+3H(ρ+p) =0. (2.7)

Moreover, as the equation of state relates the pressure and the density of a fluid, we can also define an equation of state for those fluids

p=wρ. (2.8)

In the case of matterwm=0 and in the case of radiationwr=1/3.

2.1.1 ΛCDM model

TheΛCDM model stems from the FLRW metric and is the most acceptable model to describe the accelerated expansion of the Universe, where the acceleration is modelled with the cosmological con- stant. The cosmological constant,Λ, is equal to a dark fluid that we know as dark energy. CDM stands for Cold Dark Matter, and it is "cold" because of its small velocity dispersion when compared to the speed of light. For that reason, being "cold", allowed particles to accrete and form structure in the early Universe. If, otherwise, they were "hot", would be more difficult to form structure because of their high pressures [17].

As dark energy is characterized as a fluid it obeys to an equation of state, which relates its pressure and density. In the case ofΛCDM model, one get the following form

w= p

ρ ⇒wΛ= pΛ

ρΛ ⇒wΛ=−1. (2.9)

When talking about a fluid component of the Universe, it is useful to define a density parameter Ωi= ρi

ρc

, (2.10)

where,ρi is the density of a given fluid andρc is the critical density of the Universe. By taking the last expression into account, we can rewrite Eq. (2.4) as

H2(a)≡ a˙

a 2

=H02

m,0a−3+Ωr,0a−4+Ωk,0a−2+ΩΛ,0

, (2.11)

where H0 is the value for the Hubble parameter today, all the values of the density parameters are at present time and their subscripts are matter, radiation, curvature and dark energy, respectively. The Friedmann equation is usually presented in this way for theΛCDM model.

Although theΛCDM is the most acceptable model there are some problems associated with it. One of the biggest problems is the cosmological constant problem. The discrepancy between the observable value, Λobs∼10−120MPL4 , and the theoretical value put forward by field theory,Λth∼10−60MPL4 , is at 60 orders of magnitude [18]. Also, some tensions like the Hubble tension and the σ8 tension. These tensions appear when one uses different data sets, within theΛCDM, that give different values for the

(18)

2.2 Lemaître–Tolman–Bondi models

same parameter. In the case of the Hubble tension, the value for the Hubble parameter today, is different when one uses early time observations, such as, CMB observations, and late time observations, such as, SN Ia observations [19]. Regarding theσ8 tension, it comes from the fact that the r.m.s. fluctuation of density perturbations at 8h−1Mpc scale, inferred from CMB data and from large scale structure (LSS) observations do not agree [20]. Therefore, by taking all of these problems into account, it may lead us to think that theΛCDM is the problem and try to use alternative models, such as LTB models.

2.2 Lemaître–Tolman–Bondi models

LTB models are isotropic, spherically symmetric and inhomogeneous models. These models have been used as an alternative to dark energy to explain the late acceleration of the Universe. They suggest that the SNIa observations of dimmer luminosities are a local effect due to the presence of inhomo- geneities at small/intermediate scales.

On these type of models, both the Hubble and the density parameters depend, not just on time, but also, on the radial coordinate. This is a big difference comparing to the ΛCDM model, where those parameters depend only on time.

2.2.1 The LTB solution

The general metric that satisfies the conditions of local isotropy and non-homogeneity of LTB models is

ds2=−dt2+X2(r,t)dr2+A2(r,t)dΩ2. (2.12) The functionsX(r,t)andA(r,t)have temporal and radial dependences. From the(0,r)component of the Einstein equations, one finds that [21]

X(r,t) = A(r,t)

p1−k(r), (2.13)

where the prime denotes the partial spatial derivative and k(r) is a function associated to the spatial curvature. The FLRW metric is recovered when we impose extra homogeneity conditions: k(r) =kr2 andA(r,t) =a(t)r, wherekis the curvature constant anda(t)the FLRW scale factor.

When compared to FLRW models, LTB models distinguish themselves by the fact of having two scale factors instead of one, and this, of course, implies two Hubble parameters instead of one, defined as

H(r,t) =A˙(r,t) A(r,t) =a˙

a, H(r,t) =A(r,t)˙ A(r,t)= a˙

a, (2.14)

whereaandaare the radial and angular scale factors, respectively. The dot denotes the partial deriva- tive with respect to time.

For a spherically symmetric matter source without pressure, the energy momentum tensor is given by

Tνµ=−ρm(r,t)δ0µδν0, (2.15)

(19)

2.2 Lemaître–Tolman–Bondi models

whereρmis the matter density andδ0µ=uµ, which represents the components of the four-velocity of the fluid.

By applying the Eqs. (2.12) and (2.15) into the Einstein field equations,Gµν =8πGTνµ, one finds two independent differential equations [21]

2+k(r)

A2 +2 ˙AA˙+k(r)

AA =8πGρm, (2.16)

2+2AA¨+k(r) =0. (2.17)

By multiplying ˙Ain each term of Eq. (2.17), considering that ˙A̸=0, and then integrating the same equation, we get

2

A2 = M(r) A3 −k(r)

A2 , (2.18)

where M(r) is a non-negative function and can be seen as the effective matter content (see below).

Substituting Eq. (2.18) into Eq. (2.16), we get M(r)

AA2 =8πGρm. (2.19)

Combining Eqs. (2.16) and (2.17) gives 2 3

A¨ A+1

3 A¨

A =−4πG

3 ρm, (2.20)

which is the generalized acceleration equation. Note that the notion of acceleration becomes ambiguous in the presence of inhomogeneities [22], because it depends on averaging over different directions.

BothM(r)andk(r)are given taking into account the nature of the inhomogeneity in the following way [21],

M(r) =H02(r)Ωm(r)A30(r), (2.21)

k(r) =H02(r)(Ωm(r)−1)A20(r), (2.22) whereH0(r)≡H(r,t0)andA0(r)≡A(r,t0). Also, by recognising a certain similarity between Eq. (2.18) and the Friedmann equation (Eq. (2.4)), but, without consider a cosmological constant, we can define,

H(r,t) =A(r,t)˙

A(r,t). (2.23)

Using the previous equations to substitute in Eq. (2.18), we find H2(r,t) =H02(r)

"

m(r)

A0(r) A(r,t)

3

+ (1−Ωm(r))

A0(r) A(r,t)

2#

. (2.24)

The big difference between the Friedmann equation (Eq. (2.11)) and this one is that in the latter all quantities depend on the spatial coordinate.

(20)

2.2 Lemaître–Tolman–Bondi models

2.2.1.1 Propagation of light in LTB models

It is very important to study the propagation of light in LTB models if we want to be able to compare them with observations. One should keep in mind that the following deductions are for observers located at the inhomogeneity’s center,r=01.

For incoming light travelling along radial null geodesics we know thatds2=dΩ2=0. Using these conditions in Eq. (2.12), we obtain the constraint equation for light rays

dt

dr =− A(r,t)

p1−k(r), (2.25)

where the minus sign means that we are considering radially incoming light rays.

Consider now two light rays emitted in the same direction, where the second one is emitted after a small time interval,τ. So, the first light ray is emitted att1=t(r)and the second att2=t(r) +τ(r). Both light rays obey to Eq. (2.25), so we get

dt1 dr ≡ dt

dr =− A(r,t)

p1−k(r), dt2

dr ≡d(t+τ)

dr =−A(r,t+τ)

p1−k(r). (2.26)

Sinceτis a small time interval, we can use a Taylor expansion and

A(r,t+τ) =A(r,t) +τA˙(r,t). (2.27) So, using Eqs. (2.26) and (2.27) we conclude that

dr =−τ A˙(r,t)

p1−k(r). (2.28)

If we considerτat emission to be the period of the wave, then, we can compare it with the period at the point of the observation using the redshift equation, given by

1+z(rem) =τ(robs)

τ(rem). (2.29)

Differentiating this equation we obtain dz

dr =−dτ(rem) dr

τ(robs)

τ2(rem), (2.30)

where the observer is considered to be at a fixed position. Using Eq. (2.28) and substituting in Eq. (2.30), we get

1 1+z

dz

dr = A˙(r,t)

p1−k(r) ⇔dln(1+z)

dr = A˙(r,t)

p1−k(r). (2.31)

Now, usingN=ln(1+z), which is the effective number of e-folds2before the present time, allows us to re-write the equations for the light rays as a parametric set of differential equations,

dt

dN =−A(r,t)

(r,t), (2.32)

1These deductions also hold for observers off-center but in only one direction, the radial direction of the inhomogeneity.

2The number of e-folds gives the time needed for a given quantity to increase by a factor of e.

(21)

2.2 Lemaître–Tolman–Bondi models

dr dN =

p1−k(r)

(r,t) , (2.33)

from where we can obtained the functions t(z) andr(z). In addition, we can also use Eq. (2.25) and substitute it in Eq. (2.31), which leads to the following result

1 1+z

dz

dr = A˙(r,t)

p1−k(r) ⇔dln(1+z) =dln(A(r,t)). (2.34) With those results, one can obtain the luminosity distance, the comoving distance and the angular diameter distance as a function of redshift [21], respectively,

dL(z) = (1+z)2A[r(z),t(z)], (2.35)

dc(z) = (1+z)A[r(z),t(z)], (2.36)

dA(z) =A[r(z),t(z)]. (2.37)

2.2.1.2 Shear Estimator

The shear estimator,Σs, is a way to quantify kinematical distortions in an expanding Universe, taking into account the two Hubble parameters of LTB models. This estimator is present in inhomogenous models and can be used to distinguish between FLRW models and LTB models.

In FLRW models, the shear of the background geometry is zero, however, in LTB models it can be different from zero and reach a maximal value at a certain redshift, related to the size of the non- homogeneity. So, by measuring a non vanishing shear, this should be interpreted as a departure from FLRW behaviour and one can in principle estimate the size and the depth of the non-homogeneity [23].

The shear estimator for LTB models is given by the ratio between the spatial shear and the expansion rate. The spatial shear is given by

σi j= (H−H)Mi j, (2.38)

whereMij=diag(−2/3,1/3,1/3)is a traceless symmetric matrix. The global expansion rate is written as follows

θ =2H+H. (2.39)

So, the dimensionless shear estimator is given by Σs=

r2 3

σ

θ =±H−H

2H+H, (2.40)

whereσ2≡σi jσi j=23(H−H)2. The problem of writing the shear estimator in this way, is that, in the case of the angular Hubble parameter, it only can be measured indirectly. So, what one should do, when using real data, is to re-write the shear estimator with more direct observable quantities.

For example, the shear estimator can be written with observables that are more easy to measure in the following way [23]

(22)

2.2 Lemaître–Tolman–Bondi models

Σs≈ ± 1−H(z)∂z[(1+z)dA(z)]

3H(z)dA(z) +2−2H(z)∂z[(1+z)dA(z)], (2.41) wheredA(z)is the angular diameter distance and∂zis the partial derivative with respect toz. One should note that this expression does not depend onH0, which is still a value statistically inconsistent when one uses different probes. Further, note that the shear estimator written as in Eq. (2.41), assumes a zero curvature, reflecting the understanding that at small scales the spatial curvature is very small.

2.2.2 ΛLTB

LTB models have been studied for several years. However, a few years ago some observations suggested that LTB models can not be an alternative to dark energy. Therefore, this led to a new type of LTB models, the ΛLTB models. In summary, ΛLTB models are LTB models with a dark energy component,Λ. However, one should keep in mind that, although these models were proposed due to the inconsistencies between observations and LTB models, LTB models are still worth studying, and, also, we do not have data with the necessary precision to compare againstΛLTB models.

To be inΛLTB regime, the changes one should do in the analysis carried out in the last section are at the level of the energy momentum tensor. For these models, the energy momentum tensor is given by

Tνµ =−ρm(r,t)δ0µδν0−ρΛδνµ, (2.42) where ρΛ represents the vacuum energy. The rest of the analysis is analogous to LTB models. Now, instead of Eq. (2.18) the dynamics of this model is given by

2

A2 =M(r) A3 −k(r)

A2 +8πG

3 ρΛ, (2.43)

where there is an additional dark energy term. Therefore, this means that Eq. 2.24 becomes H2(r,t) =H02(r)

"

m(r)

A0(r) A(r,t)

3

+Ωk(r)

A0(r) A(r,t)

2

+ΩΛ(r)

#

. (2.44)

Thus, the density parameters for theΛLTB model are the following

m(r) ≡ M(r)

H02(r)A30(r), (2.45)

Λ(r) ≡ 8πG 3

ρΛ

H02(r), (2.46)

k(r) ≡ 1−Ωm(r)−ΩΛ(r) =− k(r)

H02(r)A20(r). (2.47) 2.2.3 Clifton-Ferreira-Land model

Clifton, Ferreira and Land proposed three void curvature profiles for LTB models [24]. With initial conditions such that the curvature is asymptotically flat with a negative perturbation near the origin and the gravitational mass is evenly distributed. As time goes by, the Universe evolves and the energy density in proximity of the curvature perturbation is dispersed and a void is created.

Figure 2.1 represents the three void curvature profiles as well as their distance modulus,∆dm. The distance modulus is defined as the observable magnitude of an astrophysical source,m, minus the mag-

(23)

2.2 Lemaître–Tolman–Bondi models

Figure 2.1: On top is represented the spatial distribution of energy density of the three void curvature profiles. On the bottom plots, the solid line corresponds to void (a), the dashed line corresponds to void (b) and the dotted line corresponds to void (c).

On the right hand plot, the ascending thick solid line corresponds to a de Sitter Universe and the descending thick solid line corresponds to a Einstein-de Sitter Universe. Credits: [24]

nitude of such source if we consider an empty, homogeneous Milne Universe, at the same redshift,M, and it given by the following expression,

∆dm=m−M=5 log10( d

10pc), (2.48)

where d is the distance of the source in parsecs. The three profiles are characterized as follows:

(a) Gaussian inrwith full width at half maximum (FWHM)r0; (b) k∝exp{−c|r|3};

(c) k∝{1− |tanh(r)|}.

All the curvature profiles are normalised to the curvature minimumk0in Figure 2.1 (bottom left panel).

2.2.4 Garcia-Bellido-Haugbølle model

Another type of LTB models was introduced by Garcia-Bellido and Haugbølle [21]. These models are described by the matter content Ωm(r) and the rate of expansion H0(r), which are given by the following expressions, respectively,

m(r) =Ωout+

in−Ωout

1−tanh[(r−r0)/2∆r]

1+tanh[r0/2∆r]

, (2.49)

H0(r) =Hout+

Hin−Hout

1−tanh[(r−r0)/2∆r]

1+tanh[r0/2∆r]

, (2.50)

(24)

2.3 Redshift Drift

whereΩoutis determined by asymptotic flatness,Ωin is determined by LSS observations,Hout is deter- mined by CMB observations, Hin is determined by HST observations, r0 characterises the size of the void and∆rcharacterizes the transitions to uniformity.

In Figure 2.2, there is represented the behaviour of theΩm(r)for what will be considered our fiducial GBH model (see Sec. 4.4.1.2).

Figure 2.2: The matter content,m(r), of our fiducial GBH model as a function of the radius, r. The dimensionless Hubble constant, h, is fix to be equal to 0.675. The underdensity at the void center,in, is fix to be equal to 0.3. The size of the void, r0, is fix to be equal to 1.5. The transition width of the void profile,∆r, is fix to be equal to 0.5. Note that,r,r0and∆rare in units of Gpc.

2.3 Redshift Drift

The observed redshift of an astrophysical source is expected to change over time. The change can be measured if object observations are separated by sufficiently long time intervals. This redshift variation over time is often called redshift drift and it was first proposed by Sandage in 1962 [25]. At that time, there was not enough precision to measure redshift drifts, and the idea was abandoned for decades. In 1998, Loeb has drawn attention to the importance of this effect [26], but the idea was, again, abandoned for the same reason. However, nowadays, the precision of observations has improved remarkably. Red- shift drifts may be measured in the coming years using high-resolution spectroscopy observations. For example, the authors in [27] have investigated the potential of redshift drift observations to constrain cosmology in the context of the CODEX (ANDES) high-resolution spectrograph proposal for the ELT.

The idea is that by measuring a time variation of the redshift of sources, for sufficiently long time intervals, one should be able to distinguish between cosmological models and obtain invaluable infor- mation about the physical mechanism behind the accelerated expansion of the Universe. In other words, one should be able to conclude whether the acceleration is due to a dark energy fluid or to the presence of a large-scale void that causes an apparent acceleration.

(25)

2.3 Redshift Drift

The impressive precision required by redshift drift observations is the basis of the remarkable po- tential of this probe for cosmology. Different cosmology models leave well-defined impacts (model signatures) in the redshift drift observable that not only allow the imposition of better constraints on dif- ferent types of models but also discard many of them. In Section 4.4, we will show that LTB and FLRW models present very different redshift drift signatures. This is a clear indication that this observable has the potential of becoming a very strong test of the CP.

This method has a great advantage when compared to other methods. It only depends on the identi- fication of stable spectral lines. Therefore, it reduces the uncertainties from systematic or evolutionary effects [27]. Nevertheless, one should keep in mind that, to measure a time variation of the redshift, one needs to consider observations over a period of several years (typically more than a decade).

2.3.1 Redshift Drift in Friedmann–Lemaître–Robertson–Walker models

For FLRW models, the expression for the redshift drift is quite simple. The observed redshift of an astrophysical source is given by

z(tobs) = a(tobs)

a(tem) −1, (2.51)

which, after a time interval becomes

z(tobs+∆tobs) = a(tobs+∆tobs)

a(tem+∆tem) −1. (2.52)

Therefore, the redshift variation of the source is

∆z=a(tobs+∆tobs)

a(tem+∆tem) −a(tobs)

a(tem). (2.53)

By Taylor expanding the former expression at first order in∆t/t, we get

∆z≃∆tobs

a(t˙ obs)−a(t˙ em) a(tem)

. (2.54)

By rewriting the last expression in terms of the Hubble parameter, we get a expression for the redshift drift with observables, which is

∆z=H0∆t

1+z−H(z) H0

, (2.55)

where the subscriptsobsandemwere dropped for simplicity. Now, it is clear that the redshift drift is directly proportional to the expansion rate of the Universe, which means that it is a direct probe of its dynamics.

In addition, the redshift drift can be expressed in terms of the apparent velocity shift of the source, which is given by

∆v= c∆z

(1+z). (2.56)

2.3.2 Redshift Drift in Lemaître–Tolman–Bondi models

When comparing to FLRW models, the redshift drift for LTB models gets a little bit more compli- cated. Now, the redshift depends not only on time but also on the radial coordinate. Thus, considering

(26)

2.3 Redshift Drift

the following formula for the redshift [28]

1+z=X(0,tobs) X(r(t),t)e

Rr

0 X

Xdr, (2.57)

whererandtrepresent the radial and time coordinates of the emission source andX(r(t),t)is defined in Eq. (2.13). From this expression it is easily conclude that the redshift has three dependencies, meaning that,z=z(r(t),t,tobs).

By considering an additional time interval (∆t), the redshift variation is given by

∆z=z(r(t) +∆r(t),t+∆t,tobs+∆tobs)−z(r(t),t,tobs). (2.58) Using a first order approximation [28], we get

∆z=∂z

∂t∆t+ ∂z

∂tobs∆tobs+ (1+z) Z tobs

t

dt∆r(t) ∆z

∆r(t), (2.59)

where∆z/∆r(t)is a functional derivative of Eq. (2.57), which is given by

∆z

∆r(t) = (1+z)∂rH(r(t),t). (2.60) By considering the definition of time dilation that can be found in Ref. [28],

∆tobs

∆t = (1+z), (2.61)

and using it, alongside with Eq. (2.57), into Eq. (2.59), we find

∆z

∆tobs

= (1+z) Z tobs

t

rH(r(t),t)dt

α(r(t),t)(1+z(r(t),t))+H(0,tobs)(1+z)−H(r(z),t(z)), (2.62) which is the expression of the redshift drift for LTB models. When comparing with FLRW models, LTB models have an extra term in the redshift drift expression, which is the first term of Eq. (2.62), this means that when the radial inhomogeneities vanish this equation reduces to the FLRW expression.

(27)

Chapter 3

Methodology

In recent years, the amount of precise data in cosmology has increased rapidly. However, there is no model capable of describing data in a completely satisfactory way. The model that best describes the Universe so far is theΛCDM model, but it is far from being able to describe all cosmological observables in a consistent way. Some observations do not match the predictions of the model and some tensions from different datasets arise. One of them is the Hubble tension which still is an intriguing problem in observational cosmology, today. So, one could conclude that there are problems with the cosmological data, the theoretical model or the methodology used to compare data with theory.

In the light of cosmological data, one can use two approaches to analyse it. It can be assumed a cosmological model, and, by using the available data, one can fit its parameters. This approach is, of course, model dependent and it is known as parametric approach. This means that a parametric approach requires the assumption that the data is described by a known function with specific model parameters, which, by itself, is a limitation that may bias the results. On the other hand, there is a completely agnostic approach that only requires data, known as non-parametric approach. Therefore, by using it, one does not need to make assumptions about the function form preferred by the data, and instead, the data will determine which is the cosmological function (and its parameters) that best describes it.

In this section, there is a review of both parametric and non-parametric methods, as well as technical details about the codes and methods used in this dissertation.

3.1 Parametric methods

As stated before, parametric methods are model dependent. This is because, in a parametric approach we have already a model and want to find the best parameter values that fit the data. Therefore, the results one finds using this approach are only valid for that particular model and cannot be used to make conclusions about other models.

One famous parametric approach is to combine Markov Chain Monte Carlo (MCMC) methods with Bayesian statistics [29]. MCMC and Bayesian statistics are two different disciplines. MCMC meth- ods are used to sample from a probability distribution, while Bayesian statistic is a theory to interpret observed data.

When analysing cosmological data we are faced with an important question, given the data, what underlying physical process could possibly cause those results? Or, when comparing causes, which one is the most likely to reproduce the data? The most common way to answer such questions is by using Bayesian theory. However, a particular problem appears when we want to compare the probability distribution function (PDF) of a set of parameters. Usually, this type of functions can not be solved

(28)

3.2 Non-parametric methods

analytically, therefore, numerical methods need to be deployed. This is why we need MCMC methods.

This type of methods are able to sample points from the PDF and, combined with the Bayes’ theorem (see the next subsection), find out which are the most likely values for a set of parameters.

3.1.1 Bayes’ Theorem

Behind all Bayesian statistic is the Bayes’ Theorem, which is a very simple result, yet very powerful.

This theorem describes the conditional probability of an hypothesis given data and prior information about the hypothesis. This result is expressed as follows [29],

p(H|D,I) = p(D|H,I)p(H|I)

p(D|I) , Posterior=Likelihood×Prior

Evidence , (3.1)

whereHis the hypothesis,Dis the data andIis the prior information about the hypothesis. Thep(H|D,I) is the probability of the hypothesis after taking into account data and prior information, which is known as posterior. The p(D|H,I) is the probability of data after assuming that both the hypothesis and the prior information are truth and it is known as the likelihood. Thep(H|I)expresses our prior knowledge ofH being true, therefore, it is known as prior. Finally, the p(D|I)is the evidence, meaning, it is the probability of the data knowing that the prior information about the hypothesis is truth.

3.1.2 Markov Chain Monte Carlo methods

MCMC methods, as the name indicates, combines both Markov Chain and the Monte Carlo method.

Markov Chain is a model that states that the probability of future states in a chain of states depends only on the current state, therefore, it does not rely on the knowledge of all the previous states. Regarding the Monte Carlo method, it is a method for sampling from the PDF, and using those samples one can estimate a given parameter.

Here, we choose to use a particular MCMC method, which is called the affine invariant ensemble sampler. This method was proposed by Goodman and Weare in 2010 [30], and it basically consists on multiple chains running in parallel, but, the chains are able to interact so they can adapt their PDFs.

3.1.3 LMFIT and emcee

So, in order to implement the affine invariant ensemble sampler for MCMC in Python (3.9.12 version), it was used two packages, the Non-Linear Least-Squares Minimization and Curve-Fitting 1 (LMFIT) (1.0.3 version) andemcee2(3.1.2 version).

LMFIT was initial created, as the name suggest, for solving non-linear least-squares problems by using the Levenberg-Marquardt method [31, 32]. Now, LMFIT provides several tools for non-linear optimization and curve fitting problems, being one of them emcee, which implements the affine invariant ensemble sampler for MCMC.

3.2 Non-parametric methods

A major problem with the modelling of cosmological data is in fact that data is often biased by the theoretical modeling that we choose to interpret it. For example, one may consider the value of the

1https://lmfit.github.io/lmfit-py/

2https://emcee.readthedocs.io/en/stable/

(29)

3.2 Non-parametric methods

matter density parameter,Ωm. According to data from the Planck mission, this parameter has a value of Ωm=0.3153±0.0073 [33], but, this is exclusive for theΛCDM model. If we instead assume a dynamic dark energy model the value ofΩmis different.

So, what one can do to avoid this type of bias is to use AI algorithms based on statistical learning theory [34]. A very good thing about AI algorithms is that we can remove model assumption biases when compare to other model dependent approaches. AI has been an indispensable tool in several fields, such as, finance [35], biology [36], physics [37]. Also, AI methods have been very useful in cosmology [38].

In particular, we used GP, which is a specific AI algorithm. With the use of GP and symbolic regres- sion algorithms the data is not interpreted in light of pre-assumed theoretical functions. These methods reconstruct observational model-independent functions withouta prioricosmological assumptions.

3.2.1 Genetic Programming

GP is a type of Evolutionary Algorithms that was inspired by natural selection [39]. In GP the main mechanism consists in a population, in which the individuals are possible candidates to the problem’s so- lution. These individuals change over time, i.e., over generations, through operations, such as mutation, crossover and selection.

In GP, it is common to use of syntax trees to represent the solution of the problem under study. The leaves of a tree are called nodes and can correspond to arithmetic operations, variables or constants. The constants and variables are called terminals and the arithmetic operations are called functions. In Figure 3.1 it is shown an example of a syntax tree that we found for the best-fit function of the shear estimator (see Section 4.3.2).

Figure 3.1: Example of a syntax tree of the shear estimator best-fit function. Thediv,subandaddrepresent the operations. The H-paraandH-perprepresent the radial and angular Hubble parameters, respectively, which are variables.

In GP, an individual is a function, so a population is nothing but a set of functions. The first step, when we run a GP code, is to randomly generate the initial population. The most common methods used to do this are: the full, the grow and the ramped half-and-half methods.

The full method consists in randomly filling the nodes with the operations from the function set until the maximum tree depth is reached. Once that depth is reached only terminals can be chosen.

The grow method consists in randomly filling the nodes with the operations from the function set and also from the terminal set until the maximum tree depth is reached. So, the grow method will generate trees with a greater variety of sizes and shapes.

(30)

3.2 Non-parametric methods

The ramped half-and-half method is a mix of the other two, i.e., half of the population is initialized using the full method and the other half is initialized using the grow method. Note that, in this work, we used the ramped half-and-half method.

Then, the individuals must compete in order to find the best individuals in the population. Here, we used tournament selection and the individuals who win a tournament will become parents by producing offspring. Individuals win tournaments taking into account their fitness. In GP, the fitness can be seen as a measure of the error. So, the lower the fitness, the better the individual, because it is more close to the problem’s solution.

One important aspect to take into account is the selection pressure. If one chooses a large tournament size, the solution will be reached more quickly. On the other hand, if one chooses a small tournament size, more individuals of a generation will survive, therefore, we get a more diverse population, and that comes with more computational time to reach the solution. So, one has to take into account these factors and apply what is best for a specific problem.

After considering all of theses factors, we have to take into account that the best-fit individuals of each generation will be modified, or not, by operations of crossover and mutation. Figure 3.2 shows an example of a crossover transformation involving two individuals, the parent functions, (X8/X1)× (X9−0.5) and 0.01+ (X2−X7), represented by the trees on the left. This operation involves the selection of a part of an individual and replacing it with a part of the other individual. In the example, the selected parts of the parent functions are represented in blue. The resulting individual, the function (X2−X7)×(X9−0.5), is then given by the tree on the right of Figure 3.2. Regarding mutation, Figure 3.3 shows an example of this operation. Mutation consists of randomly mutating parts of an individual with new random function parts. There are several ways of doing mutations, but in the example presented in Figure 3.3 the function(X8/X1)×(X9−0.5)(the tree on the left) is changed by mutating the tree nodes represented in grey. The resulting new individual, the function(X8−X1)×(X9−0.265)(tree on the right), is then considered for the next generation.

Figure 3.2: Crossover operation represented by syntax trees. From: https://gplearn.readthedocs.io/en/stable/intro.html

Figure 3.3: Mutation operation represented by syntax trees. From: https://gplearn.readthedocs.io/en/stable/intro.html

It is very important to note that, if the sum of the mutation and crossover probabilities’ is less than

(31)

3.2 Non-parametric methods

one, reproduction takes place to compensate it. This means that, the winner of a tournament is cloned to enter the next generation.

There is an interesting phenomenon that sometimes happens in GP, which is called bloat. Bloat is when the size of the individuals grow larger and larger, but the fitness does not improve. This will spend more and more computational time and brings little benefit to the solution. Note that this phenomenon is not exclusive to GP.

3.2.2 Error Analysis

There is not a direct method to estimate the error of the best-fit function given by the GP code. Savvas Nesseris and Garcia-Bellido proposed an analytic way to overcome this issue, for both correlated and uncorrelated data [40]. Note that, comparing with the use of correlated data, the uncorrelated data is more simple and easier to deal with. To consider correlated data, one has to account with the covariance matrix, which is not a trivial matrix to estimate. This matrix is obtained by doing several simulations of Universes with different parameters, which is a procedure beyond the scope of this dissertation. Therefore, for the purpose of this thesis, we will only present here their error analysis for uncorrelated data.

The method in Ref. [40] to estimate functional error analysis with GP works as follows. Let us consider a normal distribution with a zero mean,

f(x,σ) = 1

2π σexp

− x22

, (3.2)

whereσ2is the variance. Then, further consider a confidence interval (CI) of 1σ, which in the case of only one variable is given by

CI(1σ) = Z

−1σdx f(x,σ) = Z

−1σdx 1

√2π σexp

− x22

=erf 1/√

2

, (3.3)

where erf is the error function. Note that, this result can be generalized tonσ by simply doing CI(nσ) =erf

n/√

2

. (3.4)

Now, the likelihood function is given by

L=Nexp −χ2(f)/2

, (3.5)

whereNis the normalization constant, f is the best-fit function given by the GP code andχ2(f)is the chi-squared of that same function and it is defined as follows

χ2(f)≡

N

i=1

yi−f(xi) σi

2

, (3.6)

where it is considered a set ofN data points(xi,yii). To determine the normalization constant, it is necessary to integrate over all possible functions given by the GP code, no matter how bad their fit is.

Although bad fit functions will later be discarded, they still contribute to the total likelihood, thus, they have to be accounted for the error estimation of the best-fit function. Therefore, we have

Z

Df L= Z

Df Nexp −χ2(f)/2

=1, (3.7)

(32)

3.2 Non-parametric methods

whereDf represents the integration over all possible functions. Df can be written asDf =∏Ni=1d fi, whered fiand firepresentd f(xi)and f(xi), respectively. Thus, Eq. (3.7) becomes

Z

Df L= Z +∞

−∞

N

i=1

d fiNexp −1 2

N

i=1

yi−fi

σi

2!

=

N

i=1

Z +∞

−∞

d fiNexp −1 2

yi−fi

σi

2!

=N·(2π)N/2

N

i=1

σi

=1,

(3.8)

which implies thatN=

(2π)N/2Ni=1σi

−1

. Replacing this in Eq. (3.5), we get the likelihood written as

L= 1 (2π)N/2Ni=1σi

exp −χ2(f)/2

. (3.9)

Remember that we are considering uncorrelated data, therefore f evaluated at a pointxi is independent from the same function evaluated at a pointxj. For that reason, we can write

L=

N

i=1

Li=

N

i=1

1 (2π)1/2σi

exp −1 2

yi−fi σi

2!

, (3.10)

which means that

Li≡ 1 (2π)1/2σi

exp −1 2

yi−fi

σi

2!

. (3.11)

With these results, we are able to calculate the error,δfi, around the best-fit, fb f(x), at a point,xi, in the following way

CI(xi,δfi) =

Z fb f(xi)+δfi fb f(xi)−δfi

d fi 1 (2π)1/2σi

exp −1 2

yi−fi σi

2!

=1 2

erf

δfi+fb f(xi)−yi

√ 2σi

+erf

δfi−fb f(xi) +yi

√ 2σi

,

(3.12)

Now, admitting thatδficorrespond to the 1σ error of a normal distribution, we have CI(xi,δfi) =erf

1/√ 2

. (3.13)

Equaling Eq. (3.12) and Eq. (3.13), we can solve it numerically and find the result forδf at each point of the best-fit function.

Referências

Documentos relacionados

7.4 SOBRE A COMPARAÇÃO COM A BUSCA LOCAL DE JACOBS E BRUSCO Apesar do desvio percentual dos melhores resultados obtidos por grande parte das versões híbridas das heurísticas AS-SCP

A presente pesquisa tem como meta analisar o conhecimento explícito e tácito aplicado no departamento de modelagem em uma indústria de confecção, unidas à revisões

Kalakota e Robinson 2001 listam um conjunto de regras que devem ser tidas em consideração: Regra 1: a tecnologia é um factor importante na condução da estratégia de negócio; Regra 2:

The role of mShield is protect medical devices from network based attacks such as DoS attacks or malicious code propagation. Those threats surround the environment either in the

Com a propriedade de suas atuações ainda não demarcadas, já que os ebooks que assimilam imagens interativas, efeitos sonoros e visuais, além uma composição gráfica interativa

A tradução para a língua francesa de Gil Vicente representa um estudo de caso exemplar, no qual o investigador encontrará matéria para equacionar questões centrais nos

Apesar do número de adolescentes com baixa renda familiar per capita também ser elevado entre aquelas que ainda não foram mães, a participação percentual é de 43% para aquelas

Para ligar instrumentos e outros tipos de equipamento ao MIC800, utilize ligações jack mono de 6,3 mm ou jack stereo de 6,3 mm, da forma seguidamente ilustrada. Para mais