Methods .1 Regressions - THE PREDICTION PROBLEM 1 Problem Statement

2. THE PREDICTION PROBLEM 1 Problem Statement

2.2 Methods .1 Regressions

As mentioned in the previous Section, there are many different statistical techniques proposed in the literature both for linear and nonlinear regressions. Linear regression is a model aim at determining a line that best ﬁts the set of data points .

Given a vector of inputs ( ₂ ) in -dimensional input space, the single output is thus predicted as follows [46]:

̂ ̂ ∑ ̂

(3)

where the term ̂ is the intercept, also known as the bias in machine learning. We can write (3) in vector form as an inner product by including the constant variable 1 in , and ̂ in the vector of coefficients ̂ ( ̂ ̂ ̂ ) as follows:

̂ ̂ (4)

In order to estimate the unknown coefficients over the -dimensional input space, we use the least square method which is the most popular estimation approach in linear regression. In this approach, we select the coefficients to minimize the residual sum of squares (RSS), a quadratic function of the parameters [46]:

( ) ∑( )² ̂ ( ) (5)

The Prediction Problem

16 2.2.2 Time-series forecasting

In addition, there exist several methods dedicated to time series forecasting. It is the use of a model to predict future values of a variable based on a previously observed series of values of the same variable [47], [48]. Analysis and forecasting of time series is of fundamental importance in many practical domains. Examples can be found in very different fields of application: the sales of a particular product in successive months, wind power generation and electricity consumption in a particular location for successive 1-hour periods, hourly observations made on the yield of a chemical process, etc. [49].

A time series is a sequential set of data points, measured typically over successive times. It is mathematically defined as a set of vectors ( ), t = 0, 1, 2,... where represents the time elapsed [48]. ( ) is a random vector and the measurements in a time series are arranged in a proper chronological order. The historical observations are carefully studied to build up a proper model that is then used to forecast unseen future values.

Over the years, various stationary and non-stationary models have been developed and used in the literature for time series forecasting [47], [48], [50]. Traditional statistical models including exponential smoothing (ES), Moving Average (MA), Autoregressive Moving Average (ARMA) and ARIMA are defined as linear regression methods where the future values are constrained to be linear function of past observations [51]. ARMA models are successfully used to represent the behavior of stationary time series. However, for non- stationary time series, differencing is necessary to resort to stationarity. To this aim, an ARIMA (p, d, q) model can be used, where parameters p, d, q are non-negative integers that refer to the order of the autoregressive, integrated and moving average parts of the model, respectively. More precisely, p is the order of the autoregressive process (highest number of significant lags); d is the order of differencing that is required to make the series stationary and q is the order of the moving average process [50], [52]. If the series is stationary, then d is equal to 0 and the ARIMA (p, 0, q) is equivalent to an ARMA (p, q) model. The interested readers can found a more extensive review of time series methods in [47], [48].

2.2.3 Machine learning methods

Data-driven machine learning methods such as NNs [53], [54], SVM [55] and Extreme Learning Machines (ELM) [56], [57], have been successfully used in various prediction

The Prediction Problem

17 problems including time series forecasting recently. The support vector machine method has been developed by Vapnik [55] and has gained popularity due to its many attractive analytic and computational features, and to the promising performances. SVM has its motivation in the geometric interpretation of maximizing the margin of discrimination, and it is characterized by the use of a kernel function. NNs have attracted increasing attention in the domain of forecasting, pattern recognition, clustering, diagnosis, etc., as they offer a very powerful and very general framework for representing non-linear mappings from multiple input variables to multiple output variables, where the form of the mapping is governed by a number of adjustable parameters [58]. NNs do not require any assumption about the statistical distribution followed by the observations. The appropriate model is adaptively formed based on the given data.

2.2.4 Methods for wind speed/power and load forecasting

Much research has been carried out on the modeling and forecasting of wind speed/power based on different time scales and horizons, e.g. very short-term (seconds to minutes), short- term (hours up to two days), medium-term (days up to one week) and long-term (weeks to months or more ahead) [31]. General overviews of existing methodologies can be found in [31], [59]-[61]. The methods used in the literature can be classified as [31], [59]: i) physical approaches, e.g. numerical weather prediction (NWP); ii) statistical approaches, e.g. time series models such as ES, ARIMA; iii) artificial intelligence methods (heuristics), e.g. NNs, fuzzy logic systems, expert systems; iv) hybrid approaches, which combine physical and statistical methods, in particular using weather forecasts and time series analysis.

For what concerns the problem of load forecasting, it has attracted the attention of researchers since 1990’s. Likewise, in wind speed/power forecasting problems, the various approaches involve different time scales and horizons (short, medium and long-term). It is critical to develop accurate short-term load forecasting (STLF) methods, since forecasted load values are used by market operators to determine day-ahead market prices, and by market participants to prepare bids. In addition, the accurate estimated loads are necessary information for the electric power price forecast on the electric power markets. There exist some works which make use of meteorological variables (e.g. temperature, humidity, cloud coverage, etc.) to forecast load, whereas some others treat the load pattern as a time series signal and predict the future load by using various time series analysis techniques [62]. In

The Prediction Problem

18 addition, recent models consider also socio-demographic and economic characteristics of consumers (occupants), which substantially influence the energy consumption particularly in residential buildings [63], [64]. Likewise the case of wind speed/power forecast, artificial intelligence methods [65], time series models [66] and hybrid models combining both techniques have been also extensively used [67] for load forecasting.

A drawback of traditional data-driven machine learning methods is that they do not associate their predictions with confidence information, but they only output simple (point) predictions.

These methods providing only point predictions cannot properly handle both the uncertainty in the model parameters and the noise in the input data. To quantify potential uncertainties associated with forecasts, in recent years, several researches have been conducted to estimate the PIs for the target of interest. Among them, NN-based PI construction approaches have become popular, and this area of research has been established and well accepted due to the superiority of these approaches on classical regression models for complex prediction problems [35], [36], [38], [39], [68]. In addition to NN-based PI models, there exist probabilistic approaches (both parametric and non-parametric) based on quantile regression, which can perform forecasting taking into account the associated uncertainty [37], [69], [70].

It is worth mentioning that probabilistic forecasting is more informative and useful than point forecasting.

In this thesis, a NN-based regression model for the construction of PIs is considered.

Thorough details about NN regression models are not reported here for the sake of brevity:

the interested readers may refer to the cited references, to the copious literature in the field and to Chapter 3 of this thesis (which is dedicated to the methodology of the multi-perceptron neural networks). Techniques for estimating PIs for NN model outputs are mentioned in Section 3.4 and Paper II of Part II. Moreover, the details of the comparison with some existing algorithms and methods are also given in Part II.

No documento Modélisation à base de réseaux de neurones dédiés à la prédiction sous incertitudes appliqué aux systèmes (páginas 48-51)