• Nenhum resultado encontrado

How retail investors behave in robinhoods most popular stocks with the performance of the market during Covid 19

N/A
N/A
Protected

Academic year: 2023

Share "How retail investors behave in robinhoods most popular stocks with the performance of the market during Covid 19"

Copied!
42
0
0

Texto

(1)

A Work Project, presented as part of the requirements for the Award of a master’s degree in Finance from Nova School of Business and Economics.

HOW RETAIL INVESTORS BEHAVE IN ROBINHOOD’S MOST POPULAR STOCKS WITH THE PERFORMANCE OF THE MARKET DURING COVID-19.

ANDRES FELIPE GUERRERO CAMPOS 47515

Work project carried out under the supervision of:

Virginia Gianinazzi.

20-05-2022

(2)

Abstract

This work project aims to study retail investor activity from Robinhood user holdings aggregated changes given key market conditions such as volatility, market and stock returns, herding effect, and popularity based on google searches during the COVID-19 pandemic. The empirical methodology used in this analysis was based on a multilinear regression model, by using Robintrack official data, along with additional data from Bloomberg, CBOE, and Google Trends Analyzing the results, we found that Robinhood retail investors are indeed sensitive to changes based on retail volume, market returns, and stock volatility and herding effect which can explain retail investor behavior.

Keywords: Fintech, Trading, Robinhood, Retail investor

This work used infrastructure and resources funded by Fundação para a Ciência e a Tecnologia (UID/ECO/00124/2013, UID/ECO/00124/2019 and Social Sciences DataLab, Project 22209), POR Lisboa (LISBOA-01-0145-FEDER-007722 and Social Sciences DataLab, Project 22209) and POR Norte (Social Sciences DataLab, Project 22209).

(3)

1 1. Introduction.

When Covid-19 hit the entire world, the financial market reacted in uncertainty and shock.

However, as retail trading in financial markets was at a pace of low growth as compared to the last two years, the core of the pandemic gave the fintech industry the chance to innovate in better practices using the digital world as an advantage and surge making the payment one of the most important and trending sectors along with regtech (regulatory technology), cybersecurity, wealth tech and proptech (property technology) which has a potential of increase in growth). In fact,

“Fintech will likely speed up as winners continue to solidify their market share. Global expansion of FinTech’s and global investments are also expected to continue in the foreseeable future”

(Mariño, Neves and Gouveia 2020). In consequence, this situation showed an increased entry of retail investors. To understand a little bit more about this type of investor, in contrast to institutional investors, retail investors are non-professional individuals that trade securities or funds through a broker, mutual funds, or financial institutions. Critics that surround retail investors suggest that their expertise in their investments is quite low and they are sensitive to behavioral biases and thus

“may underestimate the power of the masses that drive the market” (Hayes 2021). Components that surround this behavior can be attributed first to overconfidence based on the quality of the information in the market (market returns, stock returns) and the ability to react on time upon specific information with the purpose to generate positive returns. Secondly regret which investors try to avoid and has shown that “when the market booms, bearish investors tend to exit their position if the prices rise to a high level, however, these investors will return and rush to buy if the price rises to a higher level, and vice versa” (Qin 2015). Third, limited attention span which means that human being limits their decisions on the knowledge they can hoard. For instance, retail investors consider assets based on popularity (through websites, finance news, family, or friends).

(4)

2 And fourth, trailing trends. These trends can be associated with patterns, when retail investors detect a pattern in the movements of stock prices and thus, they tend to rely on past information, which is not the case and make causes poor performance on their investment decisions and future repercussions in their portfolios (Shuang and Jain 2000). For matters of this project, one platform will be the core of the discussion. Founded in Silicon Valley in April 2013, Robinhood Markets Inc. is a buildup fintech company that offers financial technology services with a wide portfolio of investment activities such as exchange-traded funds (ETFs), options, and cryptocurrency, stocks, and American Depositary Receipts (ADRs). The way Robinhood operates in the market is through diversified sources such as membership fees, order flows, interchange fees, and interest on uninvested cash, among others (Johnston 2022). For instance, some trends around a major event such as Covid-19 can be addressed during January and February of last year, where the company reported a growth of 6 million crypto accounts as well as an evident differentiation of investors, arranging “98000 being Americans, 16% belong to customers of the Hispanic community, (even more than platforms such as E-trade, Schwab, Fidelity, TD Ameritrade, and Vanguard), 9% comes from African American (in contrast with 3% of other platforms). Gen Y and Z surround approximately 70% of users” (Boorstin 2021). However, as also stated by Boorstin, currently Robinhood app faces a conflict of interest of brokers to deliver as many trades as possible. This behavior leads also to controversy regarding the case with AMC and GameStop by limiting retail investors’ trade these stocks as it was giving an advantage to institutional investors. As we can appreciate in Figure 1 the number of users in Robinhood grew in a steady trend since 2014 and the difference in its popularity can be evidenced by 2021 around 22.4 million users worldwide were using the platform as well as the app’s revenues with 91 million US dollar in the second quarter of 2021 (Norrestad 2021). However, according to Siddarth Shrikanth from the Financial Times, controversy with this company arise because of its gamified structure and interface giving investors gamified investing’ which has led young investors to make impulsive investment decisions

(5)

3 following trends by the masses throughout social media, news, or events instead of their analysis of the performance of a specific asset over time. To understand a little bit more about investor behavior this paper introduces the terminology herding behavior which stands for how investors act by imitating similar actions of other individuals. In 2016, Devenow and Welch denote herding behavior of investors can be based upon three main factors: first, payoff externalities which mean that this kind of payoff to an individual influence other individuals to mimic the same procedure.

Second, conflict of principal-agent theory, for instance, can be seen when a manager's behavior is based on a benchmark of other managers that act based on a specific market or industry index such as the SP500 index. This leads the individual to rely upon basic and relative behavior of other managers. And third, cascades happen when agents acquire knowledge and information from previous decision processes ignoring their private information which ultimately will lead to bad decisions especially when investing in the market according to (Andrea Devenow and Welch 2016). In this case, theory of efficient markets shows us if prices can have a similar representation of available information. Having into account that the market is in a state of equilibrium stated in terms of expected returns. There’s statistically significant evidence of dependence of price changes that affect investor sentiment which has an impact on rationality of these investors’ actions, in a behavior that represents arbitrage in the market followed by fluctuations in stock holdings (Fama 2011). Most of the reviewed studies relies upon one of the previous terms briefly exposed separately and mostly based on explaining the behavior of institutional investors. Thus, the motivation of this directed research work project is recollecting these terms and explaining how during a major event like Covid-19 Robinhood changes in aggregated stock holdings are influenced by factors such as market return, herding effect, investor sentiment and trends of popularity considering most popular stocks of Robinhood and that are more likely to be traded in the financial markets.

(6)

4 For the development of this work project, data from Robintrack.net, Bloomberg and Google trends and the Chicago Board Options Exchange historical data base is extracted and used for the construction of the model. Likewise, this thesis will be structured as follows: First and foremost, this work project will give the reader an overview of the main topics surrounding similar theoretical foundations that surround behavioral finance and how these theories or practices can be applied to address the research question and state what is the hypotheses that would have to be empirically verified. Third, by structuring the previous literature, here the work would examine studies with a similar methodological approach that shows comparable results with predictions or conclusions that these statements make concerning the projects research question and would be an inspiration for the development of the empirical model. Fourth, the empirical analysis would be divided upon four main sub points. The first one is data description where the relevant variables, how are they calculated, the data identified and the sources where this data was extracted from. The second point addresses the necessary sample that will be used in the model documenting how the sample size changes in comparison to the raw data. Third, the project will present and explain the relevance of descriptive statistics. Fourth, the methodological approach, where it is explained the estimation strategy with time series and regression analysis, here it would be explained why the selection was of the dependent and the explanatory variables. And fifth and last point would be based on summarizing the results reflecting a critical view on possible limitations and problems that occurred. In addition, the results and findings are aimed to be compared to previous literature and discuss the contribution to the work project, what was learned from the whole methodological process as well as the results.

2. Theoretical Foundations

To begin discussing relevant terminology, this section will contextualize the understanding of this research.

(7)

5 Behavioral Finance

Behavioral Finance is an essential element to describe systematic financial market implications surrounding decision processes regarding the drivers for investors (retail and institutional) and the consequences surrounding stock investment process. To begin with, (Olsen 1998) remarks important elements that have a clear repercussion in investment decisions arranging from: Stock volatility, following the leader or herding behavior among investors, selling wining investments too early and selling losing investments too late, irrational decision upon good companies vs good investments, retail investors holding undiversified portfolio of stocks and popularity investments based on holding poorer than desired return stocks. Similarly, (Barber, Odean 2013) introduces behavioral finance differentiating real individual investors who hold diversified portfolios, sell stocks with winning investments while holding losing investments (terminology known as

“disposition effect”) and following analysis based on market sentiment and current performance on the market, while uniformed investors trade driven by speculation, actively and to their disadvantage (decisions based on systematic not random buying and selling), In effect, elements such as devoting increased attention to current information can cause overreaction upon investments decisions as well as distress in trading behavior which redirects retail investor to act irrationally when holding a set of stocks. Conversely, (Barber, Odean 2006) shows this effect in how retail investors face a huge search problem when choosing stock to hold. Instead of having a clear systematic panorama when searching good quality stocks, they are driven stocks that catch their attention (that have popularity based on large price moves or trending retail volume). In consequence, individual investors tend to hold a small number of stocks and sell stocks that in short term don’t promises any financial benefit. To understand this behavior empirical work developed by Barber and Odean suggest that using abnormal trading volume of stocks, previous day returns, and macro events are good proxies for attention based on popularity. Likewise, (Engelberg and Parsons 2011) illustrates that retail investors are more likely to trade a stock linked to an S&P 500

(8)

6 index stock finding that the absence of local media coverage is strongly to the probability and magnitude of retail trading a good proxy for the current analysis on Robinhood user holding behavior.

Herding Behavior

In terms of herding behavior in financial markets, theoretical findings suggest that nowadays retail investors activity can be explained use Google search engines that provides them with useful information like financial settings, innovation projects and market insight which ultimately provides them with useful capital market information to undergo an investment decision (Hsieh, Chan and Wang, 2020). According to the authors, retail investor investment decisions in terms of herding is based on two main key elements. The first refers to information based where investors act with the masses when they have information based on comparable professional, educational backgrounds (correlated information environments). A clear example can be attributed to asset price movements and market fluctuations. Secondly, herding also represents a matter of behavior driven motives such as psychological biases (irrational decision-making process) and the attention- grabbing effect (tendency for an individual to be attractive by the result or identification of goal seek externalities). Similarly, to Wangs, Hsieh and Chan view on herding behavior, previous work by Merli and Roger identify that herding effect is more evident in retail investors than institutional investors suggesting a similar trend of mimic behavior from the masses (Merli and Roger 2013).

In effect, for matters of this project cross sectional absolute deviation (CSAD) developed by (Chang and Cheng 2000) as a function of aggregate market return (S&P 500 Index as a proxy) will be used to examine if Robinhood user holding changes throughout the years 2018 until 2020 based on the difference between stock returns and the market return. Here is important to recall that according to the author: “in presence of herding behavior, security returns will not deviate too far from the overall market return (SPY Index). This would lead to an increase in dispersion at a

(9)

7 decreasing rate, and if the herding is severe, it may lead to a decrease in dispersion” (Chang and Cheng 2000). In addition, according to (Barber, Huang and Schwarz, 2020) capturing additional herding effects are valid when they are based on unusual volume of user holdings which can be captured by periods of heightened retail trading for example in COVID-19 period where is represented through a timeframe from March 13th of 2020 until the end of the timeframe August 13th of 2020 and when the number of Robinhood users holding a set of stocks show an increasing behavior for a specific date during this range.

Volatility and Investor Sentiment

Volatility is a concept which comprises an important role in this research. When investing both institutional and retail investors tend to gain insight about historical volatility to understand market movements. In effect, the market can be a factor of uncertainty which can contain variants surrounding any variable that has a close relationship with to a measure of standard deviation, the most used metric to measure this kind of uncertainty. In this order of ideas, volatility is a good measure because it gives a useful look of the extreme values of a stock return as well as can give a fair level of uncertainty that have influence to encourage investment activity by any kind of investors. In this case, VIX index gives individuals insights about market sentiment and expectations about future volatility (Moran and Liu 2020). This market volatility is caused by flows on latest information (clear examples includes stock run ups for game retailer Gamestop as well AMC in 2021). In particular, (Zou and Lei 2012) assert VIX a as valid sentiment proxy and thus proceed to analyze the impact to capture irrational behavior from investors which ultimately suggest that when there is high sentiment period it can explain the percentage change in trading volume of stock holdings. Alternatively, empirical evidence suggests that VIX have a potential negative effect in the attention when it comes to trading. For example, when extreme changes (i.e 20-40%) in the VIX index, there is a behavior of retail investors which intensify its attention on

(10)

8 trading platforms to get more information about their current portfolio of stocks. In addition, the attention of this type of individuals is also driven by trading days rather than not trading days (Aharon and Qadan 2020). Taking this analysis into account and understanding that the market only operates during weekdays (Monday – Friday) the current analysis will be based on these conditions. Results of this literature suggests that volatility have an impact on retail investors to pay attention in the investment decisions among trading platforms, period of big uncertainty such as COVID 19 tend to have an influence of attention on trading accounts.

3. Existing empirical literature.

Attending empirical work and useful tools that will nurture this analysis, work done by (Zou and Lei 2012) based on VIX trading behavior on trading volume was took in consideration. According to the authors, trading behavior can be understood by changes in CBOE VIX volatility which comprises change in stock holdings in the SP500. In brief, the analysis suggests that the changes associated with the VIX volatility can explain the percentage change in trading volume in periods where sentiment is high. Similarly, (Moran and Liu 2020) not only identify VIX as a good proxy for sentiment but also relates a strong correlation with the S&P500. A good approach of the SP500 index associated with changes in user holdings can be attributed by (Welch 2020) which suggest that significant fluctuations in the S&P 500 are related to considerable increases in the sum of total number user holdings by the app Robinhood. Further evaluation of the behavior of Robinhood regarding the S&P 500 index evidence that after the stock markets fluctuates, changes of user holdings tend to change, indicating that some retail investors are sensitive to market movements after days of current trading. Conversely, regarding herding behavior some examples resemble two approaches. The first one recalls (Barber, Huang and Schwarz, 2020) who identify a herding day when the number of user holdings from Robinhood tend to increase in a dramatic rate (the rate used is 0,5% top stocks during the period of the pandemic starting from March 13th 2020). Additionally,

(11)

9 (Chang and Cheng 2000) proposal regarding herding includes the CSAD (Cross section absolute deviation) which can detect herding based on the relationship of the overall market return and the portfolio of popular stocks. This methodology identifies a negative relationship (herding when return dispersions tend to decrease with an increase in the market return). On the other hand, (Hsieh, Chan and Wang, 2020) characterize retail investors as individuals that use Google search tools to obtain information regarding popularity of companies listed in the stock market in order to make investment decisions. An important index used in this paper is called the ASVI as a proxy to understand the information demand by retail investors. In effect a high SVI (Google Search Volume Index) means an increase interest on holding stocks by its popularity. (Da, Engelberg and Gao 2013) also reviews the use of SVI to explain the relationship with a given search term with the market. Results over this empirical analysis evidence that the relationship is negative and significant (individual investors pay more attention to abnormal search volume index than institutional investors) when considering this index in analysis driven to explain investor trading behavior and thus important information about demand of retail investors in terms of the stocks, they’re interested in. On the other hand, (Barber, Lin, Odean, 2021) introduces measures based on retail volume trade (Abnormal retail volume and standardized retail volume) between retail investors which is similar to the study done by (Barber, Huang, Schwarz, 2020) that portrays the attention of these individual on buying stocks (positive skewed behavior: increased volume activity in previous event based on COVID-19), an important element to have into account during the development of this research. For this work project, the forementioned methodologies and proxies will be considered to analyze what makes retail investors have into account when holding a set of aggregated stocks set by popularity.

4. Empirical Analysis.

(12)

10 In this chapter, we describe the empirical analysis and the methodological approach that will address the behavior of Robinhood retail investors on the market. To begin with, we will describe how we obtain the data relevant for this research. Secondly, we will describe the sample used, the number of observations used and what/why observations can be removed from the datasets. In addition, mention how the sample selection can affect the representativeness, validity, and interpretation of the results along with the descriptive analysis of the dataset. Finally, we take the final data to perform the methodological approach and estimation strategy by explaining the selection of the dependent and explanatory variables along with the findings.

a) Data and used samples

Robintrack dataset of stock holding changes.

Robintrack.com contains dataset with the track of the amount of Robinhood users that hold a specific stock from May 2nd, 2018, until August 13th, 2020. The representation relates price and user holding of a specific stock represented in days, hours or even minutes. During the development of this project, this range of dates will be the analyzed days since the API was shut down and do not provide any further data since then. Note that, additional analyzes where made by including google trends based on Robinhood interests, however the reliability of the data from August 13th of 2020 until the beginning of 2022 was not accurate because it was contemplating the general searches and not actual user holdings in the platform. First, from the Robintrack API 8597 securities were extracted, then forty stocks by the S&P500 were chosen based on the weight in the S&P 500 and company market capitalization which oscillates from 70% to 80% of the total market cap of United States. This portfolio of securities comprises: Apple, Microsoft Corporation, Amazon, Tesla, Alphabet Class A (Google), Alphabet Class C (Google), Berkshire Hathaway, NVDIA Corporation, UnitedHealth Group, Meta Platforms, Johnson & Johnson, Procter &

Gamble, JP Morgan Chase & Co., Exxon Mobil, Visa Inc, Chevron, Home Depot, Mastercard, Pfizer, AbbVie, Bank of America, Costco Wholesale, Coca Cola, Eli Lilly, PespiCo, Walt Disney,

(13)

11 Broadcom, Verizon Communications, Thermos Fisher, Walmart, Merck & Co, Cisco Systems, Comcast Corporation, Abbot Laboratories, Accenture, Adobe Incorporated, Mc Donald’s, Salesforce, Intel Corporation and Wells Fargo. Because the data is given not only in days but also in minutes including seconds, arrangements to the dates were made to set the date as normal format date (dd/mm/yyyy). For each ticker, each data set was combined into one obtaining in total of 818 rows by day and by security. The problem regarding this dataset is that only represents user holdings without having into account more detailed information about trades done over the date range obtained. Because some securities were repeating in the same day based on minutes or even seconds, we proceed to take away this units. Therefore, to address this problem the dataset was transformed into a pivot table with the goal of averaging the number of user holdings for those repeating days, this was made to prevent future problem when plugin the dataset to perform the regression analysis. Then, we obtained a set of 818 rows corresponding to the time range mentioned before. Graphically it can be evidencing an increase of users at the beginning of the COVID 19 pandemic as depicted in Figure 2. Therefore, the focus of this research is based on aggregate changes of Robinhood user holdings. This variable is defined as ∆𝑢𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠𝑡1

CBOE Volatility Index (VIX) and S&P 500 Index

To explain Robinhood user holding behavior stocks, data was obtained from the CBOE (Chicago Board Option Exchange) data web page (cboe.com/data/). This index grabs the markets expectations in terms on how the price fluctuates in the S&P 500 index for the market’s

1 Aggregate changes in Robinhood user holdings defined as

∆𝑢𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠𝑡= ln ( 𝑢𝐻𝑜𝑙𝑑𝑡 𝑢𝐻𝑜𝑙𝑑𝑡−1)

Where 𝑢𝐻𝑜𝑙𝑑𝑡 is the aggregate amount of user holdings of the 40 most popular RH stocks in the day t, and 𝑢𝐻𝑜𝑙𝑑𝑡−1 is the total amount of user holdings of the 40 most popular RH stocks in a time t-1.

(14)

12 expectations of 30-day future volatility. According to Justin Kueper, “volatility, or how fast prices change, is often seen as a way to gauge market sentiment, and in particular the degree of fear among market participants” (Kueper 2022). Therefore, the data is based on the VIX calculations used from 2003, an optimal benchmark for equity stocks in the market. In terms of interpretation, high VIX values greater that have a value above 30 are related to have more volatility due to an increase in uncertainty, risk, and investor fear. Values that are below 20 are stress free periods or stable in the markets (Kueper 2022). Keys of use of VIX are this index is inversely correlated with the market.

VIX can be a good proxy to determine the market direction. For retail investors, high VIX can suggest a sell of stocks and low VIX a buy of stocks. If VIX rises, the SP500 falls in price due to increase in investor fear. Volatility can be a good element to understand asset price movements in the market. Going long on the VIX, means that volatility is going to increase, this could lead events of financial instability and no certainty around the market. Going short on the VIX can be present due to low interest rates and expectable economic growth and thus low volatility in the market. As we can see in Figure 3. VIX volatility tends to be more pronounced since the announcement of COVID-19 as global pandemic back on March 13th, 2020 and had the highest peak at 82,69% on March 16th 2020. Conversely, the S&P 500 index was extracted for the range previously mentioned.

This information was extracted from Bloomberg terminal. The value used in this case is the closing price, 588 observations were analyzed and this values as the VIX volatility does not have into account weekends which affect the representativeness of Robinhood user holdings data. In addition its evident that when prices for the S&P500 goes down, VIX tends to be high than 30 which point to investor fear and uncertainty, for periods at the beginning of the pandemic (March and April 2020) in comparison to periods previous to the pandemic where values were 20 or below indicating stable markets and stress free periods (Kueper 2022).

SVI (Search volume index)

(15)

13 Google is one of the best search platforms that provides its users the ability to obtain trends by interest through Google Trends an API provided by the company which shows a relation between relative searches in its servers to the total search volume queries around the world in a specified time range (from days to weeks even months and years). For the current empirical analysis, data was extracted for the timeframe between the beginning of May of 2018 and the entire month of August 2020. Words such as: “cheapest, “top stocks,” “selling stocks,” “penny stocks” plus

“Robinhood” term were considered as important keywords which represent this search volume index. Consequently, daily historical data was extracted to omit gaps between days (4 consecutive datasets were obtained from the API) compiling them we can obtain the entire span of needed days for the analysis. To follow the current methodology in data management, search index during weekends were not considered, 854 observations were obtained. From the data the values obtain show a standardized range from 0 to 100, indicating the grade of interest through the number of clicks, locations, and platforms such as Youtube. It is important to recall that the reason why SVI is chosen as a proxy to understand Robin hoods user holding behavior is because both (Da, Engelberg and Gao 2013) and (Hsieh, Chan, Wang, 2020) agree that retail investors have propensity to make investment decisions upon information they search through the web, specifically Google search engines. Whereas institutional investors tend to have easy access to platforms such as Bloomberg or Reuter when observing the possibilities in investing in the market.

Notably, in Figure 4 it is evident high interest of the platform from the beginning of 2019 until September of the same year. This behavior is result of different news such as the “number of accounts with Boeing stock increase by 68% as the company dealt with the fallout of a second 737 Max crash in five months” (Bloomberg 2019), causing interest in Millennials to buy this company stock from Robinhood app. Similarly, news includes Tesla where young retail investors brought investing attention holding shares increasing more than 8% to 145000 after the company earnings release with plans of raising 2 billion USD sending shares up by approximately 5% (Bloomberg,

(16)

14 2019). Other news recalled includes a series E financing of $323 million USD at $7.6 billion valuation and the announcement of AMD’s new chip back in May of 2019. At the end we obtain 854 observations that later will be arranged with the dates that fit the total amount of observations with Robinhood user holdings. Using the methodology proposed by (Da, Engelberg and Gao 2013) we define ∆𝑆𝑉𝐼𝑡2. With this log changes we can obtain a similar trend behavior like in Figure 4 however, it is necessary to apply time series analysis for this set of data which will be shown later in methodology.

Standard Absolute Retail Volume and CSAD index.

To analyze if retail volume of Robinhood top 40 most popular stocks have an influence in changes in user holdings, a measure of analysis called abnormal volume will be used with the methodology used by (Barber, Lin, Odean, 2021). To begin with, data was mined for volume of trades based of interest of retail investors in the portfolio of popular stocks from Bloomberg terminal from a span date (May 1st, 2018, until August 31st 2020). 588 rows were obtained along with data that do not consider weekends as the CBOE VIX volatility and the S&P500 index. The next step was to

“compare retail trades to the mean of retail trading’s in a date range of 30 days and then scaled by the standard deviation of retail volume in the same time interval” (Barber, Lin, Odean, 2021), first we calculate the mean of retail trades (𝑣̅̅̅̅)𝑖𝑡 3, standard deviation 𝜎2(𝑣𝑖𝑡)4 and use SARV5 and ARV6 for the portfolio of 40 stocks we are working on.

2 Delta SVI: ∆𝑆𝑉𝐼𝑡= ln(𝑆𝑉𝐼𝑡) − ln(𝑆𝑉𝐼𝑡−1). Where 𝑆𝑉𝐼𝑡 correspond to the search volume index in day t measured

in % and 𝑆𝑉𝐼𝑡−1 is the search volume index for the previous day.

3 Equation (1). Volume mean of retail trades of aggregate stocks: 𝑣̅̅̅̅ = ∑𝑖𝑡 𝑣𝑖𝑡

𝑇 𝑇𝑡=0

4 Equation (2). Standard deviation of volume trade: 𝜎2(𝑣𝑖𝑡) = ∑ (𝑣𝑖𝑡−𝑣̅̅̅̅)𝑖𝑡2

𝑇−1 𝑇𝑡=0 5 Equation (3). Standard absolute retail volume: 𝑆𝐴𝑅𝑉𝑡=𝑣𝑖𝑡−𝑣̅̅̅̅𝑖𝑡

𝜎(𝑣𝑖𝑡) 6 Equation (4). Absolute retail volume:𝐴𝑅𝑉𝑡=𝑣𝑖𝑡

𝑣𝑖𝑡

̅̅̅̅

(17)

15 This data is worked during a total amount of 564 days (which excludes weekends due to the data set established by RH user holdings) using (1), (2), (3) and (4) we can obtain a trend of standardized abnormal retail volume where we can appreciate in Figure 5. The behavior of the abnormal retail volume is positively skewed, consistent with (Barber, Huang, Schwarz, 2020) empirical analysis which shows that the effect of buying for Robinhood investors is more evident than the “general population of retail traders”, this is because this measure seek to avoid posterior bias. Likewise, stock closing prices for the 40 popular stocks from RH were obtained in Bloomberg terminal, with the purpose to build index of comparison between the portfolio of stocks along with the returns of the S&P 500 index. To treat the data returns were found for each stock from the 1st of May of 2018 until 28th of August of 2020. Then we calculate the log returns for each day and each stock and we prepare the data along with the return of the S&P 500 to construct the CSAD7 index. By solving the equation, we can obtain on how the average return of stocks differs from the return of the S&P 500 in the range of the days we are working on. Therefore, this index can signal additional herding in the market for retail investors and contrast whether the market is behaving corresponding to the EMH (Efficient Market Hypothesis). In effect, when the market fluctuates, the CSAD have a tendency of change as well causing retail investors follow market signals influenced by trends in the industry and volatility or sentiment. Example: High market volatility in March and April of 2020 (when the Pandemic hit the world, and everybody went to lockdown) there was notable loses specially energy stocks and consumer discretionary and ultimate winner such as Amazon and Apple. Consequently, when the relationship is increasing in market volatility, the market is rational, however if the relationship is inverse (Figure 3.), in the tails of the market return distribution there

7 Cross Section Absolute Deviation. 𝐶𝑆𝐴𝐷𝑡=𝑁𝑡=1|𝑅𝑖,𝑡−𝑅𝑚,𝑡|

𝑁 . Where 𝑅𝑖,𝑡 correspond to the return to each stock 𝑖 for the day 𝑡, 𝑅𝑚,𝑡 is the return of the S&P 500 index for the day 𝑡 and 𝑁 are the days used for the calculation which correspond to 564 days in total.

(18)

16 is an evident herding effect on how individual investors hold a stock over time. We can relate this cross-sectional analysis to the right tail of the market return distribution as (Christie and Huang 1995) analyze.

Data not considered

After having the data organized, it was evident that for variables such as CBOE VIX volatility, S&P 500 Index, returns of 40 stocks considered in this research and retail volume trade were not considering available data for days such as Saturdays or Sundays as it does for Robinhood user holding data, this data was removed because it caused breaks when building the dataset for the analysis. From 818 rows of data, 254 rows were removed which represents 31,05% from the total data. With this limitation of the data and having into account available data only from 2018 until 2020 this can significantly impact the quality of the analysis.

b) Descriptive Analysis

From the preliminary data obtained from Robintrack, CBOE, Google Trends and Bloomberg terminal, it is now necessary to perform a descriptive analysis of the main variables that are going to be included. The set of data analyzed in this section have an amount of 564 rows (days) which corresponds to days where the market opens (Monday-Friday). Table 1 can depicts the main descriptive statistics for the raw data extracted from the sources. Nevertheless, its necessary to check graphically the trend, stationarity, and distribution for each variable to transform the raw data for a better representation which can be seen in Table 4. In R studio frequency histogram. In Figure 6 is evident that for Robinhood user holdings the graph shows a tendency with a positive skew as well for variables like the CSAD and the CBOE VIX index and negative skew for the

(19)

17 SARV variable, even though for this variable graphically is not that evident that the behavior of the plot is negatively skewed. A density line plot was overlapped as well to see this problem.

Conversely, it is essential to undergo a time series analysis, this with the objective to make the regression model more robust attending a series to be stationary (distribution of the error term, correlation between the error terms constant over the time that is being analyzed). In effect, (Rupert and Matteson 2015) methodology was carried out for this preliminary analysis. First, plot for the variable RH user holdings, CBOE VIX index, S&P500 index, CSAD, SARV index and SVI index were made as well as the auto correlation function as evident in Figure 7,8,9,10,11,12 the variables respectively display a trend signal and the ACF correlogram shows visually that the variable is not stationary. Numerically to confirm this we perform a Unit Root test using the Dickey Fuller Test where the null hypothesis 𝐻0 affirms there is a unit root (non-stationary time series), whereas the alternative 𝐻1 affirms that the data follows a stationary time series. In R studio the adf.test() was used. To interpret the results if the p-value is smaller than the level of significance (in this case 5%

will be used as our level) then the time series is stationary. Table 2 shows the pertinent analysis for stationarity for the variables of interest. In conclusion its evident that none of the variables, excluding that contain the raw data are stationary we proceed to transform the data with log returns and repeat the time series analysis to use this data in the regression analysis. For the variables which have a non-stationary behavior, data was treated with the diff() function and log() functions in R in order to transform it into a stationary time series. Table 3 shows the ADF statistic along with the p-value for the variables without the trend component confirming stationarity with a significance level of 5%.

c) Methodological Approach

(20)

18 Recalling the research topic of this work project. Multiple linear regression was used to test if the standardized abnormal retail volume, market return, volatility of the market, stock returns compared to the market return and search volume based on popularity significantly can explains changes in Robinhood aggregated user holdings for the top 40 most popular stocks invested in this platform. To develop this idea, regression analysis will be used, (Rupert and Matteson 2015) expose multiple regression where relates Y to the regressor variables as follows. For this case we denote:

∆𝑢𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠𝑡= 𝛽0 + 𝛽1𝑆𝐴𝑅𝑉𝑡+ 𝛽2𝑙𝑜𝑔∆𝑆𝑃500𝑡+ 𝛽3𝑙𝑜𝑔∆𝑉𝐼𝑋𝑡+ 𝛽4𝐶𝑆𝐴𝐷𝑡+ 𝛽5𝑙𝑜𝑔∆𝑆𝑉𝐼𝑡 + 𝛽6𝑑𝐶𝑜𝑣𝑖𝑑𝑡+ 𝛽7𝑑𝐻𝑒𝑟𝑑𝑡+ 𝜖𝑡 (1)

Where, ∆𝑢𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠𝑡 is the lag log change of Robinhood user holding for the day t. 𝑆𝐴𝑅𝑉𝑡 is the standardized retail volume for a portfolio of 40 popular stocks at the day t. 𝑙𝑜𝑔∆𝑆𝑃500𝑡 correspond to the log return of the S&P 500 index for the day t. 𝑙𝑜𝑔∆𝑉𝐼𝑋𝑡 is the log change for the CBOE VIX index for each day t. 𝐶𝑆𝐴𝐷𝑡 is the cross standard absolute deviation index which compares the returns of closing prices of 40 popular stocks with the return of the market for each day t and shows if individual investors are sensitive to changes in stock returns. 𝑙𝑜𝑔∆𝑆𝑉𝐼𝑡 is the transformed value of the google search volume index for each day t. 𝑑𝐶𝑜𝑣𝑖𝑑𝑡 comprehend a dummy variable which takes the value of 1 if the day is between 13th March 2020 until 13th August 2020 (official dates where COVID-19 pandemic took place), 0 otherwise. For 𝑑𝐻𝑒𝑟𝑑𝑡, first for each stock dummy variable takes the value of 1 if the “percentage change in Robinhood users is in the top 0,5% for stocks with positive user changes on day t and a minimum of 100 users on day t-1” (Barber, Huang y Schwarz, 2020), 0 otherwise was calculated pointing that 𝑑𝐻𝑒𝑟𝑑𝑡 was more evident after 13th of March 2020. In this case, because we are working on a set of 40 stocks this dummy variable was set to the days where this condition was met, and it was found that this behavior in average match

(21)

19 the days when COVID 19 was announced as a global emergency. And finally, 𝜖𝑡 is the error term.

Now we proceed initially to create at correlation coefficients for each variable with the goal of detecting if there are multicollinearity problems. In R studio we use function 𝑐𝑜𝑟() which produces the Pearson’s correlation coefficients rounded with two decimal places. Table 5 and Figure 13 depicts numerically the relation between the variables that are going to be used in the model. We can see that for the variable 𝑙𝑜𝑔∆𝑉𝐼𝑋𝑡 and 𝑙𝑜𝑔∆𝑆𝑃500𝑡 has a strong negative relationship, this can indicate that both variables have a tendency of moving in opposite direction, we can confirm this in Figure 3 where back in March 2020 when COVID-19 was pronounced global emergency CBOE VIX index tended to increase while the return of the market decreased. It is important to note that R studio also provides a measure to detect multicollinearity problems which is called the Variance Inflation Factor (VIF) which helps diagnostic the correlation of each independent variables and the dependent variable. This methodology will be used after the regression analysis.

Likewise, for each independent variable and the dependent variable a scatterplot was generated along with COVID dummy variable as a marker on each plot to see the pattern within groups.

Figure 14 gives a better look at the relationship of groups. Visually, at first, there are non-linear relationships in any pair of variables. However, we can see that the behavior during COVID-19 has a relationship with the dependent variable (∆𝑢𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠𝑡) and a moderate relationship with retail volume of popular stocks and trends based on google search volume index. The next step of the analysis is to regress (1) using R studio. For this, lm() function was used to perform the analysis, and summary() function was used to view the output of the estimates, statistics and p-value of each regressor. We express the model as: ∆𝑢𝐻𝑜𝑙𝑑𝑡~𝑆𝐴𝑅𝑉𝑡+ 𝑙𝑜𝑔∆𝑆𝑃500𝑡+ 𝑙𝑜𝑔∆𝑉𝐼𝑋𝑡+ 𝐶𝑆𝐴𝐷𝑡+ 𝑙𝑜𝑔∆𝑆𝑉𝐼𝑡+ 𝑑𝐶𝑜𝑣𝑖𝑑𝑡+ 𝑑𝐻𝑒𝑟𝑑𝑡.

(22)

20 Empirical Results

After running the model in R studio, Table 6 displays that the overall regression output. To begin with, multiple R squared is biased in term of large models, therefore the bias can be decreased by using the correct adjustment which substitutes occurrences of n by the correct degrees of freedom of the model. In effect, the overall regression is statically significant with an adjusted R squared of 0,6393. An F statistic (7, 556) = 143,6 and p-value of p<2.2e-16. On the other hand, we see that the fitted regression model is 𝛽̂ = 0.00233320 , 𝛽̂ = −0.00173011 , 𝛽̂ = 0.03271852 , 𝛽̂ =3 0.0037326, 𝛽̂ = 0.0009089, 𝛽4 ̂ = −0.0033407, 𝛽5 ̂ = 0.0030193 and 𝛽6 ̂ = 0.0119906. Now 7 for the regressors, with a significance level of p<0,001: it was found that popularity demand of retail investors measured with the SVI (Da, Engelberg and Gao 2013), herding behavior indicator of “the % change in Robinhood users is in the top 0,5% for stocks with positive user changes on day t and a minimum of 100 users on day t-1” (Barber, Huang and Schwarz, 2020) and the macro event which is COVID-19 pandemic that is considered global emergency since March 13th 2020 are statistically significant. With a level of significance p<0,01 market return and return fluctuations in relation to the market return (CSAD) (Welch 2020) are statically significant indicating retail holding changes in Robinhoods app. And with a significance level of p<0,05, changes in investor sentiment (Zou and Lei 2012) proxied by the CBOE VIX index and the abnormal retail volume (Barber, Odean, 2006) are statistically significant for changes in Robinhood user holdings. In order to see the relationship between one predictor variable and the response variable ∆𝑢𝐻𝑜𝑙𝑑𝑡 we produced added variable plots with the function avPlots() available in car package in R studio. In Figure 15 we can clearly see that retail investors react to low abnormal return and low popularity according to google search volume. Alternatively, RH investors have a positive response to returns by the market (S&P500), direction of change of VIX volatility,

(23)

21 difference between returns of their portfolio of stocks and the return of the market, and the changes were more evident since the Covid 19 started (13th March 2020).

Diagnostics and Limitations.

We begin taking a glimpse in the linear relationship of the model by creating a scatterplot for each regressor variable to contrast it with the response variable. This will allow the analysis to evidence if there is any relationship between each of these variables. From Figure 16 which illustrates partial residual plot of each of the variables we can see that the standardized abnormal retail volume, the difference of return of portfolio of stocks in contrast to the market return (CSAD), the VIX and both dummy variables have a more linear relationship with the changes in RH user holding changes.

As discussed at the beginning, looking at the data collinearity should be analyzed after computing the model to see if there none of the regressors are highly correlated with each other. For this we can use the variance inflation factor (VIF), which indicates “how much the squared standard error (variance of 𝛽̂) in increased by having other predictor variables in the model” (Rupert and Matteson 2015). Mathematically, VIF would be expressed as 𝑉𝐼𝐹 = 1

(1−𝑅𝑗2). Where 𝑅𝑗2 corresponds to the value of R squared. The rule of thumb in this case is for the VIF to be less than 5 (benchmark) which means that the used predictors are not correlated when the model is computed. With R Studio we call faraway library using the function vif(). Computing this function along with Figure 17 we can see graphically that there is no correlation between the SVI, SARV, dummies regarding COVID (dCOVID), herding (dHERD) and the CSAD. In the case of the VIX and S&P500 because they are between VIF of 1 and 5 this would suggest moderate correlation (consistent with Figure 13) but can be included in the current model.

(24)

22 Now to look at independence of observations, we recall Durbin Watson test as a good tool to assess no residual autocorrelation. We then can define the null hypothesis as:

𝐻0: 𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝑛𝑜 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑚𝑜𝑛𝑔 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠 𝐻𝑎: 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠 𝑎𝑟𝑒 𝑎𝑢𝑡𝑜𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑒𝑑

In R studio we can use the function durbinWatsonTest() found in car package. Computing the model Table 7 we can evidence that the test statistic is 1,93 and the corresponding p-value is equal to 0,332. With a 5% significance level don’t reject the null hypothesis concluding that residuals for this model are not autocorrelated.

Now, let us look at multivariate normality, here we are aiming to check normality on residuals.

Therefore, to analyze this assumption we take a first look at the QQ plot. This graph is based on depicting quantile points that help to determine if the residuals from the model have normal distribution. The rule of thumb tells us that if the points form a diagonal line, the normality assumption is fulfilled. On the other hand, we also can plot a histogram of residuals using the stures() function in R studio. In conclusion, as we can evidence in Figure 18 that first the QQ plot shows a good distribution of points along the diagonal 45° line, however R has flagged data observations 419 and 448 as problematic. To confirm this, with the histogram of residuals we can clearly see that the graph has a normal behavior by tracing a density line, however it is not normal, this could be due to outliers of the model.

Finally, it is important to revise that the error term has equal variance given any values of independence. To see this relationship, we proceed to generate an absolute residual plot against predicted values. The rule of thumb tells us, if this plot shows a systematic trend, then non constant variance or heteroscedasticity is present. In this case, we can see a problem can be due to outliers or influential points. The pattern evident in Figure 19 R Studio detected 4 main points of interest.

In addition, residuals look linear at some point between range 0.000 and ~0.006. To address the

(25)

23 problem of the points of interest we recall the method of outlierTest() in R Studio. This function allows us to detect outliers from the model through the Bonferoni p-values for evaluating each observation (Table 8). The null hypothesis for this test tells us that there is not an outlier, while the alternative hypothesis tells us otherwise. By computing this test in we can obtain that with a significance level of 5% there are 5 outliers that can affect the model and its necessary to look at them to analyze what impact have in the model and the interpretation of the results. Visually we additionally can calculate Cook’s distance which measures how much fitted values changes if the nth observation is taken away. Evidently, Figure 20 the presence of changes in Rh user holdings in days corresponding to events before the covid pandemic (27/07/2018, 16/01/2020, 28/02/2020) and after the pandemic (17/03/2020 and 24/03/2020) have a presence of outliers, particularly extreme outliers, which is a concern when we have nonnormality. However, “in financial time series, outliers are often “good observations” due to excess volatility in the markets on certain days” (Rupert and Matteson 2015). This conclusion is related to these days there was an increment on VIX volatility as well as increments in the S&P500 index which indicate people to hold more stocks based on herding found throughout the model as (Barber, Huang and Schwarz, 2020) identified with herding events with the top 0,5% stocks acquired by Robinhood retail investors each day along with the cross section absolute deviation. Similarly, this change behavior in user holding in these days is due to high sentiment periods as (Zou and Lei 2012) recalls in their work arguing that VIX affects positively trading volume during periods where events such as COVID or news regarding around the market takes place. Another limitation is the data itself. Because robintrack stopped providing further data after August 2020, analyzing retail holding behavior was a challenge not only for the number of observations extracted from the data but also this limit the analysis of other data like market returns, stock returns, volatility and popularity of searches. In

(26)

24 addition, because the financial market does not operate during weekends, we had to omit around 30 percent of the data which can influence the interpretation of results.

5. Conclusions.

Robinhood corporation has been a company of great discussion over the last two years even before the COVID-19 pandemic, its popularity was more pronounced since this event was more present back in 2020, which brought attention to researchers to understand and analyze investor behavior of this platform. In effect, the purpose of this study was to explore if Robinhood retail investors rely on investor sentiment (VIX), market returns, stock returns, herding and popularity searches to make their investment decisions. In conclusion, we found that indeed individual investors are sensitive when there are high sentiment periods as (Zou and Lei 2012) analyzes. Similarly, herding is also found in periods where events such as the announcement of the pandemic hit the world, as well as news regarding performance of the market and stock returns captured by the methodology developed by (Barber, Odean, 2006) and (Christie and Huang 1995). However, past returns and abnormal trading volume are stretchy connected in stock holding as (Chang and Cheng 2000) includes in their study. Now taking a look to popularity searches, because retail investors can only rely on engine searches such as Google, this proxy to market herding can be a good tool to understand how these individuals react not only in the analysis of this apps but in future studies.

However, because data was only restricted for analysis in a span of 3 years and 3 months, this was a restriction to take a look a further examination in the current years (2021 and 2022). Some of the limitations found within the data is, because Robintrack API was shut by late august 2020 no longer providing its users with data about user stock holdings for each security, the analysis was restricted to be analyzed in a span of 2 years and 3 months and this limitation applies for the variables handled to explain aggregated changes in Robinhood user holdings. Indeed, this research contributes not

(27)

25 only theoretically but also practically in implementing data analytics in finance using computational methods such as time series analysis, data transformation and empirical analysis through regression analysis and could be used to understand the behavior not only in platforms like Robinhood but other platforms that provide financial technological services.

(28)

26 References

Aharon, David, and Mahmoud Qadan. 2020. "When do Retail Investors Pay Attention to their Trading Platforms?" North American Journal of Economics and Finance 12.

Andrea Devenow, and Ivo Welch. 2016. "Rational Herding in Financial Economics." European Economic

Review 13.

https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.4883&rep=rep1&type=pdf.

Barber, Brad, and Terrance Odean. 2006. All that Glitters: The Effect of Attention and News on the Buying Behavior of Individual and Institutional Investors. Berkley, 51.

Barber, Brad, and Terrance Odean. 2006. "All that glitters: The Effect of attention on the Buying Behvaior of Individual and Insitutional Investors." Berkley 52.

Barber, Brad, and Terrance Odean. 2013. The Behavior of Individual Investors. Elsevier.

Barber, Brad, Shengle Lin, and Terrance Odean. 2021. "Retail trades Positely Predict Returns but are Not Profitable." 45.

Barber, Brad, Xing Huang, and Christopher Schwarz. 2020. "Atention Induced Trading and Returns Evidence from Robinhood users." 73.

Bloomberg. 2019. Millenials are pouring into Tesla's stock following the electic car maker quarter.

Business Insider.

Bloomberg. 2019. Millenials were scooping up Boeing's stock amid the 737 Max's turmoil. Business Insider.

Boorstin, Julia. 2021. Robinhood’s disruptive force: The good, the bad and the controversy CNBC. May.

https://www.cnbc.com/2021/05/25/robinhoods-disruptive-trade-the-good-the-bad-and-the- controversy.html.

Chang, Eric, and Joseph Cheng. 2000. "An examination of herd behavior in equity markets: An international perspective." Journal of Banking & Finance 29.

Christie, William, and Roger Huang. 1995. "Folowwing the Pied Piper: Do Individual Returns Herd around the Market?" Taylor & Francis 7.

Da, Zhi, Joseph Engelberg, and Pengjie Gao. 2013. "The Sum of All FEARS: Investor Sentiment and Asset Prices." Journal of Finance 43.

Engelberg, Joseph, and Christopher Parsons. 2011. The casual Impact of Media in Financial Markets.

Wiley.

Fama, Eugene. 2011. "Efficient Capital Markets: A review of theory and empirical work." The Journal of Finance 35.

Hayes, Adam. 2021. Retail Investor. February. https://www.investopedia.com/terms/r/retailinvestor.asp.

Hsieh, Shu, Chia Can, and Ming Wang. 2020. "Retail Investor attention and herding behavior." Jounral of Finance (Journal of Finance) 24.

Hsieh, Shu, Chia Chan, and Ming Wang. 2020. "Retail investor attention and herding behavior." ELSEVIER Journal of Empirical Finance 24.

(29)

27 Hussain, Sartaj. 2019. "A Theoretical Evaluation of the Models for Stock Market Volatility." Institute of

Advanced Studies 15.

Johnston, Matthew. 2022. Investopedia. January 29. https://www.investopedia.com/articles/active- trading/020515/how-robinhood-makes-money.asp.

Kueper, Justin. 2022. CBOE Volatility Index. February 11. https://www.investopedia.com/terms/v/vix.asp.

Mariño, Javier, Mariana Neves, and Cinderela Gouveia. 2020. "Fintech Report." Repositorio Universidade Nova. January. http://hdl.handle.net/10362/107455.

Merli, Maxime, and Tristan Roger. 2013. "What Drives The Herding Behavior of Individual Investors."

Cairin Matieres a Reflexion 39.

Moran, Matthew, and Berlinda Liu. 2020. "The VIX index and volatility based global indexes and trading instruments." CFA Insitute Research Foundation 32.

Norrestad, F. 2021. Statista Financial Intruments & Investments. December.

https://www.statista.com/statistics/822176/number-of-users-robinhood/.

Olsen, Robert. 1998. "Behavioral Finance and Its Implications for Stock-Price Volatility." Taylor & Francis Group (Taylor & Francis Ltd.) 9.

Qin, Jie. 2015. "A model of regret, investor behavior, and market turbulence." Journal of Economic Theory (Journal of Economic Theory) 14.

Rupert, David, and David Matteson. 2015. Statistics and Data Analysis for Financial Engineering in R.

Springer.

Shrikanth, Siddarth. 2020. ‘Gamified’ investing leaves millennials playing with fire by Financial Times.

May. https://www.ft.com/content/9336fd0f-2bf4-4842-995d-0bcbab27d97a.

Shuang, Joanna, and Prem Jain. 2000. "Truth in Mutual Fund Advertising: Evidence on Future Performance and Fund Flows." Journal of Finance 23.

Welch, Ivo. 2020. "The wisdom of Robinhood Crowd and Covid Crisis." National Bureau of Economic Research (Journal of Finance) 45.

Zou, Maggie, and Violet Lei. 2012. "Investor sentiment: Relationship bewteen VIX and Trading Volume."

University of Macau 23.

(30)

28 Appendix

Figure 1: Number of users of Robinhood APP. Launched in 2015, the number grows rapidly as for 2021 having 22,4 million users worldwide. (Norrestad 2021)

Figure 2: Aggregate user holdings for 40 top most popular stocks in Robinhoods app, measure obtained from Robintrack.net from 2nd May 2018 until 13th August 2020.

(31)

29

Figure 3: CBOE VIX volatility and S&P 500 index trend. Datasets extracted from CBOE.com and Bloomberg terminal respectively. Values obtained from May 2nd 2018 until August 13th 2020.

Figure 4: Google Search Volume Index (SVI). Data obtained from Google trends historical data from May 1st of 2018 until 31 of August of 2020.

(32)

30

Figure 5: Frequency Histogram for Abnormail Retail Volume of retail trades for 40 most popular stocks in Robinhoods App. Dataset obtained in Bloomberg terminal.

Figure 6: Histogram of Frequency for the variables of interest. The data used to represent these graphs are from the raw data obtained for the sources: Robintrack, Bloomberg terminal, Google Trends and the CBOE data. The number of observations used were with 564.

(33)

31

Figure 7: Trend Signal and Stationary Signal for Robinhood user holdings.

Figure 8: Trend Signal and Stationary Signal for Standardized Abnormal Retail Volume (SARV).

(34)

32

Figure 9: Trend Signal and Stationary Signal for the S&P 500 index.

Figure 10: Trend Signal and Stationary Signal for CBOE VIX volatility index.

(35)

33

Figure 11: Trend Signal and Stationary Signal for CBOE VIX volatility index.

Figure 12: Trend Signal and Stationary Signal for the Google Search Volume index.

(36)

34

Figure 13: Visual Correlation Coefficient plot. Depicts relationship between independent variables and the dependent variable.

Figure 14: Model Scatterplots matrix of the changes in Robinhood user stock holdings. The variable rhUsers is the response variable.

The red points are markers to identify the pattern of COVID-19 within the groups. The blue points are markers indicating no COVID-19

(37)

35

Figure 15: Added variable plots which shows influence analysis of the dependent variable for each of the regressor from the model.

Figure 16: Partial residual plot for dependent variable against each of the predictor variables, which help us analyze linear relationship.

(38)

36

Figure 17: Visualization of the Value Inflation Factor which measures the correlation and strength of correlation between the current model and the independent variables. Values that fall with a value less than 5 in the VIF are considered good to include in the regression model.

Figure 18: QQ plot and distribution of the residuals from the regression model to check normality. Is evident that there are some outliers corresponding to observation 419 and 448.

(39)

37

Figure 19: Plot that illustrates residuals versus fitted values. This help us check if there is constant variance among these values.

Figure 20: Cook's Distance that shows the observations of interest, these observations are more common days before lockdown due to COVID 19 pandemic and the first week after the emergency of this event.

Referências

Documentos relacionados

A Fraternidade é o fundamento para entender a Constituição Brasileira como projeto cultural e, como categoria política, é o compasso teórico que irá restituir à Política o

As hipóteses H2 (A expectativa de esforço afeta positivamente a intenção de utilização da tecnologia Qr Code), H3 (A influência social afeta positivamente a intenção

profile of the students and of the educational establishment, based on an empirical analysis that used data from Prova Brasil, from the School Census and from IDEB itself,

Most of the data comes from the Yahoo Finance Cloud System. This system is commonly used by investors and offers a free use of their API [45]. With this, it was possible to get

In the final stage, simple regression analysis was done using all the data from all the trees sampled (n = 82), with the independent variables being the predictor variables in

Em conformidade com o que acabamos de expor até aqui, verificou- -se que os funcionários dessa agência não parecem felizes no trabalho, pois como relatou Freud (1930), a felicidade

The main goal of this dissertation is to assess how useful Data Mining methods and algorithms can predict the adverse effect of drugs. Thus, several key points have to be studied

A aprendizagem infantil é um processo de construção pessoal e não uma mera cópia de um modelo externo; inicialmente a criança precisa compreender seu próprio