| Literature DB >> 26808833 |
Gabriele Ranco1, Ilaria Bordino2, Giacomo Bormetti3,4, Guido Caldarelli1,5,6, Fabrizio Lillo3,4, Michele Treccani4,7.
Abstract
The new digital revolution of big data is deeply changing our capability of understanding society and forecasting the outcome of many social and economic systems. Unfortunately, information can be very heterogeneous in the importance, relevance, and surprise it conveys, affecting severely the predictive power of semantic and statistical methods. Here we show that the aggregation of web users' behavior can be elicited to overcome this problem in a hard to predict complex system, namely the financial market. Specifically, our in-sample analysis shows that the combined use of sentiment analysis of news and browsing activity of users of Yahoo! Finance greatly helps forecasting intra-day and daily price changes of a set of 100 highly capitalized US stocks traded in the period 2012-2013. Sentiment analysis or browsing activity when taken alone have very small or no predictive power. Conversely, when considering a news signal where in a given time interval we compute the average sentiment of the clicked news, weighted by the number of clicks, we show that for nearly 50% of the companies such signal Granger-causes hourly price returns. Our result indicates a "wisdom-of-the-crowd" effect that allows to exploit users' activity to identify and weigh properly the relevant and surprising news, enhancing considerably the forecasting power of the news sentiment.Entities:
Mesh:
Year: 2016 PMID: 26808833 PMCID: PMC4726698 DOI: 10.1371/journal.pone.0146576
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Complementary of the cumulative distribution function of the number of clicks a news receives for the ten assets with the largest number of news and the aggregate portfolio of 100 stocks.
Both coordinates have been rescaled by a common factor preserving the power law scaling of the right tail and normalizing the maximum number of clicks to the value 1010. The dotted line corresponds to a power law with tail exponent fitted from the portfolio time series. We provide details about the standard error and the complete list of tail exponents for all the companies in S1 Text.
Fig 2Spearman’s correlation coefficients for the de-seasonalized time series of all the 100 companies at hourly scale.
The x axis reports the list of companies identified by a unique number, as detailed in the main text. Among the several possibilities, we consider only three couples and the color scale corresponds to the level of correlation. We plot those values for which we reject the null of zero correlation at 5% significance level and equalize non significant values to zero (light green color).
Percentage of companies for which we reject the null hypothesis of zero Spearman correlation at 5% confidence level.
| Time interval (minutes) | |||
|---|---|---|---|
| 1 | 7 | 86 | 95 |
| 10 | 3 | 72 | 90 |
| 30 | 5 | 54 | 85 |
| 65 | 4 | 36 | 79 |
| 130 | 4 | 26 | 76 |
Fig 3Time evolution of the cumulative number of clicks per news in a time interval of five hours after the publication.
We normalize the cumulated amount by a constant which corresponds to the total number of clicks received by a single news during the first week after publication. The news are grouped in deciles according to the total number of clicks they have received until October 2013 and the curves represent average values. Inset: estimated values and standard errors of the attention time scale obtained by an exponential fit of the decile curves.
Fig 4Granger Causality tests at hourly scale between de-seasonalized time series (x axis as in Fig 2).
The white cells correspond to tests for which we do not reject the null hypothesis of no Granger causality at 5% significance level. A black cell corresponds to a statistically significant Granger causality.
Number of companies for which we reject the null hypothesis of no Granger causality at 5% confidence level.
| Causality relation | Hourly scale | Daily scale |
|---|---|---|
| 4 | 18 | |
| 13 | 9 | |
| 19 | 11 | |
| 53 | 37 | |
| 100 | 97 | |
| 65 | 52 | |
| 69 | 52 | |
| 96 | 16 |