Literature DB >> 27533113

Predicting Fluctuations in Cryptocurrency Transactions Based on User Comments and Replies.

Young Bin Kim1, Jun Gi Kim2, Wook Kim3, Jae Ho Im3, Tae Hyeong Kim1, Shin Jin Kang2, Chang Hun Kim3.   

Abstract

This paper proposes a method to predict fluctuations in the prices of cryptocurrencies, which are increasingly used for online transactions worldwide. Little research has been conducted on predicting fluctuations in the price and number of transactions of a variety of cryptocurrencies. Moreover, the few methods proposed to predict fluctuation in currency prices are inefficient because they fail to take into account the differences in attributes between real currencies and cryptocurrencies. This paper analyzes user comments in online cryptocurrency communities to predict fluctuations in the prices of cryptocurrencies and the number of transactions. By focusing on three cryptocurrencies, each with a large market size and user base, this paper attempts to predict such fluctuations by using a simple and efficient method.

Entities:  

Mesh:

Year:  2016        PMID: 27533113      PMCID: PMC4988639          DOI: 10.1371/journal.pone.0161197

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The ubiquity of Internet access has triggered the emergence of currencies distinct from those used in the prevalent monetary system. The advent of cryptocurrencies based on a unique method called “mining” has brought about significant changes in the online economic activities of users. Various cryptocurrencies have emerged since 2008, when Bitcoin was first introduced [1, 2]. Nowadays, cryptocurrencies are often used in online transactions, and their usage has increased every year since their introduction [3, 4]. Cryptocurrencies are primarily characterized by fluctuations in their price and number of transactions [2, 3]. For instance, the most famous cryptocurrency, Bitcoin, had witnessed no significant fluctuation in its price and number of transactions until the end of 2013 [3], when it began to garner worldwide attention, and witnessed a significant rise and fluctuation in its price and number of transactions. Other cryptocurrencies—Ripple and Litecoin, for instance—have shown significantly unstable fluctuations since the end of December 2013 [5]. Such unstable fluctuations have served as an opportunity for speculation for some users while hindering most others from using cryptocurrencies [2, 6, 7]. Research on the attributes of cryptocurrencies has made steady progress but has a long way to go. Most researchers analyze user sentiments related to cryptocurrencies on social media, e.g., Twitter, or quantified Web search queries on search engines, such as Google, as well as fluctuations in price and trade volume to determine any relation [8-12]. Past studies have been limited to Bitcoin because the large amount of data that it provides eliminates the need to build a model to predict fluctuations in the price and number of transactions of diverse cryptocurrencies. Therefore, this paper proposes a method to predict fluctuations in the price and number of transactions of cryptocurrencies. The proposed method analyzes user comments on online cryptocurrency communities, and conducts an association analysis between these comments and fluctuations in the price and number of transactions of cryptocurrencies to extract significant factors and formulate a prediction model. The method is intended to predict fluctuations in cryptocurrencies based on the attributes of online communities. Online communities serve as forums where people share opinions regarding topics of common interest [13-17]. Therefore, such communities mirror the responses of many users to certain cryptocurrencies on a daily basis. Cryptocurrencies are largely traded online, where many users rely on information on the Web to make decisions about selling or buying them [4, 18]. In this paper, daily topics and relevant comments/replies in cryptocurrency communities are analyzed to determine how the opinions of community users are associated with fluctuations in the price and number of transactions of cryptocurrencies on a daily basis. The proposed method is applicable to a range of cryptocurrencies, and can predict fluctuations in the prices of such cryptocurrencies as Bitcoin, Ripple, and Ethereum to a certain extent (approximately 74% weighted average precision). Moreover, the rise and fall in the number of transactions of Bitcoin and Ethereum can be predicted to some extent.

Methods

System Overview

For the proposed system, we crawled all comments and replies posted in online communities relevant to cryptocurrencies [19-21]. We then analyzed the data (comments and replies) and tagged the extent of positivity or negativity of each topic as well as that of each comment and reply. Following this, we tested the relation between the price and number of transactions of cryptocurrencies based on user comments and replies to select data (comments and replies) that showed significant relation. Finally, we created a prediction model via machine learning based on the selected data to predict fluctuations (Fig 1).
Fig 1

System overview.

Crawling user comment data

We crawled data needed to create the prediction model. Once the environment for cryptocurrency trading among users is established, transactions between users lead to fluctuations in price [4]. We hypothesized that user comments in certain online cryptocurrency communities may affect fluctuations in their price and trading volume. Thus, we crawled the relevant data. Approximately 670 types of cryptocurrencies existed as of February 2016 [22]. Of the available ones, we crawled online communities for the top three in terms of market cap, i.e., Bitcoin, Ethereum, and Ripple. We did not include Litecoin in this study because its online communities seemed not to be sufficiently active to be considered in this experiment, despite its large market cap and broad user base. Since Bitcoin was the first cryptocurrency, it has a large user community. In the Bitcoin community [19], data items were collected starting from December 2013, when the cryptocurrency became widely available. In the Ethereum community [20], data were collected from August 7, 2015, since when the community stabilized to the extent that at least one topic has since been posted every day and transaction data are available. From the Ripple community [21], all data since the creation of the community were gathered. In all communities of interest, we collected data in a legitimate manner, in compliance with their terms and conditions. Moroever, the collected data did not involve any personally identifiable information. The cryptocurrencies of interest in this paper had online communities where users shared opinions on the relevant topics. The Bitcoin community [19] is divided into four sections, i.e., a “Bitcoin” section on Bitcoin-related topics, an “Economy” section on transactions, an “Alternate cryptocurrencies” section concerning other cryptocurrencies, and an “Other” section for other topics. Each section has three-five subsections. The “Bitcoin” section consisted of “Bitcoin Discussion,” “Development & Technical Discussion,” “Mining,” “Technical Support,” and “Project Development.” The “Alternate cryptocurrencies” section had a similar structure. For this paper, we crawled the discussion sub-sections for topics related to each of the cryptocurrencies. Comments and relevant replies posted by users on bulletin boards in each community were crawled. Furthermore, the time when each comment and replies to it were posted, the number of replies to each comment, and the number of views were crawled as well. Replies quoting previous comments and replies were crawled excluding overlapping sentences. Each community’s HTML page was crawled using Python [23]. Using Python’s regex, we parsed the tags on HTML pages to extract the number of topics, the number of replies, the dates on which the topics and replies were posted, and the URL of each topic from the bulletin boards. Based on the URLs of extracted topics, their contents and replies to them were extracted. The extracted topics, the dates on which they were posted, topic contents, reply contents, and reply dates were saved in .json format, which was in turn converted into other formats (e.g. csv) appropriate for different purposes. The .json files of the communities crawled can be viewed in the supporting information. One researcher executed the crawling on a single PC for 48 ~ 72 hours, where the time varied with the size of the community. The Bitcoin and Ethereum forums were crawled on February 1 and 8, 2016, respectively, whereas the Ripple forum was crawled on January 21, 2016. Table 1 outlines the arrangement of the opinion data that were gathered.
Table 1

Summary of crawled opinion data.

Target CryptocurrenciesOpinion Topics
Crawling SourceCrawling BoundaryData Volume (threads)
BitcoinBitcoin ForumDec. 01, 2013~ Feb. 01, 201613,360
EthereumEthereum ForumAug. 07, 2015~ Feb. 08, 20161,449
RippleRipple ForumSept. 07, 2015~ Jan. 21, 2016468
The crawled data included garbage, e.g., ads and meaninglessly repetitive postings or replies. Quite a few spam filtering techniques were investigated to remove such garbage data [15, 24–29]. Any posting of more than two sentences found more than five times a day was considered spam and treated as such.

Tagging user comments data

In this step, positive/negative replies to the crawled user comment data were tagged. Many past studies have dealt with classifying user sentiment or comment data [15, 30–35]. In this vein, user reviews have been used to create a classifier based on machine learning [36-40], and user comments on the Web have been statistically analyzed for sentiment tagging [41-43]. Past research has mostly focused on classifying user comments in particular fields. Comments on online communities involve considerable use of neologisms, slang, and emoticons that transcend grammatical usage. C.J. Hutto and Eric Gilbert introduced an algorithm called VADER [44] to parse such expressions, and proposed a method to analyze social media texts by drawing on a rule-based model. Online communities of interest in this paper paralleled social media texts. Thus, user comment data were tagged based on this algorithm. VADER normalizes positive and negative sentiments from -1 to 1. Based on the normalized figure, x, -1< = x < -0.6, -0.6< = x < -0.2, 0.2 < = x <0.6, and 0.6< = x < = 1.0 were tagged as very negative, negative, positive, and very positive, respectively. In this paper, each of the comments and replies was tagged (see the opinion analysis example in Table 2).
Table 2

Bitcoin Community Opinion Analysis Example.

Opinion CriteriaExample topic sentences
Very Positive“I am selling for $100 a Starbucks Gift card with a loaded balance of $20 worth of BTC” / “Bitcoin is the global currency of the Earth” / “How can 1 BTC eventually be worth $11 M”
Positive“We are in Bitcoin Heaven” / “Bitcoin to eventually replace Apps like Uber” / “Russians can Pay Internet and phone bills with Bitcoin without fees”
Neutral“Do you think Bitcoin will disappear or sopt being used?” / “What you like the best about Bitcoin?” / “Can Bitcoin make banks disappear?”
Negative“Bitcoin: Should you stay or should you go?” / “Is there a way to earn at least $1 in BTC per hour?” / “IMF fears cryptocurrencies may circumvent capital controls”
Very Negative“Bitcoin used to be involved in money laundering—will it become a huge problem?” / “Bitcoin cold storage—Hacked easily” / “Russia's Finance Ministry wants to ban Bitcoin”

Prediction modeling

The crawled user comment data were tagged to create a prediction model. To create the prediction model, data selection was performed again. All opinions from very negative to very positive comments and replies could have been used. Yet, we intended to improve the qualitative results and minimize operation cost. For data selection, we performed an association analysis between the results of opinion analysis and fluctuations in cryptocurrency prices. In this paper, the Granger causality test, which is widely used in research on the value of shares and currencies, was adopted [45]. As shown in Eq 1, the results of opinion analysis based on the topics and replies (VADER-based tagged values), the number of topics posted, the number of replies posted, and the number of views of the entire topics posted on a certain day were transformed into z-scores for standardization against the previous 10 days. Likewise, the fluctuations in the price and number of transactions of cryptocurrencies were transformed into z-scores for standardization against the previous 10 days. On a certain date t (t = 10 in the paper), the z-score of a certain item , denoted by , was defined as: where and respectively represent the mean and standard deviation of each item for every date. Fig 2 shows an example of test results comparing the fluctuations in cryptocurrency prices and results of opinion analysis z-scores.
Fig 2

Z-scores of fluctuations in cryptocurrency prices overlapping with results of opinion analysis.

Some opinions show a trend similar to that of fluctuations in cryptocurrency prices.

Z-scores of fluctuations in cryptocurrency prices overlapping with results of opinion analysis.

Some opinions show a trend similar to that of fluctuations in cryptocurrency prices. The standardized z-scores underwent the Granger causality test to determine the significance of association. The Granger causality test relies on the assumption that if a variable X causes Y, then changes in X will systematically occur before changes in Y [46]. As demonstrated in previous studies, lagged values of X exhibit a statistically significant correlation with Y [15, 46]. Correlation does not prove causation, however. We are not testing actual causation, but only whether the time series of a community of opinions contained predictive information regarding the fluctuations in cryptocurrency prices. Our time series for the prices of cryptocurrencies and number of transactions, denoted by S, reflected daily changes in the prices of cryptocurrencies and the number of transactions. To test whether the community opinions in the time series can predict changes in the fluctuations in cryptocurrency prices, we compared the variance explained by two linear models, as shown in Eqs 2 and 3. The first model uses only n lagged values of S (i.e., S, ⋯, S) for prediction, whereas the second model uses the n lagged values of both S and the selling prices of the item time series, denoted by X, ⋯, X. We performed the Granger causality test according to models in Eqs 2 and 3. Based on the results of the Granger causality test, we can reject the null hypothesis, whereby the community opinions time series does not predict fluctuations in cryptocurrency prices—i.e., β{1,2,⋯, ≠ 0—with a high level of confidence The community opinions with the highest Granger causality relation (p-value < 0.05) were extracted. The Granger causality test was performed on each currency for a time lag of 1 to 13 days. Experimentally, a time lag of 14 days and longer proved insignificant. Depending on the difference in each time lag measurement, elements showing significant associations were identified. For the prediction, the fluctuations in cryptocurrency prices were determined in a binary manner. We generated and validated the prediction model based on averaged one-dependence estimators (AODE) [47]. Based on AODE, we estimated the probability of a binary class y, given that an item-related set of features was x1,⋯x, P(y|x1,⋯x). This probability was estimated as follows: where denotes an estimate of P(⋅), F(⋅) is the frequency, and m is the frequency limit set at 1 in this paper. In the next section, we discuss the results of the applied system.

Experimental Results

Using our model, we made predictions regarding three cryptocurrencies (Bitcoin, Ethereum, and Ripple). In consonance with the days for which data were collected from these communities, each cryptocurrency’s daily price and number of transactions were crawled. Information concerning the price and number of transactions of Bitcoin was crawled via Coindesk [19], whereas price information for Ethereum was crawled via CoinMarketCap [22] and its transaction information was crawled via Etherscan [48]. Information regarding price for Ripple was crawled via rippleCharts [49], whereas its transaction information was not crawled. All data collected were in the public domain and excluded personal information. Table 3 outlines the arrangement of the market data that were gathered.
Table 3

Summary of crawled market data.

Target CryptocurrenciesCryptocurrency pricesCryptocurrency transactions
Crawling SourceCrawling BoundaryData Volume (days)Crawling SourceCrawling BoundaryData Volume (days)
BitcoinCoinDeskDec. 01, 2013~ Feb. 01, 2016793CoinDeskDec. 01, 2013~ Feb. 01, 2016793
EthereumCoinMarketCapAug. 07, 2015~ Feb. 08, 2016187EtherscanAug. 07, 2015~ Feb. 08, 2016187
RipplerippleChartsSept. 07, 2015~ Jan. 21, 2016137
The elements that exhibited significant associations in modeling for predictions were used for learning (Tables 4–8). P-values in the table are only shown for elements with prices of 0.05 or less.
Table 4

Statistical significance (p-values) of bivariate Granger causality correlation for Bitcoin price and community opinion.

Time LagBitcoin Price
Very PositivePositiveNeutralVery Positive ReplyPositive ReplyNeutral ReplyNegative ReplyVery Negative ReplyTopicViewsReply
1 day0.23180.00070.07530.25550.02210.32690.12370.1260.04060.00860.1107
2 days0.7120.00990.09340.62890.03540.34360.3120.12120.21580.49400.1709
3 days0.77250.02240.01730.69160.0750.53840.08110.19950.04560.06520.2459
4 days0.00440.00090.01580.00510.000040.03250.00210.01220.00090.00770.0006
5 days0.000010.00210.01310.00340.000020.00810.00510.01960.000070.00290.0005
6 days0.00040.00800.05170.03850.00080.07710.04310.08840.00460.01690.0118
7 days0.00020.01170.02520.06050.00170.12350.03520.02860.00120.02290.0130
8 days0.00050.02550.03060.09010.00390.29430.06710.05080.00500.02920.0247
9 days0.00110.00650.05120.08850.00720.19830.06780.06950.00700.03280.0289
10 days0.00260.02000.08660.07930.00510.18620.03310.03630.00920.01410.0189
11 days0.00110.03370.12650.08820.00580.19640.06280.03170.00630.03160.0024
12 days0.00240.04120.13820.03780.00300.1170.04120.02610.00730.07740.0109
13 days0.00190.06150.10930.04110.00580.17830.03280.03680.00610.08930.0149
Table 8

Statistical significance (p-values) of bivariate Granger causality correlation for Ripple’s price and community opinion.

Time LagRipple Price
NegativeVery NegativeNegative Reply
1 day0.07810.00330.3903
2 days0.19510.01380.2366
3 days0.26490.01500.2033
4 days0.34130.03220.0659
5 days0.32280.01240.0374
6 days0.38410.01550.0539
7 days0.04500.01850.0380
8 days0.06770.03200.0339
9 days0.08260.05570.0051
10 days0.06990.08800.0064
11 days0.09850.09830.0068
12 days0.02720.14640.0106
13 days0.00910.19210.0112
An example of applicable input data is shown in Table 9. The results of the predicted fluctuations in the price and number of transactions of each cryptocurrency are discussed below.
Table 9

Example of a machine learning dataset.

The z-score () of data for the previous 10 days was used as the values A~J, which indicate the value of the sum of the opinion of each community at the given date. Here, X~Z indicate the topic data values (number of topics, sum of replies, sum of views) on the given date.

Data ClassDateOpinion DataTopic Data
Very Positive TopicPositive TopicNeutral TopicNegative TopicVery Negative TopicVery Positive ReplyPositive ReplyNeutral ReplyNegative ReplyVery Negative ReplyNumber of TopicsSum of RepliesSum of Views
Crawled Raw DataJan 02, 2016ABCDEFGHIJXYZ
Input Learning DataJan 02, 2016ZAtZBtZCtZDtZEtZFtZGtZHtZItZJtZXtZYtZZt

Example of a machine learning dataset.

The z-score () of data for the previous 10 days was used as the values A~J, which indicate the value of the sum of the opinion of each community at the given date. Here, X~Z indicate the topic data values (number of topics, sum of replies, sum of views) on the given date. The accuracy rate, the F-measure and the Matthews correlation coefficient (MCC) were used to evaluate the performance of the proposed models. The computation of these evaluation measures required estimating precision and recall, which are evaluated from True Positive (TP), False Positive (FP), True Negative (TN) and False Negative (FN). These parameters are defined in Eqs 5, 6, 7 and 8: Accuracy rate, weighted average of F-measure (F−Measure) and MCC are defined in Eqs 9, 10, 11, 12 and 13. Of the Bitcoin-related data for 793 days, the first 88% (for 697 days) and the remaining 12% (for 94 days) were used for learning and verification, respectively. Fluctuations in the price of Bitcoin proved to be significantly associated with the number of topics, positive/very positive comments, and positive replies. The prediction result proved to be the highest when the time lag was six days with an accuracy of 79.57% (Table 10). Moreover, fluctuations in the number of transactions proved to be significantly associated with the section where a number of daily topics, very positive comments, and very positive replies were found. The predicted result of fluctuating numbers of transactions proved to be highest when the time lag was three days with an accuracy of 77.895% (Table 10).
Table 10

Experimental result of predicted Bitcoin fluctuation.

Time LagBitcoin PriceBitcoin Transaction
Accuracy(%)F1-ScoreMCCAccuracy(%)F1-ScoreMCC
1 day51.5790.5210.06761.0530.6100.212
2 days54.7370.5470.09664.2110.6380.233
3 days49.4740.4970.01077.8950.7740.579
4 days55.3190.5520.10272.3400.7190.486
5 days65.9570.6560.32148.9360.495-0.048
6 days79.5700.7960.60642.5530.426-0.162
7 days60.6380.5970.21652.1280.5140.028
8 days55.3190.5520.10563.8300.6340.283
9 days67.0210.6680.32059.5740.5950.192
10 days51.0640.5120.02456.3830.5650.121
11 days57.4470.5740.15450.0000.506-0.021
12 days49.4620.495-0.01145.1610.449-0.121
13 days50.5380.5060.01248.3870.489-0.040
A 10-fold cross-validation was performed on Ethereum for the entire days (for 187 days). Unlike Bitcoin, Ethereum showed a significant association in the Granger causality test with the section where a number of negative/very negative comments were found. A significant association with a number of positive user replies was also found. The predicted result proved to be highest when the time lag was six days with an accuracy of 71.823% (Table 11). The fluctuation in the number of transactions showed insignificant associations with most sections, but was significantly associated with very negative replies when the time lag was 11~13 days. The predicted fluctuation in the number of transactions when the time lag was one day yielded an accuracy of 66.129% (Table 11).
Table 11

Experimental result of predicted Ethereum fluctuation.

Time LagEthereum PriceEthereum Transaction
Accuracy(%)F1-ScoreMCCAccuracy(%)F1-ScoreMCC
1 day53.7630.5330.05866.1290.6610.315
2 days52.4320.5240.042
3 days45.6520.456-0.095
4 days54.6450.5460.086
5 days51.3810.5140.021
6 days71.8230.7170.430
7 days63.3330.6330.259
8 days67.0390.6690.331
9 days49.4380.490-0.030
10 days49.7180.496-0.016
11 days55.6820.5550.10364.2050.6410.276
12 days50.2860.501-0.00654.2860.5430.079
13 days49.4250.495-0.01351.1490.5120.020
Finally, Ripple underwent 10-fold cross-validation for the entire days (for 137 days). The predicted fluctuation in the price of Ripple proved to be highest when the time lag was seven days with an accuracy of 71.756% (Table 12).
Table 12

Experimental result of predicted Ripple price fluctuation.

Time LagRipple Price
Accuracy(%)F1-ScoreMCC
1 day61.3140.6130.206
2 days50.7350.5100.013
3 days51.8520.5170.011
4 days52.5930.5280.055
5 days62.4060.6240.236
6 days42.4240.426-0.153
7 days71.7560.7040.431
8 days53.0770.5300.049
9 days50.3880.496-0.025
10 days60.9380.6100.210
11 days63.7800.6380.268
12 days53.1570.5270.040
13 days63.2000.6280.243
Like Ethereum, Ripple proved to be significantly associated with very negative comments, and with negative replies when the time lag was seven days and longer. The prediction of fluctuation in the number of transactions of Ripple could not be performed due to difficulties in acquiring relevant data. To determine the effectiveness of the proposed prediction model, we performed a simulated investment in Bitcoin, using the simulated investment technique generally used in past studies on stock price prediction [50]. We invested in Bitcoin when the model predicted the price would rise the following day, and did not invest when the price was expected to drop the following day according to the model. The simulated investment was based on the rule whereby we would gain or lose from the investment (m) by r, which indicates the increment or decrement in the Bitcoin price (m = m + m × r or m = m−m × r, respectively). The six-day time lag, which corresponded to the best result in this study, was used in the prediction model. The prediction model was created based on data for the period from December 1, 2013 to November 10, 2015. The 84-day or 12-week data for the period from November 11, 2015 to February 2, 2016 were used in the experiment. Fig 3 shows the results of the simulated investment program based on the above conditions. The random investment average refers to the mean of 10 simulated investments based on the random Bitcoin price prediction. Over 12 weeks, the Bitcoin price increased by 19.29% while the amount of investment grew by 35.09%. In random investment, the amount of investment increased by approximately 10.72%, which was lower than the increment in Bitcoin price.
Fig 3

Increment/decrement in the amount of simulated investment in Bitcoin.

Discussion and Conclusion

This paper analyzed user comments in online communities to predict the price and the number of transactions of cryptocurrencies. The proposed method predicted fluctuations in the price of cryptocurrencies at low cost. In terms of the prediction rates for Bitcoin and other cryptocurrencies based on the limited resources in online communities, the proposed method paralleled previous studies designed for similar purposes [15, 51]. Moreover, user comments and replies in online communities proved to affect the number of transactions among users. The proposed method proved applicable to buying and selling cryptocurrencies, and shed light on aspects influencing user opinions. Furthermore, the simulated investment demonstrated that the proposed method is applicable to cryptocurrency trading. Based on the learning data at the time of higher prediction rates, the types of comments that most significantly influenced fluctuations in the price and the number of transactions of each cryptocurrency were identified. Opinions affecting price fluctuations varied across cryptocurrencies. Positive user comments significantly affected price fluctuations of Bitcoin, whereas those of the other two currencies were significantly influenced by negative user comments and replies. Moreover, the association with the number of topics posted daily indicated that the variation in community activities could influence fluctuations in price. Further, unlike the price of cryptocurrencies, the number of transactions proved to be significantly associated with user replies rather than comments posted. Based on the prediction results, user opinions proved useful to predict the fluctuations in 6~7 days (Table 10). The predicted fluctuations in the price of each cryptocurrency showed approximately 8% accuracy gaps. The predicted result was most precise in Bitcoin, which seems attributable to the amount of accumulated data and animated community activities (16.91 comments, 473.81 user replies, and 27443.18 views on average daily), which exerted a direct effect on fluctuations in the price of the cryptocurrency. The predicted result was least precise in Ripple, which had the smallest community regardless of its market size (3.41 comments, 29.14 user replies, and 1661.99 views on average daily). Ripple’s online community started in September, 2015, with little data accumulated and few user activities. These findings suggest that the difference in community sizes may have direct effects on fluctuations in the price of cryptocurrencies. Improving the precision of prediction requires a few improvements. Despite the association analysis used to filter user comments and replies, more qualitative selection criteria are needed to build a prediction model. This paper focused on online communities to determine associations and predict fluctuations. Yet, as with past studies, using data on the Web [52, 53], analyzing social network data [46], and referring to search volumes on Google [10, 12] are conducive to more precise results. Moreover, partly adopting the stock market prediction technique used in previous studies [54] might help increase precision rate. In this paper, we acquired information from users in online communities as a viable source for research on cryptocurrencies. In the same vein, the sentiments expressed by user comments and replies in online communities seem applicable to further analysis and understanding of cryptocurrencies. Moreover, the propensities of online community users may help understand the attributes of the relevant cryptocurrency. In addition, the rich information in online communities can contribute to understanding cryptocurrencies from different perspectives. Cryptocurrencies are increasingly being used, and their usability has drawn attention from different perspectives [2-5]. Research on cryptocurrencies is insufficient, in that hardly any currency other than Bitcoin has been investigated. The proposed method of predicting fluctuations in the price and trading volume of cryptocurrencies based on user comments and replies in online communities is likely to increase the understanding and availability of cryptocurrencies if a range of improvements and applications are implemented. Furthermore, different approaches to user comments and replies in online communities are expected to bring more significant results in diverse fields.

Results of crawling Bitcoin forum, Ethereum forum, and Ripple forum (in .json format).

(ZIP) Click here for additional data file.

Python-based crawler source code for community data collection.

(ZIP) Click here for additional data file.

The result of implementing opinion analysis from user opinion data (topic) on the Bitcoin forum (https://bitcointalk.org).

(CSV) Click here for additional data file.

The result of implementing opinion analysis from user opinion data (topic) on the Ethereum forum (https://forum.ethereum.org/).

(CSV) Click here for additional data file.

The result of implementing opinion analysis from user opinion data (topic) on the Ripple forum (http://www.xrpchat.com/).

(CSV) Click here for additional data file.

The result of implementing opinion analysis from user opinion data (reply) on the Bitcoin forum (https://bitcointalk.org).

(ZIP) Click here for additional data file.

The result of implementing opinion analysis from user opinion data (reply) on the Ethereum forum (https://forum.ethereum.org/).

(CSV) Click here for additional data file.

The result of implementing opinion analysis from user opinion data (reply) on the Ripple forum (http://www.xrpchat.com/).

(CSV) Click here for additional data file.
Table 5

Statistical significance (p-values) of bivariate Granger causality correlation for the number of transactions and community opinion for Bitcoin.

Time LagBitcoin Transaction
Very PositivePositiveNeutralNegativeVery NegativeVery Positive ReplyPositive ReplyNeutral ReplyTopicViewsReply
1 day0.00030.02900.00250.15240.01770.1980.69880.07750.00020.00360.647
2 days0.0000030.03740.00010.01460.01770.28010.04940.01240.0000090.00110.1362
3 days0.000070.00220.00080.04020.06410.36930.15080.05580.00010.00250.2696
4 days0.00150.00990.00670.02470.18080.60880.33920.2170.00170.01530.5221
5 days0.00860.03630.04340.08150.40.39060.29210.76860.00480.08690.4328
6 days0.00720.11350.16540.03640.52440.00500.01450.09690.00230.37110.0398
7 days0.00730.07330.32510.0710.5240.00210.02830.15750.00720.61760.0711
8 days0.01610.22870.32980.02840.18640.00990.06130.31230.00140.48650.0965
9 days0.02450.18970.09710.04510.23640.00450.04120.27970.00190.40040.0848
10 days0.02090.19970.08820.02530.31110.00530.0610.36350.00200.53010.111
11 days0.02880.07640.11290.03450.3930.00430.06020.38470.00160.63030.0883
12 days0.04570.16150.11760.05310.48390.01070.07430.43820.0040.7350.1136
13 days0.07630.2240.15330.06940.54630.02250.09840.4050.00820.820.1376
Table 6

Statistical significance (p-values) of bivariate Granger causality correlation for Ethereum’s price and community opinion.

Time LagEthereum Price
Very PositivePositiveNeutralNegativeVery NegativeVery Positive ReplyPositive ReplyNeutral ReplyNegative ReplyVery Negative ReplyTopicViewsReply
1 day0.01060.51940.88920.00090.97900.00110.02320.01030.29110.08400.09740.00030.0085
2 days0.07990.99540.27730.03250.05580.18060.17270.29430.21950.24520.07690.65740.1837
3 days0.21310.78190.16040.16580.11540.47650.06200.34960.01790.35920.06190.64980.0578
4 days0.29280.55820.20060.08370.02100.39640.00100.35840.00140.44830.09340.35540.0139
5 days0.39400.48730.26160.04020.00120.13720.00020.19940.00310.41360.23160.29810.0051
6 days0.36880.33590.20390.06910.00160.09730.00040.21070.00640.19840.08090.14970.0086
7 days0.32220.00060.09310.00190.00110.02700.00020.08850.00020.03670.06400.06800.0043
8 days0.01690.00540.00790.00110.00260.03430.00600.08080.00520.01150.02720.09350.0144
9 days0.22280.01320.06530.02070.01660.10080.00410.41380.03320.04960.15820.44500.1692
10 days0.37660.06200.25180.01480.01210.19030.00580.24170.06920.20010.41310.75600.2621
11 days0.58070.13460.32900.03520.01770.24140.01180.39940.12570.36210.55740.88750.3475
12 days0.61580.11780.26480.04580.01900.13470.01200.34210.14880.22850.39060.49620.3025
13 days0.27830.19230.20480.04100.01570.27310.00700.37730.05850.07780.65000.44620.3243
Table 7

Statistical significance (p-values) of bivariate Granger causality correlation for the number of transactions and community opinion for Ethereum.

Time LagEthereum Transaction
PositiveNegativeVery Negative Reply
1 day0.04600.02480.0567
11 days0.61420.98750.0179
12 days0.63580.99420.0292
13 days0.68140.99590.0385
  6 in total

1.  The digital traces of bubbles: feedback cycles between socio-economic signals in the Bitcoin economy.

Authors:  David Garcia; Claudio J Tessone; Pavlin Mavrodiev; Nicolas Perony
Journal:  J R Soc Interface       Date:  2014-10-06       Impact factor: 4.118

2.  What are the main drivers of the Bitcoin price? Evidence from wavelet coherence analysis.

Authors:  Ladislav Kristoufek
Journal:  PLoS One       Date:  2015-04-15       Impact factor: 3.240

3.  Mood and the market: can press reports of investors' mood predict stock prices?

Authors:  Yochi Cohen-Charash; Charles A Scherbaum; John D Kammeyer-Mueller; Barry M Staw
Journal:  PLoS One       Date:  2013-08-28       Impact factor: 3.240

4.  Do the rich get richer? An empirical analysis of the Bitcoin transaction network.

Authors:  Dániel Kondor; Márton Pósfai; István Csabai; Gábor Vattay
Journal:  PLoS One       Date:  2014-02-05       Impact factor: 3.240

5.  BitCoin meets Google Trends and Wikipedia: quantifying the relationship between phenomena of the Internet era.

Authors:  Ladislav Kristoufek
Journal:  Sci Rep       Date:  2013-12-04       Impact factor: 4.379

6.  Virtual World Currency Value Fluctuation Prediction System Based on User Sentiment Analysis.

Authors:  Young Bin Kim; Sang Hyeok Lee; Shin Jin Kang; Myung Jin Choi; Jung Lee; Chang Hun Kim
Journal:  PLoS One       Date:  2015-08-04       Impact factor: 3.240

  6 in total
  9 in total

1.  Forecasting and trading cryptocurrencies with machine learning under changing market conditions.

Authors:  Helder Sebastião; Pedro Godinho
Journal:  Financ Innov       Date:  2021-01-06

2.  Conducting Causal Analysis by Means of Approximating Probabilistic Truths.

Authors:  Bo Pieter Johannes Andrée
Journal:  Entropy (Basel)       Date:  2022-01-06       Impact factor: 2.524

3.  Predicting Virtual World User Population Fluctuations with Deep Learning.

Authors:  Young Bin Kim; Nuri Park; Qimeng Zhang; Jun Gi Kim; Shin Jin Kang; Chang Hun Kim
Journal:  PLoS One       Date:  2016-12-09       Impact factor: 3.240

4.  When Bitcoin encounters information in an online forum: Using text mining to analyse user opinions and predict value fluctuation.

Authors:  Young Bin Kim; Jurim Lee; Nuri Park; Jaegul Choo; Jong-Hyun Kim; Chang Hun Kim
Journal:  PLoS One       Date:  2017-05-12       Impact factor: 3.240

5.  Cryptocurrency price drivers: Wavelet coherence analysis revisited.

Authors:  Ross C Phillips; Denise Gorse
Journal:  PLoS One       Date:  2018-04-18       Impact factor: 3.240

6.  Predicting altcoin returns using social media.

Authors:  Lars Steinert; Christian Herff
Journal:  PLoS One       Date:  2018-12-04       Impact factor: 3.240

7.  A percolation model for the emergence of the Bitcoin Lightning Network.

Authors:  Silvia Bartolucci; Fabio Caccioli; Pierpaolo Vivo
Journal:  Sci Rep       Date:  2020-03-11       Impact factor: 4.379

8.  Forecasting Bitcoin Trends Using Algorithmic Learning Systems.

Authors:  Gil Cohen
Journal:  Entropy (Basel)       Date:  2020-07-30       Impact factor: 2.524

9.  Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning.

Authors:  Franco Valencia; Alfonso Gómez-Espinosa; Benjamín Valdés-Aguirre
Journal:  Entropy (Basel)       Date:  2019-06-14       Impact factor: 2.524

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.