Recently I tried to use Google News python package to gather the news about the US Tech companies, and then use Machine Learning Text Classification to predict the stock price will rise or drop in the next day. Here is the algorithm.
I focus on the 5 Big US Tech companies: AAPL, FB, GOOG, AMZN, NFLX. I search the news with the key word: “ticker-code” + “price”. I gathered around 300 news titles, and then gather the corresponding stock Open and Close price of the next day.
The trading strategy is that buy the stock at the market open, and sell that at the market close. If Close-price ≥ Open-price, it is gain, and so the label is 1. Otherwise, it is loss, and so the label is 0.
I concatenate the new title belongs to the same date, and feed into the text classification ML library: Turi Create, which use text sentiment analsysis to classify against the label.
Upon using 20% unseen data as evaluation, the accuracy is about 71%, which is higher than the time series prediction using historical prices.