Without much ado, let’s get started with the code. The complete project on github can be found here .
I started with loading all the libraries and dependencies required.
import json import requests from keras.models import Sequential from keras.layers import Activation, Dense, Dropout, LSTM import matplotlib.pyplot as plt import numpy as np import pandas as pd import seaborn as sns from sklearn.metrics import mean_absolute_error %matplotlib inline
I have used Canadian exchange rate and stored the real time data into a pandas data-frame. I used
method to convert string Date time into Python Date time object. This is necessary as Date time objects in the file are read as a string object. Performing operations like time difference on a string rather a Date Time object is much easy.
endpoint = 'https://min-api.cryptocompare.com/data/histoday' res = requests.get(endpoint + '?fsym=BTC&tsym=CAD&limit=500') hist = pd.DataFrame(json.loads(res.content)['Data']) hist = hist.set_index('time') hist.index = pd.to_datetime(hist.index, unit='s') target_col = 'close'
Let’s see how the dataset looks like with all the trading features like price, volume, open, high, low.
Next, I split the data into two sets — training set and test set with 80% and 20% data respectively. The decision made here is just for the purpose of this tutorial. In real projects, you should always split your data into training, validation, testing (like 60%, 20%, 20%).
def train_test_split(df, test_size=0.2):
split_row = len(df) - int(test_size * len(df))
train_data = df.iloc[:split_row]
test_data = df.iloc[split_row:]
return train_data, test_datatrain, test = train_test_split(hist, test_size=0.2)
Now let’s plot the cryptocurrency prices in Canadian dollars as a function of time using the below code:
def line_plot(line1, line2, label1=None, label2=None, title='', lw=2):
fig, ax = plt.subplots(1, figsize=(13, 7))
ax.plot(line1, label=label1, linewidth=lw)
ax.plot(line2, label=label2, linewidth=lw)
ax.set_ylabel('price [CAD]', fontsize=14)
ax.legend(loc='best', fontsize=16)line_plot(train[target_col], test[target_col], 'training', 'test', title='')
We can observe that there is a clear dip in prices between December 2018 and April 2019. The prices keeps on increasing from April 2019 to August 2019 with fluctuations happening in the months of July and August. From September 2019 onward prices are constantly decreasing. The interesting thing to be noted from this price fluctuation is that the prices are low in winter and it increases in the summer. Although this can’t be generalized as the dataset under consideration is just a small sample that is for a year. Also with cryptocurrency it’s hard to generalize anything.
Next, I made a couple of functions to normalize the values. Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to change the values of numeric columns in the dataset to a common scale, without distorting differences in the ranges of values.
return df / df.iloc - 1
return (df - df.min()) / (data.max() - df.min())
Next, I made a function to extract data of windows which are of size 5 each as shown in the code below:
def extract_window_data(df, window_len=5, zero_base=True):
window_data = 
for idx in range(len(df) - window_len):
tmp = df[idx: (idx + window_len)].copy()
tmp = normalise_zero_base(tmp)
I continued with making a function to prepare the data in a format to be later feeded into the neural network. I used the same concept of splitting the data into two sets — training set and test set with 80% and 20% data respectively as shown in the code below:
def prepare_data(df, target_col, window_len=10, zero_base=True, test_size=0.2):
train_data, test_data = train_test_split(df, test_size=test_size)
X_train = extract_window_data(train_data, window_len, zero_base)
X_test = extract_window_data(test_data, window_len, zero_base)
y_train = train_data[target_col][window_len:].values
y_test = test_data[target_col][window_len:].values
y_train = y_train / train_data[target_col][:-window_len].values - 1
y_test = y_test / test_data[target_col][:-window_len].values - 1
return train_data, test_data, X_train, X_test, y_train, y_test
It works by using special gates to allow each LSTM layer to take information from both previous layers and the current layer. The data goes through multiple gates (like forget gate, input gate, etc.) and various activation functions (like the tanh function, relu function) and is passed through the LSTM cells. The main advantage of this is that it allows each LSTM cell to remember patterns for a certain amount of time. The thing to be noted is that LSTM can remember important information and at the same time forget irrelevant information. The LSTM architectures is shown below:
Now let’s build the model. Sequential model is used for stacking all the layers (input, hidden and output). The neural network comprises of a LSTM layer followed by 20% Dropout layer and a Dense layer with linear activation function. I complied the model using Adam as the optimizer and Mean Squared Error as the loss function.
def build_lstm_model(input_data, output_size, neurons=100, activ_func='linear', dropout=0.2, loss='mse', optimizer='adam'):
model = Sequential()
model.add(LSTM(neurons, input_shape=(input_data.shape, input_data.shape)))
Next I set up some of the parameters to be used later. These parameters are — random number seed, length of the window, test set size, number of neurons in LSTM layer, epochs, batch size, loss, dropouts and optimizer.
window_len = 5
test_size = 0.2
zero_base = True
lstm_neurons = 100
epochs = 20
batch_size = 32
loss = 'mse'
dropout = 0.2
optimizer = 'adam'
Now let’s train the model using inputs
train, test, X_train, X_test, y_train, y_test = prepare_data(
hist, target_col, window_len=window_len, zero_base=zero_base, test_size=test_size)model = build_lstm_model(
X_train, output_size=1, neurons=lstm_neurons, dropout=dropout, loss=loss,
history = model.fit(
X_train, y_train, epochs=epochs, batch_size=batch_size, verbose=1, shuffle=True)
Let us take a look at snapshot during model training for 20 epochs.
I used Mean Absolute Error as the evaluation metric.
It measures the average magnitude of the errors in a set of predictions, without considering their direction. It’s the average over the test sample of the absolute differences between actual and predicted observations where all individual differences have equal weight.
targets = test[target_col][window_len:]
preds = model.predict(X_test).squeeze()
The MAE value obtained looks good. Finally, let’s plot the actual and predicted prices using the below code:
preds = test[target_col].values[:-window_len] * (preds + 1) preds = pd.Series(index=targets.index, data=preds) line_plot(targets, preds, 'actual', 'prediction', lw=3)