To start with, I build an ANN with densely connected layers as my baseline model to compare my other models with.

from keras.models import Sequential from keras import layers from keras.optimizers import RMSpropmodel_ann = Sequential() model_ann.add(layers.Flatten(input_shape = (lookback, data_u.shape[-1]))) model_ann.add(layers.Dense(32,activation = 'relu')) model_ann.add(layers.Dropout(0.3)) model_ann.add(layers.Dense(1,activation = 'sigmoid')) model_ann.summary()

Then, I compile the model and record the fitting process.

model_ann.compile(optimizer = RMSprop(lr = 1e-2), loss = 'binary_crossentropy', metrics = ['acc']) history = model_ann.fit_generator(train_generator, steps_per_epoch=steps_per_epc, epochs = 20, validation_data = val_generator, validation_steps = val_steps)

To check the performance on the validation dataset, I plot the loss curve.

acc_ = history_dic['loss'] val_acc_ = history_dic['val_loss'] epochs = range(1,21) #plt.clf() plt.plot(epochs,acc_, 'bo', label = "training loss") plt.plot(epochs, val_acc_, 'r', label = "validation loss") plt.xlabel('Epochs') plt.ylabel('loss') plt.legend() plt.show()

As expected, the model becomes overfitting after several epochs. To evaluate the model objectively, I apply it to the test set and get accuracy as 60%.

scores = model_ann.evaluate_generator(test_generator,test_steps) print("Accuracy = ", scores[1]," Loss = ", scores[0])

Next, I implement an RNN by using one LSTM layer followed by two densely connected layers.

model_rnn = Sequential() model_rnn.add(layers.LSTM(32, dropout=0.2, recurrent_dropout=0.2, input_shape=(None,data_u.shape[-1])))model_rnn.add(layers.Dense(32,activation = 'relu')) model_rnn.add(layers.Dropout(0.3)) model_rnn.add(layers.Dense(1,activation='sigmoid')) model_rnn.summary()

The model training is similar to that of ANN above.

model_rnn.compile(optimizer = RMSprop(lr = 1e-2), loss = 'binary_crossentropy', metrics = ['acc']) history = model_rnn.fit_generator(train_generator, steps_per_epoch=steps_per_epc, epochs = 20, validation_data = val_generator, validation_steps = val_steps)

The training and validation set performance is as below.

The overfitting is not as severe as that of the ANN. I also evaluate the model on the test data, which yields an accuracy of 62.5%. Even though the performance on the test set is better than that of the ANN with densely connected layers, the improvement is tiny.

To gain better performance, I try to increase the complexity of the model by adding one more recurrent layer. However, to reduce the computational cost, I replace the LSTM layer by the Gated Recurrent Unit (GRU). The model is shown below.

model_rnn = Sequential() model_rnn.add(layers.GRU(32, dropout=0.2, recurrent_dropout=0.2, return_sequences = True, input_shape=(None,data_u.shape[-1]))) model_rnn.add(layers.GRU(64, activation = 'relu',dropout=0.2,recurrent_dropout=0.2)) model_rnn.add(layers.Dense(32,activation = 'relu')) model_rnn.add(layers.Dropout(0.3))model_rnn.add(layers.Dense(1,activation = 'sigmoid')) model_rnn.summary()

The training and validation set performance is as below.

No serious overfitting is detected on the plot. Even though the accuracy of the test data has increased to 64%, the improvement is still tiny. I begin to doubt whether RNN can do the job.

However, I give my last try by further increasing the complexity of the model. Specifically, I enable the recurrent layer to be bidirectional.

model_rnn = Sequential() model_rnn.add(layers.Bidirectional(layers.GRU(32, dropout=0.2, recurrent_dropout=0.2, return_sequences = True), input_shape=(None,data_u.shape[-1]))) model_rnn.add(layers.Bidirectional(layers.GRU(64, activation = 'relu',dropout=0.2,recurrent_dropout=0.2)))model_rnn.add(layers.Dense(32,activation = 'relu')) model_rnn.add(layers.Dropout(0.3)) model_rnn.add(layers.Dense(1,activation='sigmoid')) model_rnn.summary()

This time, the training and validation set performance is as below.

Actually, before the model starts overfitting, there is not much difference between this model and the previous one on the validation loss. The accuracy of the test set is 64% as well.

By exploring all the models above, I kind of realize that the RNN may not be a good fit for the NBA game result prediction problem. There are indeed tens of hyperparameters that can be tuned, the difference between the ANN and RNN, however, is too small.

推荐文章

- 1. Reinforcement Learning with TensorFlow Agents — Tutorial
- 2. Statistical analysis of 1 billion leaked credentials
- 3. Build a Hamilton Song Recommendation SMS Bot with Machine Learn..
- 4. LSTM Gradients
- 5. Common Activation Functions and Why You Must Know Them
- 6. Computational Linear Algebra for Programmers (2017)

## 我来评几句

登录后评论已发表评论数()