AI - Building and Tuning Deep Neural Networks for Predicting Future Prices

https://i.imgur.com/tw2wJ6t.png

The following notebooks, script and modules are guides for building KerasRegressor models, deep neural networks (dnn), for trying to predict a stock’s future closing price from a Trading History dataset. The tools use a Trading History dataset that was created and automatically published to S3 after processing a trading algorithm’s backtest of custom indicators analyzed intraday minute-by-minute pricing data stored in redis.

If you do not have a Trading History you can create one with:

ae -t SPY -p s3://algohistory/algo_training_SPY.json

and run it distributed across the engine’s workers with -w

ae -w -t SPY -p s3://algohistory/algo_training_SPY.json

Here are examples on training a dnn’s using a Trading History from S3 (Minio or AWS):

AI - Building a Deep Neural Network Helper Module

This function is used as a Keras Scikit-Learn Builder Function for creating a Keras Sequential deep neural network model (dnn). This function is passed in as the build_fn argument to create a KerasRegressor (or KerasClassifier).

Build a deep neural network for regression predictions

analysis_engine.ai.build_regression_dnn.build_regression_dnn(num_features, compile_config, model_json=None, model_config=None)[source]
Parameters:
  • num_features – input_dim for the number of features in the data
  • compile_config – dictionary of compile options
  • model_json – keras model json to build the model
  • model_config – optional dictionary for model

AI - Training Dataset Helper Modules

These modules are included to help build new training datasets. It looks like read the docs does not support keras, sklearn or tensorflow for generating sphinx docs so here are links to the repository’s source code:

Build scaler normalized train and test datasets from a pandas.DataFrame (like a Trading History stored in s3)

Note

This function will create multiple copies of the data so this is a memory intensive call which may overflow the available memory on a machine if there are many rows

analysis_engine.ai.build_datasets_using_scalers.build_datasets_using_scalers(train_features, test_feature, df, test_size, seed, min_feature=-1, max_feature=1)[source]

Build train and test datasets using a MinMaxScaler for normalizing a dataset before training a deep neural network.

Here’s the returned dictionary:

res = {
    'status': status,
    'scaled_train_df': scaled_train_df,
    'scaled_test_df': scaled_test_df,
    'scaler_train': scaler_train,
    'scaler_test': scaler_test,
    'x_train': x_train,
    'y_train': y_train,
    'x_test': x_test,
    'y_test': y_test,
}
Parameters:
  • train_features – list of strings with all columns (features) to train
  • test_feature – string name of the column to predict. This is a single column name in the``df`` (which is a pandas.DataFrame).
  • df – dataframe to build scaler test and train datasets
  • test_size – percent of test to train rows
  • min_feature – min scaler range with default -1
  • max_feature – max scaler range with default 1

Build a scaler normalized pandas.DataFrame from an existing pandas.DataFrame

analysis_engine.ai.build_scaler_dataset_from_df.build_scaler_dataset_from_df(df, min_feature=-1, max_feature=1)[source]

Helper for building scaler datasets from an existing pandas.DataFrame

returns a dictionary:

return {
    'status': status,   # NOT_RUN | SUCCESS | ERR
    'scaler': scaler,   # MinMaxScaler
    'df': df  # scaled df from df arg
}
Parameters:
  • dfpandas.DataFrame to convert to scalers
  • min_feature – min feature range for scaler normalization with default -1
  • max_feature – max feature range for scaler normalization with default 1

AI - Plot Deep Neural Network Fit History

Plot a deep neural network’s history output after training

Please check out this blog post for more information on how this works

analysis_engine.ai.plot_dnn_fit_history.plot_dnn_fit_history(title, df, red, red_color=None, red_label=None, blue=None, blue_color=None, blue_label=None, green=None, green_color=None, green_label=None, orange=None, orange_color=None, orange_label=None, xlabel='Training Epochs', ylabel='Error Values', linestyle='-', width=9.0, height=9.0, date_format='%d\n%b', df_filter=None, start_date=None, footnote_text=None, footnote_xpos=0.7, footnote_ypos=0.01, footnote_color='#888888', footnote_fontsize=8, scale_y=False, show_plot=True, dropna_for_all=False, verbose=False, send_plots_to_slack=False)[source]

Plot a DNN’s fit history using Keras fit history object

Parameters:
  • title – title of the plot
  • df – dataset which is pandas.DataFrame
  • red – string - column name to plot in red_color (or default ae_consts.PLOT_COLORS[red]) where the column is in the df and accessible with:df[red]
  • red_color – hex color code to plot the data in the df[red] (default is ae_consts.PLOT_COLORS['red'])
  • red_label – optional - string for the label used to identify the red line in the legend
  • blue – string - column name to plot in blue_color (or default ae_consts.PLOT_COLORS['blue']) where the column is in the df and accessible with:df[blue]
  • blue_color – hex color code to plot the data in the df[blue] (default is ae_consts.PLOT_COLORS['blue'])
  • blue_label – optional - string for the label used to identify the blue line in the legend
  • green – string - column name to plot in green_color (or default ae_consts.PLOT_COLORS['darkgreen']) where the column is in the df and accessible with:df[green]
  • green_color – hex color code to plot the data in the df[green] (default is ae_consts.PLOT_COLORS['darkgreen'])
  • green_label – optional - string for the label used to identify the green line in the legend
  • orange – string - column name to plot in orange_color (or default ae_consts.PLOT_COLORS['orange']) where the column is in the df and accessible with:df[orange]
  • orange_color – hex color code to plot the data in the df[orange] (default is ae_consts.PLOT_COLORS['orange'])
  • orange_label – optional - string for the label used to identify the orange line in the legend
  • xlabel – x-axis label
  • ylabel – y-axis label
  • linestyle – style of the plot line
  • width – float - width of the image
  • height – float - height of the image
  • date_format – string - format for dates
  • df_filter – optional - initialized pandas.DataFrame query for reducing the df records before plotting. As an eaxmple df_filter=(df['close'] > 0.01) would find only records in the df with a close value greater than 0.01
  • start_date – optional - string datetime for plotting only from a date formatted as YYYY-MM-DD HH\:MM\:SS
  • footnote_text – optional - string footnote text (default is algotraders <DATE>)
  • footnote_xpos – optional - float for footnote position on the x-axies (default is 0.75)
  • footnote_ypos – optional - float for footnote position on the y-axies (default is 0.01)
  • footnote_color – optional - string hex color code for the footnote text (default is #888888)
  • footnote_fontsize – optional - float footnote font size (default is 8)
  • scale_y

    optional - bool to scale the y-axis with .. code-block:: python

    use_ax.set_ylim(
    [0, use_ax.get_ylim()[1] * 3])
  • show_plot – bool to show the plot
  • dropna_for_all – optional - bool to toggle keep None’s in the plot df (default is drop them for display purposes)
  • verbose – optional - bool to show logs for debugging a dataset
  • send_plots_to_slack – optional - bool to send the dnn plot to slack