AI - Building and Tuning Deep Neural Networks for Predicting Future Prices¶
The following notebooks, script and modules are guides for building KerasRegressor models, deep neural networks (dnn), for trying to predict a stock’s future closing price from a Trading History dataset. The tools use a Trading History dataset that was created and automatically published to S3 after processing a trading algorithm’s backtest of custom indicators analyzed intraday minute-by-minute pricing data stored in redis.
If you do not have a Trading History you can create one with:
ae -t SPY -p s3://algohistory/algo_training_SPY.json
and run it distributed across the engine’s workers with -w
ae -w -t SPY -p s3://algohistory/algo_training_SPY.json
Here are examples on training a dnn’s using a Trading History from S3 (Minio or AWS):
AI - Building a Deep Neural Network Helper Module¶
This function is used as a Keras Scikit-Learn Builder Function for creating a Keras Sequential deep neural network model (dnn). This function is passed in as the build_fn argument to create a KerasRegressor (or KerasClassifier).
Build a deep neural network for regression predictions
-
analysis_engine.ai.build_regression_dnn.build_regression_dnn(num_features, compile_config, model_json=None, model_config=None)[source]¶ Parameters: - num_features – input_dim for the number of features in the data
- compile_config – dictionary of compile options
- model_json – keras model json to build the model
- model_config – optional dictionary for model
AI - Training Dataset Helper Modules¶
These modules are included to help build new training datasets. It looks like read the docs does not support keras, sklearn or tensorflow for generating sphinx docs so here are links to the repository’s source code:
Build scaler normalized train and test datasets
from a pandas.DataFrame (like a Trading History stored in s3)
Note
This function will create multiple copies of the data so this is a memory intensive call which may overflow the available memory on a machine if there are many rows
-
analysis_engine.ai.build_datasets_using_scalers.build_datasets_using_scalers(train_features, test_feature, df, test_size, seed, min_feature=-1, max_feature=1)[source]¶ Build train and test datasets using a MinMaxScaler for normalizing a dataset before training a deep neural network.
Here’s the returned dictionary:
res = { 'status': status, 'scaled_train_df': scaled_train_df, 'scaled_test_df': scaled_test_df, 'scaler_train': scaler_train, 'scaler_test': scaler_test, 'x_train': x_train, 'y_train': y_train, 'x_test': x_test, 'y_test': y_test, }
Parameters: - train_features – list of strings with all columns (features) to train
- test_feature – string name of the column to predict.
This is a single column name in the``df``
(which is a
pandas.DataFrame). - df – dataframe to build scaler test and train datasets
- test_size – percent of test to train rows
- min_feature – min scaler range
with default
-1 - max_feature – max scaler range
with default
1
Build a scaler normalized
pandas.DataFrame from an existing pandas.DataFrame
-
analysis_engine.ai.build_scaler_dataset_from_df.build_scaler_dataset_from_df(df, min_feature=-1, max_feature=1)[source]¶ Helper for building scaler datasets from an existing
pandas.DataFramereturns a dictionary:
return { 'status': status, # NOT_RUN | SUCCESS | ERR 'scaler': scaler, # MinMaxScaler 'df': df # scaled df from df arg }
Parameters: - df –
pandas.DataFrameto convert to scalers - min_feature – min feature range for scaler normalization
with default
-1 - max_feature – max feature range for scaler normalization
with default
1
- df –
AI - Plot Deep Neural Network Fit History¶
Plot a deep neural network’s history output after training
Please check out this blog post for more information on how this works
-
analysis_engine.ai.plot_dnn_fit_history.plot_dnn_fit_history(title, df, red, red_color=None, red_label=None, blue=None, blue_color=None, blue_label=None, green=None, green_color=None, green_label=None, orange=None, orange_color=None, orange_label=None, xlabel='Training Epochs', ylabel='Error Values', linestyle='-', width=9.0, height=9.0, date_format='%d\n%b', df_filter=None, start_date=None, footnote_text=None, footnote_xpos=0.7, footnote_ypos=0.01, footnote_color='#888888', footnote_fontsize=8, scale_y=False, show_plot=True, dropna_for_all=False, verbose=False, send_plots_to_slack=False)[source]¶ Plot a DNN’s fit history using Keras fit history object
Parameters: - title – title of the plot
- df – dataset which is
pandas.DataFrame - red – string - column name to plot in
red_color(or defaultae_consts.PLOT_COLORS[red]) where the column is in thedfand accessible with:df[red] - red_color – hex color code to plot the data in the
df[red](default isae_consts.PLOT_COLORS['red']) - red_label – optional - string for the label used
to identify the
redline in the legend - blue – string - column name to plot in
blue_color(or defaultae_consts.PLOT_COLORS['blue']) where the column is in thedfand accessible with:df[blue] - blue_color – hex color code to plot the data in the
df[blue](default isae_consts.PLOT_COLORS['blue']) - blue_label – optional - string for the label used
to identify the
blueline in the legend - green – string - column name to plot in
green_color(or defaultae_consts.PLOT_COLORS['darkgreen']) where the column is in thedfand accessible with:df[green] - green_color – hex color code to plot the data in the
df[green](default isae_consts.PLOT_COLORS['darkgreen']) - green_label – optional - string for the label used
to identify the
greenline in the legend - orange – string - column name to plot in
orange_color(or defaultae_consts.PLOT_COLORS['orange']) where the column is in thedfand accessible with:df[orange] - orange_color – hex color code to plot the data in the
df[orange](default isae_consts.PLOT_COLORS['orange']) - orange_label – optional - string for the label used
to identify the
orangeline in the legend - xlabel – x-axis label
- ylabel – y-axis label
- linestyle – style of the plot line
- width – float - width of the image
- height – float - height of the image
- date_format – string - format for dates
- df_filter – optional - initialized
pandas.DataFramequery for reducing thedfrecords before plotting. As an eaxmpledf_filter=(df['close'] > 0.01)would find only records in thedfwith aclosevalue greater than0.01 - start_date – optional - string
datetimefor plotting only from a date formatted asYYYY-MM-DD HH\:MM\:SS - footnote_text – optional - string footnote text
(default is
algotraders <DATE>) - footnote_xpos – optional - float for footnote position
on the x-axies
(default is
0.75) - footnote_ypos – optional - float for footnote position
on the y-axies
(default is
0.01) - footnote_color – optional - string hex color code for
the footnote text
(default is
#888888) - footnote_fontsize – optional - float footnote font size
(default is
8) - scale_y –
optional - bool to scale the y-axis with .. code-block:: python
- use_ax.set_ylim(
- [0, use_ax.get_ylim()[1] * 3])
- show_plot – bool to show the plot
- dropna_for_all – optional - bool to toggle keep None’s in
the plot
df(default is drop them for display purposes) - verbose – optional - bool to show logs for debugging a dataset
- send_plots_to_slack – optional - bool to send the dnn plot to slack