AI - Building and Tuning Deep Neural Networks for Predicting Future Prices¶
The following notebooks, script and modules are guides for building KerasRegressor models, deep neural networks (dnn), for trying to predict a stock’s future closing price from a Trading History
dataset. The tools use a Trading History
dataset that was created and automatically published to S3 after processing a trading algorithm’s backtest of custom indicators analyzed intraday minute-by-minute pricing data stored in redis.
If you do not have a Trading History
you can create one with:
ae -t SPY -p s3://algohistory/algo_training_SPY.json
and run it distributed across the engine’s workers with -w
ae -w -t SPY -p s3://algohistory/algo_training_SPY.json
Here are examples on training a dnn’s using a Trading History
from S3 (Minio or AWS):
AI - Building a Deep Neural Network Helper Module¶
This function is used as a Keras Scikit-Learn Builder Function for creating a Keras Sequential deep neural network model (dnn). This function is passed in as the build_fn argument to create a KerasRegressor (or KerasClassifier).
Build a deep neural network for regression predictions
-
analysis_engine.ai.build_regression_dnn.
build_regression_dnn
(num_features, compile_config, model_json=None, model_config=None)[source]¶ Parameters: - num_features – input_dim for the number of features in the data
- compile_config – dictionary of compile options
- model_json – keras model json to build the model
- model_config – optional dictionary for model
AI - Training Dataset Helper Modules¶
These modules are included to help build new training datasets. It looks like read the docs does not support keras, sklearn or tensorflow for generating sphinx docs so here are links to the repository’s source code:
Build scaler normalized train and test datasets
from a pandas.DataFrame
(like a Trading History
stored in s3)
Note
This function will create multiple copies of the data so this is a memory intensive call which may overflow the available memory on a machine if there are many rows
-
analysis_engine.ai.build_datasets_using_scalers.
build_datasets_using_scalers
(train_features, test_feature, df, test_size, seed, min_feature=-1, max_feature=1)[source]¶ Build train and test datasets using a MinMaxScaler for normalizing a dataset before training a deep neural network.
Here’s the returned dictionary:
res = { 'status': status, 'scaled_train_df': scaled_train_df, 'scaled_test_df': scaled_test_df, 'scaler_train': scaler_train, 'scaler_test': scaler_test, 'x_train': x_train, 'y_train': y_train, 'x_test': x_test, 'y_test': y_test, }
Parameters: - train_features – list of strings with all columns (features) to train
- test_feature – string name of the column to predict.
This is a single column name in the``df``
(which is a
pandas.DataFrame
). - df – dataframe to build scaler test and train datasets
- test_size – percent of test to train rows
- min_feature – min scaler range
with default
-1
- max_feature – max scaler range
with default
1
Build a scaler normalized
pandas.DataFrame
from an existing pandas.DataFrame
-
analysis_engine.ai.build_scaler_dataset_from_df.
build_scaler_dataset_from_df
(df, min_feature=-1, max_feature=1)[source]¶ Helper for building scaler datasets from an existing
pandas.DataFrame
returns a dictionary:
return { 'status': status, # NOT_RUN | SUCCESS | ERR 'scaler': scaler, # MinMaxScaler 'df': df # scaled df from df arg }
Parameters: - df –
pandas.DataFrame
to convert to scalers - min_feature – min feature range for scaler normalization
with default
-1
- max_feature – max feature range for scaler normalization
with default
1
- df –
AI - Plot Deep Neural Network Fit History¶
Plot a deep neural network’s history output after training
Please check out this blog post for more information on how this works
-
analysis_engine.ai.plot_dnn_fit_history.
plot_dnn_fit_history
(title, df, red, red_color=None, red_label=None, blue=None, blue_color=None, blue_label=None, green=None, green_color=None, green_label=None, orange=None, orange_color=None, orange_label=None, xlabel='Training Epochs', ylabel='Error Values', linestyle='-', width=9.0, height=9.0, date_format='%d\n%b', df_filter=None, start_date=None, footnote_text=None, footnote_xpos=0.7, footnote_ypos=0.01, footnote_color='#888888', footnote_fontsize=8, scale_y=False, show_plot=True, dropna_for_all=False, verbose=False, send_plots_to_slack=False)[source]¶ Plot a DNN’s fit history using Keras fit history object
Parameters: - title – title of the plot
- df – dataset which is
pandas.DataFrame
- red – string - column name to plot in
red_color
(or defaultae_consts.PLOT_COLORS[red]
) where the column is in thedf
and accessible with:df[red]
- red_color – hex color code to plot the data in the
df[red]
(default isae_consts.PLOT_COLORS['red']
) - red_label – optional - string for the label used
to identify the
red
line in the legend - blue – string - column name to plot in
blue_color
(or defaultae_consts.PLOT_COLORS['blue']
) where the column is in thedf
and accessible with:df[blue]
- blue_color – hex color code to plot the data in the
df[blue]
(default isae_consts.PLOT_COLORS['blue']
) - blue_label – optional - string for the label used
to identify the
blue
line in the legend - green – string - column name to plot in
green_color
(or defaultae_consts.PLOT_COLORS['darkgreen']
) where the column is in thedf
and accessible with:df[green]
- green_color – hex color code to plot the data in the
df[green]
(default isae_consts.PLOT_COLORS['darkgreen']
) - green_label – optional - string for the label used
to identify the
green
line in the legend - orange – string - column name to plot in
orange_color
(or defaultae_consts.PLOT_COLORS['orange']
) where the column is in thedf
and accessible with:df[orange]
- orange_color – hex color code to plot the data in the
df[orange]
(default isae_consts.PLOT_COLORS['orange']
) - orange_label – optional - string for the label used
to identify the
orange
line in the legend - xlabel – x-axis label
- ylabel – y-axis label
- linestyle – style of the plot line
- width – float - width of the image
- height – float - height of the image
- date_format – string - format for dates
- df_filter – optional - initialized
pandas.DataFrame
query for reducing thedf
records before plotting. As an eaxmpledf_filter=(df['close'] > 0.01)
would find only records in thedf
with aclose
value greater than0.01
- start_date – optional - string
datetime
for plotting only from a date formatted asYYYY-MM-DD HH\:MM\:SS
- footnote_text – optional - string footnote text
(default is
algotraders <DATE>
) - footnote_xpos – optional - float for footnote position
on the x-axies
(default is
0.75
) - footnote_ypos – optional - float for footnote position
on the y-axies
(default is
0.01
) - footnote_color – optional - string hex color code for
the footnote text
(default is
#888888
) - footnote_fontsize – optional - float footnote font size
(default is
8
) - scale_y –
optional - bool to scale the y-axis with .. code-block:: python
- use_ax.set_ylim(
- [0, use_ax.get_ylim()[1] * 3])
- show_plot – bool to show the plot
- dropna_for_all – optional - bool to toggle keep None’s in
the plot
df
(default is drop them for display purposes) - verbose – optional - bool to show logs for debugging a dataset
- send_plots_to_slack – optional - bool to send the dnn plot to slack