Dataset Tools - Load Dataset

Load Algorithm Ready Dataset

analysis_engine.load_dataset.load_dataset will load a dataset from a file, s3 or redis.

Load an algorithm dataset from file, s3 or redis

Supported Datasets:

  • SA_DATASET_TYPE_ALGO_READY - Algorithm-ready datasets

Supported environment variables

# to show debug, trace logging please export ``SHARED_LOG_CFG``
# to a debug logger json file. To turn on debugging for this
# library, you can export this variable to the repo's
# included file with the command:
export SHARED_LOG_CFG=/opt/sa/analysis_engine/log/debug-logging.json
analysis_engine.load_dataset.load_dataset(algo_dataset=None, dataset_type=20000, serialize_datasets=['daily', 'minute', 'quote', 'stats', 'peers', 'news1', 'financials', 'earnings', 'dividends', 'company', 'news', 'calls', 'puts', 'pricing', 'tdcalls', 'tdputs'], path_to_file=None, compress=False, encoding='utf-8', redis_enabled=True, redis_key=None, redis_address=None, redis_db=None, redis_password=None, redis_expire=None, redis_serializer='json', redis_encoding='utf-8', s3_enabled=True, s3_key=None, s3_address=None, s3_bucket=None, s3_access_key=None, s3_secret_key=None, s3_region_name=None, s3_secure=False, slack_enabled=False, slack_code_block=False, slack_full_width=False, verbose=False)[source]

Load an algorithm dataset from file, s3 or redis

Parameters:
  • algo_dataset – optional - already loaded algorithm-ready dataset
  • dataset_type – optional - dataset type (default is SA_DATASET_TYPE_ALGO_READY)
  • path_to_file – optional - path to an algorithm-ready dataset in a file
  • serialize_datasets – optional - list of dataset names to deserialize in the dataset
  • compress – optional - boolean flag for decompressing the contents of the path_to_file if necessary (default is False and algorithms use zlib for compression)
  • encoding – optional - string for data encoding

(Optional) Redis connectivity arguments

Parameters:
  • redis_enabled – bool - toggle for auto-caching all datasets in Redis (default is True)
  • redis_key – string - key to save the data in redis (default is None)
  • redis_address – Redis connection string format: host:port (default is localhost:6379)
  • redis_db – Redis db to use (default is 0)
  • redis_password – optional - Redis password (default is None)
  • redis_expire – optional - Redis expire value (default is None)
  • redis_serializer – not used yet - support for future pickle objects in redis
  • redis_encoding – format of the encoded key in redis

(Optional) Minio (S3) connectivity arguments

Parameters:
  • s3_enabled – bool - toggle for auto-archiving on Minio (S3) (default is True)
  • s3_key – string - key to save the data in redis (default is None)
  • s3_address – Minio S3 connection string format: host:port (default is localhost:9000)
  • s3_bucket – S3 Bucket for storing the artifacts (default is dev) which should be viewable on a browser: http://localhost:9000/minio/dev/
  • s3_access_key – S3 Access key (default is trexaccesskey)
  • s3_secret_key – S3 Secret key (default is trex123321)
  • s3_region_name – S3 region name (default is us-east-1)
  • s3_secure – Transmit using tls encryption (default is False)

(Optional) Slack arguments

Parameters:
  • slack_enabled – optional - boolean for publishing to slack
  • slack_code_block – optional - boolean for publishing as a code black in slack
  • slack_full_width – optional - boolean for publishing as a to slack using the full width allowed

Additonal arguments

Parameters:verbose – optional - bool for increasing logging

Load Trading History Dataset

Load an Trading History dataset from file, s3 - redis coming soon

Supported Datasets:

  • SA_DATASET_TYPE_TRADING_HISTORY - trading history datasets
analysis_engine.load_history_dataset.load_history_dataset(history_dataset=None, dataset_type=None, serialize_datasets=None, path_to_file=None, compress=None, encoding='utf-8', convert_to_dict=False, redis_enabled=None, redis_key=None, redis_address=None, redis_db=None, redis_password=None, redis_expire=None, redis_serializer='json', redis_encoding='utf-8', s3_enabled=None, s3_key=None, s3_address=None, s3_bucket=None, s3_access_key=None, s3_secret_key=None, s3_region_name=None, s3_secure=None, slack_enabled=False, slack_code_block=False, slack_full_width=False, verbose=False)[source]

Load a Trading History Dataset from file, s3 - note redis is not supported yet

Parameters:
  • history_dataset – optional - already loaded history dataset
  • dataset_type – optional - dataset type (default is analysis_engine.consts.SA_DATASET_TYPE_TRADING_HISTORY)
  • path_to_file – optional - path to a trading history dataset in a file
  • serialize_datasets – optional - list of dataset names to deserialize in the dataset
  • compress – optional - boolean flag for decompressing the contents of the path_to_file if necessary (default is True and uses zlib for compression)
  • encoding – optional - string for data encoding
  • convert_to_dict – optional - boolean flag for decoding as a dictionary during prepare

(Optional) Redis connectivity arguments

Parameters:
  • redis_enabled – bool - toggle for auto-caching all datasets in Redis (default is analysis_engine.consts.ENABLED_REDIS_PUBLISH)
  • redis_key – string - key to save the data in redis (default is None)
  • redis_address – Redis connection string format: host:port (default is analysis_engine.consts.REDIS_ADDRESS)
  • redis_db – Redis db to use (default is analysis_engine.consts.REDIS_DB)
  • redis_password – optional - Redis password (default is analysis_engine.consts.REDIS_PASSWORD)
  • redis_expire – optional - Redis expire value (default is None)
  • redis_serializer – not used yet - support for future pickle objects in redis (default is json)
  • redis_encoding – format of the encoded key in redis (default is utf-8)

(Optional) Minio (S3) connectivity arguments

Parameters:
  • s3_enabled – bool - toggle for auto-archiving on Minio (S3) (default is analysis_engine.consts.ENABLED_S3_UPLOAD)
  • s3_key – string - key to save the data in redis (default is None)
  • s3_address – Minio S3 connection string format: host:port (default is analysis_engine.consts.S3_ADDRESS)
  • s3_bucket – S3 Bucket for storing the artifacts (default is analysis_engine.consts.S3_BUCKET) which should be viewable on a browser: http://localhost:9000/minio/
  • s3_access_key – S3 Access key (default is analysis_engine.consts.S3_ACCESS_KEY)
  • s3_secret_key – S3 Secret key (default is analysis_engine.consts.S3_SECRET_KEY)
  • s3_region_name – S3 region name (default is analysis_engine.consts.S3_REGION_NAME)
  • s3_secure – Transmit using tls encryption (default is analysis_engine.consts.S3_SECURE)

(Optional) Slack arguments

Parameters:
  • slack_enabled – optional - boolean for publishing to slack
  • slack_code_block – optional - boolean for publishing as a code black in slack
  • slack_full_width – optional - boolean for publishing as a to slack using the full width allowed

Additonal arguments

Parameters:verbose – optional - bool for increasing logging

Load Trading History Dataset from S3

Helper for loading Trading History datasets from s3

analysis_engine.load_history_dataset_from_s3.load_history_dataset_from_s3(s3_key, s3_address, s3_bucket, s3_access_key, s3_secret_key, s3_region_name, s3_secure, serialize_datasets=['daily', 'minute', 'quote', 'stats', 'peers', 'news1', 'financials', 'earnings', 'dividends', 'company', 'news', 'calls', 'puts', 'pricing', 'tdcalls', 'tdputs'], convert_as_json=True, convert_to_dict=False, compress=False, encoding='utf-8')[source]

Load an algorithm-ready dataset for algorithm backtesting from a local file

Parameters:
  • serialize_datasets – optional - list of dataset names to deserialize in the dataset
  • convert_as_json – optional - boolean flag for decoding as a dictionary
  • convert_to_dict – optional - boolean flag for decoding as a dictionary during prepare
  • compress – optional - boolean flag for decompressing the contents of the path_to_file if necessary (default is False and algorithms use zlib for compression)
  • encoding – optional - string for data encoding

Minio (S3) connectivity arguments

Parameters:
  • s3_enabled – bool - toggle for auto-archiving on Minio (S3) (default is True)
  • s3_key – string - key to save the data in redis (default is None)
  • s3_address – Minio S3 connection string format: host:port (default is localhost:9000)
  • s3_bucket – S3 Bucket for storing the artifacts (default is dev) which should be viewable on a browser: http://localhost:9000/minio/dev/
  • s3_access_key – S3 Access key (default is trexaccesskey)
  • s3_secret_key – S3 Secret key (default is trex123321)
  • s3_region_name – S3 region name (default is us-east-1)
  • s3_secure – Transmit using tls encryption (default is False)

Load Trading History Dataset from a local File

Helper for loading Trading History dataset from a file

Supported environment variables

# to show debug, trace logging please export ``SHARED_LOG_CFG``
# to a debug logger json file. To turn on debugging for this
# library, you can export this variable to the repo's
# included file with the command:
export SHARED_LOG_CFG=/opt/sa/analysis_engine/log/debug-logging.json
analysis_engine.load_history_dataset_from_file.load_history_dataset_from_file(path_to_file, compress=False, encoding='utf-8')[source]

Load a Trading History dataset from a local file

Parameters:
  • path_to_file – string - path to file holding an Trading History dataset
  • compress – optional - boolean flag for decompressing the contents of the path_to_file if necessary (default is False and algorithms use zlib for compression)
  • encoding – optional - string for data encoding