Dataset Tools - Show Dataset

analysis_engine.show_dataset.show_dataset will load a dataset from a file, s3 or redis and produce a structure overview for debugging dataset mapping and serialization issues.

Show an algorithm dataset from file, s3 or redis

Supported Datasets:

  • SA_DATASET_TYPE_ALGO_READY - Algorithm-ready datasets

Supported environment variables

# to show debug, trace logging please export ``SHARED_LOG_CFG``
# to a debug logger json file. To turn on debugging for this
# library, you can export this variable to the repo's
# included file with the command:
export SHARED_LOG_CFG=/opt/sa/analysis_engine/log/debug-logging.json
analysis_engine.show_dataset.show_dataset(algo_dataset=None, dataset_type=20000, serialize_datasets=['daily', 'minute', 'quote', 'stats', 'peers', 'news1', 'financials', 'earnings', 'dividends', 'company', 'news', 'calls', 'puts', 'pricing', 'tdcalls', 'tdputs'], path_to_file=None, compress=False, encoding='utf-8', redis_enabled=True, redis_key=None, redis_address=None, redis_db=None, redis_password=None, redis_expire=None, redis_serializer='json', redis_encoding='utf-8', s3_enabled=True, s3_key=None, s3_address=None, s3_bucket=None, s3_access_key=None, s3_secret_key=None, s3_region_name=None, s3_secure=False, slack_enabled=False, slack_code_block=False, slack_full_width=False, verbose=False)[source]

Show a supported dataset’s internal structure and preview some of the values to debug mapping, serialization issues

  • algo_dataset – optional - already loaded algorithm-ready dataset
  • dataset_type – optional - dataset type (default is SA_DATASET_TYPE_ALGO_READY)
  • serialize_datasets – optional - list of dataset names to deserialize in the dataset
  • path_to_file – optional - path to an algorithm-ready dataset in a file
  • compress – optional - boolean flag for decompressing the contents of the path_to_file if necessary (default is False and algorithms use zlib for compression)
  • encoding – optional - string for data encoding

(Optional) Redis connectivity arguments

  • redis_enabled – bool - toggle for auto-caching all datasets in Redis (default is True)
  • redis_key – string - key to save the data in redis (default is None)
  • redis_address – Redis connection string format: host:port (default is localhost:6379)
  • redis_db – Redis db to use (default is 0)
  • redis_password – optional - Redis password (default is None)
  • redis_expire – optional - Redis expire value (default is None)
  • redis_serializer – not used yet - support for future pickle objects in redis
  • redis_encoding – format of the encoded key in redis

(Optional) Minio (S3) connectivity arguments

  • s3_enabled – bool - toggle for auto-archiving on Minio (S3) (default is True)
  • s3_key – string - key to save the data in redis (default is None)
  • s3_address – Minio S3 connection string format: host:port (default is localhost:9000)
  • s3_bucket – S3 Bucket for storing the artifacts (default is dev) which should be viewable on a browser: http://localhost:9000/minio/dev/
  • s3_access_key – S3 Access key (default is trexaccesskey)
  • s3_secret_key – S3 Secret key (default is trex123321)
  • s3_region_name – S3 region name (default is us-east-1)
  • s3_secure – Transmit using tls encryption (default is False)

(Optional) Slack arguments

  • slack_enabled – optional - boolean for publishing to slack
  • slack_code_block – optional - boolean for publishing as a code black in slack
  • slack_full_width – optional - boolean for publishing as a to slack using the full width allowed

Additonal arguments

Parameters:verbose – optional - bool for increasing logging