Fetch - Stock Datasets

Fetch populates redis caches as a stock data pipeline. Data can be pulled at any time using: analysis_engine.extract.extract

Dataset Fetch API

analysis_engine.fetch.fetch(ticker=None, tickers=None, fetch_mode=None, iex_datasets=None, redis_enabled=True, redis_address=None, redis_db=None, redis_password=None, redis_expire=None, s3_enabled=True, s3_address=None, s3_bucket=None, s3_access_key=None, s3_secret_key=None, s3_region_name=None, s3_secure=False, celery_disabled=True, broker_url=None, result_backend=None, label=None, verbose=False)[source]

Fetch all supported datasets for a stock ticker or a list of tickers and returns a dictionary. Once run, the datasets will all be cached in Redis and archived in Minio (S3) by default.

Python example:

from analysis_engine.fetch import fetch
d = fetch(ticker='NFLX')
for k in d['NFLX']:
    print(f'dataset key: {k}')

By default, it synchronously automates:

  • fetching all datasets
  • caching all datasets in Redis
  • archiving all datasets in Minio (S3)
  • returns all datasets in a single dictionary

This was created for reducing the amount of typying in Jupyter notebooks. It can be set up for use with a distributed engine as well with the optional arguments depending on your connectitivty requirements.


Please ensure Redis and Minio are running before trying to extract tickers

Stock tickers to fetch

  • ticker – single stock ticker/symbol/ETF to fetch
  • tickers – optional - list of tickers to fetch

(Optional) Data sources, datafeeds and datasets to gather

  • fetch_mode – data sources - default is all (both IEX and Yahoo), iex for only IEX, yahoo for only Yahoo.
  • iex_datasets – list of strings for gathering specific IEX datasets which are set as consts: analysis_engine.iex.consts.FETCH_*.

(Optional) Redis connectivity arguments

  • redis_enabled – bool - toggle for auto-caching all datasets in Redis (default is True)
  • redis_address – Redis connection string format: host:port (default is localhost:6379)
  • redis_db – Redis db to use (default is 0)
  • redis_password – optional - Redis password (default is None)
  • redis_expire – optional - Redis expire value (default is None)

(Optional) Minio (S3) connectivity arguments

  • s3_enabled – bool - toggle for auto-archiving on Minio (S3) (default is True)
  • s3_address – Minio S3 connection string format: host:port (default is localhost:9000)
  • s3_bucket – S3 Bucket for storing the artifacts (default is dev) which should be viewable on a browser: http://localhost:9000/minio/dev/
  • s3_access_key – S3 Access key (default is trexaccesskey)
  • s3_secret_key – S3 Secret key (default is trex123321)
  • s3_region_name – S3 region name (default is us-east-1)
  • s3_secure – Transmit using tls encryption (default is False)

(Optional) Celery worker broker connectivity arguments

  • celery_disabled – bool - toggle synchronous mode or publish to an engine connected to the Celery broker and backend (default is True - synchronous mode without an engine or need for a broker or backend for Celery)
  • broker_url – Celery broker url (default is redis://
  • result_backend – Celery backend url (default is redis://
  • label – tracking log label

(Optional) Debugging

Parameters:verbose – bool - show fetch warnings and other debug logging (default is False)

Supported environment variables

export REDIS_ADDRESS="localhost:6379"
export REDIS_DB="0"
export S3_ADDRESS="localhost:9000"
export S3_BUCKET="dev"
export AWS_ACCESS_KEY_ID="trexaccesskey"
export AWS_SECRET_ACCESS_KEY="trex123321"
export AWS_DEFAULT_REGION="us-east-1"
export S3_SECURE="0"
export WORKER_BROKER_URL="redis://"
export WORKER_BACKEND_URL="redis://"