Fetch - Stock Datasets

Fetch populates redis caches as a stock data pipeline. Data can be pulled at any time using: analysis_engine.extract.extract

Dataset Fetch API

analysis_engine.fetch.fetch(ticker=None, tickers=None, fetch_mode=None, iex_datasets=None, redis_enabled=True, redis_address=None, redis_db=None, redis_password=None, redis_expire=None, s3_enabled=True, s3_address=None, s3_bucket=None, s3_access_key=None, s3_secret_key=None, s3_region_name=None, s3_secure=False, celery_disabled=True, broker_url=None, result_backend=None, label=None, verbose=False)[source]

Fetch all supported datasets for a stock ticker or a list of tickers and returns a dictionary. Once run, the datasets will all be cached in Redis and archived in Minio (S3) by default.

Python example:

from analysis_engine.fetch import fetch
d = fetch(ticker='NFLX')
print(d)
for k in d['NFLX']:
    print(f'dataset key: {k}')

By default, it synchronously automates:

  • fetching all datasets
  • caching all datasets in Redis
  • archiving all datasets in Minio (S3)
  • returns all datasets in a single dictionary

This was created for reducing the amount of typying in Jupyter notebooks. It can be set up for use with a distributed engine as well with the optional arguments depending on your connectitivty requirements.

Note

Please ensure Redis and Minio are running before trying to extract tickers

Stock tickers to fetch

Parameters:
  • ticker – single stock ticker/symbol/ETF to fetch
  • tickers – optional - list of tickers to fetch

(Optional) Data sources, datafeeds and datasets to gather

Parameters:
  • fetch_mode – data sources - default is all (both IEX and Yahoo), iex for only IEX, yahoo for only Yahoo.
  • iex_datasets – list of strings for gathering specific IEX datasets which are set as consts: analysis_engine.iex.consts.FETCH_*.

(Optional) Redis connectivity arguments

Parameters:
  • redis_enabled – bool - toggle for auto-caching all datasets in Redis (default is True)
  • redis_address – Redis connection string format: host:port (default is localhost:6379)
  • redis_db – Redis db to use (default is 0)
  • redis_password – optional - Redis password (default is None)
  • redis_expire – optional - Redis expire value (default is None)

(Optional) Minio (S3) connectivity arguments

Parameters:
  • s3_enabled – bool - toggle for auto-archiving on Minio (S3) (default is True)
  • s3_address – Minio S3 connection string format: host:port (default is localhost:9000)
  • s3_bucket – S3 Bucket for storing the artifacts (default is dev) which should be viewable on a browser: http://localhost:9000/minio/dev/
  • s3_access_key – S3 Access key (default is trexaccesskey)
  • s3_secret_key – S3 Secret key (default is trex123321)
  • s3_region_name – S3 region name (default is us-east-1)
  • s3_secure – Transmit using tls encryption (default is False)

(Optional) Celery worker broker connectivity arguments

Parameters:
  • celery_disabled – bool - toggle synchronous mode or publish to an engine connected to the Celery broker and backend (default is True - synchronous mode without an engine or need for a broker or backend for Celery)
  • broker_url – Celery broker url (default is redis://0.0.0.0:6379/13)
  • result_backend – Celery backend url (default is redis://0.0.0.0:6379/14)
  • label – tracking log label

(Optional) Debugging

Parameters:verbose – bool - show fetch warnings and other debug logging (default is False)

Supported environment variables

export REDIS_ADDRESS="localhost:6379"
export REDIS_DB="0"
export S3_ADDRESS="localhost:9000"
export S3_BUCKET="dev"
export AWS_ACCESS_KEY_ID="trexaccesskey"
export AWS_SECRET_ACCESS_KEY="trex123321"
export AWS_DEFAULT_REGION="us-east-1"
export S3_SECURE="0"
export WORKER_BROKER_URL="redis://0.0.0.0:6379/13"
export WORKER_BACKEND_URL="redis://0.0.0.0:6379/14"