Yahoo Market Data with yfinance in Python

Yahoo Market Data with yfinance in Python

Yahoo market data with yfinance in Python

I came across a useful technique recently to get Yahoo Market Data in Python … which still works!! In the past I have tried a number of methods and techniques to get market data. Often the API or free data service would be discontinued and the method stops working. However, obtaining Yahoo market data with yfinance in Python still works and can be easily used with Pandas.

Downloading Yahoo Market Data with yfinance

The high level approach I use here is to download data with the yfinance Python module. You can download market data into a Pandas DataFrame for a set of tickers. Once downloaded, you slice it up and do what you want with it. It’s really that straightforward. The example below shows how to download and obtain a adjusted price data in a Pandas DataFrame.

from datetime import datetime

import numpy as np
import pandas as pd
import yfinance as yf

ticker_list = ['AAPL','GOOG','FB','AMZN','COP']

start_date = datetime(2020,6,8)
end_date = datetime(2020,12,11)

mkt_data = yf.download(ticker_list, start_date, end_date)
prices = mkt_data["Adj Close"]
print(prices.head())

The mkt_data variable here will be a Pandas DataFrame with various info such as open price, closing price, adjusted price, volume etc. It will have this for each ticker in the list provided. The index days between the start and end dates provided (i.e. we have daily data).

We slice this up to give us a Pandas DataFrame with just the adjusting closing prices. This is exactly what the final line in the above code listing does. Below is a screen shot of the output from printing prices.head().

prices.head() output

Working With the Prices DataFrame

Now lets say you want to calculate daily log returns and annualised volatility based on the historic data we just downloaded. You can use standard deviation for volatility and then package everything up into small functions. The revised listing becomes the one below.

from datetime import datetime

import numpy as np
import pandas as pd
import yfinance as yf

def calculate_returns(close: pd.DataFrame) -> pd.DataFrame:
    return (close - close.shift(1))/close.shift(1)

def get_annualised_vols(prices: pd.DataFrame, 
                          days_per_year: float = 252) -> pd.Series:
    
    return calculate_returns(prices).std() * days_per_year ** 0.5
  
ticker_list = ['AAPL','GOOG','FB','AMZN','COP']

start_date = datetime(2020,6,8)
end_date = datetime(2020,12,11)

mkt_data = yf.download(ticker_list, start_date, end_date)
prices = mkt_data["Adj Close"]

annualised_vols = get_annualised_vols(prices)
print(annualised_vols)

Applying Best Practice and Unit Testing

So far so good. We obtained Yahoo market data with yfinance in Python. Our data was in a Pandas DataFrame we were able to easily manipulate and work with. But our overall program is far from what I would call perfect so let’s improve on this a little (without going over the top).

Consider the below listing. Here, we have re-structured our code to include a simple unit test on the annualised volatility method. The idea is to test our annualised volatility function by downloading and manipulating market data (obtained using yfinance). This is essentially where I got to when I was playing around with it.

from datetime import datetime

import pandas as pd
import yfinance as yf
import unittest as ut

def calculate_returns(close: pd.DataFrame) -> pd.DataFrame:
    return (close - close.shift(1))/close.shift(1)

def get_annualised_vols(prices: pd.DataFrame, 
                          days_per_year: float = 252) -> pd.Series:
    
    return calculate_returns(prices).std() * days_per_year ** 0.5

class TestRetsVols(ut.TestCase):
    
    # we want these part of the object and to set up
    # just once rather than just before each test
    def __init__(self, *args, **kwargs):
        super(TestRetsVols, self).__init__(*args, **kwargs)
        self.set_up_data()        
        
    def __str__ (self) -> str:
        ret_str = """--- Setting up unit testing ---
            prices pd.DataFrame: \n {prices_df} \n
            annualised volatilities pd.DataFrame: \n {vols_df} \n"""
        
        return ret_str.format(prices_df = self._prices.head(), 
                              vols_df = self._annualised_vols)
    
    def set_up_data(self) -> None:
        self._ticker_list = ['AAPL','GOOG','FB','AMZN','COP']
        self._mkt_data = yf.download(self._ticker_list, 
                                     datetime(2020,6,8), 
                                     datetime(2020,12,11))
        self._prices = self._mkt_data["Adj Close"]
        self._annualised_vols = get_annualised_vols(self._prices)
    
    def test_vols(self) -> None:
        print(self)
        print("--- Testing the vol of GOOG is correct ---")
        self.assertAlmostEqual(0.3, self._annualised_vols["GOOG"], 3)
        return None

def main() -> None:
    ut.main()
    
if __name__ == "__main__":
    main()

Conclusions

Hopefully, the above technique for obtaining Yahoo market data with yfinance in Python will be useful to you. The method is certainly something I wanted to put down in writing to I can refer to later.