How to get PyPI download statistics

Kiran Koduru • May 20, 2017 • 1 minutes to read • Last Updated: May 21, 2017

This is a short post on how to get download statistics about any package from PyPI. Though there have been efforts in that direction from sites like pypi ranking but this post finds a better solution.

Google has been generous enough to donate it’s Big Query capacity to the Python Software Foundation. You can access the pypi downloads table through the Big Query console. I ran a sample query to find out how my personal package arachne has been doing on PyPI.

SELECT COUNT(*) as download_count
FROM TABLE_DATE_RANGE(
  [the-psf:pypi.downloads],
  TIMESTAMP("2015-05-01"),
  CURRENT_TIMESTAMP()
)
WHERE file.project="arachne"

The above query returns the download count since 1st May 2015

What might have gotten your attention might be the, FROM statement.

FROM TABLE_DATE_RANGE(
  [the-psf:pypi.downloads],
  TIMESTAMP("2015-05-01"),
  CURRENT_TIMESTAMP()
)

If you aren’t familiar with Big Query the short version is that it creates a new table for each date. So in the above query I am trying to look for data in downloads table starting from date 2015-05-01.

Since Big Query accepts most SQL like statements you can try and play around using the Big Query UI. Also thanks to Ofek Lev on his package pypinfo for inspiring me to write this post.