Skip to content

Welcome to Python Data Science

Python Data Science is an open source, collaborative project aiming to document best practice approaches to data science tasks using Python. At present there are two main classes of resources:

  • A collection of Python Task Views
  • The Jupyter Overview that compares Python functionality against the R and Julia data science frameworks

Python Task Views

Task Views is a collection of documents in markdown format that provide guidance on which python packages are relevant for which data science task.

Task views aim to outline which packages could be included (or excluded) in a certain project to achieve certain functionality. They are not meant to endorse the "best" package for any given task.

The initial proposed list of python task views mirrors the CRAN set of corresponding Task Views for the R system. Over time this may develop to reflect more accurately the grouping of tasks in the Python universe

Python Datascience Taskview List

Jupyter (Python versus R versus Julia) overview

While Task Views are dedicated exclusively to Python data science tools, the Jupyter overview project is inspired by the R ecosystem CRAN views. It offers a side-by-side comparison with R and Julia packages available for data science helps identify important sub-domains where Python may currently lag. The overview is available in two formats: