Skip to content

The Python Data Science Taxonomy

PDS Tasks are classified and organized in various ways.

Level 1 Groups

An important overall scheme recognizes a horizontal, cross-cutting set of tasks that is applicable across a number of domains and a separate vertical, sector or domain oriented taxonomy that includes tasks that mostly make sense within a narrow application context.

The horizontal / vertical separation is not rigid: wider availability and accelerated education around data science opportunities means that specialized data science methods increasingly diffuse across verticals.

Level 2 Groups - Cross-Cutting Pillar

In turn, the Level 1 Horizontal group of tasks can be split into three main sub-categories. These concern the nature and type of insights offered by the various tools: * Statistics and Probability tools that enable a range a core data science analyses (from basic regression to far more complex studies) * Models, Algorithms and Simulation tools that are either more elaborate versions of the previous group either in the complexity of algorithms involved or the data structures (e.g., Machine Learning, Graph Systems) or go beyond strict data oriented analysis and involve optimization, simulation etc. * Data-Adapted tools offer specialized functionalities that are tuned to the nature of the data: e.g. graph or geospatial data. * The Workflow and Support group involves aspects of Data Science work that precede, augment or follow the algorithm-oriented toolkit of the previous groups. This includes database API's, data quality, visualization, documentation and deployment aspects. While the development of these tools themselves will typically be outside the scope of data science, their effective use is not.

Level 2 Groups - Sectoral Pillar

The Sector Pillar is natural split into broad sectors that also define the use context