Jupyter
Overview-of-the-Julia-Python-R-Universe
Contribute to the side by side overview of the three leading open source data science ecosystems of today! Jupyter is a project to develop a side-by-side review of the main open source ecosystems supporting the Data Science domain: Julia, Python, R, sometimes abbreviated as Jupyter.
Motivation
In recent years open source software targeting Data Science finds increased adoption in diverse applications. The Overview of the Julia-Python-R Universe project is a side by side comparison of a wide range of aspects of Python, Julia and R language ecosystems.
The comparison of the three ecosystems aims:
- To be useful for people that are somewhat familiar with programming and want to inspect options and use the most appropriate tool
- To promote interoperability, cross-validation and overall best-practices
- To be factual as much as possible without drifting to judgement / opinions
- To cover use cases relevant for the implementation of quantitative risk models
The comparison does not aim:
- To be a detailed / comprehensive catalog of all available libraries (which count to many thousands!)
- To cover use cases very removed from quantitative risk models
- To be totally exhaustive (e.g., to identify all the possible computer systems one can run a Python interpreter on, or count all the possible ways one can perform linear regression in R)
Result
The Overview of the Julia-Python-R Universe article is a side by side comparison of a wide range of aspects of Python, Julia and R language ecosystems (sometimes affectionately named "jupyter"). The comparison of the three ecosystems aims at high level:
- To be useful for people that are somewhat familiar with programming and want to inspect options and use of the most appropriate tool
- To promote interoperability, cross-validation and overall best-practices across these ecosystems
- To be factual as much as possible without drifting to judgement / opinions
How to use this repo
This repo is to facilitate (and properly document contributions and discussions) of the three-way overview of the Julia-Python-R universe, sometimes affectionately known as the "Jupyter" trio.
Individual Packages / Libraries / Frameworks
If you want to add/remove/change something at the package level * raise an issue * indicate the relevant category / subcategory * the link to the package repo
Overall Categorization / Segmentation
We welcome also suggestions to the overall structure (the different categories / subcategories).
Disclaimers
The comparison does absolutely not provide an assessment of which system is "better". The proper way to use the comparison is to start with one's objectives, knowledge level, use case.
The comparison attempted here is not entirely appropriate as the three systems have quite different origins and architectural design choices. For example, strictly speaking R is not a general programming language. R is a system for statistical computation and graphics. It consists of a sufficiently general language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files. Yet despite the disclaimer a comparison is justified because in very large domain of applications and use cases the three frameworks can be used interchangeably (or nearly so)
Structure
The comparison data are provided in tabular format in several distinct tables. Each table documents a relevant language or ecosystem subdomain. The number and focus areas of the different table are somewhat arbitrary and may expand in the future. The order is roughly from more generic aspects towards more specialized / advanced areas, concluding with interoperability.
Each table entry (row) highlights key functionality within the subdomain. The language columns point to information or packages and (where applicable) there is commentary. Reference links are included when useful.
At the bottom of some tables there is a row indicated Package Review. This row has a collection of links to the CRAN Task Reviews that aim to summarize the large number of R packages available for some data science tasks. There are also links to Python Task Views (this content is still WIP - contributors welcome, see below)
Getting Involved
You can provide simple and anonymous feedback on the wiki version of the overview using the feedback button at the bottom of the page.
People interested in developing the Python focused Python Task Views can do so via the gitlab repo.