The goal of the project is to improve the Python multiprocessing backend of joblib, used extensively by scikit- learn. The technical challenge is that, to avoid locks, the parallel-computing strategy of the multiprocessing module is to spawn multiple processes. Error management and nested parallelism are difficult in such setting. The project is still ramping up, but we could already identify and fix many failure modes of the Python multiprocessing module when computation crashed in workers. Fixes will be first integrated in joblib, and later contributed upstream in the Python standard library.
-
News
- Introduction to Data Science in Python, November 2019 edition 2019/12/11
- Workshop “Introduction to Data Science in Python” 2019/03/26
- Data Scientist Position 2019/02/01
- Data Engineering Position 2019/02/01
- RAMP on detecting Solar storms, October 10 2018/10/01
- 3rd CDS pitching day November 28, call for contributions 2018/09/10
- CDS at the Scipy Conference 2018 2018/07/18
- RAMP: IMaging-PsychiAtry Challenge (IMPAC) 2018/07/01