Incorporating multivariate adaptive regression splines into Scikit-Learn

The purpose of the project was to incorporate MARS (Multivariate Adaptive Regression Splines) into scikit-learn by adapting and improving an existing implementation named py-earth . MARS is a non-parametric regression algorithm and py-earth is an implementation of it in Python. During this mission doctorale, a number of contributions have been made to the existing code of py-earth. Part of the contributions was to clean the code, adapt it to the coding guidelines of scikit-learn (http://scikit-learn.org/stable/developers/), enhance the documentation and add more unit tests. Further, new features were added. py-earth lacked a way to deal with multiple outputs, the first contribution was to add a way to deal with multiple outputs with the possibility of weighting each output variable.The second contribution was to add three different ways of estimating input variables importances, the purpose of this contribution was to bring a way to assess the predictive power of each input variable. The final contribution was to implement FastMARS, a way to speed up the original MARS algorithm.

The first contribution of this mission doctorale was to participate into bringing py-earth to scikit-learn contrib, a new project held by core developers of scikit-learn to include scikit-learn compatible projects like py-earth. For that, a number of modifications of the code has been made to be respect the guidelines.

Most contributions that have been made so far were are about improving code quality : fix bugs, improve documentation, make installation more easier (packaging).

Comments are closed.