Skip to content
Mustapha Lebbah edited this page Nov 13, 2020 · 56 revisions

Welcome to the C4E wiki!

It is a Big Data Clustering Library API gathering clustering algorithms and quality indices in Scala and Spark/Scala (our team used Spark/scala since since 2012).

Don't hesitate to ask questions or make recommendations in our Gitter. C4E is also in SparkPackages.

Contributors team

  • Gaël Beck. C4E maintainer. PhD in Computer Science from Department (LIPN, CNRS(UMR 7030)) of Sorbonne Paris Nord University

  • Mustapha LEBBAH. PI. Computer Science Department (LIPN, CNRS(UMR 7030)) of the Sorbonne Paris Nord University,

  • Hanane Azzag. Computer Science Department (LIPN, CNRS(UMR 7030)) of Sorbonne Paris Nord University

  • Anthony Coutant. Phd Computer Science, Department (LIPN, CNRS(UMR 7030)) of Sorbonne Paris Nord University

  • Florent Forest Phd student Computer Science Department (SAFRAN, LIPN, CNRS(UMR 7030)) of the University of Paris 13

  • Etienne Goffinet Phd student Computer Science Department (RENAULT, LIPN, CNRS(UMR 7030)) Sorbonne Paris Nord University

  • Tarn Duong. Phd, Lead Data Scientist

  • Waris Radji Machine Learning 🤖 & Scala ♦️ Enthusiast - Blogger 📝 - Road to PhD 👨‍🎓 - Engineering Student at ENSEIRB

  • Mohamed Walid ATTAOUI Phd student Computer Science Department ESI and Sorbonne Paris Nord University

  • Dina Faneva ANDRIANTSIORY (AIMS-Sénégal (Institut africain des sciences mathématiques) intership Master 2).

  • Tugdual Sarazin. Phd, Lead Data Engineer

  • Mohammed Ghesmoune. Phd, Data Scientist

  • M Quan Cao Anh (USTH, intership L3, Master 1).

  • Zaineb Chelly Dagdia, Marie Sklodowska Curie Research Fellow. Collaboration to integrate the Rough Set Features selection.

This story began in 2012 when our team decide to implements its algorithms in Scala and Spark (from Tugdual Sarazin initiative). From year to year it happens that it was a good choice thanks to the rapid pace at which Spark increased its notoriety. Our objective is to provide a open source library in order to test and compare a wide range of algorithms focusing on clustering as well as for research purpose than for industrial ones.

Clone this wiki locally