I'm a computer engineering student in my final year of studies. I am majoring in machine learning and data science (SCIA) at EPITA.
During my internship at enioka Haute Couture, I was the sole intern working on ComDaAn. ComDaAn previously used git repositories to visualize the corresponding community’s activity, size, network and member centrality.
My work* consisted of adding mailing lists and GitLab issues as additional data sources and the option for an issue response analysis. I also worked on multiprocessing the time consuming operations and on integrating LOWESS regressions for the curves in the project. Finally, I reworked the code to follow an OOP logic, renovated the project’s API to better suit the needs of its target users and added a test suite.
Being the only intern working on the project, I benefited from a fair amount of autonomy regarding project management and decision making.
Technologies used: Python (pandas, numpy, networkx, bokeh, etc.)
*More information on my work on the project can be found in my blog post: Tooling For Community Data Analytics which is about the state of the project after my changes.
EPITA or École Pour l’Informatique et les Techniques Avancées is a french computer engineering school. I joined it in 2018 and am set to graduate in 2021. I’m an active member in multiple associations that promote acceptance, openness and multiculturalism.
Bac+3 ESIB or École Supérieur des Ingénieurs de Beyrouth is an engineering school in Lebanon. I joined it in 2015 for my “classes préparatoires” which are mandatory pre-engineering classes in the french system that span over two years. I have also done a year of computer engineering in ESIB before deciding to move to France and tranfer to EPITA.
From kindergarten to High School Soeurs des Saints-Coeurs Siouf or SSCC Sioufi for short is a french school in Lebanon. It is where I’ve gotten my primary, complementary and secondary education. I’ve completed a Baccalauréat Scientifique with a specialty in mathematics, and got a Mention Bien at the official exams.
ComDaAn is a suite of tools for analyzing the data produced by FOSS communities. It relies mainly on the pandas, numpy, networkx and bokeh libraries.
In a group of five, for our end of studies project, we worked with French GP practices to create a slotution to cluster french medical reports and extract the theme of each cluster. Using their data, we were to come up with different approaches to solve the problem and evaluate them; we were to present the practices with either an efficient solution or the conclusion that an unsupervised approach isn’t adapted to the data and issue at hand. a
We used a multitude of NLP tools to determine the most efficient approach such as, but not limited to:
Technologies: Python, pandas, numpy, nltk, gensim, scikit-learn, etc.
In a group of five, we were to handle the analysis and storage of millions of NYPD parking tickets. We also simulated drones sending alerts to handle them in real time in a stream and display them on a website we created.
Technologies: Scala, Spark, Kafka, SQL.
In a group of five, we were to use deep learning approaches to segment White Matter Hyperintensities. We implemented the 2D U-Net model in the Fully Convolutional Network Ensembles for White Matter Hyperintensities Segmentation in MR Images paper (ranked second) and evaluated its performance with and without data preproprecing. We also adapted it to a 3D input using convultions at the begining of the network to see how it would affect results.
Technologies: Python, Tensorflow/Keras.
For a Text Mining and Natural Language class, we implemented, in a group of three, a spell checker based on a Trie data structure and using the Damerau–Levenshtein distance. The search for the results had to happen under time and memory constraints which meant memory mapping and management had to be handled deftly.
Technologies: C++, Python for validation testing, Memory profiling tools, etc.
Using the various Congnitive Services on Azure, in a group of 5 we built a conversational chatbot with built-in Speech-to-Text and Text-to-Speech features. It used the NASA FAQs to build a knowledge base to answer questions about space and the climate. When the bot couldn’t find an answer to a question asked in its knowledge base, it would search for it on the internet.
Tools: The Azure cloud platform with a focus on Azure Cognitive Services (QnAMaker, Azure Bot Service, Azure Speech Service, Azure Search, etc.)
TC is a project done in a group of 4 students. It is a full front end compiler with a lexical, syntactic and semantic analysis of the Tiger language (a purely pedagogical programming language). The lexer and parser for the compiler were written using Flex and Bison and the AST, binder and type-checker in C++.
Technologies: C++, Flex/Bison.