top of page

Fellipe Silva

Personalized Medicine
Redefining Cancer Treatment

Project developed with the purpose of implementing medicine identification and treatment of cancerous mutations in humans.
A knowledge base noted by pathological experts gathered in a MSKCC-supplied dataset, Memorial Sloan Kettering Cancer Center.

In this project it is possible to find Natural Language Processing (NLP) with programming language packages R, pre-processing of texts, phrases, words and most relevant terms, quantitative analysis with several graphs exemplifying each stage of the analysis process, choosing the best sparsity within a term-document matrix, construction and analysis of metrics of Machine Learning models as Neural Networks Artificial, Decision Tree, K - Nearest Neighbors, Random Forest and Extreme Gradient Boosting.

The first stage of the project consists of a detailed analysis of the clinical evidence, where we had the most frequent words, which are the correlations between terms, how the interconnections between phrases behaved, among other analyzes that resulted in valuable insights to the project.

The second stage, on the other hand, focused efforts on building predictive models in a that it was possible to predict to which class new and unknown clinical evidence fit after trained models are tested. This step also added value to the project as it was
possible to exploit the full capacity of Machine Learning models by acquiring from models the most relevant words to predict each class of mutation that triggers cancer.

Download Full Project

Next Project

bottom of page