PAVITHRA C P
DATA SCIENTIST
I’m a machine learning and deep learning enthusiast. Interested in Natural Language Processing and Python. Interested to work on cutting edge technology.
CONTACT
CAREER OBJECTIVE
To be competitive in my field and learning oriented environment
for developing my professional as well as personal skills.
--
Trivandrum, Kerala
@pavithra
pavithra-c-p
May 2019 - Present
Mirafra Technologies Pvt. Ltd, Bangalore
Role: Data Scientist
SKILLS
Programming
Python
C
C++
MySql
Postgres
LaTeX
○○○○○
○○○○○
○○○○○
○○○○○
○○○○○
○○○○○
Operating Systems
Linux
Windows
○○○○○
○○○○○
Software & Tools
Data handling/analysis
○○○○○
(e.g. numpy, scipy, pandas, scikit-learn,
NLTK)
Machine Learning
○○○○○
(e.g. Classifier algorithms, Clustering algorithms)
Deep Learning
○○○○○
(e.g. RNN, LSTM, CNN)
Visualisation
○○○○○
(e.g. matplotlib, boxplot, ...)
IDE
○○○○○
(e.g. Anaconda, Jupyter, Spider)
Languages
English
Malayalam
Tamil
WORK HISTORY
○○○○○
○○○○○
○○○○○
CERTIFICATES
Automated software testing tool selenium
Works deal with different machine learning algorithms and Natural Language
Processing for problems like Machine Translations, Recommendation systems
and Health problems.
EDUCATION
2017 - 2019
KTU, Kerala
M.Tech (Computational Linguistics)
CGPA: 8.6
2012 - 2016
Calicut university, Kerala
B.Tech (Computer Science)
CGPA: 8.01
AREAS OF INTEREST
Natural Language Processing
Machine learning
Deep Learning
Data Mining
PROFESSIONAL SUMMARY
• 2+ Years of hands-on experience in Machine Learning, Natural
language processing.
• 2+ Years of hands-on experience in tools like Scipy, NLTK, Pandas, Numpy, Spacy.
• 2+ Years of experience in Data Analysis and EDA.
• 2+ Years of hands-on experience in the language Python(3.x).
• 2+ Years of experience in Deep Learning(RNN, LSTM).
• Beginner in SPARK.
PROJECTS
• Summarizing and Root Cause Analysis of Customer log using
ML
Domain: Natural Language Processing and Machine Learning
Description: Analysing the customer log files and generate
summary over large log file. Then predict root cause and detail
root cause of the given problem using Machine Learning.
Language : Python 3.8
Algorithm used : T-5 Summarizer, SVM, Elmo Embedding,
Cosine Similarity
Platform : Anaconda, Jupyter
Responsibility : Coding, Debugging
• Abnormal heartbeat detection to prevent heart problems.
Domain : Medical
Description : Identifying different datasets for ECG analysis
and heart beat feature extraction.
Language : Python 3.6
Platform : Anaconda, Jupyter
Responsibility : Coding, Debugging
• LTV prediction model
Domain : Data creation and Machine Learning
Description : Building a dataset to predict in-app purchase
(IAP) of a user and using Multinomial Logistic regression
created a model to predict the same.
Language : Python 3.6
Algorithm used : Multinomial Logistic regression
Platform : Anaconda, Jupyter
Responsibility : Coding, Debugging
• Recommendation system using spark
Domain : Machine Learning
Description : Recommendation system with different product
dataset and music dataset using python and spark.
Language : Python 3.6
Tools Used : SPARK,sql, git actions
Platform : Anaconda, Jupyter
Responsibility : Coding, Debugging
• Word Sense Disambiguation using word embedding and neural network.
Domain : Natural Language Processing
Description : Used various techniques to disambiguate the
sense of a word in a sentence. Two techniques were explored.
One is using embedding and other using neural network.
Language : Python 3.6
Tools Used : NLTK
Platform : Anaconda, Jupyter
Responsibility : Coding, Debugging
• Sense Identification for Language Translation Assistant System.
Client : Storyweaver (Pratham Books)
Domain : Natural Language Processing
Description : Identify sense of words in a sentence to disambiguate the meaning of words based on context. Lesk algorithm was
implemented and explored more techniques.
Language : Python 3.6
Tools Used : NLTK
Responsibility : Coding, Debugging
• Corpus Creation for Language Translation Assistant System.
Client : Storyweaver (Pratham Books)
Domain : Stories (Natural Language Processing)
Description : For conversion of stories from English to other languages, identify the unique words and phrases from the stories and
create a corpus regarding the words as well as with phrases.
– Spacy used for all data preprocessing.
– Explore pytextrank algorithm for phrase identification from sentences.
Language : Python 3.6
Tools Used : NLTK, Spacy, pytextrank, textacy, pandas, Enchant
Platform : Anaconda, Jupyter
Responsibility : Coding, Debugging
• Gender Identification From SMS Text Messages (M.Tech Mini Project)
Language: Python
Description: The project is mainly based on gender identification from short messages. Through these a comparison study was
performed based on two methods : First one is based on n-gram feature and the second method is based on feature extraction
technique. Both the technique was analyzed with two classification algorithms: Naive bayes and Support vector machine.
• Suspicious Activity Detection Using Machine Learning. (B.Tech Project)
Language: c++
Description: The project dealt in the field of Image Processing. The project was to detect any suspicious activities taking place inside
an ATM. Detection of using Guns inside the ATM was the primary focus. Also a study on classifier, mainly HAAR and LBP classifiers
were done.
PUBLICATIONS
Level Sensitive Context Aware Translation Suggestions
Pavithra C P, Supriya Mandal
2021
IEEE 18th India Council International Conference (INDICON)
arXiv
An Overview of Relevant Literature on Different Approaches to Word Sense Disambiguation
Pavithra C P, Supriya Mandal
2021
Fourth International Conference on Electrical, Computer and Communication Technologies, IEEE-ICECCT
arXiv
Detection and Verification of Rumour in Social Media : A Survey
Pavithra C P, Shibily Joseph
2019
National Conference on ”Innovations in Engineering Technology, IJARCCE
arXiv
Deep Learning Approach For Rumour Detection In Twitter : A Comparative Analysis
Pavithra C P, Shibily Joseph
2019
International Conference on Systems, Energy and Environment
arXiv
DECLARATION
The above-mentioned information is true to the best of my knowledge and believe that I shall serve your esteemed organization with sincerity.
Trivandrum
April 4, 2022
Pavithra C P