Samuel Kouatcheu

Samuel Kouatcheu

$50/hr
As a certified GCP Data Engineer and Data Analyst,
Reply rate:
-
Availability:
Hourly ($/hour)
Location:
Beltsville, Maryland, United States
Experience:
7 years
Samuel Kouatcheu Summary: Certified GCP Data Engineer and Data Analyst with more than 8 years of professional experience in the Financial, Retail and Healthcare domain. Able to learn unfamiliar material and techniques quickly, exhibit great attention to detail, execute under aggressive deadlines, and adapt quickly to changing circumstances and systems to maintain strategic vision of the organization and link it with everyday work. Very passionate about projects and able to carry out tasks that are time sensitive in an efficient manner to guarantee the growth and success of the organization. A wonderful team player who values collaboration and communication, eager and possesses a very strong work ethic and can relate and work efficiently with diverse groups. Education 2018/05- Bachelor of Science: Computer Science and Molecular Biology Florida Agricultural and Mechanical University - Tallahassee, FL Certification Google Cloud Platform Data Engineer Power BI Data Analyst certification Tableau Desktop Certification Technical Skills: Python, R, Stata,Java, Machine Learning, Looker Numpy, SciPy, Pandas, Scikit-learn, Matplotlib, PySpark, Spark, Seaborn, ggplot2, SQL Alchemy Operating Systems Windows, Unix, Linux PCA, K-Means, Linear Regression, Non-Negative Matrix Factorization, Decision Tree, Logistic Regression, Naïve Bayes, Random Forest Tableau, Power BI, SAS Databases: MS SQL Server 2005/2008/2012, MySQL 5.2, 5.7, Oracle 9i, 10g, 11g, DB2 Algorithms Statistical Modelling, Data Extraction, Data screening, Data cleaning, Data Exploration and Data Visualization of structured and unstructured datasets Predictive analysis, Hypothesis testing, Probability distribution SharePoint 365 Creating data models Bash scripting Professional Experience: Dunamis InfoTech, Beltsville, MDMarch 2021-Present Data Analyst/BI Developer Deploy Tableau Dashboards and worksheets in clustered environment.  Documente a whole process of working with Tableau Desktop and evaluating Business Requirements.  Interacted on a day to day basis with several different groups to achieve a common deliverable.  Create Tableau scorecards, dashboards using stack bars, bar graphs, scattered plots, geographical maps, heat maps, bullet charts, Gantt charts demonstrating key information for decision making. Identified and document limitations in data quality that jeopardize the ability of internal and external data analysts Create complex SQL queries and scripts to extract and aggregate data to validate the accuracy of the data Provide high level expertise in applicable public health disciplines to collect, abstract, code, analyze, or interpret scientific data contained within information systems and databases related to public health  Lead the data analytics team with the help of the data scientist   Written Python script to download the files from partner system and Checked the Volume counts between Data file and control files. Connect BigQuery tables to Looker and created visuals Create and optimize LookML models, explores, and dashboards to provide actionable insights and data-driven decision-making capabilities. Troubleshoot and resolve Looker-related issues, including performance tuning, debugging, and addressing data inconsistencies. Develop and implement Looker-based data solutions, including data modelling, exploration, and visualization. Responsible for design and development of advanced R/Python(Numpy, Pandas, Scipy) programs to prepare to transform and harmonize data sets in preparation for modeling Analyze program data using statistical and relational database software  Worked on adding functionalities to Electronic medical record (Epic). Risk Management/Patient Safety Improvement, Data Analysis, and theory EPIC EHR/EMR. Research contents, limitations, and potential uses of all available data sets relevant to assigned task areas  Develop documentation for all analyses to include storage location of all files (i.e., program code, data sets, output); thoroughly document program code and monitor use of disk storage, archiving data sets when not in use  Develop measures to assess contract performance in assigned task areas and make changes to those measures as required with contract modifications files and SQL Server Implemented end-to-end solutions for batch and real-time algorithms along with requisite tooling around monitoring, logging, automated testing, performance testing, and A/B testing. Created various views in Tableau (Tree maps, Heat Maps, Scatter plot) Created quick filters, table calculations, calculated parameters and measures Inspected reporting requirements and created various Dashboards Environment : ETL, BI, TSQL, SQL, SAS, MS Excel and Windows. Equifax (Contractor), Alpharetta, GAAug 2020-March 2021 Data Wrangler Lead and executed activities to support all aspects bringing new data sources. Collaborated extensively with Technology to design data ingestion, data models, and automated operational metrics for consistent high-quality data loading. Developed strategies, standards, and standard methodologies in the areas of data wrangling, data visualization, and data integration and lead adoption. Performed data query and exploration on BigQuery Created LookML models and visuals Performed optimization on LookML run times. Ensured quality, coverage and accuracy of data in the Equifax Big Data environment by performing analysis to measure quality of data. Validated data quality content in conjunction with respective data stewards by developing reports and tools to monitor and visualize data. Develop data preprocessing pipelines using Python (Numpy, Pandas, Scipy), R, Linux scripts on on-premise High-performance cluster and AWS, GCP cloud VMs Designed, built, and deployed a set of Python modeling APIs for customer analytics, which integrate multiple machine learning techniques for various user behavior prediction Analyzed new sources of data to quantify quality, uniqueness, coverage and overlap with Equifax sources. Performed analysis and integrated Equifax, customer data, and third-party data to solve unique analytical problems. Developed and Collaborated extensively with Data, Analytics, and Technology leads to ensure the seamless consumption of insights generated from Equifax's new big data analytical platform. Stayed ahead of with rapidly developing Big Data technologies landscape and shared knowledge with both internal and external customers. Prototyped and integrated tools into Equifax Big Data environment and lead adoption of tools through Data and Analytics. Environment: AWS, GCP, BigQuery, Kibana, Looker, SQL, T-SQL. National Cancer Institute ( Contractor), Bethesda, MD Sep 2017- Aug 2020 Data Scientist Identify and integrate disparate data sources, both internal and external, including raw data from medical researchers, unstructured data from clinical experts, and well-established, publically-available databases Develop and deploy machine learning algorithms, predictive models, and classification methods to advance cancer research and inform clinical decision making Deliver novel, data-driven insights to improve outcomes in the treatment of cancer Identified novel molecules with activity against a cancer causing protein target Developed a predictive model from clinical features to identify patients at the risk of DNA repair diseases Developed Ordinary Differential Equation models of molecular pathways that are perturbed in cancer to identify potential drug targets Used Jira for writing Epic, user stories with acceptance criteria or user stories with use cases, used confluence for upload documents. Collaborated with the Product Team to assess and revise features through A/B tests (sample size 40M), evaluated metrics of the possibility to make successful information transformation and check by applying Z-Test and Chi-Squared Test Devised statistical models, algorithms and software systems to better serve research needs Researched and compiled statistical data to support cost control and care improvement initiatives Work closely with molecular biologists during assay development and optimizations Analyzed archival data such as birth, death and disease records for cancer research Monitored clinical trials or experiments to verify adherence to established procedures and quality of data collected Drew conclusions and made predictions based on data summaries or statistical analysis Calculated sample size requirements for clinical studies Used R and Python( Numpy, Pandas, Scipy) to prepare tables and graphs to present clinical data or results Develop hierarchical and k-means clustering algorithms to group patients/features/cells that are similar to each other. Environment: ER Studio, AWS, MDM, SUnix, Sharepoint 365,Python, MLLib, SAS, Regression, Stata,Logistic Regression, Hadoop, OLTP, OLAP, HDFS, NLTK, SVM, JSON, SAS, PySpark, Spark. Office Depot, Tallahassee, Florida Sep 2015 - Aug 2017 Jr. Data Scientist Involved in defining the Source to business rules, Target data mappings and data definitions Performing Data Validation / Data Reconciliation between disparate source and target systems for various projects Identifying the Customer and account attributes required for MDM implementation from disparate sources and preparing detailed documentation Prepared Data Visualization reports for the management using R Used R machine learning library to build and evaluate different models Utilized a broad variety of statistical packages like R, MLIB, Python and others Performed data cleaning using R, filtered input variables using the correlation matrix, step-wise regression, and Random Forest Performed Multinomial Logistic Regression, Random Forest, Decision Tree, SVM to classify package is going to deliver on time for the new route Provides input and recommendations on technical issues to Business & Data Analysts, BI Engineers and Data Scientists Segmented the customers based on demographics using K-means Clustering Used T-SQL queries to pull the data from disparate systems and Data warehouse in different environments Extensively using Tableau for data validation Generating weekly, monthly reports for various business users according to the business requirements Environment: Python, R, ETL, Sharepoint 365, BI, Stata, TSQL, SQL, Machine Language, MS Excel and Windows, PySpark, Spark. Tallahassee Memorial Healthcare, Tallahassee, Florida Aug 2014 – Sep 2015 Data Analayst/Tableau Developer Identified and document limitations in data quality that jeopardize the ability of internal and external data analysts Created complex SQL queries and scripts to extract and aggregate data to validate the accuracy of the data Performing daily integration and ETL tasks by extracting, transforming and loading data to and from different RDBMS Utilized SSIS for ETL data modeling, data migration, and analysis Business requirement gathering and translating them into clear and concise specifications and queries Prepared high level analysis reports with Excel and Tableau Provided feedback on the quality of data including identification of billing patterns and outliers Developed Database Triggers and Views, Stored Procedures and Functions Performed ad-hoc reporting analysis as well as manipulate complex data on MS SQL server Wrote complex SQL queries using advanced SQL concepts like Aggregate functions Viewed duplicate data or errors in data to provide appropriate communication within the department and accurate monthly reports Experienced in Data mapping, Data transformation between sources to target data models Involved in extraction, transformation and loading of data directly from different source systems like, Excel, Oracle, flat files and SQL Server Created various views in Tableau (Tree maps, Heat Maps, Scatter plot) Created quick filters, table calculations, calculated parameters and measures Inspected reporting requirements and created various Dashboards Environment : ETL, BI, TSQL, SQL, MS Excel and Windows.
Get your freelancer profile up and running. View the step by step guide to set up a freelancer profile so you can land your dream job.