Chidiebere Onwuchekwa

Chidiebere Onwuchekwa

$50/hr
Data Science/Machine Learning Engineering/Generative AI
Reply rate:
-
Availability:
Hourly ($/hour)
Location:
Houston, Tx, United States
Experience:
12 years
Chidiebere Onwuchekwa MACHINE LEARNING ENGINEER Phone: - Email:- EDUCATION Masters in Business Analytics Merrimack College, North Andover, Massachusetts, USA Bachelors in Petroleum Technology University of Port Harcourt, Rivers State, Nigeria Certificate in Petroleum Data Technology Lone Star College, Cypress, Texas, USA SUMMARY ABOUT ME     .  9 Years in Data Science/Machine Learning 12 years in Information Technology Expertise in Machine Learning, Deep Learning, Convoluted Neural nets, Generative AI Projects involving NLP, NLU, LLM, Text Mining, Predictive Analytics, Artificial Intelligence Techniques in big data structure and unstructured  Extensive exposure on analytics project life cycle CRISP-DM (Cross Industry Standard Process for Data Mining) and web applications using SCRUM methodologies.  Use machine learning to advance systems such as product recommendations, search ranking and relevance, image attribution, demand routing, fit recommendations, inventory forecasting, threat modeling, etc.  Business understanding, Data understanding, Data preparation, Modeling, Evaluation and Deployment.  Experienced in practical application of data science to business problems to produce actionable results.  Experience in Natural Language Processing (NLP), Machine Learning & Artificial Intelligence.  Extensive experience in building chatbots using ChatGPT, Langchain, OpenAI, Llamaindex frameworks  Experience with AWS cloud computing, Spark (especially AWS EMR), Kibana, Node.js, Tableau, Looker.  Able to incorporate visual analytics dashboards.  Experience with a variety of NLP methods for information extraction, topic modeling, parsing, and relationship extraction  Knowledge on Apache Spark and developing data processing and analysis algorithms using Python.  Programming in Java, Python and SQL queries.  Use of libraries and frameworks in Machine Learning such as NumPy, SciPy, Pandas, Theano, Caffe, SciKit-learn Matplotlib, Seaborn, Theano, TensorFlow, Keras, NLTK, PyTorch Gensim, Urllib, Beautiful Soup. Chidiebere Onwuchekwa | -|- TECHNICAL SKILLS COMMUNICATION SKILLS verbal, written, presentations LEADERSHIP supports project goals, business use case and mentors’ team QUALITY continuous improvement in project processes, workflows, automation and ongoing learning and achievements CLOUD Analytics in cloud-based platforms (AWS, MS Azure, Google Cloud), Docker, Kubernetes, CI/CD ANALYTICS Data Analysis, Data Mining, Statistical Analysis, Multivariate Analysis, Stochastic Optimization, Linear Regression, ANOVA, Hypothesis Testing, Forecasting, ARIMA, Sentiment Analysis, Prompt Engineering, Chat GPT, Predictive Analysis, Pattern Recognition, Classification, Behavioral Modeling, LLaMA, Prompt Tuning, Few-shot Learning, RetrievalAugmented Generation (RAG), MCP, Prompt Tuning, Chain-of-Thought Reasoning PROGRAMMING LANGUAGES Python, R, SQL, Scala, Java, MATLAB, C, SAS, F#, ArcGIS Pro, MapInfo LIBRARIES NumPy, SciPy, Pandas, Theano, Caffe, SciKit-learn Matplotlib, Seaborn, TensorFlow, Keras, NLTK, PyTorch, Gensim, Urllib, BeautifulSoup4, MxNet, Deeplearning4j, EJML, dplyr, ggplot2, reshape2, tidyr, purrr, readr DEVELOPMENT Git, GitHub, Bitbucket, SVN, Mercurial, PyCharm, Sublime, JIRA, TFS, Trello, Linux, Unix DATA EXTRATION AND MANIPULATION: Hadoop HDFS, Hortonworks Hadoop, MapR, Snowflake, Cloudera Hadoop, Cloudera Impala, Google Cloud Platform, MS Azure Cloud, both SQL and noSQL, Data Warehouse, Data Lake, SWL, HiveQL, AWS (RedShift, Kinesis, EMR, EC2) MACHINE LEARNING Supervised Machine Learning Algorithms (Linear Regression, Logistic Regression, Support Vector Machines, Decision Trees and Random Forests, Naive Bayes Classifiers, K Nearest Neighbors), Unsupervised Machine Learning Algorithms (K Means Clustering, Gaussian Mixtures, Hidden Markov Models, D), Imbalanced Learning (SMOTE, AdaSyn, NearMiss), Deep Learning Artificial Neural Networks, Machine Perception APPLICATIONS: Recommender Systems, Predictive Maintenance, Forecasting, Fraud Prevention and Detection, Targeting Systems, Ranking Systems, Deep Learning, Strategic Planning, Digital Intelligence, Microservices and Containerization (Docker, Kubernetes) Chidiebere Onwuchekwa | -|- WORK EXPERIENCE COGNIZANT TECHNOLOGY SOLUTIONS USA Nov 2024 – June 2025 Machine Learning Engineer | Teaneck NJ Worked as a Machine Learning Engineer contractor to develop data-driven solutions to mitigate bank fraud and improve operational efficiency for Optum Bank (United Health Group) to detect fraudulent bank account activities using supervised and unsupervised learning techniques using machine learning models like XGBoost and neural networks, Deep Learning Models, Generative AI, predictive analytics, anomaly detection, and big data processing, reducing fraudulent activities through advanced algorithm implementation. Built a vector-based retrieval system using Langchain framework and ChromaDB to enable real-time semantic search across large datasets. Integrated with OpenAI embeddings for accurate document relevance scoring of bank transactions. Adept at developing real-time detection strategies, leveraging behavioral analytics, and integrating machine learning models to reduce fraud losses.  Designed a multi-agent system using LangChain and GPT-4 for hybrid retrieval (vector + keyword) to pull relevant compliance rules and behavioral patterns and led development of bank fraud detection models leveraging LLMs deployed via Azure OpenAI to extract semantic patterns from transaction notes and flag suspicious behaviors  Conducted exploratory data analysis (EDA) in Databricks, transformed raw data into actionable insights using Power BI, DAX, and advanced visualization techniques and presented actionable insights to mitigate bank fraud trends.  Built, trained, and fine-tuned large generative models (LLMs, GPT, BERT), used prompt engineering to guide GPT-4 for various applications within healthcare and finance.  Developed AI agents and implemented RAG techniques for data analysis and to enhance generative model capabilities. Worked with LangGraph, LangChain, and similar frameworks for advanced AI applications.  Analyzed large, complex datasets and integrated relevant data into generative models for improved outcomes and performance. Developed scalable AI solutions using Python, TensorFlow, PyTorch and Hugging face.  Evaluated and optimize the performance of generative models, and traditional machine learning models, ensuring accuracy, stability, and efficiency in production environments while keeping false positives low  Optimized data pipelines for handling large datasets and improved model efficiency by 25%. Designed and deployed machine learning models and fraud detection rules that reduced bank fraudulent transactions by >30% across various banking products and gained behavioral analyses insights from Account owners and merchants.   Conducted root cause analysis and presented findings to senior leadership Implement orchestration layers to unify fraud controls across systems and partnered with compliance teams to ensure fraud detection solutions aligned with industry regulations.  Developed scalable web applications for the model using Python frameworks like Django and Flask. Automated testing and deployment processes using CI/CD pipelines and Docker. Integrated RESTful APIs and third-party services into existing Python applications.  Mentored junior data scientists and machine learning engineers by providing technical guidance, code reviews, and project planning support, fostering skill development and collaboration within the team. Technologies Used: Python, Azure SQL, Scikit Learn, TensorFlow, PyTorch,, GPT, BERT, OpenAI, AI Agents, RetrievalAugmented Generation (RAG), LangGraph, LangChain, cloud platforms (AWS, Google Cloud, Azure), Power BI Databricks, Github Copilot, MLFlow ,ChromaDB, FastAPI, OpenAI API, Vector Search, LLM Integration, FAISS (for benchmarking), HuggingFace Transformers, Prompt Engineering, MCP Servers Chidiebere Onwuchekwa | -|- TCS for MICROSOFT CORPORATION June 2023 – April 2024 Machine Learning Engineer | Redmond WA Worked as a Machine Learning Engineer vendor with a cross-functional MLOps team to design and implement and deploy machine learning models for financial predictive analytics purposes.  Responsible for analyzing large data sets to develop multiple custom models and algorithms to drive innovative business solutions.  Interpret problems and provided solutions to business problems using data analysis, data mining, optimization tools, Power BI dashboards, ChatGPT and OpenAI frameworks, machine learning techniques and statistics.  Conducted data pre-processing, feature engineering, and exploratory data analysis to identify patterns and trends in the data  Built a LangChain-based agent that retrieves and summarizes legal documents using semantic search and RAG-enhanced LLMs.  Utilized VS Code as the primary integrated development environment (IDE) for data science tasks.  Integrated Azure OpenAI for natural language processing tasks, such as sentiment analysis and text classification.  Implemented CI/CD pipelines using Azure DevOps for seamless deployment of data science solutions  Integrated Azure OpenAI for natural language processing tasks, such as sentiment analysis and text classification.  Developed and maintained ADF pipelines for data ingestion, transformation, and loading into Azure cloud storage.  Optimized deep learning architectures by fine-tuning hyperparameters,resulting in a 15% improvement in model accuracy  Leveraged Copilot and MS Teams for effective collaboration and communication within the team  Employed MLFlow to streamline internal workflows and automate data science processes  Created and managed Azure Workspaces for secure and collaborative data science projects.  Utilized SSMS and LLMS for database management, query optimization, and performance tuning  Integrated Prompt Flow into the data science workflow to accelerate model development and experimentation. Technologies Used: Python, Azure SQL, Azure OpenAI, CI/CD, ADF pipeline, Databricks, Langchain, OpenAI, Github Copilot, Teams, Outlook, MLFlow, Azure Workspace, SSMS, LLMS, Prompt Flow, SharePoint, DevOps, VS Code, Azure,Databricks, Anaconda prompt, Prompt Engineering Chidiebere Onwuchekwa | -|- TCS February 2022 – May 2023 Machine Learning Engineer | Edison, NJ Worked on TCS Velocity project. Worked closely with customers, cross-functional teams, software developers, and business teams in Agile/Scrum work environment to drive data model implementations and algorithms into practice. Specialized in developing and fine-tuning large language models, optimizing model performance, and integrating AI workflows on AWS  Led LLM fine-tuning using Hugging Face Transformers on domain-specific datasets, resulting in a 30% improvement in downstream NLP tasks in AWS.  Responsible for analyzing large data sets to develop multiple custom models and algorithms to drive innovative business solutions.  Interpret problems and provided solutions to business problems using data analysis, data mining, optimization tools, and machine learning techniques and statistics.  Carried out specified data processing and statistical techniques such as sampling technique estimation, hypothesis testing, time series, correlation and regression analysis using python and R.  Developed Estimation models for various product & services bundled offering to optimize and predict the gross margin.  Worked on business owners/stakeholders to assess Risk impact, provided solutions to business owners.  Worked with sales and Marketing team for partner and collaborate with a cross-functional team to frame and answer important data questions.  Designed and developed predictive models and learning algorithms to track customer lifetime value, lead scoring, retention, attribution and propensity. Segmented the customers based on demographics using K-means Clustering and DBSCAN   Performed data analysis and data profiling using complex SQL queries on various sources system including SQL Server 2012.  Created DL algorithms using LSTM and RNN.  Designed and implemented end-to-end systems for Data analytics and Automation, integrating custom visualization tools using R, python, Tableau, Power BI and GCP.  Analyzed data and endeavored to understand existing operational processes in order to develop and recommend actionable solutions.  Developed a Sales Performance Dashboard using Power BI and DAX  Performed both advanced qualitative and quantitative analysis of high-volume data bases to identify developing trends, patterns and correlations that could improve overall business performance.  Developed interactive dashboards, created various Ad Hoc reports for users in Tableau, QlikView by connecting various data sources. Technologies Used: Python, R, SQL, Scala, Tableau, QlikView, ggplot2, GCP, AWS, Azure,Databricks, SQL Server, SageMaker, Tensor flow, Power BI Chidiebere Onwuchekwa | -|- CARDINAL HEALTH November 2021 – January 2022 Data Scientist | Dublin, OH Worked with the Data engineering and Opportunity Analysis Teams to leverage advanced analytics to extract and transform data, predict demand and supply, visualize and communicate results for Supply Chain team and for use by the organization ● ● Performed and conducted complex custom analytics as needed for supply chain management. Designed and developed specific databases for collection, tracking and reporting of data. ● Designed, coded, tested and debugged custom queries using Microsoft T-SQL and SQL Reporting Services in Big query. ● Developed quality code that met project and departmental standards. ● Analyzed and processed complex data sets using advanced querying, visualization and analytics tools. ● Identified, measured and recommended improvement strategies for KPIs across all business areas. ● Processing of billions of records from databases extracting, cleansing, transforming, integrating and loading into data warehouse using SSIS packages. ● Supported data analysis services through traditional ROLAP, cube and operational data store environments and creating SSRS reports for business use. ● Identified methods to promote efficiency within existing processes and create a best practices environment ● Performed data manipulation, data preparation, normalization, and predictive modelling. Improve efficiency and accuracy by evaluating model in Python. ● Performed exploratory data analysis like calculation of descriptive statistics, detection of outliers, assumptions testing, factor analysis, etc., in Python ● Performed Data Profiling to assess data quality using SQL through complex internal database ● Improved sales and logistic data quality by data cleaning using NumPy, SciPy, Pandas in Python ● communicated complex concepts in easy-to-understand terminology Technologies Used: Python, R, GCP, SQL, Tableau, Teradata, Agile, Linear Optimization Chidiebere Onwuchekwa | -|- BANK OF NY MELLON November 2018 – August 2021 Data Scientist | Florham Park, NJ Worked with the Identity Management Team within the Information Security Division to develop self-service tools for internal employees. Worked to establish cloud controls for identity governance and assess risks associated with cloud service providers and Fraud risk assessment & mitigation . Developed and deployed fraud detection rules that reduced false positives by 20% ● Programmed solutions using Python libraries such as numpy and pandas. ●......Use ● machine learning and statistical modeling techniques to develop and evaluate algorithms to improve performance, quality, data management and accuracy Managed version control setup for the phantom platform using Git. Set up a playbook for events and classification containers. Developed an app called risk hub and moved it to production deployment using Django. ● Contributed to design and prototyping of medium to high complexity machine learning systems ● In charge of cleaning and debugging datasets and the code base before applications reach QA. Used NLP and RASA to build a POC web app to show chat bot usage Used Text Mining and NLP techniques find the sentiment about the organization ● ● ● ● ● Perform analysis of user profiles and current application entitlements based on user profiles, organization, departments and groups. ● Used graph-based algorithms to detect anomalies and prevent financial fraud by connected data analysis ● Built a recommending system to auto-provisioning applications and platform access to new employees/contractors so they are productive as soon as they are onboarded ● Ensured a holistic approach to Enterprise Data Governance employed in projects that addresses the people and process components of Data Quality, Data Stewardship, Data Privacy & Protection, Data Policy & Standards, and Metadata Management Data Analysis and Reporting using MS Excel, Tableau, Power BI, SQL and Alteryx Built MS Excel applications using VBA to help with process automation and control Performed Evaluation of on-prem and cloud controls. Responsible for presenting findings to stakeholders. ● ● ● ● ● ● ● Ran SQL code to assess, clean, validate and analyze large datasets to support email campaigns Familiar with Machine Learning modeling using python and frameworks like TensorFlow Implemented and cleans datasets for network accesses based on user profiles Demonstrated core leadership capabilities as a project lead and led interactions with Sales and Product teams to proliferate understanding, incorporate feedback, and refine team analytics ● Used the AWS SageMaker to quickly build, train and deploy the machine learning models ● Built Restful web services that use Json. ● Good knowledge of data structures, algorithms, and object-oriented programming. ● Chidiebere Onwuchekwa | -|- OMNICARE Inc September 2017- October 2018 Data Scientist/NLP Engineer | Stafford, TX Worked with NLP to classify text with data draw from a big data system. The text categorization involved labeling natural language texts with relevant categories from a predefined set. One goal was to target users by automated classification. In this way we could create cohorts to improve marketing. The NLP text analysis monitored, tracked and classified user discussion about product and/or service in online discussion. The machine learning classifier was trained to identify whether a cohort was a promoter or a detractor. Overall, the project improved marketing ROI and customer satisfaction. ● ● ● ● Oversaw the entire production cycle to extract and display metadata from various assets developing a report display that is easy to grasp and gain insights. Performs NLP preprocessing in Python using libraries such as NLTK. Collaborated with both the Research and Engineering teams to productionize the application. Assisted various teams in bringing prototyped assets into production. ● Expertise in applying data mining techniques and optimization techniques in B2B and B2C industries and proficient in Machine Learning, Data/Text Mining, Statistical Analysis and Predictive Modeling. ● ● ● Utilized MapReduce/PySpark Python modules for machine learning & predictive analytics on AWS. Implemented assets and scripts for various projects using R, Java and Python Built sustainable rapport with senior leaders. ● Developing and maintaining Data Dictionary to create metadata reports for technical and business purposes. ● Build and maintain dashboard and reporting based on the statistical models to identify and track key metrics and risk indicators. ● Keeping up to date with latest NLP methodologies by reading 10 to 15 articles and whitepapers per week. ● ● ● ● Extracting the source data from Oracle tables, MS SQL Server, sequential files and Excel sheets. Parse and manipulate raw, complex data streams to prepare for loading into an analytical tool. Involved in defining the source to target data mappings, business rules, and data definitions. Project environment was AWS and Linux. Technologies Used: Python, R, Java, Kubernetes, Docker, ELK Stack (ElasticSearch, Logstash, Kibana), AWS Comprehend Chidiebere Onwuchekwa | -|- Randalls Food and Stores August 2016-July 2017 Data Scientist | Houston, TX Worked on a sales forecasting project for a using an artificial neural network developed in PyTorch along with Facebook’s Prophet model. I performed data cleaning in Python on a large dataset including several years’ worth of data across different departments in dozens of stores and produced highly accurate forecasts for each store and department. ● Created a model using Facebook Prophet to produce highly accurate predictions of a weekly sales ● Evaluated model performance on large dataset (multiple years of daily data for dozens of departments per store and dozens of stores) ● Deployed model created highly accurate 6-month forecasts up to 6 months in advance for every store and department. Worked in a Cloudera Hadoop environment using Python, SQL, and Tableau HDFS (Cloudera): Pulled data from Hadoop cluster. Worked within the Enterprise Applications team as a Data Scientist. ● ● ● ● ● ● ● ● ● ● ● ● ● ● Used Python, Pandas, NumPy, and SciPy for exploratory data analysis, data wrangling and, feature engineering. Used Tableau and TabPy for visualization of analyses. Worked along with Business Analyst, Data Analyst, and Data Engineers. Consulted with various departments within the company including, SIU and Safety. Managed and matched claim numbers into fraud cases. Cleaned fraud data to be joined with the claims data (~73k observations) Research and Assess the Fraud Predictive Analytics scenario in terms of predicting final outcomes for new claims Create a Tableau Dashboard that will help SIU in present their Annual Report Tried kernel density estimation in lower dimensional space as a feature to predict fraud. Testing Anomaly Detection Models such as Expectation Maximization, Elliptical Envelope, and Isolation Forest. Multivariate analysis of safety programs from the last 10 years. ● Used regression to determine the correlation of participation in the safety program with outcome of claims. ● Hypothesis testing and statistical analysis was done to determine statistically significant changes in claims after participating in the safety program. Presented findings of impact testing. Workers Compensation fraud detection Prepared data for exploratory analysis Engineering actuarial formulas ● ● ● ● ● Collaborated with other Data Scientist with use cases that included workplace accident prediction and sentiment analysis. Technologies: Cloudera Hadoop, Python, SQL, and Tableau, Hadoop HDFS, Pandas, NumPy, andSciPy, TabPy, Data Modeling, Multivariate analysis, Regression Analysis, Hypothesis Testing, Exploratory Analysis, Sentiment Analysis, Predictive Analytics. Chidiebere Onwuchekwa | -|- TGS-Nopec February 2013-June 2016 Data Scientist | Houston, TX TGS-Nopec is a publicly traded company listed in the Norwegian Stock Exchange with global headquarters in Houston. The primary occupation of the company is to perform exploration studies for the oil and gas industry. Their principal products are data and insights for the oil and energy industries that include multiclient geophysical data, multi-client geological data, imaging services and reservoir solutions, data & analytics, machine-learning solutions, well performance insights, etc. At TGS I performed several statistical studies including well performance and drilling optimization using deep neural nets. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Application of data mining techniques and optimization techniques in B2B and B2C industries and Machine Learning, Data/Text Mining, Statistical Analysis and Predictive Modeling. Utilized PySpark Python modules for machine learning & predictive analytics in Hadoop on AWS. Predictive modeling using state-of-the-art methods. Implemented advanced machine learning algorithms utilizing Caffe, TensorFlow, Scala, Spark, MLLib, R and other tools and languages needed. Programming, and scripting in R, Java and Python. Developed Data Dictionary to create metadata reports for technical and business purpose. Built reporting dashboard on the statistical models to identify and track key metrics and risk indicators. Performed Boosting method on predicted model for the improve efficiency of the model. Extracted source data from Amazon Redshift on AWS cloud platform. Parsed and manipulated raw, complex data streams to prepare for loading into an analytical tool. Explored different regression and ensemble models in machine learning to perform forecasting Developed new financial models and forecasts. Improved efficiency and accuracy by evaluating models in R. Involved in defining the source to target data mappings, business rules, and data definitions. Performing an end to end Informatica ETL Testing for these custom tables by writing complex SQL Queries on the source database and comparing the results against the target database. Technologies: PostgreSQL/PostGIS, Oracle Spatial, Python, SQL, Tableau, Hadoop HDFS, ArcGIS Pro, QGIS, MapInfo, Pandas, NumPy, andSciPy, TabPy, Data Modeling, Multivariate analysis, Regression Analysis, Hypothesis Testing, Predictive Analytics TMD Staffing June 2012-January 2013 Data Analyst | Katy, TX ● Applied Machine Learning, Data/Text Mining, Statistical Analysis and Predictive Modeling. ● Implemented Event Task for executing an application automatically. ● Involved in defining the source to target data mappings, business rules, and data definitions. ● Assist in continual monitoring, analysis and improvement of AWS Hadoop Data Lake environment ● Built and maintained dashboard and reporting based on the statistical models to identify and track key metrics and performance indicators. ● Involved in fixing bugs and minor enhancements for the front-end modules. ● Performed data mining and developed statistical models using Python to provide tactical recommendations to the business executives. ● Integrated R into micro-strategy to expose metrics determined by more sophisticated and detailed models than natively available in the tool. ● Participated in feature engineering such as feature intersection generating, feature normalize and label encoding with SciKit-Learn preprocessing. ● Worked on outlier identification with Gaussian Mixture Models using Pandas, NumPy and matplotlib. ● Adopted feature engineering techniques with 200+ predictors in order to find the most important features for the models. Tested the models with classification methods, such as Random Forest, Logistics Regression and Gradient Boosting Machine, and performed hyperparameter tuning to optimize the models. University of Port Harcourt Sept 2009-January 2011 IT Associate | Rivers State, Nigeria ● Assisted users in initiating services. ● Experience with Microsoft Exchange Migration ● Open Directory and Dot Net framework. ● Implemented Event Task for executing an application automatically. ● Involved in defining the source to target data mappings, business rules, and data definitions.  Assisted in basic computer repair/reformat
Get your freelancer profile up and running. View the step by step guide to set up a freelancer profile so you can land your dream job.