Jame Li | Freelancer Resume

James Li Data Scientist & Machine Learning Engineer - q https://www.linkedin.com/in/james-li-6a7b95c e Vancouver, BC PROFESSIONAL SUMMARY Seasoned Data Science Professional with 6+ years of strong experience in machine learning engineering, data engineering, and operations. Proficient in predictive modeling, data processing, data mining, and statistical modeling to address complex business challenges. Strong expertise in NLP and implementation of deep learning models, such as BERT, GPT, and T5. Familiar with state-of-the-art NLP techniques, including Transformer architectures, and libraries like spaCy, HuggingFaces. Proficient in integrating vector stores such as ChromaDB and Pinecone into NLP or machine learning pipelines, enhancing tasks like document clustering, recommendation systems, and content-based retrieval. Understand APIs and functionalities provided by vector stores. Excellent knowledge of SQL and RDBMS implementation including data analysis techniques, complex SQL profiling on SQL Server and Teradata, and optimization of PL/SQL stored procedures and queries. Experience in data extraction, data modeling, data wrangling, and data analysis using IDEs such as Jupyter Notebook and PyCharm. Familiar with data warehousing principles and normalization techniques for OLAP systems. Expertise in data mining, text mining, data cleansing, transformation, and integration, leveraging MySQL, SQL Server, SQL. Skilled in designing and building batch and stream-based Spark pipelines, semantic search systems, and question-answering systems, deploying to production environments with high efficiency. Experienced in data visualization, adept at creating dynamic dashboards, reports, and data stories using Tableau and Power BI, and proficient in numerous Python and R packages like Pandas, NumPy, SciPy, Matplotlib, Seabon, ggplot2, TensorFlow, and Scikit-Learn. Solid understanding and background in mathematical foundations behind machine learning algorithms, probability theory, random processes, statistics, and optimization techniques. Apt in linear algebra and convex optimization techniques. Proficient with full Data Science project life cycle, actively involved in all phases including data acquisition, data cleaning, data engineering, feature scaling/engineering, statistical modeling, testing and validation, and data visualization. Highly experienced in working with cloud-based infrastructure, containerization, and CI/CD pipelines within a DevOps environment. Proven capability to deliver enterprise-grade, scalable machine learning and deep learning-based applications and services. Strong proficiency in agile methodology, and SCRUM process, with experience in tracking defects using tools like Jira and Git. Excellent at collaborating with cross-functional teams, stakeholders, application engineering, quality engineers, and product management. Outstanding communicative, interpersonal, intuitive, analysis, and leadership skills. WORKING EXPERIENCE Data Scientist Sisense 07/2020 - 07/2023 New York, United States Sisense is a global business intelligence software company, headquartered in New York City, that offers Sisense Fusion, a cloud-based BI platform enabling users to connect, analyze, and visualize data from various sources. • Designed, implemented, and validated over 40 machine learning models across diverse projects, employing real-world program data to predict future trends accurately. • Spearheaded Extract, Transform, Load (ETL) processes to streamline data cleaning, modeling, and mining initiatives, enabling report generation through PowerBI, resulting in a 90% reduction in turnaround time to adhere to SLA standards. • Utilized SQL to detect and minimize data duplication across various team projects, enhancing coordination, accountability, and streamlining project management. • Leveraged SAS and Excel Macros to clean and prepare client data, assisting the marketing team in constructing effective marketing mix models. This strategy yielded an ROI lift of 25 basis points. • Collaborated with the product team to design and deploy a Python-based product recommendation engine, boosting on-page user engagement and driving $150K in incremental annual revenue. • Partnered with product and marketing teams to identify pre-client interactions positively correlated with conversion rates, facilitating strategies that led to a 25% uplift in conversions. • Developed operational reports in Tableau to optimize contractor scheduling, leading to an annual budget saving of $85,000. Data Scientist Ada Support 10/2017 - 05/2020 Toronto, Canada Ada Support is a customer service automation platform that leverages LLMs to assist businesses in addressing customer inquiries across various languages and channels. • Leveraged Google Cloud Natural Language API and Python for efficient entity extraction, content classification, and sentiment analysis. • Developed predictive models using a combination of machine learning, natural language, and statistical analysis methods. • Utilized Spark for distributed data processing on large streaming data sets, improving data ingestion speed by 40%. • Crafted an automated linear regression model refinement program in SAS, reducing manual work by 20 hours per month, targeted at specific customer base segments. • Built ETL infrastructures for data delivery to Redshift, enhancing stakeholder decision-making capabilities by 36%. • Designed a SQL Server Integration Services (SSIS) package for seamless transfer and loading of files from 150 diverse sources into numerous SL database tables. • Extracted valuable datasets from Salesforce CRM and loaded them into SQL relational database for analytic processing, utilizing SQL syntax to retrieve, manipulate, and extract significant results. • Advisory role in crafting marketing strategies based on efficient marketing media mix channel insights, resulting in reduced marketing expenses, increased clickthrough rate, and boosted customer acquisition rate. WORKING EXPERIENCE Data Analyst GeoViz 06/2016 - 09/2017 Oakville, Canada GeoViz Inc. is a digital commerce services company helping businesses transform digitally and improve operations for growth and cost reduction. • Interpreted and analyzed business requirements and processes for clients, leveraging interviews, document analysis, and workflow assessment strategies, increasing business process understanding by 40%. • Initiated and executed data extraction from Oracle, SQL, and Teradata for over 20 projects, converting data into SAS data sets using Proc SQL. This process improvement saved 20% time in data management. • Participated in the evaluation of dataset complexities for 60 datasets using SAS, leading to a 30% improvement in data quality. • Skillfully applied strategic logic in formatting and extracting necessary information for over 50 projects, adhering to regulatory standards and reducing datarelated discrepancies by 20%. • Created impactful visualizations using Tableau for over 100 presentations, improving stakeholder understanding by 50%. • Demonstrated deep understanding of digital commerce services, focusing on customer requirements to develop over 100 solutions, resulting in a satisfaction increase of 50%. EDUCATION Bachelor of Computer Science York University GPA 3.86 / 4.0 e Toronto, Canada 04/2012 - 04/2016 SKILLS Programming & Tools: Python Seaborn R Scala Plotly SQL ggplot2 JavaScript TensorFlow Microsoft Excel Tableau PyTorch Keras Power BI Git Pandas Jira spaCy Scikit-learn Jupyter Notebook Matplotlib PyCharm Machine Learning and AI: Predictive Modeling Random Forests Deep Learning Natural Language Processing Native Bayes Classifier Discriminant Analysis Support Vector Machine AWS Architecture OpenAI models Semantic Search Ensemble Models Regression Models PCA Decision Trees Factor Analysis Cluster Analysis Huggingface models Data Engineering: Data Acquisition Data Cleaning Data Warehousing Data Engineering Agile Methodology Features Scaling Data Visualization Data Mining Statistical Modeling Scrum Process Cloud and DevOps: CI/CD Pipelines Containerization AWS Azure Google Cloud Platform Firebase Docker Database and Big Data Technologies: SQL Server HBase Microsoft Access MapReduce MySQL PostgreSQL Data Warehousing Principles MongoDB Teradata Apache Spark OLAP Systems Business and Soft Skills: Cross-Functional Teamwork Communication Interpersonal Skills LANGUAGES English Proficient Chinese Native Leadership Problem-Solving Hadoop HDFS Hive