M. Ismail Hafeez
F-6 Islamabad | E-mail | LinkedIn | GitHub
Education
National University of Computer and Emerging Sciences, Islamabad Campus
Bachelor of Science in Data Science
Beaconhouse School System, Margalla Campus
Cambridge GCE A Levels
2022 – Present
2019 – 2021
Experience
10/2025 – 01/2026
Data Research Intern, Dakota
• Helped enrich data in Dakota Marketplace by collecting private companies data in the US market.
• Utilized Google Sheet to maintain records.
AI Intern, Automotive Artificial Intelligence (AAI-GmbH)
06/2025 – 07/2025
• Built a RAG based chatbot trained on ISO standard documents.
• Scraped over 8000 documents using Playwright and Beautiful Soup
• Utilized LangChain for LLM integration, LangGraph for memory and Vertex AI Vector Store for semantic
search.
• Deployed using Streamlit and FastAPI on Google Run.
Projects
Final Year Project - UpClout (ongoing)
• Developing a platform to ease content creation collaborations.
• Automated ETL pipeline by scraping Instagram profiles using Apify, cleaning and storing in PostreSQL
• Integrated a smart AI based recommendation model and a chatbot to answer queries.
Near-Real-Time Data Warehouse
• Designed a hybrid-join stream-based join algorithm for processing Walmart sales transactions.
• Automated ETL pipeline, enabling real-time data integration from CSV sources.
• Developed 20 OLAP analytical queries to support business intelligence and analytics.
Music Recommendation Model
• Developed a real-time recommendation engine using Apache Kafka, Spark, and MongoDB on a 100GB dataset.
• Deployed a user-friendly web application using Flask, supporting live data streaming and recommendations.
Technical Skills
Languages: Python, C++, JavaScript, SQL
Libraries & Frameworks: LangChain, TensorFlow, Scikit-learn, Pandas, Playwright, Beautiful Soup
Big Data Tools: Databricks , Apache Kafka, Apache Spark, Dask, Apache Airflow
Web & Visualization: HTML/CSS, Flask, D3.js, Node.js, Power BI, Tableau
Databases: MySQL, MongoDB, PostgreSQL
Tools and Platform: Docker, Git, GCP, n8n
Certificates
ETL and Data Pipelines with Shell, Airflow and Kafka