- Languages: Python (Advanced), SQL (SQLite/MySQL), HTML/CSS.
- Data Science: Pandas, NumPy, Scikit-Learn, TensorFlow/Keras, NLTK, OpenCV.
- Data Engineering: Web Scraping (BeautifulSoup/Selenium), OCR (Tesseract), Regex, PDF Automation (Tabula/pdfplumber).
- Visualization: Matplotlib, Seaborn, Power BI, Tableau.
- Developer Tools: VS Code, Jupyter Notebook, Git/GitHub, Eel (GUI).
Auratech Services | Data Analyst Intern Jaipur, Rajasthan | Dec 2023 – July 2024 * Database Management: Developed and maintained company database systems, ensuring 100% data integrity and accessibility for stakeholders.
- Data Cleaning & Quality: Engineered automated cleaning pipelines using Python (Pandas) to rectify inconsistencies, reducing data errors by 40%.
- Reporting: Utilized SQL for complex querying and Python for data manipulation to generate automated business reports and visual insights.
Omni-Extract: Offline Data Extraction Engine | Python, Eel, OCR
- Developed a desktop application to automate data extraction from PDFs and Handwritten images.
- Integrated Tesseract OCR and pdfplumber to digitize messy financial records securely offline.
Blackcoffer Data Extraction & NLP | Python, BeautifulSoup, NLTK
- Engineered a pipeline to scrape 100+ URLs and perform sentiment analysis.
- Manually calculated Fog Index and Polarity Scores for high-precision readability metrics.
Email Spam Classification System | Machine Learning, Scikit-Learn
- Built a text classification model using Random Forest and CountVectorizer.
- Achieved 95% Accuracy in identifying and filtering unsolicited communications.
JustDial Business Lead Scraper | Python, Requests, Regex
- Automated the extraction of B2B leads from public directories.
- Used Regex to sanitize HTML and format data for direct Excel/SQL export.