Priyanka Kumari | Freelancer Resume

Priyanka Kumari Curriculum Vitae Basic Info. Name E-Mail Research Interest Priyanka Kumari-(1) (2) (3) (4) (5) (6) (7) Rare Category Analysis Video analytics Active Learning Semi-supervised Learning Transfer Learning Spam Filtering Big data Analytics Education 2016 - 2019 Master of data Science and Analytics | Massachusetts Institute of Technology 2005 - 2007 Master of Computer Science (MSCS), Computer science | Carnegie Mellon 2000 - 2004 University Bachelor of Technology (B.Tech.), Computer engineering | Indian Institute of Technology (IIT) Bombay Work Experience August 2019 - Present Senior Software Engineer | Google August 2014 - June 2016 Project Manager | Microsoft February 2010 - July 2014 Senior Software Engineer | Amazon.com September 2007 - December 2009 Software Engineer | IBM Research Experience ◆ Machine Learning 1) Develop a new method for detecting instances from the minority classes via an unsupervised local-density-differential sampling strategy. Essentially a variable-scale nearest neighbor process is used to optimize the probability of sampling tightly-grouped minority classes, subject to a local smoothness assumption of the majority class. The effectiveness of the proposed method is proven both theoretically and in preliminary experiments. 2) Design a prior-free rare category detection method named SEDER. It implicitly performs semiparametric density estimation using specially designed exponentially families, and then picks the examples for labeling where the neighborhood density changes the most. 1 Priyanka Kumari 3) 4) 5) 6) 7) 8) Experimental results show that its performance is comparable to state-of-the-art techniques where much more prior information about the data set is needed. Propose graph-based rare category detection methods named GRADE and GRADE-LI for detecting minority classes on graphs. They first calculate the global-similarity between two nodes on the graph, and then implicitly map the nodes to the feature space according to the global similarity. By sampling in the regions with high density, they have a high probability of finding examples from the minority class with a few label requests. Given the same amount of information, GRADE performs much better than state-of-the-art techniques. On the other hand, given much less information, GRADE-LI performs as well as state-of-the-art techniques. Propose a new graph-based transfer learning method. It is based on an objective function that takes into account the label smoothness on an example-feature-example tripartite graph, example-example bipartite graph and the consistency with the label information. Furthermore, to address the computation issue, we propose an iterative algorithm, which is shown to converge to the optimal value. Experimental results on several data sets demonstrate the superiority of the proposed method over state-of-the-art techniques. In the field of spam filtering, propose a new asymmetric boosting method, Boosting with Different Costs. Compared with traditional boosting methods, which assume the same cost for misclassified instances from different classes, our method is more generic, and is designed to be more suitable for problems where the major concern is a low false positive (or negative) rate. Experimental results on a large scale email spam data set demonstrate the superiority of our method over state-of-the-art techniques. Propose a new graph-based semi-supervised learning method. It differs from existing graph-based methods in that it estimates both the class conditional probabilities and the class priors, therefore it is a generative model in nature, while existing methods are all discriminative models. Experimental results on three datasets show the superiority of my method over existing methods especially when the proportion in the labeled set is not the same as the class priors. Propose a new variant of boosting algorithm, named W-Boost, which addresses the problem of over-fitting when training data is not sufficient to a certain extent. It is based on a novel weight update scheme and uses changeable bin number to estimate marginal distributions in weak learner design. Study and compare existing active learning methods used in Content-based Image Retrieval, and propose a novel method named mean version space active learning. The criterion of the proposed method incorporates both posterior probabilities and the size of the version space, while existing methods are only based on one of them. 2 Priyanka Kumari ◆ Image Related Topics 1) Propose a novel transductive learning framework named manifold-ranking based image retrieval (MRBIR). Several schemes for incorporating negative feedback images and for selecting images in each round of relevance feedback are incorporated into the framework. In systematic experiments, MRBIR outperforms state of the art techniques. 2) Evaluate the performance of different classification algorithms in an image classification task (photo vs. graphic), e.g. SVM, AdaBoost, Real-AdaBoost, and incorporate the best one (Real-Adaboost) into a web image search engine developed by Microsoft Research Asia. 3) Propose an optimization-based approach for automatic peak number detection in repeated pattern analysis. Apply the theory of wallpaper groups to natural images and extract a novel feature to depict the symmetry property of natural images. The proposed symmetry feature outperforms several other texture features in image retrieval. Some of Projects Experience 1. Human Detection in Video stream Technologies used: • Faster R-CNN, MASK R-CNN • Deep Learning, Machine learning, Convolution neural network • Python, pytorch, JavaScript, ReactJS, NodeJS • Classical Machine Learning Methods; - SVM classifier with RBF kernel was trained on hand-crafted LBP (local binary pattern) • features; - Convolutional NN was used for one of attributes. • The key application features: - Face Recognition and Tracking; - People Aggregation; - Storing All the Information into the Database; - Quick and Advanced View Modes. The product may be used in different video surveillance fields, such as: - streets; - shopping and business centers; sporting events; - concerts. Realtime human detection and tracking in live video stream from fisheye camera. It can also track illegal activities by people and report to police like molesting girls, carrying arms etc. The solution has been architected keeping in mind following features: - Background and foreground segmentation. - Resolving ambiguities of crossing tracks. - Re-identification of re-entering humans. 3 Priyanka Kumari - Server-side solution with HTTP API. - Integration with Age and Gender classifier from side-view camera. This project had many modules. The one complex module was person’s features assessment: - gender; - age; - emotions; - race. The system determines percentage of matching for each feature and shows them to the user. 2. Protection Detection for engineering industry: Technologies used: • Faster R-CNN, MASK R-CNN • Deep Learning, Machine learning, Convolution neural network • Python, pytorch, JavaScript, ReactJS, NodeJS The project is designed for assessment of protective clothes and detection: - a safety helmet; - safety glasses; - a reflective vest. The intuitive indicators YES/ NO identify the presence or absence of the specific/particular clothes attributes on the person. The product may be used in different fields, such as: construction; engineering; renovations; manufacturing. 3. Address parsing using libPostal ( retraining Libpostal on custom data) Technologies used: • Naïve Classifier, Recurrent neural network, Unsupervised learning • Deep Learning, Machine learning, RBF 4 Priyanka Kumari • Python, C ++ JavaScript, ReactJS, NodeJS, Micro services, API This project was to parse address components like Country, zip code, city, building number, street name etc. from address string. 4. Candidates’ resumes parsing and matching Technologies used: • Natural Language Processing, Machine Learning, Data Science • Deep Learning, Machine learning, Convolution neural network • Python, pytorch, JavaScript, ReactJS, NodeJS This project was to build a system that parses resumes and jobs and performs resume/job matching to find resumes matching specific job or jobs matching specific resumes. This system takes into account gender, age, relevant backgrounds, including past job and educational experience. From the technical point of view, for parsing resumes and jobs the system uses a complex solution based on Apache Tika, ontologies (for skills, cities, universities and so on) and NLP-based techniques. For matching, the system uses machine learning based algorithms based on a set of different information extracted from resumes and jobs systems. The trained model system ranks custom resumes/jobs and finds top N jobs/resumes that brings together the most matches and provides a score. The system provides an interface for retraining matching models based on new or updated resumes and jobs. Also, the system provides RESTful API interface that allows to use the functionality of the system for external systems. The results of this project are - a sub-system built into the HR system - an innovative way to parse resumes and CVs based on Natural Language Processing and Machine Learning - +25% for customer retention 5. Retail Shop security analytics Technologies used: • Faster R-CNN, MASK R-CNN, RBF • Deep Learning, Machine learning, Convolution neural network • Python, pytorch, JavaScript, ReactJS, NodeJS The goal of this project is building a retail analytics system. Within the framework of the current project, the following tasks are considered. 1) definition of characteristics of the general traffic of visitors including volume and time of entrance / exit of visitors on the territory of the trading hall; 2) determination of personal characteristics of visitors: maximum / minimum and 5 Priyanka Kumari average length of stay in the trading floor; 3) determination of the individual characteristics of visitors: the identification of regular customers (using unique features, for example, a person) and their preferences (client traffic maps through the trading floor area). From the technical point of view, we use two video cameras installed inside the store, the view angle of which allows you to monitor the input / output zones, and the resolution is sufficient for personal identification (Full HD, or higher) and a server equipped with a GeForce GTX 1080, the performance of which allows to process the video stream in real time. We use Ubuntu as main OS and Python with such frameworks like Caffe, Pytorch, Numpy, Sklearn, Scipy and OpenCV. All information about visitors generated during processing is stored in the specified formats (.mp4, .csv, .txt); the values of the main parameters, the input and output paths can be configured using the console user interface. As a result of the project, our customer gets a better understanding of marketing and boosted its revenues by 16%. He also plans to enhance this solution for the whole network of shops. 6. Drone based Agricultural Intelligence Technologies used: • Faster R-CNN, MASK R-CNN, RBF • Deep Learning, Machine learning, Convolution neural network • Python, pytorch, JavaScript, ReactJS, NodeJS Our client provides grape growers with specialized aerial data solutions developed specifically for the complexities of vineyards. Therefore, the business goal of this project was providing existing and prospective customers of the company with an easy-to-use and AI-loaded application. Thus, the customers would be able to fetch and analyze all necessary information about their vineyards on their own computer using Computer Vision methods in a cross-platform app. The goal of the project was to analyze images of grape fields from drones in order to find and detect grape rows, estimated start and end points, and the length and width of each row. Additionally, the application supports additional image processing functions such as colors, brightness and contrast manipulation, drawing primitives (lines, polygons), zoom, saving results as shapefiles, and supports geolocation information. From the technical point of view, this application includes cross-platform design written in C++/Qt for Mac/Win/Linux platforms. The backend of that application relies on Computer Vision algorithms in order to detect the rows. As a result of this project our client boosted its revenues by 5% and was able to demonstrate the latest Computer Vision advancements in the field of agriculture. 6 Priyanka Kumari 7. Android Face App to predict photo of old age Technologies used: • CNN, GAN, Faster R-CNN • Deep Learning, Machine learning, Convolution neural network • JAVA, Android Application Development This was android app made using TensorFlow lite framework which can predict your future face using GAN technologies. Honors and Awards (Selected) 2009 IBM Fellowship 2008 IBM Fellowship 2004 IIT Samsung Fellowship for excellent student (top 1%) 2004 Best Presentation Award in WSM Group, Microsoft Research Asia 2002 Microsoft bitwise challenge winner 2001 Three Good student (top 1%), IIT, Mumbai 7