Anuragh Barua | Freelancer Portfolio Item #446251

PK: Implement a Simple Machine Learning Model in Python SK: Machine learning Python code example Machine learning algorithms in Python code Machine learning with Python machine learning model in Python Implementing a Simple Machine Learning Model in Python Introduction So, you’ve decided to build a machine learning model in Python for your final year project or just out of curiosity for Python and Artificial Intelligence. Now the next step in your journey is to get thorough knowledge about Python and Machine Learning to implement a simple yet useful machine learning model in Python. And to get familiar with such concepts, follow this guide to know everything about machine learning models, google collab and much more exciting things coming up! What is Machine Learning? 150 words We need to get our basics clear as to what is Machine Learning and why we are building an ML model. So, let’s answer the burning question, what is machine learning? Machine Learning is trying to mimic the way humans learn by making computers understand data using statistics and find out patterns and recognize them in completely new data, thus learning on the way. This field of AI is ever growing with new ML algorithms making boring calculation easier, thus helping humans think better. Then what is this ML model we’re talking about from some time. Let’s decode that. Machine learning model is the program that takes decisions by recognizing data based upon the training provided to it to understand unseen data. In this guide we are going to implement a simple machine learning model using Python which is the most preferred language to do so. But why? Let’s understand! Why use Python for Machine Learning? 100 words We always associate machine learning with python and say that we will build a machine learning model using python, but why is that? Because Python is simple and readable. It comes with a ton of useful ML libraries that makes building ML models a piece of cake. Moreover, developer around the world find it easy to express their ideas in Python as it’s very readable and easy to maintain. And this comes handy when beginners are trying to enter the field of AI and they stat studying ML models and being written in Python, it becomes easier to learn and implement it. Furthermore, they can add certain features or fix bugs in starting of their journey as an ML engineer as learning ML libraries by Python gives you the confidence you need to create something new. Implementing a Simple Machine Learning Model in Python To implement a machine learning model using python, we need to follow some steps that will ensure our model works and is accurate. So, let’s start with importing few libraries that will make our work easier: Importing the necessary libraries We’ll import some Python libraries like pandas for reading the set of data, matplotlib to plot some graphs to understand how our data is distributed and the scikit learn to use the prebuilt programs of certain models whose source code might be trickier for us. Let’s ru the below code to import all the necessary libraries. import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression Loading the dataset The next step is to learn as much as you can about the data so you can feed the right amount of data on which your machine learning model will train itself. For that to happen we have to load the data first. Run the code below to load the dataset of House Rent that details some exciting trends. df = pd.read_csv('House_Rent_Dataset.csv') In the above code we use the read_csv method of Pandas to read our dataset file. Let’s see some of the rows and columns using the code below: df.head() In the output you’ll find various factors that influence the House Rent and today we’ll be studying some of the factors and train our model to predict the Rent based upon those factors. Understanding the dataset This is a crucial step when building a simple machine learning model in python. The more you understand the dataset yourself the better you can make your machine learning model learn it. Let’s see how many rows and columns does the dataset contain using the shape method. df.shape As you can see there are 4746 rows and 12 columns of data. To gain more information about the dataset we’ll use the info method df.info() There is no missing values in the dataset and to be more sure of it you can search for all the null values if present using the isnull() method Here you can see the places where there is a data value present is showing false as it’s not null. If you even have some missing data, to get the exact count of how much data is missing from your dataset, you can use the following code df.isnull().sum() Since, it’s confirmed that there’s no missing data, we can finally focus on what type of data is present in the columns as it’s very important to train an ML model effectively. We are going to use the dtypes method of pandas to do that. df.dtypes Data preprocessing To implement machine learning models in python effectively, we have to make the data simpler for the model to learn from. For that we need to convert all the object types to category types. In this model, we ae going to focus on how House Rent is influenced by three factors, namely, Furnishing Status, Tenant Preferred and Area Type and hence we convert only those to category types to train our model better. df['Furnishing Status'] = df['Furnishing Status'].astype('category') df['Tenant Preferred'] = df['Tenant Preferred'].astype('category') df['Area Type'] = df['Area Type'].astype('category') Let’s verify if we were actually able to change the types by using dtypes df.dtypes As you can see, we’ve successfully converted the three significant factors to category type. Moreover, we don’t need other columns of object types as it might interfere woth the training of our model so let’s remove all such irrelevant columns. df.pop('Posted On') df.pop('Floor') df.pop('Area Locality') df.pop('City') df.pop('Point of Contact') Now let’s know some trends inside the dataset itself using the describe method of pandas. df.describe().T We did the transpose to see the results better for each column. Data visualization This step is crucial to visualize the data to notice some underlying patterns and to understand how the data is divided. We are going to plot some graphs using the Python library matplotlib to visualize the data. plt.figure(figsize = (8, 5)) colors = ['#FF1E00', '#A66CFF', '#EAE509', '#D61C4E', '#3CCF4E', '#3AB4F2'] df["Furnishing Status"].value_counts().plot(kind = 'bar', color = colors, rot = 0) We have visualized how the data is divided under the Furnishing Status column and we can see clear distinction between the three types of furnishing statuses, semi-furnished, unfurnished and furnished. Before moving on to train our machine learning model in python, we need to perform the last step of one hot encoding. This step is very crucial to train our ML model. One Hot Encoding We are going to perform one hot encoding to categorical variables in the dataset using the get_dummies method in Pandas. df = pd.get_dummies(df) Using the columns attribute we can see that columns have been divided into few categories. df.columns Building a regression model We have finally reached to the step where we can build our simple machine learning model in python. We are going to build a Linear Regression model but to do that we’ll have to understand a little bit about input and output variables. When an ML model trains itself, it accepts some features (columns) as input called input variables to recognize some underlying patterns and then predict some features called output variables. So here we are considering Rent as the output variable and other columns as input variables. So we need to assign the Rent column to variable y as the output variable. y = df["Rent"] To create the input variable X, we’ll have to remove the Rent column from the main data frame. To do that we’ll use the drop method of Pandas. X = df.drop("Rent", axis = 1) We also have to divide the dataset in a certain ratio to create some data for training on which model trains itself and some data for testing which the model uses to evaluate itself. And to do that we’ll use the train_test_split method provided with the scikit learn Python library. X_train,X_test,y_train,y_test=train_test_split( X,y, train_size = 0.6, random_state = 1) Using the above code we have split the dataset int o 60 percent of training data and 40 percent of testing data. Now you can create the Linear Regression model by just creating an instance of the Linear Regression class. lr = LinearRegression() Now, let’s train the model using the training and testing data. lr.fit(X_train,y_train) With the completion of this step, we have successfully built our first ever simple machine learning model in python. Model evaluation It’s time to look at the performance of our first ever machine learning model. Aren’t you excited? Let’s decode the results using the following code. lr.score(X_test, y_test) We can conclude that our model has an accuracy of 41.52% which is not very promising but a good start towards building an accurate machine learning model in python. Model prediction Now let’s use our model in a real world scenario. We are going to find out if our model is able to predict the first row of training data and the Rent present in it. To do that we need to create a new data frame with following code to contain only the first row of training data which we are going to predict. df_new = X_train[:1] Now let’s see if our model can predict the Rent of our new data frame. lr.predict(df_new) Let’s compare it with the actual value using the code below y_train[:1] As you can see we are pretty close with our prediction although it’s not accurate. Moving forward we can build upon this knowledge and create more accurate models using other ML algorithms that would work best in this case like the Random Forest algorithm. Role of online learning platforms Implementing simple machine learning models in python is easy as we did above but to build real world accurate models that can make human lives better, we have to gain sophisticated knowledge of Machine Learning. This is where Great Learning’s Machine Learning comes in. If you want to have a thorough knowledge of how machine learning algorithms work and how real life accurate models are built, this course is for you. You can opt in for the free version called Basics of Machine Learning and then built upon that knowledge with their paid course called PGP in Machine Learning. In summary We implemented a simple machine learning model in python from scratch and learnt a lot about how machine learning works and how models are built. Machine learning is a great field to build your career in. And with the right guidance, you can become an ML Engineer and work towards a brighter future when humans and AI thrive together.