Titanic datset

Step 1 - Data preprocessing and Data cleaning in Titanic Dataset


In this blog, I am going to demonstrate how to do a simple machine learning project. I hope you all known what is machine learning and other theoretical stuff.

Let's jump to a practical exploration.For this tutorial, I have a taken titanic dataset from Kaggle I will attach the link below.

Kaggle link for Dataset:https://www.kaggle.com/c/titanic/data

Github link: Project link

To do an ML project we want to separate our data into two categories such as train and test. We will apply all our analysis and algorithm to train data we will test the output test data

I have used python and Jupyter notebook.The main packages I have used is pandas and numpy for these steps.These steps are very basic and it can be applicable to most of the ML and data visualisation project.I have attached my GitHub link and youtube link for the better understanding.

The things that are covered are:

1) How to read CSV file in python ( I have used pandas)

2)To see the columns, heads, tails of the data.

3) To find the null value for all columns and rows

4) Clean the null value by using fillna method

5) Shape the data

6) Information of the data such as  type and count value

7) Describe function. This is used to find the count, standard deviation,  mean, 25% of the total      value, 50% of the total value

8) dtype.This used to find the type of the columns

I hope these are very basic steps to be followed to most of the machine projects and data visualisation.

Check out my youtube video and GitHub to under the first step of analysing

Happy coding!!!


Comments

Popular posts from this blog

Artificial Intelligence

The taxonomy of CASE Tools

Zoho Second round - adding a digit to all the digits of a number