Posts

Showing posts from February, 2021

Machine Learning - Titanic

Image
   Problem:   Titanic - Machine Learning from Disaster Following link has detailed problem description: h ttps://www.kaggle.com/c/titanic/overview/description Observation:  Firstly I analyzed if there are missing values in different columns of train_data and test_data dataset by verifying the count of values present in each column. I could see that there are missing values in Age column of train_data and test_data dataset. Also, there are missing values in Fare column of test_data  dataset. Secondly, it can be noted that Cabin and Name columns are not being used for our calculation of passenger’s survival. Analysis: Based on the observations made I applied median value to missing values of these columns. I dropped these columns from our train_data and test_data. Also, using the boxplot, I could see that there are outliers in Age column of  train_data dataset. I removed the rows having outlier values in this dataset as well. Code: Outcome: Above analyzed cha...