Week7 03rd October 2020
Material for Week 7 of ML boot camp.
Prerequisites - you should be familiar and should at least be able to understand below
- Numpy
- Pandas
- Matplotlib
- Able to use sklearn to split the dataset in test and train datasets
- Able to use sklearn to pre-preocess the data and use sklearn pipelines
- Be clear with Theory of Linear Regression and be able to Use Linear Regression
Items ticked are highly recomended to be completed before session, unticked ones are for the more adventerous of you :)
- Day1: Understand working with missing data AND Practice working on missing data
- Day2: Working with categorical data and Three practice datasets
- Day3: Use Pipelines to manage Missing values & categorical encoding at one go using SKLEARN PIPELINES, I am not adding any separate practice datasets as you can practice these concepts on same notebooks which i have provided for Linear Regression and Logistic Regression.
- Day4: Linear Regression: First go through this notebook. Then go through 2 notebooks I provided in week5. You can use Ames housing & Boston housing datasets to predict Prices.These datasets are from Missing value day1 link. You can use Automobile import dataset from day2 link to predict Prices of automobiles.
- Day5: Logistic Regression revise week6, Practice you skills on Titanic dataset to predict survival, Pima-Indian dataset to predict diabites.