DATA SCIENCE

Project 1

Predicting the Sale Price of Bulldozers

Objectives:

Steps:

  1. Data pre-processing
    • Import data and parsing dates
    • Sort dataframe by saledate
    • Add datetime parameters for saledate column
    • Convert string to categories
    • Fill missing values
    • Turn categorical variables into numbers
    • Split data into train/validation sets
  2. Building a model
    • Build a machine learning model using random forest regressor
  3. Training a model
    • Train a model using trained dataset
  4. Model evaluation
    • Evaluate a model with root mean squared log error (RMSLE) metric
  5. Tuning hyperparameters
    • Tune hyerparameters with RandomizedSearchCV
    • Train a model with the best hyperparamters
  6. Deployment
    • Make predictions on test data
    • Extract Feature Importance

Outcomes:

Prediction the sale price of bulldozers


Top of feature importance


Project 2

Heart Disease Classification

Objectives:

Steps:

  1. Data pre-processing
    • Load data
    • Data exploration (Exploratory Data Analysis: EDA)
  2. Modelling
    • Model comparison
    • Hyperparameter tuning
  3. Hyperparameter tuning with RandomizedSearchCV
  4. Hyperparameter tuning with GridSearchCV
  5. Model evaluation
    • Calculate evaluation metrics using cross-validation
    • Feature importance
  6. Experimentation
    • Evaluation metric

Outcomes:

Confusion matrix for heart disease prediction


Feature importance


Project 3

Multi-class Dog Breed Classification

Objectives:

Steps:

  1. Data pre-processing
    • Get data ready: images and labels
    • Create our own validation set
    • Preprocess images: turning into Tensors
    • Turn our data into batches
  2. Building a model
    • Build a model using Keras sequential model and MobileNetV2
  3. Training a model
    • Train a model using trained dataset
  4. Evaluating a model
    • Make predictions on validation data using a trained model
  5. Training a model with full data
  6. Deployment
    • Make predictions on test dataset and custom images

Outcomes:

Prediction the dog breed with prediction probabilities