summaryrefslogtreecommitdiff
path: root/Data Prediction/Tele Churn/.ipynb_checkpoints
diff options
context:
space:
mode:
Diffstat (limited to 'Data Prediction/Tele Churn/.ipynb_checkpoints')
-rw-r--r--Data Prediction/Tele Churn/.ipynb_checkpoints/Customer-Churn-Prediction-checkpoint.ipynb3283
-rw-r--r--Data Prediction/Tele Churn/.ipynb_checkpoints/tele_churn-checkpoint.ipynb5535
2 files changed, 8818 insertions, 0 deletions
diff --git a/Data Prediction/Tele Churn/.ipynb_checkpoints/Customer-Churn-Prediction-checkpoint.ipynb b/Data Prediction/Tele Churn/.ipynb_checkpoints/Customer-Churn-Prediction-checkpoint.ipynb
new file mode 100644
index 0000000..b601aff
--- /dev/null
+++ b/Data Prediction/Tele Churn/.ipynb_checkpoints/Customer-Churn-Prediction-checkpoint.ipynb
@@ -0,0 +1,3283 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Introduction \n",
+ "## Customer Churn Prediction"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Customer attrition or churn, is when customers stop doing business with a company. It can have a significant impact on a company's revenue and it's crucial for businesses to find out the reasons why customers are leaving and take steps to reduce the number of customers leaving. One way to do this is by identifying customer segments that are at risk of leaving, and implementing retention strategies to keep them. Also, by using data and machine learning techniques, companies can predict which customers are likely to leave in the future and take actions to keep them before they decide to leave.\n",
+ "\n",
+ "We are going to build a basic model for predicting customer churn using [Telco Customer Churn dataset](https://www.kaggle.com/blastchar/telco-customer-churn). We are using some classification algorithm to model customers who have left, using Python tools such as pandas for data manipulation and matplotlib for visualizations.\n",
+ "\n",
+ "\n",
+ "Let's get started."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Steps Involved to Predict Customer Churn\n",
+ "- Importing Libraries\n",
+ "- Loading Dataset\n",
+ "- Exploratory Data Analysis\n",
+ "- Outliers using IQR method\n",
+ "- Cleaning and Transforming Data\n",
+ " - One-hot Encoding\n",
+ " - Rearranging Columns\n",
+ " - Feature Scaling\n",
+ " - Feature Selection\n",
+ "- Prediction using Logistic Regression\n",
+ "- Prediction using Support Vector Classifier\n",
+ "- Prediction using Decision Tree Classifier\n",
+ "- Prediction using KNN Classifier"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Importing Libraries\n",
+ "\n",
+ "First of all, we will import knwon necessary libraries."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#import platform\n",
+ "import pandas as pd\n",
+ "import sklearn\n",
+ "import numpy as np\n",
+ "#import graphviz\n",
+ "import seaborn as sns\n",
+ "import matplotlib\n",
+ "import matplotlib.pyplot as plt\n",
+ "# import plotly.express as px\n",
+ "# import plotly.graph_objects as go\n",
+ "\n",
+ "%matplotlib inline"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Loading Dataset\n",
+ "We use pandas to read the dataset and preprocess it."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(7043, 21)"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df = pd.read_csv('WA_Fn-UseC_-Telco-Customer-Churn.csv')\n",
+ "df.shape"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Exploratory Data Analysis"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>customerID</th>\n",
+ " <th>gender</th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>Partner</th>\n",
+ " <th>Dependents</th>\n",
+ " <th>tenure</th>\n",
+ " <th>PhoneService</th>\n",
+ " <th>MultipleLines</th>\n",
+ " <th>InternetService</th>\n",
+ " <th>OnlineSecurity</th>\n",
+ " <th>...</th>\n",
+ " <th>DeviceProtection</th>\n",
+ " <th>TechSupport</th>\n",
+ " <th>StreamingTV</th>\n",
+ " <th>StreamingMovies</th>\n",
+ " <th>Contract</th>\n",
+ " <th>PaperlessBilling</th>\n",
+ " <th>PaymentMethod</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>Churn</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>7590-VHVEG</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>1</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>No</td>\n",
+ " <td>...</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>29.85</td>\n",
+ " <td>29.85</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>5575-GNVDE</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>34</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>...</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>One year</td>\n",
+ " <td>No</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>56.95</td>\n",
+ " <td>1889.5</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>3668-QPYBK</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>2</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>...</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>53.85</td>\n",
+ " <td>108.15</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>7795-CFOCW</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>45</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>...</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>One year</td>\n",
+ " <td>No</td>\n",
+ " <td>Bank transfer (automatic)</td>\n",
+ " <td>42.30</td>\n",
+ " <td>1840.75</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>9237-HQITU</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>2</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>No</td>\n",
+ " <td>...</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>70.70</td>\n",
+ " <td>151.65</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "<p>5 rows × 21 columns</p>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " customerID gender SeniorCitizen Partner Dependents tenure PhoneService \\\n",
+ "0 7590-VHVEG Female 0 Yes No 1 No \n",
+ "1 5575-GNVDE Male 0 No No 34 Yes \n",
+ "2 3668-QPYBK Male 0 No No 2 Yes \n",
+ "3 7795-CFOCW Male 0 No No 45 No \n",
+ "4 9237-HQITU Female 0 No No 2 Yes \n",
+ "\n",
+ " MultipleLines InternetService OnlineSecurity ... DeviceProtection \\\n",
+ "0 No phone service DSL No ... No \n",
+ "1 No DSL Yes ... Yes \n",
+ "2 No DSL Yes ... No \n",
+ "3 No phone service DSL Yes ... Yes \n",
+ "4 No Fiber optic No ... No \n",
+ "\n",
+ " TechSupport StreamingTV StreamingMovies Contract PaperlessBilling \\\n",
+ "0 No No No Month-to-month Yes \n",
+ "1 No No No One year No \n",
+ "2 No No No Month-to-month Yes \n",
+ "3 Yes No No One year No \n",
+ "4 No No No Month-to-month Yes \n",
+ "\n",
+ " PaymentMethod MonthlyCharges TotalCharges Churn \n",
+ "0 Electronic check 29.85 29.85 No \n",
+ "1 Mailed check 56.95 1889.5 No \n",
+ "2 Mailed check 53.85 108.15 Yes \n",
+ "3 Bank transfer (automatic) 42.30 1840.75 No \n",
+ "4 Electronic check 70.70 151.65 Yes \n",
+ "\n",
+ "[5 rows x 21 columns]"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.018231Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.017819Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.052282Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.051336Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.018175Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>customerID</th>\n",
+ " <th>gender</th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>Partner</th>\n",
+ " <th>Dependents</th>\n",
+ " <th>tenure</th>\n",
+ " <th>PhoneService</th>\n",
+ " <th>MultipleLines</th>\n",
+ " <th>InternetService</th>\n",
+ " <th>OnlineSecurity</th>\n",
+ " <th>...</th>\n",
+ " <th>DeviceProtection</th>\n",
+ " <th>TechSupport</th>\n",
+ " <th>StreamingTV</th>\n",
+ " <th>StreamingMovies</th>\n",
+ " <th>Contract</th>\n",
+ " <th>PaperlessBilling</th>\n",
+ " <th>PaymentMethod</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>Churn</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>7038</th>\n",
+ " <td>6840-RESVB</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>24</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>...</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>One year</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>84.80</td>\n",
+ " <td>1990.5</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7039</th>\n",
+ " <td>2234-XADUH</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>72</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>No</td>\n",
+ " <td>...</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>One year</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Credit card (automatic)</td>\n",
+ " <td>103.20</td>\n",
+ " <td>7362.9</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7040</th>\n",
+ " <td>4801-JZAZL</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>11</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>...</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>29.60</td>\n",
+ " <td>346.45</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7041</th>\n",
+ " <td>8361-LTMKD</td>\n",
+ " <td>Male</td>\n",
+ " <td>1</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>4</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>No</td>\n",
+ " <td>...</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>74.40</td>\n",
+ " <td>306.6</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7042</th>\n",
+ " <td>3186-AJIEK</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>66</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>Yes</td>\n",
+ " <td>...</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Two year</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Bank transfer (automatic)</td>\n",
+ " <td>105.65</td>\n",
+ " <td>6844.5</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "<p>5 rows × 21 columns</p>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " customerID gender SeniorCitizen Partner Dependents tenure \\\n",
+ "7038 6840-RESVB Male 0 Yes Yes 24 \n",
+ "7039 2234-XADUH Female 0 Yes Yes 72 \n",
+ "7040 4801-JZAZL Female 0 Yes Yes 11 \n",
+ "7041 8361-LTMKD Male 1 Yes No 4 \n",
+ "7042 3186-AJIEK Male 0 No No 66 \n",
+ "\n",
+ " PhoneService MultipleLines InternetService OnlineSecurity ... \\\n",
+ "7038 Yes Yes DSL Yes ... \n",
+ "7039 Yes Yes Fiber optic No ... \n",
+ "7040 No No phone service DSL Yes ... \n",
+ "7041 Yes Yes Fiber optic No ... \n",
+ "7042 Yes No Fiber optic Yes ... \n",
+ "\n",
+ " DeviceProtection TechSupport StreamingTV StreamingMovies Contract \\\n",
+ "7038 Yes Yes Yes Yes One year \n",
+ "7039 Yes No Yes Yes One year \n",
+ "7040 No No No No Month-to-month \n",
+ "7041 No No No No Month-to-month \n",
+ "7042 Yes Yes Yes Yes Two year \n",
+ "\n",
+ " PaperlessBilling PaymentMethod MonthlyCharges TotalCharges \\\n",
+ "7038 Yes Mailed check 84.80 1990.5 \n",
+ "7039 Yes Credit card (automatic) 103.20 7362.9 \n",
+ "7040 Yes Electronic check 29.60 346.45 \n",
+ "7041 Yes Mailed check 74.40 306.6 \n",
+ "7042 Yes Bank transfer (automatic) 105.65 6844.5 \n",
+ "\n",
+ " Churn \n",
+ "7038 No \n",
+ "7039 No \n",
+ "7040 No \n",
+ "7041 Yes \n",
+ "7042 No \n",
+ "\n",
+ "[5 rows x 21 columns]"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.tail()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.079833Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.078995Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.090558Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.089462Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.079771Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(7043, 21)"
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.shape"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We have 2 types of features in the dataset: categorical (two or more values and without any order) and numerical. Most of the feature names are self-explanatory, except for:\n",
+ " - Partner: whether the customer has a partner or not (Yes, No),\n",
+ " - Dependents: whether the customer has dependents or not (Yes, No),\n",
+ " - OnlineBackup: whether the customer has online backup or not (Yes, No, No internet service),\n",
+ " - tenure: number of months the customer has stayed with the company,\n",
+ " - MonthlyCharges: the amount charged to the customer monthly,\n",
+ " - TotalCharges: the total amount charged to the customer.\n",
+ " \n",
+ "There are 7043 customers in the dataset and 19 features without customerID (non-informative) and Churn column (target variable). Most of the categorical features have 4 or less unique values."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.093002Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.092646Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.101858Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.100608Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.092944Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "147903"
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.size"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.055811Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.055339Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.065207Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.064137Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.055751Z"
+ },
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "customerID object\n",
+ "gender object\n",
+ "SeniorCitizen int64\n",
+ "Partner object\n",
+ "Dependents object\n",
+ "tenure int64\n",
+ "PhoneService object\n",
+ "MultipleLines object\n",
+ "InternetService object\n",
+ "OnlineSecurity object\n",
+ "OnlineBackup object\n",
+ "DeviceProtection object\n",
+ "TechSupport object\n",
+ "StreamingTV object\n",
+ "StreamingMovies object\n",
+ "Contract object\n",
+ "PaperlessBilling object\n",
+ "PaymentMethod object\n",
+ "MonthlyCharges float64\n",
+ "TotalCharges object\n",
+ "Churn object\n",
+ "dtype: object"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.dtypes"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Totalcharges is given as object datatype but it is float datatype"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.067769Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.067117Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.076918Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.075769Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.067723Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',\n",
+ " 'tenure', 'PhoneService', 'MultipleLines', 'InternetService',\n",
+ " 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',\n",
+ " 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',\n",
+ " 'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'],\n",
+ " dtype='object')"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.columns"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.105839Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.104115Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.143193Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.142163Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.105792Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "<class 'pandas.core.frame.DataFrame'>\n",
+ "RangeIndex: 7043 entries, 0 to 7042\n",
+ "Data columns (total 21 columns):\n",
+ " # Column Non-Null Count Dtype \n",
+ "--- ------ -------------- ----- \n",
+ " 0 customerID 7043 non-null object \n",
+ " 1 gender 7043 non-null object \n",
+ " 2 SeniorCitizen 7043 non-null int64 \n",
+ " 3 Partner 7043 non-null object \n",
+ " 4 Dependents 7043 non-null object \n",
+ " 5 tenure 7043 non-null int64 \n",
+ " 6 PhoneService 7043 non-null object \n",
+ " 7 MultipleLines 7043 non-null object \n",
+ " 8 InternetService 7043 non-null object \n",
+ " 9 OnlineSecurity 7043 non-null object \n",
+ " 10 OnlineBackup 7043 non-null object \n",
+ " 11 DeviceProtection 7043 non-null object \n",
+ " 12 TechSupport 7043 non-null object \n",
+ " 13 StreamingTV 7043 non-null object \n",
+ " 14 StreamingMovies 7043 non-null object \n",
+ " 15 Contract 7043 non-null object \n",
+ " 16 PaperlessBilling 7043 non-null object \n",
+ " 17 PaymentMethod 7043 non-null object \n",
+ " 18 MonthlyCharges 7043 non-null float64\n",
+ " 19 TotalCharges 7043 non-null object \n",
+ " 20 Churn 7043 non-null object \n",
+ "dtypes: float64(1), int64(2), object(18)\n",
+ "memory usage: 1.1+ MB\n"
+ ]
+ }
+ ],
+ "source": [
+ "df.info()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.176933Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.176295Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.202429Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.201454Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.176874Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "customerID 0\n",
+ "gender 0\n",
+ "SeniorCitizen 0\n",
+ "Partner 0\n",
+ "Dependents 0\n",
+ "tenure 0\n",
+ "PhoneService 0\n",
+ "MultipleLines 0\n",
+ "InternetService 0\n",
+ "OnlineSecurity 0\n",
+ "OnlineBackup 0\n",
+ "DeviceProtection 0\n",
+ "TechSupport 0\n",
+ "StreamingTV 0\n",
+ "StreamingMovies 0\n",
+ "Contract 0\n",
+ "PaperlessBilling 0\n",
+ "PaymentMethod 0\n",
+ "MonthlyCharges 0\n",
+ "TotalCharges 0\n",
+ "Churn 0\n",
+ "dtype: int64"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.isnull().sum()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.205070Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.203846Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.233001Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.231899Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.205022Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.duplicated().sum()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Basic Data Cleaning: \n",
+ "As we have already observered in above cell that Totalcharges is given as object datatype but it is float datatype. We will fix it here."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "dtype('O')"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df['TotalCharges'].dtype"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.290044Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.289662Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.301523Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.300033Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.289998Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "df['TotalCharges'] = pd.to_numeric(df['TotalCharges'],errors = 'coerce')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "dtype('float64')"
+ ]
+ },
+ "execution_count": 14,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df['TotalCharges'].dtype"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "categorical_features = [\n",
+ " \"gender\",\n",
+ " \"SeniorCitizen\",\n",
+ " \"Partner\",\n",
+ " \"Dependents\",\n",
+ " \"PhoneService\",\n",
+ " \"MultipleLines\",\n",
+ " \"InternetService\",\n",
+ " \"OnlineSecurity\",\n",
+ " \"OnlineBackup\",\n",
+ " \"DeviceProtection\",\n",
+ " \"TechSupport\",\n",
+ " \"StreamingTV\",\n",
+ " \"StreamingMovies\",\n",
+ " \"Contract\",\n",
+ " \"PaperlessBilling\",\n",
+ " \"PaymentMethod\",\n",
+ "]\n",
+ "numerical_features = [\"tenure\", \"MonthlyCharges\", \"TotalCharges\"]\n",
+ "target = \"Churn\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.235534Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.234920Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.262979Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.261969Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.235471Z"
+ },
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "SeniorCitizen 1.833633\n",
+ "tenure 0.239540\n",
+ "MonthlyCharges -0.220524\n",
+ "TotalCharges 0.961642\n",
+ "dtype: float64"
+ ]
+ },
+ "execution_count": 16,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.skew(numeric_only= True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:52:43.269333Z",
+ "iopub.status.busy": "2021-11-09T03:52:43.268524Z",
+ "iopub.status.idle": "2021-11-09T03:52:43.287626Z",
+ "shell.execute_reply": "2021-11-09T03:52:43.286653Z",
+ "shell.execute_reply.started": "2021-11-09T03:52:43.269284Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>tenure</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <td>1.000000</td>\n",
+ " <td>0.016567</td>\n",
+ " <td>0.220173</td>\n",
+ " <td>0.102411</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>tenure</th>\n",
+ " <td>0.016567</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>0.247900</td>\n",
+ " <td>0.825880</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <td>0.220173</td>\n",
+ " <td>0.247900</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>0.651065</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>TotalCharges</th>\n",
+ " <td>0.102411</td>\n",
+ " <td>0.825880</td>\n",
+ " <td>0.651065</td>\n",
+ " <td>1.000000</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " SeniorCitizen tenure MonthlyCharges TotalCharges\n",
+ "SeniorCitizen 1.000000 0.016567 0.220173 0.102411\n",
+ "tenure 0.016567 1.000000 0.247900 0.825880\n",
+ "MonthlyCharges 0.220173 0.247900 1.000000 0.651065\n",
+ "TotalCharges 0.102411 0.825880 0.651065 1.000000"
+ ]
+ },
+ "execution_count": 17,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.corr(numeric_only= True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Feature distribution"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We plot distributions for numerical and categorical features to check for outliers and compare feature distributions with target variable."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Numerical features distribution\n",
+ "\n",
+ "Numeric summarizing techniques (mean, standard deviation, etc.) don't show us spikes, shapes of distributions and it is hard to observe outliers with it. That is the reason we use histograms."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>tenure</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>count</th>\n",
+ " <td>7043.000000</td>\n",
+ " <td>7043.000000</td>\n",
+ " <td>7032.000000</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>mean</th>\n",
+ " <td>32.371149</td>\n",
+ " <td>64.761692</td>\n",
+ " <td>2283.300441</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>std</th>\n",
+ " <td>24.559481</td>\n",
+ " <td>30.090047</td>\n",
+ " <td>2266.771362</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>min</th>\n",
+ " <td>0.000000</td>\n",
+ " <td>18.250000</td>\n",
+ " <td>18.800000</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>25%</th>\n",
+ " <td>9.000000</td>\n",
+ " <td>35.500000</td>\n",
+ " <td>401.450000</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>50%</th>\n",
+ " <td>29.000000</td>\n",
+ " <td>70.350000</td>\n",
+ " <td>1397.475000</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>75%</th>\n",
+ " <td>55.000000</td>\n",
+ " <td>89.850000</td>\n",
+ " <td>3794.737500</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>max</th>\n",
+ " <td>72.000000</td>\n",
+ " <td>118.750000</td>\n",
+ " <td>8684.800000</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " tenure MonthlyCharges TotalCharges\n",
+ "count 7043.000000 7043.000000 7032.000000\n",
+ "mean 32.371149 64.761692 2283.300441\n",
+ "std 24.559481 30.090047 2266.771362\n",
+ "min 0.000000 18.250000 18.800000\n",
+ "25% 9.000000 35.500000 401.450000\n",
+ "50% 29.000000 70.350000 1397.475000\n",
+ "75% 55.000000 89.850000 3794.737500\n",
+ "max 72.000000 118.750000 8684.800000"
+ ]
+ },
+ "execution_count": 18,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df[numerical_features].describe()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "array([[<AxesSubplot: title={'center': 'tenure'}>,\n",
+ " <AxesSubplot: title={'center': 'MonthlyCharges'}>],\n",
+ " [<AxesSubplot: title={'center': 'TotalCharges'}>, <AxesSubplot: >]],\n",
+ " dtype=object)"
+ ]
+ },
+ "execution_count": 19,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 1000x700 with 4 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "df[numerical_features].hist(bins=30, figsize=(10, 7))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We look at distributions of numerical features in relation to the target variable. We can observe that the greater TotalCharges and tenure are the less is the probability of churn."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "array([<AxesSubplot: title={'center': 'tenure'}>,\n",
+ " <AxesSubplot: title={'center': 'MonthlyCharges'}>,\n",
+ " <AxesSubplot: title={'center': 'TotalCharges'}>], dtype=object)"
+ ]
+ },
+ "execution_count": 20,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 1400x400 with 3 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "fig, ax = plt.subplots(1, 3, figsize=(14, 4))\n",
+ "df[df.Churn == \"No\"][numerical_features].hist(bins=30, color=\"blue\", alpha=0.5, ax=ax)\n",
+ "df[df.Churn == \"Yes\"][numerical_features].hist(bins=30, color=\"red\", alpha=0.5, ax=ax)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Categorical feature distribution\n",
+ "\n",
+ "To analyze categorical features, we use bar charts. We observe that Senior citizens and customers without phone service are less represented in the data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 1900x1900 with 16 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "ROWS, COLS = 4, 4\n",
+ "fig, ax = plt.subplots(ROWS,COLS, figsize=(19,19))\n",
+ "row, col = 0, 0,\n",
+ "for i, categorical_feature in enumerate(categorical_features):\n",
+ " if col == COLS - 1:\n",
+ " row += 1\n",
+ " col = i % COLS\n",
+ " df[categorical_feature].value_counts().plot(kind='bar', ax=ax[row, col]).set_title(categorical_feature)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The next step is to look at categorical features in relation to the target variable. We do this only for contract feature. Users who have a month-to-month contract are more likely to churn than users with long term contracts."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Text(0.5, 1.0, 'churned')"
+ ]
+ },
+ "execution_count": 22,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAA+IAAAHYCAYAAADJZFKpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABUX0lEQVR4nO3dfVxUdf7//+cgAorOICoghXiV19daSnmZrKiUWbRlkVqifjOoNVozf2umdmGZa6aZrq2Xu1jWbrqlhZKmaOIVhpqaqWloCpQIhCaX8/ujm+fTpBajcGaQx/12O7ebc97vmfM6ejzvec65stjtdrsAAAAAAIApPFxdAAAAAAAAVQlBHAAAAAAAExHEAQAAAAAwEUEcAAAAAAATEcQBAAAAADARQRwAAAAAABMRxAEAAAAAMBFBHAAAAAAAExHEAQAAAAAwEUEcgCTJYrEoLi7O1WWUu0cffVSNGjVydRkAAFyXpUuXymKxaPfu3a4updxZLBZNmTLF1WUApiKIA5XQ22+/raVLl7q6DAAAAADXgCAOVEIEcQAAAKDyIogDME1paakuXrzo6jIAAMBVnD9/3tUlAFUCQRyoYFOmTJHFYtHRo0f16KOPys/PTzabTY899pguXLjg0Le4uFgvvviimjZtKm9vbzVq1Ej/3//3/6mgoMDo06hRIx04cECbN2+WxWKRxWJRnz59freG0tJSvfnmm2rXrp18fHxUv359DRgw4IrXma1evVpt27aVt7e32rRpo8TERIf2q11zfWk9f+3SdecJCQlq06aNvL29lZiYaFzn9sUXXyg+Pl7169eXr6+v7r33Xv3www+Xffann36qnj17ytfXV7Vr11ZkZKQOHDhw1dp9fHzUtm1brVq16nf/XgAAcCfff/+9YmJiFBwcLG9vbzVu3Fhjx45VYWGh0aegoOAPx86rXXPdqFEjPfroo8brS+Px5s2b9cQTTyggIEA333yzJKlPnz5q27atDh48qL59+6pmzZq66aabNGPGjMs+t6CgQC+88IKaNWsmb29vhYSE6Nlnn3X4/nKp39NPP6369eurdu3aGjx4sE6dOnUdf2NA5eXp6gKAquKBBx5Q48aNNX36dO3Zs0f//Oc/FRAQoNdee83oM2rUKC1btkz333+/nnnmGe3YsUPTp0/XoUOHjFA5e/ZsPfnkk6pVq5b+9re/SZICAwN/d9kxMTFaunSpBg4cqFGjRqm4uFhbtmzR9u3b1bVrV6Pf1q1b9eGHH+qJJ55Q7dq1NWfOHEVFRSk9PV1169a9pvXeuHGj3n//fcXFxalevXpq1KiR0tLSJElPPvmk6tSpoxdeeEEnTpzQ7NmzFRcXp5UrVxrv/9e//qURI0YoIiJCr732mi5cuKD58+erR48e+vLLL40fBdavX6+oqCi1bt1a06dP19mzZ/XYY48ZXygAAHBnp0+f1m233aacnByNGTNGLVu21Pfff6///Oc/Dj/cl2XsdNYTTzyh+vXra/LkyQ5HxM+dO6cBAwbovvvu0wMPPKD//Oc/mjBhgtq1a6eBAwdK+uXH/sGDB2vr1q0aM2aMWrVqpf379+uNN97QN998o9WrVxufN2rUKP373//Www8/rNtvv10bN25UZGTkNdcNVGp2ABXqhRdesEuyjxw50mH+vffea69bt67xOi0tzS7JPmrUKId+f/3rX+2S7Bs3bjTmtWnTxt67d+8yLX/jxo12SfannnrqsrbS0lLjz5LsXl5e9qNHjxrz9u7da5dknzt3rjFvxIgR9tDQ0Kuu569Jsnt4eNgPHDjgMH/JkiV2Sfbw8HCHGp5++ml7tWrV7Dk5OXa73W7/6aef7H5+fvbRo0c7vD8jI8Nus9kc5nfs2NHeoEED4712u92+fv16u6Qr1gsAgDsZPny43cPDw75r167L2kpLS8s8dtrtv4y/L7zwwmWfExoaah8xYoTx+tJn9ujRw15cXOzQt3fv3nZJ9uXLlxvzCgoK7EFBQfaoqChj3r/+9S+7h4eHfcuWLQ7vX7BggV2S/YsvvrDb7f/3PeeJJ55w6Pfwww9ftV7gRsap6YBJHn/8cYfXPXv21NmzZ5WXlydJ+uSTTyRJ8fHxDv2eeeYZSdLatWuvabn//e9/ZbFY9MILL1zW9ttTycPDw9W0aVPjdfv27WW1WvXtt99e07IlqXfv3mrduvUV28aMGeNQQ8+ePVVSUqLvvvtOkpSUlKScnBw99NBD+vHHH42pWrVq6tatmz7//HNJ0pkzZ5SWlqYRI0bIZrMZn/enP/3pqssGAMBdlJaWavXq1br77rsdzlS75Ndj5R+Nnddi9OjRqlat2mXza9WqpUceecR47eXlpdtuu83he8EHH3ygVq1aqWXLlg5j9Z133ilJxlh96XvOU0895bCMcePGXXPdQGXGqemASRo2bOjwuk6dOpJ+Oe3LarXqu+++k4eHh5o1a+bQLygoSH5+ftc8wB47dkzBwcHy9/d3usZLdZ47d+6ali1JjRs3LvPyfv13IklHjhyRJGMw/y2r1SpJxt/NLbfcclmfFi1aaM+ePU5WDQCAeX744Qfl5eWpbdu2f9j3j8bOa3G1sfrmm2++7Ef7OnXqaN++fcbrI0eO6NChQ6pfv/4VPyMrK0uSjO85v/7BX/plnAaqIoI4YJIr/dIsSXa73eH1bwc8M5WlxqvVV1JScsX5NWrUuObllZaWSvrlOvGgoKDL+nl6sgsDAFQtZf0+cSXOjtVlWVZpaanatWunWbNmXbFvSEjIH9YFVEV8iwXcRGhoqEpLS3XkyBG1atXKmJ+ZmamcnByFhoYa85wJ602bNtW6deuUnZ1dpqPif6ROnTrKycm5bP71nBJ3NZd+NQ8ICFB4ePhV+136u7l0BP3XDh8+XO51AQBQnurXry+r1aqvvvqqXD7vSmN1YWGhzpw5Uy6f/2tNmzbV3r171a9fv9/9fnLpe86xY8ccjoIzTqOq4hpxwE0MGjRI0i93Rf+1S78w//quor6+vlcMw1cSFRUlu92uqVOnXtZWll/Pf6tp06bKzc11OC3tzJkzFfKosIiICFmtVr3yyisqKiq6rP3S41oaNGigjh07atmyZcrNzTXak5KSdPDgwXKvCwCA8uTh4aEhQ4bo448/vuKjRZ0dr5s2bark5GSHeQsXLrzqEfHr8cADD+j777/XO++8c1nbzz//bNyF/dJd1ufMmePQ57ffe4CqgiPigJvo0KGDRowYoYULFyonJ0e9e/fWzp07tWzZMg0ZMkR9+/Y1+nbp0kXz58/XSy+9pGbNmikgIOCq11H37dtXw4YN05w5c3TkyBENGDBApaWl2rJli/r27au4uDin6hw6dKgmTJige++9V0899ZTxOLHmzZuX+7XYVqtV8+fP17Bhw9S5c2cNHTpU9evXV3p6utauXas77rhDb731liRp+vTpioyMVI8ePTRy5EhlZ2dr7ty5atOmjfLz88u1LgAAytsrr7yi9evXq3fv3sZjwM6cOaMPPvhAW7dudeqzRo0apccff1xRUVH605/+pL1792rdunWqV69eudc9bNgwvf/++3r88cf1+eef64477lBJSYm+/vprvf/++1q3bp26du2qjh076qGHHtLbb7+t3Nxc3X777dqwYYOOHj1a7jUBlQFBHHAj//znP9WkSRMtXbpUq1atUlBQkCZOnHjZHc8nT56s7777TjNmzNBPP/2k3r17XzWIS9KSJUvUvn17LVq0SOPHj5fNZlPXrl11++23O11j3bp1tWrVKsXHx+vZZ581no1+5MiRCrkp2sMPP6zg4GC9+uqrev3111VQUKCbbrpJPXv21GOPPWb0GzBggD744ANNmjRJEydOVNOmTbVkyRL973//06ZNm8q9LgAAytNNN92kHTt26Pnnn1dCQoLy8vJ00003aeDAgapZs6ZTnzV69GgdP35cixYtUmJionr27KmkpCT169ev3Ov28PDQ6tWr9cYbb2j58uVatWqVatasqSZNmugvf/mLmjdvbvRdvHix6tevr4SEBK1evVp33nmn1q5dy3XkqJIs9ms5NxUAAAAAAFwTrhEHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMdMM+R7y0tFSnT59W7dq1ZbFYXF0OAACy2+366aefFBwcLA8Pfgu/Xoz1AAB3U9ax/oYN4qdPn1ZISIirywAA4DInT57UzTff7OoyKj3GegCAu/qjsf6GDeK1a9eW9MtfgNVqdXE1AABIeXl5CgkJMcYoXB/GegCAuynrWH/DBvFLp6hZrVYGZwCAW+E06vLBWA8AcFd/NNZzgRoAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAAABgIoI4AAAAAAAmIogDAAAAAGAigjgAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAAABgIoI4AAAAAAAm8nR1AVVFo+fWuroEt3Xi1UhXlwAAwHVjrL86xnoAcMQRcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAA6Sk5N19913Kzg4WBaLRatXr76sz6FDhzR48GDZbDb5+vrq1ltvVXp6utF+8eJFxcbGqm7duqpVq5aioqKUmZnp8Bnp6emKjIxUzZo1FRAQoPHjx6u4uLiiVw8AAJcjiAMAAAfnz59Xhw4dNG/evCu2Hzt2TD169FDLli21adMm7du3T88//7x8fHyMPk8//bQ+/vhjffDBB9q8ebNOnz6t++67z2gvKSlRZGSkCgsLtW3bNi1btkxLly7V5MmTK3z9AABwNZ4jDgAAHAwcOFADBw68avvf/vY3DRo0SDNmzDDmNW3a1Phzbm6uFi1apBUrVujOO++UJC1ZskStWrXS9u3b1b17d61fv14HDx7UZ599psDAQHXs2FEvvviiJkyYoClTpsjLy6viVhAAABfjiDgAACiz0tJSrV27Vs2bN1dERIQCAgLUrVs3h9PXU1NTVVRUpPDwcGNey5Yt1bBhQ6WkpEiSUlJS1K5dOwUGBhp9IiIilJeXpwMHDlxx2QUFBcrLy3OYAACojAjiAACgzLKyspSfn69XX31VAwYM0Pr163Xvvffqvvvu0+bNmyVJGRkZ8vLykp+fn8N7AwMDlZGRYfT5dQi/1H6p7UqmT58um81mTCEhIeW8dgAAmIMgDgAAyqy0tFSSdM899+jpp59Wx44d9dxzz+muu+7SggULKnTZEydOVG5urjGdPHmyQpcHAEBFIYgDAIAyq1evnjw9PdW6dWuH+a1atTLumh4UFKTCwkLl5OQ49MnMzFRQUJDR57d3Ub/0+lKf3/L29pbVanWYAACojAjiAACgzLy8vHTrrbfq8OHDDvO/+eYbhYaGSpK6dOmi6tWra8OGDUb74cOHlZ6errCwMElSWFiY9u/fr6ysLKNPUlKSrFbrZSEfAIAbDXdNBwAADvLz83X06FHj9fHjx5WWliZ/f381bNhQ48eP14MPPqhevXqpb9++SkxM1Mcff6xNmzZJkmw2m2JiYhQfHy9/f39ZrVY9+eSTCgsLU/fu3SVJ/fv3V+vWrTVs2DDNmDFDGRkZmjRpkmJjY+Xt7e2K1QYAwDQEcQAA4GD37t3q27ev8To+Pl6SNGLECC1dulT33nuvFixYoOnTp+upp55SixYt9N///lc9evQw3vPGG2/Iw8NDUVFRKigoUEREhN5++22jvVq1alqzZo3Gjh2rsLAw+fr6asSIEZo2bZp5KwoAgItY7Ha73dVFVIS8vDzZbDbl5ua6xTVkjZ5b6+oS3NaJVyNdXQIAmMLdxqbKzt3+Phnrr46xHkBVUdaxiWvEAQAAAAAwEUEcAAAAAAATEcQBAAAAADARQRwAAAAAABMRxAEAAAAAMBFBHAAAAAAAExHEAQAAAAAwEUEcAAAAAAATEcQBAAAAADCRU0F8+vTpuvXWW1W7dm0FBARoyJAhOnz4sEOfixcvKjY2VnXr1lWtWrUUFRWlzMxMhz7p6emKjIxUzZo1FRAQoPHjx6u4uNihz6ZNm9S5c2d5e3urWbNmWrp06bWtIQAAAAAAbsSpIL5582bFxsZq+/btSkpKUlFRkfr376/z588bfZ5++ml9/PHH+uCDD7R582adPn1a9913n9FeUlKiyMhIFRYWatu2bVq2bJmWLl2qyZMnG32OHz+uyMhI9e3bV2lpaRo3bpxGjRqldevWlcMqAwAAAADgOha73W6/1jf/8MMPCggI0ObNm9WrVy/l5uaqfv36WrFihe6//35J0tdff61WrVopJSVF3bt316effqq77rpLp0+fVmBgoCRpwYIFmjBhgn744Qd5eXlpwoQJWrt2rb766itjWUOHDlVOTo4SExPLVFteXp5sNptyc3NltVqvdRXLTaPn1rq6BLd14tVIV5cAAKZwt7GpsnO3v0/G+qtjrAdQVZR1bLqua8Rzc3MlSf7+/pKk1NRUFRUVKTw83OjTsmVLNWzYUCkpKZKklJQUtWvXzgjhkhQREaG8vDwdOHDA6PPrz7jU59JnXElBQYHy8vIcJgAAAAAA3M01B/HS0lKNGzdOd9xxh9q2bStJysjIkJeXl/z8/Bz6BgYGKiMjw+jz6xB+qf1S2+/1ycvL088//3zFeqZPny6bzWZMISEh17pqAAAAAABUmGsO4rGxsfrqq6/03nvvlWc912zixInKzc01ppMnT7q6JAAAAAAALuN5LW+Ki4vTmjVrlJycrJtvvtmYHxQUpMLCQuXk5DgcFc/MzFRQUJDRZ+fOnQ6fd+mu6r/u89s7rWdmZspqtapGjRpXrMnb21ve3t7XsjoAAAAAAJjGqSPidrtdcXFxWrVqlTZu3KjGjRs7tHfp0kXVq1fXhg0bjHmHDx9Wenq6wsLCJElhYWHav3+/srKyjD5JSUmyWq1q3bq10efXn3Gpz6XPAAAAAACgsnLqiHhsbKxWrFih//3vf6pdu7ZxTbfNZlONGjVks9kUExOj+Ph4+fv7y2q16sknn1RYWJi6d+8uSerfv79at26tYcOGacaMGcrIyNCkSZMUGxtrHNF+/PHH9dZbb+nZZ5/VyJEjtXHjRr3//vtau5a7kQIAAAAAKjenjojPnz9fubm56tOnjxo0aGBMK1euNPq88cYbuuuuuxQVFaVevXopKChIH374odFerVo1rVmzRtWqVVNYWJgeeeQRDR8+XNOmTTP6NG7cWGvXrlVSUpI6dOigv//97/rnP/+piIiIclhlAAAAAABcx6kj4mV55LiPj4/mzZunefPmXbVPaGioPvnkk9/9nD59+ujLL790pjwAAAAAANzedT1HHAAAAAAAOIcgDgAAAACAiQjiAAAAAACYiCAOAAAAAICJCOIAAAAAAJiIIA4AABwkJyfr7rvvVnBwsCwWi1avXn3Vvo8//rgsFotmz57tMD87O1vR0dGyWq3y8/NTTEyM8vPzHfrs27dPPXv2lI+Pj0JCQjRjxowKWBsAANwPQRwAADg4f/68OnTo8LuPIpWkVatWafv27QoODr6sLTo6WgcOHFBSUpLWrFmj5ORkjRkzxmjPy8tT//79FRoaqtTUVL3++uuaMmWKFi5cWO7rAwCAu3HqOeIAAODGN3DgQA0cOPB3+3z//fd68skntW7dOkVGRjq0HTp0SImJidq1a5e6du0qSZo7d64GDRqkmTNnKjg4WAkJCSosLNTixYvl5eWlNm3aKC0tTbNmzXII7AAA3Ig4Ig4AAJxSWlqqYcOGafz48WrTps1l7SkpKfLz8zNCuCSFh4fLw8NDO3bsMPr06tVLXl5eRp+IiAgdPnxY586du+JyCwoKlJeX5zABAFAZEcQBAIBTXnvtNXl6euqpp566YntGRoYCAgIc5nl6esrf318ZGRlGn8DAQIc+l15f6vNb06dPl81mM6aQkJDrXRUAAFyCIA4AAMosNTVVb775ppYuXSqLxWLqsidOnKjc3FxjOnnypKnLBwCgvBDEAQBAmW3ZskVZWVlq2LChPD095enpqe+++07PPPOMGjVqJEkKCgpSVlaWw/uKi4uVnZ2toKAgo09mZqZDn0uvL/X5LW9vb1mtVocJAIDKiCAOAADKbNiwYdq3b5/S0tKMKTg4WOPHj9e6deskSWFhYcrJyVFqaqrxvo0bN6q0tFTdunUz+iQnJ6uoqMjok5SUpBYtWqhOnTrmrhQAACbjrukAAMBBfn6+jh49arw+fvy40tLS5O/vr4YNG6pu3boO/atXr66goCC1aNFCktSqVSsNGDBAo0eP1oIFC1RUVKS4uDgNHTrUeNTZww8/rKlTpyomJkYTJkzQV199pTfffFNvvPGGeSsKAICLEMQBAICD3bt3q2/fvsbr+Ph4SdKIESO0dOnSMn1GQkKC4uLi1K9fP3l4eCgqKkpz5swx2m02m9avX6/Y2Fh16dJF9erV0+TJk3l0GQCgSiCIAwAAB3369JHdbi9z/xMnTlw2z9/fXytWrPjd97Vv315btmxxtjwAACo9rhEHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARJ6uLgDAlTV6bq2rS3BLJ16NdHUJAAAAwHXhiDgAAHCQnJysu+++W8HBwbJYLFq9erXRVlRUpAkTJqhdu3by9fVVcHCwhg8frtOnTzt8RnZ2tqKjo2W1WuXn56eYmBjl5+c79Nm3b5969uwpHx8fhYSEaMaMGWasHgAALkcQBwAADs6fP68OHTpo3rx5l7VduHBBe/bs0fPPP689e/boww8/1OHDhzV48GCHftHR0Tpw4ICSkpK0Zs0aJScna8yYMUZ7Xl6e+vfvr9DQUKWmpur111/XlClTtHDhwgpfPwAAXI1T0wEAgIOBAwdq4MCBV2yz2WxKSkpymPfWW2/ptttuU3p6uho2bKhDhw4pMTFRu3btUteuXSVJc+fO1aBBgzRz5kwFBwcrISFBhYWFWrx4sby8vNSmTRulpaVp1qxZDoEdAIAbEUfEAQDAdcnNzZXFYpGfn58kKSUlRX5+fkYIl6Tw8HB5eHhox44dRp9evXrJy8vL6BMREaHDhw/r3LlzV1xOQUGB8vLyHCYAACojgjgAALhmFy9e1IQJE/TQQw/JarVKkjIyMhQQEODQz9PTU/7+/srIyDD6BAYGOvS59PpSn9+aPn26bDabMYWEhJT36gAAYAqCOAAAuCZFRUV64IEHZLfbNX/+/Apf3sSJE5Wbm2tMJ0+erPBlAgBQEbhGHAAAOO1SCP/uu++0ceNG42i4JAUFBSkrK8uhf3FxsbKzsxUUFGT0yczMdOhz6fWlPr/l7e0tb2/v8lwNAABcgiPiAADAKZdC+JEjR/TZZ5+pbt26Du1hYWHKyclRamqqMW/jxo0qLS1Vt27djD7JyckqKioy+iQlJalFixaqU6eOOSsCAICLEMQBAICD/Px8paWlKS0tTZJ0/PhxpaWlKT09XUVFRbr//vu1e/duJSQkqKSkRBkZGcrIyFBhYaEkqVWrVhowYIBGjx6tnTt36osvvlBcXJyGDh2q4OBgSdLDDz8sLy8vxcTE6MCBA1q5cqXefPNNxcfHu2q1AQAwDaemAwAAB7t371bfvn2N15fC8YgRIzRlyhR99NFHkqSOHTs6vO/zzz9Xnz59JEkJCQmKi4tTv3795OHhoaioKM2ZM8foa7PZtH79esXGxqpLly6qV6+eJk+ezKPLAABVAkEcAAA46NOnj+x2+1Xbf6/tEn9/f61YseJ3+7Rv315btmxxuj4AACo7Tk0HAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATOR0EE9OTtbdd9+t4OBgWSwWrV692qH90UcflcVicZgGDBjg0Cc7O1vR0dGyWq3y8/NTTEyM8vPzHfrs27dPPXv2lI+Pj0JCQjRjxgzn1w4AAAAAADfjdBA/f/68OnTooHnz5l21z4ABA3TmzBljevfddx3ao6OjdeDAASUlJWnNmjVKTk52eFxJXl6e+vfvr9DQUKWmpur111/XlClTtHDhQmfLBQAAAADArTj9+LKBAwdq4MCBv9vH29tbQUFBV2w7dOiQEhMTtWvXLnXt2lWSNHfuXA0aNEgzZ85UcHCwEhISVFhYqMWLF8vLy0tt2rRRWlqaZs2axfNFAQAAAACVWoVcI75p0yYFBASoRYsWGjt2rM6ePWu0paSkyM/PzwjhkhQeHi4PDw/t2LHD6NOrVy95eXkZfSIiInT48GGdO3fuisssKChQXl6ewwQAAAAAgLsp9yA+YMAALV++XBs2bNBrr72mzZs3a+DAgSopKZEkZWRkKCAgwOE9np6e8vf3V0ZGhtEnMDDQoc+l15f6/Nb06dNls9mMKSQkpLxXDQAAAACA6+b0qel/ZOjQocaf27Vrp/bt26tp06batGmT+vXrV96LM0ycOFHx8fHG67y8PMI4AAAAAMDtVPjjy5o0aaJ69erp6NGjkqSgoCBlZWU59CkuLlZ2drZxXXlQUJAyMzMd+lx6fbVrz729vWW1Wh0mAAAAAADcTYUH8VOnTuns2bNq0KCBJCksLEw5OTlKTU01+mzcuFGlpaXq1q2b0Sc5OVlFRUVGn6SkJLVo0UJ16tSp6JIBAAAAAKgwTgfx/Px8paWlKS0tTZJ0/PhxpaWlKT09Xfn5+Ro/fry2b9+uEydOaMOGDbrnnnvUrFkzRURESJJatWqlAQMGaPTo0dq5c6e++OILxcXFaejQoQoODpYkPfzww/Ly8lJMTIwOHDiglStX6s0333Q49RwAAAAAgMrI6SC+e/duderUSZ06dZIkxcfHq1OnTpo8ebKqVaumffv2afDgwWrevLliYmLUpUsXbdmyRd7e3sZnJCQkqGXLlurXr58GDRqkHj16ODwj3Gazaf369Tp+/Li6dOmiZ555RpMnT+bRZQAAAACASs/pm7X16dNHdrv9qu3r1q37w8/w9/fXihUrfrdP+/bttWXLFmfLAwAAAADArVX4NeIAAAAAAOD/EMQBAAAAADARQRwAAAAAABMRxAEAAAAAMBFBHAAAAAAAExHEAQAAAAAwEUEcAAAAAAATEcQBAAAAADARQRwAAAAAABMRxAEAAAAAMBFBHAAAOEhOTtbdd9+t4OBgWSwWrV692qHdbrdr8uTJatCggWrUqKHw8HAdOXLEoU92draio6NltVrl5+enmJgY5efnO/TZt2+fevbsKR8fH4WEhGjGjBkVvWoAALgFgjgAAHBw/vx5dejQQfPmzbti+4wZMzRnzhwtWLBAO3bskK+vryIiInTx4kWjT3R0tA4cOKCkpCStWbNGycnJGjNmjNGel5en/v37KzQ0VKmpqXr99dc1ZcoULVy4sMLXDwAAV/N0dQEAAMC9DBw4UAMHDrxim91u1+zZszVp0iTdc889kqTly5crMDBQq1ev1tChQ3Xo0CElJiZq165d6tq1qyRp7ty5GjRokGbOnKng4GAlJCSosLBQixcvlpeXl9q0aaO0tDTNmjXLIbADAHAj4og4AAAos+PHjysjI0Ph4eHGPJvNpm7duiklJUWSlJKSIj8/PyOES1J4eLg8PDy0Y8cOo0+vXr3k5eVl9ImIiNDhw4d17ty5Ky67oKBAeXl5DhMAAJURQRwAAJRZRkaGJCkwMNBhfmBgoNGWkZGhgIAAh3ZPT0/5+/s79LnSZ/x6Gb81ffp02Ww2YwoJCbn+FQIAwAUI4gAAoFKYOHGicnNzjenkyZOuLgkAgGtCEAcAAGUWFBQkScrMzHSYn5mZabQFBQUpKyvLob24uFjZ2dkOfa70Gb9exm95e3vLarU6TAAAVEYEcQAAUGaNGzdWUFCQNmzYYMzLy8vTjh07FBYWJkkKCwtTTk6OUlNTjT4bN25UaWmpunXrZvRJTk5WUVGR0ScpKUktWrRQnTp1TFobAABcg7umA8ANotFza11dgls68Wqkq0uodPLz83X06FHj9fHjx5WWliZ/f381bNhQ48aN00svvaRbbrlFjRs31vPPP6/g4GANGTJEktSqVSsNGDBAo0eP1oIFC1RUVKS4uDgNHTpUwcHBkqSHH35YU6dOVUxMjCZMmKCvvvpKb775pt544w1XrDIAAKYiiAMAAAe7d+9W3759jdfx8fGSpBEjRmjp0qV69tlndf78eY0ZM0Y5OTnq0aOHEhMT5ePjY7wnISFBcXFx6tevnzw8PBQVFaU5c+YY7TabTevXr1dsbKy6dOmievXqafLkyTy6DABQJRDEAQCAgz59+shut1+13WKxaNq0aZo2bdpV+/j7+2vFihW/u5z27dtry5Yt11wnAACVFdeIAwAAAABgIoI4AAAAAAAmIogDAAAAAGAigjgAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAAABgIoI4AAAAAAAmIogDAAAAAGAigjgAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAAABgIoI4AAAAAAAmIogDAAAAAGAigjgAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAAABgIoI4AAAAAAAmIogDAAAAAGAigjgAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAp5SUlOj5559X48aNVaNGDTVt2lQvvvii7Ha70cdut2vy5Mlq0KCBatSoofDwcB05csThc7KzsxUdHS2r1So/Pz/FxMQoPz/f7NUBAMB0BHEAAOCU1157TfPnz9dbb72lQ4cO6bXXXtOMGTM0d+5co8+MGTM0Z84cLViwQDt27JCvr68iIiJ08eJFo090dLQOHDigpKQkrVmzRsnJyRozZowrVgkAAFN5uroAAABQuWzbtk333HOPIiMjJUmNGjXSu+++q507d0r65Wj47NmzNWnSJN1zzz2SpOXLlyswMFCrV6/W0KFDdejQISUmJmrXrl3q2rWrJGnu3LkaNGiQZs6cqeDgYNesHAAAJuCIOAAAcMrtt9+uDRs26JtvvpEk7d27V1u3btXAgQMlScePH1dGRobCw8ON99hsNnXr1k0pKSmSpJSUFPn5+RkhXJLCw8Pl4eGhHTt2XHG5BQUFysvLc5gAAKiMOCIOAACc8txzzykvL08tW7ZUtWrVVFJSopdfflnR0dGSpIyMDElSYGCgw/sCAwONtoyMDAUEBDi0e3p6yt/f3+jzW9OnT9fUqVPLe3UAADAdR8QBAIBT3n//fSUkJGjFihXas2ePli1bppkzZ2rZsmUVutyJEycqNzfXmE6ePFmhywMAoKJwRBwAADhl/Pjxeu655zR06FBJUrt27fTdd99p+vTpGjFihIKCgiRJmZmZatCggfG+zMxMdezYUZIUFBSkrKwsh88tLi5Wdna28f7f8vb2lre3dwWsEQAA5uKIOAAAcMqFCxfk4eH4FaJatWoqLS2VJDVu3FhBQUHasGGD0Z6Xl6cdO3YoLCxMkhQWFqacnBylpqYafTZu3KjS0lJ169bNhLUAAMB1OCIOAACccvfdd+vll19Ww4YN1aZNG3355ZeaNWuWRo4cKUmyWCwaN26cXnrpJd1yyy1q3Lixnn/+eQUHB2vIkCGSpFatWmnAgAEaPXq0FixYoKKiIsXFxWno0KHcMR0AcMMjiAMAAKfMnTtXzz//vJ544gllZWUpODhY/+///T9NnjzZ6PPss8/q/PnzGjNmjHJyctSjRw8lJibKx8fH6JOQkKC4uDj169dPHh4eioqK0pw5c1yxSgAAmMrpU9OTk5N19913Kzg4WBaLRatXr3Zot9vtmjx5sho0aKAaNWooPDxcR44cceiTnZ2t6OhoWa1W+fn5KSYmRvn5+Q599u3bp549e8rHx0chISGaMWOG82sHAADKXe3atTV79mx99913+vnnn3Xs2DG99NJL8vLyMvpYLBZNmzZNGRkZunjxoj777DM1b97c4XP8/f21YsUK/fTTT8rNzdXixYtVq1Yts1cHAADTOR3Ez58/rw4dOmjevHlXbJ8xY4bmzJmjBQsWaMeOHfL19VVERIQuXrxo9ImOjtaBAweUlJSkNWvWKDk5WWPGjDHa8/Ly1L9/f4WGhio1NVWvv/66pkyZooULF17DKgIAAAAA4D6cPjV94MCBGjhw4BXb7Ha7Zs+erUmTJumee+6RJC1fvlyBgYFavXq1hg4dqkOHDikxMVG7du1S165dJf1yitugQYM0c+ZMBQcHKyEhQYWFhVq8eLG8vLzUpk0bpaWladasWQ6BHQAAAACAyqZc75p+/PhxZWRkKDw83Jhns9nUrVs3paSkSJJSUlLk5+dnhHBJCg8Pl4eHh3bs2GH06dWrl8MpbhERETp8+LDOnTt3xWUXFBQoLy/PYQIAAAAAwN2UaxDPyMiQJAUGBjrMDwwMNNoyMjIUEBDg0O7p6Sl/f3+HPlf6jF8v47emT58um81mTCEhIde/QgAAAAAAlLMb5jniEydOVG5urjGdPHnS1SUBAAAAAHCZcg3iQUFBkqTMzEyH+ZmZmUZbUFCQsrKyHNqLi4uVnZ3t0OdKn/HrZfyWt7e3rFarwwQAAAAAgLsp1yDeuHFjBQUFacOGDca8vLw87dixQ2FhYZKksLAw5eTkKDU11eizceNGlZaWqlu3bkaf5ORkFRUVGX2SkpLUokUL1alTpzxLBgAAAADAVE4H8fz8fKWlpSktLU3SLzdoS0tLU3p6uiwWi8aNG6eXXnpJH330kfbv36/hw4crODhYQ4YMkSS1atVKAwYM0OjRo7Vz50598cUXiouL09ChQxUcHCxJevjhh+Xl5aWYmBgdOHBAK1eu1Jtvvqn4+PhyW3EAAAAAAFzB6ceX7d69W3379jVeXwrHI0aM0NKlS/Xss8/q/PnzGjNmjHJyctSjRw8lJibKx8fHeE9CQoLi4uLUr18/eXh4KCoqSnPmzDHabTab1q9fr9jYWHXp0kX16tXT5MmTeXQZAAAAAKDSczqI9+nTR3a7/artFotF06ZN07Rp067ax9/fXytWrPjd5bRv315btmxxtjwAAAAAANzaDXPXdAAAAAAAKgOCOAAAAAAAJiKIAwAAAABgIoI4AAAAAAAmIogDAAAAAGAigjgAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAAABgIoI4AAAAAAAmIogDAAAAAGAigjgAAAAAACYiiAMAAAAAYCKCOAAAAAAAJiKIAwAAp33//fd65JFHVLduXdWoUUPt2rXT7t27jXa73a7JkyerQYMGqlGjhsLDw3XkyBGHz8jOzlZ0dLSsVqv8/PwUExOj/Px8s1cFAADTEcQBAIBTzp07pzvuuEPVq1fXp59+qoMHD+rvf/+76tSpY/SZMWOG5syZowULFmjHjh3y9fVVRESELl68aPSJjo7WgQMHlJSUpDVr1ig5OVljxoxxxSoBAGAqT1cXAAAAKpfXXntNISEhWrJkiTGvcePGxp/tdrtmz56tSZMm6Z577pEkLV++XIGBgVq9erWGDh2qQ4cOKTExUbt27VLXrl0lSXPnztWgQYM0c+ZMBQcHm7tSAACYiCPiAADAKR999JG6du2qP//5zwoICFCnTp30zjvvGO3Hjx9XRkaGwsPDjXk2m03dunVTSkqKJCklJUV+fn5GCJek8PBweXh4aMeOHVdcbkFBgfLy8hwmAAAqI4I4AABwyrfffqv58+frlltu0bp16zR27Fg99dRTWrZsmSQpIyNDkhQYGOjwvsDAQKMtIyNDAQEBDu2enp7y9/c3+vzW9OnTZbPZjCkkJKS8Vw0AAFMQxAEAgFNKS0vVuXNnvfLKK+rUqZPGjBmj0aNHa8GCBRW63IkTJyo3N9eYTp48WaHLAwCgohDEAQCAUxo0aKDWrVs7zGvVqpXS09MlSUFBQZKkzMxMhz6ZmZlGW1BQkLKyshzai4uLlZ2dbfT5LW9vb1mtVocJAIDKiCAOAACccscdd+jw4cMO87755huFhoZK+uXGbUFBQdqwYYPRnpeXpx07digsLEySFBYWppycHKWmphp9Nm7cqNLSUnXr1s2EtQAAwHW4azoAAHDK008/rdtvv12vvPKKHnjgAe3cuVMLFy7UwoULJUkWi0Xjxo3TSy+9pFtuuUWNGzfW888/r+DgYA0ZMkTSL0fQBwwYYJzSXlRUpLi4OA0dOpQ7pgMAbngEcQAA4JRbb71Vq1at0sSJEzVt2jQ1btxYs2fPVnR0tNHn2Wef1fnz5zVmzBjl5OSoR48eSkxMlI+Pj9EnISFBcXFx6tevnzw8PBQVFaU5c+a4YpUAADAVQRwAADjtrrvu0l133XXVdovFomnTpmnatGlX7ePv768VK1ZURHkAALg1rhEHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAHBdXn31VVksFo0bN86Yd/HiRcXGxqpu3bqqVauWoqKilJmZ6fC+9PR0RUZGqmbNmgoICND48eNVXFxscvUAAJiPIA4AAK7Zrl279I9//EPt27d3mP/000/r448/1gcffKDNmzfr9OnTuu+++4z2kpISRUZGqrCwUNu2bdOyZcu0dOlSTZ482exVAADAdARxAABwTfLz8xUdHa133nlHderUMebn5uZq0aJFmjVrlu6880516dJFS5Ys0bZt27R9+3ZJ0vr163Xw4EH9+9//VseOHTVw4EC9+OKLmjdvngoLC121SgAAmIIgDgAArklsbKwiIyMVHh7uMD81NVVFRUUO81u2bKmGDRsqJSVFkpSSkqJ27dopMDDQ6BMREaG8vDwdOHDgissrKChQXl6ewwQAQGXk6eoCAABA5fPee+9pz5492rVr12VtGRkZ8vLykp+fn8P8wMBAZWRkGH1+HcIvtV9qu5Lp06dr6tSp5VA9AACuVe5HxKdMmSKLxeIwtWzZ0mjn5i0AAFRuJ0+e1F/+8hclJCTIx8fHtOVOnDhRubm5xnTy5EnTlg0AQHmqkFPT27RpozNnzhjT1q1bjTZu3gIAQOWWmpqqrKwsde7cWZ6envL09NTmzZs1Z84ceXp6KjAwUIWFhcrJyXF4X2ZmpoKCgiRJQUFBl/0Qf+n1pT6/5e3tLavV6jABAFAZVUgQ9/T0VFBQkDHVq1dPEjdvAQDgRtCvXz/t379faWlpxtS1a1dFR0cbf65evbo2bNhgvOfw4cNKT09XWFiYJCksLEz79+9XVlaW0ScpKUlWq1WtW7c2fZ0AADBThQTxI0eOKDg4WE2aNFF0dLTS09MlVdzNWyRu4AIAgFlq166ttm3bOky+vr6qW7eu2rZtK5vNppiYGMXHx+vzzz9XamqqHnvsMYWFhal79+6SpP79+6t169YaNmyY9u7dq3Xr1mnSpEmKjY2Vt7e3i9cQAICKVe5BvFu3blq6dKkSExM1f/58HT9+XD179tRPP/1UYTdvkX65gYvNZjOmkJCQ8l0xAABQZm+88YbuuusuRUVFqVevXgoKCtKHH35otFerVk1r1qxRtWrVFBYWpkceeUTDhw/XtGnTXFg1AADmKPe7pg8cOND4c/v27dWtWzeFhobq/fffV40aNcp7cYaJEycqPj7eeJ2Xl0cYBwDAJJs2bXJ47ePjo3nz5mnevHlXfU9oaKg++eSTCq4MAAD3U+HPEffz81Pz5s119OhRBQUFVcjNWyRu4AIAAAAAqBwqPIjn5+fr2LFjatCggbp06cLNWwAAAAAAVVq5n5r+17/+VXfffbdCQ0N1+vRpvfDCC6pWrZoeeughh5u3+Pv7y2q16sknn7zqzVtmzJihjIwMbt4CAAAAALhhlHsQP3XqlB566CGdPXtW9evXV48ePbR9+3bVr19f0i83b/Hw8FBUVJQKCgoUERGht99+23j/pZu3jB07VmFhYfL19dWIESO4eQsAAAAA4IZQ7kH8vffe+912bt4CAAAAAKjKKvwacQAAAAAA8H8I4gAAAAAAmIggDgAAAACAiQjiAAAAAACYiCAOAAAAAICJCOIAAAAAAJiIIA4AAAAAgIkI4gAAAAAAmIggDgAAAACAiQjiAAAAAACYiCAOAAAAAICJCOIAAAAAAJiIIA4AAAAAgIkI4gAAAAAAmIggDgAAAACAiQjiAAAAAACYiCAOAAAAAICJCOIAAAAAAJiIIA4AAAAAgIkI4gAAwCnTp0/Xrbfeqtq1aysgIEBDhgzR4cOHHfpcvHhRsbGxqlu3rmrVqqWoqChlZmY69ElPT1dkZKRq1qypgIAAjR8/XsXFxWauCgAALkEQBwAATtm8ebNiY2O1fft2JSUlqaioSP3799f58+eNPk8//bQ+/vhjffDBB9q8ebNOnz6t++67z2gvKSlRZGSkCgsLtW3bNi1btkxLly7V5MmTXbFKAACYytPVBQAAgMolMTHR4fXSpUsVEBCg1NRU9erVS7m5uVq0aJFWrFihO++8U5K0ZMkStWrVStu3b1f37t21fv16HTx4UJ999pkCAwPVsWNHvfjii5owYYKmTJkiLy8vV6waAACm4Ig4AAC4Lrm5uZIkf39/SVJqaqqKiooUHh5u9GnZsqUaNmyolJQUSVJKSoratWunwMBAo09ERITy8vJ04MCBKy6noKBAeXl5DhMAAJURQRwAAFyz0tJSjRs3TnfccYfatm0rScrIyJCXl5f8/Pwc+gYGBiojI8Po8+sQfqn9UtuVTJ8+XTabzZhCQkLKeW0AADAHQRwAAFyz2NhYffXVV3rvvfcqfFkTJ05Ubm6uMZ08ebLClwkAQEXgGnEAAHBN4uLitGbNGiUnJ+vmm2825gcFBamwsFA5OTkOR8UzMzMVFBRk9Nm5c6fD5126q/qlPr/l7e0tb2/vcl4LAADMxxFxAADgFLvdrri4OK1atUobN25U48aNHdq7dOmi6tWra8OGDca8w4cPKz09XWFhYZKksLAw7d+/X1lZWUafpKQkWa1WtW7d2pwVAQDARTgiDgAAnBIbG6sVK1bof//7n2rXrm1c022z2VSjRg3ZbDbFxMQoPj5e/v7+slqtevLJJxUWFqbu3btLkvr376/WrVtr2LBhmjFjhjIyMjRp0iTFxsZy1BsAcMMjiAMAAKfMnz9fktSnTx+H+UuWLNGjjz4qSXrjjTfk4eGhqKgoFRQUKCIiQm+//bbRt1q1alqzZo3Gjh2rsLAw+fr6asSIEZo2bZpZqwEAgMsQxAEAgFPsdvsf9vHx8dG8efM0b968q/YJDQ3VJ598Up6lAZVKo+fWuroEt3Xi1UhXlwBUKK4RBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAAAAAAExEEAcAAAAAwEQEcQAAAAAATEQQBwAAAADARARxAAAAAABM5OnqAgAAAAAAf6zRc2tdXYLbOvFqpKtLcApHxAEAAAAAMBFBHAAAAAAAExHEAQAAAAAwEUEcAAAAAAATuXUQnzdvnho1aiQfHx9169ZNO3fudHVJAACgnDHeAwCqGrcN4itXrlR8fLxeeOEF7dmzRx06dFBERISysrJcXRoAACgnjPcAgKrIbYP4rFmzNHr0aD322GNq3bq1FixYoJo1a2rx4sWuLg0AAJQTxnsAQFXkls8RLywsVGpqqiZOnGjM8/DwUHh4uFJSUq74noKCAhUUFBivc3NzJUl5eXkVW2wZlRZccHUJbstd/o3cDdvMlbG9XB3bzJW50zZzqRa73e7iStyDs+M9Y33l5S7/Ru6Gbebq2GaujG3m6txlmynrWO+WQfzHH39USUmJAgMDHeYHBgbq66+/vuJ7pk+frqlTp142PyQkpEJqRPmxzXZ1BahM2F7gLHfcZn766SfZbDZXl+Fyzo73jPWVlzv+P4R7Y5uBs9xtm/mjsd4tg/i1mDhxouLj443XpaWlys7OVt26dWWxWFxYmXvJy8tTSEiITp48KavV6upyUAmwzcBZbDNXZ7fb9dNPPyk4ONjVpVRKjPVlx/9DOIPtBc5im7m6so71bhnE69Wrp2rVqikzM9NhfmZmpoKCgq74Hm9vb3l7ezvM8/Pzq6gSKz2r1cp/GjiFbQbOYpu5Mo6E/x9nx3vGeufx/xDOYHuBs9hmrqwsY71b3qzNy8tLXbp00YYNG4x5paWl2rBhg8LCwlxYGQAAKC+M9wCAqsotj4hLUnx8vEaMGKGuXbvqtttu0+zZs3X+/Hk99thjri4NAACUE8Z7AEBV5LZB/MEHH9QPP/ygyZMnKyMjQx07dlRiYuJlN3SBc7y9vfXCCy9cdmofcDVsM3AW2wycwXhfMfh/CGewvcBZbDPXz2LnGSoAAAAAAJjGLa8RBwAAAADgRkUQBwAAAADARARxAAAAAABMRBAHAAAAAMBEBHEAhqKiIo0cOVLHjx93dSmoJNhmAKByYb8NZ7HNVAyCOABD9erV9d///tfVZaASYZsBgMqF/TacxTZTMQjiABwMGTJEq1evdnUZqETYZgCgcmG/DWexzZQ/T1cXAPPk5ORo586dysrKUmlpqUPb8OHDXVQV3M0tt9yiadOm6YsvvlCXLl3k6+vr0P7UU0+5qDK4K7YZwH0w1qMs2G/DWWwz5c9it9vtri4CFe/jjz9WdHS08vPzZbVaZbFYjDaLxaLs7GwXVgd30rhx46u2WSwWffvttyZWg8qAbQZwD4z1KCv223AW20z5I4hXEc2bN9egQYP0yiuvqGbNmq4uBwAAlDPGegCoPAjiVYSvr6/279+vJk2auLoUAABQARjrAaDy4BrxKiIiIkK7d+9mcEaZnDp1Sh999JHS09NVWFjo0DZr1iwXVQV3xjYDuB5jPZzBfhvOYpspXwTxG9hHH31k/DkyMlLjx4/XwYMH1a5dO1WvXt2h7+DBg80uD25qw4YNGjx4sJo0aaKvv/5abdu21YkTJ2S329W5c2dXlwc3xDYDuA5jPa4F+204i22m/HFq+g3Mw6NsT6ezWCwqKSmp4GpQWdx2220aOHCgpk6dqtq1a2vv3r0KCAhQdHS0BgwYoLFjx7q6RLgZthnAdRjrcS3Yb8NZbDPljyAOwEHt2rWVlpampk2bqk6dOtq6davatGmjvXv36p577tGJEydcXSLcDNsMAFQu7LfhLLaZ8le2n1FR6S1fvlwFBQWXzS8sLNTy5ctdUBHcla+vr3HdT4MGDXTs2DGj7ccff3RVWXBjbDOAe2CsR1mx34az2GbKH0G8injssceUm5t72fyffvpJjz32mAsqgrvq3r27tm7dKkkaNGiQnnnmGb388ssaOXKkunfv7uLq4I7YZgD3wFiPsmK/DWexzZQ/btZWRdjtdlkslsvmnzp1SjabzQUVwV3NmjVL+fn5kqSpU6cqPz9fK1eu1C233MIdMXFFbDOAe2CsR1mx34az2GbKH9eI3+A6deoki8WivXv3qk2bNvL0/L/fXkpKSnT8+HENGDBA77//vgurBAAA14qxHgAqH46I3+CGDBkiSUpLS1NERIRq1apltHl5ealRo0aKiopyUXVwVzk5OfrPf/6jY8eOafz48fL399eePXsUGBiom266ydXlwQ2xzQCuw1iPa8F+G85imylfHBGvIpYtW6YHH3xQPj4+ri4Fbm7fvn0KDw+XzWbTiRMndPjwYTVp0kSTJk1Seno6N/zBZdhmAPfAWI+yYr8NZ7HNlD9u1lZFjBgxQj4+PiosLNSpU6eUnp7uMAGXxMfH69FHH9WRI0ccvswNGjRIycnJLqwM7optBnAPjPUoK/bbcBbbTPnj1PQq4siRIxo5cqS2bdvmMP/SjV1KSkpcVBncza5du/SPf/zjsvk33XSTMjIyXFAR3B3bDOAeGOtRVuy34Sy2mfJHEK8iHn30UXl6emrNmjVq0KDBFe+qCkiSt7e38vLyLpv/zTffqH79+i6oCO6ObQZwD4z1KCv223AW20z54xrxKsLX11epqalq2bKlq0uBmxs1apTOnj2r999/X/7+/tq3b5+qVaumIUOGqFevXpo9e7arS4SbYZsB3ANjPcqK/TacxTZT/rhGvIpo3bq1fvzxR1eXgUrg73//u/Lz8xUQEKCff/5ZvXv3VrNmzVS7dm29/PLLri4PbohtBnAPjPUoK/bbcBbbTPnjiHgVsXHjRk2aNEmvvPKK2rVrp+rVqzu0W61WF1UGd7V161bt27dP+fn56ty5s8LDw11dEtwc2wzgWoz1cBb7bTiLbab8EMSrCA+PX05++O31YtzABb/17bffqkmTJq4uA5UI2wzgHhjrUVbst+EstpnyRxCvIjZv3vy77b179zapErg7Dw8P9e7dWzExMbr//vt5Hi3+ENsM4B4Y61FW7LfhLLaZ8kcQB+AgLS1NS5Ys0bvvvqvCwkI9+OCDGjlypLp16+bq0uCm2GYAoHJhvw1nsc2UP4J4FZKTk6NFixbp0KFDkqQ2bdpo5MiRstlsLq4M7qi4uFgfffSRli5dqsTERDVv3lwjR47UsGHDeEwFrohtBnA9xno4g/02nMU2U34I4lXE7t27FRERoRo1aui2226TJO3atUs///yz1q9fr86dO7u4QrirgoICvf3225o4caIKCwvl5eWlBx54QK+99poaNGjg6vLghthmANdgrMe1Yr8NZ7HNXD+CeBXRs2dPNWvWTO+88448PT0l/fKL1qhRo/Ttt98qOTnZxRXC3ezevVuLFy/We++9J19fX40YMUIxMTE6deqUpk6dqry8PO3cudPVZcKNsM0ArsVYD2ex34az2GbKD0G8iqhRo4a+/PJLtWzZ0mH+wYMH1bVrV124cMFFlcHdzJo1S0uWLNHhw4c1aNAgjRo1SoMGDTLuxitJp06dUqNGjVRcXOzCSuEu2GYA98BYj7Jivw1nsc2UP09XFwBzWK1WpaenXzY4nzx5UrVr13ZRVXBH8+fP18iRI/Xoo49e9dSigIAALVq0yOTK4K7YZgD3wFiPsmK/DWexzZQ/johXEU899ZRWrVqlmTNn6vbbb5ckffHFFxo/fryioqI0e/Zs1xYIAACuC2M9AFQeHBGvImbOnCmLxaLhw4cbp4tUr15dY8eO1auvvuri6gAAwPVirAeAyoMj4lXMhQsXdOzYMUlS06ZNVbNmTRdXBAAAyhNjPQC4P4I4AAAAAAAm4tT0KuLixYuaO3euPv/8c2VlZam0tNShfc+ePS6qDAAAlAfGegCoPAjiVURMTIzWr1+v+++/X7fddpssFourS4Kb++GHH3T48GFJUosWLVS/fn0XVwR3VlxcrE2bNunYsWN6+OGHVbt2bZ0+fVpWq1W1atVydXlAlcBYD2ew38a14Pth+eHU9CrCZrPpk08+0R133OHqUuDmzp8/ryeffFL/+te/VFJSIkmqVq2ahg8frrlz53KtIS7z3XffacCAAUpPT1dBQYG++eYbNWnSRH/5y19UUFCgBQsWuLpEoEpgrEdZsd+Gs/h+WP48/rgLbgQ33XQTzxBFmcTHx2vz5s366KOPlJOTo5ycHP3vf//T5s2b9cwzz7i6PLihv/zlL+ratavOnTunGjVqGPPvvfdebdiwwYWVAVULYz3Kiv02nMX3w/LHEfEq4tNPP9WcOXO0YMEChYaGurocuLF69erpP//5j/r06eMw//PPP9cDDzygH374wTWFwW3VrVtX27ZtU4sWLVS7dm3t3btXTZo00YkTJ9S6dWtduHDB1SUCVQJjPcqK/TacxffD8sc14lVE165ddfHiRTVp0kQ1a9ZU9erVHdqzs7NdVBnczYULFxQYGHjZ/ICAAAZmXFFpaalxmtqvnTp1iqNzgIkY61FW7LfhLL4flj+OiFcR4eHhSk9PV0xMjAIDAy+7gcuIESNcVBncTb9+/VS3bl0tX75cPj4+kqSff/5ZI0aMUHZ2tj777DMXVwh38+CDD8pms2nhwoWqXbu29u3bp/r16+uee+5Rw4YNtWTJEleXCFQJjPUoK/bbcBbfD8sfQbyKqFmzplJSUtShQwdXlwI399VXXykiIkIFBQXG9rJ37175+Pho3bp1atOmjYsrhLs5deqUIiIiZLfbdeTIEXXt2lVHjhxRvXr1lJycrICAAFeXCFQJjPUoK/bbcBbfD8sfQbyK6Ny5s95++211797d1aWgErhw4YISEhL09ddfS5JatWql6Ohohxu6AL9WXFys9957T/v27VN+fr46d+7MNgOYjLEezmC/DWfx/bB8EcSriPXr12vq1Kl6+eWX1a5du8uuG7NarS6qDAAAlAfGegCoPAjiVYSHxy9Pqvvt9WJ2u10Wi+WKN+xA1dSwYUP16dNHvXv3Vt++fdWkSRNXl4RK4MiRI/r888+VlZWl0tJSh7bJkye7qCqgamGshzPYb8MZfD8sfwTxKmLz5s2/2967d2+TKoG7+/e//63k5GRt2rRJR48e1U033aTevXurd+/e6tOnj2655RZXlwg3884772js2LGqV6+egoKCHEKAxWLRnj17XFgdUHUw1qOs2G/DWXw/LH8EcQBXdebMGW3evFlr1qzRypUrr/q4E1RtoaGheuKJJzRhwgRXlwIAKAP227gefD8sHzxHvApq166dPvnkE4WEhLi6FLipCxcuaOvWrdq0aZM+//xzffnll2rbtq369Onj6tLghs6dO6c///nPri4DwK8w1uP3sN/GteD7YfnycHUBMN+JEydUVFTk6jLgpm6//XbVrVtXzz33nC5evKjnnntOZ86c0Zdffqk33njD1eXBDf35z3/W+vXrXV0GgF9hrMfvYb8NZ/H9sPxxRByAg6+//lq+vr5q2bKlWrZsqVatWqlOnTquLgturFmzZnr++ee1ffv2K96p+amnnnJRZQCAK2G/DWfx/bD8cY14FTRo0CAtWrRIDRo0cHUpcEN2u1379+/Xpk2btHnzZiUnJ8vLy8u4S+bo0aNdXSLcTOPGja/aZrFY9O2335pYDQCJsR6/j/02nMX3w/JHEAdwVXa7XampqXrrrbeUkJDAzTgAAACqOL4flg9OTa9CSkpKtHr1ah06dEiS1KZNGw0ePFjVqlVzcWVwB9OmTdNf//pXff3119q0aZM2bdqkrVu36qefflK7du305JNP8ugb/K4ff/xRklSvXj0XVwJUXYz1cAb7bfwRvh9WHI6IVxFHjx5VZGSkTp06pRYtWkiSDh8+rJCQEK1du1ZNmzZ1cYVwtWrVqunMmTMKDg5Wp06djGdD9urVSzabzdXlwU3l5OTob3/7m1auXKlz585JkurUqaOhQ4fqpZdekp+fn2sLBKoQxnqUBfttOIPvhxWHIF5FDBo0SHa7XQkJCfL395cknT17Vo888og8PDy0du1aF1cIV/Pw8FBGRoZ8fHxktVpdXQ4qgezsbIWFhen7779XdHS0WrVqJUk6ePCgVqxYoZCQEG3bto2buQAmYazHH2G/DWfx/bDiEMSrCF9fX+POmL+2d+9e3XHHHcrPz3dRZXAXHh4eyszMVP369V1dCiqJcePGacOGDfrss88UGBjo0JaRkaH+/furX79+PNYEMAljPf4I+204i++HFYdrxKsIb29v/fTTT5fNz8/Pl5eXlwsqgjtq3ry5LBbL7/bJzs42qRq4u9WrV+sf//jHZV/mJCkoKEgzZszQ448/zhc6wCSM9fgj7LdxLfh+WDEI4lXEXXfdpTFjxmjRokW67bbbJEk7duzQ448/rsGDB7u4OriLqVOncr0PyuzMmTNq06bNVdvbtm2rjIwMEysCqjbGevwR9tu4Fnw/rBgE8Spizpw5GjFihMLCwlS9enVJUnFxsQYPHqzZs2e7tji4jaFDhyogIMDVZaCSqFevnk6cOKGbb775iu3Hjx83rlMFUPEY6/FH2G/jWvD9sGJwjXgVc/ToUeORJq1atVKzZs1cXBHcxaW7YrKjRVmNHDlSx44dU1JS0mWnvRYUFCgiIkJNmjTR4sWLXVQhUDUx1uNq2G/DWXw/rDgE8Sri0jMAa9as6TD/559/1uuvv67Jkye7qDK4i0t3xWRHi7I6deqUunbtKm9vb8XGxqply5ay2+06dOiQ3n77bRUUFGj37t0KCQlxdalAlcBYjz/CfhvO4vthxSGIVxFX+zXr7NmzCggIUElJiYsqA1CZHT9+XE888YTWr1+vS8OJxWLRn/70J7311lsciQNMxFiPsmC/DbgHrhGvIux2+xXvdrh3716uBQJwzRo3bqxPP/1U586d05EjRyRJzZo1Y78CuABjPcqC/TbgHgjiN7g6derIYrHIYrFc9uiBkpIS5efn6/HHH3dhhQBuBHXq1DHu0gzAXIz1uBbstwHX4tT0G9yyZctkt9s1cuRIzZ492+HRA15eXmrUqJHCwsJcWCEAALgejPUAUPkQxKuIzZs364477pCnJydBAABwI2KsB4DKw8PVBcAcvXv3NgbmyMhInTlzxsUVAQCA8sRYDwCVB0G8CkpOTtbPP//s6jIAAEAFYawHAPdGEAcAAAAAwEQE8SooNDRU1atXd3UZAACggjDWA4B742ZtAAAAAACYiNtqViE5OTnauXOnsrKyVFpa6tA2fPhwF1UFAADKC2M9AFQOHBGvIj7++GNFR0crPz9fVqtVFovFaLNYLMrOznZhdQAA4Hox1gNA5UEQryKaN2+uQYMG6ZVXXlHNmjVdXQ4AAChnjPUAUHkQxKsIX19f7d+/X02aNHF1KQAAoAIw1gNA5cFd06uIiIgI7d6929VlAACACsJYDwCVBzdru4F99NFHxp8jIyM1fvx4HTx4UO3atbvskSaDBw82uzwAAHCdGOsBoHLi1PQbmIdH2U54sFgsKikpqeBqAABAeWOsB4DKiSAOAAAAAICJuEa8ili+fLkKCgoum19YWKjly5e7oCIAAFCeGOsBoPLgiHgVUa1aNZ05c0YBAQEO88+ePauAgABOVwMAoJJjrAeAyoMj4lWE3W6XxWK5bP6pU6dks9lcUBEAAChPjPUAUHlw1/QbXKdOnWSxWGSxWNSvXz95ev7fP3lJSYmOHz+uAQMGuLBCAABwPRjrAaDyIYjf4IYMGSJJSktLU0REhGrVqmW0eXl5qVGjRoqKinJRdQAA4Hox1gNA5cM14lXEsmXL9OCDD8rHx8fVpQAAgArAWA8AlQdBvIopLCxUVlaWSktLHeY3bNjQRRUBAIDyxFgPAO6PU9OriCNHjmjkyJHatm2bw/xLN3bhTqoAAFRujPUAUHkQxKuIRx99VJ6enlqzZo0aNGhwxbuqAgCAyouxHgAqD05NryJ8fX2Vmpqqli1buroUAABQARjrAaDy4DniVUTr1q31448/uroMAABQQRjrAaDyIIhXEa+99pqeffZZbdq0SWfPnlVeXp7DBAAAKjfGegCoPDg1vYrw8PjlN5ffXi/GDVwAALgxMNYDQOXBzdqqiM8//9zVJQAAgArEWA8AlQdHxAEAAAAAMBFHxKuQnJwcLVq0SIcOHZIktWnTRiNHjpTNZnNxZQAAoDww1gNA5cAR8Spi9+7dioiIUI0aNXTbbbdJknbt2qWff/5Z69evV+fOnV1cIQAAuB6M9QBQeRDEq4iePXuqWbNmeuedd+Tp+cuJEMXFxRo1apS+/fZbJScnu7hCAABwPRjrAaDyIIhXETVq1NCXX36pli1bOsw/ePCgunbtqgsXLrioMgAAUB4Y6wGg8uA54lWE1WpVenr6ZfNPnjyp2rVru6AiAABQnhjrAaDyIIhXEQ8++KBiYmK0cuVKnTx5UidPntR7772nUaNG6aGHHnJ1eQAA4Dox1gNA5cFd06uImTNnymKxaPjw4SouLpbdbpeXl5fGjh2rV1991dXlAQCA68RYDwCVB9eIVzEXLlzQsWPHJElNmzZVzZo1XVwRAAAoT4z1AOD+OCJ+gxs5cmSZ+i1evLiCKwEAABWBsR4AKh+OiN/gPDw8FBoaqk6dOun3/qlXrVplYlUAAKC8MNYDQOXDEfEb3NixY/Xuu+/q+PHjeuyxx/TII4/I39/f1WUBAIBywlgPAJUPR8SrgIKCAn344YdavHixtm3bpsjISMXExKh///6yWCyuLg8AAFwnxnoAqFwI4lXMd999p6VLl2r58uUqLi7WgQMHVKtWLVeXBQAAygljPQC4P54jXsV4eHjIYrHIbrerpKTE1eUAAIByxlgPAO6PIF4FFBQU6N1339Wf/vQnNW/eXPv379dbb72l9PR0fiEHAOAGwFgPAJULN2u7wT3xxBN67733FBISopEjR+rdd99VvXr1XF0WAAAoJ4z1AFD5cI34Dc7Dw0MNGzZUp06dfvdmLR9++KGJVQEAgPLCWA8AlQ9HxG9ww4cP526pAADcwBjrAaDy4Yg4AAAAAAAm4mZtAAAAAACYiCAOAAAAAICJCOIAAAAAAJiIIA4AAAAAgIkI4gAAAAAAmIggDgAAAACAiQjiAAAAAACY6P8H32YW1VJfthgAAAAASUVORK5CYII=\n",
+ "text/plain": [
+ "<Figure size 1200x400 with 2 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "feature = 'Contract'\n",
+ "fig, ax = plt.subplots(1, 2, figsize=(12, 4))\n",
+ "df[df.Churn == \"No\"][feature].value_counts().plot(kind='bar', ax=ax[0]).set_title('not churned')\n",
+ "df[df.Churn == \"Yes\"][feature].value_counts().plot(kind='bar', ax=ax[1]).set_title('churned')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Target variable distribution"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Text(0.5, 1.0, 'churned')"
+ ]
+ },
+ "execution_count": 23,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "df[target].value_counts().plot(kind='bar').set_title('churned')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Target variable distribution shows that we are dealing with an imbalanced problem as there are many more non-churned as compare to churned users. The model would achieve high accuracy as it would mostly predict majority class - users who didn't churn in our example.\n",
+ "\n",
+ "Few things we can do to minimize the influence of imbalanced dataset:\n",
+ "- resample data,\n",
+ "- collect more samples,\n",
+ "- use precision and recall as accuracy metrics."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Outliers Analysis with IQR Method"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:14.876626Z",
+ "iopub.status.busy": "2021-11-09T03:53:14.875430Z",
+ "iopub.status.idle": "2021-11-09T03:53:14.900303Z",
+ "shell.execute_reply": "2021-11-09T03:53:14.899071Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:14.876576Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "No outliers in tenure\n",
+ "No outliers in MonthlyCharges\n"
+ ]
+ }
+ ],
+ "source": [
+ "x = ['tenure','MonthlyCharges']\n",
+ "def count_outliers(data,col):\n",
+ " q1 = data[col].quantile(0.25,interpolation='nearest')\n",
+ " q2 = data[col].quantile(0.5,interpolation='nearest')\n",
+ " q3 = data[col].quantile(0.75,interpolation='nearest')\n",
+ " q4 = data[col].quantile(1,interpolation='nearest')\n",
+ " IQR = q3 -q1\n",
+ " global LLP\n",
+ " global ULP\n",
+ " LLP = q1 - 1.5*IQR\n",
+ " ULP = q3 + 1.5*IQR\n",
+ " if data[col].min() > LLP and data[col].max() < ULP:\n",
+ " print(\"No outliers in\",i)\n",
+ " else:\n",
+ " print(\"There are outliers in\",i)\n",
+ " x = data[data[col]<LLP][col].size\n",
+ " y = data[data[col]>ULP][col].size\n",
+ " a.append(i)\n",
+ " print('Count of outliers are:',x+y)\n",
+ "global a\n",
+ "a = []\n",
+ "for i in x:\n",
+ " count_outliers(df,i)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Cleaning and Transforming Data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:14.902614Z",
+ "iopub.status.busy": "2021-11-09T03:53:14.902166Z",
+ "iopub.status.idle": "2021-11-09T03:53:14.911726Z",
+ "shell.execute_reply": "2021-11-09T03:53:14.910394Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:14.902565Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "df.drop(['customerID'],axis = 1,inplace = True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:14.914366Z",
+ "iopub.status.busy": "2021-11-09T03:53:14.914012Z",
+ "iopub.status.idle": "2021-11-09T03:53:14.952158Z",
+ "shell.execute_reply": "2021-11-09T03:53:14.951160Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:14.914319Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>gender</th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>Partner</th>\n",
+ " <th>Dependents</th>\n",
+ " <th>tenure</th>\n",
+ " <th>PhoneService</th>\n",
+ " <th>MultipleLines</th>\n",
+ " <th>InternetService</th>\n",
+ " <th>OnlineSecurity</th>\n",
+ " <th>OnlineBackup</th>\n",
+ " <th>DeviceProtection</th>\n",
+ " <th>TechSupport</th>\n",
+ " <th>StreamingTV</th>\n",
+ " <th>StreamingMovies</th>\n",
+ " <th>Contract</th>\n",
+ " <th>PaperlessBilling</th>\n",
+ " <th>PaymentMethod</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>Churn</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>1</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>29.85</td>\n",
+ " <td>29.85</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>34</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>One year</td>\n",
+ " <td>No</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>56.95</td>\n",
+ " <td>1889.50</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>2</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>53.85</td>\n",
+ " <td>108.15</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>45</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>One year</td>\n",
+ " <td>No</td>\n",
+ " <td>Bank transfer (automatic)</td>\n",
+ " <td>42.30</td>\n",
+ " <td>1840.75</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>2</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>70.70</td>\n",
+ " <td>151.65</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " gender SeniorCitizen Partner Dependents tenure PhoneService \\\n",
+ "0 Female 0 Yes No 1 No \n",
+ "1 Male 0 No No 34 Yes \n",
+ "2 Male 0 No No 2 Yes \n",
+ "3 Male 0 No No 45 No \n",
+ "4 Female 0 No No 2 Yes \n",
+ "\n",
+ " MultipleLines InternetService OnlineSecurity OnlineBackup \\\n",
+ "0 No phone service DSL No Yes \n",
+ "1 No DSL Yes No \n",
+ "2 No DSL Yes Yes \n",
+ "3 No phone service DSL Yes No \n",
+ "4 No Fiber optic No No \n",
+ "\n",
+ " DeviceProtection TechSupport StreamingTV StreamingMovies Contract \\\n",
+ "0 No No No No Month-to-month \n",
+ "1 Yes No No No One year \n",
+ "2 No No No No Month-to-month \n",
+ "3 Yes Yes No No One year \n",
+ "4 No No No No Month-to-month \n",
+ "\n",
+ " PaperlessBilling PaymentMethod MonthlyCharges TotalCharges \\\n",
+ "0 Yes Electronic check 29.85 29.85 \n",
+ "1 No Mailed check 56.95 1889.50 \n",
+ "2 Yes Mailed check 53.85 108.15 \n",
+ "3 No Bank transfer (automatic) 42.30 1840.75 \n",
+ "4 Yes Electronic check 70.70 151.65 \n",
+ "\n",
+ " Churn \n",
+ "0 No \n",
+ "1 No \n",
+ "2 Yes \n",
+ "3 No \n",
+ "4 Yes "
+ ]
+ },
+ "execution_count": 26,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Dropped customerID because it is not needed"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### On Hot Encoding"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:14.954613Z",
+ "iopub.status.busy": "2021-11-09T03:53:14.953998Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.014837Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.013920Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:14.954564Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "df1=pd.get_dummies(data=df,columns=['gender', 'Partner', 'Dependents', \n",
+ " 'PhoneService', 'MultipleLines', 'InternetService', 'OnlineSecurity',\n",
+ " 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV',\n",
+ " 'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod', 'Churn'], drop_first=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>tenure</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>gender_Male</th>\n",
+ " <th>Partner_Yes</th>\n",
+ " <th>Dependents_Yes</th>\n",
+ " <th>PhoneService_Yes</th>\n",
+ " <th>MultipleLines_No phone service</th>\n",
+ " <th>MultipleLines_Yes</th>\n",
+ " <th>...</th>\n",
+ " <th>StreamingTV_Yes</th>\n",
+ " <th>StreamingMovies_No internet service</th>\n",
+ " <th>StreamingMovies_Yes</th>\n",
+ " <th>Contract_One year</th>\n",
+ " <th>Contract_Two year</th>\n",
+ " <th>PaperlessBilling_Yes</th>\n",
+ " <th>PaymentMethod_Credit card (automatic)</th>\n",
+ " <th>PaymentMethod_Electronic check</th>\n",
+ " <th>PaymentMethod_Mailed check</th>\n",
+ " <th>Churn_Yes</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>29.85</td>\n",
+ " <td>29.85</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>0</td>\n",
+ " <td>34</td>\n",
+ " <td>56.95</td>\n",
+ " <td>1889.50</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>0</td>\n",
+ " <td>2</td>\n",
+ " <td>53.85</td>\n",
+ " <td>108.15</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>0</td>\n",
+ " <td>45</td>\n",
+ " <td>42.30</td>\n",
+ " <td>1840.75</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>0</td>\n",
+ " <td>2</td>\n",
+ " <td>70.70</td>\n",
+ " <td>151.65</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "<p>5 rows × 31 columns</p>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " SeniorCitizen tenure MonthlyCharges TotalCharges gender_Male \\\n",
+ "0 0 1 29.85 29.85 0 \n",
+ "1 0 34 56.95 1889.50 1 \n",
+ "2 0 2 53.85 108.15 1 \n",
+ "3 0 45 42.30 1840.75 1 \n",
+ "4 0 2 70.70 151.65 0 \n",
+ "\n",
+ " Partner_Yes Dependents_Yes PhoneService_Yes \\\n",
+ "0 1 0 0 \n",
+ "1 0 0 1 \n",
+ "2 0 0 1 \n",
+ "3 0 0 0 \n",
+ "4 0 0 1 \n",
+ "\n",
+ " MultipleLines_No phone service MultipleLines_Yes ... StreamingTV_Yes \\\n",
+ "0 1 0 ... 0 \n",
+ "1 0 0 ... 0 \n",
+ "2 0 0 ... 0 \n",
+ "3 1 0 ... 0 \n",
+ "4 0 0 ... 0 \n",
+ "\n",
+ " StreamingMovies_No internet service StreamingMovies_Yes \\\n",
+ "0 0 0 \n",
+ "1 0 0 \n",
+ "2 0 0 \n",
+ "3 0 0 \n",
+ "4 0 0 \n",
+ "\n",
+ " Contract_One year Contract_Two year PaperlessBilling_Yes \\\n",
+ "0 0 0 1 \n",
+ "1 1 0 0 \n",
+ "2 0 0 1 \n",
+ "3 1 0 0 \n",
+ "4 0 0 1 \n",
+ "\n",
+ " PaymentMethod_Credit card (automatic) PaymentMethod_Electronic check \\\n",
+ "0 0 1 \n",
+ "1 0 0 \n",
+ "2 0 0 \n",
+ "3 0 0 \n",
+ "4 0 1 \n",
+ "\n",
+ " PaymentMethod_Mailed check Churn_Yes \n",
+ "0 0 0 \n",
+ "1 1 0 \n",
+ "2 1 1 \n",
+ "3 0 0 \n",
+ "4 0 1 \n",
+ "\n",
+ "[5 rows x 31 columns]"
+ ]
+ },
+ "execution_count": 28,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df1.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 29,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Index(['SeniorCitizen', 'tenure', 'MonthlyCharges', 'TotalCharges',\n",
+ " 'gender_Male', 'Partner_Yes', 'Dependents_Yes', 'PhoneService_Yes',\n",
+ " 'MultipleLines_No phone service', 'MultipleLines_Yes',\n",
+ " 'InternetService_Fiber optic', 'InternetService_No',\n",
+ " 'OnlineSecurity_No internet service', 'OnlineSecurity_Yes',\n",
+ " 'OnlineBackup_No internet service', 'OnlineBackup_Yes',\n",
+ " 'DeviceProtection_No internet service', 'DeviceProtection_Yes',\n",
+ " 'TechSupport_No internet service', 'TechSupport_Yes',\n",
+ " 'StreamingTV_No internet service', 'StreamingTV_Yes',\n",
+ " 'StreamingMovies_No internet service', 'StreamingMovies_Yes',\n",
+ " 'Contract_One year', 'Contract_Two year', 'PaperlessBilling_Yes',\n",
+ " 'PaymentMethod_Credit card (automatic)',\n",
+ " 'PaymentMethod_Electronic check', 'PaymentMethod_Mailed check',\n",
+ " 'Churn_Yes'],\n",
+ " dtype='object')"
+ ]
+ },
+ "execution_count": 29,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df1.columns"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Rearranging Columns"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "metadata": {
+ "_kg_hide-input": true,
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.018322Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.017423Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.028617Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.027469Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.018273Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "df1 = df1[['SeniorCitizen', 'tenure', 'MonthlyCharges', 'TotalCharges',\n",
+ " 'gender_Male', 'Partner_Yes', 'Dependents_Yes',\n",
+ " 'PhoneService_Yes', 'MultipleLines_No phone service',\n",
+ " 'MultipleLines_Yes', 'InternetService_Fiber optic',\n",
+ " 'InternetService_No', 'OnlineSecurity_No internet service',\n",
+ " 'OnlineSecurity_Yes', 'OnlineBackup_No internet service',\n",
+ " 'OnlineBackup_Yes', 'DeviceProtection_No internet service',\n",
+ " 'DeviceProtection_Yes', 'TechSupport_No internet service',\n",
+ " 'TechSupport_Yes', 'StreamingTV_No internet service', 'StreamingTV_Yes',\n",
+ " 'StreamingMovies_No internet service', 'StreamingMovies_Yes',\n",
+ " 'Contract_One year', 'Contract_Two year', 'PaperlessBilling_Yes',\n",
+ " 'PaymentMethod_Credit card (automatic)',\n",
+ " 'PaymentMethod_Electronic check', 'PaymentMethod_Mailed check','Churn_Yes']]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.031710Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.030868Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.064625Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.063618Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.031661Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>tenure</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>gender_Male</th>\n",
+ " <th>Partner_Yes</th>\n",
+ " <th>Dependents_Yes</th>\n",
+ " <th>PhoneService_Yes</th>\n",
+ " <th>MultipleLines_No phone service</th>\n",
+ " <th>MultipleLines_Yes</th>\n",
+ " <th>...</th>\n",
+ " <th>StreamingTV_Yes</th>\n",
+ " <th>StreamingMovies_No internet service</th>\n",
+ " <th>StreamingMovies_Yes</th>\n",
+ " <th>Contract_One year</th>\n",
+ " <th>Contract_Two year</th>\n",
+ " <th>PaperlessBilling_Yes</th>\n",
+ " <th>PaymentMethod_Credit card (automatic)</th>\n",
+ " <th>PaymentMethod_Electronic check</th>\n",
+ " <th>PaymentMethod_Mailed check</th>\n",
+ " <th>Churn_Yes</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>29.85</td>\n",
+ " <td>29.85</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>0</td>\n",
+ " <td>34</td>\n",
+ " <td>56.95</td>\n",
+ " <td>1889.50</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>0</td>\n",
+ " <td>2</td>\n",
+ " <td>53.85</td>\n",
+ " <td>108.15</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>0</td>\n",
+ " <td>45</td>\n",
+ " <td>42.30</td>\n",
+ " <td>1840.75</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>0</td>\n",
+ " <td>2</td>\n",
+ " <td>70.70</td>\n",
+ " <td>151.65</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>...</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "<p>5 rows × 31 columns</p>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " SeniorCitizen tenure MonthlyCharges TotalCharges gender_Male \\\n",
+ "0 0 1 29.85 29.85 0 \n",
+ "1 0 34 56.95 1889.50 1 \n",
+ "2 0 2 53.85 108.15 1 \n",
+ "3 0 45 42.30 1840.75 1 \n",
+ "4 0 2 70.70 151.65 0 \n",
+ "\n",
+ " Partner_Yes Dependents_Yes PhoneService_Yes \\\n",
+ "0 1 0 0 \n",
+ "1 0 0 1 \n",
+ "2 0 0 1 \n",
+ "3 0 0 0 \n",
+ "4 0 0 1 \n",
+ "\n",
+ " MultipleLines_No phone service MultipleLines_Yes ... StreamingTV_Yes \\\n",
+ "0 1 0 ... 0 \n",
+ "1 0 0 ... 0 \n",
+ "2 0 0 ... 0 \n",
+ "3 1 0 ... 0 \n",
+ "4 0 0 ... 0 \n",
+ "\n",
+ " StreamingMovies_No internet service StreamingMovies_Yes \\\n",
+ "0 0 0 \n",
+ "1 0 0 \n",
+ "2 0 0 \n",
+ "3 0 0 \n",
+ "4 0 0 \n",
+ "\n",
+ " Contract_One year Contract_Two year PaperlessBilling_Yes \\\n",
+ "0 0 0 1 \n",
+ "1 1 0 0 \n",
+ "2 0 0 1 \n",
+ "3 1 0 0 \n",
+ "4 0 0 1 \n",
+ "\n",
+ " PaymentMethod_Credit card (automatic) PaymentMethod_Electronic check \\\n",
+ "0 0 1 \n",
+ "1 0 0 \n",
+ "2 0 0 \n",
+ "3 0 0 \n",
+ "4 0 1 \n",
+ "\n",
+ " PaymentMethod_Mailed check Churn_Yes \n",
+ "0 0 0 \n",
+ "1 1 0 \n",
+ "2 1 1 \n",
+ "3 0 0 \n",
+ "4 0 1 \n",
+ "\n",
+ "[5 rows x 31 columns]"
+ ]
+ },
+ "execution_count": 31,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df1.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 32,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(7043, 31)"
+ ]
+ },
+ "execution_count": 32,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df1.shape"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 33,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.067076Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.066454Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.080022Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.078954Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.067027Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.impute import SimpleImputer\n",
+ "\n",
+ "# The imputer will replace missing values with the mean of the non-missing values for the respective columns\n",
+ "\n",
+ "imputer = SimpleImputer(missing_values=np.nan, strategy=\"mean\")\n",
+ "\n",
+ "df1.TotalCharges = imputer.fit_transform(df1[\"TotalCharges\"].values.reshape(-1, 1))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Feature Scaling"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 34,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.082462Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.082111Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.103525Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.102463Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.082399Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.preprocessing import StandardScaler\n",
+ "scaler = StandardScaler()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 35,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "scaler.fit(df1.drop(['Churn_Yes'],axis = 1))\n",
+ "scaled_features = scaler.transform(df1.drop('Churn_Yes',axis = 1))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Feature Selection"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 36,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.106000Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.105329Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.116525Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.115285Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.105952Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.model_selection import train_test_split\n",
+ "X = scaled_features\n",
+ "Y = df1['Churn_Yes']\n",
+ "X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size = 0.3,random_state=44)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Prediction using Logistic Regression"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 37,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.228616Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.227007Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.319319Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.318141Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.228565Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<style>#sk-container-id-1 {color: black;background-color: white;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LogisticRegression()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">LogisticRegression</label><div class=\"sk-toggleable__content\"><pre>LogisticRegression()</pre></div></div></div></div></div>"
+ ],
+ "text/plain": [
+ "LogisticRegression()"
+ ]
+ },
+ "execution_count": 37,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "from sklearn.linear_model import LogisticRegression\n",
+ "from sklearn.metrics import classification_report,accuracy_score ,confusion_matrix\n",
+ "\n",
+ "logmodel = LogisticRegression()\n",
+ "logmodel.fit(X_train,Y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 38,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.328549Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.325493Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.338505Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.337265Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.328497Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "predLR = logmodel.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 39,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "array([0, 0, 0, ..., 0, 0, 0], dtype=uint8)"
+ ]
+ },
+ "execution_count": 39,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "predLR"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 40,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "5616 0\n",
+ "2937 0\n",
+ "1355 0\n",
+ "5441 1\n",
+ "3333 0\n",
+ " ..\n",
+ "2797 1\n",
+ "412 0\n",
+ "174 0\n",
+ "5761 0\n",
+ "5895 0\n",
+ "Name: Churn_Yes, Length: 2113, dtype: uint8"
+ ]
+ },
+ "execution_count": 40,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "Y_test"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 41,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.348885Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.344785Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.381860Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.380863Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.348824Z"
+ },
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " precision recall f1-score support\n",
+ "\n",
+ " 0 0.84 0.90 0.87 1557\n",
+ " 1 0.65 0.53 0.58 556\n",
+ "\n",
+ " accuracy 0.80 2113\n",
+ " macro avg 0.74 0.71 0.73 2113\n",
+ "weighted avg 0.79 0.80 0.79 2113\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(classification_report(Y_test, predLR))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 42,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAA9UAAAF2CAYAAABgXbt2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABcK0lEQVR4nO3de1wVdf7H8fcBBLwBkgJCJGamkgqGSZSmbSResiwrTQsjwy2hTFo3KQW1kjIzyrVIN9I23exibatGGUZWkhouXcxLmoplBzQSlBIU5vdHPyZPXJQjd17Px2Meeb7znZnvNPOB82G+8/1aDMMwBAAAAAAAasyhoRsAAAAAAEBTRVINAAAAAICdSKoBAAAAALATSTUAAAAAAHYiqQYAAAAAwE4k1QAAAAAA2ImkGgAAAAAAO5FUAwAAAABgJ5JqAAAAAADsRFINu1ksFs2ePbtG29x5550KCAiok/YAqD/Lli2TxWLR/v37zbIhQ4ZoyJAhDdYmoKFt3bpVV1xxhdq2bSuLxaLs7OyGbhIAoB6QVAMAAJyjkydP6pZbblF+fr6eeeYZ/etf/5K3t7dmzJihq6++Wu3bt5fFYlFGRkZDNxUAdOjQIc2ePZs//tUSp4ZuAJqu3377TU5ONbuFli5dqrKysjpqEQAADWPv3r06cOCAli5dqrvvvluSlJGRoSeffFLdu3dXnz59lJmZ2cCtBIDfHTp0SHPmzFFAQICCg4MbujlNHk+qW4CysjKdOHGi1vfr6upa46S6VatWcnFxqfW2AC1VUVFRQzcBgKS8vDxJkoeHh1kWEhKin3/+Wbt371ZcXFwDtezcnDp1SiUlJQ3dDAC1hJiuGyTVTcjs2bNlsVi0c+dO3XrrrXJzc9N5552nqVOn2iTNFotFsbGxWrFihS655BK5uLgoLS1NkvTjjz/qrrvukre3t1xcXHTJJZcoNTW1wrFOnDih2bNn6+KLL5arq6s6d+6sm266SXv37rU5zunvVB87dkwPPPCAAgIC5OLiIi8vL1177bXatm2bWaeyd6qLior04IMPyt/fXy4uLurRo4cWLFggwzBs6pWf1zvvvKPevXub7S8/N6C5K/8Z8O2332r8+PHq0KGDBg4cKEl69dVXFRISotatW8vT01Pjxo3TwYMHK+xj8+bNGjFihDp06KC2bduqb9++evbZZ831X331le68805deOGFcnV1lY+Pj+666y79/PPP9XaeQFNz5513avDgwZKkW265RRaLRUOGDFH79u3l6el5Tvt+7bXXFBISovbt28vNzU19+vSxiVlJOnr0qKZNm2b+/j3//PMVGRmpI0eOmHXy8vI0adIkeXt7y9XVVUFBQVq+fLnNfvbv3y+LxaIFCxYoOTlZ3bp1k4uLi7799ltJ0s6dO3XzzTfL09NTrq6u6t+/v959991zOj+gsTrT99qAgADdeeedFbb78/giGRkZslgsWrVqlR5++GH5+Piobdu2uv766yv8nh4yZIh69+6trKwsXXHFFWrdurW6du2qlJSUCsc5l5h+/vnnddlll0mSoqKiZLFYZLFYtGzZsnP7n9aC0f27Cbr11lsVEBCgpKQkff7553ruuef0yy+/6JVXXjHrbNiwQa+//rpiY2PVsWNHBQQEKDc3V5dffrmZnHbq1EnvvfeeJk2apMLCQj3wwAOSpNLSUl133XVKT0/XuHHjNHXqVB07dkzr16/XN998o27dulXarnvuuUdvvvmmYmNjFRgYqJ9//lmffvqpduzYoUsvvbTSbQzD0PXXX6+PPvpIkyZNUnBwsN5//31Nnz5dP/74o5555hmb+p9++qlWr16tKVOmqH379nruuec0ZswY5eTk6Lzzzqud/8FAI3fLLbeoe/fumjdvngzD0OOPP65Zs2bp1ltv1d13363Dhw9r0aJFuuqqq/S///3PfHK2fv16XXfddercubOmTp0qHx8f7dixQ2vWrNHUqVPNOt9//72ioqLk4+Oj7du3a8mSJdq+fbs+//xzWSyWBjxzoHH661//Kj8/P82bN0/333+/LrvsMnl7e5/zftevX6/bbrtN11xzjZ588klJ0o4dO/TZZ5+ZMXv8+HENGjRIO3bs0F133aVLL71UR44c0bvvvqsffvhBHTt21G+//aYhQ4Zoz549io2NVdeuXfXGG2/ozjvv1NGjR819lXv55Zd14sQJTZ48WS4uLvL09NT27dt15ZVXys/PTzNmzFDbtm31+uuva/To0Xrrrbd04403nvP5Ao2JPd9rq/P444/LYrHooYceUl5enpKTkxUeHq7s7Gy1bt3arPfLL79oxIgRuvXWW3Xbbbfp9ddf17333itnZ2fdddddknTOMX3jjTfq2LFjSkhI0OTJkzVo0CBJ0hVXXHEO/8daOANNRmJioiHJuP76623Kp0yZYkgyvvzyS8MwDEOS4eDgYGzfvt2m3qRJk4zOnTsbR44csSkfN26c4e7ubvz666+GYRhGamqqIclYuHBhhTaUlZWZ/5ZkJCYmmp/d3d2NmJiYas9h4sSJRpcuXczP77zzjiHJeOyxx2zq3XzzzYbFYjH27NljczxnZ2ebsi+//NKQZCxatKja4wLNQfnPgNtuu80s279/v+Ho6Gg8/vjjNnW//vprw8nJySw/deqU0bVrV6NLly7GL7/8YlP39Lgu/zlwun//+9+GJGPjxo1m2csvv2xIMvbt22eWDR482Bg8ePA5nCHQdH300UeGJOONN96odP0bb7xhSDI++uijs97n1KlTDTc3N+PUqVNV1klISDAkGatXr66wrjy2k5OTDUnGq6++aq4rKSkxwsLCjHbt2hmFhYWGYRjGvn37DEmGm5ubkZeXZ7Ova665xujTp49x4sQJm/1fccUVRvfu3c/6nICm4kzfa7t06WJMnDixQvmffxeW/2zw8/MzY80wDOP11183JBnPPvuszbaSjKefftosKy4uNoKDgw0vLy+jpKTEMIzaiemtW7cakoyXX375rP5/oHp0/26CYmJibD7fd999kqR169aZZYMHD1ZgYKD52TAMvfXWWxo1apQMw9CRI0fMJSIiQgUFBWZ3lrfeeksdO3Y093u66p5SeXh4aPPmzTp06NBZn8u6devk6Oio+++/36b8wQcflGEYeu+992zKw8PDbZ6U9+3bV25ubvr+++/P+phAU3fPPfeY/169erXKysp066232sS1j4+Punfvro8++kiS9L///U/79u3TAw88YPPOp2Qb16f/tfzEiRM6cuSILr/8ckmyeZUDQN3z8PBQUVGR1q9fX2Wdt956S0FBQZU+KS6P7XXr1snHx0e33Xabua5Vq1a6//77dfz4cX388cc2240ZM0adOnUyP+fn52vDhg269dZbdezYMfPnzM8//6yIiAh99913+vHHH8/1dIFGxZ7vtdWJjIxU+/btzc8333yzOnfubPP9XZKcnJz017/+1fzs7Oysv/71r8rLy1NWVpakc49p1D6S6iaoe/fuNp+7desmBwcHm/liu3btalPn8OHDOnr0qJYsWaJOnTrZLFFRUZL+GGRl79696tGjR40HIZs/f76++eYb+fv7a8CAAZo9e/YZk90DBw7I19fX5oeMJPXq1ctcf7oLLrigwj46dOigX375pUZtBZqy0+P7u+++k2EY6t69e4XY3rFjh01cS1Lv3r2r3Xd+fr6mTp0qb29vtW7dWp06dTKPV1BQUEdnBLRs+fn5slqt5lIea1OmTNHFF1+s4cOH6/zzz9ddd91VYRyRvXv3njGuDxw4oO7du8vBwfZrX1W/a//8HWLPnj0yDEOzZs2q8HMmMTFR0h/fIYDmwp7vtdX58/d3i8Wiiy66yOb7uyT5+vqqbdu2NmUXX3yxJJl1zzWmUft4p7oZqOzp8elPmySZ01jdfvvtmjhxYqX76du37zm149Zbb9WgQYP09ttv64MPPtBTTz2lJ598UqtXr9bw4cPPad/lHB0dKy03/jSoGdCcnR7fZWVlslgseu+99yqNj3bt2tVo37feeqs2bdqk6dOnKzg4WO3atVNZWZmGDRvGdHhAHbnppptsnixNnDhRy5Ytk5eXl7Kzs/X+++/rvffe03vvvaeXX35ZkZGRFQYkqk1VfYf429/+poiIiEq3ueiii+qsPUBDONP32qp6b5aWllb5fbWh/DmmUftIqpug7777zuYvTnv27FFZWVmFUbVP16lTJ7Vv316lpaUKDw+vdv/dunXT5s2bdfLkSbVq1apGbevcubOmTJmiKVOmKC8vT5deeqkef/zxKpPqLl266MMPP9SxY8dsnlbv3LnTXA+gat26dZNhGOratav5l+yq6knSN998U+XPgF9++UXp6emaM2eOEhISzPLvvvuudhsNwMbTTz9t0+PK19fX/Lezs7NGjRqlUaNGqaysTFOmTNGLL76oWbNm6aKLLlK3bt30zTffVLv/Ll266KuvvlJZWZnNk62z/V174YUXSvq9e+mZvkMAzUl132s7dOigo0ePVtjmwIEDZsyc7s+/Sw3D0J49eyo81Dp06JCKiopsnlbv3r1bkszv+uca01L1r3Si5uj+3QQtXrzY5vOiRYskqdqnwY6OjhozZozeeuutSn/5Hj582Pz3mDFjdOTIEf3jH/+oUK+qJ8KlpaUVuoZ6eXnJ19dXxcXFVbZrxIgRKi0trXCsZ555RhaLpdaecAPN1U033SRHR0fNmTOnQnwahmFOhXXppZeqa9euSk5OrvAloHy78r+s/3k/ycnJddN4AJJ+n886PDzcXMrHRPnzVHYODg7mF/Dy361jxozRl19+qbfffrvCfstjecSIEbJarVq1apW57tSpU1q0aJHatWtnTgdWFS8vLw0ZMkQvvviifvrppwrrT/8OATQHZ/O9tlu3bvr8889t5nxes2ZNpdNZStIrr7yiY8eOmZ/ffPNN/fTTTxW+6546dUovvvii+bmkpEQvvviiOnXqpJCQEEnnHtOSzKS9sj8MoOZ4Ut0E7du3T9dff72GDRumzMxMvfrqqxo/fryCgoKq3e6JJ57QRx99pNDQUEVHRyswMFD5+fnatm2bPvzwQ+Xn50v6fSCFV155RXFxcdqyZYsGDRqkoqIiffjhh5oyZYpuuOGGCvs+duyYzj//fN18880KCgpSu3bt9OGHH2rr1q16+umnq2zTqFGjdPXVV+uRRx7R/v37FRQUpA8++ED/+c9/9MADD1Q5fReA33Xr1k2PPfaY4uPjtX//fo0ePVrt27fXvn379Pbbb2vy5Mn629/+JgcHB73wwgsaNWqUgoODFRUVpc6dO2vnzp3avn273n//fbm5uemqq67S/PnzdfLkSfn5+emDDz7Qvn37Gvo0gSbrsccekyRt375dkvSvf/1Ln376qSRp5syZ1W579913Kz8/X3/5y190/vnn68CBA1q0aJGCg4PNdyenT5+uN998U7fccovuuusuhYSEKD8/X++++65SUlIUFBSkyZMn68UXX9Sdd96prKwsBQQE6M0339Rnn32m5OTkCuOaVGbx4sUaOHCg+vTpo+joaF144YXKzc1VZmamfvjhB3355Zfn8r8JaFTO5nvt3XffrTfffFPDhg3Trbfeqr179+rVV1+t8rurp6enBg4cqKioKOXm5io5OVkXXXSRoqOjber5+vrqySef1P79+3XxxRdr1apVys7O1pIlS8wepLUR0926dZOHh4dSUlLUvn17tW3bVqGhobx/ba+GGHIc9imfTufbb781br75ZqN9+/ZGhw4djNjYWOO3334z60mqcgqA3NxcIyYmxvD39zdatWpl+Pj4GNdcc42xZMkSm3q//vqr8cgjjxhdu3Y16918883G3r17bY5TPqVWcXGxMX36dCMoKMho37690bZtWyMoKMh4/vnnbfb75ym1DMMwjh07ZkybNs3w9fU1WrVqZXTv3t146qmnbKb5qe68qprSAGhuyn8GHD58uMK6t956yxg4cKDRtm1bo23btkbPnj2NmJgYY9euXTb1Pv30U+Paa68147Rv3742U9L98MMPxo033mh4eHgY7u7uxi233GIcOnSowhR6TKkF2KpqSi1JVS5n8uabbxpDhw41vLy8DGdnZ+OCCy4w/vrXvxo//fSTTb2ff/7ZiI2NNfz8/AxnZ2fj/PPPNyZOnGgzhWZubq4RFRVldOzY0XB2djb69OlTYSqd8ul3nnrqqUrbs3fvXiMyMtLw8fExWrVqZfj5+RnXXXed8eabb57l/yWgaTjb77VPP/204efnZ7i4uBhXXnml8cUXX1Q5pda///1vIz4+3vDy8jJat25tjBw50jhw4IDN/gYPHmxccsklxhdffGGEhYUZrq6uRpcuXYx//OMfFdpYGzH9n//8xwgMDDScnJyYXuscWQyDEZ6aitmzZ2vOnDk6fPiwOnbs2NDNAQAAAFCNjIwMXX311XrjjTd08803V1t3yJAhOnLkyBnHSUDjwzvVAAAAAADYiaQaAAAAAAA7kVQDAAAAAGAn3qkGAAAAAMBOPKkGAAAAAMBOJNUAAAAAANjJqaEbcDbKysp06NAhtW/fXhaLpaGbAzRKhmHo2LFj8vX1lYND4/t7GXEMnB1iGWj6GnscS8QycDbONpabRFJ96NAh+fv7N3QzgCbh4MGDOv/88xu6GRUQx0DNEMtA09dY41giloGaOFMsN4mkun379pJ+Pxk3N7cGbg3QOBUWFsrf39+Ml8aGOAbODrEMNH2NPY4lYhk4G2cby00iqS7vkuLm5kbQA2fQWLtwEcdAzRDLQNOzceNGPfXUU/riiy8kSWvXrtX48eNt6uzYsUMPPfSQPv74Y506dUqBgYF66623dMEFF0iSTpw4oQcffFCvvfaaiouLFRERoeeff17e3t7mPnJycnTvvffqo48+Urt27TRx4kQlJSXJyensv9oTy8DZO9Pv5Mb5kgcAAADQxBQVFSkoKEgLFiyodP3evXs1cOBA9ezZUxkZGfrqq680a9Ysubq6mnWmTZum//73v3rjjTf08ccf69ChQ7rpppvM9aWlpRo5cqRKSkq0adMmLV++XMuWLVNCQkKdnx+AyjWJeaoLCwvl7u6ugoIC/pIGVKGxx0ljbx/QWDT2WGns7QMag/I4WbFihc2T6nHjxqlVq1b617/+Vel2BQUF6tSpk1auXKmbb75ZkrRz50716tVLmZmZuvzyy/Xee+/puuuu06FDh8yn1ykpKXrooYd0+PBhOTs716iNxDJQtbONE55UAwAAAHWsrKxMa9eu1cUXX6yIiAh5eXkpNDRU77zzjlknKytLJ0+eVHh4uFnWs2dPXXDBBcrMzJQkZWZmqk+fPjbdwSMiIlRYWKjt27fX2/kA+ANJNQAAAFDH8vLydPz4cT3xxBMaNmyYPvjgA91444266aab9PHHH0uSrFarnJ2d5eHhYbOtt7e3rFarWef0hLp8ffm6qhQXF6uwsNBmAVA7msRAZQAAAEBTVlZWJkm64YYbNG3aNElScHCwNm3apJSUFA0ePLhOj5+UlKQ5c+bU6TGAloon1QAAAEAd69ixo5ycnBQYGGhT3qtXL+Xk5EiSfHx8VFJSoqNHj9rUyc3NlY+Pj1knNze3wvrydVWJj49XQUGBuRw8ePBcTwnA/yOpBgAAAOqYs7OzLrvsMu3atcumfPfu3erSpYskKSQkRK1atVJ6erq5fteuXcrJyVFYWJgkKSwsTF9//bXy8vLMOuvXr5ebm1uFhP10Li4u5vRZTKMF1C66fwMAAAC14Pjx49qzZ4+OHz8uSTpw4ICys7Pl6empCy64QNOnT9fYsWN11VVX6eqrr1ZaWpr++9//KiMjQ5Lk7u6uSZMmKS4uTp6ennJzc9N9992nsLAwXX755ZKkoUOHKjAwUHfccYfmz58vq9WqmTNnKiYmRi4uLg116kCLxpNqAAAAoBZ88cUX6tevnwYNGiRJevjhh9WvXz9zDukbb7xRKSkpmj9/vvr06aN//vOfeuuttzRw4EBzH88884yuu+46jRkzRldddZV8fHy0evVqc72jo6PWrFkjR0dHhYWF6fbbb1dkZKTmzp1bvycLwMQ81UAz0djjpLG3D2gsGnusNPb2AY1BU4iTptBGoKExTzUAAAAAAHWMpBoAAAAAADs1q4HKAmasbegmNAv7nxjZ0E1AC0csnzviGA2NOK4dxDIaGrF87ojj5o8n1QAAAAAA2ImkGgAAAAAAO5FUAwAAAABgJ5JqAAAAAADsRFINAAAAAICdSKoBAAAAALATSTUAAAAAAHYiqQYAAAAAwE52JdWLFy9WQECAXF1dFRoaqi1btlRbPzk5WT169FDr1q3l7++vadOm6cSJE3Y1GAAAAACAxqLGSfWqVasUFxenxMREbdu2TUFBQYqIiFBeXl6l9VeuXKkZM2YoMTFRO3bs0EsvvaRVq1bp4YcfPufGAwAAAADQkGqcVC9cuFDR0dGKiopSYGCgUlJS1KZNG6WmplZaf9OmTbryyis1fvx4BQQEaOjQobrtttvO+HQbAAAAAIDGrkZJdUlJibKyshQeHv7HDhwcFB4erszMzEq3ueKKK5SVlWUm0d9//73WrVunESNGnEOzAQAAAABoeE41qXzkyBGVlpbK29vbptzb21s7d+6sdJvx48fryJEjGjhwoAzD0KlTp3TPPfdU2/27uLhYxcXF5ufCwsKaNBMAAAAAgHpR56N/Z2RkaN68eXr++ee1bds2rV69WmvXrtWjjz5a5TZJSUlyd3c3F39//7puJgAAAAAANVajJ9UdO3aUo6OjcnNzbcpzc3Pl4+NT6TazZs3SHXfcobvvvluS1KdPHxUVFWny5Ml65JFH5OBQMa+Pj49XXFyc+bmwsJDEGgAAAADQ6NToSbWzs7NCQkKUnp5ulpWVlSk9PV1hYWGVbvPrr79WSJwdHR0lSYZhVLqNi4uL3NzcbBYAAAAAABqbGj2plqS4uDhNnDhR/fv314ABA5ScnKyioiJFRUVJkiIjI+Xn56ekpCRJ0qhRo7Rw4UL169dPoaGh2rNnj2bNmqVRo0aZyTUAAAAAAE1RjZPqsWPH6vDhw0pISJDValVwcLDS0tLMwctycnJsnkzPnDlTFotFM2fO1I8//qhOnTpp1KhRevzxx2vvLAAAAAAAaAB2DVQWGxurAwcOqLi4WJs3b1ZoaKi5LiMjQ8uWLTM/Ozk5KTExUXv27NFvv/2mnJwcLV68WB4eHufadgAAWrzFixcrICBArq6uCg0NNaewrEpycrJ69Oih1q1by9/fX9OmTdOJEyfqqbUAADQ/dT76NwAAqBurVq1SXFycEhMTtW3bNgUFBSkiIkJ5eXmV1l+5cqVmzJihxMRE7dixQy+99JJWrVpV7TSXAACgeiTVAAA0UQsXLlR0dLSioqIUGBiolJQUtWnTRqmpqZXW37Rpk6688kqNHz9eAQEBGjp0qG677bYzPt0GAABVI6kGAKAJKikpUVZWlsLDw80yBwcHhYeHKzMzs9JtrrjiCmVlZZlJ9Pfff69169ZpxIgRVR6nuLhYhYWFNgsAAPhDjQcqAwAADe/IkSMqLS01Bwot5+3trZ07d1a6zfjx43XkyBENHDhQhmHo1KlTuueee6rt/p2UlKQ5c+bUatsBAGhOeFINAEALkZGRoXnz5un555/Xtm3btHr1aq1du1aPPvpoldvEx8eroKDAXA4ePFiPLQYAoPEjqQYAoAnq2LGjHB0dlZuba1Oem5srHx+fSreZNWuW7rjjDt19993q06ePbrzxRs2bN09JSUkqKyurdBsXFxe5ubnZLAAqt3HjRo0aNUo9evSQJK1Zs6bKuvfcc48sFouSk5NtyvPz8zVhwgS5ubnJw8NDkyZN0vHjx23qfPXVVxo0aJBcXV3l7++v+fPn1/q5ADh7JNUAADRBzs7OCgkJUXp6ullWVlam9PR0hYWFVbrNr7/+KgcH21/9jo6OkiTDMOqusUALUVRUpKCgIC1YsKDaem+//bY+//xz+fr6Vlg3YcIEbd++XevXr9eaNWu0ceNGTZ482VxfWFiooUOHqkuXLsrKytJTTz2l2bNna8mSJbV+PgDODu9UAwDQRMXFxWnixInq37+/BgwYoOTkZBUVFSkqKkqSFBkZKT8/PyUlJUmSRo0apYULF6pfv34KDQ3Vnj17NGvWLI0aNcpMrgHYb/jw4Ro+fHi1A/r9+OOPuu+++/T+++9r5MiRNut27NihtLQ0bd26Vf3795ckLVq0SCNGjNCCBQvk6+urFStWqKSkRKmpqXJ2dtYll1yi7OxsLVy40Cb5BlB/eFINtFCLFy9WQECAXF1dFRoaesYpdZKTk9WjRw+1bt1a/v7+mjZtmk6cOFFPrQVQmbFjx2rBggVKSEhQcHCwsrOzlZaWZg5elpOTo59++smsP3PmTD344IOaOXOmAgMDNWnSJEVEROjFF19sqFMAWpSysjLdcccdmj59ui655JIK6zMzM+Xh4WEm1JIUHh4uBwcHbd682axz1VVXydnZ2awTERGhXbt26Zdffqny2IzkD9QdnlQDLdCqVasUFxenlJQUhYaGKjk52fyF7OXlVaH+ypUrNWPGDKWmpuqKK67Q7t27deedd8pisWjhwoUNcAYAysXGxio2NrbSdRkZGTafnZyclJiYqMTExHpoGYA/e/LJJ+Xk5KT777+/0vVWq7XC72EnJyd5enrKarWadbp27WpTp/wPaVarVR06dKh034zkD9QdnlQDLdDChQsVHR2tqKgoBQYGKiUlRW3atFFqamql9Tdt2qQrr7xS48ePV0BAgIYOHarbbrvtjE+3AQDA77KysvTss89q2bJlslgs9X58RvIH6g5JNdDClJSUKCsrS+Hh4WaZg4ODwsPDlZmZWek2V1xxhbKysswk+vvvv9e6des0YsSIKo9DNzMAAP7wySefKC8vTxdccIGcnJzk5OSkAwcO6MEHH1RAQIAkycfHR3l5eTbbnTp1Svn5+eao/j4+PpWO+l++riqM5A/UHZJqoIU5cuSISktLza5i5by9vc2uZX82fvx4zZ07VwMHDlSrVq3UrVs3DRkyRA8//HCVx0lKSpK7u7u5+Pv71+p5AADQlNxxxx366quvlJ2dbS6+vr6aPn263n//fUlSWFiYjh49qqysLHO7DRs2qKysTKGhoWadjRs36uTJk2ad9evXq0ePHlV2/QZQt0iqAZxRRkaG5s2bp+eff17btm3T6tWrtXbtWj366KNVbkM3MwBAS3P8+HFlZ2frq6++kiQdOHBA2dnZysnJ0XnnnafevXvbLK1atZKPj485r3WvXr00bNgwRUdHa8uWLfrss88UGxurcePGmdNvjR8/Xs7Ozpo0aZK2b9+uVatW6dlnn1VcXFyDnTfQ0jFQGdDCdOzYUY6OjpV2Hauq29isWbN0xx136O6775Yk9enTR0VFRZo8ebIeeeSRCvPeSr93M3Nxcan9EwAAoJH64osvdPXVV5ufH374YT388MOaOHGili1bdlb7WLFihWJjY3XNNdfIwcFBY8aM0XPPPWeud3d31wcffKCYmBiFhISoY8eOSkhIYDotoAGRVAMtjLOzs0JCQpSenq7Ro0dL+n2Kj/T09CpHEP71118rJM7lc9oahlGn7QUAoKkYMmSIDMNQYWGh3N3dVVBQUO27y/v3769Q5unpqZUrV1Z7nL59++qTTz451+YCqCUk1UALFBcXp4kTJ6p///4aMGCAkpOTVVRUpKioKElSZGSk/Pz8lJSUJEkaNWqUFi5cqH79+ik0NFR79uzRrFmzNGrUKDO5BgAAAFoikmqgBRo7dqwOHz6shIQEWa1WBQcHKy0tzRy8LCcnx+bJ9MyZM2WxWDRz5kz9+OOP6tSpk0aNGqXHH3+8oU4BAAAAaBRIqoEWKjY2tsru3hkZGTafnZyclJiYqMTExHpoGQAAANB0MPo3AAAAAAB2IqkGAAAAAMBOJNUAAAAAANiJpBoAAAAAADuRVAMAAAAAYCe7kurFixcrICBArq6uCg0N1ZYtW6qsO2TIEFkslgrLyJEj7W40AAAAAACNQY2T6lWrVikuLk6JiYnatm2bgoKCFBERoby8vErrr169Wj/99JO5fPPNN3J0dNQtt9xyzo0HAAAAAKAh1TipXrhwoaKjoxUVFaXAwEClpKSoTZs2Sk1NrbS+p6enfHx8zGX9+vVq06YNSTUAAAAAoMmrUVJdUlKirKwshYeH/7EDBweFh4crMzPzrPbx0ksvady4cWrbtm3NWgoAAAAAQCPjVJPKR44cUWlpqby9vW3Kvb29tXPnzjNuv2XLFn3zzTd66aWXqq1XXFys4uJi83NhYWFNmgkAAAAAQL2o19G/X3rpJfXp00cDBgyotl5SUpLc3d3Nxd/fv55aCAAAAADA2atRUt2xY0c5OjoqNzfXpjw3N1c+Pj7VbltUVKTXXntNkyZNOuNx4uPjVVBQYC4HDx6sSTMBAAAAAKgXNUqqnZ2dFRISovT0dLOsrKxM6enpCgsLq3bbN954Q8XFxbr99tvPeBwXFxe5ubnZLAAAAAAANDY1eqdakuLi4jRx4kT1799fAwYMUHJysoqKihQVFSVJioyMlJ+fn5KSkmy2e+mllzR69Gidd955tdNyAAAAAAAaWI2T6rFjx+rw4cNKSEiQ1WpVcHCw0tLSzMHLcnJy5OBg+wB8165d+vTTT/XBBx/UTqsBAAAAAGgEapxUS1JsbKxiY2MrXZeRkVGhrEePHjIMw55DAQAAAADQaNXr6N8AAAAAADQnJNUAAAAAANiJpBoAAAAAADuRVAMAAAAAYCeSagAAAAAA7ERSDQAAAACAnUiqAQAAAACwE0k1AAAAAAB2IqkGAAAAasHGjRs1atQo9ejRQ5K0Zs0ac93Jkyf10EMPqU+fPmrbtq18fX0VGRmpQ4cO2ewjPz9fEyZMkJubmzw8PDRp0iQdP37cps5XX32lQYMGydXVVf7+/po/f37dnxyAKpFUAwAAALWgqKhIQUFBWrBgQYV1v/76q7Zt26ZZs2Zp27ZtWr16tXbt2qXrr7/ept6ECRO0fft2rV+/XmvWrNHGjRs1efJkc31hYaGGDh2qLl26KCsrS0899ZRmz56tJUuW1Pn5AaicU0M3AAAAAGgOhg8fruHDh6uwsLDCOnd3d61fv96m7B//+IcGDBignJwcXXDBBdqxY4fS0tK0detW9e/fX5K0aNEijRgxQgsWLJCvr69WrFihkpISpaamytnZWZdccomys7O1cOFCm+QbQP0hqUadC5ixtqGb0Czsf2JkQzcBAADUooKCAlksFnl4eEiSMjMz5eHhYSbUkhQeHi4HBwdt3rxZN954ozIzM3XVVVfJ2dnZrBMREaEnn3xSv/zyizp06FDpsYqLi1VcXGx+rizxB2Afun8DAAAA9ezEiRN66KGHdNttt8nNzU2SZLVa5eXlZVPPyclJnp6eslqtZh1vb2+bOuWfy+tUJikpSe7u7ubi7+9fm6cDtGgk1QAAAEA9OnnypG699VYZhqEXXnihXo4ZHx+vgoICczl48GC9HBdoCej+DQAAANST8oT6wIED2rBhg/mUWpJ8fHyUl5dnU//UqVPKz8+Xj4+PWSc3N9emTvnn8jqVcXFxkYuLS22dBoDT8KQaAAAAqAflCfV3332nDz/8UOedd57N+rCwMB09elRZWVlm2YYNG1RWVqbQ0FCzzsaNG3Xy5Emzzvr169WjR48q36cGULdIqgEAAIBacPz4cWVnZ+urr76SJB04cEDZ2dnKycnRyZMndfPNN+uLL77QihUrVFpaKqvVKqvVqpKSEklSr169NGzYMEVHR2vLli367LPPFBsbq3HjxsnX11eSNH78eDk7O2vSpEnavn27Vq1apWeffVZxcXENdt5AS0dSDQAAANSCL774Qv369dOgQYMkSQ8//LD69eunhIQE/fjjj3r33Xf1ww8/KDg4WJ07dzaXTZs2mftYsWKFevbsqWuuuUYjRozQwIEDbeagdnd31wcffKB9+/YpJCREDz74oBISEphOC2hAvFMNAAAA1IIhQ4bIMAwVFhbK3d1dBQUFNu9MG4Zxxn14enpq5cqV1dbp27evPvnkk3NuL4DawZNqAAAAAADsRFINAAAAAICdSKoBAAAAALATSTUAAAAAAHYiqQYAAAAAwE52JdWLFy9WQECAXF1dFRoaqi1btlRb/+jRo4qJiVHnzp3l4uKiiy++WOvWrbOrwQAAAAAANBY1nlJr1apViouLU0pKikJDQ5WcnKyIiAjt2rVLXl5eFeqXlJTo2muvlZeXl9588035+fnpwIED8vDwqI32AwAAAADQYGqcVC9cuFDR0dGKioqSJKWkpGjt2rVKTU3VjBkzKtRPTU1Vfn6+Nm3apFatWkmSAgICzq3VAAAAAAA0AjXq/l1SUqKsrCyFh4f/sQMHB4WHhyszM7PSbd59912FhYUpJiZG3t7e6t27t+bNm6fS0tJzazkAAAAAAA2sRk+qjxw5otLSUnl7e9uUe3t7a+fOnZVu8/3332vDhg2aMGGC1q1bpz179mjKlCk6efKkEhMTK92muLhYxcXF5ufCwsKaNBMAAAAAgHpR56N/l5WVycvLS0uWLFFISIjGjh2rRx55RCkpKVVuk5SUJHd3d3Px9/ev62YCAAAAAFBjNUqqO3bsKEdHR+Xm5tqU5+bmysfHp9JtOnfurIsvvliOjo5mWa9evWS1WlVSUlLpNvHx8SooKDCXgwcP1qSZAAAAAADUixol1c7OzgoJCVF6erpZVlZWpvT0dIWFhVW6zZVXXqk9e/aorKzMLNu9e7c6d+4sZ2fnSrdxcXGRm5ubzQIAAAAAQGNT4+7fcXFxWrp0qZYvX64dO3bo3nvvVVFRkTkaeGRkpOLj48369957r/Lz8zV16lTt3r1ba9eu1bx58xQTE1N7ZwEAQAu1ePFiBQQEyNXVVaGhodqyZUu19Y8ePaqYmBh17txZLi4uuvjii7Vu3bp6ai0AAM1PjafUGjt2rA4fPqyEhARZrVYFBwcrLS3NHLwsJydHDg5/5Or+/v56//33NW3aNPXt21d+fn6aOnWqHnroodo7CwAAWqBVq1YpLi5OKSkpCg0NVXJysiIiIrRr1y55eXlVqF9SUqJrr71WXl5eevPNN+Xn56cDBw7Iw8Oj/hsPAEAzUeOkWpJiY2MVGxtb6bqMjIwKZWFhYfr888/tORQAAKjCwoULFR0dbfYWS0lJ0dq1a5WamqoZM2ZUqJ+amqr8/Hxt2rRJrVq1kiQFBATUZ5MBAGh26nz0bwAAUPtKSkqUlZWl8PBws8zBwUHh4eHKzMysdJt3331XYWFhiomJkbe3t3r37q158+aptLS0yuMUFxersLDQZgEAAH8gqQYAoAk6cuSISktLzdevynl7e8tqtVa6zffff68333xTpaWlWrdunWbNmqWnn35ajz32WJXHYZpLAACqR1INAEALUVZWJi8vLy1ZskQhISEaO3asHnnkEaWkpFS5DdNcAgBQPZJqoIVixGCgaevYsaMcHR2Vm5trU56bmysfH59Kt+ncubMuvvhiOTo6mmW9evWS1WpVSUlJpdswzSUAANUjqQZaoPIRgxMTE7Vt2zYFBQUpIiJCeXl5ldYvHzF4//79evPNN7Vr1y4tXbpUfn5+9dxyAOWcnZ0VEhKi9PR0s6ysrEzp6ekKCwurdJsrr7xSe/bsUVlZmVm2e/dude7cWc7OznXeZgAAmiOSaqAFOn3E4MDAQKWkpKhNmzZKTU2ttH75iMHvvPOOrrzySgUEBGjw4MEKCgqq55YDOF1cXJyWLl2q5cuXa8eOHbr33ntVVFRkjgYeGRmp+Ph4s/69996r/Px8TZ06Vbt379batWs1b948xcTENNQpAADQ5JFUAy1MfY0YDKDujR07VgsWLFBCQoKCg4OVnZ2ttLQ0c/CynJwc/fTTT2Z9f39/vf/++9q6dav69u2r+++/X1OnTq10+i0AAHB27JqnGkDTVd2IwTt37qx0m++//14bNmzQhAkTtG7dOu3Zs0dTpkzRyZMnlZiYWOk2xcXFKi4uNj8zDQ9QN2JjYxUbG1vpuoyMjAplYWFh+vzzz+u4VQAAtBwk1QDO6PQRgx0dHRUSEqIff/xRTz31VJVJdVJSkubMmVPPLQUAVCdgxtqGbkKzsP+JkQ3dBACNCN2/gRamvkYMZhoeAAAAtAQk1UALU18jBjMNDwCgpdm4caNGjRqlHj16SJLWrFljs94wDCUkJKhz585q3bq1wsPD9d1339nUyc/P14QJE+Tm5iYPDw9NmjRJx48ft6nz1VdfadCgQXJ1dZW/v7/mz59ftycGoFok1UALxIjBAADUvqKiIgUFBWnBggWVrp8/f76ee+45paSkaPPmzWrbtq0iIiJ04sQJs86ECRO0fft2rV+/XmvWrNHGjRs1efJkc31hYaGGDh2qLl26KCsrS0899ZRmz56tJUuW1Pn5Aagc71QDLdDYsWN1+PBhJSQkyGq1Kjg4uMKIwQ4Of/zNrXzE4GnTpqlv377y8/PT1KlT9dBDDzXUKQAA0OgMHz5cw4cPr3RwTsMwlJycrJkzZ+qGG26QJL3yyivy9vbWO++8o3HjxmnHjh1KS0vT1q1b1b9/f0nSokWLNGLECC1YsEC+vr5asWKFSkpKlJqaKmdnZ11yySXKzs7WwoULbZJvAPWHpBpooRgxGACA+rNv3z5ZrVabKS3d3d0VGhqqzMxMjRs3TpmZmfLw8DATakkKDw+Xg4ODNm/erBtvvFGZmZm66qqrbF6/ioiI0JNPPqlffvlFHTp0qNfzAkBSDQAAANQ5q9UqSZVOaVm+zmq1ysvLy2a9k5OTPD09bep07dq1wj7K11WVVDPVJVB3eKcaAAAAaOaSkpLk7u5uLv7+/g3dJKDZIKkGAAAA6lj5tJXVTWnp4+OjvLw8m/WnTp1Sfn6+TZ3K9nH6MSrDVJdA3SGpBgAAAOpY165d5ePjYzOlZWFhoTZv3mxOaRkWFqajR48qKyvLrLNhwwaVlZUpNDTUrLNx40adPHnSrLN+/Xr16NGj2vepmeoSqDsk1QAAAEAtOH78uLKzs/XVV19Jkg4cOKDs7Gzl5OTIYrHogQce0GOPPaZ3331XX3/9tSIjI+Xr66vRo0dLknr16qVhw4YpOjpaW7Zs0WeffabY2FiNGzdOvr6+kqTx48fL2dlZkyZN0vbt27Vq1So9++yziouLa6jTBlo8kmoAAACgFnzxxRfq16+fBg0aJEl6+OGH1a9fPyUkJEiS/v73v+u+++7T5MmTddlll+n48eNKS0uTq6uruY8VK1aoZ8+euuaaazRixAgNHDjQZg5qd3d3ffDBB9q3b59CQkL04IMPKiEhgem0gAbE6N8AAABALRgyZIgMw1BhYaHc3d1VUFBg083aYrFo7ty5mjt3bpX78PT01MqVK6s9Tt++ffXJJ5/UWrsBnBueVAMAAAAAYCeSagAAAAAA7ERSDQAAAACAnexKqhcvXqyAgAC5uroqNDRUW7ZsqbLusmXLZLFYbJbTB2MAAAAAAKCpqnFSvWrVKsXFxSkxMVHbtm1TUFCQIiIiKkxUfzo3Nzf99NNP5nLgwIFzajQAAAAAAI1BjZPqhQsXKjo6WlFRUQoMDFRKSoratGmj1NTUKrexWCzy8fExF29v73NqNAAAAAAAjUGNkuqSkhJlZWUpPDz8jx04OCg8PFyZmZlVbnf8+HF16dJF/v7+uuGGG7R9+3b7WwwAAAAAQCNRo6T6yJEjKi0trfCk2dvbW1artdJtevToodTUVP3nP//Rq6++qrKyMl1xxRX64YcfqjxOcXGxCgsLbRYAAAAAABqbOh/9OywsTJGRkQoODtbgwYO1evVqderUSS+++GKV2yQlJcnd3d1c/P3967qZAAAAAADUmFNNKnfs2FGOjo7Kzc21Kc/NzZWPj89Z7aNVq1bq16+f9uzZU2Wd+Ph4xcXFmZ8LCwtJrAGggQXMWNvQTWjy9j8xsqGbAAAAalmNnlQ7OzsrJCRE6enpZllZWZnS09MVFhZ2VvsoLS3V119/rc6dO1dZx8XFRW5ubjYLAAAAAACNTY2eVEtSXFycJk6cqP79+2vAgAFKTk5WUVGRoqKiJEmRkZHy8/NTUlKSJGnu3Lm6/PLLddFFF+no0aN66qmndODAAd199921eyYAAAAAANSzGifVY8eO1eHDh5WQkCCr1arg4GClpaWZg5fl5OTIweGPB+C//PKLoqOjZbVa1aFDB4WEhGjTpk0KDAysvbMAAAAAAKAB1DiplqTY2FjFxsZWui4jI8Pm8zPPPKNnnnnGnsMAAAAAANCo1fno3wAAAAAANFck1QAAAAAA2ImkGgAAAAAAO5FUAwAAAABgJ5JqAAAAAADsRFINAAAAAICdSKoBAAAAALATSTUAAAAAAHYiqQYAAAAAwE4k1QAAAAAA2ImkGgAAAAAAO5FUAwAAAABgJ5JqAAAAAADsRFINAAAA1IPS0lLNmjVLXbt2VevWrdWtWzc9+uijMgzDrGMYhhISEtS5c2e1bt1a4eHh+u6772z2k5+frwkTJsjNzU0eHh6aNGmSjh8/Xt+nA+D/kVQDAAAA9eDJJ5/UCy+8oH/84x/asWOHnnzySc2fP1+LFi0y68yfP1/PPfecUlJStHnzZrVt21YRERE6ceKEWWfChAnavn271q9frzVr1mjjxo2aPHlyQ5wSAElODd0AAAAAoCXYtGmTbrjhBo0cOVKSFBAQoH//+9/asmWLpN+fUicnJ2vmzJm64YYbJEmvvPKKvL299c4772jcuHHasWOH0tLStHXrVvXv31+StGjRIo0YMUILFiyQr69vw5wc0ILxpBoAAACoB1dccYXS09O1e/duSdKXX36pTz/9VMOHD5ck7du3T1arVeHh4eY27u7uCg0NVWZmpiQpMzNTHh4eZkItSeHh4XJwcNDmzZurPHZxcbEKCwttFgC1gyfVAAAAQD2YMWOGCgsL1bNnTzk6Oqq0tFSPP/64JkyYIEmyWq2SJG9vb5vtvL29zXVWq1VeXl42652cnOTp6WnWqUxSUpLmzJlTm6cD4P/xpBoAAACoB6+//rpWrFihlStXatu2bVq+fLkWLFig5cuX1/mx4+PjVVBQYC4HDx6s82MCLQVPqgEAAIB6MH36dM2YMUPjxo2TJPXp00cHDhxQUlKSJk6cKB8fH0lSbm6uOnfubG6Xm5ur4OBgSZKPj4/y8vJs9nvq1Cnl5+eb21fGxcVFLi4utXxGACSeVAMAAAD14tdff5WDg+3Xb0dHR5WVlUmSunbtKh8fH6Wnp5vrCwsLtXnzZoWFhUmSwsLCdPToUWVlZZl1NmzYoLKyMoWGhtbDWQD4M55UAwAAAPVg1KhRevzxx3XBBRfokksu0f/+9z8tXLhQd911lyTJYrHogQce0GOPPabu3bura9eumjVrlnx9fTV69GhJUq9evTRs2DBFR0crJSVFJ0+eVGxsrMaNG8fI30ADIakGAAAA6sGiRYs0a9YsTZkyRXl5efL19dVf//pXJSQkmHX+/ve/q6ioSJMnT9bRo0c1cOBApaWlydXV1ayzYsUKxcbG6pprrpGDg4PGjBmj5557riFOCYBIqgEAAIB60b59eyUnJys5ObnKOhaLRXPnztXcuXOrrOPp6amVK1fWQQsB2MOud6oXL16sgIAAubq6KjQ01Jyw/kxee+01WSwWs/sKAAAAAABNWY2T6lWrVikuLk6JiYnatm2bgoKCFBERUWEUwj/bv3+//va3v2nQoEF2NxYAAAAAgMakxkn1woULFR0draioKAUGBiolJUVt2rRRampqlduUlpZqwoQJmjNnji688MJzajAAAAAAAI1FjZLqkpISZWVlKTw8/I8dODgoPDxcmZmZVW43d+5ceXl5adKkSWd1nOLiYhUWFtosAAAAAAA0NjVKqo8cOaLS0lJ5e3vblHt7e8tqtVa6zaeffqqXXnpJS5cuPevjJCUlyd3d3Vz8/f1r0kwAAAAAAOqFXQOVna1jx47pjjvu0NKlS9WxY8ez3i4+Pl4FBQXmcvDgwTpsJQAATReDhwIA0LBqNKVWx44d5ejoqNzcXJvy3Nxc+fj4VKi/d+9e7d+/X6NGjTLLysrKfj+wk5N27dqlbt26VdjOxcVFLi4uNWkaAAAtTvngoSkpKQoNDVVycrIiIiK0a9cueXl5Vbkdg4cCAFB7avSk2tnZWSEhIUpPTzfLysrKlJ6errCwsAr1e/bsqa+//lrZ2dnmcv311+vqq69WdnY23boBADgHDB4KAEDDq9GTakmKi4vTxIkT1b9/fw0YMEDJyckqKipSVFSUJCkyMlJ+fn5KSkqSq6urevfubbO9h4eHJFUoBwAAZ6988ND4+HizrKaDh37yySdnPE5xcbGKi4vNzwweCgCArRq/Uz127FgtWLBACQkJCg4OVnZ2ttLS0szBy3JycvTTTz/VekMB1D7exQSaLgYPBQCgcajxk2pJio2NVWxsbKXrMjIyqt122bJl9hwSQC3jXUygZTmXwUPj4uLMz4WFhSTWAACcxq6kGkDTd/q7mJKUkpKitWvXKjU1VTNmzKh0m9Pfxfzkk0909OjRemwxgNMxeCgAAI1DnU6pBaBxKn8XMzw83Cyr6buYABoWg4cCANA48KQaaIGqexdz586dlW5T/i5mdnb2WR2DwY2AusfgoQAANDySagBnZM+7mElJSZozZ04dtwxo2caOHavDhw8rISFBVqtVwcHBFQYPdXCgUxoAAHWJpBpogerjXUwGNwLqB4OHAgDQsEiqgRbo9Hcxy6fFKn8Xs7Iv5+XvYp5u5syZOnbsmJ599tlKk2UGNwIAAEBLQFINtFC8iwkAAACcO5JqoIXiXUwAAADg3JFUAy0Y72ICAAAA54bHUAAAAAAA2ImkGgAAAAAAO5FUAwAAAABgJ5JqAAAAAADsRFINAAAAAICdSKoBAAAAALATSTUAAAAAAHYiqQYAAADqyY8//qjbb79d5513nlq3bq0+ffroiy++MNcbhqGEhAR17txZrVu3Vnh4uL777jubfeTn52vChAlyc3OTh4eHJk2apOPHj9f3qQD4fyTVAAAAQD345ZdfdOWVV6pVq1Z677339O233+rpp59Whw4dzDrz58/Xc889p5SUFG3evFlt27ZVRESETpw4YdaZMGGCtm/frvXr12vNmjXauHGjJk+e3BCnBECSU0M3AAAAAGgJnnzySfn7++vll182y7p27Wr+2zAMJScna+bMmbrhhhskSa+88oq8vb31zjvvaNy4cdqxY4fS0tK0detW9e/fX5K0aNEijRgxQgsWLJCvr2/9nhQAnlQDAAAA9eHdd99V//79dcstt8jLy0v9+vXT0qVLzfX79u2T1WpVeHi4Webu7q7Q0FBlZmZKkjIzM+Xh4WEm1JIUHh4uBwcHbd68ucpjFxcXq7Cw0GYBUDtIqgEAAIB68P333+uFF15Q9+7d9f777+vee+/V/fffr+XLl0uSrFarJMnb29tmO29vb3Od1WqVl5eXzXonJyd5enqadSqTlJQkd3d3c/H396/NUwNaNJJqAAAAoB6UlZXp0ksv1bx589SvXz9NnjxZ0dHRSklJqfNjx8fHq6CgwFwOHjxY58cEWgqSagAAAKAedO7cWYGBgTZlvXr1Uk5OjiTJx8dHkpSbm2tTJzc311zn4+OjvLw8m/WnTp1Sfn6+WacyLi4ucnNzs1kA1A6SagAAAKAeXHnlldq1a5dN2e7du9WlSxdJvw9a5uPjo/T0dHN9YWGhNm/erLCwMElSWFiYjh49qqysLLPOhg0bVFZWptDQ0Ho4CwB/ZldSvXjxYgUEBMjV1VWhoaHasmVLlXVXr16t/v37y8PDQ23btlVwcLD+9a9/2d1gAAAAoCmaNm2aPv/8c82bN0979uzRypUrtWTJEsXExEiSLBaLHnjgAT322GN699139fXXXysyMlK+vr4aPXq0pN+fbA8bNkzR0dHasmWLPvvsM8XGxmrcuHGM/A00kBpPqbVq1SrFxcUpJSVFoaGhSk5OVkREhHbt2lVh0ARJ8vT01COPPKKePXvK2dlZa9asUVRUlLy8vBQREVErJwEAAAA0dpdddpnefvttxcfHa+7cueratauSk5M1YcIEs87f//53FRUVafLkyTp69KgGDhyotLQ0ubq6mnVWrFih2NhYXXPNNXJwcNCYMWP03HPPNcQpAZAdSfXChQsVHR2tqKgoSVJKSorWrl2r1NRUzZgxo0L9IUOG2HyeOnWqli9frk8//ZSkGgAAAC3Kddddp+uuu67K9RaLRXPnztXcuXOrrOPp6amVK1fWRfMA2KFG3b9LSkqUlZVlM3eeg4ODwsPDzbnzqmMYhtLT07Vr1y5dddVVVdZjHj0AAAAAQFNQoyfVR44cUWlpaaVz5+3cubPK7QoKCuTn56fi4mI5Ojrq+eef17XXXltl/aSkJM2ZM6cmTQMAAACAFiFgxtqGbkKzsP+JkbWyn3oZ/bt9+/bKzs7W1q1b9fjjjysuLk4ZGRlV1mcePQAAAABAU1CjJ9UdO3aUo6NjtXPnVcbBwUEXXXSRJCk4OFg7duxQUlJShfety7m4uMjFxaUmTQMAAAAAoN7V6Em1s7OzQkJCbObOKysrU3p6ujl33tkoKytTcXFxTQ4NAAAAAECjU+PRv+Pi4jRx4kT1799fAwYMUHJysoqKiszRwCMjI+Xn56ekpCRJv78f3b9/f3Xr1k3FxcVat26d/vWvf+mFF16o3TMBAAAAAKCe1TipHjt2rA4fPqyEhARZrVYFBwcrLS3NHLwsJydHDg5/PAAvKirSlClT9MMPP6h169bq2bOnXn31VY0dO7b2zgIAAAAAgAZQ46RakmJjYxUbG1vpuj8PQPbYY4/pscces+cwAAAAAAA0avUy+jcAAAAAAM0RSTUAAAAAAHYiqQYAAAAAwE4k1QAAAAAA2ImkGgAAAAAAO5FUAwAAAABgJ5JqAAAAAADsRFINAAAAAICdSKoBAAAAALATSTUAAAAAAHYiqQYAAAAAwE4k1QAAAAAA2ImkGgAAAAAAO5FUAwAAAABgJ5JqAAAAAADsRFINAAAAAICdSKoBAAAAALATSTUAAAAAAHYiqQYAAAAawBNPPCGLxaIHHnjALDtx4oRiYmJ03nnnqV27dhozZoxyc3NttsvJydHIkSPVpk0beXl5afr06Tp16lQ9tx5AOZJqAAAAoJ5t3bpVL774ovr27WtTPm3aNP33v//VG2+8oY8//liHDh3STTfdZK4vLS3VyJEjVVJSok2bNmn58uVatmyZEhIS6vsUAPw/kmoAAACgHh0/flwTJkzQ0qVL1aFDB7O8oKBAL730khYuXKi//OUvCgkJ0csvv6xNmzbp888/lyR98MEH+vbbb/Xqq68qODhYw4cP16OPPqrFixerpKSkoU4JaNFIqgEAAIB6FBMTo5EjRyo8PNymPCsrSydPnrQp79mzpy644AJlZmZKkjIzM9WnTx95e3ubdSIiIlRYWKjt27fXzwkAsOHU0A0AAAAAWorXXntN27Zt09atWyuss1qtcnZ2loeHh025t7e3rFarWef0hLp8ffm6qhQXF6u4uNj8XFhYaO8pAPgTnlQDAAAA9eDgwYOaOnWqVqxYIVdX13o9dlJSktzd3c3F39+/Xo8PNGd2JdWLFy9WQECAXF1dFRoaqi1btlRZd+nSpRo0aJA6dOigDh06KDw8vNr6AAAAQHOUlZWlvLw8XXrppXJycpKTk5M+/vhjPffcc3JycpK3t7dKSkp09OhRm+1yc3Pl4+MjSfLx8akwGnj55/I6lYmPj1dBQYG5HDx4sHZPDmjBapxUr1q1SnFxcUpMTNS2bdsUFBSkiIgI5eXlVVo/IyNDt912mz766CNlZmbK399fQ4cO1Y8//njOjQcAAACaimuuuUZff/21srOzzaV///6aMGGC+e9WrVopPT3d3GbXrl3KyclRWFiYJCksLExff/21zXfv9evXy83NTYGBgVUe28XFRW5ubjYLgNpR46R64cKFio6OVlRUlAIDA5WSkqI2bdooNTW10vorVqzQlClTFBwcrJ49e+qf//ynysrKbH5YAAAA+9B7DGg62rdvr969e9ssbdu21XnnnafevXvL3d1dkyZNUlxcnD766CNlZWUpKipKYWFhuvzyyyVJQ4cOVWBgoO644w59+eWXev/99zVz5kzFxMTIxcWlgc8QaJlqlFSXlJQoKyvLZkRCBwcHhYeHmyMSnsmvv/6qkydPytPTs8o6xcXFKiwstFkAAIAteo8Bzc8zzzyj6667TmPGjNFVV10lHx8frV692lzv6OioNWvWyNHRUWFhYbr99tsVGRmpuXPnNmCrgZatRqN/HzlyRKWlpZWOOLhz586z2sdDDz0kX1/fClMInC4pKUlz5sypSdMAAGhxTu89JkkpKSlau3atUlNTNWPGjAr1V6xYYfP5n//8p9566y2lp6crMjKyXtoMwFZGRobNZ1dXVy1evFiLFy+ucpsuXbpo3bp1ddwyAGerXkf/fuKJJ/Taa6/p7bffrnbEQwZSAOoH3UaBpqu+eo8BAIDq1Sip7tixoxwdHSsdcbC60QYlacGCBXriiSf0wQcfqG/fvtXWZSAFoO7RbRRo2qrrPVbdXLWnO5veY7ySBQBA9WqUVDs7OyskJMRmkLHyQcfKRySszPz58/Xoo48qLS1N/fv3t7+1AGoNgw4CLdvZ9h5jblsAAKpX4+7fcXFxWrp0qZYvX64dO3bo3nvvVVFRkfk+V2RkpOLj4836Tz75pGbNmqXU1FQFBATIarXKarXq+PHjtXcWAGqkPrqN8nQLqFv11XuMV7IAAKhejZPqsWPHasGCBUpISFBwcLCys7OVlpZmdj/LycnRTz/9ZNZ/4YUXVFJSoptvvlmdO3c2lwULFtTeWQCokfroNsrTLaBu1VfvMV7JAgCgejUa/btcbGysYmNjK1335xEM9+/fb88hADRi5d1GMzIyquw2Gh8fr7i4OPNzYWEhiTVQy+Li4jRx4kT1799fAwYMUHJycoXeY35+fkpKSpL0e++xhIQErVy50uw9Jknt2rVTu3btGuw8AABoyuxKqgE0bbXRbfTDDz+sttuoi4uLXFxcaqW9ACo3duxYHT58WAkJCbJarQoODq7Qe8zB4Y9Oaaf3HjtdYmKiZs+eXZ9NBwCg2SCpBlqg07uNjh49WtIf3Uar6oUi/d5t9PHHH9f777/PoINAI0HvMQAAGhZJNdBC0W0UAAAAOHck1UALRbdRAAAA4NyRVAMtGN1GAQAAgHNT4ym1AAAAAADA70iqAQAAAACwE0k1AAAAAAB2IqkGAAAAAMBOJNUAAAAAANiJpBoAAAAAADuRVAMAAAAAYCeSagAAAAAA7ERSDQAAAACAnUiqAQAAAACwE0k1AAAAAAB2IqkGAAAAAMBOJNUAAAAAANiJpBoAAAAAADuRVAMAAAAAYCeSagAAAAAA7ERSDQAAANSTpKQkXXbZZWrfvr28vLw0evRo7dq1y6bOiRMnFBMTo/POO0/t2rXTmDFjlJuba1MnJydHI0eOVJs2beTl5aXp06fr1KlT9XkqAP4fSTUAAABQTz7++GPFxMTo888/1/r163Xy5EkNHTpURUVFZp1p06bpv//9r9544w19/PHHOnTokG666SZzfWlpqUaOHKmSkhJt2rRJy5cv17Jly5SQkNAQpwS0eE4N3QAAAACgpUhLS7P5vGzZMnl5eSkrK0tXXXWVCgoK9NJLL2nlypX6y1/+Ikl6+eWX1atXL33++ee6/PLL9cEHH+jbb7/Vhx9+KG9vbwUHB+vRRx/VQw89pNmzZ8vZ2bkhTg1osex6Ur148WIFBATI1dVVoaGh2rJlS5V1t2/frjFjxiggIEAWi0XJycn2thUAAABoVgoKCiRJnp6ekqSsrCydPHlS4eHhZp2ePXvqggsuUGZmpiQpMzNTffr0kbe3t1knIiJChYWF2r59e6XHKS4uVmFhoc0CoHbUOKletWqV4uLilJiYqG3btikoKEgRERHKy8urtP6vv/6qCy+8UE888YR8fHzOucEAAABAc1BWVqYHHnhAV155pXr37i1JslqtcnZ2loeHh01db29vWa1Ws87pCXX5+vJ1lUlKSpK7u7u5+Pv71/LZAC1XjZPqhQsXKjo6WlFRUQoMDFRKSoratGmj1NTUSutfdtlleuqppzRu3Di5uLicc4MBAACA5iAmJkbffPONXnvttTo/Vnx8vAoKCszl4MGDdX5MoKWoUVJdUlKirKwsm+4oDg4OCg8PN7uj1Aa6pwAAAKA5i42N1Zo1a/TRRx/p/PPPN8t9fHxUUlKio0eP2tTPzc01e336+PhUGA28/HNVPUNdXFzk5uZmswCoHTVKqo8cOaLS0tJKu5tU1dXEHnRPAQAAQHNkGIZiY2P19ttva8OGDeratavN+pCQELVq1Urp6elm2a5du5STk6OwsDBJUlhYmL7++mub1y/Xr18vNzc3BQYG1s+JADA1yim16J4CAACA5igmJkavvvqqVq5cqfbt28tqtcpqteq3336TJLm7u2vSpEmKi4vTRx99pKysLEVFRSksLEyXX365JGno0KEKDAzUHXfcoS+//FLvv/++Zs6cqZiYGF63BBpAjabU6tixoxwdHSvtblKbg5C5uLjwAwEAAADNzgsvvCBJGjJkiE35yy+/rDvvvFOS9Mwzz8jBwUFjxoxRcXGxIiIi9Pzzz5t1HR0dtWbNGt17770KCwtT27ZtNXHiRM2dO7e+TgPAaWqUVDs7OyskJETp6ekaPXq0pN9HLUxPT1dsbGxdtA8AAABoNgzDOGMdV1dXLV68WIsXL66yTpcuXbRu3brabBoAO9UoqZakuLg4TZw4Uf3799eAAQOUnJysoqIiRUVFSZIiIyPl5+enpKQkSb8Pbvbtt9+a//7xxx+VnZ2tdu3a6aKLLqrFUwEAAAAAoH7VOKkeO3asDh8+rISEBFmtVgUHBystLc0cvCwnJ0cODn+8qn3o0CH169fP/LxgwQItWLBAgwcPVkZGxrmfAQAAAAAADaTGSbX0+xQAVXX3/nOiHBAQcFbdXAAAAAAAaGoa5ejfAAAAAAA0BSTVAAAAAADYiaQaAAAAAAA7kVQDAAAAAGAnkmoAAAAAAOxEUg0AAAAAgJ1IqgEAAAAAsBNJNQAAAAAAdiKpBgAAAADATiTVAAAAAADYiaQaAAAAAAA7kVQDAAAAAGAnkmoAAAAAAOxEUg0AAAAAgJ1IqgEAAAAAsBNJNQAAAAAAdiKpBgAAAADATiTVAAAAAADYiaQaAAAAAAA7kVQDAAAAAGAnkmoAAAAAAOxEUg0AAAAAgJ1IqgEAAAAAsBNJNQAAAAAAdrIrqV68eLECAgLk6uqq0NBQbdmypdr6b7zxhnr27ClXV1f16dNH69ats6uxAGoXsQw0fcQx0HLVNP4B1I0aJ9WrVq1SXFycEhMTtW3bNgUFBSkiIkJ5eXmV1t+0aZNuu+02TZo0Sf/73/80evRojR49Wt988805Nx6A/YhloOkjjoGWq6bxD6Du1DipXrhwoaKjoxUVFaXAwEClpKSoTZs2Sk1NrbT+s88+q2HDhmn69Onq1auXHn30UV166aX6xz/+cc6NB2A/Yhlo+ohjoOWqafwDqDtONalcUlKirKwsxcfHm2UODg4KDw9XZmZmpdtkZmYqLi7OpiwiIkLvvPNOlccpLi5WcXGx+bmgoECSVFhYWG37yop/PdMp4Cyc6f9zTXFdaseZrkv5esMwzriv+ohle+NY4p6pDbUdxxLXpTaczXU521jmd3LLwO/kxqk2fyfbw574J5YbDr+TG6/aiuUaJdVHjhxRaWmpvL29bcq9vb21c+fOSrexWq2V1rdarVUeJykpSXPmzKlQ7u/vX5Pmwk7uyQ3dAlTmbK/LsWPH5O7uXm2d+ohl4rhhEceNU02uy5limd/JLQOx3DjV5u9ke9gT/8RywyGOG6/aiuUaJdX1JT4+3uYv6WVlZcrPz9d5550ni8XSgC07N4WFhfL399fBgwfl5ubW0M3B/2su18UwDB07dky+vr4N3RRJxDHqV3O6LsRy/WhO90xz0lyuS2OLY4lYRv1qLtflbGO5Rkl1x44d5ejoqNzcXJvy3Nxc+fj4VLqNj49PjepLkouLi1xcXGzKPDw8atLURs3Nza1J31zNVXO4Lmf71/D6iGXiGA2huVyXs4llfifXjuZyzzQ3zeG61MUT6nL2xD+xjIbQHK7L2cRyjQYqc3Z2VkhIiNLT082ysrIypaenKywsrNJtwsLCbOpL0vr166usD6DuEctA00ccAy2XPfEPoO7UuPt3XFycJk6cqP79+2vAgAFKTk5WUVGRoqKiJEmRkZHy8/NTUlKSJGnq1KkaPHiwnn76aY0cOVKvvfaavvjiCy1ZsqR2zwRAjRDLQNNHHAMt15niH0D9qXFSPXbsWB0+fFgJCQmyWq0KDg5WWlqaOVBCTk6OHBz+eAB+xRVXaOXKlZo5c6Yefvhhde/eXe+884569+5de2fRRLi4uCgxMbFC1xs0rJZ6XYhl+7TU+6Wxa6nXhTi2X0u9Zxo7rsvZO1P8txTcM41TS7suFqOuxvoHAAAAAKCZq9E71QAAAAAA4A8k1QAAAAAA2ImkGgAAAAAAO5FUV8Jiseidd95p6Gbg/3E9YA/um8aHawJ7cN80PlwT2IP7pnHhetSuFplUW61W3Xfffbrwwgvl4uIif39/jRo1qsLcnY3N/v37ZbFY5OXlpWPHjtmsCw4O1uzZsxumYeeoKV6Pjz/+WK1atdKnn35qU15UVKQLL7xQf/vb3xqoZS1HU7xvpOYbx1LTvCbEcsNriveNRCw3NsRyw2uK943UfGO5KV6PphzHLS6p3r9/v0JCQrRhwwY99dRT+vrrr5WWlqarr75aMTExdXbckpKSWtvXsWPHtGDBglrbX0Nqqtdj8ODBuu+++3TnnXeqqKjILP/73/+u1q1b67HHHjvXJqIaTfW+OV1zimOp6V4TYrlhNdX75nTEcu0glpu2pnrfnK45xXJTvR5NOo6NFmb48OGGn5+fcfz48QrrfvnlF8MwDEOSsXTpUmP06NFG69atjYsuusj4z3/+Y9Z7+eWXDXd3d5tt3377beP0/52JiYlGUFCQsXTpUiMgIMCwWCxnte/q7Nu3z5BkTJ8+3WjXrp2Rm5trrgsKCjISExPNz/n5+cYdd9xheHh4GK1btzaGDRtm7N69+6yOU5+a8vX47bffjF69ehkxMTGGYRjGhg0bDGdnZ+OLL74wSktLjXnz5hkBAQGGq6ur0bdvX+ONN94wt83PzzfGjx9vdOzY0XB1dTUuuugiIzU19ayOi6Z93zTHODaMpn1NiOWG05TvG2K58V0TYrnhNOX7pjnGclO+Hk01jltUUv3zzz8bFovFmDdvXrX1JBnnn3++sXLlSuO7774z7r//fqNdu3bGzz//bBjG2d9kbdu2NYYNG2Zs27bN+PLLL89q39UpD/pt27YZwcHB5s1mGBWD/vrrrzd69eplbNy40cjOzjYiIiKMiy66yCgpKTnjcepLU78ehmEYW7duNVq1amW88847RkBAgDF79mzDMAzjscceM3r27GmkpaUZe/fuNV5++WXDxcXFyMjIMAzDMGJiYozg4GBj69atxr59+4z169cb77777lkds6Vr6vdNc4tjw2j618QwiOWG0NTvG2K58V0TwyCWG0JTv2+aWyw39ethGE0zjltUUr1582ZDkrF69epq60kyZs6caX4+fvy4Icl47733DMM4+5usVatWRl5eXo32XZ3yoP/f//5npKWlGa1atTL27NljGIZt0O/evduQZHz22WfmtkeOHDFat25tvP7662c8Tn1p6tejXEJCguHg4GCEhIQYJ0+eNE6cOGG0adPG2LRpk029SZMmGbfddpthGIYxatQoIyoq6qyPgT809fumucWxYTT9a1KOWK5fTf2+IZYb3zUpRyzXr6Z+3zS3WG7q16NcU4vjFvVOtWEYZ123b9++5r/btm0rNzc35eXl1eh4Xbp0UadOnepk3xERERo4cKBmzZpVYd2OHTvk5OSk0NBQs+y8885Tjx49tGPHjhodpy41l+sxa9YslZWVacaMGXJyctKePXv066+/6tprr1W7du3M5ZVXXtHevXslSffee69ee+01BQcH6+9//7s2bdpUo3NpyZrLfSM1jziWms81IZbrV3O5byRiubFdE2K5fjWX+0ZqHrHcXK5HU4tjp3o7UiPQvXt3WSwW7dy584x1W7VqZfPZYrGorKxMkuTg4FDhhj158mSFfbRt27bG+66JJ554QmFhYZo+fXqNt20Mmsv1cHJysvnv8ePHJUlr166Vn5+fTV0XFxdJ0vDhw3XgwAGtW7dO69ev1zXXXKOYmJhmM0BGXWou9025ph7HUvO5JsRy/Wou9005YrnxXBNiuX41l/umXFOP5eZyPZpaHLeoJ9Wenp6KiIjQ4sWLbUaUK3f06NGz2k+nTp107Ngxm31kZ2fXUivP3oABA3TTTTdpxowZNuW9evXSqVOntHnzZrPs559/1q5duxQYGFjfzaxSc7se5QIDA+Xi4qKcnBxddNFFNou/v79Zr1OnTpo4caJeffVVJScna8mSJQ3W5qakud03TT2OpeZ3TcoRy3Wrud03xPIfGss1KUcs163mdt809VhubtejXGOP4xaVVEvS4sWLVVpaqgEDBuitt97Sd999px07dui5555TWFjYWe0jNDRUbdq00cMPP6y9e/dq5cqVWrZsWd02vAqPP/64NmzYoF27dpll3bt31w033KDo6Gh9+umn+vLLL3X77bfLz89PN9xwQ4O0syrN7XpIUvv27fW3v/1N06ZN0/Lly7V3715t27ZNixYt0vLlyyVJCQkJ+s9//qM9e/Zo+/btWrNmjXr16tVgbW5qmtt909TjWGp+10QilutDc7tviOXfNaZrIhHL9aG53TdNPZab2/WQGn8ct7ik+sILL9S2bdt09dVX68EHH1Tv3r117bXXKj09XS+88MJZ7cPT01Ovvvqq1q1bpz59+ujf//53g00Mf/HFF+uuu+7SiRMnbMpffvllhYSE6LrrrlNYWJgMw9C6desqdMVoaM3tepR79NFHNWvWLCUlJalXr14aNmyY1q5dq65du0qSnJ2dFR8fr759++qqq66So6OjXnvttQZtc1PS3O6bph7HUvO7JuWI5brV3O4bYvl3jemalCOW61Zzu2+aeiw3t+tRrjHHscWoydvsAAAAAADA1OKeVAMAAAAAUFtIqhuRe+65x2aI+NOXe+65p6Gb1+JwPWAP7pvGh2sCe3DfND5cE9iD+6Zxaa7Xg+7fjUheXp4KCwsrXefm5iYvL696blHLxvWAPbhvGh+uCezBfdP4cE1gD+6bxqW5Xg+SagAAAAAA7ET3bwAAAAAA7ERSDQAAAACAnUiqAQAAAACwE0k1AAAAAAB2IqkGAAAAAMBOJNUAAAAAANiJpBoAAAAAADuRVAMAAAAAYKf/A65QFoqj0p5QAAAAAElFTkSuQmCC\n",
+ "text/plain": [
+ "<Figure size 1200x400 with 4 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# calculate the classification report\n",
+ "report = classification_report(Y_test, predLR, target_names=['Churn_No', 'Churn_Yes'])\n",
+ "\n",
+ "# split the report into lines\n",
+ "lines = report.split('\\n')\n",
+ "\n",
+ "# split each line into parts\n",
+ "parts = [line.split() for line in lines[2:-5]]\n",
+ "\n",
+ "# extract the metrics for each class\n",
+ "class_metrics = dict()\n",
+ "for part in parts:\n",
+ " class_metrics[part[0]] = {'precision': float(part[1]), 'recall': float(part[2]), 'f1-score': float(part[3]), 'support': int(part[4])}\n",
+ "\n",
+ "# create a bar chart for each metric\n",
+ "fig, ax = plt.subplots(1, 4, figsize=(12, 4))\n",
+ "metrics = ['precision', 'recall', 'f1-score', 'support']\n",
+ "for i, metric in enumerate(metrics):\n",
+ " ax[i].bar(class_metrics.keys(), [class_metrics[key][metric] for key in class_metrics.keys()])\n",
+ " ax[i].set_title(metric)\n",
+ "\n",
+ "# display the plot\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 43,
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [],
+ "source": [
+ "confusion_matrix_LR = confusion_matrix(Y_test, predLR)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 44,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfwAAAG4CAYAAACgm1VpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA2pElEQVR4nO3deVxVdf7H8fdl30EUEJREE1xxX7Nxy8Qsl6wsa1KzsaapUdO0MdPUyrUmx3KyyRot28x9rCw1l3LNXQsVzV1cEGRVEDi/P/x5k0Dl6kXM7+v5eNxH3PP9nu/5HDre9z3nfO/FZlmWJQAAcEtzKe0CAABAySPwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8IESsmLFCtlsNp05c6a0S3GIzWbT/PnzS7sM3GB/xP/vrVu31oABA0q7jD8MAh9/eL1795bNZtO4ceMKLJ8/f75sNptDY0VFRWnSpEnF6rtlyxY99NBDCgsLk5eXl6Kjo9W3b1/t2bPHoW0CN8Lx48f197//XVWqVJGnp6ciIyPVqVMnLVu2rLRLww1C4OOW4OXlpfHjxyslJeWGbG/RokVq1qyZsrOz9cknnyg+Pl4zZ85UYGCghg8fXqLbzsnJKdHxces5cOCAGjZsqO+//14TJ07Ujh07tHjxYrVp00bPPvtsiW2XY/XmQuDjltCuXTuVL19eY8eOvWK/OXPmqFatWvL09FRUVJTefPNNe1vr1q118OBBPf/887LZbJe9OpCVlaUnnnhCHTt21MKFC9WuXTtVrlxZTZs21RtvvKH33nuvQP9NmzapUaNG8vHx0R133KHdu3fb23r37q2uXbsW6D9gwAC1bt26QF3PPfecBgwYoHLlyikuLs5+u2DZsmWXHVuSFixYoAYNGsjLy0tVqlTRqFGjlJuba29PSEhQy5Yt5eXlpZo1a2rJkiVX/P3hj+lvf/ubbDabNmzYoAceeEAxMTGqVauWBg4cqHXr1tn7JSUl6f7775ePj4+io6O1cOFCe9v06dMVFBRUYNzfX0UbOXKk6tWrp2nTpqly5cry8vKSdOF2wbRp0y47tiTt3LlT99xzj/z8/BQWFqbHH39cSUlJ9vbMzEz17NlTfn5+Cg8PL/BvF8VD4OOW4OrqqjFjxujtt9/WkSNHiuyzadMmde/eXY888oh27NihkSNHavjw4Zo+fbokae7cuapYsaJGjx6txMREJSYmFjnOt99+q6SkJA0ZMqTI9t+/KA4bNkxvvvmmNm7cKDc3N/Xp08fh/ZsxY4Y8PDy0evVqTZ06tVhj//DDD+rZs6f69++vX375Re+9956mT5+u119/XZKUn5+vbt26ycPDQ+vXr9fUqVP14osvOlwbbm7JyclavHixnn32Wfn6+hZqv/R4HTVqlLp3767t27erY8eOeuyxx5ScnOzQ9vbu3as5c+Zo7ty52rp1a7HGPnPmjNq2bav69etr48aNWrx4sU6cOKHu3bvb1x88eLBWrlypBQsW6LvvvtOKFSu0efNmx34ZprOAP7hevXpZXbp0sSzLspo1a2b16dPHsizLmjdvnnXpIf7oo49ad999d4F1Bw8ebNWsWdP+vFKlStZbb711xe2NHz/ekmQlJydfsd/y5cstSdbSpUvty7766itLknX27NlCtV/Uv39/q1WrVvbnrVq1surXr+/w2HfddZc1ZsyYAut9/PHHVnh4uGVZlvXtt99abm5u1tGjR+3t33zzjSXJmjdv3hX3DX8c69evtyRZc+fOvWI/SdbLL79sf56RkWFJsr755hvLsizrv//9rxUYGFhgnd//G3vllVcsd3d36+TJkw6N/eqrr1rt27cvsM7hw4ctSdbu3but9PR0y8PDw5o1a5a9/fTp05a3t7fVv3//q/8SYFmWZXGGj1vK+PHjNWPGDMXHxxdqi4+PV4sWLQosa9GihRISEpSXl1fsbViW5VBNderUsf8cHh4uSTp58qRDYzRs2NDhsbdt26bRo0fLz8/P/ujbt68SExOVlZWl+Ph4RUZGKiIiwj5G8+bNHaoLNz9HjtdLjydfX18FBAQ4fKxWqlRJISEhDo29bds2LV++vMCxWr16dUnSvn37tG/fPuXk5Khp06b2MYKDg1WtWjWHajOdW2kXADhTy5YtFRcXp6FDh6p3794lso2YmBhJ0q5du4oVkO7u7vafL97vzM/PlyS5uLgUekE+f/58oTGKuhR7tbEzMjI0atQodevWrdB6F++t4tYXHR0tm82mXbt2XbXvpceTdOGYKolj9fdjZ2RkqFOnTho/fnyh9cLDw7V3796r1o6r4wwft5xx48bpf//7n9auXVtgeY0aNbR69eoCy1avXq2YmBi5urpKkjw8PK56tt++fXuVK1dOEyZMKLLdkc/dh4SEFJorcOl9z+vRoEED7d69W1WrVi30cHFxUY0aNXT48OEC2790AhduDcHBwYqLi9OUKVOUmZlZqL24x2tISIjS09MLjOHMY/Xnn39WVFRUoWPV19dXt99+u9zd3bV+/Xr7OikpKXwE1kEEPm45sbGxeuyxxzR58uQCywcNGqRly5bp1Vdf1Z49ezRjxgy98847euGFF+x9oqKitGrVKh09erTADOFL+fr6atq0afrqq6/UuXNnLV26VAcOHNDGjRs1ZMgQ/fWvfy12rW3bttXGjRv10UcfKSEhQa+88op27tx5bTv+OyNGjNBHH32kUaNG6eeff1Z8fLw+//xzvfzyy5IufLIhJiZGvXr10rZt2/TDDz9o2LBhTtk2bi5TpkxRXl6emjRpojlz5ighIUHx8fGaPHlysW/jNG3aVD4+PnrppZe0b98+ffrpp/YJr9fr2WefVXJysnr06KGffvpJ+/bt07fffqsnnnhCeXl58vPz05NPPqnBgwfr+++/186dO9W7d2+5uBBhjuC3hVvS6NGj7ZcLL2rQoIFmzZqlzz//XLVr19aIESM0evToApf+R48erQMHDuj2228v8j7kRV26dNGaNWvk7u6uRx99VNWrV1ePHj2Umpqq1157rdh1xsXFafjw4RoyZIgaN26s9PR09ezZ0+H9vdzYixYt0nfffafGjRurWbNmeuutt1SpUiVJFy7Rzps3T2fPnlWTJk30l7/8xT6DH7eWKlWqaPPmzWrTpo0GDRqk2rVr6+6779ayZcv07rvvFmuM4OBgzZw5U19//bViY2P12WefaeTIkU6pLyIiQqtXr1ZeXp7at2+v2NhYDRgwQEFBQfZQnzhxov70pz+pU6dOateune68887Lzm1B0WyWozOQAADAHw5n+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfuEGys7M1cuRIZWdnl3YpQInhOL958cU7wA2SlpamwMBApaamKiAgoLTLAUoEx/nNizN8AAAMQOADAGAAt9IuAKUjPz9fx44dk7+/v/3vqKNkpaWlFfgvcCviOL/xLMtSenq6IiIirvgXBLmHb6gjR44oMjKytMsAADjJ4cOHVbFixcu2c4ZvKH9/f0nSwc1RCvDjzg5uXffHxJZ2CUCJytV5/aiv7a/rl0PgG+riZfwAPxcF+BP4uHW52dxLuwSgZP3/dfqr3Z7llR4AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwi7BixQrZbDadOXOmtEtxiM1m0/z580u7jFvaqrVn1bnnMVWst1+u4Xs1/5uMAu2j3jitmncelH+VfSpb/Ve1735U6zefK9Bn8/Zzav/wUQVX+1UhNX/V0y+cVEZmvr19+hdpcg3fW+TjZFLuDdlP4FIp1ilttVZrlbVIS63ZOmkdLdQn00rTVmu1llvz9b01TxusZTpnZdnb86w87bK2aKW1UMutedpmrVW2da7QOCg5pRr4vXv3ls1m07hx4wosnz9/vmw2m0NjRUVFadKkScXqu2XLFj300EMKCwuTl5eXoqOj1bdvX+3Zs8ehbcI8mVn5qlvTU2+PCSmyPbqKhyaPCdG25bdp1YIKqhTprg6PHNOppDxJ0rHjuWr/8DFVjXLX2q8q6utPI/TLnhw90f+EfYyHO/vp6LaoAo/2rX3UqrmXQsu53ZD9BC6Vp1z5KVDVVb/I9iwrQxu1Qr7yV0O1UjPdrcqqIZdLImaPtumUjilWzdRQrZWjs9qutTdqF6Cb4Azfy8tL48ePV0pKyg3Z3qJFi9SsWTNlZ2frk08+UXx8vGbOnKnAwEANHz68RLedk5NTouOj5N1zl69e/UdZ3d/Rr8j2R7v5q11LH1Wp5K5a1Tz15shySkvP1/b4bEnSoiWZcnez6Z2xIapW1UON63np3+NDNPerTO3df+H48PZ2UflQN/vD1cWm5auz9ESPgBu2n8ClytnCVdVWW6G2CkW279NOlVV5RdvqKMBWRj42P4XYIuRh85Ik5VrndUz7FaO6CraFKsBWRjXVSKk6rVTr9I3cFaOVeuC3a9dO5cuX19ixY6/Yb86cOapVq5Y8PT0VFRWlN998097WunVrHTx4UM8//7xsNttlrw5kZWXpiSeeUMeOHbVw4UK1a9dOlStXVtOmTfXGG2/ovffeK9B/06ZNatSokXx8fHTHHXdo9+7d9rbevXura9euBfoPGDBArVu3LlDXc889pwEDBqhcuXKKi4uz3y5YtmzZZceWpAULFqhBgwby8vJSlSpVNGrUKOXm/nY5NyEhQS1btpSXl5dq1qypJUuWXPH3hxsvJ8fS+zNTFRjgoro1Pe3LPDxscnH57Rj19rrw848bir68+fHsNPl4u+jB+4p+kwGUJsuylKTj8pGfNls/aKX1P22wlhW47J+mFFmyFKxQ+zJfW4C85KMzIvBvlFIPfFdXV40ZM0Zvv/22jhw5UmSfTZs2qXv37nrkkUe0Y8cOjRw5UsOHD9f06dMlSXPnzlXFihU1evRoJSYmKjExschxvv32WyUlJWnIkCFFtgcFBRV4PmzYML355pvauHGj3Nzc1KdPH4f3b8aMGfLw8NDq1as1derUYo39ww8/qGfPnurfv79++eUXvffee5o+fbpef/11SVJ+fr66desmDw8PrV+/XlOnTtWLL77ocG0oGYuWZCrg9n3yidqnSf85o2+/iFC5sq6SpDZ3euv4yVy98e8U5eRYSjmTp6GvX3jBO36i6PvzH36aph73+8nbu9T/uQKF5ChbecrVAe1WWYWpgf6kUFXQdq1VinXq//uck00ucrd5FFjXQ57KEffxb5Sb4hXk/vvvV7169fTKK68U2f7Pf/5Td911l4YPH66YmBj17t1bzz33nCZOnChJCg4Olqurq/z9/VW+fHmVL1++yHESEhIkSdWrVy9WXa+//rpatWqlmjVr6h//+IfWrFmjc+ccOzijo6M1YcIEVatWTdWqVSvW2KNGjdI//vEP9erVS1WqVNHdd9+tV1991X4FYunSpdq1a5c++ugj1a1bVy1bttSYMWOuWEd2drbS0tIKPFAy2rTw1ualkfrxfxUV18ZHjzx13D7ZrlY1T/33X2H659Qz8quyTxF196vybe4KC3EtcNZ/0dqNZxWfcF59uJyPm5YlSQpRhCrZYuRvC1KUrbrKKVxH9Gsp14ZL3RSBL0njx4/XjBkzFB8fX6gtPj5eLVq0KLCsRYsWSkhIUF5eXrG3YVmWQzXVqVPH/nN4eLgk6eTJkw6N0bBhQ4fH3rZtm0aPHi0/Pz/7o2/fvkpMTFRWVpbi4+MVGRmpiIgI+xjNmze/Yh1jx45VYGCg/REZGenQfqD4fH1cVLWyh5o19NK0f4bJzc2mDz/97Q3Wo938dWx7ZR3eEqVTv1TRKy8E69TpPFWu5F5orA8+TVO92h5qWNfrRu4CUGzu8pRNNvmq4JtSX/nrnC7M0veQlyzl67xVcB5TjrLlIY7tG+WmCfyWLVsqLi5OQ4cOLbFtxMTESJJ27dpVrP7u7r+9AF+cF5Cff+HjUy4uLoXeQJw/f77QGL6+vg6PnZGRoVGjRmnr1q32x44dO5SQkCAvr2v7xzF06FClpqbaH4cPH76mceC4/HxL2TmF32yGhbjJz9dFXyzIkJenTXe39C7QnpGZry8XZnB2j5uai81FASqjLKUXWJ6lDHnJR5IUoDKyyaZk/XbClGml65yyFKSyN7Rek91Un/EZN26c6tWrV+DStyTVqFFDq1evLrBs9erViomJkavrhXujHh4eVz3bb9++vcqVK6cJEyZo3rx5hdrPnDlT6D7+5YSEhGjnzp0Flm3durVAkF+rBg0aaPfu3apatWqR7TVq1NDhw4eVmJhovzqwbt26K47p6ekpT0/P667NdBmZ+dq7/7c3dgcO5WrrzmwFB7mobLCrxkxKUac4X4WHuiopOU//np6qo8fz9GCn3ybcTfnwjJo38pKfr4uWrsrSkNGnNXZYWQUFuhbY1hcLMpSbJz32gP8N2z+gKLlWrs7qt++cOKtMpVtn5C4Pedl8VEnVtEPrVMYqpzIK1WkdV5IS1VCtJEluNndFWJWVoO1ytzzkJnft1hYFKliBNgL/RrmpAj82NlaPPfaYJk+eXGD5oEGD1LhxY7366qt6+OGHtXbtWr3zzjv697//be8TFRWlVatW6ZFHHpGnp6fKlStXaHxfX19NmzZNDz30kDp37qx+/fqpatWqSkpK0qxZs3To0CF9/vnnxaq1bdu2mjhxoj766CM1b95cM2fO1M6dO1W/ftGfU3XEiBEjdN999+m2227Tgw8+KBcXF23btk07d+7Ua6+9pnbt2ikmJka9evXSxIkTlZaWpmHDhl33dnF1G7ed010PHLM/HzQySZLUs7u/3h0fol17c/TRl2lKSs5T2TKualTPSyvnV1Ctar+92dqwJVsj30hWRma+qlf10LsTQvT4Q4XP4v/7WZruv8e30BsB4EZLU7I2a5X9eYK2S5LCVUm11FihtgqqbjXQAe3Wbm2Vj/wVq+YKsv32OhyjukqQTdu1VvnKV1mFqboa3PB9MdlNFfiSNHr0aH3xxRcFljVo0ECzZs3SiBEj9Oqrryo8PFyjR49W7969C6z39NNP6/bbb1d2dvZl79d36dJFa9as0dixY/Xoo48qLS1NkZGRatu2rV577bVi1xkXF6fhw4dryJAhOnfunPr06aOePXtqx44d17Tfvx970aJFGj16tMaPHy93d3dVr15df/nLXyRduJ0wb948Pfnkk2rSpImioqI0efJkdejQ4bq3jStrfYeP8hKLvvIiSXM+DL/qGDPeDivWtn78X8Vi1wWUpGBbqNrpwSv2qWCrrAqqfNl2V5urqqv+Zb+8ByXPZjk6kw23hLS0NAUGBiplTxUF+N80UzkAp4uLqFfaJQAlKtc6rxVaoNTUVAUEXH7OD6/0AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAzgcOAvXrxYP/74o/35lClTVK9ePT366KNKSUlxanEAAMA5HA78wYMHKy0tTZK0Y8cODRo0SB07dtT+/fs1cOBApxcIAACun5ujK+zfv181a9aUJM2ZM0f33XefxowZo82bN6tjx45OLxAAAFw/h8/wPTw8lJWVJUlaunSp2rdvL0kKDg62n/kDAICbi8Nn+HfeeacGDhyoFi1aaMOGDfriiy8kSXv27FHFihWdXiAAALh+Dp/hv/POO3Jzc9Ps2bP17rvvqkKFCpKkb775Rh06dHB6gQAA4PrZLMuySrsI3HhpaWkKDAxUyp4qCvDn05m4dcVF1CvtEoASlWud1wotUGpqqgICAi7bz+FX+s2bN2vHjh325wsWLFDXrl310ksvKScn59qqBQAAJcrhwH/66ae1Z88eSdKvv/6qRx55RD4+Pvryyy81ZMgQpxcIAACun8OBv2fPHtWrV0+S9OWXX6ply5b69NNPNX36dM2ZM8fZ9QEAACdwOPAty1J+fr6kCx/Lu/jZ+8jISCUlJTm3OgAA4BQOB36jRo302muv6eOPP9bKlSt17733SrrwhTxhYWFOLxAAAFw/hwN/0qRJ2rx5s5577jkNGzZMVatWlSTNnj1bd9xxh9MLBAAA18/hL96pU6dOgVn6F02cOFGurq5OKQoAADiXw4F/OV5eXs4aCgAAOJnDgZ+Xl6e33npLs2bN0qFDhwp99j45OdlpxQEAAOdw+B7+qFGj9M9//lMPP/ywUlNTNXDgQHXr1k0uLi4aOXJkCZQIAACul8OB/8knn+j999/XoEGD5Obmph49emjatGkaMWKE1q1bVxI1AgCA6+Rw4B8/flyxsbGSJD8/P6WmpkqS7rvvPn311VfOrQ4AADiFw4FfsWJFJSYmSpJuv/12fffdd5Kkn376SZ6ens6tDgAAOIXDgX///fdr2bJlkqS///3vGj58uKKjo9WzZ0/16dPH6QUCAIDr5/As/XHjxtl/fvjhh3Xbbbdp7dq1io6OVqdOnZxaHAAAcI7r/hx+8+bN1bx5c2fUAgAASkixAn/hwoXFHrBz587XXAwAACgZxQr8rl27Fmswm82mvLy866kHAACUgGIF/sU/hwsAAP6YHJ6lDwAA/niKHfjff/+9atasqbS0tEJtqampqlWrllatWuXU4gAAgHMUO/AnTZqkvn37KiAgoFBbYGCgnn76ab311ltOLQ4AADhHsQN/27Zt6tChw2Xb27dvr02bNjmlKAAA4FzFDvwTJ07I3d39su1ubm46deqUU4oCAADOVezAr1Chgnbu3HnZ9u3btys8PNwpRQEAAOcqduB37NhRw4cP17lz5wq1nT17Vq+88oruu+8+pxYHAACcw2ZZllWcjidOnFCDBg3k6uqq5557TtWqVZMk7dq1S1OmTFFeXp42b96ssLCwEi0YzpGWlqbAwECl7KmiAH8+nYlbV1xEvdIuAShRudZ5rdACpaamFjmx/qJif5d+WFiY1qxZo2eeeUZDhw7VxfcJNptNcXFxmjJlCmEPAMBNyqE/nlOpUiV9/fXXSklJ0d69e2VZlqKjo1WmTJmSqg8AADjBNf21vDJlyqhx48bOrgUAAJQQbt4CAGAAAh8AAAMQ+AAAGIDABwDAAMWatLdw4cJiD9i5c+drLgY33kNt4+Tm4lnaZQAlxjXao7RLAEqUlZct7bt6v2IFfteuXYu1UZvNpry8vGL1BQAAN06xAj8/P7+k6wAAACWIe/gAABjgmr54JzMzUytXrtShQ4eUk5NToK1fv35OKQwAADiPw4G/ZcsWdezYUVlZWcrMzFRwcLCSkpLk4+Oj0NBQAh8AgJuQw5f0n3/+eXXq1EkpKSny9vbWunXrdPDgQTVs2FBvvPFGSdQIAACuk8OBv3XrVg0aNEguLi5ydXVVdna2IiMjNWHCBL300kslUSMAALhODge+u7u7XFwurBYaGqpDhw5JkgIDA3X48GHnVgcAAJzC4Xv49evX108//aTo6Gi1atVKI0aMUFJSkj7++GPVrl27JGoEAADXyeEz/DFjxig8PFyS9Prrr6tMmTJ65plndOrUKf3nP/9xeoEAAOD6OXyG36hRI/vPoaGhWrx4sVMLAgAAzscX7wAAYACHz/ArV64sm8122fZff/31ugoCAADO53DgDxgwoMDz8+fPa8uWLVq8eLEGDx7srLoAAIATORz4/fv3L3L5lClTtHHjxusuCAAAOJ/T7uHfc889mjNnjrOGAwAATuS0wJ89e7aCg4OdNRwAAHCia/rinUsn7VmWpePHj+vUqVP697//7dTiAACAczgc+F26dCkQ+C4uLgoJCVHr1q1VvXp1pxYHAACcw+HAHzlyZAmUAQAASpLD9/BdXV118uTJQstPnz4tV1dXpxQFAACcy+HAtyyryOXZ2dny8PC47oIAAIDzFfuS/uTJkyVJNptN06ZNk5+fn70tLy9Pq1at4h4+AAA3qWIH/ltvvSXpwhn+1KlTC1y+9/DwUFRUlKZOner8CgEAwHUrduDv379fktSmTRvNnTtXZcqUKbGiAACAczk8S3/58uUlUQcAAChBDk/ae+CBBzR+/PhCyydMmKCHHnrIKUUBAADncjjwV61apY4dOxZafs8992jVqlVOKQoAADiXw4GfkZFR5Mfv3N3dlZaW5pSiAACAczkc+LGxsfriiy8KLf/8889Vs2ZNpxQFAACcy+FJe8OHD1e3bt20b98+tW3bVpK0bNkyffbZZ/ryyy+dXiAAALh+Dgd+p06dNH/+fI0ZM0azZ8+Wt7e36tSpo6VLl6pVq1YlUSMAALhODge+JN1777269957Cy3fuXOnateufd1FAQAA53L4Hv7vpaen6z//+Y+aNGmiunXrOqMmAADgZNcc+KtWrVLPnj0VHh6uN954Q23bttW6deucWRsAAHAShy7pHz9+XNOnT9cHH3ygtLQ0de/eXdnZ2Zo/fz4z9AEAuIkV+wy/U6dOqlatmrZv365Jkybp2LFjevvtt0uyNgAA4CTFPsP/5ptv1K9fPz3zzDOKjo4uyZoAAICTFfsM/8cff1R6eroaNmyopk2b6p133lFSUlJJ1gYAAJyk2IHfrFkzvf/++0pMTNTTTz+tzz//XBEREcrPz9eSJUuUnp5eknUCAIDr4PAsfV9fX/Xp00c//vijduzYoUGDBmncuHEKDQ1V586dS6JGAABwna7rc/jVqlXThAkTdOTIEX322WfOqgkAADjZdX/xjiS5urqqa9euWrhwoTOGAwAATuaUwAcAADc3Ah8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYwK20C7gZ2Ww2zZs3T127di3tUoqtdevWqlevniZNmlTapRhlX+oGnTi7T5nnk+Vqc1OQZ7higu6Un3twgX4p2ceUcGaNUnOOS3JRgEeIGoXcL1cXN2Xlpmpf6gYlnzus7PxMebr6KcKnum4PbCIXm2vp7BhwiV+T1+lE+h5l5pyWq4u7grwiFBPSSr4eZe19snJStPvUCqWcO6J8K0/lfCqrRmg7ebr5FhovPz9X6w7PVHr2STW/rZcCvMJu5O4Yy8gz/OPHj+vvf/+7qlSpIk9PT0VGRqpTp05atmxZaZeGP5iU7KO6za+OmoU9okah3WRZ+dp4cp5y889f0ueYNp2cr3JeldQsrIeal39Et/nVlc12oT3zfIokS7WC79Kd4T1VI6ilDmfs0J4zq0tnp4DfSc46rNuC6qvZbY+rYcXuyle+Nh75Urn5OZKk3PwcbTz6pWSTGld8RE0jH5Nl5Wnz0TmyLKvQeLuTVsrT1e9G74bxjAv8AwcOqGHDhvr+++81ceJE7dixQ4sXL1abNm307LPPlth2c3JySmxslJ5Goferol8t+XuUVYBHiGLLtte5vHSl5Zyw99mVskqV/OupSmBj+XuUlZ97sMJ9Y+Riu3CBLcQ7SrFl26ucdyX5uAUq1Od2VQ5ooBNZe0trt4ACGlV8SBUCY+XnWU4BnqGKDeuoc7lpSjt34Tg/c/aozp5PVWxYR/l7hsjfM0S1y9+rtOzjSs46WGCsU5m/6nTWflULaV0Ke2I24wL/b3/7m2w2mzZs2KAHHnhAMTExqlWrlgYOHKh169bZ+yUlJen++++Xj4+PoqOjtXDhQnvb9OnTFRQUVGDc+fPny3bxlE3SyJEjVa9ePU2bNk2VK1eWl5eXpAu3C6ZNm3bZsSVp586duueee+Tn56ewsDA9/vjjSkpKsrdnZmaqZ8+e8vPzU3h4uN58801n/opwHc7//xmPu8uF/9/ZeVlKzTkuD1cfrTv+hb4/8h+tP/GlUs4dveo47q5eJV4vcC3O52dLkv0YzbfyZJMK3IJytbnKJptSzh6xL8vOzdTPJxYrtvy9cnVxv6E1w7DAT05O1uLFi/Xss8/K17fwfaVLQ3zUqFHq3r27tm/fro4dO+qxxx5TcnKyQ9vbu3ev5syZo7lz52rr1q3FGvvMmTNq27at6tevr40bN2rx4sU6ceKEunfvbl9/8ODBWrlypRYsWKDvvvtOK1as0ObNmx37ZcDpLMvSrpSVCvKMkL9HOUnS2dxUSdLe1HWq6FdbjUK7KsAjVBtOzv3/S/mFZZ4/o0PpWxXpF3vDageKy7Is7T61TEFeFeTvGSJJCvKKkKuLu3YnrVRe/nnl5udod9IKWbKUnZdpX2/n8a8VGVhPgV7hpbkLxjJq0t7evXtlWZaqV69+1b69e/dWjx49JEljxozR5MmTtWHDBnXo0KHY28vJydFHH32kkJCQYo/9zjvvqH79+hozZoy9/4cffqjIyEjt2bNHERER+uCDDzRz5kzdddddkqQZM2aoYsWKV6wlOztb2dnZ9udpaWnF3g8Uzy8p3yv9fJKahf325szShfuXkX6xquhXS5IU4BGq0+cO60jmz6oWdGeBMc7lZmjTqXkq7xNN4OOmFH9yidKzk9Q08jH7Mg83H9UN76JfTi7RoTObZJNN5f1rKMAzTDZduPJ56Mxm5ebnqEpws9Iq3XhGBX5Rk0cup06dOvaffX19FRAQoJMnTzq0vUqVKhUK+6uNvW3bNi1fvlx+foUntOzbt09nz55VTk6OmjZtal8eHBysatWqXbGWsWPHatSoUQ7Vj+L7JXm5Tp3dryZhD8nLzd++3NP1wpWk38/a93Mvo3O56QWWncvN0IaTsxXkEaFawe1KvmjAQb+cWKJTmfvUOLKHvNz9C7SV862slpWfUk5elmxykburl5bvm6Ly7oGSpOSsgzpz7piWJBS8Bbnu0EcKD6ip2PL33rD9MJVRgR8dHS2bzaZdu3Zdta+7e8H7SzabTfn5+ZIkFxeXQm8ezp8/r98r6rbB1cbOyMhQp06dNH78+ELrhYeHa+/ea5vINXToUA0cOND+PC0tTZGRkdc0Fn5jWZbiU1boxNm9ahL6oHzcAgu0e7sGyNPVt9Dl+8zzZxTiHWV/fjHsAzxCFVv27gLzQYDSZlmW4k8u1cmMBDWOfEQ+7kGX7evh6iNJOp11UDl5mQr1qypJqh7aTlXz/2Tvl52boU1Hv1Sd8M4K8ooo0fpxgVGBHxwcrLi4OE2ZMkX9+vUrFMhnzpwpNBmvKCEhIUpPT1dmZqZ9jEvv0V+PBg0aaM6cOYqKipKbW+H/Pbfffrvc3d21fv163XbbbZKklJQU7dmzR61atbrsuJ6envL09HRKjfjNLynLlZi5Sw1COsvNxcN+v9LN5ilXFzfZbDZV9m+ovanr5O8RIn/3EB3L/EWZucmq73vhjOZi2Hu7+qt6UEvl5J+1j3/xCgFQmuJPLlFierzqR9x/4TjPzZAkubl42iffHU3dIV+PsvJw9daZc8e06+QyVSrTyP5ZfW/3gAJjurl4SJJ83IMKXS1AyTAq8CVpypQpatGihZo0aaLRo0erTp06ys3N1ZIlS/Tuu+8qPj7+qmM0bdpUPj4+eumll9SvXz+tX79e06dPd0p9zz77rN5//3316NFDQ4YMUXBwsPbu3avPP/9c06ZNk5+fn5588kkNHjxYZcuWVWhoqIYNGyYXF6PmX940DmdslyRtODm7wPLawXfb79lHBTRQvpWnXSkrdT7/nPzdQ9Q4pJv9LCnp3EFl5Z5RVu4ZrTg2rcA4HW4bUOL7AFzN4dStkqSfjnxeYHntsHtUIfDCXJPMnGTtSVql83ln5e0eqCplm6tSUKMbXSquwLjAr1KlijZv3qzXX39dgwYNUmJiokJCQtSwYUO9++67xRojODhYM2fO1ODBg/X+++/rrrvu0siRI/XUU09dd30RERFavXq1XnzxRbVv317Z2dmqVKmSOnToYA/1iRMn2i/9+/v7a9CgQUpNTb3ubcNxxQ3kKoGNVSWwcZFtFf1q2d8cADejuJghV+0TE9JKMSGXv8r4e97ugcUaF85jsxyZyYZbRlpamgIDA9Wu4jNyc+FSP25dlpdHaZcAlKjcvGwt2/cvpaamKiAg4LL9uA4MAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABiDwAQAwAIEPAIABCHwAAAxA4AMAYAACHwAAAxD4AAAYgMAHAMAABD4AAAYg8AEAMACBDwCAAQh8AAAMQOADAGAAAh8AAAMQ+AAAGIDABwDAAG6lXQBKh2VZkqTc/JxSrgQoWVaeVdolACUqNz9b0m+v65dD4BsqPT1dkrTi2AelXAkAwBnS09MVGBh42XabdbW3BLgl5efn69ixY/L395fNZivtcoyQlpamyMhIHT58WAEBAaVdDlAiOM5vPMuylJ6eroiICLm4XP5OPWf4hnJxcVHFihVLuwwjBQQE8EKIWx7H+Y11pTP7i5i0BwCAAQh8AAAMQOADN4inp6deeeUVeXp6lnYpQInhOL95MWkPAAADcIYPAIABCHwAAAxA4AMAYAACH0Cp6d27t7p27Wp/3rp1aw0YMOCG17FixQrZbDadOXPmphgHKAkEPoACevfuLZvNJpvNJg8PD1WtWlWjR49Wbm5uiW977ty5evXVV4vVtzTCdcuWLXrooYcUFhYmLy8vRUdHq2/fvtqzZ88NqwG4VgQ+gEI6dOigxMREJSQkaNCgQRo5cqQmTpxYZN+cHOf9Aabg4GD5+/s7bTxnWrRokZo1a6bs7Gx98sknio+P18yZMxUYGKjhw4eXdnnAVRH4AArx9PRU+fLlValSJT3zzDNq166dFi5cKOm3y/Cvv/66IiIiVK1aNUnS4cOH1b17dwUFBSk4OFhdunTRgQMH7GPm5eVp4MCBCgoKUtmyZTVkyJBCf93r95f0s7Oz9eKLLyoyMlKenp6qWrWqPvjgAx04cEBt2rSRJJUpU0Y2m029e/eWdOHvRIwdO1aVK1eWt7e36tatq9mzZxfYztdff62YmBh5e3urTZs2BeosSlZWlp544gl17NhRCxcuVLt27VS5cmU1bdpUb7zxht57770i1zt9+rR69OihChUqyMfHR7Gxsfrss88K9Jk9e7ZiY2Pl7e2tsmXLql27dsrMzJR04SpGkyZN5Ovrq6CgILVo0UIHDx68Yq3A5RD4AK7K29u7wJn8smXLtHv3bi1ZskSLFi3S+fPnFRcXJ39/f/3www9avXq1/Pz81KFDB/t6b775pqZPn64PP/xQP/74o5KTkzVv3rwrbrdnz5767LPPNHnyZMXHx+u9996Tn5+fIiMjNWfOHEnS7t27lZiYqH/961+SpLFjx+qjjz7S1KlT9fPPP+v555/Xn//8Z61cuVLShTcm3bp1U6dOnbR161b95S9/0T/+8Y8r1vHtt98qKSlJQ4YMKbI9KCioyOXnzp1Tw4YN9dVXX2nnzp166qmn9Pjjj2vDhg2SpMTERPXo0UN9+vRRfHy8VqxYoW7dusmyLOXm5qpr165q1aqVtm/frrVr1+qpp57ij13h2lkAcIlevXpZXbp0sSzLsvLz860lS5ZYnp6e1gsvvGBvDwsLs7Kzs+3rfPzxx1a1atWs/Px8+7Ls7GzL29vb+vbbby3Lsqzw8HBrwoQJ9vbz589bFStWtG/LsiyrVatWVv/+/S3Lsqzdu3dbkqwlS5YUWefy5cstSVZKSop92blz5ywfHx9rzZo1Bfo++eSTVo8ePSzLsqyhQ4daNWvWLND+4osvFhrrUuPHj7ckWcnJyUW2X6mm37v33nutQYMGWZZlWZs2bbIkWQcOHCjU7/Tp05Yka8WKFVfcJlBc/LU8AIUsWrRIfn5+On/+vPLz8/Xoo49q5MiR9vbY2Fh5eHjYn2/btk179+4tdP/93Llz2rdvn1JTU5WYmKimTZva29zc3NSoUaNCl/Uv2rp1q1xdXdWqVati1713715lZWXp7rvvLrA8JydH9evXlyTFx8cXqEOSmjdvfsVxL1fj1eTl5WnMmDGaNWuWjh49qpycHGVnZ8vHx0eSVLduXd11112KjY1VXFyc2rdvrwcffFBlypRRcHCwevfurbi4ON19991q166dunfvrvDw8GuqBSDwARTSpk0bvfvuu/Lw8FBERITc3Aq+VPj6+hZ4npGRoYYNG+qTTz4pNFZISMg11eDt7e3wOhkZGZKkr776ShUqVCjQdj3f7R4TEyNJ2rVr11XfHFxq4sSJ+te//qVJkyYpNjZWvr6+GjBggP02h6urq5YsWaI1a9bou+++09tvv61hw4Zp/fr1qly5sv773/+qX79+Wrx4sb744gu9/PLLWrJkiZo1a3bN+wJzcQ8fQCG+vr6qWrWqbrvttkJhX5QGDRooISFBoaGhqlq1aoFHYGCgAgMDFR4ervXr19vXyc3N1aZNmy47ZmxsrPLz8+333n/v4hWGvLw8+7KaNWvK09NThw4dKlRHZGSkJKlGjRr2e+gXrVu37or71759e5UrV04TJkwosv1yHw1cvXq1unTpoj//+c+qW7euqlSpUugjfDabTS1atNCoUaO0ZcsWeXh4FJjbUL9+fQ0dOlRr1qxR7dq19emnn16xVuByCHwA1+2xxx5TuXLl1KVLF/3www/av3+/VqxYoX79+unIkSOSpP79+2vcuHGaP3++du3apb/97W9X/Ax9VFSUevXqpT59+mj+/Pn2MWfNmiVJqlSpkmw2mxYtWqRTp04pIyND/v7+euGFF/T8889rxowZ2rdvnzZv3qy3335bM2bMkCT99a9/VUJCggYPHqzdu3fr008/1fTp06+4f76+vpo2bZq++uorde7cWUuXLtWBAwe0ceNGDRkyRH/961+LXC86Otp+Bh8fH6+nn35aJ06csLevX79eY8aM0caNG3Xo0CHNnTtXp06dUo0aNbR//34NHTpUa9eu1cGDB/Xdd98pISFBNWrUcOD/DHCJ0p5EAODmcumkPUfaExMTrZ49e1rlypWzPD09rSpVqlh9+/a1UlNTLcu6MEmvf//+VkBAgBUUFGQNHDjQ6tmz52Un7VmWZZ09e9Z6/vnnrfDwcMvDw8OqWrWq9eGHH9rbR48ebZUvX96y2WxWr169LMu6MNFw0qRJVrVq1Sx3d3crJCTEiouLs1auXGlf73//+59VtWpVy9PT0/rTn/5kffjhh1edbGdZlvXTTz9Z3bp1s0JCQixPT0+ratWq1lNPPWUlJCRYllV40t7p06etLl26WH5+flZoaKj18ssvF9jnX375xYqLi7OPFxMTY7399tuWZVnW8ePHra5du9r3vVKlStaIESOsvLy8K9YIXA5/HhcAAANwSR8AAAMQ+AAAGIDABwDAAAQ+AAAGIPABADAAgQ8AgAEIfAAADEDgAwBgAAIfAAADEPgAABiAwAcAwAAEPgAABvg/XqF/my2Yev8AAAAASUVORK5CYII=\n",
+ "text/plain": [
+ "<Figure size 480x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# create a heatmap of the matrix using matshow()\n",
+ "\n",
+ "plt.matshow(confusion_matrix(Y_test, predLR))\n",
+ "\n",
+ "# add labels for the x and y axes\n",
+ "plt.xlabel('Predicted Class')\n",
+ "plt.ylabel('Actual Class')\n",
+ "\n",
+ "for i in range(2):\n",
+ " for j in range(2):\n",
+ " plt.text(j, i, confusion_matrix_LR[i, j], ha='center', va='center')\n",
+ "\n",
+ "\n",
+ "# Add custom labels for x and y ticks\n",
+ "plt.xticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.yticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 45,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.390863Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.388123Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.405849Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.404464Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.390782Z"
+ },
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.8062880324543611"
+ ]
+ },
+ "execution_count": 45,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "logmodel.score(X_train, Y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 46,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.8002839564600095"
+ ]
+ },
+ "execution_count": 46,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "accuracy_score(Y_test, predLR)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Prediction using Support Vector Classifier"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 47,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.527574Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.526756Z",
+ "iopub.status.idle": "2021-11-09T03:53:43.842686Z",
+ "shell.execute_reply": "2021-11-09T03:53:43.841678Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.527527Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.svm import SVC\n",
+ "\n",
+ "svc = SVC()\n",
+ "svc.fit(X_train, Y_train)\n",
+ "y_pred_svc = svc.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 48,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:43.862493Z",
+ "iopub.status.busy": "2021-11-09T03:53:43.861822Z",
+ "iopub.status.idle": "2021-11-09T03:53:43.877207Z",
+ "shell.execute_reply": "2021-11-09T03:53:43.876226Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:43.862445Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " precision recall f1-score support\n",
+ "\n",
+ " 0 0.83 0.92 0.87 1557\n",
+ " 1 0.67 0.48 0.56 556\n",
+ "\n",
+ " accuracy 0.80 2113\n",
+ " macro avg 0.75 0.70 0.71 2113\n",
+ "weighted avg 0.79 0.80 0.79 2113\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(classification_report(Y_test, y_pred_svc))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 49,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:43.844696Z",
+ "iopub.status.busy": "2021-11-09T03:53:43.844279Z",
+ "iopub.status.idle": "2021-11-09T03:53:43.858729Z",
+ "shell.execute_reply": "2021-11-09T03:53:43.857478Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:43.844652Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "confusion_matrix_svc = confusion_matrix(Y_test, y_pred_svc)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 50,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 480x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# create a heatmap of the matrix using matshow()\n",
+ "\n",
+ "plt.matshow(confusion_matrix_svc)\n",
+ "\n",
+ "# add labels for the x and y axes\n",
+ "plt.xlabel('Predicted Class')\n",
+ "plt.ylabel('Actual Class')\n",
+ "\n",
+ "for i in range(2):\n",
+ " for j in range(2):\n",
+ " plt.text(j, i, confusion_matrix_svc[i, j], ha='center', va='center')\n",
+ "\n",
+ " \n",
+ "# Add custom labels for x and y ticks\n",
+ "plt.xticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.yticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 51,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.8170385395537525"
+ ]
+ },
+ "execution_count": 51,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "svc.score(X_train,Y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 52,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:43.879144Z",
+ "iopub.status.busy": "2021-11-09T03:53:43.878814Z",
+ "iopub.status.idle": "2021-11-09T03:53:43.885927Z",
+ "shell.execute_reply": "2021-11-09T03:53:43.884870Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:43.879102Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.8012304779933743"
+ ]
+ },
+ "execution_count": 52,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "accuracy_score(Y_test, y_pred_svc)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Prediction using Decision Tree Classifier"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 53,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.414719Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.412027Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.465457Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.464395Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.414670Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.tree import DecisionTreeClassifier\n",
+ "\n",
+ "dtc = DecisionTreeClassifier()\n",
+ "\n",
+ "dtc.fit(X_train, Y_train)\n",
+ "y_pred_dtc = dtc.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 54,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.485884Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.485243Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.506139Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.505038Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.485837Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " precision recall f1-score support\n",
+ "\n",
+ " 0 0.81 0.80 0.81 1557\n",
+ " 1 0.47 0.48 0.47 556\n",
+ "\n",
+ " accuracy 0.72 2113\n",
+ " macro avg 0.64 0.64 0.64 2113\n",
+ "weighted avg 0.72 0.72 0.72 2113\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(classification_report(Y_test, y_pred_dtc))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 55,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.468239Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.467658Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.483494Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.482335Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.468197Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "confusion_matrix_dtc = confusion_matrix(Y_test, y_pred_dtc)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 56,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 480x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# create a heatmap of the matrix using matshow()\n",
+ "\n",
+ "plt.matshow(confusion_matrix_dtc)\n",
+ "\n",
+ "# add labels for the x and y axes\n",
+ "plt.xlabel('Predicted Class')\n",
+ "plt.ylabel('Actual Class')\n",
+ "\n",
+ "for i in range(2):\n",
+ " for j in range(2):\n",
+ " plt.text(j, i, confusion_matrix_dtc[i, j], ha='center', va='center')\n",
+ "\n",
+ "\n",
+ "# Add custom labels for x and y ticks\n",
+ "plt.xticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.yticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 57,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.9987829614604462"
+ ]
+ },
+ "execution_count": 57,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "dtc.score(X_train,Y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 58,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:42.512579Z",
+ "iopub.status.busy": "2021-11-09T03:53:42.511696Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.524237Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.523090Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:42.512525Z"
+ },
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.718409843823947"
+ ]
+ },
+ "execution_count": 58,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "accuracy_score(Y_test, y_pred_dtc)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Prediction using KNN Classifier"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 59,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.119418Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.118718Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.188313Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.187419Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.119360Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<style>#sk-container-id-2 {color: black;background-color: white;}#sk-container-id-2 pre{padding: 0;}#sk-container-id-2 div.sk-toggleable {background-color: white;}#sk-container-id-2 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-2 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-2 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-2 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-2 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-2 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-2 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-2 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-2 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-2 div.sk-item {position: relative;z-index: 1;}#sk-container-id-2 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-2 div.sk-item::before, #sk-container-id-2 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-2 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-2 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-2 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-2 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-2 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-2 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-2 div.sk-label-container {text-align: center;}#sk-container-id-2 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-2 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-2\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>KNeighborsClassifier(n_neighbors=30)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-2\" type=\"checkbox\" checked><label for=\"sk-estimator-id-2\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">KNeighborsClassifier</label><div class=\"sk-toggleable__content\"><pre>KNeighborsClassifier(n_neighbors=30)</pre></div></div></div></div></div>"
+ ],
+ "text/plain": [
+ "KNeighborsClassifier(n_neighbors=30)"
+ ]
+ },
+ "execution_count": 59,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "from sklearn.neighbors import KNeighborsClassifier\n",
+ "\n",
+ "knn = KNeighborsClassifier(n_neighbors = 30)\n",
+ "knn.fit(X_train,Y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 60,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.190286Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.189853Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.800866Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.799696Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.190238Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "pred_knn = knn.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 61,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.840171Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.839811Z",
+ "iopub.status.idle": "2021-11-09T03:53:40.333004Z",
+ "shell.execute_reply": "2021-11-09T03:53:40.332162Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.840125Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "error_rate= []\n",
+ "for i in range(1,40):\n",
+ " knn = KNeighborsClassifier(n_neighbors = i)\n",
+ " knn.fit(X_train,Y_train)\n",
+ " pred_i = knn.predict(X_test)\n",
+ " error_rate.append(np.mean(pred_i != Y_test))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 62,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:40.334926Z",
+ "iopub.status.busy": "2021-11-09T03:53:40.334639Z",
+ "iopub.status.idle": "2021-11-09T03:53:40.729899Z",
+ "shell.execute_reply": "2021-11-09T03:53:40.728891Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:40.334874Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Text(0, 0.5, 'Error Rate')"
+ ]
+ },
+ "execution_count": 62,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 1000x600 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plt.figure(figsize = (10,6))\n",
+ "plt.plot(range(1,40),error_rate,color = 'blue',linestyle = '--',marker = 'o',markerfacecolor='red',markersize = 10)\n",
+ "plt.title('Error Rate vs K')\n",
+ "plt.xlabel('K')\n",
+ "plt.ylabel('Error Rate')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 63,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.820436Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.820173Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.838086Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.837096Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.820382Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " precision recall f1-score support\n",
+ "\n",
+ " 0 0.84 0.88 0.86 1557\n",
+ " 1 0.62 0.55 0.58 556\n",
+ "\n",
+ " accuracy 0.79 2113\n",
+ " macro avg 0.73 0.71 0.72 2113\n",
+ "weighted avg 0.79 0.79 0.79 2113\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(classification_report(Y_test,pred_knn))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 64,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:15.803343Z",
+ "iopub.status.busy": "2021-11-09T03:53:15.803004Z",
+ "iopub.status.idle": "2021-11-09T03:53:15.818621Z",
+ "shell.execute_reply": "2021-11-09T03:53:15.817622Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:15.803297Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "confusion_matrix_knn = confusion_matrix(Y_test,pred_knn)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 65,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "<Figure size 480x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# create a heatmap of the matrix using matshow()\n",
+ "\n",
+ "plt.matshow(confusion_matrix_knn)\n",
+ "\n",
+ "# add labels for the x and y axes\n",
+ "plt.xlabel('Predicted Class')\n",
+ "plt.ylabel('Actual Class')\n",
+ "\n",
+ "for i in range(2):\n",
+ " for j in range(2):\n",
+ " plt.text(j, i, confusion_matrix_knn[i, j], ha='center', va='center')\n",
+ "\n",
+ "# Add custom labels for x and y ticks\n",
+ "plt.xticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.yticks([0, 1], [\"Not Churned\", \"Churned\"])\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 66,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.8008113590263691"
+ ]
+ },
+ "execution_count": 66,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "knn.score(X_train,Y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 67,
+ "metadata": {
+ "execution": {
+ "iopub.execute_input": "2021-11-09T03:53:40.732823Z",
+ "iopub.status.busy": "2021-11-09T03:53:40.731412Z",
+ "iopub.status.idle": "2021-11-09T03:53:42.225267Z",
+ "shell.execute_reply": "2021-11-09T03:53:42.224304Z",
+ "shell.execute_reply.started": "2021-11-09T03:53:40.732768Z"
+ },
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.792238523426408"
+ ]
+ },
+ "execution_count": 67,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "accuracy_score(Y_test, pred_knn)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Conclusion\n",
+ "So, Thank you for sticking with me until the end. If you are interested in learning more about this dataset, you can explore other machine learning classification models such as Ada Boost Classifier, Gradient Boosting Classifier, Stochastic Gradient Boosting (SGB) Classifier, Cat Boost Classifier and XGB Boost Classifier. Additionally, you can try tuning the model's hyperparameters using techniques like GridSearchCV. I am not going into detail about those topics, but if you are interested, feel free to explore them further. "
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.9"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/Data Prediction/Tele Churn/.ipynb_checkpoints/tele_churn-checkpoint.ipynb b/Data Prediction/Tele Churn/.ipynb_checkpoints/tele_churn-checkpoint.ipynb
new file mode 100644
index 0000000..a603791
--- /dev/null
+++ b/Data Prediction/Tele Churn/.ipynb_checkpoints/tele_churn-checkpoint.ipynb
@@ -0,0 +1,5535 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "211755a3",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# pip install matplotlib pandas seaborn missingno plotly scikit-learn xgboost catboost lightgbm"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "be0c7624-8210-4045-9033-2176cb5211ef",
+ "metadata": {},
+ "source": [
+ "# Importing Libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "id": "3ce7f5d0",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ " <script type=\"text/javascript\">\n",
+ " window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
+ " if (window.MathJax && window.MathJax.Hub && window.MathJax.Hub.Config) {window.MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
+ " if (typeof require !== 'undefined') {\n",
+ " require.undef(\"plotly\");\n",
+ " requirejs.config({\n",
+ " paths: {\n",
+ " 'plotly': ['https://cdn.plot.ly/plotly-2.27.0.min']\n",
+ " }\n",
+ " });\n",
+ " require(['plotly'], function(Plotly) {\n",
+ " window._Plotly = Plotly;\n",
+ " });\n",
+ " }\n",
+ " </script>\n",
+ " "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "import warnings\n",
+ "warnings.simplefilter(action='ignore')\n",
+ "import matplotlib.pyplot as plt\n",
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "import seaborn as sns\n",
+ "import missingno as msno\n",
+ "from plotly.offline import plot, iplot, init_notebook_mode\n",
+ "init_notebook_mode(connected=True)\n",
+ "import plotly.express as px\n",
+ "import plotly.graph_objects as go\n",
+ "from plotly.subplots import make_subplots\n",
+ "from sklearn.preprocessing import MinMaxScaler, LabelEncoder, StandardScaler, RobustScaler\n",
+ "from sklearn.model_selection import GridSearchCV, cross_validate\n",
+ "from sklearn.metrics import roc_auc_score,roc_curve, classification_report, confusion_matrix, accuracy_score\n",
+ "from sklearn.metrics import RocCurveDisplay\n",
+ "from sklearn.model_selection import train_test_split, cross_validate\n",
+ "from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier, VotingClassifier, AdaBoostClassifier\n",
+ "from sklearn.linear_model import LogisticRegression\n",
+ "from sklearn.neighbors import KNeighborsClassifier\n",
+ "from sklearn.tree import DecisionTreeClassifier\n",
+ "from sklearn.preprocessing import StandardScaler\n",
+ "from xgboost import XGBClassifier\n",
+ "from catboost import CatBoostClassifier\n",
+ "from lightgbm import LGBMClassifier\n",
+ "from sklearn.exceptions import ConvergenceWarning\n",
+ "import tkinter\n",
+ "from collections import Counter\n",
+ "\n",
+ "pd.set_option('display.max_columns', None)\n",
+ "pd.set_option('display.max_rows', None)\n",
+ "pd.set_option('display.float_format', lambda x: '%.3f' % x)\n",
+ "pd.set_option('display.width', 500)\n",
+ "\n",
+ "# %matplotlib inline"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7a35dd24-0ced-4d2d-930c-064a0be81b50",
+ "metadata": {},
+ "source": [
+ "# Loading Dataset"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "id": "fe7fdda7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "df1 = pd.read_csv(\"WA_Fn-UseC_-Telco-Customer-Churn.csv\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "147cc94e-f526-4f19-9101-caac1a7a8f4e",
+ "metadata": {},
+ "source": [
+ "# Explorartory Data Analysis"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "id": "eccab7dd-6d55-4d84-9ef2-10c8b9fefc93",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>customerID</th>\n",
+ " <th>gender</th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>Partner</th>\n",
+ " <th>Dependents</th>\n",
+ " <th>tenure</th>\n",
+ " <th>PhoneService</th>\n",
+ " <th>MultipleLines</th>\n",
+ " <th>InternetService</th>\n",
+ " <th>OnlineSecurity</th>\n",
+ " <th>OnlineBackup</th>\n",
+ " <th>DeviceProtection</th>\n",
+ " <th>TechSupport</th>\n",
+ " <th>StreamingTV</th>\n",
+ " <th>StreamingMovies</th>\n",
+ " <th>Contract</th>\n",
+ " <th>PaperlessBilling</th>\n",
+ " <th>PaymentMethod</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>Churn</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>7590-VHVEG</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>1</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>29.850</td>\n",
+ " <td>29.85</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>5575-GNVDE</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>34</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>One year</td>\n",
+ " <td>No</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>56.950</td>\n",
+ " <td>1889.5</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>3668-QPYBK</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>2</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>53.850</td>\n",
+ " <td>108.15</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>7795-CFOCW</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>45</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>One year</td>\n",
+ " <td>No</td>\n",
+ " <td>Bank transfer (automatic)</td>\n",
+ " <td>42.300</td>\n",
+ " <td>1840.75</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>9237-HQITU</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>2</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>70.700</td>\n",
+ " <td>151.65</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " customerID gender SeniorCitizen Partner Dependents tenure PhoneService MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges Churn\n",
+ "0 7590-VHVEG Female 0 Yes No 1 No No phone service DSL No Yes No No No No Month-to-month Yes Electronic check 29.850 29.85 No\n",
+ "1 5575-GNVDE Male 0 No No 34 Yes No DSL Yes No Yes No No No One year No Mailed check 56.950 1889.5 No\n",
+ "2 3668-QPYBK Male 0 No No 2 Yes No DSL Yes Yes No No No No Month-to-month Yes Mailed check 53.850 108.15 Yes\n",
+ "3 7795-CFOCW Male 0 No No 45 No No phone service DSL Yes No Yes Yes No No One year No Bank transfer (automatic) 42.300 1840.75 No\n",
+ "4 9237-HQITU Female 0 No No 2 Yes No Fiber optic No No No No No No Month-to-month Yes Electronic check 70.700 151.65 Yes"
+ ]
+ },
+ "execution_count": 25,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df = df1.copy()\n",
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "id": "072cc34f-b29e-41ae-998a-3796be0dcc86",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>customerID</th>\n",
+ " <th>gender</th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>Partner</th>\n",
+ " <th>Dependents</th>\n",
+ " <th>tenure</th>\n",
+ " <th>PhoneService</th>\n",
+ " <th>MultipleLines</th>\n",
+ " <th>InternetService</th>\n",
+ " <th>OnlineSecurity</th>\n",
+ " <th>OnlineBackup</th>\n",
+ " <th>DeviceProtection</th>\n",
+ " <th>TechSupport</th>\n",
+ " <th>StreamingTV</th>\n",
+ " <th>StreamingMovies</th>\n",
+ " <th>Contract</th>\n",
+ " <th>PaperlessBilling</th>\n",
+ " <th>PaymentMethod</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>Churn</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>7038</th>\n",
+ " <td>6840-RESVB</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>24</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>One year</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>84.800</td>\n",
+ " <td>1990.5</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7039</th>\n",
+ " <td>2234-XADUH</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>72</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>One year</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Credit card (automatic)</td>\n",
+ " <td>103.200</td>\n",
+ " <td>7362.9</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7040</th>\n",
+ " <td>4801-JZAZL</td>\n",
+ " <td>Female</td>\n",
+ " <td>0</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>11</td>\n",
+ " <td>No</td>\n",
+ " <td>No phone service</td>\n",
+ " <td>DSL</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Electronic check</td>\n",
+ " <td>29.600</td>\n",
+ " <td>346.45</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7041</th>\n",
+ " <td>8361-LTMKD</td>\n",
+ " <td>Male</td>\n",
+ " <td>1</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>4</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>Month-to-month</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Mailed check</td>\n",
+ " <td>74.400</td>\n",
+ " <td>306.6</td>\n",
+ " <td>Yes</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7042</th>\n",
+ " <td>3186-AJIEK</td>\n",
+ " <td>Male</td>\n",
+ " <td>0</td>\n",
+ " <td>No</td>\n",
+ " <td>No</td>\n",
+ " <td>66</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Fiber optic</td>\n",
+ " <td>Yes</td>\n",
+ " <td>No</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Two year</td>\n",
+ " <td>Yes</td>\n",
+ " <td>Bank transfer (automatic)</td>\n",
+ " <td>105.650</td>\n",
+ " <td>6844.5</td>\n",
+ " <td>No</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " customerID gender SeniorCitizen Partner Dependents tenure PhoneService MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges Churn\n",
+ "7038 6840-RESVB Male 0 Yes Yes 24 Yes Yes DSL Yes No Yes Yes Yes Yes One year Yes Mailed check 84.800 1990.5 No\n",
+ "7039 2234-XADUH Female 0 Yes Yes 72 Yes Yes Fiber optic No Yes Yes No Yes Yes One year Yes Credit card (automatic) 103.200 7362.9 No\n",
+ "7040 4801-JZAZL Female 0 Yes Yes 11 No No phone service DSL Yes No No No No No Month-to-month Yes Electronic check 29.600 346.45 No\n",
+ "7041 8361-LTMKD Male 1 Yes No 4 Yes Yes Fiber optic No No No No No No Month-to-month Yes Mailed check 74.400 306.6 Yes\n",
+ "7042 3186-AJIEK Male 0 No No 66 Yes No Fiber optic Yes No Yes Yes Yes Yes Two year Yes Bank transfer (automatic) 105.650 6844.5 No"
+ ]
+ },
+ "execution_count": 26,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.tail()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "id": "9c64a681",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>tenure</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>count</th>\n",
+ " <td>7043.000</td>\n",
+ " <td>7043.000</td>\n",
+ " <td>7043.000</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>mean</th>\n",
+ " <td>0.162</td>\n",
+ " <td>32.371</td>\n",
+ " <td>64.762</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>std</th>\n",
+ " <td>0.369</td>\n",
+ " <td>24.559</td>\n",
+ " <td>30.090</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>min</th>\n",
+ " <td>0.000</td>\n",
+ " <td>0.000</td>\n",
+ " <td>18.250</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>25%</th>\n",
+ " <td>0.000</td>\n",
+ " <td>9.000</td>\n",
+ " <td>35.500</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>50%</th>\n",
+ " <td>0.000</td>\n",
+ " <td>29.000</td>\n",
+ " <td>70.350</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>75%</th>\n",
+ " <td>0.000</td>\n",
+ " <td>55.000</td>\n",
+ " <td>89.850</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>max</th>\n",
+ " <td>1.000</td>\n",
+ " <td>72.000</td>\n",
+ " <td>118.750</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " SeniorCitizen tenure MonthlyCharges\n",
+ "count 7043.000 7043.000 7043.000\n",
+ "mean 0.162 32.371 64.762\n",
+ "std 0.369 24.559 30.090\n",
+ "min 0.000 0.000 18.250\n",
+ "25% 0.000 9.000 35.500\n",
+ "50% 0.000 29.000 70.350\n",
+ "75% 0.000 55.000 89.850\n",
+ "max 1.000 72.000 118.750"
+ ]
+ },
+ "execution_count": 27,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.describe()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "0d330dc4",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>count</th>\n",
+ " <th>mean</th>\n",
+ " <th>std</th>\n",
+ " <th>min</th>\n",
+ " <th>25%</th>\n",
+ " <th>50%</th>\n",
+ " <th>75%</th>\n",
+ " <th>max</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <td>7043.000</td>\n",
+ " <td>0.162</td>\n",
+ " <td>0.369</td>\n",
+ " <td>0.000</td>\n",
+ " <td>0.000</td>\n",
+ " <td>0.000</td>\n",
+ " <td>0.000</td>\n",
+ " <td>1.000</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>tenure</th>\n",
+ " <td>7043.000</td>\n",
+ " <td>32.371</td>\n",
+ " <td>24.559</td>\n",
+ " <td>0.000</td>\n",
+ " <td>9.000</td>\n",
+ " <td>29.000</td>\n",
+ " <td>55.000</td>\n",
+ " <td>72.000</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <td>7043.000</td>\n",
+ " <td>64.762</td>\n",
+ " <td>30.090</td>\n",
+ " <td>18.250</td>\n",
+ " <td>35.500</td>\n",
+ " <td>70.350</td>\n",
+ " <td>89.850</td>\n",
+ " <td>118.750</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " count mean std min 25% 50% 75% max\n",
+ "SeniorCitizen 7043.000 0.162 0.369 0.000 0.000 0.000 0.000 1.000\n",
+ "tenure 7043.000 32.371 24.559 0.000 9.000 29.000 55.000 72.000\n",
+ "MonthlyCharges 7043.000 64.762 30.090 18.250 35.500 70.350 89.850 118.750"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.describe().T"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "a30694a2-5fec-4755-871a-ed37833b7c3c",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(7043, 21)"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.shape"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "f6e03abf-0d4e-4f82-b40d-a89f7cb5e102",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents', 'tenure', 'PhoneService', 'MultipleLines', 'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'], dtype='object')"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.columns"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "2962acc6-4b0b-4fab-96ba-1af16ddf8dcc",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "customerID 0\n",
+ "gender 0\n",
+ "SeniorCitizen 0\n",
+ "Partner 0\n",
+ "Dependents 0\n",
+ "tenure 0\n",
+ "PhoneService 0\n",
+ "MultipleLines 0\n",
+ "InternetService 0\n",
+ "OnlineSecurity 0\n",
+ "OnlineBackup 0\n",
+ "DeviceProtection 0\n",
+ "TechSupport 0\n",
+ "StreamingTV 0\n",
+ "StreamingMovies 0\n",
+ "Contract 0\n",
+ "PaperlessBilling 0\n",
+ "PaymentMethod 0\n",
+ "MonthlyCharges 0\n",
+ "TotalCharges 0\n",
+ "Churn 0\n",
+ "dtype: int64"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.isnull().sum()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "76760a8d-b933-47d6-8357-de5fad75fe1d",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.duplicated().sum()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "09aeaead",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[1m******************** SHAPE ********************\u001b[0m\n",
+ "(7043, 21)\n",
+ "\u001b[1m******************** TYPES ********************\u001b[0m\n",
+ "customerID object\n",
+ "gender object\n",
+ "SeniorCitizen int64\n",
+ "Partner object\n",
+ "Dependents object\n",
+ "tenure int64\n",
+ "PhoneService object\n",
+ "MultipleLines object\n",
+ "InternetService object\n",
+ "OnlineSecurity object\n",
+ "OnlineBackup object\n",
+ "DeviceProtection object\n",
+ "TechSupport object\n",
+ "StreamingTV object\n",
+ "StreamingMovies object\n",
+ "Contract object\n",
+ "PaperlessBilling object\n",
+ "PaymentMethod object\n",
+ "MonthlyCharges float64\n",
+ "TotalCharges object\n",
+ "Churn object\n",
+ "dtype: object\n",
+ "\u001b[1m******************** NA ********************\u001b[0m\n",
+ "customerID 0\n",
+ "gender 0\n",
+ "SeniorCitizen 0\n",
+ "Partner 0\n",
+ "Dependents 0\n",
+ "tenure 0\n",
+ "PhoneService 0\n",
+ "MultipleLines 0\n",
+ "InternetService 0\n",
+ "OnlineSecurity 0\n",
+ "OnlineBackup 0\n",
+ "DeviceProtection 0\n",
+ "TechSupport 0\n",
+ "StreamingTV 0\n",
+ "StreamingMovies 0\n",
+ "Contract 0\n",
+ "PaperlessBilling 0\n",
+ "PaymentMethod 0\n",
+ "MonthlyCharges 0\n",
+ "TotalCharges 0\n",
+ "Churn 0\n",
+ "dtype: int64\n",
+ "\u001b[1m******************** DUPLICATED VALUE ********************\u001b[0m\n",
+ "0\n"
+ ]
+ }
+ ],
+ "source": [
+ "def check_df(dataframe, head=10):\n",
+ " \n",
+ " print('\\033[1m' + 20*\"*\" + ' SHAPE ' + 20*\"*\" + '\\033[0m')\n",
+ " print(dataframe.shape)\n",
+ " \n",
+ " print('\\033[1m' + 20*\"*\" + ' TYPES ' + 20*\"*\" + '\\033[0m')\n",
+ " print(dataframe.dtypes)\n",
+ " \n",
+ " print('\\033[1m' + 20*\"*\" + ' NA ' + 20*\"*\" + '\\033[0m')\n",
+ " print(dataframe.isnull().sum())\n",
+ " \n",
+ " print('\\033[1m' + 20*\"*\" + ' DUPLICATED VALUE ' + 20*\"*\" + '\\033[0m')\n",
+ " print(dataframe.duplicated().sum())\n",
+ " \n",
+ "check_df(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "09adc81e-77f4-4ebe-8f85-e0ea6da42b7c",
+ "metadata": {},
+ "source": [
+ "### Data Cleaning"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "id": "b2f45ccf",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Making the necessary arrangements\n",
+ "# Since we do not need the CustomerID variable, we omitted it from the dataset.\n",
+ "df = df.drop(['customerID'], axis = 1)\n",
+ "\n",
+ "# We converted the Churn variable as we wanted to see it as 1/0 instead of yes/no.\n",
+ "df[\"Churn\"] = df[\"Churn\"].replace({\"Yes\":1, \"No\":0})\n",
+ "\n",
+ "# We converted the TotalCharges variable to a numeric variable.\n",
+ "df.TotalCharges = pd.to_numeric(df.TotalCharges, errors='coerce')\n",
+ "\n",
+ "# SeniorCitizen variable should be object not integer, we changed that too.\n",
+ "df[\"SeniorCitizen\"] = df[\"SeniorCitizen\"].astype(\"O\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "id": "f21e4206",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "<class 'pandas.core.frame.DataFrame'>\n",
+ "RangeIndex: 7043 entries, 0 to 7042\n",
+ "Data columns (total 20 columns):\n",
+ " # Column Non-Null Count Dtype \n",
+ "--- ------ -------------- ----- \n",
+ " 0 gender 7043 non-null object \n",
+ " 1 SeniorCitizen 7043 non-null object \n",
+ " 2 Partner 7043 non-null object \n",
+ " 3 Dependents 7043 non-null object \n",
+ " 4 tenure 7043 non-null int64 \n",
+ " 5 PhoneService 7043 non-null object \n",
+ " 6 MultipleLines 7043 non-null object \n",
+ " 7 InternetService 7043 non-null object \n",
+ " 8 OnlineSecurity 7043 non-null object \n",
+ " 9 OnlineBackup 7043 non-null object \n",
+ " 10 DeviceProtection 7043 non-null object \n",
+ " 11 TechSupport 7043 non-null object \n",
+ " 12 StreamingTV 7043 non-null object \n",
+ " 13 StreamingMovies 7043 non-null object \n",
+ " 14 Contract 7043 non-null object \n",
+ " 15 PaperlessBilling 7043 non-null object \n",
+ " 16 PaymentMethod 7043 non-null object \n",
+ " 17 MonthlyCharges 7043 non-null float64\n",
+ " 18 TotalCharges 7032 non-null float64\n",
+ " 19 Churn 7043 non-null int64 \n",
+ "dtypes: float64(2), int64(2), object(16)\n",
+ "memory usage: 1.1+ MB\n"
+ ]
+ }
+ ],
+ "source": [
+ "df.info()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "id": "089edcff",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Text(0.5, 1.02, 'Count of Target Variable per Category')"
+ ]
+ },
+ "execution_count": 26,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 800x600 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "df['Churn'].value_counts().plot(kind='barh', figsize=(8,6))\n",
+ "plt.xlabel(\"Count\", labelpad=14)\n",
+ "plt.ylabel(\"Target Variable\", labelpad=14)\n",
+ "plt.title(\"Count of Target Variable per Category\", y=1.02)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "id": "9eae1591",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Churn\n",
+ "0 5174\n",
+ "1 1869\n",
+ "Name: count, dtype: int64"
+ ]
+ },
+ "execution_count": 27,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df['Churn'].value_counts()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "id": "63180e4a",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Churn\n",
+ "0 73.463\n",
+ "1 26.537\n",
+ "Name: count, dtype: float64"
+ ]
+ },
+ "execution_count": 28,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "100*df['Churn'].value_counts()/len(df['Churn'])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 29,
+ "id": "3048dfc6",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "<class 'pandas.core.frame.DataFrame'>\n",
+ "RangeIndex: 7043 entries, 0 to 7042\n",
+ "Data columns (total 20 columns):\n",
+ " # Column Non-Null Count Dtype \n",
+ "--- ------ -------------- ----- \n",
+ " 0 gender 7043 non-null object \n",
+ " 1 SeniorCitizen 7043 non-null object \n",
+ " 2 Partner 7043 non-null object \n",
+ " 3 Dependents 7043 non-null object \n",
+ " 4 tenure 7043 non-null int64 \n",
+ " 5 PhoneService 7043 non-null object \n",
+ " 6 MultipleLines 7043 non-null object \n",
+ " 7 InternetService 7043 non-null object \n",
+ " 8 OnlineSecurity 7043 non-null object \n",
+ " 9 OnlineBackup 7043 non-null object \n",
+ " 10 DeviceProtection 7043 non-null object \n",
+ " 11 TechSupport 7043 non-null object \n",
+ " 12 StreamingTV 7043 non-null object \n",
+ " 13 StreamingMovies 7043 non-null object \n",
+ " 14 Contract 7043 non-null object \n",
+ " 15 PaperlessBilling 7043 non-null object \n",
+ " 16 PaymentMethod 7043 non-null object \n",
+ " 17 MonthlyCharges 7043 non-null float64\n",
+ " 18 TotalCharges 7032 non-null float64\n",
+ " 19 Churn 7043 non-null int64 \n",
+ "dtypes: float64(2), int64(2), object(16)\n",
+ "memory usage: 1.1+ MB\n"
+ ]
+ }
+ ],
+ "source": [
+ "df.info()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "id": "54fbf0bc",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "gender 0\n",
+ "SeniorCitizen 0\n",
+ "Partner 0\n",
+ "Dependents 0\n",
+ "tenure 0\n",
+ "PhoneService 0\n",
+ "MultipleLines 0\n",
+ "InternetService 0\n",
+ "OnlineSecurity 0\n",
+ "OnlineBackup 0\n",
+ "DeviceProtection 0\n",
+ "TechSupport 0\n",
+ "StreamingTV 0\n",
+ "StreamingMovies 0\n",
+ "Contract 0\n",
+ "PaperlessBilling 0\n",
+ "PaymentMethod 0\n",
+ "MonthlyCharges 0\n",
+ "TotalCharges 11\n",
+ "Churn 0\n",
+ "dtype: int64"
+ ]
+ },
+ "execution_count": 30,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.isna().sum()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "id": "8b1e531b",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Observations: 7043\n",
+ "Variables: 20\n",
+ "cat_cols: 17\n",
+ "num_cols: 3\n",
+ "cat_but_car: 0\n",
+ "num_but_cat: 1\n"
+ ]
+ }
+ ],
+ "source": [
+ "def grab_col_names(dataframe, cat_th=10, car_th=20):\n",
+ " \"\"\"\n",
+ "\n",
+ " It gives the names of categorical, numerical and categorical but cardinal variables in the data set.\n",
+ " Note: Categorical variables with numerical appearance are also included in categorical variables.\n",
+ "\n",
+ " Parameters\n",
+ " ------\n",
+ " df: Dataframe\n",
+ " The dataframe from which variable names are to be retrieved\n",
+ " cat_th: int, optional\n",
+ " threshold value for numeric but categorical variables\n",
+ " car_th: int, optinal\n",
+ " threshold value for categorical but cardinal variables\n",
+ "\n",
+ " Returns\n",
+ " ------\n",
+ " cat_cols: list\n",
+ " Categorical variable list\n",
+ " num_cols: list\n",
+ " Numeric variable list\n",
+ " cat_but_car: list\n",
+ " Categorical but cardinal variable list\n",
+ "\n",
+ " Notes\n",
+ " ------\n",
+ " cat_cols + num_cols + cat_but_car = total number of variables\n",
+ " num_but_cat is inside cat_cols\n",
+ "\n",
+ " \"\"\"\n",
+ "\n",
+ " # cat_cols, cat_but_car\n",
+ " cat_cols = [col for col in dataframe.columns if dataframe[col].dtypes == \"O\"]\n",
+ " num_but_cat = [col for col in dataframe.columns if dataframe[col].nunique() < cat_th and\n",
+ " dataframe[col].dtypes != \"O\"]\n",
+ " cat_but_car = [col for col in dataframe.columns if dataframe[col].nunique() > car_th and\n",
+ " dataframe[col].dtypes == \"O\"]\n",
+ " cat_cols = cat_cols + num_but_cat\n",
+ " cat_cols = [col for col in cat_cols if col not in cat_but_car]\n",
+ "\n",
+ " # num_cols\n",
+ " num_cols = [col for col in dataframe.columns if dataframe[col].dtypes != \"O\"]\n",
+ " num_cols = [col for col in num_cols if col not in num_but_cat]\n",
+ "\n",
+ " print(f\"Observations: {dataframe.shape[0]}\")\n",
+ " print(f\"Variables: {dataframe.shape[1]}\")\n",
+ " print(f'cat_cols: {len(cat_cols)}')\n",
+ " print(f'num_cols: {len(num_cols)}')\n",
+ " print(f'cat_but_car: {len(cat_but_car)}')\n",
+ " print(f'num_but_cat: {len(num_but_cat)}')\n",
+ " return cat_cols, num_cols, cat_but_car\n",
+ "\n",
+ "cat_cols, num_cols, cat_but_car = grab_col_names(df)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 32,
+ "id": "5831c338",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['tenure', 'MonthlyCharges', 'TotalCharges']"
+ ]
+ },
+ "execution_count": 32,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "num_cols"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 33,
+ "id": "02d09051",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['gender',\n",
+ " 'SeniorCitizen',\n",
+ " 'Partner',\n",
+ " 'Dependents',\n",
+ " 'PhoneService',\n",
+ " 'MultipleLines',\n",
+ " 'InternetService',\n",
+ " 'OnlineSecurity',\n",
+ " 'OnlineBackup',\n",
+ " 'DeviceProtection',\n",
+ " 'TechSupport',\n",
+ " 'StreamingTV',\n",
+ " 'StreamingMovies',\n",
+ " 'Contract',\n",
+ " 'PaperlessBilling',\n",
+ " 'PaymentMethod',\n",
+ " 'Churn']"
+ ]
+ },
+ "execution_count": 33,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cat_cols"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c805c38",
+ "metadata": {},
+ "source": [
+ "Data Visualization"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 34,
+ "id": "6bb2c162",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### gender ############################\n",
+ " gender Ratio\n",
+ "gender \n",
+ "Male 3555 50.476\n",
+ "Female 3488 49.524\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### SeniorCitizen ############################\n",
+ " SeniorCitizen Ratio\n",
+ "SeniorCitizen \n",
+ "0 5901 83.785\n",
+ "1 1142 16.215\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### Partner ############################\n",
+ " Partner Ratio\n",
+ "Partner \n",
+ "No 3641 51.697\n",
+ "Yes 3402 48.303\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### Dependents ############################\n",
+ " Dependents Ratio\n",
+ "Dependents \n",
+ "No 4933 70.041\n",
+ "Yes 2110 29.959\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### PhoneService ############################\n",
+ " PhoneService Ratio\n",
+ "PhoneService \n",
+ "Yes 6361 90.317\n",
+ "No 682 9.683\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAAkQAAAGwCAYAAABIC3rIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAvFUlEQVR4nO3de1TVdb7/8dcGZIPihrwAMiAxWQrmJa10L8vxQqJxOjY6XYwpS9PJwRwlL4czRkaW5WTmBbWLis3R46WLlZZImnhDLYo0NMYcGmwUyDHYeQOVff7o5/fnHs2UkI1+no+1vmvx/Xze+7PfH9eiXuv7/e6Nze12uwUAAGAwH283AAAA4G0EIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4/l5u4ErQXV1tQ4cOKDGjRvLZrN5ux0AAHAR3G63fvjhB0VERMjH58LXgAhEF+HAgQOKiorydhsAAKAG9u/fr8jIyAvWEIguQuPGjSX9+A/qcDi83A0AALgYLpdLUVFR1v/HL4RAdBHO3CZzOBwEIgAArjAX87gLD1UDAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGM/P2w0AgAmKi4t16NAhb7cB1DvNmjVTy5Ytvd0GgQgALrfi4mLFxsbq2LFj3m4FqHcaNmyoPXv2eD0UEYgA4DI7dOiQjh07phdfS9d1rWO83Q5Qb+wrLNLYYWk6dOgQgQgATHFd6xi17djG220AOA8eqgYAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPK8Hon/+85/6/e9/r6ZNmyowMFDt2rXTp59+as273W6lpaWpRYsWCgwMVHx8vPbu3euxxuHDh5WUlCSHw6GQkBANHTpUR44c8ajZuXOnbr/9dgUEBCgqKkpTp06tk/0BAID6z6uB6Pvvv1e3bt3UoEEDffjhh9q9e7emTZuma665xqqZOnWqZs6cqXnz5mn79u1q1KiREhISdOLECasmKSlJBQUFys7O1qpVq7Rx40YNHz7cmne5XOrTp4+io6OVl5env/zlL5o0aZJeffXVOt0vAACon/y8+eYvvPCCoqKitHDhQmssJibG+tntduvll1/WxIkT1b9/f0nSG2+8obCwMK1cuVL333+/9uzZozVr1uiTTz7RzTffLEmaNWuW7rzzTr344ouKiIjQ4sWLVVVVpQULFsjf319t27ZVfn6+XnrpJY/gBAAAzOTVK0Tvvfeebr75Zt1zzz0KDQ3VTTfdpNdee82aLyoqUklJieLj462x4OBgdenSRbm5uZKk3NxchYSEWGFIkuLj4+Xj46Pt27dbNd27d5e/v79Vk5CQoMLCQn3//ffn9FVZWSmXy+VxAACAq5dXA9Hf//53zZ07V9dff72ysrI0YsQIjRo1SosWLZIklZSUSJLCwsI8XhcWFmbNlZSUKDQ01GPez89PTZo08ag53xpnv8fZpkyZouDgYOuIioqqhd0CAID6yquBqLq6Wp06ddJzzz2nm266ScOHD9ewYcM0b948b7al1NRUVVRUWMf+/fu92g8AALi8vBqIWrRoobi4OI+x2NhYFRcXS5LCw8MlSaWlpR41paWl1lx4eLjKyso85k+dOqXDhw971JxvjbPf42x2u10Oh8PjAAAAVy+vBqJu3bqpsLDQY+xvf/uboqOjJf34gHV4eLjWrVtnzbtcLm3fvl1Op1OS5HQ6VV5erry8PKtm/fr1qq6uVpcuXayajRs36uTJk1ZNdna2Wrdu7fGJNgAAYCavBqIxY8Zo27Zteu655/T1119ryZIlevXVV5WcnCxJstlsGj16tCZPnqz33ntPu3bt0kMPPaSIiAjdfffdkn68otS3b18NGzZMO3bs0JYtWzRy5Ejdf//9ioiIkCQ98MAD8vf319ChQ1VQUKBly5ZpxowZSklJ8dbWAQBAPeLVj93fcssteuedd5Samqr09HTFxMTo5ZdfVlJSklUzfvx4HT16VMOHD1d5ebluu+02rVmzRgEBAVbN4sWLNXLkSPXu3Vs+Pj4aOHCgZs6cac0HBwdr7dq1Sk5OVufOndWsWTOlpaXxkXsAACBJsrndbre3m6jvXC6XgoODVVFRwfNEAC7ZZ599ps6dO+udjX9V245tvN0OUG8U5H+l33Z/UHl5eerUqVOtr38p///2+p/uAAAA8DYCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIzn1UA0adIk2Ww2j6NNmzbW/IkTJ5ScnKymTZsqKChIAwcOVGlpqccaxcXFSkxMVMOGDRUaGqpx48bp1KlTHjUbNmxQp06dZLfb1apVK2VmZtbF9gAAwBXC61eI2rZtq4MHD1rH5s2brbkxY8bo/fff14oVK5STk6MDBw5owIAB1vzp06eVmJioqqoqbd26VYsWLVJmZqbS0tKsmqKiIiUmJqpnz57Kz8/X6NGj9eijjyorK6tO9wkAAOovP6834Oen8PDwc8YrKio0f/58LVmyRL169ZIkLVy4ULGxsdq2bZu6du2qtWvXavfu3froo48UFhamjh076plnntGECRM0adIk+fv7a968eYqJidG0adMkSbGxsdq8ebOmT5+uhISEOt0rAACon7x+hWjv3r2KiIjQr3/9ayUlJam4uFiSlJeXp5MnTyo+Pt6qbdOmjVq2bKnc3FxJUm5urtq1a6ewsDCrJiEhQS6XSwUFBVbN2WucqTmzxvlUVlbK5XJ5HAAA4Orl1UDUpUsXZWZmas2aNZo7d66Kiop0++2364cfflBJSYn8/f0VEhLi8ZqwsDCVlJRIkkpKSjzC0Jn5M3MXqnG5XDp+/Ph5+5oyZYqCg4OtIyoqqja2CwAA6imv3jLr16+f9XP79u3VpUsXRUdHa/ny5QoMDPRaX6mpqUpJSbHOXS4XoQgAgKuY12+ZnS0kJEQ33HCDvv76a4WHh6uqqkrl5eUeNaWlpdYzR+Hh4ed86uzM+c/VOByOnwxddrtdDofD4wAAAFevehWIjhw5on379qlFixbq3LmzGjRooHXr1lnzhYWFKi4ultPplCQ5nU7t2rVLZWVlVk12drYcDofi4uKsmrPXOFNzZg0AAACvBqKxY8cqJydH33zzjbZu3arf/va38vX11aBBgxQcHKyhQ4cqJSVFH3/8sfLy8vTII4/I6XSqa9eukqQ+ffooLi5ODz74oL744gtlZWVp4sSJSk5Olt1ulyQ99thj+vvf/67x48frq6++0pw5c7R8+XKNGTPGm1sHAAD1iFefIfr22281aNAg/etf/1Lz5s112223adu2bWrevLkkafr06fLx8dHAgQNVWVmphIQEzZkzx3q9r6+vVq1apREjRsjpdKpRo0YaPHiw0tPTrZqYmBitXr1aY8aM0YwZMxQZGanXX3+dj9wDAACLVwPR0qVLLzgfEBCgjIwMZWRk/GRNdHS0Pvjggwuu06NHD33++ec16hEAAFz96tUzRAAAAN5AIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB49SYQPf/887LZbBo9erQ1duLECSUnJ6tp06YKCgrSwIEDVVpa6vG64uJiJSYmqmHDhgoNDdW4ceN06tQpj5oNGzaoU6dOstvtatWqlTIzM+tgRwAA4EpRLwLRJ598oldeeUXt27f3GB8zZozef/99rVixQjk5OTpw4IAGDBhgzZ8+fVqJiYmqqqrS1q1btWjRImVmZiotLc2qKSoqUmJionr27Kn8/HyNHj1ajz76qLKysupsfwAAoH7zeiA6cuSIkpKS9Nprr+maa66xxisqKjR//ny99NJL6tWrlzp37qyFCxdq69at2rZtmyRp7dq12r17t/7nf/5HHTt2VL9+/fTMM88oIyNDVVVVkqR58+YpJiZG06ZNU2xsrEaOHKnf/e53mj59+k/2VFlZKZfL5XEAAICrl9cDUXJyshITExUfH+8xnpeXp5MnT3qMt2nTRi1btlRubq4kKTc3V+3atVNYWJhVk5CQIJfLpYKCAqvm39dOSEiw1jifKVOmKDg42DqioqJ+8T4BAED95dVAtHTpUn322WeaMmXKOXMlJSXy9/dXSEiIx3hYWJhKSkqsmrPD0Jn5M3MXqnG5XDp+/Ph5+0pNTVVFRYV17N+/v0b7AwAAVwY/b73x/v379ac//UnZ2dkKCAjwVhvnZbfbZbfbvd0GAACoI167QpSXl6eysjJ16tRJfn5+8vPzU05OjmbOnCk/Pz+FhYWpqqpK5eXlHq8rLS1VeHi4JCk8PPycT52dOf+5GofDocDAwMu0OwAAcCXxWiDq3bu3du3apfz8fOu4+eablZSUZP3coEEDrVu3znpNYWGhiouL5XQ6JUlOp1O7du1SWVmZVZOdnS2Hw6G4uDir5uw1ztScWQMAAMBrt8waN26sG2+80WOsUaNGatq0qTU+dOhQpaSkqEmTJnI4HHr88cfldDrVtWtXSVKfPn0UFxenBx98UFOnTlVJSYkmTpyo5ORk65bXY489ptmzZ2v8+PEaMmSI1q9fr+XLl2v16tV1u2EAAFBveS0QXYzp06fLx8dHAwcOVGVlpRISEjRnzhxr3tfXV6tWrdKIESPkdDrVqFEjDR48WOnp6VZNTEyMVq9erTFjxmjGjBmKjIzU66+/roSEBG9sCQAA1EP1KhBt2LDB4zwgIEAZGRnKyMj4yddER0frgw8+uOC6PXr00Oeff14bLQIAgKuQ17+HCAAAwNsIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwXo0CUa9evVReXn7OuMvlUq9evX5pTwAAAHWqRoFow4YNqqqqOmf8xIkT2rRp0y9uCgAAoC75XUrxzp07rZ93796tkpIS6/z06dNas2aNfvWrX9VedwAAAHXgkgJRx44dZbPZZLPZzntrLDAwULNmzaq15gAAAOrCJQWioqIiud1u/frXv9aOHTvUvHlza87f31+hoaHy9fWt9SYBAAAup0sKRNHR0ZKk6urqy9IMAACAN1xSIDrb3r179fHHH6usrOycgJSWlvaLGwMAAKgrNQpEr732mkaMGKFmzZopPDxcNpvNmrPZbAQiAABwRalRIJo8ebKeffZZTZgwobb7AQAAqHM1+h6i77//Xvfcc09t9wIAAOAVNQpE99xzj9auXVvbvQAAAHhFjW6ZtWrVSk8++aS2bdumdu3aqUGDBh7zo0aNqpXmAAAA6kKNAtGrr76qoKAg5eTkKCcnx2POZrMRiAAAwBWlRoGoqKiotvsAAADwmho9QwQAAHA1qdEVoiFDhlxwfsGCBTVqBgAAwBtqFIi+//57j/OTJ0/qyy+/VHl5+Xn/6CsAAEB9VqNA9M4775wzVl1drREjRui66677xU0BAADUpVp7hsjHx0cpKSmaPn16bS0JAABQJ2r1oep9+/bp1KlTtbkkAADAZVejW2YpKSke5263WwcPHtTq1as1ePDgWmkMAACgrtQoEH3++ece5z4+PmrevLmmTZv2s59AAwAAqG9qFIg+/vjj2u4DAADAa2oUiM747rvvVFhYKElq3bq1mjdvXitNAQAA1KUaPVR99OhRDRkyRC1atFD37t3VvXt3RUREaOjQoTp27Fht9wgAAHBZ1SgQpaSkKCcnR++//77Ky8tVXl6ud999Vzk5OXriiSdqu0cAAIDLqka3zN566y29+eab6tGjhzV25513KjAwUPfee6/mzp1bW/0BAABcdjW6QnTs2DGFhYWdMx4aGsotMwAAcMWpUSByOp166qmndOLECWvs+PHjevrpp+V0OmutOQAAgLpQo1tmL7/8svr27avIyEh16NBBkvTFF1/Ibrdr7dq1tdogAADA5VajQNSuXTvt3btXixcv1ldffSVJGjRokJKSkhQYGFirDQIAAFxuNQpEU6ZMUVhYmIYNG+YxvmDBAn333XeaMGFCrTQHAABQF2r0DNErr7yiNm3anDPetm1bzZs376LXmTt3rtq3by+HwyGHwyGn06kPP/zQmj9x4oSSk5PVtGlTBQUFaeDAgSotLfVYo7i4WImJiWrYsKFCQ0M1bty4c/7A7IYNG9SpUyfZ7Xa1atVKmZmZl7ZhAABwVatRICopKVGLFi3OGW/evLkOHjx40etERkbq+eefV15enj799FP16tVL/fv3V0FBgSRpzJgxev/997VixQrl5OTowIEDGjBggPX606dPKzExUVVVVdq6dasWLVqkzMxMpaWlWTVFRUVKTExUz549lZ+fr9GjR+vRRx9VVlZWTbYOAACuQjW6ZRYVFaUtW7YoJibGY3zLli2KiIi46HXuuusuj/Nnn31Wc+fO1bZt2xQZGan58+dryZIl6tWrlyRp4cKFio2N1bZt29S1a1etXbtWu3fv1kcffaSwsDB17NhRzzzzjCZMmKBJkybJ399f8+bNU0xMjKZNmyZJio2N1ebNmzV9+nQlJCTUZPsAAOAqU6MrRMOGDdPo0aO1cOFC/eMf/9A//vEPLViwQGPGjDnnuaKLdfr0aS1dulRHjx6V0+lUXl6eTp48qfj4eKumTZs2atmypXJzcyVJubm5ateuncd3IiUkJMjlcllXmXJzcz3WOFNzZo3zqayslMvl8jgAAMDVq0ZXiMaNG6d//etf+uMf/6iqqipJUkBAgCZMmKDU1NRLWmvXrl1yOp06ceKEgoKC9M477yguLk75+fny9/dXSEiIR31YWJhKSkok/Xjr7t+/IPLM+c/VuFwuHT9+/LyfipsyZYqefvrpS9oHAAC4ctUoENlsNr3wwgt68skntWfPHgUGBur666+X3W6/5LVat26t/Px8VVRU6M0339TgwYOVk5NTk7ZqTWpqqlJSUqxzl8ulqKgoL3YEAAAupxoFojOCgoJ0yy23/KIG/P391apVK0lS586d9cknn2jGjBm67777VFVVpfLyco+rRKWlpQoPD5ckhYeHa8eOHR7rnfkU2tk1//7JtNLSUjkcjp/8ziS73V6jcAcAAK5MNXqG6HKqrq5WZWWlOnfurAYNGmjdunXWXGFhoYqLi60/D+J0OrVr1y6VlZVZNdnZ2XI4HIqLi7Nqzl7jTA1/YgQAAJzxi64Q/VKpqanq16+fWrZsqR9++EFLlizRhg0blJWVpeDgYA0dOlQpKSlq0qSJHA6HHn/8cTmdTnXt2lWS1KdPH8XFxenBBx/U1KlTVVJSookTJyo5Odm6wvPYY49p9uzZGj9+vIYMGaL169dr+fLlWr16tTe3DgAA6hGvBqKysjI99NBDOnjwoIKDg9W+fXtlZWXpjjvukCRNnz5dPj4+GjhwoCorK5WQkKA5c+ZYr/f19dWqVas0YsQIOZ1ONWrUSIMHD1Z6erpVExMTo9WrV2vMmDGaMWOGIiMj9frrr/ORewAAYPFqIJo/f/4F5wMCApSRkaGMjIyfrImOjtYHH3xwwXV69Oihzz//vEY9AgCAq1+9e4YIAACgrhGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDyvBqIpU6bolltuUePGjRUaGqq7775bhYWFHjUnTpxQcnKymjZtqqCgIA0cOFClpaUeNcXFxUpMTFTDhg0VGhqqcePG6dSpUx41GzZsUKdOnWS329WqVStlZmZe7u0BAIArhFcDUU5OjpKTk7Vt2zZlZ2fr5MmT6tOnj44ePWrVjBkzRu+//75WrFihnJwcHThwQAMGDLDmT58+rcTERFVVVWnr1q1atGiRMjMzlZaWZtUUFRUpMTFRPXv2VH5+vkaPHq1HH31UWVlZdbpfAABQP/l5883XrFnjcZ6ZmanQ0FDl5eWpe/fuqqio0Pz587VkyRL16tVLkrRw4ULFxsZq27Zt6tq1q9auXavdu3fro48+UlhYmDp27KhnnnlGEyZM0KRJk+Tv76958+YpJiZG06ZNkyTFxsZq8+bNmj59uhISEup83wAAoH6pV88QVVRUSJKaNGkiScrLy9PJkycVHx9v1bRp00YtW7ZUbm6uJCk3N1ft2rVTWFiYVZOQkCCXy6WCggKr5uw1ztScWePfVVZWyuVyeRwAAODqVW8CUXV1tUaPHq1u3brpxhtvlCSVlJTI399fISEhHrVhYWEqKSmxas4OQ2fmz8xdqMblcun48ePn9DJlyhQFBwdbR1RUVK3sEQAA1E/1JhAlJyfryy+/1NKlS73dilJTU1VRUWEd+/fv93ZLAADgMvLqM0RnjBw5UqtWrdLGjRsVGRlpjYeHh6uqqkrl5eUeV4lKS0sVHh5u1ezYscNjvTOfQju75t8/mVZaWiqHw6HAwMBz+rHb7bLb7bWyNwAAUP959QqR2+3WyJEj9c4772j9+vWKiYnxmO/cubMaNGigdevWWWOFhYUqLi6W0+mUJDmdTu3atUtlZWVWTXZ2thwOh+Li4qyas9c4U3NmDQAAYDavXiFKTk7WkiVL9O6776px48bWMz/BwcEKDAxUcHCwhg4dqpSUFDVp0kQOh0OPP/64nE6nunbtKknq06eP4uLi9OCDD2rq1KkqKSnRxIkTlZycbF3leeyxxzR79myNHz9eQ4YM0fr167V8+XKtXr3aa3sHAAD1h1evEM2dO1cVFRXq0aOHWrRoYR3Lli2zaqZPn67/+I//0MCBA9W9e3eFh4fr7bfftuZ9fX21atUq+fr6yul06ve//70eeughpaenWzUxMTFavXq1srOz1aFDB02bNk2vv/46H7kHAACSvHyFyO12/2xNQECAMjIylJGR8ZM10dHR+uCDDy64To8ePfT5559fco8AAODqV28+ZQYAAOAtBCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACM59VAtHHjRt11112KiIiQzWbTypUrPebdbrfS0tLUokULBQYGKj4+Xnv37vWoOXz4sJKSkuRwOBQSEqKhQ4fqyJEjHjU7d+7U7bffroCAAEVFRWnq1KmXe2sAAOAK4tVAdPToUXXo0EEZGRnnnZ86dapmzpypefPmafv27WrUqJESEhJ04sQJqyYpKUkFBQXKzs7WqlWrtHHjRg0fPtyad7lc6tOnj6Kjo5WXl6e//OUvmjRpkl599dXLvj8AAHBl8PPmm/fr10/9+vU775zb7dbLL7+siRMnqn///pKkN954Q2FhYVq5cqXuv/9+7dmzR2vWrNEnn3yim2++WZI0a9Ys3XnnnXrxxRcVERGhxYsXq6qqSgsWLJC/v7/atm2r/Px8vfTSSx7B6WyVlZWqrKy0zl0uVy3vHAAA1Cf19hmioqIilZSUKD4+3hoLDg5Wly5dlJubK0nKzc1VSEiIFYYkKT4+Xj4+Ptq+fbtV0717d/n7+1s1CQkJKiws1Pfff3/e954yZYqCg4OtIyoq6nJsEQAA1BP1NhCVlJRIksLCwjzGw8LCrLmSkhKFhoZ6zPv5+alJkyYeNedb4+z3+HepqamqqKiwjv379//yDQEAgHrLq7fM6iu73S673e7tNgAAQB2pt1eIwsPDJUmlpaUe46WlpdZceHi4ysrKPOZPnTqlw4cPe9Scb42z3wMAAJit3gaimJgYhYeHa926ddaYy+XS9u3b5XQ6JUlOp1Pl5eXKy8uzatavX6/q6mp16dLFqtm4caNOnjxp1WRnZ6t169a65ppr6mg3AACgPvNqIDpy5Ijy8/OVn58v6ccHqfPz81VcXCybzabRo0dr8uTJeu+997Rr1y499NBDioiI0N133y1Jio2NVd++fTVs2DDt2LFDW7Zs0ciRI3X//fcrIiJCkvTAAw/I399fQ4cOVUFBgZYtW6YZM2YoJSXFS7sGAAD1jVefIfr000/Vs2dP6/xMSBk8eLAyMzM1fvx4HT16VMOHD1d5ebluu+02rVmzRgEBAdZrFi9erJEjR6p3797y8fHRwIEDNXPmTGs+ODhYa9euVXJysjp37qxmzZopLS3tJz9yDwAAzOPVQNSjRw+53e6fnLfZbEpPT1d6evpP1jRp0kRLliy54Pu0b99emzZtqnGfAADg6lZvnyECAACoKwQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4Xv2mangqLi7WoUOHvN0GUO80a9ZMLVu29HYbAK5iBKJ6ori4WLGxsTp27Ji3WwHqnYYNG2rPnj2EIgCXDYGonjh06JCOHTumF19L13WtY7zdDlBv7Css0thhaTp06BCBCMBlQyCqZ65rHaO2Hdt4uw0AAIzCQ9UAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMRiAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIAAAYj0AEAACMRyACAADGIxABAADjEYgAAIDxCEQAAMB4BCIAAGA8AhEAADAegQgAABiPQAQAAIxHIAIAAMYjEAEAAOMZFYgyMjJ07bXXKiAgQF26dNGOHTu83RIAAKgHjAlEy5YtU0pKip566il99tln6tChgxISElRWVubt1gAAgJcZE4heeuklDRs2TI888oji4uI0b948NWzYUAsWLPB2awAAwMv8vN1AXaiqqlJeXp5SU1OtMR8fH8XHxys3N/ec+srKSlVWVlrnFRUVkiSXy3XZejxy5IgkqSB/j44dPX7Z3ge40hTt/UbSj78jl/N38HLi9xs4v8v9+31mTbfb/fPFbgP885//dEtyb9261WN83Lhx7ltvvfWc+qeeesotiYODg4ODg+MqOPbv3/+zWcGIK0SXKjU1VSkpKdZ5dXW1Dh8+rKZNm8pms3mxM9QFl8ulqKgo7d+/Xw6Hw9vtAKhF/H6bxe1264cfflBERMTP1hoRiJo1ayZfX1+VlpZ6jJeWlio8PPycervdLrvd7jEWEhJyOVtEPeRwOPgPJnCV4vfbHMHBwRdVZ8RD1f7+/urcubPWrVtnjVVXV2vdunVyOp1e7AwAANQHRlwhkqSUlBQNHjxYN998s2699Va9/PLLOnr0qB555BFvtwYAALzMmEB033336bvvvlNaWppKSkrUsWNHrVmzRmFhYd5uDfWM3W7XU089dc5tUwBXPn6/8VNsbvfFfBYNAADg6mXEM0QAAAAXQiACAADGIxABAADjEYgAAIDxCEQw0sMPPyybzabnn3/eY3zlypV8GzlwBXK73YqPj1dCQsI5c3PmzFFISIi+/fZbL3SGKwWBCMYKCAjQCy+8oO+//97brQD4hWw2mxYuXKjt27frlVdescaLioo0fvx4zZo1S5GRkV7sEPUdgQjGio+PV3h4uKZMmfKTNW+99Zbatm0ru92ua6+9VtOmTavDDgFciqioKM2YMUNjx45VUVGR3G63hg4dqj59+uimm25Sv379FBQUpLCwMD344IM6dOiQ9do333xT7dq1U2BgoJo2bar4+HgdPXrUi7tBXSMQwVi+vr567rnnNGvWrPNeSs/Ly9O9996r+++/X7t27dKkSZP05JNPKjMzs+6bBXBRBg8erN69e2vIkCGaPXu2vvzyS73yyivq1auXbrrpJn366adas2aNSktLde+990qSDh48qEGDBmnIkCHas2ePNmzYoAEDBoiv6TMLX8wIIz388MMqLy/XypUr5XQ6FRcXp/nz52vlypX67W9/K7fbraSkJH333Xdau3at9brx48dr9erVKigo8GL3AC6krKxMbdu21eHDh/XWW2/pyy+/1KZNm5SVlWXVfPvtt4qKilJhYaGOHDmizp0765tvvlF0dLQXO4c3cYUIxnvhhRe0aNEi7dmzx2N8z5496tatm8dYt27dtHfvXp0+fbouWwRwCUJDQ/WHP/xBsbGxuvvuu/XFF1/o448/VlBQkHW0adNGkrRv3z516NBBvXv3Vrt27XTPPffotdde49lCAxGIYLzu3bsrISFBqamp3m4FQC3x8/OTn9+Pf67zyJEjuuuuu5Sfn+9x7N27V927d5evr6+ys7P14YcfKi4uTrNmzVLr1q1VVFTk5V2gLhnzx12BC3n++efVsWNHtW7d2hqLjY3Vli1bPOq2bNmiG264Qb6+vnXdIoAa6tSpk9566y1de+21Vkj6dzabTd26dVO3bt2Ulpam6OhovfPOO0pJSanjbuEtXCECJLVr105JSUmaOXOmNfbEE09o3bp1euaZZ/S3v/1NixYt0uzZszV27FgvdgrgUiUnJ+vw4cMaNGiQPvnkE+3bt09ZWVl65JFHdPr0aW3fvl3PPfecPv30UxUXF+vtt9/Wd999p9jYWG+3jjpEIAL+n/T0dFVXV1vnnTp10vLly7V06VLdeOONSktLU3p6uh5++GHvNQngkkVERGjLli06ffq0+vTpo3bt2mn06NEKCQmRj4+PHA6HNm7cqDvvvFM33HCDJk6cqGnTpqlfv37ebh11iE+ZAQAA43GFCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgPAIRAAAwHoEIQJ3JzMxUSEiIt9vwuocfflh33323t9sAcBYCEYBa9fDDD8tms8lms8nf31+tWrVSenq6Tp065e3WPBQVFemBBx5QRESEAgICFBkZqf79++urr7667O89Y8YMZWZmXvb3AXDx+Gv3AGpd3759tXDhQlVWVuqDDz5QcnKyGjRooBYtWni7NUnSyZMndccdd6h169Z6++231aJFC3377bf68MMPVV5eXuN1q6qq5O/v/7N1wcHBNX4PAJcHV4gA1Dq73a7w8HBFR0drxIgRio+P13vvvWfNZ2VlKTY2VkFBQerbt68OHjxozVVXVys9PV2RkZGy2+3q2LGj1qxZY81/8803stlsevvtt9WzZ081bNhQHTp0UG5urkcPmzdv1u23367AwEBFRUVp1KhROnr0qCSpoKBA+/bt05w5c9S1a1dFR0erW7dumjx5srp27WqtsX//ft17770KCQlRkyZN1L9/f33zzTfW/JlbX88++6wiIiLUunVr/fd//7e6dOlyzr9Jhw4dlJ6e7vG6s/c8depUtWrVSna7XS1bttSzzz570X0A+OUIRAAuu8DAQFVVVUmSjh07phdffFF//etftXHjRhUXF2vs2LFW7YwZMzRt2jS9+OKL2rlzpxISEvSf//mf2rt3r8eaf/7znzV27Fjl5+frhhtu0KBBg6zbcvv27VPfvn01cOBA7dy5U8uWLdPmzZs1cuRISVLz5s3l4+OjN998U6dPnz5vzydPnlRCQoIaN26sTZs2acuWLVaAO7MXSVq3bp0KCwuVnZ2tVatWKSkpSTt27NC+ffusmoKCAu3cuVMPPPDAed8rNTVVzz//vJ588knt3r1bS5YsUVhY2CX1AeAXcgNALRo8eLC7f//+brfb7a6urnZnZ2e77Xa7e+zYse6FCxe6Jbm//vprqz4jI8MdFhZmnUdERLifffZZjzVvueUW9x//+Ee32+12FxUVuSW5X3/9dWu+oKDALcm9Z88et9vtdg8dOtQ9fPhwjzU2bdrk9vHxcR8/ftztdrvds2fPdjds2NDduHFjd8+ePd3p6enuffv2WfV//etf3a1bt3ZXV1dbY5WVle7AwEB3VlaWtdewsDB3ZWWlx3t16NDBnZ6ebp2npqa6u3Tpct5/I5fL5bbb7e7XXnvtvP+eF9MHgF+OK0QAat2qVasUFBSkgIAA9evXT/fdd58mTZokSWrYsKGuu+46q7ZFixYqKyuTJLlcLh04cEDdunXzWK9bt27as2ePx1j79u091pBkrfPFF18oMzNTQUFB1pGQkKDq6moVFRVJkpKTk1VSUqLFixfL6XRqxYoVatu2rbKzs601vv76azVu3Nhao0mTJjpx4oTH1Z927dqd89xQUlKSlixZIklyu9363//9XyUlJZ3332rPnj2qrKxU7969zzt/sX0A+GV4qBpArevZs6fmzp0rf39/RUREyM/v//+npkGDBh61NptNbrf7kt/j7HVsNpukH5/FkaQjR47oD3/4g0aNGnXO61q2bGn93LhxY91111266667NHnyZCUkJGjy5Mm64447dOTIEXXu3FmLFy8+Z43mzZtbPzdq1Oic+UGDBmnChAn67LPPdPz4ce3fv1/33XffefcRGBh4wX1ebB8AfhkCEYBa16hRI7Vq1eqSX+dwOBQREaEtW7boN7/5jTW+ZcsW3XrrrRe9TqdOnbR79+5L6sFms6lNmzbaunWrtcayZcsUGhoqh8Nx8ZuQFBkZqd/85jdavHixjh8/rjvuuEOhoaHnrb3++usVGBiodevW6dFHHz3vXmraB4CLxy0zAPXKuHHj9MILL2jZsmUqLCzUf/3Xfyk/P19/+tOfLnqNCRMmaOvWrRo5cqTy8/O1d+9evfvuu9ZD1fn5+erfv7/efPNN7d69W19//bXmz5+vBQsWqH///pJ+vO3VrFkz9e/fX5s2bVJRUZE2bNigUaNG6dtvv/3ZHpKSkrR06VKtWLHiJ2+XSVJAQIAmTJig8ePH64033tC+ffu0bds2zZ8/v1b6AHBxuEIEoF4ZNWqUKioq9MQTT6isrExxcXF67733dP3111/0Gu3bt1dOTo7+/Oc/6/bbb5fb7dZ1111n3baKjIzUtddeq6efftr6GP+Z8zFjxkj68VmnjRs3asKECRowYIB++OEH/epXv1Lv3r0v6krN7373O40cOVK+vr4/+63UTz75pPz8/JSWlqYDBw6oRYsWeuyxx2qlDwAXx+auyc17AACAqwi3zAAAgPEIRAAAwHgEIgAAYDwCEQAAMB6BCAAAGI9ABAAAjEcgAgAAxiMQAQAA4xGIAACA8QhEAADAeAQiAABgvP8DEctRcVtOk/UAAAAASUVORK5CYII=",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### MultipleLines ############################\n",
+ " MultipleLines Ratio\n",
+ "MultipleLines \n",
+ "No 3390 48.133\n",
+ "Yes 2971 42.184\n",
+ "No phone service 682 9.683\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### InternetService ############################\n",
+ " InternetService Ratio\n",
+ "InternetService \n",
+ "Fiber optic 3096 43.959\n",
+ "DSL 2421 34.375\n",
+ "No 1526 21.667\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### OnlineSecurity ############################\n",
+ " OnlineSecurity Ratio\n",
+ "OnlineSecurity \n",
+ "No 3498 49.666\n",
+ "Yes 2019 28.667\n",
+ "No internet service 1526 21.667\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### OnlineBackup ############################\n",
+ " OnlineBackup Ratio\n",
+ "OnlineBackup \n",
+ "No 3088 43.845\n",
+ "Yes 2429 34.488\n",
+ "No internet service 1526 21.667\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### DeviceProtection ############################\n",
+ " DeviceProtection Ratio\n",
+ "DeviceProtection \n",
+ "No 3095 43.944\n",
+ "Yes 2422 34.389\n",
+ "No internet service 1526 21.667\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### TechSupport ############################\n",
+ " TechSupport Ratio\n",
+ "TechSupport \n",
+ "No 3473 49.311\n",
+ "Yes 2044 29.022\n",
+ "No internet service 1526 21.667\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### StreamingTV ############################\n",
+ " StreamingTV Ratio\n",
+ "StreamingTV \n",
+ "No 2810 39.898\n",
+ "Yes 2707 38.435\n",
+ "No internet service 1526 21.667\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### StreamingMovies ############################\n",
+ " StreamingMovies Ratio\n",
+ "StreamingMovies \n",
+ "No 2785 39.543\n",
+ "Yes 2732 38.790\n",
+ "No internet service 1526 21.667\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### Contract ############################\n",
+ " Contract Ratio\n",
+ "Contract \n",
+ "Month-to-month 3875 55.019\n",
+ "Two year 1695 24.066\n",
+ "One year 1473 20.914\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### PaperlessBilling ############################\n",
+ " PaperlessBilling Ratio\n",
+ "PaperlessBilling \n",
+ "Yes 4171 59.222\n",
+ "No 2872 40.778\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### PaymentMethod ############################\n",
+ " PaymentMethod Ratio\n",
+ "PaymentMethod \n",
+ "Electronic check 2365 33.579\n",
+ "Mailed check 1612 22.888\n",
+ "Bank transfer (automatic) 1544 21.922\n",
+ "Credit card (automatic) 1522 21.610\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "##################### Churn ############################\n",
+ " Churn Ratio\n",
+ "Churn \n",
+ "0 5174 73.463\n",
+ "1 1869 26.537\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "def cat_summary(dataframe, col_name, plot = False):\n",
+ " print('#####################',col_name,'############################')\n",
+ " print(pd.DataFrame({col_name: dataframe[col_name].value_counts(),\n",
+ " 'Ratio':100 * dataframe[col_name].value_counts()/len(dataframe)}))\n",
+ " \n",
+ " if plot:\n",
+ " sns.countplot(x = dataframe[col_name], data = dataframe, edgecolor='black', color='#D9F9C4')\n",
+ " plt.show(block = True)\n",
+ " \n",
+ "for col in cat_cols:\n",
+ " cat_summary(df, col, plot = True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 37,
+ "id": "03a89b6e",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from plotly.subplots import make_subplots"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 38,
+ "id": "884f9b09",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "domain": {
+ "x": [
+ 0,
+ 0.45
+ ],
+ "y": [
+ 0,
+ 1
+ ]
+ },
+ "hole": 0.4,
+ "hoverinfo": "label+percent+name",
+ "labels": [
+ "Male",
+ "Female"
+ ],
+ "name": "Gender",
+ "textfont": {
+ "size": 16
+ },
+ "type": "pie",
+ "values": [
+ 3555,
+ 3488
+ ]
+ },
+ {
+ "domain": {
+ "x": [
+ 0.55,
+ 1
+ ],
+ "y": [
+ 0,
+ 1
+ ]
+ },
+ "hole": 0.4,
+ "hoverinfo": "label+percent+name",
+ "labels": [
+ "No",
+ "Yes"
+ ],
+ "name": "Churn",
+ "textfont": {
+ "size": 16
+ },
+ "type": "pie",
+ "values": [
+ 5174,
+ 1869
+ ]
+ }
+ ],
+ "layout": {
+ "annotations": [
+ {
+ "font": {
+ "size": 20
+ },
+ "showarrow": false,
+ "text": "Gender",
+ "x": 0.16,
+ "y": 0.5
+ },
+ {
+ "font": {
+ "size": 20
+ },
+ "showarrow": false,
+ "text": "Churn",
+ "x": 0.84,
+ "y": 0.5
+ }
+ ],
+ "autosize": true,
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "fillpattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "text": "Gender and Churn Distributions"
+ }
+ }
+ },
+ "image/png": "",
+ "text/html": [
+ "<div> <div id=\"7a0eaf5e-0716-4e9a-b957-5391a4585b92\" class=\"plotly-graph-div\" style=\"height:525px; width:100%;\"></div> <script type=\"text/javascript\"> require([\"plotly\"], function(Plotly) { window.PLOTLYENV=window.PLOTLYENV || {}; if (document.getElementById(\"7a0eaf5e-0716-4e9a-b957-5391a4585b92\")) { Plotly.newPlot( \"7a0eaf5e-0716-4e9a-b957-5391a4585b92\", [{\"labels\":[\"Male\",\"Female\"],\"name\":\"Gender\",\"values\":[3555,3488],\"type\":\"pie\",\"domain\":{\"x\":[0.0,0.45],\"y\":[0.0,1.0]},\"textfont\":{\"size\":16},\"hole\":0.4,\"hoverinfo\":\"label+percent+name\"},{\"labels\":[\"No\",\"Yes\"],\"name\":\"Churn\",\"values\":[5174,1869],\"type\":\"pie\",\"domain\":{\"x\":[0.55,1.0],\"y\":[0.0,1.0]},\"textfont\":{\"size\":16},\"hole\":0.4,\"hoverinfo\":\"label+percent+name\"}], {\"template\":{\"data\":{\"histogram2dcontour\":[{\"type\":\"histogram2dcontour\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"choropleth\":[{\"type\":\"choropleth\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}],\"histogram2d\":[{\"type\":\"histogram2d\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"heatmap\":[{\"type\":\"heatmap\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"heatmapgl\":[{\"type\":\"heatmapgl\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"contourcarpet\":[{\"type\":\"contourcarpet\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}],\"contour\":[{\"type\":\"contour\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"surface\":[{\"type\":\"surface\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"mesh3d\":[{\"type\":\"mesh3d\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}],\"scatter\":[{\"fillpattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2},\"type\":\"scatter\"}],\"parcoords\":[{\"type\":\"parcoords\",\"line\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatterpolargl\":[{\"type\":\"scatterpolargl\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"bar\":[{\"error_x\":{\"color\":\"#2a3f5f\"},\"error_y\":{\"color\":\"#2a3f5f\"},\"marker\":{\"line\":{\"color\":\"#E5ECF6\",\"width\":0.5},\"pattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2}},\"type\":\"bar\"}],\"scattergeo\":[{\"type\":\"scattergeo\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatterpolar\":[{\"type\":\"scatterpolar\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"histogram\":[{\"marker\":{\"pattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2}},\"type\":\"histogram\"}],\"scattergl\":[{\"type\":\"scattergl\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatter3d\":[{\"type\":\"scatter3d\",\"line\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}},\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scattermapbox\":[{\"type\":\"scattermapbox\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatterternary\":[{\"type\":\"scatterternary\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scattercarpet\":[{\"type\":\"scattercarpet\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"carpet\":[{\"aaxis\":{\"endlinecolor\":\"#2a3f5f\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"minorgridcolor\":\"white\",\"startlinecolor\":\"#2a3f5f\"},\"baxis\":{\"endlinecolor\":\"#2a3f5f\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"minorgridcolor\":\"white\",\"startlinecolor\":\"#2a3f5f\"},\"type\":\"carpet\"}],\"table\":[{\"cells\":{\"fill\":{\"color\":\"#EBF0F8\"},\"line\":{\"color\":\"white\"}},\"header\":{\"fill\":{\"color\":\"#C8D4E3\"},\"line\":{\"color\":\"white\"}},\"type\":\"table\"}],\"barpolar\":[{\"marker\":{\"line\":{\"color\":\"#E5ECF6\",\"width\":0.5},\"pattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2}},\"type\":\"barpolar\"}],\"pie\":[{\"automargin\":true,\"type\":\"pie\"}]},\"layout\":{\"autotypenumbers\":\"strict\",\"colorway\":[\"#636efa\",\"#EF553B\",\"#00cc96\",\"#ab63fa\",\"#FFA15A\",\"#19d3f3\",\"#FF6692\",\"#B6E880\",\"#FF97FF\",\"#FECB52\"],\"font\":{\"color\":\"#2a3f5f\"},\"hovermode\":\"closest\",\"hoverlabel\":{\"align\":\"left\"},\"paper_bgcolor\":\"white\",\"plot_bgcolor\":\"#E5ECF6\",\"polar\":{\"bgcolor\":\"#E5ECF6\",\"angularaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"},\"radialaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"}},\"ternary\":{\"bgcolor\":\"#E5ECF6\",\"aaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"},\"baxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"},\"caxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"}},\"coloraxis\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}},\"colorscale\":{\"sequential\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]],\"sequentialminus\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]],\"diverging\":[[0,\"#8e0152\"],[0.1,\"#c51b7d\"],[0.2,\"#de77ae\"],[0.3,\"#f1b6da\"],[0.4,\"#fde0ef\"],[0.5,\"#f7f7f7\"],[0.6,\"#e6f5d0\"],[0.7,\"#b8e186\"],[0.8,\"#7fbc41\"],[0.9,\"#4d9221\"],[1,\"#276419\"]]},\"xaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\",\"title\":{\"standoff\":15},\"zerolinecolor\":\"white\",\"automargin\":true,\"zerolinewidth\":2},\"yaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\",\"title\":{\"standoff\":15},\"zerolinecolor\":\"white\",\"automargin\":true,\"zerolinewidth\":2},\"scene\":{\"xaxis\":{\"backgroundcolor\":\"#E5ECF6\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"showbackground\":true,\"ticks\":\"\",\"zerolinecolor\":\"white\",\"gridwidth\":2},\"yaxis\":{\"backgroundcolor\":\"#E5ECF6\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"showbackground\":true,\"ticks\":\"\",\"zerolinecolor\":\"white\",\"gridwidth\":2},\"zaxis\":{\"backgroundcolor\":\"#E5ECF6\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"showbackground\":true,\"ticks\":\"\",\"zerolinecolor\":\"white\",\"gridwidth\":2}},\"shapedefaults\":{\"line\":{\"color\":\"#2a3f5f\"}},\"annotationdefaults\":{\"arrowcolor\":\"#2a3f5f\",\"arrowhead\":0,\"arrowwidth\":1},\"geo\":{\"bgcolor\":\"white\",\"landcolor\":\"#E5ECF6\",\"subunitcolor\":\"white\",\"showland\":true,\"showlakes\":true,\"lakecolor\":\"white\"},\"title\":{\"x\":0.05},\"mapbox\":{\"style\":\"light\"}}},\"title\":{\"text\":\"Gender and Churn Distributions\"},\"annotations\":[{\"showarrow\":false,\"text\":\"Gender\",\"x\":0.16,\"y\":0.5,\"font\":{\"size\":20}},{\"showarrow\":false,\"text\":\"Churn\",\"x\":0.84,\"y\":0.5,\"font\":{\"size\":20}}]}, {\"responsive\": true} ).then(function(){\n",
+ " \n",
+ "var gd = document.getElementById('7a0eaf5e-0716-4e9a-b957-5391a4585b92');\n",
+ "var x = new MutationObserver(function (mutations, observer) {{\n",
+ " var display = window.getComputedStyle(gd).display;\n",
+ " if (!display || display === 'none') {{\n",
+ " console.log([gd, 'removed!']);\n",
+ " Plotly.purge(gd);\n",
+ " observer.disconnect();\n",
+ " }}\n",
+ "}});\n",
+ "\n",
+ "// Listen for the removal of the full notebook cells\n",
+ "var notebookContainer = gd.closest('#notebook-container');\n",
+ "if (notebookContainer) {{\n",
+ " x.observe(notebookContainer, {childList: true});\n",
+ "}}\n",
+ "\n",
+ "// Listen for the clearing of the current output cell\n",
+ "var outputEl = gd.closest('.output');\n",
+ "if (outputEl) {{\n",
+ " x.observe(outputEl, {childList: true});\n",
+ "}}\n",
+ "\n",
+ " }) }; }); </script> </div>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "g_labels = ['Male', 'Female']\n",
+ "c_labels = ['No', 'Yes']\n",
+ "# Create subplots: use 'domain' type for Pie subplot\n",
+ "fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type':'domain'}]])\n",
+ "fig.add_trace(go.Pie(labels=g_labels, values=df['gender'].value_counts(), name=\"Gender\"),\n",
+ " 1, 1)\n",
+ "fig.add_trace(go.Pie(labels=c_labels, values=df['Churn'].value_counts(), name=\"Churn\"),\n",
+ " 1, 2)\n",
+ "\n",
+ "# Use `hole` to create a donut-like pie chart\n",
+ "fig.update_traces(hole=.4, hoverinfo=\"label+percent+name\", textfont_size=16)\n",
+ "\n",
+ "fig.update_layout(\n",
+ " title_text=\"Gender and Churn Distributions\",\n",
+ " # Add annotations in the center of the donut pies.\n",
+ " annotations=[dict(text='Gender', x=0.16, y=0.5, font_size=20, showarrow=False),\n",
+ " dict(text='Churn', x=0.84, y=0.5, font_size=20, showarrow=False)])\n",
+ "fig.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 39,
+ "id": "153ebb4d",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "count 7043.000\n",
+ "mean 32.371\n",
+ "std 24.559\n",
+ "min 0.000\n",
+ "5% 1.000\n",
+ "10% 2.000\n",
+ "20% 6.000\n",
+ "30% 12.000\n",
+ "40% 20.000\n",
+ "50% 29.000\n",
+ "60% 40.000\n",
+ "70% 50.000\n",
+ "80% 60.000\n",
+ "90% 69.000\n",
+ "95% 72.000\n",
+ "99% 72.000\n",
+ "max 72.000\n",
+ "Name: tenure, dtype: float64\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "count 7043.000\n",
+ "mean 64.762\n",
+ "std 30.090\n",
+ "min 18.250\n",
+ "5% 19.650\n",
+ "10% 20.050\n",
+ "20% 25.050\n",
+ "30% 45.850\n",
+ "40% 58.830\n",
+ "50% 70.350\n",
+ "60% 79.100\n",
+ "70% 85.500\n",
+ "80% 94.250\n",
+ "90% 102.600\n",
+ "95% 107.400\n",
+ "99% 114.729\n",
+ "max 118.750\n",
+ "Name: MonthlyCharges, dtype: float64\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "count 7032.000\n",
+ "mean 2283.300\n",
+ "std 2266.771\n",
+ "min 18.800\n",
+ "5% 49.605\n",
+ "10% 84.600\n",
+ "20% 267.070\n",
+ "30% 551.995\n",
+ "40% 944.170\n",
+ "50% 1397.475\n",
+ "60% 2048.950\n",
+ "70% 3141.130\n",
+ "80% 4475.410\n",
+ "90% 5976.640\n",
+ "95% 6923.590\n",
+ "99% 8039.883\n",
+ "max 8684.800\n",
+ "Name: TotalCharges, dtype: float64\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "def num_summary(dataframe, numerical_col, plot=False):\n",
+ " quantiles = [0.05, 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90, 0.95, 0.99]\n",
+ " print(dataframe[numerical_col].describe(quantiles).T)\n",
+ "\n",
+ " if plot:\n",
+ " dataframe[numerical_col].hist(bins=20, alpha=0.4, color='b')\n",
+ " plt.xlabel(numerical_col)\n",
+ " plt.title(numerical_col)\n",
+ " plt.show(block=True)\n",
+ " \n",
+ "for col in num_cols:\n",
+ " num_summary(df, col, plot=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 203,
+ "id": "ff85677a",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " CHURN_MEAN\n",
+ "gender \n",
+ "Female 0.269\n",
+ "Male 0.262\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "SeniorCitizen \n",
+ "0 0.236\n",
+ "1 0.417\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "Partner \n",
+ "No 0.330\n",
+ "Yes 0.197\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "Dependents \n",
+ "No 0.313\n",
+ "Yes 0.155\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "PhoneService \n",
+ "No 0.249\n",
+ "Yes 0.267\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "MultipleLines \n",
+ "No 0.250\n",
+ "No phone service 0.249\n",
+ "Yes 0.286\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "InternetService \n",
+ "DSL 0.190\n",
+ "Fiber optic 0.419\n",
+ "No 0.074\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "OnlineSecurity \n",
+ "No 0.418\n",
+ "No internet service 0.074\n",
+ "Yes 0.146\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "OnlineBackup \n",
+ "No 0.399\n",
+ "No internet service 0.074\n",
+ "Yes 0.215\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "DeviceProtection \n",
+ "No 0.391\n",
+ "No internet service 0.074\n",
+ "Yes 0.225\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "TechSupport \n",
+ "No 0.416\n",
+ "No internet service 0.074\n",
+ "Yes 0.152\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "StreamingTV \n",
+ "No 0.335\n",
+ "No internet service 0.074\n",
+ "Yes 0.301\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "StreamingMovies \n",
+ "No 0.337\n",
+ "No internet service 0.074\n",
+ "Yes 0.299\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "Contract \n",
+ "Month-to-month 0.427\n",
+ "One year 0.113\n",
+ "Two year 0.028\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "PaperlessBilling \n",
+ "No 0.163\n",
+ "Yes 0.336\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "PaymentMethod \n",
+ "Bank transfer (automatic) 0.167\n",
+ "Credit card (automatic) 0.152\n",
+ "Electronic check 0.453\n",
+ "Mailed check 0.191\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n",
+ " CHURN_MEAN\n",
+ "Churn \n",
+ "0 0.000\n",
+ "1 1.000\n",
+ "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"
+ ]
+ }
+ ],
+ "source": [
+ "def target_summary_with_cat(dataframe,target,categorical_col):\n",
+ " print(pd.DataFrame({\"CHURN_MEAN\": dataframe.groupby(categorical_col)[target].mean()}))\n",
+ " print(\"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\")\n",
+ "\n",
+ "for col in cat_cols:\n",
+ " target_summary_with_cat(df,\"Churn\",col)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 204,
+ "id": "cd0bef79",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def outlier_thresholds(dataframe, col_name, q1=0.05, q3=0.95):\n",
+ " quartile1 = dataframe[col_name].quantile(q1)\n",
+ " quartile3 = dataframe[col_name].quantile(q3)\n",
+ " interquantile_range = quartile3 - quartile1\n",
+ " up_limit = quartile3 + 1.5 * interquantile_range\n",
+ " low_limit = quartile1 - 1.5 * interquantile_range\n",
+ " return low_limit, up_limit"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 205,
+ "id": "3a1bd489",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def check_outlier(dataframe, col_name):\n",
+ " low_limit, up_limit = outlier_thresholds(dataframe, col_name)\n",
+ " if dataframe[(dataframe[col_name] > up_limit) | (dataframe[col_name] < low_limit)].any(axis=None):\n",
+ " return True\n",
+ " else:\n",
+ " return False"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 206,
+ "id": "4b42d8eb",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "tenure False\n",
+ "MonthlyCharges False\n",
+ "TotalCharges False\n"
+ ]
+ }
+ ],
+ "source": [
+ "for col in num_cols:\n",
+ " print(col, check_outlier(df, col))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 207,
+ "id": "0e5ca3eb",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " n_miss ratio\n",
+ "TotalCharges 11 0.160\n"
+ ]
+ }
+ ],
+ "source": [
+ "def missing_values_table(dataframe, na_name=False):\n",
+ " na_columns = [col for col in dataframe.columns if dataframe[col].isnull().sum() > 0]\n",
+ "\n",
+ " n_miss = dataframe[na_columns].isnull().sum().sort_values(ascending=False)\n",
+ " ratio = (dataframe[na_columns].isnull().sum() / dataframe.shape[0] * 100).sort_values(ascending=False)\n",
+ " missing_df = pd.concat([n_miss, np.round(ratio, 2)], axis=1, keys=['n_miss', 'ratio'])\n",
+ " print(missing_df, end=\"\\n\")\n",
+ "\n",
+ " if na_name:\n",
+ " return na_columns\n",
+ "\n",
+ "missing_values_table(df)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 208,
+ "id": "6f73eb78",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0"
+ ]
+ },
+ "execution_count": 208,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Filling in missing values with the median\n",
+ "\n",
+ "df[\"TotalCharges\"].fillna(df[\"TotalCharges\"].median(), inplace=True)\n",
+ "\n",
+ "df['TotalCharges'].isnull().sum()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 209,
+ "id": "137afb24",
+ "metadata": {
+ "lines_to_next_cell": 2
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "<Axes: >"
+ ]
+ },
+ "execution_count": 209,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 2 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "sns.heatmap(df[['tenure', 'MonthlyCharges', 'TotalCharges','Churn']].corr())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "8d5436d7-9833-4e97-a620-0dbbcd219b58",
+ "metadata": {},
+ "source": [
+ "IT IS LOGICAL THAT THERE IS A HIGH CORRELATION BETWEEN TENURE AND TOTALCHARGES BECAUSE AS THE TENURE, I.e., TOTAL SERVICE PROVIDED MONTH INCREASES, THE AMOUNT COLLECTED ALSO INCREASES."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 210,
+ "id": "2aa3d5f3",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "df.loc[((df[\"gender\"] == \"Male\") & (df[\"SeniorCitizen\"]== 1)),\"SENIOR/YOUNG_GENDER\"] = \"senior_male\"\n",
+ "df.loc[((df[\"gender\"] == \"Female\") & (df[\"SeniorCitizen\"]== 0)),\"SENIOR/YOUNG_GENDER\"] =\"young_male\"\n",
+ "df.loc[((df[\"gender\"] == \"Male\") & (df[\"SeniorCitizen\"]== 1)),\"SENIOR/YOUNG_GENDER\"] =\"senior_female\"\n",
+ "df.loc[((df[\"gender\"] == \"Female\") & (df[\"SeniorCitizen\"]== 0)),\"SENIOR/YOUNG_GENDER\"] =\"young_female\"\n",
+ "\n",
+ "\n",
+ "df.loc[((df[\"gender\"] == \"Male\") & (df[\"TechSupport\"] == \"No\")),\"GENDER_SUPPORT\"] = \"no_sup_male\"\n",
+ "df.loc[((df[\"gender\"] == \"Female\") & (df[\"TechSupport\"] == \"No\")),\"GENDER_SUPPORT\"] = \"no_sup_female\"\n",
+ "\n",
+ "\n",
+ "df.loc[((df[\"Contract\"] == \"Month-to-month\")\n",
+ " & (df[\"PaymentMethod\"] == \"Electronic check\")\n",
+ " & (df[\"gender\"] == \"Male\")),\"GENDER_EC_MONTH\"] = \"male_ec_month\"\n",
+ "\n",
+ "df.loc[((df[\"Contract\"] == \"Month-to-month\")\n",
+ " & (df[\"PaymentMethod\"] == \"Electronic check\")\n",
+ " & (df[\"gender\"] == \"Female\")),\"GENDER_EC_MONTH\"] = \"female_ec_month\"\n",
+ "\n",
+ "\n",
+ "df.loc[((df[\"OnlineSecurity\"] == \"No\") & (df[\"gender\"] == \"Female\")), \"GENDER_SECURITY\"] = \"no_sec_female\"\n",
+ "df.loc[((df[\"OnlineSecurity\"] == \"Yes\") & (df[\"gender\"] == \"Female\")),\"GENDER_SECURITY\"] = \"yes_sec_female\"\n",
+ "df.loc[((df[\"OnlineSecurity\"] == \"No\") & (df[\"gender\"] == \"Male\")),\"GENDER_SECURITY\"] = \"no_sec_male\"\n",
+ "df.loc[((df[\"OnlineSecurity\"] == \"Yes\") & (df[\"gender\"] == \"Male\")),\"GENDER_SECURITY\"] = \"yes_sec_male\"\n",
+ "\n",
+ "\n",
+ "df.loc[((df[\"InternetService\"] == \"Fiber optic\")\n",
+ " & (df[\"gender\"] == \"Male\")\n",
+ " & (df[\"Dependents\"] == \"No\")),\"GENDER_FIB_DEP\"] = \"male_fib_dep_no\"\n",
+ "\n",
+ "df.loc[((df[\"InternetService\"] == \"Fiber optic\")\n",
+ " & (df[\"gender\"] == \"Female\")\n",
+ " & (df[\"Dependents\"] == \"No\")),\"GENDER_FIB_DEP\"] = \"female_fib_dep_no\"\n",
+ "\n",
+ "df.loc[(df[\"tenure\"]>=0) & (df[\"tenure\"]<=12),\"NEW_TENURE_YEAR\"] = \"0-1 Year\"\n",
+ "df.loc[(df[\"tenure\"]>12) & (df[\"tenure\"]<=24),\"NEW_TENURE_YEAR\"] = \"1-2 Year\"\n",
+ "df.loc[(df[\"tenure\"]>24) & (df[\"tenure\"]<=36),\"NEW_TENURE_YEAR\"] = \"2-3 Year\"\n",
+ "df.loc[(df[\"tenure\"]>36) & (df[\"tenure\"]<=48),\"NEW_TENURE_YEAR\"] = \"3-4 Year\"\n",
+ "df.loc[(df[\"tenure\"]>48) & (df[\"tenure\"]<=60),\"NEW_TENURE_YEAR\"] = \"4-5 Year\"\n",
+ "df.loc[(df[\"tenure\"]>60) & (df[\"tenure\"]<=72),\"NEW_TENURE_YEAR\"] = \"5-6 Year\"\n",
+ "\n",
+ "\n",
+ "df.loc[((df[\"Partner\"] == \"No\") & (df[\"Contract\"] == \"Month-to-month\")),\"PARTNER_CONTR\"] = \"no_partner_month\"\n",
+ "df.loc[((df[\"Partner\"] == \"Yes\") & (df[\"Contract\"] == \"Month-to-month\")),\"PARTNER_CONTR\"] = \"yes_partner_month\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 211,
+ "id": "39b74b87",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Observations: 7043\n",
+ "Variables: 27\n",
+ "cat_cols: 24\n",
+ "num_cols: 3\n",
+ "cat_but_car: 0\n",
+ "num_but_cat: 1\n"
+ ]
+ }
+ ],
+ "source": [
+ "cat_cols, num_cols, cat_but_car = grab_col_names(df)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 212,
+ "id": "ed4b3bb8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "le = LabelEncoder()\n",
+ "\n",
+ "binary_cols = [col for col in df.columns if df[col].dtype not in [int, float]\n",
+ " and df[col].nunique() == 2]\n",
+ "\n",
+ "def label_encoder(dataframe, binary_col):\n",
+ " labelencoder = LabelEncoder()\n",
+ " dataframe[binary_col] = labelencoder.fit_transform(dataframe[binary_col])\n",
+ " return dataframe\n",
+ "\n",
+ "\n",
+ "for col in binary_cols:\n",
+ " df = label_encoder(df, col)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 213,
+ "id": "2ef6b00b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def one_hot_encoder(dataframe, categorical_cols, drop_first=True):\n",
+ " dataframe = pd.get_dummies(dataframe, columns=categorical_cols, drop_first=drop_first)\n",
+ " return dataframe\n",
+ "\n",
+ "ohe_cols = [col for col in df.columns if 30 >= df[col].nunique() > 2]\n",
+ "df = one_hot_encoder(df, ohe_cols)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 214,
+ "id": "27af7c78",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>tenure</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>-1.277</td>\n",
+ " <td>-1.160</td>\n",
+ " <td>-0.994</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>0.066</td>\n",
+ " <td>-0.260</td>\n",
+ " <td>-0.173</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>-1.237</td>\n",
+ " <td>-0.363</td>\n",
+ " <td>-0.960</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>0.514</td>\n",
+ " <td>-0.747</td>\n",
+ " <td>-0.195</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>-1.237</td>\n",
+ " <td>0.197</td>\n",
+ " <td>-0.940</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " tenure MonthlyCharges TotalCharges\n",
+ "0 -1.277 -1.160 -0.994\n",
+ "1 0.066 -0.260 -0.173\n",
+ "2 -1.237 -0.363 -0.960\n",
+ "3 0.514 -0.747 -0.195\n",
+ "4 -1.237 0.197 -0.940"
+ ]
+ },
+ "execution_count": 214,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "scaler = StandardScaler()\n",
+ "df[num_cols] = scaler.fit_transform(df[num_cols])\n",
+ "\n",
+ "df[num_cols].head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 215,
+ "id": "d2dd691f",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "y = df[\"Churn\"]\n",
+ "X = df.drop([\"Churn\"], axis=1)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 217,
+ "id": "7036f544",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 226,
+ "id": "ce93de41-cf13-4b1f-bbc6-cf35402a3720",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def base_models(X, y, scoring=\"roc_auc\"):\n",
+ " print(\"Base Models....\")\n",
+ " classifiers = [('LR', LogisticRegression()),\n",
+ " ('KNN', KNeighborsClassifier()),\n",
+ " (\"CART\", DecisionTreeClassifier()),\n",
+ " (\"RF\", RandomForestClassifier()),\n",
+ " ('Adaboost', AdaBoostClassifier()),\n",
+ " ('GBM', GradientBoostingClassifier()),\n",
+ " ('XGBoost', XGBClassifier(use_label_encoder=False, eval_metric='logloss')),\n",
+ " ('LightGBM', LGBMClassifier()),\n",
+ " ]\n",
+ "\n",
+ " for name, classifier in classifiers:\n",
+ " cv_results = cross_validate(classifier, X, y, cv=3, scoring=scoring)\n",
+ " print(f\"{scoring}: {round(cv_results['test_score'].mean(), 4)} ({name}) \")\n",
+ "\n",
+ "base_models(X, y, scoring=\"accuracy\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 232,
+ "id": "e412726e",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "knn_params = {\"n_neighbors\": range(2, 50)}\n",
+ "\n",
+ "cart_params = {'max_depth': range(1, 20),\n",
+ " \"min_samples_split\": range(2, 30)}\n",
+ "\n",
+ "rf_params = {\"max_depth\": [8, 15, None],\n",
+ " \"max_features\": [5, 7, \"auto\"],\n",
+ " \"min_samples_split\": [15, 20],\n",
+ " \"n_estimators\": [200, 300]}\n",
+ "\n",
+ "xgboost_params = {\"learning_rate\": [0.1, 0.01],\n",
+ " \"max_depth\": [5, 8],\n",
+ " \"n_estimators\": [100, 200]}\n",
+ "\n",
+ "lightgbm_params = {\"learning_rate\": [0.01, 0.1],\n",
+ " \"n_estimators\": [300, 500]}\n",
+ "\n",
+ "\n",
+ "classifiers = [('KNN', KNeighborsClassifier(), knn_params),\n",
+ " (\"CART\", DecisionTreeClassifier(), cart_params),\n",
+ " (\"RF\", RandomForestClassifier(), rf_params),\n",
+ " ('XGBoost', XGBClassifier(use_label_encoder=False, eval_metric='logloss'), xgboost_params),\n",
+ " ('LightGBM', LGBMClassifier(), lightgbm_params)]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 233,
+ "id": "6568b4b0",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def hyperparameter_optimization(X, y, cv=3, scoring=\"roc_auc\"):\n",
+ " print(\"Hyperparameter Optimization....\")\n",
+ " best_models = {}\n",
+ " for name, classifier, params in classifiers:\n",
+ " print(f\"########## {name} ##########\")\n",
+ " cv_results = cross_validate(classifier, X, y, cv=cv, scoring=scoring)\n",
+ " print(f\"{scoring} (Before): {round(cv_results['test_score'].mean(), 4)}\")\n",
+ "\n",
+ " gs_best = GridSearchCV(classifier, params, cv=cv, n_jobs=-1, verbose=False).fit(X, y)\n",
+ " final_model = classifier.set_params(**gs_best.best_params_)\n",
+ "\n",
+ " cv_results = cross_validate(final_model, X, y, cv=cv, scoring=scoring)\n",
+ " print(f\"{scoring} (After): {round(cv_results['test_score'].mean(), 4)}\")\n",
+ " print(f\"{name} best params: {gs_best.best_params_}\", end=\"\\n\\n\")\n",
+ " best_models[name] = final_model\n",
+ " return best_models"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 234,
+ "id": "23ff2b58",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Base Models....\n",
+ "roc_auc: 0.8403 (LR) \n",
+ "roc_auc: 0.7827 (KNN) \n",
+ "roc_auc: 0.6498 (CART) \n",
+ "roc_auc: 0.8182 (RF) \n",
+ "roc_auc: 0.8371 (Adaboost) \n",
+ "roc_auc: 0.8383 (GBM) \n",
+ "roc_auc: 0.8114 (XGBoost) \n",
+ "roc_auc: 0.8356 (CatBoost) \n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 998, number of negative: 2758\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000621 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265708 -> initscore=-1.016508\n",
+ "[LightGBM] [Info] Start training from score -1.016508\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000553 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000501 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "roc_auc: 0.8262 (LightGBM) \n",
+ "Hyperparameter Optimization....\n",
+ "########## KNN ##########\n",
+ "roc_auc (Before): 0.7827\n",
+ "roc_auc (After): 0.8246\n",
+ "KNN best params: {'n_neighbors': 26}\n",
+ "\n",
+ "########## CART ##########\n",
+ "roc_auc (Before): 0.6471\n",
+ "roc_auc (After): 0.8081\n",
+ "CART best params: {'max_depth': 4, 'min_samples_split': 2}\n",
+ "\n",
+ "########## RF ##########\n",
+ "roc_auc (Before): 0.816\n",
+ "roc_auc (After): 0.8391\n",
+ "RF best params: {'max_depth': 15, 'max_features': 5, 'min_samples_split': 20, 'n_estimators': 300}\n",
+ "\n",
+ "########## XGBoost ##########\n",
+ "roc_auc (Before): 0.8114\n",
+ "roc_auc (After): 0.8362\n",
+ "XGBoost best params: {'learning_rate': 0.1, 'max_depth': 5, 'n_estimators': 100}\n",
+ "\n",
+ "########## LightGBM ##########\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 998, number of negative: 2758\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.001389 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265708 -> initscore=-1.016508\n",
+ "[LightGBM] [Info] Start training from score -1.016508\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000611 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000637 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "roc_auc (Before): 0.8262\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 1496, number of negative: 4138\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000679 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 5634, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265531 -> initscore=-1.017418\n",
+ "[LightGBM] [Info] Start training from score -1.017418\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 998, number of negative: 2758\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000629 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265708 -> initscore=-1.016508\n",
+ "[LightGBM] [Info] Start training from score -1.016508\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000625 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000640 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "roc_auc (After): 0.8388\n",
+ "LightGBM best params: {'learning_rate': 0.01, 'n_estimators': 300}\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "def fit_models(X,y):\n",
+ " base_models(X, y)\n",
+ " best_models = hyperparameter_optimization(X, y)\n",
+ " return best_models\n",
+ "\n",
+ "best_models = fit_models(X_train, y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 235,
+ "id": "4b9313a7",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 1496, number of negative: 4138\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.001857 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 5634, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265531 -> initscore=-1.017418\n",
+ "[LightGBM] [Info] Start training from score -1.017418\n"
+ ]
+ }
+ ],
+ "source": [
+ "lgbm_model = best_models['LightGBM'].fit(X_train, y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 236,
+ "id": "b1554003",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " precision recall f1-score support\n",
+ "\n",
+ " 0 0.84 0.92 0.88 1036\n",
+ " 1 0.68 0.51 0.58 373\n",
+ "\n",
+ " accuracy 0.81 1409\n",
+ " macro avg 0.76 0.71 0.73 1409\n",
+ "weighted avg 0.80 0.81 0.80 1409\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "y_pred = lgbm_model.predict(X_test)\n",
+ "y_prob = lgbm_model.predict_proba(X_test)[:, 1]\n",
+ "print(classification_report(y_test, y_pred))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 237,
+ "id": "d3d93aca",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "lgbm_roc_auc = roc_auc_score(y_test, y_prob)\n",
+ "fpr, tpr, thresholds = roc_curve(y_test, y_prob)\n",
+ "plt.figure()\n",
+ "\n",
+ "plt.plot([0,1],[0,1],'r--')\n",
+ "plt.plot(fpr, tpr, marker='.', label='LGBM')\n",
+ "plt.xlabel('False Positive Rate')\n",
+ "plt.ylabel('True Positive Rate')\n",
+ "plt.title(\"LGBM ROC\")\n",
+ "plt.legend()\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 238,
+ "id": "e4af4a2b",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "array([1])"
+ ]
+ },
+ "execution_count": 238,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "X.columns\n",
+ "random_user = X.sample(1, random_state = 42)\n",
+ "lgbm_model.predict(random_user)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 239,
+ "id": "4d3cba12",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 1000x1000 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Observing the variables that affect the success of our model\n",
+ "\n",
+ "def plot_importance(model, features, num=len(X), save=False):\n",
+ " feature_imp = pd.DataFrame({'Value': model.feature_importances_, 'Feature': features.columns})\n",
+ " plt.figure(figsize=(10, 10))\n",
+ " sns.set(font_scale=1)\n",
+ " sns.barplot(x=\"Value\", y=\"Feature\", data=feature_imp.sort_values(by=\"Value\",\n",
+ " ascending=False)[0:num])\n",
+ " plt.title('Features')\n",
+ " plt.tight_layout()\n",
+ " plt.show()\n",
+ " if save:\n",
+ " plt.savefig('importances.png')\n",
+ "\n",
+ "plot_importance(lgbm_model, X_train, num=35)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 240,
+ "id": "36ff6f7c",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>gender</th>\n",
+ " <th>SeniorCitizen</th>\n",
+ " <th>Partner</th>\n",
+ " <th>Dependents</th>\n",
+ " <th>tenure</th>\n",
+ " <th>PhoneService</th>\n",
+ " <th>PaperlessBilling</th>\n",
+ " <th>MonthlyCharges</th>\n",
+ " <th>TotalCharges</th>\n",
+ " <th>Churn</th>\n",
+ " <th>MultipleLines_No phone service</th>\n",
+ " <th>MultipleLines_Yes</th>\n",
+ " <th>InternetService_Fiber optic</th>\n",
+ " <th>InternetService_No</th>\n",
+ " <th>OnlineSecurity_No internet service</th>\n",
+ " <th>OnlineSecurity_Yes</th>\n",
+ " <th>OnlineBackup_No internet service</th>\n",
+ " <th>OnlineBackup_Yes</th>\n",
+ " <th>DeviceProtection_No internet service</th>\n",
+ " <th>DeviceProtection_Yes</th>\n",
+ " <th>TechSupport_No internet service</th>\n",
+ " <th>TechSupport_Yes</th>\n",
+ " <th>StreamingTV_No internet service</th>\n",
+ " <th>StreamingTV_Yes</th>\n",
+ " <th>StreamingMovies_No internet service</th>\n",
+ " <th>StreamingMovies_Yes</th>\n",
+ " <th>Contract_One year</th>\n",
+ " <th>Contract_Two year</th>\n",
+ " <th>PaymentMethod_Credit card (automatic)</th>\n",
+ " <th>PaymentMethod_Electronic check</th>\n",
+ " <th>PaymentMethod_Mailed check</th>\n",
+ " <th>SENIOR/YOUNG_GENDER_senior_female</th>\n",
+ " <th>SENIOR/YOUNG_GENDER_young_female</th>\n",
+ " <th>GENDER_SUPPORT_no_sup_female</th>\n",
+ " <th>GENDER_SUPPORT_no_sup_male</th>\n",
+ " <th>GENDER_EC_MONTH_male_ec_month</th>\n",
+ " <th>GENDER_EC_MONTH_nan</th>\n",
+ " <th>GENDER_SECURITY_no_sec_female</th>\n",
+ " <th>GENDER_SECURITY_no_sec_male</th>\n",
+ " <th>GENDER_SECURITY_yes_sec_female</th>\n",
+ " <th>GENDER_SECURITY_yes_sec_male</th>\n",
+ " <th>GENDER_FIB_DEP_male_fib_dep_no</th>\n",
+ " <th>GENDER_FIB_DEP_nan</th>\n",
+ " <th>NEW_TENURE_YEAR_1-2 Year</th>\n",
+ " <th>NEW_TENURE_YEAR_2-3 Year</th>\n",
+ " <th>NEW_TENURE_YEAR_3-4 Year</th>\n",
+ " <th>NEW_TENURE_YEAR_4-5 Year</th>\n",
+ " <th>NEW_TENURE_YEAR_5-6 Year</th>\n",
+ " <th>PARTNER_CONTR_no_partner_month</th>\n",
+ " <th>PARTNER_CONTR_yes_partner_month</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>-1.277</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " <td>-1.160</td>\n",
+ " <td>-0.994</td>\n",
+ " <td>0</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0.066</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>-0.260</td>\n",
+ " <td>-0.173</td>\n",
+ " <td>0</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>-1.237</td>\n",
+ " <td>1</td>\n",
+ " <td>1</td>\n",
+ " <td>-0.363</td>\n",
+ " <td>-0.960</td>\n",
+ " <td>1</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0.514</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>-0.747</td>\n",
+ " <td>-0.195</td>\n",
+ " <td>0</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>-1.237</td>\n",
+ " <td>1</td>\n",
+ " <td>1</td>\n",
+ " <td>0.197</td>\n",
+ " <td>-0.940</td>\n",
+ " <td>1</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>False</td>\n",
+ " <td>True</td>\n",
+ " <td>False</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " gender SeniorCitizen Partner Dependents tenure PhoneService PaperlessBilling MonthlyCharges TotalCharges Churn MultipleLines_No phone service MultipleLines_Yes InternetService_Fiber optic InternetService_No OnlineSecurity_No internet service OnlineSecurity_Yes OnlineBackup_No internet service OnlineBackup_Yes DeviceProtection_No internet service DeviceProtection_Yes TechSupport_No internet service TechSupport_Yes StreamingTV_No internet service StreamingTV_Yes \\\n",
+ "0 0 0 1 0 -1.277 0 1 -1.160 -0.994 0 True False False False False False False True False False False False False False \n",
+ "1 1 0 0 0 0.066 1 0 -0.260 -0.173 0 False False False False False True False False False True False False False False \n",
+ "2 1 0 0 0 -1.237 1 1 -0.363 -0.960 1 False False False False False True False True False False False False False False \n",
+ "3 1 0 0 0 0.514 0 0 -0.747 -0.195 0 True False False False False True False False False True False True False False \n",
+ "4 0 0 0 0 -1.237 1 1 0.197 -0.940 1 False False True False False False False False False False False False False False \n",
+ "\n",
+ " StreamingMovies_No internet service StreamingMovies_Yes Contract_One year Contract_Two year PaymentMethod_Credit card (automatic) PaymentMethod_Electronic check PaymentMethod_Mailed check SENIOR/YOUNG_GENDER_senior_female SENIOR/YOUNG_GENDER_young_female GENDER_SUPPORT_no_sup_female GENDER_SUPPORT_no_sup_male GENDER_EC_MONTH_male_ec_month GENDER_EC_MONTH_nan GENDER_SECURITY_no_sec_female GENDER_SECURITY_no_sec_male GENDER_SECURITY_yes_sec_female GENDER_SECURITY_yes_sec_male \\\n",
+ "0 False False False False False True False False True True False False False True False False False \n",
+ "1 False False True False False False True False False False True False True False False False True \n",
+ "2 False False False False False False True False False False True False True False False False True \n",
+ "3 False False True False False False False False False False False False True False False False True \n",
+ "4 False False False False False True False False True True False False False True False False False \n",
+ "\n",
+ " GENDER_FIB_DEP_male_fib_dep_no GENDER_FIB_DEP_nan NEW_TENURE_YEAR_1-2 Year NEW_TENURE_YEAR_2-3 Year NEW_TENURE_YEAR_3-4 Year NEW_TENURE_YEAR_4-5 Year NEW_TENURE_YEAR_5-6 Year PARTNER_CONTR_no_partner_month PARTNER_CONTR_yes_partner_month \n",
+ "0 False True False False False False False False True \n",
+ "1 False True False True False False False False False \n",
+ "2 False True False False False False False True False \n",
+ "3 False True False False True False False False False \n",
+ "4 False False False False False False False True False "
+ ]
+ },
+ "execution_count": 240,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 241,
+ "id": "0d76e6a0",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sklearn.metrics import confusion_matrix, recall_score, precision_score, f1_score, accuracy_score, roc_auc_score\n",
+ "from sklearn.metrics import classification_report\n",
+ "from sklearn.metrics import ConfusionMatrixDisplay, roc_auc_score, RocCurveDisplay"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "83e60cf2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sklearn.ensemble import AdaBoostClassifier\n",
+ "from sklearn.ensemble import BaggingClassifier\n",
+ "from sklearn.naive_bayes import BernoulliNB\n",
+ "from sklearn.calibration import CalibratedClassifierCV\n",
+ "from sklearn.naive_bayes import CategoricalNB\n",
+ "from sklearn.multioutput import ClassifierChain\n",
+ "from sklearn.naive_bayes import ComplementNB\n",
+ "from sklearn.tree import DecisionTreeClassifier\n",
+ "from sklearn.dummy import DummyClassifier\n",
+ "from sklearn.tree import ExtraTreeClassifier\n",
+ "from sklearn.ensemble import ExtraTreesClassifier\n",
+ "from sklearn.naive_bayes import GaussianNB\n",
+ "from sklearn.gaussian_process import GaussianProcessClassifier\n",
+ "from sklearn.ensemble import GradientBoostingClassifier\n",
+ "from sklearn.ensemble import HistGradientBoostingClassifier\n",
+ "from sklearn.neighbors import KNeighborsClassifier\n",
+ "from sklearn.semi_supervised import LabelPropagation\n",
+ "from sklearn.semi_supervised import LabelSpreading\n",
+ "from sklearn.discriminant_analysis import LinearDiscriminantAnalysis\n",
+ "from sklearn.svm import LinearSVC\n",
+ "from sklearn.linear_model import LogisticRegression\n",
+ "from sklearn.linear_model import LogisticRegressionCV\n",
+ "from sklearn.neural_network import MLPClassifier\n",
+ "from sklearn.multioutput import MultiOutputClassifier\n",
+ "from sklearn.naive_bayes import MultinomialNB\n",
+ "from sklearn.neighbors import NearestCentroid\n",
+ "from sklearn.svm import NuSVC\n",
+ "from sklearn.multiclass import OneVsOneClassifier\n",
+ "from sklearn.multiclass import OneVsRestClassifier\n",
+ "from sklearn.multiclass import OutputCodeClassifier\n",
+ "from sklearn.linear_model import PassiveAggressiveClassifier\n",
+ "from sklearn.linear_model import Perceptron\n",
+ "from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis\n",
+ "from sklearn.neighbors import RadiusNeighborsClassifier\n",
+ "from sklearn.ensemble import RandomForestClassifier\n",
+ "from sklearn.linear_model import RidgeClassifier\n",
+ "from sklearn.linear_model import RidgeClassifierCV\n",
+ "from sklearn.linear_model import SGDClassifier\n",
+ "from sklearn.svm import SVC\n",
+ "from sklearn.ensemble import StackingClassifier\n",
+ "\n",
+ "from xgboost import XGBClassifier\n",
+ "from catboost import CatBoostClassifier"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "90253f99",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "estimators = []\n",
+ "estimators.append(('AdaBoostClassifier', AdaBoostClassifier(random_state=13) ))\n",
+ "estimators.append(('Bagging Classifier', BaggingClassifier(random_state=13) ))\n",
+ "estimators.append(('Bernoulli NB', BernoulliNB() ))\n",
+ "estimators.append(('Decision Tree Classifier', DecisionTreeClassifier(random_state=13) ))\n",
+ "estimators.append(('Dummy Classifier', DummyClassifier(random_state=13) ))\n",
+ "estimators.append(('Extra Tree Classifier', ExtraTreeClassifier(random_state=13) ))\n",
+ "estimators.append(('Extra Trees Classifier', ExtraTreesClassifier(random_state=13) ))\n",
+ "estimators.append(('Gaussian NB', GaussianNB() ))\n",
+ "estimators.append(('Gaussian Process Classifier', GaussianProcessClassifier(random_state=13) ))\n",
+ "estimators.append(('Gradient Boosting Classifier', GradientBoostingClassifier(random_state=13) ))\n",
+ "estimators.append(('Hist Gradient Boosting Classifier', HistGradientBoostingClassifier(random_state=13) ))\n",
+ "estimators.append(('KNN', KNeighborsClassifier() ))\n",
+ "#estimators.append(('Label Propagation', LabelPropagation() ))\n",
+ "#estimators.append(('Label Spreading', LabelSpreading() ))\n",
+ "estimators.append(('LogisticRegression', LogisticRegression(max_iter=1000, random_state=13)))\n",
+ "estimators.append(('Logistic Regression CV', LogisticRegressionCV(max_iter=1000, random_state=13) ))\n",
+ "estimators.append(('MLPClassifier', MLPClassifier(max_iter=2000,random_state=13) ))\n",
+ "estimators.append(('Nearest Centroid', NearestCentroid() ))\n",
+ "estimators.append(('Passive Aggressive Classifier', PassiveAggressiveClassifier(random_state=13) ))\n",
+ "estimators.append(('Perceptron', Perceptron(random_state=13) ))\n",
+ "#estimators.append(('RadiusNeighborsClassifier', RadiusNeighborsClassifier(radius=3) ))\n",
+ "estimators.append(('RandomForest', RandomForestClassifier(max_depth= 10, min_samples_leaf= 1, min_samples_split= 3, n_estimators= 170, random_state=13) ))\n",
+ "estimators.append(('Ridge Classifier', RidgeClassifier(random_state=13) ))\n",
+ "estimators.append(('Ridge Classifier CV', RidgeClassifierCV() ))\n",
+ "estimators.append(('SGDClassifier', SGDClassifier(random_state=13) ))\n",
+ "estimators.append(('SVC', SVC(random_state=13)))\n",
+ "estimators.append(('XGB', XGBClassifier(random_state=13) ))\n",
+ "estimators.append(('CatBoost', CatBoostClassifier(logging_level='Silent', random_state=13) ))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 243,
+ "id": "feedc6f5-137a-441f-8fb8-f7870ea8c2b0",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "XGB = XGBClassifier(random_state=13)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 244,
+ "id": "2a7010bc",
+ "metadata": {
+ "lines_to_next_cell": 2
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "Stacking classifier training Accuracy: 0.81\n",
+ "Stacking classifier test Accuracy: 0.79\n"
+ ]
+ }
+ ],
+ "source": [
+ "from sklearn.ensemble import StackingClassifier\n",
+ "SC = StackingClassifier(estimators=estimators,final_estimator=XGB,cv=6)\n",
+ "SC.fit(X_train, y_train)\n",
+ "y_pred = SC.predict(X_test)\n",
+ "\n",
+ "print(f\"\\nStacking classifier training Accuracy: {SC.score(X_train, y_train):0.2f}\")\n",
+ "print(f\"Stacking classifier test Accuracy: {SC.score(X_test, y_test):0.2f}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 245,
+ "id": "370db6f1",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[[932 104]\n",
+ " [185 188]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "SC_Recall = recall_score(y_test, y_pred)\n",
+ "SC_Precision = precision_score(y_test, y_pred)\n",
+ "SC_f1 = f1_score(y_test, y_pred)\n",
+ "SC_accuracy = accuracy_score(y_test, y_pred)\n",
+ "SC_roc_auc = roc_auc_score(y_test, y_pred)\n",
+ "\n",
+ "cm = confusion_matrix(y_test, y_pred)\n",
+ "print(cm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 246,
+ "id": "f9ec6524",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002649 seconds.\n",
+ "You can set `force_col_wise=true` to remove the overhead.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 998, number of negative: 2758\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.002687 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265708 -> initscore=-1.016508\n",
+ "[LightGBM] [Info] Start training from score -1.016508\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.004307 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008726 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 998, number of negative: 2758\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.002564 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265708 -> initscore=-1.016508\n",
+ "[LightGBM] [Info] Start training from score -1.016508\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.003598 seconds.\n",
+ "You can set `force_col_wise=true` to remove the overhead.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007005 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.001559 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 998, number of negative: 2758\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.002142 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265708 -> initscore=-1.016508\n",
+ "[LightGBM] [Info] Start training from score -1.016508\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 998, number of negative: 2758\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006682 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265708 -> initscore=-1.016508\n",
+ "[LightGBM] [Info] Start training from score -1.016508\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.001670 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines\n",
+ "[LightGBM] [Info] Number of positive: 997, number of negative: 2759\n",
+ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.001828 seconds.\n",
+ "You can set `force_row_wise=true` to remove the overhead.\n",
+ "And if memory is not enough, you can set `force_col_wise=true`.\n",
+ "[LightGBM] [Info] Total Bins 676\n",
+ "[LightGBM] [Info] Number of data points in the train set: 3756, number of used features: 49\n",
+ "[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.265442 -> initscore=-1.017873\n",
+ "[LightGBM] [Info] Start training from score -1.017873\n",
+ "Cross Validation Recall scores are: [0.49832776 0.4548495 0.48160535 0.49666667 0.44147157]\n",
+ "Average Cross Validation Recall score: 0.4745841694537347\n",
+ "Cross Validation Recall standard deviation: 0.025429279829347458\n"
+ ]
+ }
+ ],
+ "source": [
+ "from statistics import stdev\n",
+ "from sklearn.model_selection import cross_val_score\n",
+ "score = cross_val_score(SC, X_train, y_train, cv=5, scoring='recall', error_score=\"raise\")\n",
+ "SC_cv_score = score.mean()\n",
+ "SC_cv_stdev = stdev(score)\n",
+ "print('Cross Validation Recall scores are: {}'.format(score))\n",
+ "print('Average Cross Validation Recall score: ', SC_cv_score)\n",
+ "print('Cross Validation Recall standard deviation: ', SC_cv_stdev)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 247,
+ "id": "2c2cc07a",
+ "metadata": {
+ "lines_to_next_cell": 2
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>Model</th>\n",
+ " <th>Recall</th>\n",
+ " <th>Precision</th>\n",
+ " <th>F1 Score</th>\n",
+ " <th>Accuracy</th>\n",
+ " <th>ROC-AUC Score</th>\n",
+ " <th>Avg CV Recall</th>\n",
+ " <th>Standard Deviation of CV Recall</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>Random Forest</td>\n",
+ " <td>0.504</td>\n",
+ " <td>0.644</td>\n",
+ " <td>0.565</td>\n",
+ " <td>0.795</td>\n",
+ " <td>0.702</td>\n",
+ " <td>0.475</td>\n",
+ " <td>0.025</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " Model Recall Precision F1 Score Accuracy ROC-AUC Score Avg CV Recall Standard Deviation of CV Recall\n",
+ "0 Random Forest 0.504 0.644 0.565 0.795 0.702 0.475 0.025"
+ ]
+ },
+ "execution_count": 247,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ndf = [(SC_Recall, SC_Precision, SC_f1, SC_accuracy, SC_roc_auc, SC_cv_score, SC_cv_stdev)]\n",
+ "\n",
+ "SC_score = pd.DataFrame(data = ndf, columns=['Recall','Precision','F1 Score', 'Accuracy', 'ROC-AUC Score', 'Avg CV Recall', 'Standard Deviation of CV Recall'])\n",
+ "SC_score.insert(0, 'Model', 'Random Forest')\n",
+ "SC_score"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 248,
+ "id": "b03513af",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 500x500 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "y_proba = SC.predict_proba(X_test)\n",
+ "\n",
+ "from sklearn.metrics import roc_curve\n",
+ "from sklearn.metrics import RocCurveDisplay\n",
+ "def plot_auc_roc_curve(y_test, y_pred):\n",
+ " fpr, tpr, _ = roc_curve(y_test, y_pred)\n",
+ " roc_display = RocCurveDisplay(fpr=fpr, tpr=tpr).plot()\n",
+ " roc_display.figure_.set_size_inches(5,5)\n",
+ " plt.plot([0, 1], [0, 1], color = 'g')\n",
+ "plot_auc_roc_curve(y_test, y_proba[:, 1])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 249,
+ "id": "0037e48d",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "from sklearn.metrics import precision_recall_curve\n",
+ "from sklearn.metrics import PrecisionRecallDisplay\n",
+ "\n",
+ "display = PrecisionRecallDisplay.from_estimator(\n",
+ " SC, X_test, y_test, name=\"Average precision\")\n",
+ "_ = display.ax_.set_title(\"Stacking Classifier\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d0d4761d",
+ "metadata": {},
+ "source": [
+ "# Random Forest"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 250,
+ "id": "3c07cd30",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "rf = RandomForestClassifier(random_state=13)\n",
+ "rf.fit(X_train, y_train)\n",
+ "y_pred = rf.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 251,
+ "id": "8ca4e27d",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[[932 104]\n",
+ " [191 182]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "rf_Recall = recall_score(y_test, y_pred)\n",
+ "rf_Precision = precision_score(y_test, y_pred)\n",
+ "rf_f1 = f1_score(y_test, y_pred)\n",
+ "rf_accuracy = accuracy_score(y_test, y_pred)\n",
+ "rf_roc_auc = roc_auc_score(y_test, y_pred)\n",
+ "\n",
+ "cm = confusion_matrix(y_test, y_pred)\n",
+ "print(cm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 252,
+ "id": "054732e8",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " precision recall f1-score support\n",
+ "\n",
+ " 0 0.83 0.90 0.86 1036\n",
+ " 1 0.64 0.49 0.55 373\n",
+ "\n",
+ " accuracy 0.79 1409\n",
+ " macro avg 0.73 0.69 0.71 1409\n",
+ "weighted avg 0.78 0.79 0.78 1409\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(classification_report(y_test, y_pred))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 253,
+ "id": "624052f4",
+ "metadata": {
+ "lines_to_next_cell": 2
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Cross Validation Recall scores are: [0.54180602 0.46488294 0.4548495 0.48333333 0.44816054]\n",
+ "Average Cross Validation Recall score: 0.47860646599777035\n",
+ "Cross Validation Recall standard deviation: 0.037736620837502725\n"
+ ]
+ }
+ ],
+ "source": [
+ "from statistics import stdev\n",
+ "score = cross_val_score(rf, X_train, y_train, cv=5, scoring='recall', error_score=\"raise\")\n",
+ "rf_cv_score = score.mean()\n",
+ "rf_cv_stdev = stdev(score)\n",
+ "print('Cross Validation Recall scores are: {}'.format(score))\n",
+ "print('Average Cross Validation Recall score: ', rf_cv_score)\n",
+ "print('Cross Validation Recall standard deviation: ', rf_cv_stdev)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 254,
+ "id": "412d998b",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>Model</th>\n",
+ " <th>Recall</th>\n",
+ " <th>Precision</th>\n",
+ " <th>F1 Score</th>\n",
+ " <th>Accuracy</th>\n",
+ " <th>ROC-AUC Score</th>\n",
+ " <th>Avg CV Recall</th>\n",
+ " <th>Standard Deviation of CV Recall</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>Random Forest</td>\n",
+ " <td>0.488</td>\n",
+ " <td>0.636</td>\n",
+ " <td>0.552</td>\n",
+ " <td>0.791</td>\n",
+ " <td>0.694</td>\n",
+ " <td>0.479</td>\n",
+ " <td>0.038</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " Model Recall Precision F1 Score Accuracy ROC-AUC Score Avg CV Recall Standard Deviation of CV Recall\n",
+ "0 Random Forest 0.488 0.636 0.552 0.791 0.694 0.479 0.038"
+ ]
+ },
+ "execution_count": 254,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ndf = [(rf_Recall, rf_Precision, rf_f1, rf_accuracy, rf_roc_auc, rf_cv_score, rf_cv_stdev)]\n",
+ "\n",
+ "rf_score = pd.DataFrame(data = ndf, columns=['Recall','Precision','F1 Score', 'Accuracy', 'ROC-AUC Score', 'Avg CV Recall', 'Standard Deviation of CV Recall'])\n",
+ "rf_score.insert(0, 'Model', 'Random Forest')\n",
+ "rf_score"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 255,
+ "id": "0af8cb24",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sklearn.model_selection import GridSearchCV\n",
+ "\n",
+ "params = {\n",
+ " 'n_estimators': [130], # 'n_estimators': [120,130,150,170,190,200],\n",
+ " 'max_depth': [14], # 'max_depth': [8,10,12,14,15],\n",
+ " 'min_samples_split': [3], # 'min_samples_split': [3,4,5,6],\n",
+ " 'min_samples_leaf': [2], # 'min_samples_leaf': [1,2,3],\n",
+ " 'random_state': [13]\n",
+ "}\n",
+ "\n",
+ "grid_rf = GridSearchCV(rf, param_grid=params, cv=5, scoring='recall').fit(X_train, y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 256,
+ "id": "1fe87c68",
+ "metadata": {
+ "lines_to_next_cell": 2
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Best parameters: {'max_depth': 14, 'min_samples_leaf': 2, 'min_samples_split': 3, 'n_estimators': 130, 'random_state': 13}\n",
+ "Best score: 0.4993333333333333\n"
+ ]
+ }
+ ],
+ "source": [
+ "print('Best parameters:', grid_rf.best_params_)\n",
+ "print('Best score:', grid_rf.best_score_)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 257,
+ "id": "7a958c1b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "y_pred = grid_rf.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 258,
+ "id": "f29dc0be",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[[934 102]\n",
+ " [186 187]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "cm = confusion_matrix(y_test, y_pred)\n",
+ "\n",
+ "grid_rf_Recall = recall_score(y_test, y_pred)\n",
+ "grid_rf_Precision = precision_score(y_test, y_pred)\n",
+ "grid_rf_f1 = f1_score(y_test, y_pred)\n",
+ "grid_rf_accuracy = accuracy_score(y_test, y_pred)\n",
+ "grid_roc_auc = roc_auc_score(y_test, y_pred)\n",
+ "\n",
+ "print(cm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 259,
+ "id": "16d5376c",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "score2 = cross_val_score(grid_rf, X_train, y_train, cv=5, scoring='recall')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 260,
+ "id": "122ed69d",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Cross Validation Recall scores are: [0.55518395 0.48160535 0.50501672 0.49666667 0.45819398]\n",
+ "Average Cross Validation Recall score: 0.4993333333333333\n",
+ "Cross Validation Recall standard deviation: 0.035935465636409585\n"
+ ]
+ }
+ ],
+ "source": [
+ "grid_cv_score = score2.mean()\n",
+ "grid_cv_stdev = stdev(score2)\n",
+ "\n",
+ "print('Cross Validation Recall scores are: {}'.format(score2))\n",
+ "print('Average Cross Validation Recall score: ', grid_cv_score)\n",
+ "print('Cross Validation Recall standard deviation: ', grid_cv_stdev)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 261,
+ "id": "d942d9b1",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>Model</th>\n",
+ " <th>Recall</th>\n",
+ " <th>Precision</th>\n",
+ " <th>F1 Score</th>\n",
+ " <th>Accuracy</th>\n",
+ " <th>ROC-AUC Score</th>\n",
+ " <th>Avg CV Recall</th>\n",
+ " <th>Standard Deviation of CV Recall</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>Random Forest after tuning</td>\n",
+ " <td>0.501</td>\n",
+ " <td>0.647</td>\n",
+ " <td>0.565</td>\n",
+ " <td>0.796</td>\n",
+ " <td>0.701</td>\n",
+ " <td>0.499</td>\n",
+ " <td>0.036</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " Model Recall Precision F1 Score Accuracy ROC-AUC Score Avg CV Recall Standard Deviation of CV Recall\n",
+ "0 Random Forest after tuning 0.501 0.647 0.565 0.796 0.701 0.499 0.036"
+ ]
+ },
+ "execution_count": 261,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ndf2 = [(grid_rf_Recall, grid_rf_Precision, grid_rf_f1, grid_rf_accuracy, grid_roc_auc, grid_cv_score, grid_cv_stdev)]\n",
+ "\n",
+ "grid_score = pd.DataFrame(data = ndf2, columns=\n",
+ " ['Recall','Precision','F1 Score', 'Accuracy', 'ROC-AUC Score', 'Avg CV Recall', 'Standard Deviation of CV Recall'])\n",
+ "grid_score.insert(0, 'Model', 'Random Forest after tuning')\n",
+ "grid_score"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 276,
+ "id": "9b414130",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<style>#sk-container-id-3 {color: black;}#sk-container-id-3 pre{padding: 0;}#sk-container-id-3 div.sk-toggleable {background-color: white;}#sk-container-id-3 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-3 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-3 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-3 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-3 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-3 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-3 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-3 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-3 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-3 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-3 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-3 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-3 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-3 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-3 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-3 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-3 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-3 div.sk-item {position: relative;z-index: 1;}#sk-container-id-3 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-3 div.sk-item::before, #sk-container-id-3 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-3 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-3 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-3 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-3 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-3 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-3 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-3 div.sk-label-container {text-align: center;}#sk-container-id-3 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-3 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-3\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>XGBClassifier(base_score=None, booster=None, callbacks=None,\n",
+ " colsample_bylevel=None, colsample_bynode=None,\n",
+ " colsample_bytree=None, device=None, early_stopping_rounds=None,\n",
+ " enable_categorical=False, eval_metric=None, feature_types=None,\n",
+ " gamma=None, grow_policy=None, importance_type=None,\n",
+ " interaction_constraints=None, learning_rate=None, max_bin=None,\n",
+ " max_cat_threshold=None, max_cat_to_onehot=None,\n",
+ " max_delta_step=None, max_depth=None, max_leaves=None,\n",
+ " min_child_weight=None, missing=nan, monotone_constraints=None,\n",
+ " multi_strategy=None, n_estimators=None, n_jobs=None,\n",
+ " num_parallel_tree=None, random_state=None, ...)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-3\" type=\"checkbox\" checked><label for=\"sk-estimator-id-3\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">XGBClassifier</label><div class=\"sk-toggleable__content\"><pre>XGBClassifier(base_score=None, booster=None, callbacks=None,\n",
+ " colsample_bylevel=None, colsample_bynode=None,\n",
+ " colsample_bytree=None, device=None, early_stopping_rounds=None,\n",
+ " enable_categorical=False, eval_metric=None, feature_types=None,\n",
+ " gamma=None, grow_policy=None, importance_type=None,\n",
+ " interaction_constraints=None, learning_rate=None, max_bin=None,\n",
+ " max_cat_threshold=None, max_cat_to_onehot=None,\n",
+ " max_delta_step=None, max_depth=None, max_leaves=None,\n",
+ " min_child_weight=None, missing=nan, monotone_constraints=None,\n",
+ " multi_strategy=None, n_estimators=None, n_jobs=None,\n",
+ " num_parallel_tree=None, random_state=None, ...)</pre></div></div></div></div></div>"
+ ],
+ "text/plain": [
+ "XGBClassifier(base_score=None, booster=None, callbacks=None,\n",
+ " colsample_bylevel=None, colsample_bynode=None,\n",
+ " colsample_bytree=None, device=None, early_stopping_rounds=None,\n",
+ " enable_categorical=False, eval_metric=None, feature_types=None,\n",
+ " gamma=None, grow_policy=None, importance_type=None,\n",
+ " interaction_constraints=None, learning_rate=None, max_bin=None,\n",
+ " max_cat_threshold=None, max_cat_to_onehot=None,\n",
+ " max_delta_step=None, max_depth=None, max_leaves=None,\n",
+ " min_child_weight=None, missing=nan, monotone_constraints=None,\n",
+ " multi_strategy=None, n_estimators=None, n_jobs=None,\n",
+ " num_parallel_tree=None, random_state=None, ...)"
+ ]
+ },
+ "execution_count": 276,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "from xgboost import XGBClassifier\n",
+ "XGBC = XGBClassifier()\n",
+ "XGBC.fit(X_train, y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 277,
+ "id": "f7a6efc4",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "y_pred = XGBC.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 278,
+ "id": "81bb3cc1",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[[923 113]\n",
+ " [178 195]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "XGBC_Recall = recall_score(y_test, y_pred)\n",
+ "XGBC_Precision = precision_score(y_test, y_pred)\n",
+ "XGBC_f1 = f1_score(y_test, y_pred)\n",
+ "XGBC_accuracy = accuracy_score(y_test, y_pred)\n",
+ "XGBC_roc_auc = roc_auc_score(y_test, y_pred)\n",
+ "\n",
+ "cm = confusion_matrix(y_test, y_pred)\n",
+ "print(cm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 279,
+ "id": "2f2067f9",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Cross Validation Recall scores are: [0.59197324 0.47157191 0.48160535 0.51 0.46488294]\n",
+ "Average Cross Validation Recall score: 0.5040066889632107\n",
+ "Cross Validation Recall standard deviation: 0.05210215243353261\n"
+ ]
+ }
+ ],
+ "source": [
+ "score = cross_val_score(XGBC, X_train, y_train, cv=5, scoring='recall', error_score=\"raise\")\n",
+ "XGBC_cv_score = score.mean()\n",
+ "XGBC_cv_stdev = stdev(score)\n",
+ "print('Cross Validation Recall scores are: {}'.format(score))\n",
+ "print('Average Cross Validation Recall score: ', XGBC_cv_score)\n",
+ "print('Cross Validation Recall standard deviation: ', XGBC_cv_stdev)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 280,
+ "id": "fec008f5",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>Model</th>\n",
+ " <th>Recall</th>\n",
+ " <th>Precision</th>\n",
+ " <th>F1 Score</th>\n",
+ " <th>Accuracy</th>\n",
+ " <th>ROC-AUC Score</th>\n",
+ " <th>Avg CV Recall</th>\n",
+ " <th>Standard Deviation of CV Recall</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>XGBC</td>\n",
+ " <td>0.523</td>\n",
+ " <td>0.633</td>\n",
+ " <td>0.573</td>\n",
+ " <td>0.793</td>\n",
+ " <td>0.707</td>\n",
+ " <td>0.504</td>\n",
+ " <td>0.052</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " Model Recall Precision F1 Score Accuracy ROC-AUC Score Avg CV Recall Standard Deviation of CV Recall\n",
+ "0 XGBC 0.523 0.633 0.573 0.793 0.707 0.504 0.052"
+ ]
+ },
+ "execution_count": 280,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ndf = [(XGBC_Recall, XGBC_Precision, XGBC_f1, XGBC_accuracy, XGBC_roc_auc, XGBC_cv_score, XGBC_cv_stdev)]\n",
+ "\n",
+ "XGBC_score = pd.DataFrame(data = ndf, columns=['Recall','Precision','F1 Score', 'Accuracy', 'ROC-AUC Score', 'Avg CV Recall', 'Standard Deviation of CV Recall'])\n",
+ "XGBC_score.insert(0, 'Model', 'XGBC')\n",
+ "XGBC_score"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 281,
+ "id": "4ebcab97",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Step 1: Searching for the optimum parameters for the learning rate and the number of estimators:\n",
+ "params = {'learning_rate': [0.01], #[0.0001, 0.001, 0.01, 0.1, 0.2, 0.3],\n",
+ " 'subsample': [0.8],\n",
+ " 'colsample_bytree': [0.8],\n",
+ " 'n_estimators': [450] #range(50,500,50),\n",
+ " }\n",
+ "\n",
+ "grid_xgb = GridSearchCV(XGBC, param_grid=params, cv=5, scoring='recall').fit(X_train, y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 282,
+ "id": "7db0ea58",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Best parameters: {'colsample_bytree': 0.8, 'learning_rate': 0.01, 'n_estimators': 450, 'subsample': 0.8}\n",
+ "Best score: 0.5033311036789299\n"
+ ]
+ }
+ ],
+ "source": [
+ "print('Best parameters:', grid_xgb.best_params_)\n",
+ "print('Best score:', grid_xgb.best_score_)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 283,
+ "id": "79870788",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Step 2: Searching for the optimum parameters for max_depth and min_child_weight:\n",
+ "params = {'max_depth': [7], #range(3,10,2),\n",
+ " 'learning_rate': [0.01],\n",
+ " 'subsample': [0.8],\n",
+ " 'colsample_bytree': [0.8],\n",
+ " # 'colsample_bylevel': np.arange(0.5, 1.0, 0.1),\n",
+ " 'min_child_weight': [5], #range(1,6,2),\n",
+ " 'n_estimators': [450],\n",
+ " # 'num_class': [10]\n",
+ " }\n",
+ "\n",
+ "grid_xgb = GridSearchCV(XGBC, param_grid=params, cv=5, scoring='recall').fit(X_train, y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 284,
+ "id": "ff199016",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Best parameters: {'colsample_bytree': 0.8, 'learning_rate': 0.01, 'max_depth': 7, 'min_child_weight': 5, 'n_estimators': 450, 'subsample': 0.8}\n",
+ "Best score: 0.5086845039018952\n"
+ ]
+ }
+ ],
+ "source": [
+ "print('Best parameters:', grid_xgb.best_params_)\n",
+ "print('Best score:', grid_xgb.best_score_)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 285,
+ "id": "ef5bb817",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "y_pred = grid_xgb.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 286,
+ "id": "6f9f6f7a",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[[942 94]\n",
+ " [175 198]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "grid_xgb_Recall = recall_score(y_test, y_pred)\n",
+ "grid_xgb_Precision = precision_score(y_test, y_pred)\n",
+ "grid_xgb_f1 = f1_score(y_test, y_pred)\n",
+ "grid_xgb_accuracy = accuracy_score(y_test, y_pred)\n",
+ "grid_xgb_roc_auc = roc_auc_score(y_test, y_pred)\n",
+ "\n",
+ "cm = confusion_matrix(y_test, y_pred)\n",
+ "print(cm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 287,
+ "id": "2dbc1aa2",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Cross Validation Recall scores are: [0.56521739 0.47826087 0.5083612 0.51666667 0.47491639]\n",
+ "Average Cross Validation Recall score: 0.5086845039018952\n",
+ "Cross Validation Recall standard deviation: 0.036488594052819415\n"
+ ]
+ }
+ ],
+ "source": [
+ "score = cross_val_score(grid_xgb, X_train, y_train, cv=5, scoring='recall', error_score=\"raise\")\n",
+ "grid_xgb_cv_score = score.mean()\n",
+ "grid_xgb_cv_stdev = stdev(score)\n",
+ "print('Cross Validation Recall scores are: {}'.format(score))\n",
+ "print('Average Cross Validation Recall score: ', grid_xgb_cv_score)\n",
+ "print('Cross Validation Recall standard deviation: ', grid_xgb_cv_stdev)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 288,
+ "id": "b5281a58",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>Model</th>\n",
+ " <th>Recall</th>\n",
+ " <th>Precision</th>\n",
+ " <th>F1 Score</th>\n",
+ " <th>Accuracy</th>\n",
+ " <th>ROC-AUC Score</th>\n",
+ " <th>Avg CV Recall</th>\n",
+ " <th>Standard Deviation of CV Recall</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>Tuned XGBC</td>\n",
+ " <td>0.531</td>\n",
+ " <td>0.678</td>\n",
+ " <td>0.595</td>\n",
+ " <td>0.809</td>\n",
+ " <td>0.720</td>\n",
+ " <td>0.509</td>\n",
+ " <td>0.036</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " Model Recall Precision F1 Score Accuracy ROC-AUC Score Avg CV Recall Standard Deviation of CV Recall\n",
+ "0 Tuned XGBC 0.531 0.678 0.595 0.809 0.720 0.509 0.036"
+ ]
+ },
+ "execution_count": 288,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ndf = [(grid_xgb_Recall, grid_xgb_Precision, grid_xgb_f1, grid_xgb_accuracy, grid_xgb_roc_auc, grid_xgb_cv_score, grid_xgb_cv_stdev)]\n",
+ "\n",
+ "grid_xgb_score = pd.DataFrame(data = ndf, columns=['Recall','Precision','F1 Score', 'Accuracy', 'ROC-AUC Score', 'Avg CV Recall', 'Standard Deviation of CV Recall'])\n",
+ "grid_xgb_score.insert(0, 'Model', 'Tuned XGBC')\n",
+ "grid_xgb_score"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 289,
+ "id": "add7f3cf",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sklearn.ensemble import VotingClassifier"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 290,
+ "id": "d869a158",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "VC_hard = VotingClassifier(estimators = estimators, voting ='hard')\n",
+ "VC_hard.fit(X_train, y_train)\n",
+ "y_pred = VC_hard.predict(X_test)\n",
+ "\n",
+ "import warnings\n",
+ "warnings.filterwarnings('ignore')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 291,
+ "id": "331fd46a",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[[940 96]\n",
+ " [171 202]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "VC_hard_Recall = recall_score(y_test, y_pred)\n",
+ "VC_hard_Precision = precision_score(y_test, y_pred)\n",
+ "VC_hard_f1 = f1_score(y_test, y_pred)\n",
+ "VC_hard_accuracy = accuracy_score(y_test, y_pred)\n",
+ "VC_hard_roc_auc = roc_auc_score(y_test, y_pred)\n",
+ "\n",
+ "cm = confusion_matrix(y_test, y_pred)\n",
+ "print(cm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 292,
+ "id": "02646d21",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Cross Validation Recall scores are: [0.55183946 0.47826087 0.49832776 0.50666667 0.46153846]\n",
+ "Average Cross Validation Recall score: 0.4993266443701227\n",
+ "Cross Validation Recall standard deviation: 0.03422054809235207\n"
+ ]
+ }
+ ],
+ "source": [
+ "score = cross_val_score(VC_hard, X_train, y_train, cv=5, scoring='recall', error_score=\"raise\")\n",
+ "VC_hard_cv_score = score.mean()\n",
+ "VC_hard_cv_stdev = stdev(score)\n",
+ "print('Cross Validation Recall scores are: {}'.format(score))\n",
+ "print('Average Cross Validation Recall score: ', VC_hard_cv_score)\n",
+ "print('Cross Validation Recall standard deviation: ', VC_hard_cv_stdev)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 293,
+ "id": "39e6c535",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>Model</th>\n",
+ " <th>Recall</th>\n",
+ " <th>Precision</th>\n",
+ " <th>F1 Score</th>\n",
+ " <th>Accuracy</th>\n",
+ " <th>ROC-AUC Score</th>\n",
+ " <th>Avg CV Recall</th>\n",
+ " <th>Standard Deviation of CV Recall</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>Voting Clasifier - Hard Voting</td>\n",
+ " <td>0.542</td>\n",
+ " <td>0.678</td>\n",
+ " <td>0.602</td>\n",
+ " <td>0.811</td>\n",
+ " <td>0.724</td>\n",
+ " <td>0.499</td>\n",
+ " <td>0.034</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " Model Recall Precision F1 Score Accuracy ROC-AUC Score Avg CV Recall Standard Deviation of CV Recall\n",
+ "0 Voting Clasifier - Hard Voting 0.542 0.678 0.602 0.811 0.724 0.499 0.034"
+ ]
+ },
+ "execution_count": 293,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ndf = [(VC_hard_Recall, VC_hard_Precision, VC_hard_f1, VC_hard_accuracy, VC_hard_roc_auc, VC_hard_cv_score, VC_hard_cv_stdev)]\n",
+ "\n",
+ "VC_hard_score = pd.DataFrame(data = ndf, columns=['Recall','Precision','F1 Score', 'Accuracy', 'ROC-AUC Score', 'Avg CV Recall', 'Standard Deviation of CV Recall'])\n",
+ "VC_hard_score.insert(0, 'Model', 'Voting Clasifier - Hard Voting')\n",
+ "VC_hard_score"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 294,
+ "id": "cb402589",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>Model</th>\n",
+ " <th>Recall</th>\n",
+ " <th>Precision</th>\n",
+ " <th>F1 Score</th>\n",
+ " <th>Accuracy</th>\n",
+ " <th>ROC-AUC Score</th>\n",
+ " <th>Avg CV Recall</th>\n",
+ " <th>Standard Deviation of CV Recall</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>Tuned XGBC</td>\n",
+ " <td>0.531</td>\n",
+ " <td>0.678</td>\n",
+ " <td>0.595</td>\n",
+ " <td>0.809</td>\n",
+ " <td>0.720</td>\n",
+ " <td>0.509</td>\n",
+ " <td>0.036</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>XGBC</td>\n",
+ " <td>0.523</td>\n",
+ " <td>0.633</td>\n",
+ " <td>0.573</td>\n",
+ " <td>0.793</td>\n",
+ " <td>0.707</td>\n",
+ " <td>0.504</td>\n",
+ " <td>0.052</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>Random Forest after tuning</td>\n",
+ " <td>0.501</td>\n",
+ " <td>0.647</td>\n",
+ " <td>0.565</td>\n",
+ " <td>0.796</td>\n",
+ " <td>0.701</td>\n",
+ " <td>0.499</td>\n",
+ " <td>0.036</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>Voting Clasifier - Hard Voting</td>\n",
+ " <td>0.542</td>\n",
+ " <td>0.678</td>\n",
+ " <td>0.602</td>\n",
+ " <td>0.811</td>\n",
+ " <td>0.724</td>\n",
+ " <td>0.499</td>\n",
+ " <td>0.034</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>Random Forest</td>\n",
+ " <td>0.488</td>\n",
+ " <td>0.636</td>\n",
+ " <td>0.552</td>\n",
+ " <td>0.791</td>\n",
+ " <td>0.694</td>\n",
+ " <td>0.479</td>\n",
+ " <td>0.038</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " Model Recall Precision F1 Score Accuracy ROC-AUC Score Avg CV Recall Standard Deviation of CV Recall\n",
+ "3 Tuned XGBC 0.531 0.678 0.595 0.809 0.720 0.509 0.036\n",
+ "2 XGBC 0.523 0.633 0.573 0.793 0.707 0.504 0.052\n",
+ "1 Random Forest after tuning 0.501 0.647 0.565 0.796 0.701 0.499 0.036\n",
+ "4 Voting Clasifier - Hard Voting 0.542 0.678 0.602 0.811 0.724 0.499 0.034\n",
+ "0 Random Forest 0.488 0.636 0.552 0.791 0.694 0.479 0.038"
+ ]
+ },
+ "execution_count": 294,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "predictions = pd.concat([rf_score, grid_score, XGBC_score, grid_xgb_score, VC_hard_score], ignore_index=True, sort=False)\n",
+ "predictions.sort_values(by=['Avg CV Recall'], ascending=False)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 125,
+ "id": "271760f6",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.\n"
+ ]
+ },
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 500x500 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "y_proba = grid_xgb.predict_proba(X_test)\n",
+ "\n",
+ "from sklearn.metrics import roc_curve\n",
+ "from sklearn.metrics import RocCurveDisplay\n",
+ "def plot_auc_roc_curve(y_test, y_pred):\n",
+ " fpr, tpr, _ = roc_curve(y_test, y_pred)\n",
+ " roc_display = RocCurveDisplay(fpr=fpr, tpr=tpr).plot()\n",
+ " roc_display.figure_.set_size_inches(5,5)\n",
+ " plt.plot([0, 1], [0, 1], color = 'g')\n",
+ "plot_auc_roc_curve(y_test, y_proba[:, 1])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 126,
+ "id": "9aa7016b",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "<Figure size 640x480 with 1 Axes>"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "from sklearn.metrics import precision_recall_curve\n",
+ "from sklearn.metrics import PrecisionRecallDisplay\n",
+ "\n",
+ "display = PrecisionRecallDisplay.from_estimator(\n",
+ " grid_xgb, X_test, y_test, name=\"Average precision\")\n",
+ "_ = display.ax_.set_title(\"Tuned XGBoost\")"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}