prateekiiest/titanic_survival_exploration

View on GitHub
Features/Feature_Engineering1.ipynb

Summary

Maintainability
Test Coverage
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Compound Features from raw data\n",
    "One of the most important parameters that decide irrespective of the learning model used(Perceptron,Decision Trees etc.) is the features or the input data itself.\n",
    "If we can arrange the input data in a coherent way and do not take many features that convey more or less the same information we can improve the accuracy of our model to a great extent.\n",
    "For example the Title of a person Mr. ,Mrs. carry the same information as that of Sex.So it will be redundant if we account for both in our model.Instead we can extract Titles from Names by RegularExpression matcher"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import re\n",
    "def get_title(name):\n",
    "    search=re.search(' ([A-Za-z]+)\\.',name)\n",
    "    if search:\n",
    "        return search.group(1)\n",
    "        #.group returns the string matched by the regular expression if we find a valid one\n",
    "    return \"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Mrs'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_title(\"Cumings, Mrs. John Bradley Florence Briggs Thayer\")\n",
    "#Demonstration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also make use of the fact that in the Names, the Title is always followed by a \".\"(which is why we use a escape sequence \"/\" to code it in our pattern)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "      <th>Title</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Braund, Mr. Owen Harris</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>A/5 21171</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>PC 17599</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>C85</td>\n",
       "      <td>C</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Heikkinen, Miss. Laina</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>STON/O2. 3101282</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>113803</td>\n",
       "      <td>53.1000</td>\n",
       "      <td>C123</td>\n",
       "      <td>S</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Allen, Mr. William Henry</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>373450</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass  \\\n",
       "0            1         0       3   \n",
       "1            2         1       1   \n",
       "2            3         1       3   \n",
       "3            4         1       1   \n",
       "4            5         0       3   \n",
       "\n",
       "                                                Name     Sex   Age  SibSp  \\\n",
       "0                            Braund, Mr. Owen Harris    male  22.0      1   \n",
       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
       "2                             Heikkinen, Miss. Laina  female  26.0      0   \n",
       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
       "4                           Allen, Mr. William Henry    male  35.0      0   \n",
       "\n",
       "   Parch            Ticket     Fare Cabin Embarked  Title  \n",
       "0      0         A/5 21171   7.2500   NaN        S      0  \n",
       "1      0          PC 17599  71.2833   C85        C      0  \n",
       "2      0  STON/O2. 3101282   7.9250   NaN        S      0  \n",
       "3      0            113803  53.1000  C123        S      0  \n",
       "4      0            373450   8.0500   NaN        S      0  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "'''Now to see how much the features Sex and Title are correlated we will use Pearson's correlation matrix which shows\n",
    "correlation between two variables.One must also add a new column Title in the original sheet'''\n",
    "import pandas as pd\n",
    "train=pd.read_csv('../../titanic_data.csv')\n",
    "train.insert(len(train.columns),'Title',value=0)\n",
    "train.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "train['Title']=train['Name'].apply(get_title)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      Mr\n",
       "1     Mrs\n",
       "2    Miss\n",
       "Name: Title, dtype: object"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train['Title'].head(3)\n",
    "#To check whether everything works fine or not"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Dr', 'Sir', 'Mme', 'Miss', 'Mlle', 'Mrs', 'Rev', 'Mr', 'Lady', 'Master', 'Capt', 'Countess', 'Col', 'Jonkheer', 'Ms', 'Don', 'Major'}\n"
     ]
    }
   ],
   "source": [
    "label=set(train['Title'])\n",
    "#Checking no. of unique titles\n",
    "print(label)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "train['Title']=train['Title'].replace('Mlle','Miss')\n",
    "train['Title']=train['Title'].replace('Ms','Miss')\n",
    "train['Title']=train['Title'].replace('Mme','Miss')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dict={\"Mr\": 1, \"Master\": 2, \"Mrs\": 3, \"Miss\": 4, \"Col\": 5,\n",
    "      \"Countess\":6,\"Don\":7,\"Sir\":8,\"Lady\":9,\"Capt\":10,\"Rev\":11\n",
    "      ,\"Jonkheer\":12,\"Major\":13,\"Dr\":14}\n",
    "unique_title=(\"Mr\",\"Master\",\"Mrs\",\"Miss\",\"Col\",\"Countess\"\n",
    "              ,\"Don\",\"Sir\",\"Lady\",\"Capt\",\"Rev\",\"Jonkheer\"\n",
    "              ,\"Major\",\"Dr\")\n",
    "            "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "train[\"Title\"]=train['Title'].apply(lambda x:dict[x])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADxJJREFUeJzt3X+s3Xddx/HnixbHz7gtu1xrW7yNaTAdkY3c1OmMUSqs\nOkL311IipMYl/afqMCSkxUTjHzU1GoREp2kG7iZMmoYfWcMQqWWEmCDjbgy2dsw2bKOt7XqBIKDJ\ntOPtH/eLOZTd3nPuvaff3g/PR9Kc7/mc7/d+37e599nvPffc21QVkqR2vaTvASRJ42XoJalxhl6S\nGmfoJalxhl6SGmfoJalxhl6SGmfoJalxhl6SGre27wEAbrjhhpqamup7DElaVR555JFvVtXEYvtd\nFaGfmppidna27zEkaVVJ8uww+/nUjSQ1ztBLUuMMvSQ1ztBLUuMMvSQ1ztBLUuMMvSQ1ztBLUuMM\nvSQ17qr4ydjlmtr7YC/nfebA7b2cV5JG4RW9JDXO0EtS4wy9JDXO0EtS4wy9JDXO0EtS4wy9JDXO\n0EtS4wy9JDXO0EtS4wy9JDXO0EtS4wy9JDXO0EtS4wy9JDXO0EtS44YKfZJnkjye5LEks93a9UmO\nJjnZ3V43sP++JKeSPJXktnENL0la3ChX9L9RVTdV1XR3fy9wrKo2A8e6+yTZAuwEbgS2A/ckWbOC\nM0uSRrCcp252ADPd9gxwx8D6oap6vqqeBk4BW5dxHknSMgwb+gL+JckjSXZ3a5NVda7bPg9Mdtvr\ngdMDx57p1iRJPRj2Pwf/1ao6m+Q1wNEkXxt8sKoqSY1y4u4fjN0Ar33ta0c5VJI0gqGu6KvqbHd7\nAfgE80/FPJdkHUB3e6Hb/SywceDwDd3apW/zYFVNV9X0xMTE0t8DSdJlLRr6JK9M8uofbgNvAZ4A\njgC7ut12AQ9020eAnUmuSbIJ2Aw8vNKDS5KGM8xTN5PAJ5L8cP9/rKpPJ/kScDjJXcCzwJ0AVXU8\nyWHgBHAR2FNVL4xleknSohYNfVV9HXjDi6x/C9i2wDH7gf3Lnk6StGz+ZKwkNc7QS1LjDL0kNc7Q\nS1LjDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1Lj\nDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1LjDL0kNc7QS1Ljhg59\nkjVJvpzkk93965McTXKyu71uYN99SU4leSrJbeMYXJI0nFGu6O8Gnhy4vxc4VlWbgWPdfZJsAXYC\nNwLbgXuSrFmZcSVJoxoq9Ek2ALcD9w4s7wBmuu0Z4I6B9UNV9XxVPQ2cArauzLiSpFENe0X/fuA9\nwA8G1iar6ly3fR6Y7LbXA6cH9jvTrUmSerBo6JO8FbhQVY8stE9VFVCjnDjJ7iSzSWbn5uZGOVSS\nNIJhruhvBd6W5BngEPCmJB8GnkuyDqC7vdDtfxbYOHD8hm7tR1TVwaqarqrpiYmJZbwLkqTLWTT0\nVbWvqjZU1RTz32T9bFW9AzgC7Op22wU80G0fAXYmuSbJJmAz8PCKTy5JGsraZRx7ADic5C7gWeBO\ngKo6nuQwcAK4COypqheWPakkaUlGCn1VfQ74XLf9LWDbAvvtB/YvczZJ0grwJ2MlqXGGXpIaZ+gl\nqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGG\nXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIaZ+glqXGGXpIa\nZ+glqXGLhj7Jy5I8nOQrSY4n+bNu/fokR5Oc7G6vGzhmX5JTSZ5Kcts43wFJ0uUNc0X/PPCmqnoD\ncBOwPcktwF7gWFVtBo5190myBdgJ3AhsB+5JsmYcw0uSFrdo6Gve97u7L+3+FLADmOnWZ4A7uu0d\nwKGqer6qngZOAVtXdGpJ0tCGeo4+yZokjwEXgKNV9UVgsqrOdbucBya77fXA6YHDz3Rrl77N3Ulm\nk8zOzc0t+R2QJF3eUKGvqheq6iZgA7A1yesvebyYv8ofWlUdrKrpqpqemJgY5VBJ0ghGetVNVX0H\neIj5596fS7IOoLu90O12Ftg4cNiGbk2S1INhXnUzkeTabvvlwJuBrwFHgF3dbruAB7rtI8DOJNck\n2QRsBh5e6cElScNZO8Q+64CZ7pUzLwEOV9Unk3wBOJzkLuBZ4E6Aqjqe5DBwArgI7KmqF8YzviRp\nMYuGvqq+Ctz8IuvfArYtcMx+YP+yp5MkLZs/GStJjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9J\njTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0\nktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktS4RUOfZGOSh5KcSHI8yd3d+vVJ\njiY52d1eN3DMviSnkjyV5LZxvgOSpMsb5or+IvDuqtoC3ALsSbIF2Ascq6rNwLHuPt1jO4Ebge3A\nPUnWjGN4SdLiFg19VZ2rqke77e8BTwLrgR3ATLfbDHBHt70DOFRVz1fV08ApYOtKDy5JGs5Iz9En\nmQJuBr4ITFbVue6h88Bkt70eOD1w2Jlu7dK3tTvJbJLZubm5EceWJA1r6NAneRXwMeBdVfXdwceq\nqoAa5cRVdbCqpqtqemJiYpRDJUkjGCr0SV7KfOTvr6qPd8vPJVnXPb4OuNCtnwU2Dhy+oVuTJPVg\nmFfdBPgg8GRVvW/goSPArm57F/DAwPrOJNck2QRsBh5euZElSaNYO8Q+twLvBB5P8li39l7gAHA4\nyV3As8CdAFV1PMlh4ATzr9jZU1UvrPjkkqShLBr6qvpXIAs8vG2BY/YD+5cxlyRphQxzRa+rzNTe\nB3s79zMHbu/t3JKWxl+BIEmNM/SS1DhDL0mNM/SS1DhDL0mNM/SS1DhDL0mNM/SS1DhDL0mNM/SS\n1DhDL0mNM/SS1DhDL0mNM/SS1DhDL0mNM/SS1DhDL0mNM/SS1DhDL0mNM/SS1DhDL0mNW9v3AKvZ\n1N4H+x5BkhblFb0kNc7QS1LjDL0kNc7QS1LjDL0kNW7R0Cf5UJILSZ4YWLs+ydEkJ7vb6wYe25fk\nVJKnktw2rsElScMZ5or+PmD7JWt7gWNVtRk41t0nyRZgJ3Bjd8w9Sdas2LSSpJEtGvqq+jzw7UuW\ndwAz3fYMcMfA+qGqer6qngZOAVtXaFZJ0hIs9Tn6yao6122fBya77fXA6YH9znRrkqSeLPubsVVV\nQI16XJLdSWaTzM7NzS13DEnSApYa+ueSrAPobi9062eBjQP7bejWfkxVHayq6aqanpiYWOIYkqTF\nLDX0R4Bd3fYu4IGB9Z1JrkmyCdgMPLy8ESVJy7HoLzVL8hHg14EbkpwB/hQ4ABxOchfwLHAnQFUd\nT3IYOAFcBPZU1Qtjml2SNIRFQ19Vb1/goW0L7L8f2L+coSRJK8efjJWkxhl6SWqcoZekxhl6SWqc\noZekxhl6SWqcoZekxhl6SWqcoZekxhl6SWqcoZekxhl6SWqcoZekxhl6SWqcoZekxhl6SWqcoZek\nxhl6SWqcoZekxhl6SWqcoZekxhl6SWqcoZekxhl6SWqcoZekxhl6SWqcoZekxhl6SWrc2nG94STb\ngQ8Aa4B7q+rAuM4lScsxtffB3s79zIHbx36OsVzRJ1kD/C3wW8AW4O1JtozjXJKkyxvXFf1W4FRV\nfR0gySFgB3BiTOeTtIL6usK9Ele3P4nGFfr1wOmB+2eAXxrTuXQF9fklrtrnx9d4jO05+sUk2Q3s\n7u5+P8lTfc2yiBuAb/Y9xBI5+5W3WucGZ+9F/mJZs//cMDuNK/RngY0D9zd0a/+vqg4CB8d0/hWT\nZLaqpvueYymc/cpbrXODs/flSsw+rpdXfgnYnGRTkp8CdgJHxnQuSdJljOWKvqouJvl94J+Zf3nl\nh6rq+DjOJUm6vLE9R19VnwI+Na63fwVd9U8vXYazX3mrdW5w9r6MffZU1bjPIUnqkb8CQZIaZ+gX\nkGRjkoeSnEhyPMndfc80iiRrknw5ySf7nmUUSa5N8tEkX0vyZJJf7numYSX5o+5j5YkkH0nysr5n\nWkiSDyW5kOSJgbXrkxxNcrK7va7PGReywOx/2X3MfDXJJ5Jc2+eMC3mx2Qcee3eSSnLDSp/X0C/s\nIvDuqtoC3ALsWWW/xuFu4Mm+h1iCDwCfrqpfAN7AKnkfkqwH/hCYrqrXM/8ihJ39TnVZ9wHbL1nb\nCxyrqs3Ase7+1eg+fnz2o8Drq+oXgX8H9l3poYZ0Hz8+O0k2Am8BvjGOkxr6BVTVuap6tNv+HvPB\nWd/vVMNJsgG4Hbi371lGkeSngV8DPghQVf9TVd/pd6qRrAVenmQt8ArgP3qeZ0FV9Xng25cs7wBm\nuu0Z4I4rOtSQXmz2qvpMVV3s7v4b8z+7c9VZ4O8d4K+B9wBj+aapoR9CkingZuCL/U4ytPcz/0Hz\ng74HGdEmYA74h+5pp3uTvLLvoYZRVWeBv2L+iuwc8J9V9Zl+pxrZZFWd67bPA5N9DrMMvwf8U99D\nDCvJDuBsVX1lXOcw9ItI8irgY8C7quq7fc+zmCRvBS5U1SN9z7IEa4E3An9XVTcD/8XV+/TBj+ie\nz97B/D9WPwu8Msk7+p1q6Wr+5Xir7iV5Sf6Y+add7+97lmEkeQXwXuBPxnkeQ38ZSV7KfOTvr6qP\n9z3PkG4F3pbkGeAQ8KYkH+53pKGdAc5U1Q+/cvoo8+FfDX4TeLqq5qrqf4GPA7/S80yjei7JOoDu\n9kLP84wkye8CbwV+p1bP68Z/nvmLg690n7MbgEeT/MxKnsTQLyBJmH+u+Mmqel/f8wyrqvZV1Yaq\nmmL+m4GfrapVcWVZVeeB00le1y1tY/X8autvALckeUX3sbONVfKN5AFHgF3d9i7ggR5nGUn3Hx29\nB3hbVf133/MMq6oer6rXVNVU9zl7Bnhj97mwYgz9wm4F3sn8FfFj3Z/f7nuonwB/ANyf5KvATcCf\n9zzPULqvQj4KPAo8zvzn1lX705pJPgJ8AXhdkjNJ7gIOAG9OcpL5r1Cuyv8VboHZ/wZ4NXC0+1z9\n+16HXMACs4//vKvnKxxJ0lJ4RS9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktQ4Qy9JjTP0ktS4/wNK\n0b5rTZiDIwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x7fd720d04cf8>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.hist(train['Title'])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see that the titles Jonkheer, Dr and others are rare we could just simply group them under a 'Rare' group"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "train['Title'] = train['Title'].replace(['Lady', 'Countess','Capt', 'Col','Don', 'Dr', 'Major', 'Rev', 'Sir', 'Jonkheer', 'Dona'], 'Rare')\n",
    "#Replacing some similar titles and replacing the rare titles by 'Rare' tag"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'From this feature by examinig the correlation matrix we can observe a few things and improve out the accuracy ,like\\nmost of the people who had a title Mr. died'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "'''From this feature by examinig the correlation matrix we can observe a few things and improve out the accuracy ,like\n",
    "most of the people who had a title Mr. died'''"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}