Formulas: Fitting models using R-style formulas
===============================================


.. _formulas_notebook:

`Link to Notebook GitHub <https://github.com/statsmodels/statsmodels/blob/master/examples/notebooks/formulas.ipynb>`_

.. raw:: html

   
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Since version 0.5.0, <code>statsmodels</code> allows users to fit statistical models using R-style formulas. Internally, <code>statsmodels</code> uses the <a href="http://patsy.readthedocs.org/">patsy</a> package to convert formulas and data to the matrices that are used in model fitting. The formula framework is quite powerful; this tutorial only scratches the surface. A full description of the formula language can be found in the <code>patsy</code> docs: </p>
   <ul>
   <li><a href="http://patsy.readthedocs.org/">Patsy formula language description</a></li>
   </ul>
   <h2 id="loading-modules-and-functions">Loading modules and functions</h2>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[1]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">print_function</span>
   <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
   <span class="kn">import</span> <span class="nn">statsmodels.api</span> <span class="kn">as</span> <span class="nn">sm</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>
   Populating the interactive namespace from numpy and matplotlib
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h4 id="Import-convention">Import convention<a class="anchor-link" href="#Import-convention">&#182;</a></h4>
   </div>
   </div>
   </div>
   
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>You can import explicitly from statsmodels.formula.api</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[2]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="kn">from</span> <span class="nn">statsmodels.formula.api</span> <span class="kn">import</span> <span class="n">ols</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Alternatively, you can just use the <code>formula</code> namespace of the main <code>statsmodels.api</code>.</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[3]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">sm</span><span class="o">.</span><span class="n">formula</span><span class="o">.</span><span class="n">ols</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt output_prompt">Out[3]:</div>
   
   
   <div class="output_text output_subarea output_pyout">
   <pre>
   &lt;bound method type.from_formula of &lt;class &apos;statsmodels.regression.linear_model.OLS&apos;&gt;&gt;
   </pre>
   </div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Or you can use the following conventioin</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[4]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="kn">import</span> <span class="nn">statsmodels.formula.api</span> <span class="kn">as</span> <span class="nn">smf</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>These names are just a convenient way to get access to each model&#39;s <code>from_formula</code> classmethod. See, for instance</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[5]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">sm</span><span class="o">.</span><span class="n">OLS</span><span class="o">.</span><span class="n">from_formula</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt output_prompt">Out[5]:</div>
   
   
   <div class="output_text output_subarea output_pyout">
   <pre>
   &lt;bound method type.from_formula of &lt;class &apos;statsmodels.regression.linear_model.OLS&apos;&gt;&gt;
   </pre>
   </div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>All of the lower case models accept <code>formula</code> and <code>data</code> arguments, whereas upper case ones take <code>endog</code> and <code>exog</code> design matrices. <code>formula</code> accepts a string which describes the model in terms of a <code>patsy</code> formula. <code>data</code> takes a <a href="http://pandas.pydata.org/">pandas</a> data frame or any other data structure that defines a <code>__getitem__</code> for variable names like a structured array or a dictionary of variables. </p>
   <p><code>dir(sm.formula)</code> will print a list of available models. </p>
   <p>Formula-compatible models have the following generic call signature: <code>(formula, data, subset=None, *args, **kwargs)</code></p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h2 id="ols-regression-using-formulas">OLS regression using formulas</h2>
   <p>To begin, we fit the linear model described on the <a href="gettingstarted.html">Getting Started</a> page. Download the data, subset columns, and list-wise delete to remove missing observations:</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[6]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">dta</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">get_rdataset</span><span class="p">(</span><span class="s2">&quot;Guerry&quot;</span><span class="p">,</span> <span class="s2">&quot;HistData&quot;</span><span class="p">,</span> <span class="n">cache</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">URLError</span>                                  Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-8-0b450e8cdfce&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span>dta <span class="ansiblue">=</span> sm<span class="ansiblue">.</span>datasets<span class="ansiblue">.</span>get_rdataset<span class="ansiblue">(</span><span class="ansiblue">&quot;Guerry&quot;</span><span class="ansiblue">,</span> <span class="ansiblue">&quot;HistData&quot;</span><span class="ansiblue">,</span> cache<span class="ansiblue">=</span>True<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansigreen">/build/statsmodels-ungkPp/statsmodels-0.6.1/debian/python-statsmodels/usr/lib/python2.7/dist-packages/statsmodels/datasets/utils.pyc</span> in <span class="ansicyan">get_rdataset</span><span class="ansiblue">(dataname, package, cache)</span>
   <span class="ansigreen">    284</span>                      &quot;master/doc/&quot;+package+&quot;/rst/&quot;)
   <span class="ansigreen">    285</span>     cache <span class="ansiblue">=</span> _get_cache<span class="ansiblue">(</span>cache<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">--&gt; 286</span><span class="ansired">     </span>data<span class="ansiblue">,</span> from_cache <span class="ansiblue">=</span> _get_data<span class="ansiblue">(</span>data_base_url<span class="ansiblue">,</span> dataname<span class="ansiblue">,</span> cache<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    287</span>     data <span class="ansiblue">=</span> read_csv<span class="ansiblue">(</span>data<span class="ansiblue">,</span> index_col<span class="ansiblue">=</span><span class="ansicyan">0</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    288</span>     data <span class="ansiblue">=</span> _maybe_reset_index<span class="ansiblue">(</span>data<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansigreen">/build/statsmodels-ungkPp/statsmodels-0.6.1/debian/python-statsmodels/usr/lib/python2.7/dist-packages/statsmodels/datasets/utils.pyc</span> in <span class="ansicyan">_get_data</span><span class="ansiblue">(base_url, dataname, cache, extension)</span>
   <span class="ansigreen">    215</span>     url <span class="ansiblue">=</span> base_url <span class="ansiblue">+</span> <span class="ansiblue">(</span>dataname <span class="ansiblue">+</span> <span class="ansiblue">&quot;.%s&quot;</span><span class="ansiblue">)</span> <span class="ansiblue">%</span> extension<span class="ansiblue"></span>
   <span class="ansigreen">    216</span>     <span class="ansigreen">try</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">--&gt; 217</span><span class="ansired">         </span>data<span class="ansiblue">,</span> from_cache <span class="ansiblue">=</span> _urlopen_cached<span class="ansiblue">(</span>url<span class="ansiblue">,</span> cache<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    218</span>     <span class="ansigreen">except</span> HTTPError <span class="ansigreen">as</span> err<span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">    219</span>         <span class="ansigreen">if</span> <span class="ansiblue">&apos;404&apos;</span> <span class="ansigreen">in</span> str<span class="ansiblue">(</span>err<span class="ansiblue">)</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   
   <span class="ansigreen">/build/statsmodels-ungkPp/statsmodels-0.6.1/debian/python-statsmodels/usr/lib/python2.7/dist-packages/statsmodels/datasets/utils.pyc</span> in <span class="ansicyan">_urlopen_cached</span><span class="ansiblue">(url, cache)</span>
   <span class="ansigreen">    206</span>     <span class="ansired"># not using the cache or didn&apos;t find it in cache</span><span class="ansiblue"></span><span class="ansiblue"></span>
   <span class="ansigreen">    207</span>     <span class="ansigreen">if</span> <span class="ansigreen">not</span> from_cache<span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">--&gt; 208</span><span class="ansired">         </span>data <span class="ansiblue">=</span> urlopen<span class="ansiblue">(</span>url<span class="ansiblue">)</span><span class="ansiblue">.</span>read<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    209</span>         <span class="ansigreen">if</span> cache <span class="ansigreen">is</span> <span class="ansigreen">not</span> None<span class="ansiblue">:</span>  <span class="ansired"># then put it in the cache</span><span class="ansiblue"></span>
   <span class="ansigreen">    210</span>             _cache_it<span class="ansiblue">(</span>data<span class="ansiblue">,</span> cache_path<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansigreen">/usr/lib/python2.7/urllib2.pyc</span> in <span class="ansicyan">urlopen</span><span class="ansiblue">(url, data, timeout, cafile, capath, cadefault, context)</span>
   <span class="ansigreen">    152</span>     <span class="ansigreen">else</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">    153</span>         opener <span class="ansiblue">=</span> _opener<span class="ansiblue"></span>
   <span class="ansigreen">--&gt; 154</span><span class="ansired">     </span><span class="ansigreen">return</span> opener<span class="ansiblue">.</span>open<span class="ansiblue">(</span>url<span class="ansiblue">,</span> data<span class="ansiblue">,</span> timeout<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    155</span> <span class="ansiblue"></span>
   <span class="ansigreen">    156</span> <span class="ansigreen">def</span> install_opener<span class="ansiblue">(</span>opener<span class="ansiblue">)</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   
   <span class="ansigreen">/usr/lib/python2.7/urllib2.pyc</span> in <span class="ansicyan">open</span><span class="ansiblue">(self, fullurl, data, timeout)</span>
   <span class="ansigreen">    427</span>             req <span class="ansiblue">=</span> meth<span class="ansiblue">(</span>req<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    428</span> <span class="ansiblue"></span>
   <span class="ansigreen">--&gt; 429</span><span class="ansired">         </span>response <span class="ansiblue">=</span> self<span class="ansiblue">.</span>_open<span class="ansiblue">(</span>req<span class="ansiblue">,</span> data<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    430</span> <span class="ansiblue"></span>
   <span class="ansigreen">    431</span>         <span class="ansired"># post-process response</span><span class="ansiblue"></span><span class="ansiblue"></span>
   
   <span class="ansigreen">/usr/lib/python2.7/urllib2.pyc</span> in <span class="ansicyan">_open</span><span class="ansiblue">(self, req, data)</span>
   <span class="ansigreen">    445</span>         protocol <span class="ansiblue">=</span> req<span class="ansiblue">.</span>get_type<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    446</span>         result = self._call_chain(self.handle_open, protocol, protocol +
   <span class="ansigreen">--&gt; 447</span><span class="ansired">                                   &apos;_open&apos;, req)
   </span><span class="ansigreen">    448</span>         <span class="ansigreen">if</span> result<span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">    449</span>             <span class="ansigreen">return</span> result<span class="ansiblue"></span>
   
   <span class="ansigreen">/usr/lib/python2.7/urllib2.pyc</span> in <span class="ansicyan">_call_chain</span><span class="ansiblue">(self, chain, kind, meth_name, *args)</span>
   <span class="ansigreen">    405</span>             func <span class="ansiblue">=</span> getattr<span class="ansiblue">(</span>handler<span class="ansiblue">,</span> meth_name<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    406</span> <span class="ansiblue"></span>
   <span class="ansigreen">--&gt; 407</span><span class="ansired">             </span>result <span class="ansiblue">=</span> func<span class="ansiblue">(</span><span class="ansiblue">*</span>args<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">    408</span>             <span class="ansigreen">if</span> result <span class="ansigreen">is</span> <span class="ansigreen">not</span> None<span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">    409</span>                 <span class="ansigreen">return</span> result<span class="ansiblue"></span>
   
   <span class="ansigreen">/usr/lib/python2.7/urllib2.pyc</span> in <span class="ansicyan">https_open</span><span class="ansiblue">(self, req)</span>
   <span class="ansigreen">   1239</span>         <span class="ansigreen">def</span> https_open<span class="ansiblue">(</span>self<span class="ansiblue">,</span> req<span class="ansiblue">)</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">   1240</span>             return self.do_open(httplib.HTTPSConnection, req,
   <span class="ansigreen">-&gt; 1241</span><span class="ansired">                 context=self._context)
   </span><span class="ansigreen">   1242</span> <span class="ansiblue"></span>
   <span class="ansigreen">   1243</span>         https_request <span class="ansiblue">=</span> AbstractHTTPHandler<span class="ansiblue">.</span>do_request_<span class="ansiblue"></span>
   
   <span class="ansigreen">/usr/lib/python2.7/urllib2.pyc</span> in <span class="ansicyan">do_open</span><span class="ansiblue">(self, http_class, req, **http_conn_args)</span>
   <span class="ansigreen">   1196</span>         <span class="ansigreen">except</span> socket<span class="ansiblue">.</span>error<span class="ansiblue">,</span> err<span class="ansiblue">:</span> <span class="ansired"># XXX what error?</span><span class="ansiblue"></span>
   <span class="ansigreen">   1197</span>             h<span class="ansiblue">.</span>close<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">-&gt; 1198</span><span class="ansired">             </span><span class="ansigreen">raise</span> URLError<span class="ansiblue">(</span>err<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">   1199</span>         <span class="ansigreen">else</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">   1200</span>             <span class="ansigreen">try</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   
   <span class="ansired">URLError</span>: &lt;urlopen error [Errno -2] Name or service not known&gt;</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[7]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">df</span> <span class="o">=</span> <span class="n">dta</span><span class="o">.</span><span class="n">data</span><span class="p">[[</span><span class="s1">&#39;Lottery&#39;</span><span class="p">,</span> <span class="s1">&#39;Literacy&#39;</span><span class="p">,</span> <span class="s1">&#39;Wealth&#39;</span><span class="p">,</span> <span class="s1">&#39;Region&#39;</span><span class="p">]]</span><span class="o">.</span><span class="n">dropna</span><span class="p">()</span>
   <span class="n">df</span><span class="o">.</span><span class="n">head</span><span class="p">()</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-9-c86d8ac9ee04&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span>df <span class="ansiblue">=</span> dta<span class="ansiblue">.</span>data<span class="ansiblue">[</span><span class="ansiblue">[</span><span class="ansiblue">&apos;Lottery&apos;</span><span class="ansiblue">,</span> <span class="ansiblue">&apos;Literacy&apos;</span><span class="ansiblue">,</span> <span class="ansiblue">&apos;Wealth&apos;</span><span class="ansiblue">,</span> <span class="ansiblue">&apos;Region&apos;</span><span class="ansiblue">]</span><span class="ansiblue">]</span><span class="ansiblue">.</span>dropna<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      2</span> df<span class="ansiblue">.</span>head<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;dta&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Fit the model:</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[8]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">mod</span> <span class="o">=</span> <span class="n">ols</span><span class="p">(</span><span class="n">formula</span><span class="o">=</span><span class="s1">&#39;Lottery ~ Literacy + Wealth + Region&#39;</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">df</span><span class="p">)</span>
   <span class="n">res</span> <span class="o">=</span> <span class="n">mod</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">summary</span><span class="p">())</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-10-536472a0f10b&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span>mod <span class="ansiblue">=</span> ols<span class="ansiblue">(</span>formula<span class="ansiblue">=</span><span class="ansiblue">&apos;Lottery ~ Literacy + Wealth + Region&apos;</span><span class="ansiblue">,</span> data<span class="ansiblue">=</span>df<span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      2</span> res <span class="ansiblue">=</span> mod<span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      3</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>res<span class="ansiblue">.</span>summary<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h2 id="categorical-variables">Categorical variables</h2>
   <p>Looking at the summary printed above, notice that <code>patsy</code> determined that elements of <em>Region</em> were text strings, so it treated <em>Region</em> as a categorical variable. <code>patsy</code>&#39;s default is also to include an intercept, so we automatically dropped one of the <em>Region</em> categories.</p>
   <p>If <em>Region</em> had been an integer variable that we wanted to treat explicitly as categorical, we could have done so by using the <code>C()</code> operator: </p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[9]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">res</span> <span class="o">=</span> <span class="n">ols</span><span class="p">(</span><span class="n">formula</span><span class="o">=</span><span class="s1">&#39;Lottery ~ Literacy + Wealth + C(Region)&#39;</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">df</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-11-d258a68e10f8&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span>res <span class="ansiblue">=</span> ols<span class="ansiblue">(</span>formula<span class="ansiblue">=</span><span class="ansiblue">&apos;Lottery ~ Literacy + Wealth + C(Region)&apos;</span><span class="ansiblue">,</span> data<span class="ansiblue">=</span>df<span class="ansiblue">)</span><span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      2</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>res<span class="ansiblue">.</span>params<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Patsy&#39;s mode advanced features for categorical variables are discussed in: <a href="contrasts.html">Patsy: Contrast Coding Systems for categorical variables</a></p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h2 id="operators">Operators</h2>
   <p>We have already seen that &quot;~&quot; separates the left-hand side of the model from the right-hand side, and that &quot;+&quot; adds new columns to the design matrix. </p>
   <h3 id="removing-variables">Removing variables</h3>
   <p>The &quot;-&quot; sign can be used to remove columns/variables. For instance, we can remove the intercept from a model by: </p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[10]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">res</span> <span class="o">=</span> <span class="n">ols</span><span class="p">(</span><span class="n">formula</span><span class="o">=</span><span class="s1">&#39;Lottery ~ Literacy + Wealth + C(Region) -1 &#39;</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">df</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-12-c9050ef6e795&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span>res <span class="ansiblue">=</span> ols<span class="ansiblue">(</span>formula<span class="ansiblue">=</span><span class="ansiblue">&apos;Lottery ~ Literacy + Wealth + C(Region) -1 &apos;</span><span class="ansiblue">,</span> data<span class="ansiblue">=</span>df<span class="ansiblue">)</span><span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      2</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>res<span class="ansiblue">.</span>params<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="multiplicative-interactions">Multiplicative interactions</h3>
   <p>&quot;:&quot; adds a new column to the design matrix with the interaction of the other two columns. &quot;*&quot; will also include the individual columns that were multiplied together:</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[11]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">res1</span> <span class="o">=</span> <span class="n">ols</span><span class="p">(</span><span class="n">formula</span><span class="o">=</span><span class="s1">&#39;Lottery ~ Literacy : Wealth - 1&#39;</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">df</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="n">res2</span> <span class="o">=</span> <span class="n">ols</span><span class="p">(</span><span class="n">formula</span><span class="o">=</span><span class="s1">&#39;Lottery ~ Literacy * Wealth - 1&#39;</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">df</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="k">print</span><span class="p">(</span><span class="n">res1</span><span class="o">.</span><span class="n">params</span><span class="p">,</span> <span class="s1">&#39;</span><span class="se">\n</span><span class="s1">&#39;</span><span class="p">)</span>
   <span class="k">print</span><span class="p">(</span><span class="n">res2</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-13-f906b35aeafd&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span>res1 <span class="ansiblue">=</span> ols<span class="ansiblue">(</span>formula<span class="ansiblue">=</span><span class="ansiblue">&apos;Lottery ~ Literacy : Wealth - 1&apos;</span><span class="ansiblue">,</span> data<span class="ansiblue">=</span>df<span class="ansiblue">)</span><span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      2</span> res2 <span class="ansiblue">=</span> ols<span class="ansiblue">(</span>formula<span class="ansiblue">=</span><span class="ansiblue">&apos;Lottery ~ Literacy * Wealth - 1&apos;</span><span class="ansiblue">,</span> data<span class="ansiblue">=</span>df<span class="ansiblue">)</span><span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      3</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>res1<span class="ansiblue">.</span>params<span class="ansiblue">,</span> <span class="ansiblue">&apos;\n&apos;</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      4</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>res2<span class="ansiblue">.</span>params<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Many other things are possible with operators. Please consult the <a href="https://patsy.readthedocs.org/en/latest/formulas.html">patsy docs</a> to learn more.</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h2 id="functions">Functions</h2>
   <p>You can apply vectorized functions to the variables in your model: </p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[12]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">res</span> <span class="o">=</span> <span class="n">smf</span><span class="o">.</span><span class="n">ols</span><span class="p">(</span><span class="n">formula</span><span class="o">=</span><span class="s1">&#39;Lottery ~ np.log(Literacy)&#39;</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">df</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-14-023367ac1531&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span>res <span class="ansiblue">=</span> smf<span class="ansiblue">.</span>ols<span class="ansiblue">(</span>formula<span class="ansiblue">=</span><span class="ansiblue">&apos;Lottery ~ np.log(Literacy)&apos;</span><span class="ansiblue">,</span> data<span class="ansiblue">=</span>df<span class="ansiblue">)</span><span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      2</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>res<span class="ansiblue">.</span>params<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Define a custom function:</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[13]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="k">def</span> <span class="nf">log_plus_1</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
       <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">log</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">+</span> <span class="mf">1.</span>
   <span class="n">res</span> <span class="o">=</span> <span class="n">smf</span><span class="o">.</span><span class="n">ols</span><span class="p">(</span><span class="n">formula</span><span class="o">=</span><span class="s1">&#39;Lottery ~ log_plus_1(Literacy)&#39;</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">df</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="k">print</span><span class="p">(</span><span class="n">res</span><span class="o">.</span><span class="n">params</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-15-0eeba7434bb9&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">      1</span> <span class="ansigreen">def</span> log_plus_1<span class="ansiblue">(</span>x<span class="ansiblue">)</span><span class="ansiblue">:</span><span class="ansiblue"></span>
   <span class="ansigreen">      2</span>     <span class="ansigreen">return</span> np<span class="ansiblue">.</span>log<span class="ansiblue">(</span>x<span class="ansiblue">)</span> <span class="ansiblue">+</span> <span class="ansicyan">1.</span><span class="ansiblue"></span>
   <span class="ansigreen">----&gt; 3</span><span class="ansired"> </span>res <span class="ansiblue">=</span> smf<span class="ansiblue">.</span>ols<span class="ansiblue">(</span>formula<span class="ansiblue">=</span><span class="ansiblue">&apos;Lottery ~ log_plus_1(Literacy)&apos;</span><span class="ansiblue">,</span> data<span class="ansiblue">=</span>df<span class="ansiblue">)</span><span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      4</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>res<span class="ansiblue">.</span>params<span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Any function that is in the calling namespace is available to the formula.</p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h2 id="using-formulas-with-models-that-do-not-yet-support-them">Using formulas with models that do not (yet) support them</h2>
   <p>Even if a given <code>statsmodels</code> function does not support formulas, you can still use <code>patsy</code>&#39;s formula language to produce design matrices. Those matrices 
   can then be fed to the fitting function as <code>endog</code> and <code>exog</code> arguments. </p>
   <p>To generate <code>numpy</code> arrays: </p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[14]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="kn">import</span> <span class="nn">patsy</span>
   <span class="n">f</span> <span class="o">=</span> <span class="s1">&#39;Lottery ~ Literacy * Wealth&#39;</span>
   <span class="n">y</span><span class="p">,</span><span class="n">X</span> <span class="o">=</span> <span class="n">patsy</span><span class="o">.</span><span class="n">dmatrices</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">df</span><span class="p">,</span> <span class="n">return_type</span><span class="o">=</span><span class="s1">&#39;dataframe&#39;</span><span class="p">)</span>
   <span class="k">print</span><span class="p">(</span><span class="n">y</span><span class="p">[:</span><span class="mi">5</span><span class="p">])</span>
   <span class="k">print</span><span class="p">(</span><span class="n">X</span><span class="p">[:</span><span class="mi">5</span><span class="p">])</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-16-b909ce5fd501&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">      1</span> <span class="ansigreen">import</span> patsy<span class="ansiblue"></span>
   <span class="ansigreen">      2</span> f <span class="ansiblue">=</span> <span class="ansiblue">&apos;Lottery ~ Literacy * Wealth&apos;</span><span class="ansiblue"></span>
   <span class="ansigreen">----&gt; 3</span><span class="ansired"> </span>y<span class="ansiblue">,</span>X <span class="ansiblue">=</span> patsy<span class="ansiblue">.</span>dmatrices<span class="ansiblue">(</span>f<span class="ansiblue">,</span> df<span class="ansiblue">,</span> return_type<span class="ansiblue">=</span><span class="ansiblue">&apos;dataframe&apos;</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      4</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>y<span class="ansiblue">[</span><span class="ansiblue">:</span><span class="ansicyan">5</span><span class="ansiblue">]</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      5</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>X<span class="ansiblue">[</span><span class="ansiblue">:</span><span class="ansicyan">5</span><span class="ansiblue">]</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>To generate pandas data frames: </p>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[15]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="n">f</span> <span class="o">=</span> <span class="s1">&#39;Lottery ~ Literacy * Wealth&#39;</span>
   <span class="n">y</span><span class="p">,</span><span class="n">X</span> <span class="o">=</span> <span class="n">patsy</span><span class="o">.</span><span class="n">dmatrices</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">df</span><span class="p">,</span> <span class="n">return_type</span><span class="o">=</span><span class="s1">&#39;dataframe&#39;</span><span class="p">)</span>
   <span class="k">print</span><span class="p">(</span><span class="n">y</span><span class="p">[:</span><span class="mi">5</span><span class="p">])</span>
   <span class="k">print</span><span class="p">(</span><span class="n">X</span><span class="p">[:</span><span class="mi">5</span><span class="p">])</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-17-d9fd5a15051e&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">      1</span> f <span class="ansiblue">=</span> <span class="ansiblue">&apos;Lottery ~ Literacy * Wealth&apos;</span><span class="ansiblue"></span>
   <span class="ansigreen">----&gt; 2</span><span class="ansired"> </span>y<span class="ansiblue">,</span>X <span class="ansiblue">=</span> patsy<span class="ansiblue">.</span>dmatrices<span class="ansiblue">(</span>f<span class="ansiblue">,</span> df<span class="ansiblue">,</span> return_type<span class="ansiblue">=</span><span class="ansiblue">&apos;dataframe&apos;</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      3</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>y<span class="ansiblue">[</span><span class="ansiblue">:</span><span class="ansicyan">5</span><span class="ansiblue">]</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   <span class="ansigreen">      4</span> <span class="ansigreen">print</span><span class="ansiblue">(</span>X<span class="ansiblue">[</span><span class="ansiblue">:</span><span class="ansicyan">5</span><span class="ansiblue">]</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;df&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[16]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class="highlight"><pre><span class="k">print</span><span class="p">(</span><span class="n">sm</span><span class="o">.</span><span class="n">OLS</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">X</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span><span class="o">.</span><span class="n">summary</span><span class="p">())</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_text output_pyerr">
   <pre>
   <span class="ansired">---------------------------------------------------------------------------</span>
   <span class="ansired">NameError</span>                                 Traceback (most recent call last)
   <span class="ansigreen">&lt;ipython-input-18-4f13d104e8aa&gt;</span> in <span class="ansicyan">&lt;module&gt;</span><span class="ansiblue">()</span>
   <span class="ansigreen">----&gt; 1</span><span class="ansired"> </span><span class="ansigreen">print</span><span class="ansiblue">(</span>sm<span class="ansiblue">.</span>OLS<span class="ansiblue">(</span>y<span class="ansiblue">,</span> X<span class="ansiblue">)</span><span class="ansiblue">.</span>fit<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue">.</span>summary<span class="ansiblue">(</span><span class="ansiblue">)</span><span class="ansiblue">)</span><span class="ansiblue"></span>
   
   <span class="ansired">NameError</span>: name &apos;y&apos; is not defined</pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>

   <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"type="text/javascript"></script>
   <script type="text/javascript">
   init_mathjax = function() {
       if (window.MathJax) {
           // MathJax loaded
           MathJax.Hub.Config({
               tex2jax: {
               // I'm not sure about the \( and \[ below. It messes with the
               // prompt, and I think it's an issue with the template. -SS
                   inlineMath: [ ['$','$'], ["\\(","\\)"] ],
                   displayMath: [ ['$$','$$'], ["\\[","\\]"] ]
               },
               displayAlign: 'left', // Change this to 'center' to center equations.
               "HTML-CSS": {
                   styles: {'.MathJax_Display': {"margin": 0}}
               }
           });
           MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
       }
   }
   init_mathjax();

   // since we have to load this in a ..raw:: directive we will add the css
   // after the fact
   function loadcssfile(filename){
       var fileref=document.createElement("link")
       fileref.setAttribute("rel", "stylesheet")
       fileref.setAttribute("type", "text/css")
       fileref.setAttribute("href", filename)

       document.getElementsByTagName("head")[0].appendChild(fileref)
   }
   // loadcssfile({{pathto("_static/nbviewer.pygments.css", 1) }})
   // loadcssfile({{pathto("_static/nbviewer.min.css", 1) }})
   loadcssfile("../../../_static/nbviewer.pygments.css")
   loadcssfile("../../../_static/ipython.min.css")
   </script>