Skip to content

Commit

Permalink
deploy: 9cc34a8
Browse files Browse the repository at this point in the history
  • Loading branch information
kvarada committed Dec 3, 2024
1 parent aef83b0 commit 621d72b
Show file tree
Hide file tree
Showing 5 changed files with 257 additions and 10 deletions.
77 changes: 73 additions & 4 deletions _sources/lectures/notes/final-exam-review-guiding-question.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,8 @@
"- What are the advantages of cross-validation?\n",
"- Why it's important to look at sub-scores of cross-validation?\n",
"- What is the fundamental trade-off in supervised machine learning?\n",
"- What is the Golden rule in supervised machine learning? "
"- What is the Golden rule in supervised machine learning?\n",
"- Scenarios for data leakage "
]
},
{
Expand All @@ -113,8 +114,28 @@
"- KNNs, SVM RBFs\n",
"- Linear models \n",
"- Random forests\n",
"- Grading Boosyinh, LGBM, CatBoost\n",
"- Stacking, averaging "
"- Grading Boosting, LGBM, CatBoost\n",
"- Stacking, averaging\n",
"\n",
"**Comparison of models**\n",
"| **Model** | Parameters and hyperparameters | **Strengths** | **Weaknesses** |\n",
"|------------------|--------------------------------|---------------------------|---------------------------|\n",
"| **Decision Trees** | | | |\n",
"| **KNNs** | | | |\n",
"| **SVM RBF** | | | |\n",
"| **Linear models** | | | | \n",
"| **Random forests** | | | | \n",
"| **Gradient boosting** | | | | \n",
"| **Stacking** | | | | \n",
"| **Averaging** | | | | \n"
]
},
{
"cell_type": "markdown",
"id": "3b43fa4c-5691-4397-a057-a881d1d94179",
"metadata": {},
"source": [
"<br><br>"
]
},
{
Expand All @@ -133,6 +154,22 @@
"- What are various data preprocessing steps such as scaling, OHE, ordinal encoding, and handling missing values. Why and when each step is necessary?"
]
},
{
"cell_type": "markdown",
"id": "46551fbd-cf55-418c-867d-f8c7705fe7d1",
"metadata": {},
"source": [
"**`sklearn` Transformers** \n",
"| **Transformer** | Hyperparameters | **When to use?** |\n",
"|------------------|--------------------------------|---------------------------|\n",
"| `SimpleImputer` | | | \n",
"| `StandardScaler` | | | \n",
"| `OneHotEncoder` | | | \n",
"| `OrdinalEncoder` | | | \n",
"| `CountVectorizer` | | | \n",
"| `TransformedTargetRegressor` | | |\n"
]
},
{
"cell_type": "markdown",
"id": "bf30b454-9f43-481e-9b1c-da43031fc0d8",
Expand Down Expand Up @@ -586,7 +623,14 @@
"\n",
"- What makes hyperparameter optimization a hard problem?\n",
"- What are two different tools provided by sklearn for hyperparameter optimization? \n",
"- What is optimization bias? "
"- What is optimization bias?\n",
"\n",
"\n",
"| **Method** | Strengths/Weaknesses | **When to use?** |\n",
"|------------------|--------------------------------|---------------------------|\n",
"| Nested for loops | | | \n",
"| Grid search | | | \n",
"| Random search | | | "
]
},
{
Expand All @@ -604,6 +648,31 @@
"- What are advantages of RMSE or MAPE over MSE? "
]
},
{
"cell_type": "markdown",
"id": "7e11a3f7-0ec3-4306-a84e-43fe74869e20",
"metadata": {},
"source": [
"**Classification Metrics**\n",
"| **Metric** | How to generate/calculate? | **When to use?** |\n",
"|------------------|--------------------------------|---------------------------|\n",
"| Accuracy | | | \n",
"| Precision | | | \n",
"| Recall | | | \n",
"| F1-score | | | \n",
"| AP score | | | \n",
"| AUC | | | \n",
"\n",
"\n",
"**Regression Metrics**\n",
"| **Metric** | How to generate/calculate? | **When to use?** |\n",
"|------------------|--------------------------------|---------------------------|\n",
"| MSE | | | \n",
"| RMSE | | | \n",
"| r2 score | | | \n",
"| MAPE | | | "
]
},
{
"cell_type": "markdown",
"id": "a1e6c11b-ee26-4d37-87ea-2b6bd3560f60",
Expand Down
4 changes: 2 additions & 2 deletions lectures/101-Giulia-lectures/07_linear-models.html
Original file line number Diff line number Diff line change
Expand Up @@ -1751,8 +1751,8 @@ <h4>Predicting with learned weights<a class="headerlink" href="#predicting-with-
<p>In our case, for values for the coefficient of <em>boring</em> &lt; -3.36, the prediction would be negative.</p>
<p>A linear model learns these coefficients or weights from the training data!</p>
<p>So a linear classifier is a linear function of the input <code class="docutils literal notranslate"><span class="pre">X</span></code>, followed by a threshold.</p>
<div class="amsmath math notranslate nohighlight" id="equation-86e21716-4141-4267-b699-8353676a67fc">
<span class="eqno">(2)<a class="headerlink" href="#equation-86e21716-4141-4267-b699-8353676a67fc" title="Permalink to this equation">#</a></span>\[\begin{equation}
<div class="amsmath math notranslate nohighlight" id="equation-21d4f5cc-1dfa-4e83-8a4a-f5e13992a2f9">
<span class="eqno">(2)<a class="headerlink" href="#equation-21d4f5cc-1dfa-4e83-8a4a-f5e13992a2f9" title="Permalink to this equation">#</a></span>\[\begin{equation}
\begin{split}
z =&amp; w_1x_1 + \dots + w_dx_d + b\\
=&amp; w^Tx + b
Expand Down
4 changes: 2 additions & 2 deletions lectures/notes/07_linear-models.html
Original file line number Diff line number Diff line change
Expand Up @@ -1695,8 +1695,8 @@ <h4>Predicting with learned weights<a class="headerlink" href="#predicting-with-
<p>In our case, for values for the coefficient of <em>boring</em> &lt; -3.36, the prediction would be negative.</p>
<p>A linear model learns these coefficients or weights from the training data!</p>
<p>So a linear classifier is a linear function of the input <code class="docutils literal notranslate"><span class="pre">X</span></code>, followed by a threshold.</p>
<div class="amsmath math notranslate nohighlight" id="equation-90015cd2-dc20-466e-9266-611936d67486">
<span class="eqno">(1)<a class="headerlink" href="#equation-90015cd2-dc20-466e-9266-611936d67486" title="Permalink to this equation">#</a></span>\[\begin{equation}
<div class="amsmath math notranslate nohighlight" id="equation-f13e2e3a-42f5-4848-8fe9-cf2e92698731">
<span class="eqno">(1)<a class="headerlink" href="#equation-f13e2e3a-42f5-4848-8fe9-cf2e92698731" title="Permalink to this equation">#</a></span>\[\begin{equation}
\begin{split}
z =&amp; w_1x_1 + \dots + w_dx_d + b\\
=&amp; w^Tx + b
Expand Down
180 changes: 179 additions & 1 deletion lectures/notes/final-exam-review-guiding-question.html
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,7 @@ <h3>ML fundamentals<a class="headerlink" href="#ml-fundamentals" title="Link to
<li><p>Why it’s important to look at sub-scores of cross-validation?</p></li>
<li><p>What is the fundamental trade-off in supervised machine learning?</p></li>
<li><p>What is the Golden rule in supervised machine learning?</p></li>
<li><p>Scenarios for data leakage</p></li>
</ul>
</section>
<section id="pros-cons-parameters-and-hyperparameters-of-different-ml-models">
Expand All @@ -530,15 +531,105 @@ <h3>Pros, cons, parameters and hyperparameters of different ML models<a class="h
<li><p>KNNs, SVM RBFs</p></li>
<li><p>Linear models</p></li>
<li><p>Random forests</p></li>
<li><p>Grading Boosyinh, LGBM, CatBoost</p></li>
<li><p>Grading Boosting, LGBM, CatBoost</p></li>
<li><p>Stacking, averaging</p></li>
</ul>
<p><strong>Comparison of models</strong></p>
<div class="pst-scrollable-table-container"><table class="table">
<thead>
<tr class="row-odd"><th class="head"><p><strong>Model</strong></p></th>
<th class="head"><p>Parameters and hyperparameters</p></th>
<th class="head"><p><strong>Strengths</strong></p></th>
<th class="head"><p><strong>Weaknesses</strong></p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p><strong>Decision Trees</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p><strong>KNNs</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p><strong>SVM RBF</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p><strong>Linear models</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p><strong>Random forests</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p><strong>Gradient boosting</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p><strong>Stacking</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p><strong>Averaging</strong></p></td>
<td><p></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
</tbody>
</table>
</div>
<p><br><br></p>
</section>
<section id="preprocessing">
<h3>Preprocessing<a class="headerlink" href="#preprocessing" title="Link to this heading">#</a></h3>
<ul class="simple">
<li><p>What are various data preprocessing steps such as scaling, OHE, ordinal encoding, and handling missing values. Why and when each step is necessary?</p></li>
</ul>
<p><strong><code class="docutils literal notranslate"><span class="pre">sklearn</span></code> Transformers</strong></p>
<div class="pst-scrollable-table-container"><table class="table">
<thead>
<tr class="row-odd"><th class="head"><p><strong>Transformer</strong></p></th>
<th class="head"><p>Hyperparameters</p></th>
<th class="head"><p><strong>When to use?</strong></p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">SimpleImputer</span></code></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">StandardScaler</span></code></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">OneHotEncoder</span></code></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">OrdinalEncoder</span></code></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">CountVectorizer</span></code></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">TransformedTargetRegressor</span></code></p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
</tbody>
</table>
</div>
<p>Let’s bring back our quiz2 grades toy dataset.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
Expand Down Expand Up @@ -875,6 +966,29 @@ <h3>Hyperparameter optimization<a class="headerlink" href="#hyperparameter-optim
<li><p>What are two different tools provided by sklearn for hyperparameter optimization?</p></li>
<li><p>What is optimization bias?</p></li>
</ul>
<div class="pst-scrollable-table-container"><table class="table">
<thead>
<tr class="row-odd"><th class="head"><p><strong>Method</strong></p></th>
<th class="head"><p>Strengths/Weaknesses</p></th>
<th class="head"><p><strong>When to use?</strong></p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p>Nested for loops</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p>Grid search</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p>Random search</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
</tbody>
</table>
</div>
</section>
<section id="evaluation-metrics">
<h3>Evaluation metrics<a class="headerlink" href="#evaluation-metrics" title="Link to this heading">#</a></h3>
Expand All @@ -886,6 +1000,70 @@ <h3>Evaluation metrics<a class="headerlink" href="#evaluation-metrics" title="Li
<li><p>What’s the main difference between AP score and F1 score?</p></li>
<li><p>What are advantages of RMSE or MAPE over MSE?</p></li>
</ul>
<p><strong>Classification Metrics</strong></p>
<div class="pst-scrollable-table-container"><table class="table">
<thead>
<tr class="row-odd"><th class="head"><p><strong>Metric</strong></p></th>
<th class="head"><p>How to generate/calculate?</p></th>
<th class="head"><p><strong>When to use?</strong></p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p>Accuracy</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p>Precision</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p>Recall</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p>F1-score</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p>AP score</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p>AUC</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
</tbody>
</table>
</div>
<p><strong>Regression Metrics</strong></p>
<div class="pst-scrollable-table-container"><table class="table">
<thead>
<tr class="row-odd"><th class="head"><p><strong>Metric</strong></p></th>
<th class="head"><p>How to generate/calculate?</p></th>
<th class="head"><p><strong>When to use?</strong></p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p>MSE</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p>RMSE</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-even"><td><p>r2 score</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
<tr class="row-odd"><td><p>MAPE</p></td>
<td><p></p></td>
<td><p></p></td>
</tr>
</tbody>
</table>
</div>
</section>
<section id="ensembles">
<h3>Ensembles<a class="headerlink" href="#ensembles" title="Link to this heading">#</a></h3>
Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit 621d72b

Please sign in to comment.