Skip to content

Commit

Permalink
Deployed ccc9ebd with MkDocs version: 1.5.3
Browse files Browse the repository at this point in the history
  • Loading branch information
MaartenGr committed May 12, 2024
1 parent 4f9a6ff commit 93bc639
Show file tree
Hide file tree
Showing 6 changed files with 121 additions and 82 deletions.
2 changes: 1 addition & 1 deletion algorithm/algorithm.html
Original file line number Diff line number Diff line change
Expand Up @@ -2896,7 +2896,7 @@ <h2 id="visual-overview"><strong>Visual Overview</strong><a class="headerlink" h
<ol>
<li><a href="../getting_started/embeddings/embeddings.html">Embeddings</a></li>
<li><a href="../getting_started/dim_reduction/dim_reduction.html">Dimensionality Reduction</a></li>
<li><a href="../getting_started/dim_reduction/dim_reduction.html">Clustering</a></li>
<li><a href="../getting_started/clustering/clustering.html">Clustering</a></li>
<li><a href="../getting_started/vectorizers/vectorizers.html">Tokenizer</a></li>
<li><a href="../getting_started/ctfidf/ctfidf.html">Weighting Scheme</a></li>
<li><a href="../getting_started/representation/representation.html">Representation Tuning</a><ul>
Expand Down
30 changes: 19 additions & 11 deletions api/bertopic.html
Original file line number Diff line number Diff line change
Expand Up @@ -7933,7 +7933,11 @@ <h1 id="bertopic"><code>BERTopic</code><a class="headerlink" href="#bertopic" ti
<span class="normal">4351</span>
<span class="normal">4352</span>
<span class="normal">4353</span>
<span class="normal">4354</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">BERTopic</span><span class="p">:</span>
<span class="normal">4354</span>
<span class="normal">4355</span>
<span class="normal">4356</span>
<span class="normal">4357</span>
<span class="normal">4358</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">BERTopic</span><span class="p">:</span>
<span class="w"> </span><span class="sd">&quot;&quot;&quot;BERTopic is a topic modeling technique that leverages BERT embeddings and</span>
<span class="sd"> c-TF-IDF to create dense clusters allowing for easily interpretable topics</span>
<span class="sd"> whilst keeping important words in the topic descriptions.</span>
Expand Down Expand Up @@ -11548,15 +11552,20 @@ <h1 id="bertopic"><code>BERTopic</code><a class="headerlink" href="#bertopic" ti

<span class="n">cluster_indices</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">documents</span><span class="o">.</span><span class="n">Old_ID</span><span class="o">.</span><span class="n">values</span><span class="p">)</span>
<span class="n">cluster_names</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">merged_model</span><span class="o">.</span><span class="n">topic_labels_</span><span class="o">.</span><span class="n">values</span><span class="p">())[</span><span class="nb">len</span><span class="p">(</span><span class="nb">set</span><span class="p">(</span><span class="n">y</span><span class="p">)):]</span>
<span class="n">cluster_topics</span> <span class="o">=</span> <span class="p">[</span><span class="n">cluster_names</span><span class="p">[</span><span class="n">topic</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">_outliers</span><span class="p">]</span> <span class="k">for</span> <span class="n">topic</span> <span class="ow">in</span> <span class="n">documents</span><span class="o">.</span><span class="n">Topic</span><span class="o">.</span><span class="n">values</span><span class="p">]</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_outliers</span><span class="p">:</span>
<span class="n">cluster_topics</span> <span class="o">=</span> <span class="p">[</span><span class="n">cluster_names</span><span class="p">[</span><span class="n">topic</span><span class="p">]</span> <span class="k">if</span> <span class="n">topic</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span> <span class="k">else</span> <span class="s2">&quot;Outliers&quot;</span> <span class="k">for</span> <span class="n">topic</span> <span class="ow">in</span> <span class="n">documents</span><span class="o">.</span><span class="n">Topic</span><span class="o">.</span><span class="n">values</span><span class="p">]</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">cluster_topics</span> <span class="o">=</span> <span class="p">[</span><span class="n">cluster_names</span><span class="p">[</span><span class="n">topic</span><span class="p">]</span> <span class="k">for</span> <span class="n">topic</span> <span class="ow">in</span> <span class="n">documents</span><span class="o">.</span><span class="n">Topic</span><span class="o">.</span><span class="n">values</span><span class="p">]</span>

<span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span>
<span class="s2">&quot;Indices&quot;</span><span class="p">:</span> <span class="n">zeroshot_indices</span> <span class="o">+</span> <span class="n">cluster_indices</span><span class="p">,</span>
<span class="s2">&quot;Label&quot;</span><span class="p">:</span> <span class="n">zeroshot_topics</span> <span class="o">+</span> <span class="n">cluster_topics</span><span class="p">}</span>
<span class="p">)</span><span class="o">.</span><span class="n">sort_values</span><span class="p">(</span><span class="s2">&quot;Indices&quot;</span><span class="p">)</span>
<span class="n">reverse_topic_labels</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">((</span><span class="n">v</span><span class="p">,</span> <span class="n">k</span><span class="p">)</span> <span class="k">for</span> <span class="n">k</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">merged_model</span><span class="o">.</span><span class="n">topic_labels_</span><span class="o">.</span><span class="n">items</span><span class="p">())</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_outliers</span><span class="p">:</span>
<span class="n">reverse_topic_labels</span><span class="p">[</span><span class="s2">&quot;Outliers&quot;</span><span class="p">]</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>
<span class="n">df</span><span class="o">.</span><span class="n">Label</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">Label</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">reverse_topic_labels</span><span class="p">)</span>
<span class="n">merged_model</span><span class="o">.</span><span class="n">topics_</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">Label</span><span class="o">.</span><span class="n">values</span>
<span class="n">merged_model</span><span class="o">.</span><span class="n">topics_</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">Label</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="nb">int</span><span class="p">)</span><span class="o">.</span><span class="n">tolist</span><span class="p">()</span>

<span class="c1"># Update the class internally</span>
<span class="n">has_outliers</span> <span class="o">=</span> <span class="nb">bool</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_outliers</span><span class="p">)</span>
Expand Down Expand Up @@ -11846,8 +11855,7 @@ <h1 id="bertopic"><code>BERTopic</code><a class="headerlink" href="#bertopic" ti
<span class="k">if</span> <span class="n">partial_fit</span><span class="p">:</span>
<span class="n">X</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">vectorizer_model</span><span class="o">.</span><span class="n">partial_fit</span><span class="p">(</span><span class="n">documents</span><span class="p">)</span><span class="o">.</span><span class="n">update_bow</span><span class="p">(</span><span class="n">documents</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">fit</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">vectorizer_model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">documents</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">vectorizer_model</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="n">documents</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">vectorizer_model</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">documents</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">X</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">vectorizer_model</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="n">documents</span><span class="p">)</span>

Expand Down Expand Up @@ -12899,11 +12907,7 @@ <h2 id="bertopic._bertopic.BERTopic.__str__" class="doc doc-heading">

<details class="quote">
<summary>Source code in <code>bertopic\_bertopic.py</code></summary>
<div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">4340</span>
<span class="normal">4341</span>
<span class="normal">4342</span>
<span class="normal">4343</span>
<span class="normal">4344</span>
<div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">4344</span>
<span class="normal">4345</span>
<span class="normal">4346</span>
<span class="normal">4347</span>
Expand All @@ -12913,7 +12917,11 @@ <h2 id="bertopic._bertopic.BERTopic.__str__" class="doc doc-heading">
<span class="normal">4351</span>
<span class="normal">4352</span>
<span class="normal">4353</span>
<span class="normal">4354</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="fm">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="normal">4354</span>
<span class="normal">4355</span>
<span class="normal">4356</span>
<span class="normal">4357</span>
<span class="normal">4358</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="fm">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="w"> </span><span class="sd">&quot;&quot;&quot;Get a string representation of the current object.</span>

<span class="sd"> Returns:</span>
Expand Down
31 changes: 31 additions & 0 deletions changelog.html
Original file line number Diff line number Diff line change
Expand Up @@ -2653,6 +2653,15 @@
</label>
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>

<li class="md-nav__item">
<a href="#version-0162" class="md-nav__link">
<span class="md-ellipsis">
Version 0.16.2
</span>
</a>

</li>

<li class="md-nav__item">
<a href="#version-0161" class="md-nav__link">
<span class="md-ellipsis">
Expand Down Expand Up @@ -2968,6 +2977,15 @@
</label>
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>

<li class="md-nav__item">
<a href="#version-0162" class="md-nav__link">
<span class="md-ellipsis">
Version 0.16.2
</span>
</a>

</li>

<li class="md-nav__item">
<a href="#version-0161" class="md-nav__link">
<span class="md-ellipsis">
Expand Down Expand Up @@ -3266,6 +3284,19 @@


<h1 id="changelog">Changelog<a class="headerlink" href="#changelog" title="Permanent link">&para;</a></h1>
<h2 id="version-0162"><strong>Version 0.16.2</strong><a class="headerlink" href="#version-0162" title="Permanent link">&para;</a></h2>
<p><em>Release date: 12 May, 2024</em></p>
<h3><b>Fixes:</a></b></h3>

<ul>
<li>Fix issue with zeroshot topic modeling missing outlier <a href="https://github.com/MaartenGr/BERTopic/issues/1957">#1957</a></li>
<li>Bump github actions versions by <a href="https://github.com/afuetterer">@afuetterer</a> in <a href="https://github.com/MaartenGr/BERTopic/pull/1941">#1941</a></li>
<li>Drop support for python 3.7 by <a href="https://github.com/afuetterer">@afuetterer</a> in <a href="https://github.com/MaartenGr/BERTopic/pull/1949">#1949</a></li>
<li>Add testing python 3.10+ in Github actions by <a href="https://github.com/afuetterer">@afuetterer</a> in <a href="https://github.com/MaartenGr/BERTopic/pull/1968">#1968</a></li>
<li>Speed up fitting CountVectorizer by <a href="https://github.com/dannywhuang">@dannywhuang</a> in <a href="https://github.com/MaartenGr/BERTopic/pull/1938">#1938</a></li>
<li>Fix <code>transform</code> when using cuML HDBSCAN by <a href="https://github.com/beckernick">@beckernick</a> in <a href="https://github.com/MaartenGr/BERTopic/pull/1960">#1960</a></li>
<li>Fix wrong link in algorithm documentation by <a href="https://github.com/naeyn">@naeyn</a> in <a href="https://github.com/MaartenGr/BERTopic/pull/1970">#1970</a></li>
</ul>
<h2 id="version-0161"><strong>Version 0.16.1</strong><a class="headerlink" href="#version-0161" title="Permanent link">&para;</a></h2>
<p><em>Release date: 21 April, 2024</em></p>
<h3><b>Highlights:</a></b></h3>
Expand Down
2 changes: 1 addition & 1 deletion search/search_index.json

Large diffs are not rendered by default.

Loading

0 comments on commit 93bc639

Please sign in to comment.