diff --git a/docs/getting_started/guided/guided.md b/docs/getting_started/guided/guided.md index 9233ac41..aa94316c 100644 --- a/docs/getting_started/guided/guided.md +++ b/docs/getting_started/guided/guided.md @@ -1,3 +1,9 @@ +!!! Note + Difference between Zero-shot and Guided BERTopic: + Guided BERTopic is similar - yet not equivalent - to [Zeros-shot Topic Modeling](https://maartengr.github.io/BERTopic/getting_started/zeroshot/zeroshot.html). + Use Guided BERTopic to boost certain keyword's importance. Use [Zeros-shot Topic Modeling](https://maartengr.github.io/BERTopic/getting_started/zeroshot/zeroshot.html) to try to categorize documents into predefined topics ("zero-shot topics") before the clustering the remaining, unclassified documents, using the default unsupervised BERTopic topic exploration algorithm. + + Guided Topic Modeling or Seeded Topic Modeling is a collection of techniques that guides the topic modeling approach by setting several seed topics to which the model will converge to. These techniques allow the user to set a predefined number of topic representations that are sure to be in documents. For example, take an IT business that has a ticket system for the software their clients use. Those tickets may typically contain information about a specific bug regarding login issues that the IT business is aware of. To model that bug, we can create a seed topic representation containing the words `bug`, `login`, `password`, diff --git a/docs/getting_started/zeroshot/zeroshot.md b/docs/getting_started/zeroshot/zeroshot.md index 951f6f0c..d1ffc884 100644 --- a/docs/getting_started/zeroshot/zeroshot.md +++ b/docs/getting_started/zeroshot/zeroshot.md @@ -1,3 +1,7 @@ +!!! Note + Difference between Zero-shot and Guided BERTopic: + Zeros-shot Topic Modeling is similar - yet not equivalent - to [Guided BERTopic](https://maartengr.github.io/BERTopic/getting_started/guided/guided.html). Use [Guided BERTopic](https://maartengr.github.io/BERTopic/getting_started/guided/guided.html) to boost certain keyword's importance. Use [Zeros-shot Topic Modeling](https://maartengr.github.io/BERTopic/getting_started/zeroshot/zeroshot.html) to try to categorize documents into predefined topics ("zero-shot topics") before the clustering the remaining, unclassified documents, using the default unsupervised BERTopic topic exploration algorithm. + Zero-shot Topic Modeling is a technique that allows you to find topics in large amounts of documents that were predefined. When faced with many documents, you often have an idea of which topics will definitely be in there. Whether that is a result of simply knowing your data or if a domain expert is involved in defining those topics. This method allows you to not only find those specific topics but also create new topics for documents that would not fit with your predefined topics.