Getting Started#
topicwizard is a pretty and opinionated Python library for topic model visualization built on Dash and Plotly. This website contains the user guide to topicwizard as well as the API reference.
Installation#
topicwizard can be simply installed by installing the PyPI package.
pip install topic-wizard
Usage#
Build a scikit-learn compatible topic pipeline with classical models.
Note
If you intend to investigate non-scikit-learn models, please have a look at Compatibility
from sklearn.decomposition import NMF
from sklearn.feature_extraction.text import CountVectorizer
from topicwizard.pipeline import make_topic_pipeline
bow_vectorizer = CountVectorizer()
nmf = NMF(n_components=10)
model = make_topic_pipeline(bow_vectorizer, nmf)
Or build a contextually sensitive topic model with Turftopic or BERTopic:
from turftopic import KeyNMF
model = KeyNMF(n_components=10)
The easiest and most sensible way to visualize is with the topicwizard web application.
import topicwizard
topicwizard.visualize(texts, model=model)
You can also you individual interactive plots to create individual visualizations you might be interested in.
Here is an example of how you can visualize words’ relations to each other in a topic model:
from topicwizard.figures import word_map
topic_data = topic_pipeline.prepare_topic_data(corpus)
word_map(topic_data)
This will open a new browser tab in which you can investigate topic models visually.