Getting Started
Turftopic is a topic modeling library which intends to simplify and streamline the usage of contextually sensitive topic models. We provide stable, minimal and scalable implementations of several types of models along with extensive documentation.
Basic Usage
Turftopic can be installed from PyPI.
pip install turftopic
Turftopic's models follow the scikit-learn API conventions, and as such they are quite easy to use if you are familiar with scikit-learn workflows.
Here's an example of how you use KeyNMF, one of our models on the 20Newsgroups dataset from scikit-learn.
from turftopic import KeyNMF
from sklearn.datasets import fetch_20newsgroups
newsgroups = fetch_20newsgroups(
subset="all",
remove=("headers", "footers", "quotes"),
)
corpus = newsgroups.data
model = KeyNMF(20).fit(corpus)
model.print_topics()
| Topic ID | Top 10 Words |
|---|---|
| 0 | armenians, armenian, armenia, turks, turkish, genocide, azerbaijan, soviet, turkey, azerbaijani |
| 1 | sale, price, shipping, offer, sell, prices, interested, 00, games, selling |
| .... |
Citation
Please cite us when using Turftopic:
@article{
Kardos2025,
title = {Turftopic: Topic Modelling with Contextual Representations from Sentence Transformers},
doi = {10.21105/joss.08183},
url = {https://doi.org/10.21105/joss.08183},
year = {2025},
publisher = {The Open Journal},
volume = {10},
number = {111},
pages = {8183},
author = {Kardos, MΓ‘rton and Enevoldsen, Kenneth C. and Kostkan, Jan and Kristensen-McLachlan, Ross Deans and Rocca, Roberta},
journal = {Journal of Open Source Software}
}