Skip to content

Getting Started

Turftopic is a topic modeling library which intends to simplify and streamline the usage of contextually sensitive topic models. We provide stable, minimal and scalable implementations of several types of models along with extensive documentation.

šŸ  Build and Train Topic Models šŸŽØ Explore, Interpret and Visualize your Models šŸ”§ Modify and Fine-tune Topic Models
šŸ“Œ Choose the Right Model for your Use-Case šŸ“ˆ Explore Topics Changing over Time šŸ“° Use Phrases or Lemmas for Topic Models
šŸŒŠ Extract Topics from a Stream of Documents šŸŒ² Find Hierarchical Order in Topics šŸ³ Name Topics with Large Language Models

Basic Usage

Turftopic can be installed from PyPI.

pip install turftopic

Turftopic's models follow the scikit-learn API conventions, and as such they are quite easy to use if you are familiar with scikit-learn workflows.

Here's an example of how you use KeyNMF, one of our models on the 20Newsgroups dataset from scikit-learn.

from turftopic import KeyNMF
from sklearn.datasets import fetch_20newsgroups

newsgroups = fetch_20newsgroups(
    subset="all",
    remove=("headers", "footers", "quotes"),
)
corpus = newsgroups.data
model = KeyNMF(20).fit(corpus)
model.print_topics()

Topic ID Top 10 Words
0 armenians, armenian, armenia, turks, turkish, genocide, azerbaijan, soviet, turkey, azerbaijani
1 sale, price, shipping, offer, sell, prices, interested, 00, games, selling
....