Saving and loading

Model persistence

All models in Turftopic can be serialized and saved to disk, or published to the HuggingFace Hub.

Saving locally

Turftopic models can now be saved to disk using the to_disk() method of models:

from turftopic import SemanticSignalSeparation

model = SemanticSignalSeparation(10).fit(corpus)
model.to_disk("./local_directory/")

Publishing models

Models can also be pushed to HuggingFace repositories. This way, others can also easily access and modify topic models you've trained.

# The repository name is, of course, arbitrary but descriptive
model.push_to_hub("your_user/s3_20-newsgroups_10-topics")

Loading models

You can load models from either the Hub or disk using the load_model() function:

from turftopic import load_model

model = load_model("./local_directory/")
# or from hub
model = load_model("your_user/s3_20-newsgroups_10-topics")

`TopicData` persistence

You can also save and load TopicData objects with Turftopic. These are saved using joblib and therefore we recommend that you give a .joblib file extension to all TopicData files:

Note on compatibility

For backwards compatibility, TopicData objects are saved using joblib as simple dict objects. If you simply load a saved TopicData object with joblib without using from_disk(), it will load as a dict.

SaveLoad

topic_data = model.prepare_topic_data(corpus)
topic_data.to_disk("topic_data.joblib")

from turftopic.data import TopicData

topic_data = TopicData.from_disk("topic_data.joblib")