Saving and loading
Model persistence
All models in Turftopic can be serialized and saved to disk, or published to the HuggingFace Hub.
Saving locally
Turftopic models can now be saved to disk using the to_disk() method of models:
from turftopic import SemanticSignalSeparation
model = SemanticSignalSeparation(10).fit(corpus)
model.to_disk("./local_directory/")
Publishing models
Models can also be pushed to HuggingFace repositories. This way, others can also easily access and modify topic models you've trained.
# The repository name is, of course, arbitrary but descriptive
model.push_to_hub("your_user/s3_20-newsgroups_10-topics")
Loading models
You can load models from either the Hub or disk using the load_model() function:
from turftopic import load_model
model = load_model("./local_directory/")
# or from hub
model = load_model("your_user/s3_20-newsgroups_10-topics")
TopicData persistence
You can also save and load TopicData objects with Turftopic.
These are saved using joblib and therefore we recommend that you give a .joblib file extension to all TopicData files:
Note on compatibility
For backwards compatibility, TopicData objects are saved using joblib as simple dict objects.
If you simply load a saved TopicData object with joblib without using from_disk(), it will load as a dict.
topic_data = model.prepare_topic_data(corpus)
topic_data.to_disk("topic_data.joblib")
from turftopic.data import TopicData
topic_data = TopicData.from_disk("topic_data.joblib")