StudentsTMixture

`noloox.mixture.StudentsTMixture`

Bases: BaseEstimator, ClusterMixin, DensityMixin

Student's T Mixture Model. This class allows you to estimate a mixture of multivariate t-distributions over your data.

Parameters:

Name	Type	Description	Default
`n_components`	`int`	The number of mixture components.	required
`tol`	`float`	The convergence threshold. EM iterations will stop when the lower bound average gain is below this threshold.	`1e-05`
`reg_covar`	`float`	Non-negative regularization added to the diagonal of covariance. Allows to assure that the covariance matrices are all positive.	`1e-06`
`max_iter`	`int`	The number of EM iterations to perform.	`1000`
`df`		Degrees of freedom for the t-Distributions.	`4.0`
`random_state`		Random state for reproducibility.	`None`

Attributes:

Name	Type	Description
`weights_`	`array-like of shape (n_components,)`	The weights of each mixture components.
`means_`	`array-like of shape (n_components, n_features)`	The mean of each mixture component.
`n_iter_`	`int`	Number of step used by the best fit of EM to reach the convergence.
`converged_`	`bool`	True when convergence of the best fit of EM was reached, False otherwise.

Source code in noloox/mixture/tmm.py

class StudentsTMixture(BaseEstimator, ClusterMixin, DensityMixin):
    """Student's T Mixture Model.
    This class allows you to estimate a mixture of multivariate t-distributions over your data.

    Parameters
    ----------
    n_components: int
        The number of mixture components.
    tol: float, default=1e-5
        The convergence threshold. EM iterations will stop when the lower bound average gain is below this threshold.
    reg_covar: float, default=1e-6
        Non-negative regularization added to the diagonal of covariance. Allows to assure that the covariance matrices are all positive.
    max_iter: int, default=1000
        The number of EM iterations to perform.
    df: float, default=4.0
        Degrees of freedom for the t-Distributions.
    random_state: int, default=None
        Random state for reproducibility.

    Attributes
    ----------
    weights_: array-like of shape (n_components,)
        The weights of each mixture components.
    means_: array-like of shape (n_components, n_features)
        The mean of each mixture component.
    n_iter_: int
        Number of step used by the best fit of EM to reach the convergence.
    converged_: bool
        True when convergence of the best fit of EM was reached, False otherwise.
    """

    def __init__(
        self,
        n_components: int,
        tol: float = 1e-5,
        reg_covar: float = 1e-06,
        max_iter: int = 1000,
        df=4.0,
        random_state=None,
    ):
        super().__init__()
        self.start_df_ = float(df)
        self.n_components = n_components
        self.tol = tol
        self.df = df
        self.reg_covar = reg_covar
        self.max_iter = max_iter
        self.random_state = random_state

    def _init_params(self, X):
        loc_ = (
            KMeans(self.n_components, random_state=self.random_state)
            .fit(X)
            .cluster_centers_
        )
        mix_weights_ = np.ones(self.n_components) / self.n_components
        default_scale_matrix = np.cov(X, rowvar=False)
        default_scale_matrix.flat[:: X.shape[1] + 1] += self.reg_covar
        if len(default_scale_matrix.shape) < 2:
            default_scale_matrix = default_scale_matrix.reshape(-1, 1)
        scale_ = np.stack(
            [default_scale_matrix for i in range(self.n_components)], axis=-1
        )
        scale_cholesky_ = [
            np.linalg.cholesky(scale_[:, :, i]) for i in range(self.n_components)
        ]
        scale_cholesky_ = np.stack(scale_cholesky_, axis=-1)
        return loc_, scale_, mix_weights_, scale_cholesky_

    @staticmethod
    def e_step(X, df_, loc_, scale_cholesky_, mix_weights_, sq_maha_dist):
        return _e_step(X, df_, loc_, scale_cholesky_, mix_weights_, sq_maha_dist)

    @staticmethod
    def m_step(X, resp, E_gamma, scale_, scale_cholesky_, df_, reg_covar):
        return _m_step(X, resp, E_gamma, scale_, scale_cholesky_, df_, reg_covar)

    def fit(self, X, y=None):
        """Estimate model parameters with the EM algorithm.

        Parameters
        ----------
        X: array-like of shape (n_samples, n_features)
            List of n_features-dimensional data points. Each row corresponds to a single data point.

        y: Ignored
            Not used, present for API consistency by convention.

        Returns
        -------
        self: StudentsTMixture
            The fitted mixture.
        """
        self.df_ = np.full((self.n_components), self.df, dtype=np.float64)
        loc_, scale_, mix_weights_, scale_cholesky_ = self._init_params(X)
        lower_bound = -np.inf
        sq_maha_dist = np.empty((X.shape[0], self.n_components))
        self.converged_ = True

        @jax.jit
        def step(loc_, mix_weights_, scale_cholesky_, scale_, sq_maha_dist):
            resp, E_gamma, current_bound = self.e_step(
                X, self.df_, loc_, scale_cholesky_, mix_weights_, sq_maha_dist
            )
            mix_weights_, loc_, scale_, scale_cholesky_ = self.m_step(
                X,
                resp,
                E_gamma,
                scale_,
                scale_cholesky_,
                self.df_,
                self.reg_covar,
            )
            return (
                loc_,
                mix_weights_,
                scale_cholesky_,
                scale_,
                sq_maha_dist,
                current_bound,
            )

        for iter in trange(self.max_iter, desc="Running EM"):
            loc_, mix_weights_, scale_cholesky_, scale_, sq_maha_dist, current_bound = (
                step(
                    loc_,
                    mix_weights_,
                    scale_cholesky_,
                    scale_,
                    sq_maha_dist,
                )
            )
            change = current_bound - lower_bound
            if abs(change) < self.tol:
                break
            lower_bound = current_bound
        else:
            self.converged_ = False
            warnings.warn(
                "tDistributionMixture did not converge, consider increasing the number of iterations."
            )
        self.n_iter_ = iter
        self.weights_ = mix_weights_
        self.means_ = loc_
        self.scale_cholesky_ = scale_cholesky_
        self.scale_ = scale_
        return self

    def predict_proba(self, X):
        """Evaluate the components' density for each sample.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            List of n_features-dimensional data points. Each row
            corresponds to a single data point.

        Returns
        -------
        resp : array, shape (n_samples, n_components)
            Density of each Student's T component for each sample in X.
        """
        sq_maha_dist = sq_maha_distance(X, self.means_, self.scale_cholesky_)
        loglik = get_loglikelihood(
            X, sq_maha_dist, self.df_, self.scale_cholesky_, self.weights_
        )
        weighted_loglik = loglik + np.log(self.weights_)[np.newaxis, :]
        with np.errstate(under="ignore"):
            loglik = weighted_loglik - logsumexp(weighted_loglik, axis=1)[:, np.newaxis]
        return np.exp(loglik)

    def predict(self, X):
        """Predict the labels for the data samples in X using trained model.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            List of n_features-dimensional data points. Each row
            corresponds to a single data point.

        Returns
        -------
        labels : array, shape (n_samples,)
            Component labels.
        """
        return np.argmax(self.predict_proba(X), axis=1)

    def fit_predict(self, X):
        """Estimate model parameters using X and predict the labels for X.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            List of n_features-dimensional data points. Each row
            corresponds to a single data point.

        y : Ignored
            Not used, present for API consistency by convention.

        Returns
        -------
        labels : array, shape (n_samples,)
            Component labels.
        """
        return self.fit(X).predict(X)

    def aic(self, X):
        """Akaike information criterion for the current model on the input X.

        Parameters
        ----------
        X: array of shape (n_samples, n_dimensions)
            The input samples.

        Returns
        -------
        aic: float
            The lower the better.
        """
        self.check_model()
        x = self.check_inputs(X)
        n_params = self.get_num_parameters()
        score = self.score(x, perform_model_checks=False)
        return 2 * n_params - 2 * score * X.shape[0]

    def bic(self, X):
        """Bayesian information criterion for the current model on the input X.

        Parameters
        ----------
        X: array of shape (n_samples, n_dimensions)
            The input samples.

        Returns
        -------
        bic: float
            The lower the better.
        """
        self.check_model()
        x = self.check_inputs(X)
        score = self.score(x, perform_model_checks=False)
        n_params = self.get_num_parameters()
        return n_params * np.log(x.shape[0]) - 2 * score * x.shape[0]

    def score_samples(self, X):
        """Compute the log-likelihood of each sample.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            List of n_features-dimensional data points. Each row
            corresponds to a single data point.

        Returns
        -------
        log_prob : array, shape (n_samples,)
            Log-likelihood of each sample in `X` under the current model.
        """
        sq_maha_dist = sq_maha_distance(X, self.means_, self.scale_cholesky_)
        loglik = get_loglikelihood(
            X, sq_maha_dist, self.df_, self.scale_cholesky_, self.weights_
        )
        weighted_loglik = loglik + np.log(self.weights_)[np.newaxis, :]
        return logsumexp(weighted_loglik, axis=1)

    def score(self, X, y=None):
        """Compute the per-sample average log-likelihood of the given data X.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_dimensions)
            List of n_features-dimensional data points. Each row
            corresponds to a single data point.

        y : Ignored
            Not used, present for API consistency by convention.

        Returns
        -------
        log_likelihood : float
            Log-likelihood of `X` under the Gaussian mixture model.
        """
        return np.mean(self.score_samples(X))

`aic(X)`

Akaike information criterion for the current model on the input X.

Parameters:

Name	Type	Description	Default
`X`		The input samples.	required

Returns:

Name	Type	Description
`aic`	`float`	The lower the better.

Source code in noloox/mixture/tmm.py

def aic(self, X):
    """Akaike information criterion for the current model on the input X.

    Parameters
    ----------
    X: array of shape (n_samples, n_dimensions)
        The input samples.

    Returns
    -------
    aic: float
        The lower the better.
    """
    self.check_model()
    x = self.check_inputs(X)
    n_params = self.get_num_parameters()
    score = self.score(x, perform_model_checks=False)
    return 2 * n_params - 2 * score * X.shape[0]

`bic(X)`

Bayesian information criterion for the current model on the input X.

Parameters:

Name	Type	Description	Default
`X`		The input samples.	required

Returns:

Name	Type	Description
`bic`	`float`	The lower the better.

Source code in noloox/mixture/tmm.py

def bic(self, X):
    """Bayesian information criterion for the current model on the input X.

    Parameters
    ----------
    X: array of shape (n_samples, n_dimensions)
        The input samples.

    Returns
    -------
    bic: float
        The lower the better.
    """
    self.check_model()
    x = self.check_inputs(X)
    score = self.score(x, perform_model_checks=False)
    n_params = self.get_num_parameters()
    return n_params * np.log(x.shape[0]) - 2 * score * x.shape[0]

`fit(X, y=None)`

Estimate model parameters with the EM algorithm.

Parameters:

Name	Type	Description	Default
`X`		List of n_features-dimensional data points. Each row corresponds to a single data point.	required
`y`		Not used, present for API consistency by convention.	`None`

Returns:

Name	Type	Description
`self`	`StudentsTMixture`	The fitted mixture.

Source code in noloox/mixture/tmm.py

def fit(self, X, y=None):
    """Estimate model parameters with the EM algorithm.

    Parameters
    ----------
    X: array-like of shape (n_samples, n_features)
        List of n_features-dimensional data points. Each row corresponds to a single data point.

    y: Ignored
        Not used, present for API consistency by convention.

    Returns
    -------
    self: StudentsTMixture
        The fitted mixture.
    """
    self.df_ = np.full((self.n_components), self.df, dtype=np.float64)
    loc_, scale_, mix_weights_, scale_cholesky_ = self._init_params(X)
    lower_bound = -np.inf
    sq_maha_dist = np.empty((X.shape[0], self.n_components))
    self.converged_ = True

    @jax.jit
    def step(loc_, mix_weights_, scale_cholesky_, scale_, sq_maha_dist):
        resp, E_gamma, current_bound = self.e_step(
            X, self.df_, loc_, scale_cholesky_, mix_weights_, sq_maha_dist
        )
        mix_weights_, loc_, scale_, scale_cholesky_ = self.m_step(
            X,
            resp,
            E_gamma,
            scale_,
            scale_cholesky_,
            self.df_,
            self.reg_covar,
        )
        return (
            loc_,
            mix_weights_,
            scale_cholesky_,
            scale_,
            sq_maha_dist,
            current_bound,
        )

    for iter in trange(self.max_iter, desc="Running EM"):
        loc_, mix_weights_, scale_cholesky_, scale_, sq_maha_dist, current_bound = (
            step(
                loc_,
                mix_weights_,
                scale_cholesky_,
                scale_,
                sq_maha_dist,
            )
        )
        change = current_bound - lower_bound
        if abs(change) < self.tol:
            break
        lower_bound = current_bound
    else:
        self.converged_ = False
        warnings.warn(
            "tDistributionMixture did not converge, consider increasing the number of iterations."
        )
    self.n_iter_ = iter
    self.weights_ = mix_weights_
    self.means_ = loc_
    self.scale_cholesky_ = scale_cholesky_
    self.scale_ = scale_
    return self

`fit_predict(X)`

Estimate model parameters using X and predict the labels for X.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	List of n_features-dimensional data points. Each row corresponds to a single data point.	required
`y`	`Ignored`	Not used, present for API consistency by convention.	required

Returns:

Name	Type	Description
`labels`	`(array, shape(n_samples))`	Component labels.

Source code in noloox/mixture/tmm.py

def fit_predict(self, X):
    """Estimate model parameters using X and predict the labels for X.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        List of n_features-dimensional data points. Each row
        corresponds to a single data point.

    y : Ignored
        Not used, present for API consistency by convention.

    Returns
    -------
    labels : array, shape (n_samples,)
        Component labels.
    """
    return self.fit(X).predict(X)

`predict(X)`

Predict the labels for the data samples in X using trained model.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	List of n_features-dimensional data points. Each row corresponds to a single data point.	required

Returns:

Name	Type	Description
`labels`	`(array, shape(n_samples))`	Component labels.

Source code in noloox/mixture/tmm.py

def predict(self, X):
    """Predict the labels for the data samples in X using trained model.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        List of n_features-dimensional data points. Each row
        corresponds to a single data point.

    Returns
    -------
    labels : array, shape (n_samples,)
        Component labels.
    """
    return np.argmax(self.predict_proba(X), axis=1)

`predict_proba(X)`

Evaluate the components' density for each sample.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	List of n_features-dimensional data points. Each row corresponds to a single data point.	required

Returns:

Name	Type	Description
`resp`	`(array, shape(n_samples, n_components))`	Density of each Student's T component for each sample in X.

Source code in noloox/mixture/tmm.py

def predict_proba(self, X):
    """Evaluate the components' density for each sample.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        List of n_features-dimensional data points. Each row
        corresponds to a single data point.

    Returns
    -------
    resp : array, shape (n_samples, n_components)
        Density of each Student's T component for each sample in X.
    """
    sq_maha_dist = sq_maha_distance(X, self.means_, self.scale_cholesky_)
    loglik = get_loglikelihood(
        X, sq_maha_dist, self.df_, self.scale_cholesky_, self.weights_
    )
    weighted_loglik = loglik + np.log(self.weights_)[np.newaxis, :]
    with np.errstate(under="ignore"):
        loglik = weighted_loglik - logsumexp(weighted_loglik, axis=1)[:, np.newaxis]
    return np.exp(loglik)

`score(X, y=None)`

Compute the per-sample average log-likelihood of the given data X.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_dimensions)`	List of n_features-dimensional data points. Each row corresponds to a single data point.	required
`y`	`Ignored`	Not used, present for API consistency by convention.	`None`

Returns:

Name	Type	Description
`log_likelihood`	`float`	Log-likelihood of `X` under the Gaussian mixture model.

Source code in noloox/mixture/tmm.py

def score(self, X, y=None):
    """Compute the per-sample average log-likelihood of the given data X.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_dimensions)
        List of n_features-dimensional data points. Each row
        corresponds to a single data point.

    y : Ignored
        Not used, present for API consistency by convention.

    Returns
    -------
    log_likelihood : float
        Log-likelihood of `X` under the Gaussian mixture model.
    """
    return np.mean(self.score_samples(X))

`score_samples(X)`

Compute the log-likelihood of each sample.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	List of n_features-dimensional data points. Each row corresponds to a single data point.	required

Returns:

Name	Type	Description
`log_prob`	`(array, shape(n_samples))`	Log-likelihood of each sample in `X` under the current model.

Source code in noloox/mixture/tmm.py

def score_samples(self, X):
    """Compute the log-likelihood of each sample.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        List of n_features-dimensional data points. Each row
        corresponds to a single data point.

    Returns
    -------
    log_prob : array, shape (n_samples,)
        Log-likelihood of each sample in `X` under the current model.
    """
    sq_maha_dist = sq_maha_distance(X, self.means_, self.scale_cholesky_)
    loglik = get_loglikelihood(
        X, sq_maha_dist, self.df_, self.scale_cholesky_, self.weights_
    )
    weighted_loglik = loglik + np.log(self.weights_)[np.newaxis, :]
    return logsumexp(weighted_loglik, axis=1)