`sklvq.solvers`.SteepestGradientDescent¶

class sklvq.solvers.SteepestGradientDescent(objective: sklvq.objectives._base.ObjectiveBaseClass, max_runs: int = 10, batch_size: int = 1, step_size: Union[float, numpy.ndarray] = 0.1, callback: Optional[callable] = None)[source]¶

Steepest gradient descent (SGD)

Implements the steepest gradient descent optimization method. Can perform stochastic, mini-batch and batch gradient descent by changing the batch_size. Implementation is inspired by the description given in [1].

The algorithm performs the following update of the model parameters ( $\mathbf{\theta}$ ) per batch. This process is repeated multiple times (per step) when the batch_size ( $M$ ) is smaller than the total number of samples in the data.

$\mathbf{\theta} = \mathbf{\theta} - \eta(t) \cdot \sum_i^M \nabla e_i(\mathbf{\theta}),$

with $\nabla e_i(\mathbf{\theta})$ the gradient of the objective function with respect to a sample given the current model parameters $\mathbf{\theta}$ , and $\eta(t)$ the step size at step $t$ , which is changed using a simple annealing function:

$\eta(t) = \frac{\eta_{init}} {(1 + \frac{t}{t_{max}})},$

with $t_{max}$ given by the max_runs parameter and $\eta_{init}$ by the step_size parameter.

Parameters

objective: ObjectiveBaseClass, required

This is set by the algorithm. See sklvq.models.GLVQ, sklvq.models.GMLVQ, and sklvq.models.LGMLVQ.

max_runs: int

Maximum number of runs/epochs that will be computed. Should be >= 1. Early stopping can be implemented by providing a callback function that returns True when the solver should stop.

batch_size: int

Controls the batch size and accepts a value >= 0. The value indicates the number of samples considered to be in the batch. A stochastic gradient descent corresponds with a batch_size of 1. For Batch gradient descent 0 can be used to indicate to use all the samples. Any value > 1 < n_samples can be considered as a mini-batch gradient descent.

If batches can not properly be divided in batches with the specified size the last batch might contain less than the specified number of samples.

The data is always shuffled before it is split into batches.

step_size: float or ndarray

The step size to control the learning rate of the model parameters. If the same step size should be used for all parameters (e.g., prototypes and omega) then a single float is sufficient. If separate initial step sizes should be used per model parameter then this should be specified by using a numpy array.

callback: callable

Callable with signature callable(state). If the callable returns True the solver will stop even if max_runs is not reached yet. The state object contains the following:

“variables”
Concatenated 1D ndarray of the model’s parameters
“nit”
The current iteration counter
“fun”
The objective cost
“step_size”
The current step_size(s)

References

[1] LeKander, M., Biehl, M., & De Vries, H. (2017). “Empirical evaluation of gradient methods for matrix learning vector quantization.” 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization, WSOM 2017.

solve(data: numpy.ndarray, labels: numpy.ndarray, model: LVQBaseClass)[source]¶

Solve function that gets called by the fit method of the models.

Performs the steps of the steepest gradient descent optimization method.

Parameters

datandarray of shape (n_samples, n_features): The data.
labelsndarray of size (n_samples): The labels of the samples in the data.
modelLVQBaseClass: The initial model that will also hold the final result

sklvq.solvers.SteepestGradientDescent¶

`sklvq.solvers`.SteepestGradientDescent¶