sklvq.solvers.SteepestGradientDescent¶
-
class
sklvq.solvers.SteepestGradientDescent(objective: sklvq.objectives._base.ObjectiveBaseClass, max_runs: int = 10, batch_size: int = 1, step_size: Union[float, numpy.ndarray] = 0.1, callback: Optional[callable] = None)[source]¶ Steepest gradient descent (SGD)
Implements the steepest gradient descent optimization method. Can perform stochastic, mini-batch and batch gradient descent by changing the batch_size. Implementation is inspired by the description given in [1].
The algorithm performs the following update of the model parameters (
) per
batch. This process is repeated multiple times (per step) when the batch_size(
)
is smaller than the total number of samples in the data.
with
the gradient of the objective function with respect to
a sample given the current model parameters
, and
the step
size at step
, which is changed using a simple annealing function:
with
given by the max_runsparameter and
by the
step_sizeparameter.- Parameters
- objective: ObjectiveBaseClass, required
This is set by the algorithm. See
sklvq.models.GLVQ,sklvq.models.GMLVQ, andsklvq.models.LGMLVQ.- max_runs: int
Maximum number of runs/epochs that will be computed. Should be >= 1. Early stopping can be implemented by providing a
callbackfunction that returns True when the solver should stop.- batch_size: int
Controls the batch size and accepts a value >= 0. The value indicates the number of samples considered to be in the batch. A stochastic gradient descent corresponds with a
batch_sizeof 1. For Batch gradient descent 0 can be used to indicate to use all the samples. Any value > 1 < n_samples can be considered as a mini-batch gradient descent.If batches can not properly be divided in batches with the specified size the last batch might contain less than the specified number of samples.
The data is always shuffled before it is split into batches.
- step_size: float or ndarray
The step size to control the learning rate of the model parameters. If the same step size should be used for all parameters (e.g., prototypes and omega) then a single float is sufficient. If separate initial step sizes should be used per model parameter then this should be specified by using a numpy array.
- callback: callable
Callable with signature callable(state). If the callable returns True the solver will stop even if
max_runsis not reached yet. The state object contains the following:- “variables”
Concatenated 1D ndarray of the model’s parameters
- “nit”
The current iteration counter
- “fun”
The objective cost
- “step_size”
The current step_size(s)
References
[1] LeKander, M., Biehl, M., & De Vries, H. (2017). “Empirical evaluation of gradient methods for matrix learning vector quantization.” 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization, WSOM 2017.
-
solve(data: numpy.ndarray, labels: numpy.ndarray, model: LVQBaseClass)[source]¶ Solve function that gets called by the fit method of the models.
Performs the steps of the steepest gradient descent optimization method.
- Parameters
- datandarray of shape (n_samples, n_features)
The data.
- labelsndarray of size (n_samples)
The labels of the samples in the data.
- modelLVQBaseClass
The initial model that will also hold the final result