Structured Radial Basis Functions Network
The s-RBFN can tackle efficiently, and with a speed not seen before in the literature with a closed-form solution, structured multiple hypotheses prediction, multimodal regression and forecasting. The model uses diversity (aka diversification) in learning, as a key component of success. I think fanatics of forecasting and prediction will enjoy the content, at least to some extent of what I enjoyed working on it.
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
It can become tackled with multiple hypotheses frameworks but with the difficulty of combining them efficiently in a learning model.
Moreover, a Structured Radial Basis Function Network presented as an ensemble of multiple hypotheses predictors for regression problems. The predictors are regression models of any type that can form centroidal Voronoi tessellations which are a function of their losses during training. Furthermore, we proved that this structured model can efficiently interpolate this tessellation. In addition, approximate the multiple hypotheses target distribution and is equivalent to interpolating the meta-loss of the predictors, the loss being a zero set of the interpolation error.
This model has a fixed-point iteration algorithm between the predictors and the centers of the basis functions.
Diversity in learning can be controlled parametrically by truncating the tessellation formation with the losses of individual predictors. A closed-form solution with least-squares is presented. Which to the authors’ knowledge, is the fastest solution in the literature for multiple hypotheses and structured predictions. Superior generalization performance and computational efficiency is achieved using only two-layer neural networks as predictors controlling diversity as a key component of success.
A gradient-descent approach becomes introduced which is loss-agnostic regarding the predictors.


An animation showing the first 83 iterations of gradient descent applied to this example. Surfaces are isosurfaces of at current guess , and arrows show the direction of descent. Due to a small and constant step size, the convergence is slow.
The expected value for the loss of the structured model with Gaussian basis functions becomes computed. As a result, finding that correlation between predictors is not an appropriate tool for diversification. The experiments show outperformance with respect to the top competitors in the literature.