.. _api:opt: Optimization ================================== .. contents:: :depth: 2 :local: This section includes several tools which form basic building blocks for other higher level solvers. We provide generators for: * Indicator functions * Projector function * Proximal operators for functions which are prox capable * Smooth functions for which gradients can be computed Most of the descriptions focus on convex sets :math:`C \subset \RR^n` and convex functions with signatures :math:`\RR^n \to \RR`. However, the implementations generalize well for the complex vector space :math:`\CC^n` with the real inner product :math:`\langle x, y \rangle = \Re(x^H y)`. Also, while we describe the math for vectors :math:`x \in \RR^n`, the memory layout of these vectors could be in the form of multi-dimensional arrays too. .. currentmodule:: cr.sparse.opt .. _api:opt:indicators: Indicator Function Generators ------------------------------ Let :math:`C` be a subset of :math:`\RR^n`. An indicator function :math:`I_C(x)` is defined as :math:`I_C(x) = 0` for every :math:`x \in C`. For our purposes, its extended-value extension is relevant which is defined as: .. math:: I_C(x) = \begin{cases} 0 & \text{if } x \in C \\ \infty & x \notin C \end{cases} An *indicator function generator* creates an indicator function based on the specification of the set :math:`C`. For example if :math:`C` is a Euclidean ball, then the specification of :math:`C` is based on its center and radius. .. autosummary:: :toctree: _autosummary indicator_zero indicator_singleton indicator_affine indicator_box indicator_box_affine indicator_conic indicator_l1_ball indicator_l2_ball .. _api:opt:projectors: Projection Function Generators --------------------------------- A key idea in covex optimization is Projection on Convex Sets (POCS). Given a convex set :math:`C \subset \RR^n` and a point :math:`x \in \RR^n`, the projection on to convex set :math:`C` is defined as : .. math:: P_C(x) = \text{arg} \min_{v \in C} \| x - v \|_2. i.e. find the point :math:`v` in :math:`C` which is closest to :math:`x`. If :math:`x` is inside :math:`C`, then the projection is :math:`x` itself. In general, computation of the projection for arbitrary convex sets can be very hard. However, for specific classes of convex sets, the projection can be computed efficiently. Here we provide projection function generators for some classes of convex sets. A *projection function generator* creates a projection function based on the specification of the set :math:`C`. .. autosummary:: :toctree: _autosummary proj_identity proj_zero proj_singleton proj_affine proj_box proj_conic proj_l1_ball proj_l2_ball .. _api:opt:smooth: Smooth Function Generators ---------------------------------- The *smoothness* of a function is a property measured by the number of continuous derivatives it has over some domain. We are concerned with convex functions :math:`f : \RR^n \to \RR` defined over convex sets :math:`C \in \RR^n` and consider them to be smooth if the gradient :math:`g = \nabla f : \RR^n \to \RR^n` exists over the interior of its domain. For a smooth convex function :math:`f`, we need to be able to * Compute the value :math:`f(x)` at :math:`x \in \RR^n` using the extended-value extension (i.e. :math:`f(x) = \infty \text{if} x \notin C`. * Compute the gradient :math:`g = \nabla f` at :math:`x`, :math:`g(x)` * Compute the pair :math:`g(x), f(x)` at :math:`x` While JAX provides automatic gradient computation capability, it doesn't match exactly at the boundaries of the domain :math:`C` of :math:`f` for many of our functions of interest. Hence, we choose to provide our hand-coded gradient implementations. Sometimes, the gradient computation :math:`g(x)` and function computation :math:`f(x)` have common parts, and this can be exploited in improving the algorithm efficiency. Hence, we provide an ability to compute the pair too if an algorithm desires to compute them together efficiently. We represent smooth functions by the type :py:class:`cr.sparse.opt.SmoothFunction`. .. autosummary:: :toctree: _autosummary :nosignatures: :template: namedtuple.rst SmoothFunction Let `op` be a variable of type :py:class:`cr.sparse.opt.SmoothFunction` which represents some smooth function :math:`f`. Then: * `op.func(x)` returns the function value :math:`f(x)`. * `op.grad(x)` returns the gradient of function :math:`g(x) = \nabla f(x)`. * `op.grad_val(x)` returns the pair :math:`(g(x), f(x))`. A *smooth function generator* creates an instance of :py:class:`cr.sparse.opt.SmoothFunction` based on the specification of the smooth function :math:`f`. Available smooth function generators: .. autosummary:: :toctree: _autosummary smooth_constant smooth_entropy smooth_huber smooth_linear smooth_entropy smooth_logdet smooth_quad_matrix smooth_quad_error Operations on smooth functions: .. autosummary:: :toctree: _autosummary smooth_func_translate We provide tools to build your own smooth functions. - You can provide just the definition of the smooth function and we will use JAX to compute the gradient. - You can provide definitions of the function :math:`f(x)` and the gradient :math:`g(x)`. - You can provide definitions of :math:`f(x)`, :math:`g(x)` as well as an efficient routine to compute the pair: :math:`(g(x), f(x))`. .. autosummary:: :toctree: _autosummary smooth_build smooth_build2 smooth_build3 Utilities: .. autosummary:: :toctree: _autosummary build_grad_val_func smooth_value_grad .. _api:opt:proximal: Proximal Operator Generators ---------------------------------- The *proximal operator* for a function :math:`f` is defined as .. math:: p_f(x, t) = \text{arg} \min_{z \in \RR^n} f(x) + \frac{1}{2t} \| z - x \|_2^2 The proximal operator :math:`p` is a mapping from :math:`(\RR^n, \RR) \to \RR^n` which maps a vector :math:`x` to its proximal vector :math:`z = p_f(x,t)` where :math:`t` can be thought of as a step size for computing the proximal vector. The proximal operator can be thought of as a generalization of the projection operator. If :math:`f` is an indicator function for some convex set :math:`C`, then the proximal operator is nothing but the projection function onto the set :math:`C`. If we think of :math:`f` as a cost function over :math:`\RR^n` then indicator functions are possibly the toughest cost functions. For smooth functions, the proximal operator reduces to the gradient step. We informally call a function as *prox-capable* if its proximal operator can be computed efficiently. For a prox-capable function :math:`f`, in typical proximal algorithms, we need to be able to * Compute the value :math:`f(x)` at :math:`x \in \RR^n` using the extended-value extension * Compute the proximal vector for the step size :math:`t` as :math:`p_f(x, t)` * Compute the pair :math:`p_f(x, t), f(p_f(x, t))` Note that in the last case, we first compute :math:`z = p_f(x, t)` and then compute the value :math:`v = f(z)`. We represent *prox-capable* functions by the type :py:class:`cr.sparse.opt.ProxCapable`. .. autosummary:: :toctree: _autosummary :nosignatures: :template: namedtuple.rst ProxCapable Let `op` be a variable of type :py:class:`cr.sparse.opt.ProxCapable` which represents some prox-capable function :math:`f`. Then: * `op.func(x)` returns the function value :math:`f(x)`. * `op.prox_op(x)` returns the proximal vector for a step size: :math:`z = p_f(x, t)`. * `op.prox_vec_val(x)` returns the pair :math:`z,v = p_f(x, t), f(z)`. A *proximal operator generator* creates an instance of :py:class:`cr.sparse.opt.ProxCapable` based on the specification of the prox-capable function :math:`f`. Available proximal operator generators: .. autosummary:: :toctree: _autosummary prox_zero prox_l1 prox_l2 prox_l1_pos prox_l1_ball prox_owl1 You can build your own :py:class:`cr.sparse.opt.ProxCapable` wrappers by providing the definition of the function :math:`f(x)` and its proximal operator :math:`p_f(x, t)`. .. autosummary:: :toctree: _autosummary prox_build build_from_ind_proj Simpler Projection Functions ---------------------------------- .. autosummary:: :toctree: _autosummary project_to_ball project_to_box project_to_real_upper_limit Shrinkage --------------------- .. autosummary:: :toctree: _autosummary shrink Conjugate Gradient Methods ---------------------------------- .. rubric:: Normal Conjugate Gradients on Matrices .. autosummary:: :toctree: _autosummary cg.solve_from cg.solve_from_jit cg.solve cg.solve_jit .. rubric:: Preconditioned Normal Conjugate Gradients on Linear Operators These are more general purpose. .. autosummary:: :toctree: _autosummary pcg.solve_from pcg.solve_from_jit pcg.solve pcg.solve_jit