SciPy BFGS 32-bit Optimization Bug: A Deep Dive

Nov 6, 2025 by Admin 48 views

BUG: BFGS 32 bit behaviour is not consistent

Hey everyone,

I've been digging into some convergence issues with the BFGS optimization algorithm in SciPy, and I've stumbled upon something that I think is pretty important, especially when dealing with 32-bit inputs. It seems like there are some inconsistencies in how BFGS behaves in these scenarios, and I wanted to share my findings and get your thoughts.

The Issue: BFGS and 32-bit Inputs

So, the main problem I've observed is that BFGS sometimes fails to converge when using 32-bit inputs, even for well-behaved functions like the Rosenbrock function with an analytical gradient. I initially thought this might be related to a previously reported issue, but after further investigation, I believe the root cause lies in how data types are handled within the _minimize_bfgs function.

Specifically, the line pk = -np.dot(Hk, gfk) seems to be the culprit. Even if all your function values and gradients are 32-bit, this line introduces a float64 variable right from the get-go. Why? Because np.dot of an integer and a float32 results in a float64. This leads to a mix of float32 and float64 data types throughout the rest of the code, which can cause unexpected behavior.

In [3]: np.dot(np.ones((2,2),dtype=int),np.ones(2,dtype=np.float32)).dtype
Out[3]: dtype('float64')

Even if I change the initial vector x0 to be float64, BFGS still struggles to converge to the expected solution of [1, 1]. However, if I modify the problematic line to explicitly cast the result back to the original data type (pk = -np.dot(Hk, gfk).astype(gfk.dtype)), the code converges. This suggests that the data type conversion is indeed playing a significant role.

Diving Deeper: More 32-bit Challenges

Now, even with the above fix, I've noticed that purely 32-bit code still doesn't always converge. One reason for this is the use of amin/amax with values like 1e-100/1e100. These values can easily overflow or underflow when working with 32-bit floats, leading to runtime warnings:

/home/skoposov/pyenv_dir/pyenv312/lib/python3.12/site-packages/scipy/optimize/_dcsrch.py:324: RuntimeWarning: overflow encountered in cast
  if stp > self.stpmax:
[3.9961147e+00 2.1209834e+04] 4.4917993e+10
/home/skoposov/pyenv_dir/pyenv312/lib/python3.12/site-packages/scipy/optimize/_dcsrch.py:385: RuntimeWarning: overflow encountered in cast
  if stp == self.stpmax and f <= ftest and g <= self.gtest:

Changing these amin/amax values and modifying the initialization of the identity matrix I to use the correct data type (I = np.eye(N, dtype=gfk.dtype)) helps keep all variables as float32. However, even with these changes, the code still doesn't always converge. My suspicion is that BFGS sometimes hits a point of constant function value due to the limited precision of 32-bit floats.

In summary, here's what I've found:

The use of np.dot without explicit type casting introduces float64 variables, even when working with 32-bit inputs.
amin/amax values can cause overflow/underflow issues with 32-bit floats.
Even after addressing these issues, the limited precision of 32-bit floats can still hinder convergence.

I believe the existing code should avoid using the int type for the I matrix and use amax/amin values that are safe for 32-bit floats. Alternatively, performing the linear search in 64-bit precision might be a viable solution.

I'm curious to hear your thoughts on this. Have you encountered similar issues with BFGS and 32-bit inputs? Do you have any suggestions for how to address these inconsistencies?

Thanks, Sergey

Reproducing the Issue

To reproduce the issue, you can use the following code:

import numpy as np
# import _optimize
import scipy.optimize._optimize as _optimize


def rosenbrock(x):
    """The Rosenbrock function"""
    a = 1.0
    b = 100.0
    ret = (a - x[0])**2 + b * (x[1] - x[0]**2)**2
    print(x, ret)
    return ret.astype(np.float32)


def rosenbrock_grad(x):
    """The gradient of the Rosenbrock function"""
    a = 1.0
    b = 100.0

    grad_x0 = 4 * b * x[0]**3 - 4 * b * x[0] * x[1] + 2 * x[0] - 2 * a

    grad_x1 = 2 * b * (x[1] - x[0]**2)

    # Return the gradient as a NumPy array
    return np.array([grad_x0, grad_x1], dtype=np.float32)


x0 = np.array([3.0, 21210.0], dtype=np.float32)

print(x0.dtype)
print(
    _optimize._minimize_bfgs(
        rosenbrock,
        x0,
        jac=rosenbrock_grad,
        maxiter=100000,
        gtol=1e-3))

Expected Behavior

Ideally, the BFGS optimization should converge to the minimum of the Rosenbrock function, which is [1, 1], even when using 32-bit inputs. However, as described above, the code may fail to converge or produce inaccurate results due to the data type issues and potential overflow/underflow problems.

No Error Message

The code may not produce an explicit error message, but the optimization process may terminate prematurely or converge to an incorrect solution. You may also see runtime warnings related to overflow or underflow.

System Information

The issue was observed with the following system information:

1.16.3 2.3.4 sys.version_info(major=3, minor=12, micro=3, releaselevel='final', serial=0)
Build Dependencies:
  blas:
    detection method: pkgconfig
    found: true
    include directory: /opt/_internal/cpython-3.12.11/lib/python3.12/site-packages/scipy_openblas32/include
    lib directory: /opt/_internal/cpython-3.12.11/lib/python3.12/site-packages/scipy_openblas32/lib
    name: scipy-openblas
    openblas configuration: OpenBLAS 0.3.29.dev DYNAMIC_ARCH NO_AFFINITY Haswell MAX_THREADS=64
    pc file directory: /project
    version: 0.3.29.dev
  lapack:
    detection method: pkgconfig
    found: true
    include directory: /opt/_internal/cpython-3.12.11/lib/python3.12/site-packages/scipy_openblas32/include
    lib directory: /opt/_internal/cpython-3.12.11/lib/python3.12/site-packages/scipy_openblas32/lib
    name: scipy-openblas
    openblas configuration: OpenBLAS 0.3.29.dev DYNAMIC_ARCH NO_AFFINITY Haswell MAX_THREADS=64
    pc file directory: /project
    version: 0.3.29.dev
  pybind11:
    detection method: config-tool
    include directory: unknown
    name: pybind11
    version: 3.0.1
Compilers:
  c:
    commands: cc
    linker: ld.bfd
    name: gcc
    version: 10.2.1
  c++:
    commands: c++
    linker: ld.bfd
    name: gcc
    version: 10.2.1
  cython:
    commands: cython
    linker: cython
    name: cython
    version: 3.1.6
  fortran:
    commands: gfortran
    linker: ld.bfd
    name: gcc
    version: 10.2.1
  pythran:
    include directory: ../../tmp/build-env-99e64tom/lib/python3.12/site-packages/pythran
    version: 0.18.0
Machine Information:
  build:
    cpu: x86_64
    endian: little
    family: x86_64
    system: linux
  cross-compiled: false
  host:
    cpu: x86_64
    endian: little
    family: x86_64
    system: linux
Python Information:
  path: /tmp/build-env-99e64tom/bin/python
  version: '3.12'

Environment Details

The versions are as follows:

SciPy: 1.16.3
NumPy: 2.3.4
Python: 3.12

Potential Solutions and Workarounds

After the analysis, here are some possible solutions to solve this intricate problem.

Explicit Type Casting: Ensure that the results of operations involving different data types are explicitly cast back to the desired data type (e.g., float32) to prevent unintended promotions to float64.
Adjusting amin/amax Values: Modify the amin/amax values to avoid overflow/underflow issues when working with 32-bit floats. This might involve using smaller values that are within the representable range of float32.
Data Type Consistency: Maintain consistency in data types throughout the optimization process. If 32-bit precision is sufficient, ensure that all variables and operations use float32.
64-bit Linear Search: Consider performing the linear search in 64-bit precision to improve accuracy and stability, especially when dealing with functions that are sensitive to numerical errors.
Check the Identity Matrix: Ensure that you are not using an int type for the Identity Matrix. The matrix must have the same type as the other matrixes.

I hope this write-up helps shed some light on the challenges of using BFGS with 32-bit inputs and provides some guidance for addressing these issues. Let's work together to improve the robustness and reliability of SciPy's optimization algorithms!