基于最小交叉熵的图像背景阈值搜索

通过阈值分割图像中的前景和背景是成像数据分析中基础的基础。此处以skimage.filters.threshold_li函数为例，介绍基于最小交叉熵的图像阈值搜索算法。

虽然我们可以通过ImageJ预览图像的像素直方图，并结合目视检查的情况，人为设定一个比较合理的背景阈值。但这样做只适合于少量数据分析。

而且这样设定的阈值，依靠的是个人感官，所以只是一个估计值。所以为了阈值的设置更加科学，这里推荐使用以信息论为理论基础的 skimage.filters.threshold_li 来获得更准确的背景阈值。

这个函数通过多次迭代搜索阈值，每次迭代会计算原始图像和二值化图像之间的交叉熵，然后使这个交叉熵最小。

最小化交叉熵的直观意义在于，经过阈值分割后的图像（用类平均灰度表示）与原始图像在信息上的差异最小，从而保留了原始图像最多的重要信息，实现了最优的分割。

但需要注意，该方法容易陷入局部最优值，如果存在这种情况，可以尝试指定initial_guess（既可以是数值，也可以是自定义估计函数）。

为了更放心稳妥地使用这个函数，我查看了这个函数的源代码，发现：

如果不设置initial_guess, 默认第一次迭代时 t_next 为图像均值。
第一次迭代时 t_curr 最开始都是从负数开始的。
代码中并没有直接计算交叉熵，而是通过推导的公式来更新 t_next： t_next = (mean_back - mean_fore) / (np.log(mean_back) - np.log(mean_fore))。这个让我看了直挠头，涉及到李氏方法。

由于这个函数是通过迭代的过程搜索阈值，所以我想着能否将这个迭代的过程作图可视化，帮助我判断这个函数的运行机制。在AI的帮助下，让它修改这个函数的源代码，得到一个新的函数 threshold_li_with_history，我进行了一些测试。结果如下：

可以看到，交叉熵确实在下降，然后当前后两次交叉熵的差值小于设定的 tolerance 之后，搜索就结束了，函数范围当前的threshold。

能够收集阈值搜索迭代历史的函数源代码如下：

1
import numpy as np
2
# 需要导入 histogram 函数，它通常来自 skimage.exposure
3
from skimage.exposure import histogram
4

5

6
def threshold_li_with_history(image, *, tolerance=None, initial_guess=None, iter_callback=None):
7
    """Compute threshold value by Li's iterative Minimum Cross Entropy method,
8
    collecting thresholds and cross-entropies during iteration.
9

10

11
    Parameters
12
    ----------
13
    image : (M, N[, ...]) ndarray
14
        Grayscale input image.
15
    tolerance : float, optional
16
        Finish the computation when the change in the threshold in an iteration
17
        is less than this value. By default, this is half the smallest
18
        difference between intensity values in ``image``.
19
    initial_guess : float or Callable[[array[float]], float], optional
20
        Li's iterative method uses gradient descent to find the optimal
21
        threshold. If the image intensity histogram contains more than two
22
        modes (peaks), the gradient descent could get stuck in a local optimum.
23
        An initial guess for the iteration can help the algorithm find the
24
        globally-optimal threshold. A float value defines a specific start
25
        point, while a callable should take in an array of image intensities
26
        and return a float value. Example valid callables include
27
        ``numpy.mean`` (default), ``lambda arr: numpy.quantile(arr, 0.95)``,
28
        or even :func:`skimage.filters.threshold_otsu`.
29
    iter_callback : Callable[[float], Any], optional
30
        A function that will be called on the threshold at every iteration of
31
        the algorithm.
32

33

34
    Returns
35
    -------
36
    threshold : float
37
        Upper threshold value. All pixels with an intensity higher than
38
        this value are assumed to be foreground.
39
    thresholds_history : list of float
40
        A list containing the threshold value calculated at each iteration,
41
        including the initial guess threshold.
42
    cross_entropies_history : list of float
43
        A list containing the cross-entropy value calculated at each iteration,
44
        including the cross-entropy for the initial guess threshold.
45
        Note: Cross-entropy is calculated using the formula - (Pb*log(ub) + Pf*log(uf)),
46
        where Pb and Pf are probabilities of background and foreground, and ub and uf are their means.
47

48

49
    References
50
    ----------
51
    .. [1] Li C.H. and Lee C.K. (1993) "Minimum Cross Entropy Thresholding"
52
           Pattern Recognition, 26(4): 617-625
53
           :DOI:`10.1016/0031-3203(93)90115-D`
54
    .. [2] Li C.H. and Tam P.K.S. (1998) "An Iterative Algorithm for Minimum
55
           Cross Entropy Thresholding" Pattern Recognition Letters, 18(8): 771-776
56
           :DOI:`10.1016/S0167-8655(98)00057-9`
57
    .. [3] Sezgin M. and Sankur B. (2004) "Survey over Image Thresholding
58
           Techniques and Quantitative Performance Evaluation" Journal of
59
           Electronic Imaging, 13(1): 146-165
60
           :DOI:`10.1117/1.1631315`
61
    .. [4] ImageJ AutoThresholder code, http://fiji.sc/wiki/index.php/Auto_Threshold
62

63

64
    Examples
65
    --------
66
    >>> from skimage.data import camera
67
    >>> image = camera()
68
    >>> thresh, thresh_hist, ce_hist = threshold_li_with_history(image)
69
    >>> print(f"Final threshold: {thresh}")
70
    >>> print(f"Threshold history (first 5): {thresh_hist[:5]}")
71
    >>> print(f"Cross-entropy history (first 5): {ce_hist[:5]}")
72
    """
73
    # Remove nan:
74
    image = image[~np.isnan(image)]
75
    if image.size == 0:
76
        return np.nan, [], []
77

78

79
    # Make sure image has more than one value; otherwise, return that value
80
    if np.all(image == image.flat[0]):
81
        val = float(image.flat[0])
82
        # Return the single value as threshold, and empty history lists
83
        return val, [val], [np.nan] # CE is not well-defined for single value image
84

85

86
    # At this point, the image only contains np.inf, -np.inf, or valid numbers
87
    image = image[np.isfinite(image)]
88
    if image.size == 0:
89
        # if there are no finite values in the image, return 0. This is because
90
        # at this point we *know* that there are *both* inf and -inf values,
91
        # because inf == inf evaluates to True. We might as well separate them.
92
        return 0.0, [0.0], [np.nan] # CE is not well-defined here either
93

94

95
    # Li's algorithm requires positive image (because of log(mean))
96
    # Store the minimum value to offset the threshold back at the end
97
    image_min = np.min(image)
98
    image_positive = image - image_min
99

100

101
    if image_positive.dtype.kind in 'iu':
102
        tolerance = tolerance or 0.5
103
    else:
104
        # Use float tolerance based on unique values difference
105
        unique_values = np.unique(image_positive)
106
        if unique_values.size > 1:
107
             tolerance = tolerance or np.min(np.diff(unique_values)) / 2
108
        else:
109
             # If only one unique finite value after removing inf/nan, handle as single value case
110
             val = float(unique_values[0]) + image_min
111
             return val, [val], [np.nan]
112

113

114

115

116
    # Initial estimate for iteration. See "initial_guess" in the parameter list
117
    if initial_guess is None:
118
        t_next = np.mean(image_positive)
119
    elif callable(initial_guess):
120
        # Apply initial guess function to the original image values (before offsetting)
121
        t_next_original_range = initial_guess(image)
122
        # Convert to the positive image range
123
        t_next = t_next_original_range - image_min
124
    elif np.isscalar(initial_guess):
125
        # Convert scalar initial guess to new, positive image range
126
        t_next = float(initial_guess) - float(image_min)
127
        # Check if initial guess is within the positive image range
128
        image_positive_max = np.max(image_positive)
129
        if not 0 < t_next < image_positive_max:
130
             # Also check edge cases where max is 0 (e.g., image was all image_min)
131
             if not (t_next == 0 and image_positive_max == 0):
132
                 msg = (
133
                     f'The initial guess for threshold_li must be within the '
134
                     f'range of the image. Got {initial_guess} for image min '
135
                     f'{float(image_min)} and max {float(np.max(image))}.'
136
                 )
137
                 raise ValueError(msg)
138

139

140
        t_next = image_positive.dtype.type(t_next) # Cast back to appropriate type if needed
141
    else:
142
        raise TypeError(
143
            'Incorrect type for `initial_guess`; should be '
144
            'a floating point value, or a function mapping an '
145
            'array to a floating point value.'
146
        )
147

148

149
    # Initialize history lists
150
    thresholds_history = []
151
    cross_entropies_history = []
152

153

154
    # Calculate and store metrics for the initial guess threshold
155
    initial_t = t_next # The initial guess threshold in the positive range
156

157

158
    # Helper function to calculate means, proportions, and CE
159
    def calculate_metrics(threshold_val, img_arr, is_integer_type, hist_data=None, bin_centers_data=None):
160
        if is_integer_type:
161
            hist, bin_centers = hist_data, bin_centers_data
162
            total_pixels = np.sum(hist)
163
            foreground_mask = bin_centers > threshold_val
164
            background_mask = ~foreground_mask
165
            n_back = np.sum(hist[background_mask])
166
            n_fore = np.sum(hist[foreground_mask])
167

168

169
            mean_back = np.average(bin_centers[background_mask], weights=hist[background_mask]) if n_back > 0 else 0.0
170
            mean_fore = np.average(bin_centers[foreground_mask], weights=hist[foreground_mask]) if n_fore > 0 else 0.0
171

172

173
            p_back = n_back / total_pixels if total_pixels > 0 else 0.0
174
            p_fore = n_fore / total_pixels if total_pixels > 0 else 0.0
175

176

177
        else: # float type
178
            total_pixels = img_arr.size
179
            foreground_mask = img_arr > threshold_val
180
            background_mask = ~foreground_mask
181
            n_back = np.sum(background_mask)
182
            n_fore = np.sum(foreground_mask)
183

184

185
            mean_back = np.mean(img_arr[background_mask]) if n_back > 0 else 0.0
186
            mean_fore = np.mean(img_arr[foreground_mask]) if n_fore > 0 else 0.0
187

188

189
            p_back = n_back / total_pixels if total_pixels > 0 else 0.0
190
            p_fore = n_fore / total_pixels if total_pixels > 0 else 0.0
191

192

193
        cross_entropy = np.nan # Default if means are zero or regions are empty
194

195

196
        # Calculate cross-entropy only if means are positive and regions exist
197
        # Check if proportions are positive to avoid log(0) indirectly via means
198
        if mean_back > 0 and mean_fore > 0 and p_back > 0 and p_fore > 0:
199
             cross_entropy = -(p_back * np.log(mean_back) + p_fore * np.log(mean_fore))
200
        # Handle case where a region has pixels but mean is exactly zero (unlikely with float, possible with int 0)
201
        # If mean is 0 but p > 0, log(mean) is -inf, term goes to inf.
202
        # The original algorithm breaks if mean_back is 0. Let's be consistent.
203
        # If any mean is zero, we consider CE infinite or undefined in this context and break the iteration.
204
        # For history, we store NaN if means are zero.
205

206

207
        return mean_back, mean_fore, p_back, p_fore, cross_entropy, n_back, n_fore
208

209

210

211

212
    # Calculate and store initial metrics
213
    is_integer = image_positive.dtype.kind in 'iu'
214
    hist_data, bin_centers_data = (None, None)
215
    if is_integer:
216
         hist_data, bin_centers_data = histogram(image_positive.reshape(-1), source_range='image')
217
         hist_data = hist_data.astype('float64', copy=False) # Use float64 for weights
218

219

220
    mean_back_init, mean_fore_init, p_back_init, p_fore_init, initial_cross_entropy, n_back_init, n_fore_init = \
221
        calculate_metrics(initial_t, image_positive, is_integer, hist_data, bin_centers_data)
222

223

224
    # Store initial guess metrics
225
    # Only store CE if means were positive, otherwise it's NaN
226
    thresholds_history.append(initial_t + image_min)
227
    cross_entropies_history.append(initial_cross_entropy)
228

229

230
    # Call callback for the initial guess threshold
231
    if iter_callback is not None:
232
         iter_callback(initial_t + image_min)
233

234

235

236

237
    # initial value for t_curr must be different from t_next by at
238
    # least the tolerance. Since the image is positive, we ensure this
239
    # by setting to a large-enough negative number relative to t_next
240
    t_curr = initial_t - 2 * tolerance
241
    t_next = initial_t # Start the loop check with the initial guess
242

243

244
    # Stop the iterations when the difference between the
245
    # new and old threshold values is less than the tolerance
246
    # or if the background/foreground mode has only one value left or a mean is zero.
247

248

249
    while abs(t_next - t_curr) > tolerance:
250
        t_curr = t_next # t_curr is now the threshold from the end of the previous iteration
251

252

253
        # Calculate the NEXT threshold based on t_curr (the previous threshold)
254
        # Need to use the current threshold (t_curr) to split the image/histogram
255
        if is_integer:
256
            foreground_curr = bin_centers_data > t_curr
257
            background_curr = ~foreground_curr
258
            n_back_curr = np.sum(hist_data[background_curr])
259
            n_fore_curr = np.sum(hist_data[foreground_curr])
260

261

262
            # Break conditions based on regions becoming empty
263
            if n_back_curr == 0 or n_fore_curr == 0:
264
                break
265

266

267
            mean_back_curr = np.average(bin_centers_data[background_curr], weights=hist_data[background_curr])
268
            mean_fore_curr = np.average(bin_centers_data[foreground_curr], weights=hist_data[foreground_curr])
269

270

271
            # Break conditions based on means being zero (log(0) is undefined)
272
            if mean_back_curr == 0 or mean_fore_curr == 0:
273
                 break
274

275

276
            # Calculate the new threshold (t_next)
277
            # This calculation is based on the means derived from t_curr
278
            t_next = (mean_back_curr - mean_fore_curr) / (np.log(mean_back_curr) - np.log(mean_fore_curr))
279

280

281
        else: # float image
282
            foreground_curr = image_positive > t_curr
283
            background_curr = ~foreground_curr
284
            n_back_curr = np.sum(background_curr)
285
            n_fore_curr = np.sum(foreground_curr)
286

287

288
            # Break conditions based on regions becoming empty
289
            if n_back_curr == 0 or n_fore_curr == 0:
290
                break
291

292

293
            mean_fore_curr = np.mean(image_positive[foreground_curr])
294
            mean_back_curr = np.mean(image_positive[background_curr])
295

296

297
            # Break conditions based on means being zero
298
            if mean_back_curr == 0.0 or mean_fore_curr == 0.0:
299
                 break
300

301

302
            # Calculate the new threshold (t_next)
303
            t_next = (mean_back_curr - mean_fore_curr) / (np.log(mean_back_curr) - np.log(mean_fore_curr))
304

305

306
        # --- End of calculating t_next for this iteration ---
307

308

309
        # Calculate cross-entropy FOR the newly calculated t_next
310
        # We need means and proportions based on the *new* t_next
311
        mean_back_next, mean_fore_next, p_back_next, p_fore_next, cross_entropy, n_back_next, n_fore_next = \
312
             calculate_metrics(t_next, image_positive, is_integer, hist_data, bin_centers_data)
313

314

315
        # Check break conditions again based on metrics calculated with t_next
316
        # This handles cases where the new threshold splits the image in a way that leads to empty/zero mean regions
317
        if n_back_next == 0 or n_fore_next == 0 or mean_back_next == 0 or mean_fore_next == 0:
318
             # If breaking due to this, the last calculated t_next and CE might be invalid.
319
             # Let's remove the last appended values before breaking.
320
             # However, the original loop structure means we calculated t_next successfully in the step.
321
             # If the *evaluation* at t_next fails, it means this t_next is problematic.
322
             # The previous t_curr was the last 'good' threshold.
323
             # Let's append the problematic t_next and NaN for CE and then break.
324
             # This shows the iteration step where it failed.
325
             thresholds_history.append(t_next + image_min)
326
             cross_entropies_history.append(np.nan) # Indicate invalid CE
327
             if iter_callback is not None:
328
                  iter_callback(t_next + image_min)
329
             break # Break the while loop
330

331

332

333

334
        # Store the newly calculated threshold and its corresponding cross-entropy
335
        # Only store CE if it's not NaN (i.e., means were positive)
336
        thresholds_history.append(t_next + image_min)
337
        cross_entropies_history.append(cross_entropy)
338

339

340

341

342
        # Callback on the newly calculated threshold for this iteration
343
        if iter_callback is not None:
344
            iter_callback(t_next + image_min)
345

346

347
    # Final threshold is the last calculated t_next
348
    threshold = t_next + image_min
349

350

351
    return threshold, thresholds_history, cross_entropies_history

绘制阈值搜索迭代历史的可视化代码如下：

1
# 绘制迭代过程中的阈值和交叉熵曲线
2
plt.figure(figsize=(15, 5))
3

4

5
# 绘制阈值随迭代次数的变化
6
plt.subplot(1, 3, 1)
7
plt.plot(thresh_hist, marker='o', linestyle='-')
8
plt.xlabel('Iteration Number')
9
plt.ylabel('Threshold Value')
10
plt.title('Threshold Value during Li Iteration')
11
plt.grid(True)
12

13

14
# 绘制交叉熵随迭代次数的变化
15
plt.subplot(1, 3, 2)
16
plt.plot(ce_hist, marker='o', linestyle='-')
17
plt.xlabel('Iteration Number')
18
plt.ylabel('Cross-Entropy')
19
plt.title('Cross-Entropy during Li Iteration')
20
plt.grid(True)
21

22

23
# 绘制交叉熵随阈值变化的曲线 (基于迭代过程中的记录)
24
plt.subplot(1, 3, 3)
25
plt.plot(thresh_hist, ce_hist, marker='o', linestyle='-')
26
plt.xlabel('Threshold Value')
27
plt.ylabel('Cross-Entropy')
28
plt.title('Cross-Entropy vs. Threshold Value (during Iteration)')
29
plt.grid(True)
30

31

32
plt.tight_layout()
33
plt.show()