Next Article in Journal
Partition Function Zeros of the Frustrated J1J2 Ising Model on the Honeycomb Lattice
Previous Article in Journal
Regression of Concurrence via Local Unitary Invariants
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Comprehensive Method for Example-Based Color Transfer with Holistic–Local Balancing and Unit-Wise Riemannian Information Gradient Acceleration

1
School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
2
Department of Computer Science, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada
*
Author to whom correspondence should be addressed.
Entropy 2024, 26(11), 918; https://doi.org/10.3390/e26110918
Submission received: 20 August 2024 / Revised: 21 October 2024 / Accepted: 27 October 2024 / Published: 29 October 2024
(This article belongs to the Topic Color Image Processing: Models and Methods (CIP: MM))

Abstract

:
Color transfer, an essential technique in image editing, has recently received significant attention. However, achieving a balance between holistic color style transfer and local detail refinement remains a challenging task. This paper proposes an innovative color transfer method, named BHL, which stands for Balanced consideration of both Holistic transformation and Local refinement. The BHL method employs a statistical framework to address the challenge of achieving a balance between holistic color transfer and the preservation of fine details during the color transfer process. Holistic color transformation is achieved using optimal transport theory within the generalized Gaussian modeling framework. The local refinement module adjusts color and texture details on a per-pixel basis using a Gaussian Mixture Model (GMM). To address the high computational complexity inherent in complex statistical modeling, a parameter estimation method called the unit-wise Riemannian information gradient (uRIG) method is introduced. The uRIG method significantly reduces the computational burden through the second-order acceleration effect of the Fisher information metric. Comprehensive experiments demonstrate that the BHL method outperforms state-of-the-art techniques in both visual quality and objective evaluation criteria, even under stringent time constraints. Remarkably, the BHL method processes high-resolution images in an average of 4.874 s, achieving the fastest processing time compared to the baselines. The BHL method represents a significant advancement in the field of color transfer, offering a balanced approach that combines holistic transformation and local refinement while maintaining efficiency and high visual quality.

1. Introduction

Example-based image color style transfer methods have emerged as powerful tools in computer graphics and image processing. Their significance lies in their ability to transfer the color characteristics from the example image to the source image. This allows for various key applications, such as artistic effects transformation [1], photorealistic image stylization [2], image illuminance adjustment [3,4], and underwater image enhancement [5,6].
Pioneering work in example-based image color style transfer was presented in [7]. The authors achieved the decorrelation of color channels by transforming RGB images into the Lab color space, leveraging simple statistics, such as mean and standard deviation, to linearly map the color characteristics from one image to another. Building on this foundation, the method introduced in [8] retained operations within the RGB space and utilized mean and covariance to account for the inherent correlations between the three color channels. Furthermore, the method detailed in [9] accomplished one-to-one color mapping by transferring the color palette of the example image to the source image through an iterative algorithm that transforms one probability density function into another. In [10], a linear color transformation derived from the Monge–Kantorovich theory was proposed. Following this, Ref. [11] introduced a regularized discrete optimal transport formulation for color transformation, effectively addressing challenges such as mass conservation relaxation and regularization. The method in [12] employed illuminant matching and optimal color palette mapping to achieve color transfer. Moreover, Ref. [13] tackled the limitations of relaxed optimal transport in color transfer by implementing a non-convex regularized optimal transportation method that enforced one-to-one feature matching while minimizing transport dispersion. The authors of [14] introduced a transformation between Multivariate Generalized Gaussian Distributions (MGGDs), consisting of optimal transportation of the second-order statistics and a stochastic-based shape parameter transformation.
While optimal transportation algorithms offer advantages in computational efficiency and ease of use, they have limitations. Notably, the application of a uniform processing method across all pixels hinders the ability to ensure the reasonableness of transformation results in all image regions. This may lead to artifacts, unnatural colorization, irrational luminosity relationships, biased color positioning, and color vignetting in the output image.
To overcome this limitation, an Expectation-Maximization (EM)-based segmentation method for regional color transfer was introduced in [15]. In [16], a soft color segmentation method was presented for color transfer, using a Gaussian Mixture Model (GMM) to capture broad color patterns with soft labels. The method proposed in [17] utilized an improved EM algorithm and a GMM for the automatic selection of appropriate reference colors within target regions. Focusing on local style variations, the method introduced in [18] leveraged Gaussian clustering to capture fine-grained light and color details within images. This method employed novel source-example cluster mapping policies and achieved style transfer through a combination of parametric color transfer and local chromatic adaptation, allowing for seamless image synthesis while preserving spatial and color coherence. A content-based color transfer method introduced in [19] performed high-level scene analysis for semantic region correspondence and utilized a novel optimization framework to achieve color transfer while preserving spatial layout and visual coherence. For applications in cartoon and fabric color transfer, the method presented in [20] improved color transfer vividness and enhanced detail preservation through image segmentation and the incorporation of a total generalized variation regularizer. A representative superpixel-based method for color transfer was presented in [21], utilizing a fast method that employed approximate nearest neighbor matching with enforced diversity and a fusion framework. Lastly, an L2 divergence-based method for color transfer was described in [22], offering flexibility by accommodating color correspondences and ensuring performance despite potential outlier pairs.
Building upon differential geometry concepts, Ref. [23] introduced a method for per-frame color transform interpolation that minimized curvature. In contrast, the method presented in [24] employed iterative probabilistic color mapping with self-learning filtering and multiscale detail manipulation, minimizing the Kullback–Leibler divergence to enhance color fidelity and detail preservation. To improve robustness, the method in [25] leveraged scattered point interpolation with moving least squares and probabilistic modeling in 3D RGB space, enabling robust color transfer across varying conditions. For more compelling results, the method in [26] considered scene illumination and target gamut constraints, utilizing white balancing, illuminant-aware tone mapping, and gamut-based color mapping techniques. Based on the color homography theorem, the method in [27] decomposed color transfer into chromaticity shift and shading adjustment, represented by a global shading curve. Additionally, a 3D color homography model was introduced in [28], approximating the transformation as a combination of a 3D perspective transform and mean intensity mapping. Addressing color transfer in a two-stage process, the method in [29] first prioritized similarities between source image pixel colors and dominant colors during color mapping, followed by an L0 gradient-preserving detail preservation step to refine large gradients at color region boundaries while maintaining small gradients within regions. The method in [30] tackled color transfer estimation with pixel-to-pixel correspondences using a robust feature-based method. This method utilized an optimal inlier maximization algorithm for outlier handling, combined with a novel structure tensor-based feature detector and descriptor, ensuring reliable color distribution matching across images.
Convolutional Neural Networks (CNNs) have proven remarkably adept at capturing the underlying features of images. This proficiency makes them particularly well suited for image style transfer tasks. Their advantage stems from their ability to learn complex image representations, often referred to as deep features. The most common method leverages these deep features to establish correspondences between the source and example images, subsequently implementing the style transfer [31,32,33,34,35,36].
Under this framework, a method for visual attribute transfer between images with different appearances was introduced in [37], focusing on images with similar semantic structures. This method leveraged deep image analogy and extended PatchMatch to guide semantically meaningful transfers. Similarly, the method proposed in [38] was designed for accurate and coherent color transfer in images with similar semantic structures, employing dense correspondences and local linear models. A local colorization method that allowed for customizable results by incorporating different example images was presented in [39]. Beyond CNN-based methods, a self-supervised Generative Adversarial Network (GAN) for High Dynamic Range (HDR) image color transfer was introduced in [40]. A style representation learning method for arbitrary image style transfer using contrastive learning was proposed in [41]. Meanwhile, a multichannel correlation network (MCCNet) for arbitrary video and image style transfer, which ensures temporal consistency and addresses flickering effects, was proposed in [42]. Detailed reviews of different color transfer techniques can be found in [43,44].
While deep learning methods generally achieve superior performance, they have certain limitations. Extensive training datasets and substantial computational resources are required to train these models. Additionally, their performance is hindered when the source image type is not present in the training data. Once trained, these networks may struggle to adapt to different source image sizes.
Existing methods for color transfer each have their own advantages. However, it is difficult to achieve good performance in all aspects, such as texture preservation, color brightness, and time efficiency. To overcome these challenges, this paper makes the following contributions:
1.
This paper proposes a method that balances holistic and local costs, named BHL. The BHL method captures the color information of example images more comprehensively at a holistic level while better preserving the texture details of the source images.
2.
A customized optimization method is introduced, based on the Riemannian information gradient and called the uRIG method, to address the high computational time associated with parameter estimation for the MGGD and GMM probability models. By leveraging the second-order acceleration effect of the Riemannian information metric (matrix), the uRIG method significantly enhances the time efficiency of the BHL algorithm.
3.
In the preprocessing stage, SLIC (Simple Linear Iterative Clustering) is used to sample mini-batches for subsequent iterations. This ensures that the colors of the image will not be too monotonous when refining local areas.
4.
Extensive numerical experiments demonstrate that the BHL method achieves a significant advantage in time complexity over existing color transfer techniques while matching or even surpassing the visual quality of existing methods.

2. Methodology

This paper aims to achieve fast and high-quality image color transfer by leveraging the complementary strengths of holistic and local methods. To this end, we formulate the engineering problem as a numerical optimization problem, as shown in the following equation:
min C h + C ,
where C h and C represent the costs associated with holistic and local color transformations, respectively. Minimizing both terms in Equation (1) theoretically leads to the optimal color transfer solution.

2.1. Holistic Cost

In this work, image pixels are represented as 3-dimensional vectors (i.e., corresponding to the 3 channels in the CIE Lab color space). By statistically modeling all pixel values in the example image, its holistic color features are described by a probability distribution, denoted as P e 2 . Similarly, the color features of the source image are represented by a distribution P s 2 . The holistic color transfer is accomplished through an optimal transport map between P s 2 and P e 2 , as optimal transport guarantees that mapping samples from P s 2 to P e 2 minimizes the color difference (i.e., the transport cost). As a result, the color distribution of the transformed image closely aligns with that of the example image. Mathematically, this problem is expressed as:
min T E x P s c x , T ( x ) ,
where x represents a random vector following P s 2 , T ( x ) is the optimal transport mapping to be found, and c ( x , T ( x ) ) denotes the transport cost. In this work, the Multivariate Generalized Gaussian Distribution (MGGD) is selected as an appropriate probability distribution to model the source image and example image. The MGGD is a generalization of the multivariate Gaussian distribution; since it inherits the advantages of the Gaussian distribution, it offers a more accurate fit to the real probability density of the dataset due to its adjustable shape parameter. The probability density function of the MGGD is defined as follows [45]:
p ( x | θ ) = c | Σ | 1 2 g β ( δ ) with θ = ( μ , Σ , β ) ,
where the d-dimensional vector μ R d is the location (mean) parameter. The d × d -dimensional scatter matrix Σ is symmetric positive definite, and | Σ | denotes the determinant of Σ . The real positive value β is the shape parameter. The coefficient c is the normalizing constant
c = β π d 2 2 d 2 β Γ d / 2 / Γ d / 2 β ,
where Γ is the Gamma function. The symbol δ denotes the Mahalanobis distance for simplicity
δ = ( x μ ) Σ 1 ( x μ ) ,
where † denotes the vector or matrix transpose. Then, the real-valued function g β ( δ ) is
g β ( δ ) = exp δ 2 β / 2 .
To conceptually distinguish between the source image and the example image, denote the random vector corresponding to the source image as x and the random vector corresponding to the example image as e . We define X = x 1 2 , , x M 2 as a sample set of x , representing the source image (whose color is to be transferred), and E = e 1 , , e N as a sample set of e , representing the example image. It is assumed that X is distributed according to an MGGD, and the same assumption holds for E . The transformation equation of two MGGDs, i.e., the minimum of Equation (2), is given in [14]. This transformation of MGGDs incorporates the following two key elements:
  • The MK (Monge–Kantorovich) linear transportation associated with the scatter matrices of x and e ;
  • A stochastic transformation of the shape parameters of x and e .
Denote U = { u 1 , , u M } as the image that was transformed by MK mapping, m { 1 , , M } . The expression of MK transportation is
T M K : x m u m = Λ ( x m μ x ) + μ e
where
Λ = Σ x 1 2 Σ x 1 2 Σ e Σ x 1 2 1 2 Σ x 1 2 .
Since MK transportation is independent of the shape parameter β e of the example image, we need to perform a second transfer operation related to β x and β e . Here, we denote T s as this second transfer operation and v m = T s ( u m ) , m { 1 , , m } , as the output of this transformation. To incorporate the shape parameter β e of the example image into the transportation result v , we first need to eliminate the influence of β x on the current result u . Then, we apply the influence of β e to u to obtain the output result v with the full parameter influence of e . Therefore, we need to introduce the stochastic representation of u [45]
u = d μ e + τ x Σ e 1 2 r .
where = d denotes stochastic equality. The d-dimensional random vector r follows a uniform distribution on the unit sphere, and τ x is a positive random variable (independent of r ) satisfying the following condition:
τ x 2 β x Γ d 2 β x , 2 .
According to (8), the expression of T s is obtained as follows:
m { 1 , , M } , T s : u m v m = τ e ( u m μ e ) Σ e 1 2 ( u m μ e ) + μ e ,
where τ e is randomly sampled according to (9). By combining and T s , we can obtain T h :
m { 1 , , M } , T h : x m v m = T s ( T M K ( x m ) ) .
Ultimately, this optimal transportation problem is equivalent to an optimization problem: the estimation problem of θ x = ( μ x , Σ x , β x ) and θ e = ( μ e , Σ e , β e )
min C h , x ( θ x ) with C h , x ( θ x ) = 1 M m = 1 M ln p ( x m | θ x ) ,
min C h , e ( θ e ) with C h , e ( θ e ) = 1 N n = 1 N ln p ( e n | θ e ) .

2.2. Local Costs

After the holistic transformation, a local transformation method is expected to adjust the details. In this work, we employ the framework introduced in [46], which achieves state-of-the-art performance. The interest of this framework not only resides in its performance but also in its operational simplicity.
Here, the output of T h is used as the input for the local transformation. The local transformation leverages the assumption that the example image dataset E = { e 1 , , e N } follows a specific Gaussian Mixture Model (GMM). The means of this GMM are hypothesized to correspond to the source image. In this way, the matching of pixels and the color transformation between the source image and the example image can be performed simultaneously and adaptively during the parameter estimation process of the GMM, without the need for additional segmentation operations.
Denote Z = { z 1 , , z M } as the means of this GMM. The probability density function of the GMM followed by the random vector e n is defined as follows:
p ( e n | ω ) = m = 1 M 1 M p ( e n | ω m ) , with   ω = { ω 1 , , ω M } , ω m = ( z m , σ m 2 I ) ,
where p ( e n | ω m ) denotes the m-th Gaussian component of the GMM and M represents the total number of components in the mixture. Following [46], a vector z m R d is used as the mean (location parameter) of the m-th component, and a diagonal covariance matrix σ m 2 I serves as its scatter matrix. Since the covariance involves only one parameter σ m to be estimated, we denote ω m as ( z m , σ m ) in the following sections for simplicity and ignore the identity matrix I . Additionally, all components are assigned equal weights of 1 / M in [46]. The local cost is defined as the negative log-likelihood of the GMM:
min C ( ω ) ,   with   C ( ω ) = n = 1 N ln 1 M m = 1 M p ( e n | ω m ) ,
and for each component of the GMM,
p ( e | ω m ) = 1 ( 2 π ) d 2 σ m d 2 exp 1 2 σ m 1 e z m 2 .
Then, the color transformation is achieved through a GMM estimation process, utilizing E = { e 1 , , e N } as the sample data and V = { v 1 , , v M } (output of T h ) as the initial value for the means Z = { z 1 , , z M } . In the minimization algorithm, we set
{ z 1 ( 0 ) , , z M ( 0 ) } = { v 1 , , v M } .
While the Expectation-Maximization (EM) algorithm was applied to address this problem in [46], it is no longer suitable in this context. In Section 3, the uRIG method is introduced, specifically tailored to this problem.

3. Optimization Algorithm

3.1. Main Algorithm

To minimize the cost function in Equations (12) and (14), a two-stage optimization process is introduced. Figure 1 illustrates the overall workflow.
During the first stage, we employ Multivariate Generalized Gaussian Distributions (MGGDs) to independently model the source image X = { x m 2 ; m = 1 , , M } and the example image E = { e n 2 ; n = 1 , , N } . We then estimate the parameters for these respective models. Utilizing these estimated parameters, we construct the holistic transformation equation T h 2 and subsequently apply it to achieve holistic color transfer.
The output of the first stage, denoted as V = { v m 2 ; m = 1 , , M } , serves as the input for the second stage. Here, a GMM with means denoted by Z = { z m 2 ; m = 1 , , M } is leveraged. We posit that the example image E constitutes a sample set for this GMM. Through an iterative maximum likelihood estimation process initialized with Z 2 ( 0 ) = V , the estimated Z ^ represents the refined result, which is also the final transformed image. The BHL method is summarized in the Algorithm 1 as below.
Algorithm 1 The BHL method
Require: 
Source image X = { x 1 , , x M } , example image E = { e 1 , , e N } ;
 1:
θ ^ x ← MGGDESTIMATION( X );
 2:
θ ^ e ← MGGDESTIMATION( E );
 3:
{ v 1 , , v M } ← TRANSMGGD( X , θ ^ x , θ ^ e );
 4:
{ z ^ 1 , , z ^ M } ← GMMESTIMATION( E , V );
Ensure: 
Edited image Z ^ = { z ^ 1 , , z ^ M } ;
 5:
procedure TransMGGD( X , θ ^ x , θ ^ e )
 6:
    Holistic transformation via (11);
 7:
    return  V = { v 1 , , v M } ;
 8:
end procedure
This color transfer method necessitates solving three optimization problems: (12a), (12b), and (14) (refer to Steps 1 and 2 in Algorithm 1). Traditionally, these problems are addressed using the fixed-point iteration method for (12a) and (12b), and the Expectation-Maximization (EM) algorithm for (14). However, these methods become computationally expensive, particularly for high-resolution images, due to increased time and memory demands. To overcome this computational bottleneck, the unit-wise Riemannian information gradient (uRIG) method is introduced. The core idea of uRIG leverages two key mathematical concepts: the Riemannian manifold and the Fisher information metric [47,48]. Updates on the Riemannian manifold effectively bypass numerical instabilities often encountered with nonlinear constraints, such as positive definite matrices, leading to a more robust algorithm. Additionally, the use of the Fisher information metric (matrix) as a replacement for the Hessian matrix eliminates the need for computationally expensive numerical approximations, thereby accelerating the convergence of the estimation process [49].

3.2. Minimization of the Holistic Cost

While the shape parameter of an MGGD is known, such as in the Gaussian ( β = 1 ) and Laplace ( β = 0.5 ) distributions, its closed-form Fisher information metric exists, as detailed in [50,51]. However, when the shape parameter is unknown, its FIM requires solving a system of partial differential equations, which currently lacks a closed-form solution [52]. Inspired by [53], this paper proposes utilizing a unit-wise Riemannian information metric to address both Problem (12) and the subsequent Problem (14). In Problem (12), the MGGD parameters reside in a product space encompassing R d for the location parameter μ , the set P d of d × d -dimensional symmetric positive definite (SPD) matrices for scatter matrix Σ , and R + for the shape parameter β . The spaces R d and R + are special Riemannian manifolds with zero curvature. The space P d is a matrix (Riemannian) manifold with negative curvature. The product of these three spaces is then a Riemannian manifold [47]. We call a subspace in the product space R d × P d × R + a unit since it has a closed-form FIM.
Definition 1.
For the MGGD model, the following spaces are defined: (i) The space R d for the location parameter μ is called a unit; (ii) The space P d for the scatter matrix Σ is called a unit; (iii) The space R + for the shape parameter β is called a unit.
While the interesting manifold (parameter space) is well defined, the proposition below gives the unit-wise FIM.
Proposition 1.
Denote Θ = R d × P d × R + as the parameter space of an MGGD and T θ Θ = T μ R d × T Σ P d × T β R + as the tangent space of θ Θ . The unit-wise FIM of this MGGD is
u θ , v θ = u μ , v μ μ + u Σ , v Σ Σ + u β , v β β ,
with
(18a) u μ , v μ μ = I μ u μ Σ 1 v μ , (18b) u Σ , v Σ Σ = I Σ , 1 tr Σ 1 u Σ Σ 1 v Σ + I Σ , 2 tr Σ 1 u Σ tr Σ 1 v Σ , (18c) u β , v β β = I β u β v β ,
where tr is the matrix trace. The vectors u θ = ( u μ , u Σ , u β ) and v θ = ( v μ , v Σ , v β ) are elements in  T θ Θ .
To simplify the notations, we use the symbols I μ , I Σ , 1 , I Σ , 1 , and I β to represent the coefficients in the three inner products above. The following remark gives the values of the three information coefficients.
Remark: The three inner products involved in Proposition 1 all have closed-form expressions for the constant coefficients, which can be directly used without any numerical approximation. The information constant with respect to μ is
I μ = 2 ( β 1 ) + d Γ d 2 2 β 2 1 β d ( d 2 ) 1 Γ d 2 β ,
where d is the dimension of the image color vector and Γ ( · ) is the Gamma function. The information constants with respect to Σ are
I Σ , 1 = d + 2 β 2 ( d + 2 )   and   I Σ , 2 = β 1 2 ( d + 2 ) ,
The information constant with respect to β is
I β = 1 β 2 { 1 + d 2 β 2 Ψ 1 d 2 β + d β ln 2 + Ψ 0 d 2 β + d 2 β [ ( ln 2 ) 2 + Ψ 0 1 + d 2 β ln 4 + Ψ 0 1 + d 2 β + Ψ 1 1 + d 2 β ] } ,
where Ψ 0 and Ψ 1 are the digamma and trigamma functions. The proof of Proposition 1 and its remark can be found in Appendix A. After defining the uFIM on the parameter space Θ , we are able to derive the associated Riemannian gradient based on this metric, i.e., the unit-wise Riemannian information gradient.
Proposition 2.
The uRIG of the holistic cost in Equation (12) is under the following form:
C h ( θ ) = μ C h ( θ ) Σ C h ( θ ) β C h ( θ ) ,
where the components of the three parameters are as follows:
(21a) μ C h ( θ ) = 1 L i = 1 L β I μ 1 δ i β 1 ( x i μ ) , (21b) Σ C h ( θ ) = 1 L i = 1 L [ β δ i β 1 2 I Σ , 1 ( x i μ ) ( x i μ ) I Σ , 1 + I Σ , 2 β δ i β + 1 2 I Σ , 1 ( I Σ , 1 + d I Σ , 2 ) Σ ] , (21c) β C h ( θ ) = 1 L i = 1 L 1 I β { 1 β 1 + d 2 β Ψ 0 d 2 β + ln 2 1 2 β δ i β 1 } .
The values of the coefficients I μ , I Σ , and I β are presented in (19). We recall that the symbol δ i denotes the Mahalanobis distance, defined as δ i = ( x i μ ) Σ 1 ( x i μ ) . The symbol Ψ 0 denotes the digamma function. The proof of Proposition 2 can be found in Appendix B.
Having established the relevant metric and gradient on the manifold, we now turn to the retraction map, which plays an important role in optimization algorithms. The retraction map serves as a bridge between the tangent space and the manifold itself, enabling us to efficiently perform gradient descent.
In Euclidean space, no special treatment is usually required for gradient descent. However, on a manifold, after moving in the direction of descent (i.e., typically the gradient) in the tangent space, a ‘retraction’ operation is needed to ensure that the parameters always remain within the constrained space. Therefore, the ideal retraction map is the geodesic map that performs the ‘retraction’ operation along geodesics. In particular, the geodesic map for Euclidean space is simply vector addition. Then, for the geodesic map on P d , we employ the form introduced in [54] in this work. Since each of the three units possesses its own intrinsic geodesic map, the most natural retraction map on Θ is the product of the three geodesic maps.
Proposition 3.
The following map is a retraction on Θ = R d × P d × R + :
Ret θ : T θ Θ Θ u μ u Σ u β μ + u μ Σ Exp Σ 1 u Σ β exp u β / β ,
where T θ Θ = T μ R d × T Σ P d × T β R + is the tangent space at the point θ = ( μ , Σ , β ) , and the vector ( u μ , u Σ , u β ) is an element of T θ Θ .
In Proposition 3, exp denotes the natural exponential function on the real number field, while Exp refers to the matrix exponential map. The proof of Proposition 3 is presented in Appendix C.
With the foundation of necessary components in place, we now turn our attention to the specific iterative method employed for the estimation of the MGGD.
In the Algorithm 2, Ret θ is given in (22). Indeed, in practical applications, in pursuit of time efficiency, the iteration is carried out in the form of mini-batch stochastic gradient descent, i.e., the constant L in (21) is the mini-batch size. Its convergence analysis and the selection of the coefficient a are discussed in Section 3.4, along with the optimization of C .
Algorithm 2 MGGD estimation using the uRIG method
Require: 
Dataset X = { x 1 , , x M } ;
Ensure: 
estimate θ ^ x ;
 1:
for  k = 1 ,   2 ,   3 ,     do
 2:
    Compute uRIG via (20);
 3:
    Define learning rate η k = a / k ;
 4:
    Update θ ( k + 1 ) Ret θ ( k ) η k C h ( θ ( k ) )
 5:
end for

3.3. Minimization of the Local Cost

We assumed in Section 2.2 that all the component weights of the GMM are equal to 1/M, where M is the number of components. So, the EM algorithm is no longer the most suitable choice when the component weights are pre-set.
Instead, we adopt the uRIG method in the stochastic gradient descent framework. Specifically, for any component, the GMM, p ( e | ω m ) is a Gaussian distribution, which is a special case of the MGGD with the shape parameter β = 1 . Therefore, we can directly apply the uRIG of the MGGD to the estimation of the GMM.
Due to the complex parameters of the GMM, we declare the following symbols for the sake of convenience in expression and understanding:
  • Parameter: ω = m = 1 M ω m , with ω m = ( z m , σ m ) ;
  • Parameter space: Ω = m = 1 M Ω m , with Ω m = R d × R + ;
  • Tangent space at point ω : T ω Ω = m = 1 M T ω m Ω m , with T ω m Ω m = R d × R .
Similar to the case of the MGGD, we start with the definition of a unit.
Definition 2.
For any component p ( e | z m , σ m ) of the GMM defined in Equation (13), we define the following spaces:
  • The space R d for the location parameter z m is a unit;
  • The space R + for the covariance σ m is a unit.
In fact, each unit of the GMM parameter space is simply a Euclidean space; therefore, its uFIM is easier to derive. The uFIM is able to be obtained in the following form by taking the second differential of the log-likelihood with respect to each unit.
Proposition 4.
The unit-wise Fisher information metric of the GMM is
u ω , v ω ω = m = 1 M 1 σ m u z m v z m 2 + d 2 σ m 2   u σ m v σ m .
The vectors u ω and v ω are elements in T ω Ω , and u z m and v z m are elements in T ω m Ω m .
The proof can be found in Appendix A. Then, the unit-wise information gradient can be easily derived based on the inner product in Equation (23).
Proposition 5.
The unit-wise information gradient of the GMM is
C ( ω ) = ( z 1 C ( ω 1 ) , σ 1 C ( ω 1 ) , , z m C ( ω m ) , σ m C ( ω m ) , , z M C ( ω M ) , σ M C ( ω M ) ) ,
where
z m C ( ω m ) = 1 N n = 1 N o m , n ( e n z m ) ,
σ m C ( ω m ) = 1 N n = 1 N o m , n σ m e n z m 2 d ,
where o m , n represents the posterior probabilities
o m , n = p ( e n | ω m ) i = 1 M p ( e i | ω i ) .
The proof of Proposition 5 is presented in Appendix B. For the means z m , the retraction map is vector addition in the Euclidean space R d . For the coefficient σ m of the covariance, we follow the same treatment as for the shape parameter β in (22). Using their product, the retraction shown below can be obtained.
Proposition 6.
The following map is a retraction on the parameter space of the GMM:
Ret ω : T ω Ω Ω u z 1 u σ 1 u z m u σ m u z M u σ M z 1 2 + u z 1 2 σ 1 2 exp ( σ 1 1 u σ 1 ) 2 z m + u z m σ m 2 exp ( σ m 1 u σ m ) 2 z M + u z M σ M 2 exp ( σ M 1 u σ M ) 2 .
The Algorithm 3 below gives the update rule for parameter estimation of the GMM.
Algorithm 3 GMM estimation using the uRIG method
Require: 
Dataset E = { e 1 , , e N } ;
Ensure: 
estimate ω ^ ;
 1:
for  k = 1 ,   2 ,   3 ,     do
 2:
    Compute uRIG via (24);
 3:
    Define learning rate η k = a / k ;
 4:
    Update ω ( k + 1 ) Ret ω ( k ) η k C ( ω ( k ) )
 5:
end for
In the above algorithm, C ( ω ( k ) ) and Ret ω 2 are given in (24) and (26), respectively. For the GMM, the uRIG method also performs under the form of mini-batch stochastic gradient descent. The convergence analysis and selection of the coefficient a of Algorithms 2 and 3 are discussed in the following subsection.

3.4. Convergence Analysis

To simplify the presentation and enhance the clarity of the convergence analysis of Algorithms 2 and 3, we unify the notations in this subsection. In the following part of this subsection, we use ω to represent the parameters and Ω to denote the parameter space (regardless of whether it is MGGD or GMM). Consider a statistical model with M units, and let N be the number of observed samples.
Since the conditions (1–3) shown below hold, the high time efficiency and robustness of Algorithm 1 can be established, as shown in Proposition 7.
The conditions are as follows:
  • The cost C ( ω ) has an isolated stationary point at ω = ω * , where ω * is the true parameter;
  • There exists a compact and convex neighborhood U Ω of ω * such that the sequence generated by Algorithm 1 remains within U .
  • The learning rate η k = a / k , a > 0 , verifies the usual condition for stochastic approximation:
    k = 1 + η k 2 = + , k = 1 η k 2 < + .
Proposition 7.
With these three conditions, we have
lim k ω 2 ( k ) = ω * .
The proof of Proposition 7 is shown in Appendix D.

4. Experiment

In this section, the BHL method is evaluated from two perspectives: visual assessment and objective quantitative analysis. The BHL method is benchmarked against five state-of-the-art methods known for their high performance. In terms of holistic transformation methods, we chose the representative MGGD transformation method in [14]. In the domain of local transformation methods, we chose two top-performing methods: the GMM-based transformation technique [46] and the method derived from L2 divergence [22]. Two state-of-the-art deep neural network-based methods, specifically the CAST method [41] and MCCNet [42], were also included in the comparative experiments for evaluation.
We randomly selected five groups of images for experimental comparison, i.e., 10 different images, with 5 as the source images and 5 as the example images. All experiments were conducted on a regular laptop with an AMD Ryzen 7 6800H processor and a main frequency of 3.2 GHz.

4.1. Parameter Setting

For the uRIG algorithm, the initialization θ 2 ( 0 ) of Algorithm 2 was determined using the method of moments [55], while the initialization of Algorithm 3 was obtained from the output of holistic transformation V , i.e., Z 2 ( 0 ) = V . The coefficient a of the learning rate was estimated according to Proposition 2 in [56].
To facilitate gradient descent, a preprocessing step was introduced in [46] to select the mini-batch. Prior to commencing the iterations, for each pixel in Z 2 ( 0 ) , b nearest neighbors are chosen from E . Notably, b corresponds to the size of the mini-batch, and these selected b pixels from E then comprise the mini-batch employed for gradient descent. While this method effectively captures color information, it results in a significant computational burden in the preprocessing stage due to the need to sort M × N pixels.
To address this limitation, we proposed a distinct mini-batch sampling strategy that leverages the Simple Linear Iterative Clustering (SLIC) algorithm. Initially, we applied the SLIC algorithm for superpixel segmentation on the example image E , partitioning it into 1000 superpixels. For each pixel in Z 2 ( 0 ) , we computed the distance to the mean of each superpixel in E . Based on these distances, the b nearest superpixels were selected to replace the first b pixels during the iteration (typically, b b ). This method offers two main advantages. First, it significantly reduces the sorting time, particularly for high-resolution images. Second, by utilizing all pixels within the superpixels, the hyperparameter b becomes less sensitive to the final output. Specifically, extreme values of b mainly affect the algorithm’s runtime rather than the color richness of the output. Therefore, it is possible to choose a relatively smaller b to reduce time costs. In the experiments, we set b = 100 and b = 50 .
This method substantially reduced the preprocessing time complexity, as detailed in Section 4.4. In addition, to incorporate spatial information, the Laplacian regularization term introduced in [46,57] was also applied to Equation (14).

4.2. Quantitative Comparison

In existing research [12,18,25,46,58], the Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR) are commonly employed to assess the textural similarity between the output image V and the source image X . Specifically, the SSIM quantifies the level of artifacts introduced by the color transfer method, while the PSNR measures the mean squared error between the two images. The color style similarity between the output image and the example image is typically evaluated using the Frechet Inception Distance (FID) and Perceptual Hash Value (PHV) [59,60].
The five methods involved in the comparative experiment have their own advantages due to their different processing techniques and optimization goals. However, no method performed well across all four quantitative evaluation criteria at the same time. For example, MCCNet achieved superior structural fidelity visually because it enhanced the edges of objects in the image (such as the edges of petals). However, excessive enhancement led to large artifacts. The L2 method provided bright-colored visual results, but because it requires local matching of the color palette, its output sometimes contained local color deviations. The results of the MGGD method and the BHL method show that their PSNR values were not relatively high. This is because the processing of these two methods requires a resampling operation combined with the parameters of the example image. Noise that differs from the source image will result in a low PSNR value [58].
To evaluate the performance of these methods, we introduced a comprehensive evaluation technique: the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) [61]. TOPSIS is a multi-criteria decision analysis technique that assesses the performance of candidate methods by calculating their distances to the ideal and negative-ideal solutions, thereby providing a comprehensive assessment.
Quantitative comparisons between the BHL method and five other methods are presented in Table 1, Table 2, Table 3, Table 4 and Table 5. When calculating the TOPSIS score, we performed the necessary order adjustments and normalization steps. The final TOPSIS scores range from 0 to 1, with higher values indicating better overall performance. As shown in Table 1, Table 2, Table 3, Table 4 and Table 5, the BHL method, which balances holistic and local information, achieved the highest TOPSIS scores. Due to its modeling and optimization techniques, the GMM algorithm achieved the best results in terms of the SSIM and PSNR. However, the performance of this algorithm varied with different hyperparameter settings, resulting in differing color effects. Consequently, the GMM method did not perform very well in the FID and PHV criteria. In contrast, the two deep learning methods, CAST and MCCNet, benefited from the robust color feature-capture capabilities of deep neural networks, achieving higher FID and PHV scores. However, these methods fell short in obtaining high textural similarity scores. The L2 divergence-based method showed severe distortions and artifacts in some images, leading to a low PHV score and, consequently, a lower comprehensive evaluation score. The MGGD method, which is based on optimal transport, performed well across all criteria. However, its lack of attention to detail resulted in numerous local artifacts, adversely affecting its evaluation score. Detailed visual comparisons are provided in the next subsection.

4.3. Visual Comparison

Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6 present the visual results of the five experiments. Due to its modeling approach, the GMM method excelled at preserving the texture of the source images. This advantage was also reflected in the quantitative evaluation. However, its output was highly sensitive to hyperparameter selection, such as the number of iterations (set to 50 for all five experiments, consistent with [46]). As shown in Figure 4(3) and Figure 6(3), the colors of the example images were not accurately transferred, and the results retained the color style of the source images. Furthermore, in Figure 5(3) and Figure 7(2), significant white artifacts can be observed in the GMM results.
Compared to other methods, the L2 method produced a brighter color style in its results. For example, in Figure 3(4), the L2 result has a significantly brighter color style, but it also deviates from the color style of the example image. This is because the L2 method relies on local matching of the color palettes, and the accuracy of color transfer depends on the precision of this local matching. In Figure 7(3), green shadows are visible on the yellow petals in the L2 result. A similar issue occurs in Figure 7(10), where the petals exhibit a cyan tint.
The MGGD method shares some visual similarities with the BHL method, which also employs a similar modeling technique in its first stage. The key distinction between the MGGD and BHL methods lies in the local refinement in the subsequent steps of the BHL method. Specifically, as shown in Figure 7(4), the MGGD method produces unnatural purple hues on the water droplets, and the black area to the right of the droplets appears opaque and blurry due to the lack of local color adjustments.
Leveraging the powerful ability of convolutional neural networks to extract structural features from images, CAST and MCCNet demonstrated excellent structural fidelity. However, due to the inherent limitations of deep neural networks, such as rigidity in input and output sizes and challenges with generalization, some flaws were present in their results. For instance, in Figure 3(6,7), sharp black blocks appear in the background, which should have been blurred. Similarly, in Figure 7(5,6), the edges of the water droplets in the results of both methods appear as rigid straight lines rather than as curves.
The BHL method achieves a good balance between holistic color style and local structural details. By utilizing a more efficient training method (uRIG) and an innovative sampling strategy, it delivers a more robust color transfer effect. The advantages in terms of time efficiency are demonstrated in the subsequent section.

4.4. Time Efficiency

We also assessed the runtime efficiency of the six methods, with the results presented in Table 6. For deep learning methods, the runtime is generally divided into two phases: training and inference. In contrast, probabilistic methods estimate parameters dynamically for each individual dataset or image rather than relying on a pre-trained model. Consequently, their runtime cannot be compared to that of deep learning methods in the same manner, and the results are, therefore, presented differently.
For the four probabilistic methods, the first two rows of Table 6 display the average runtimes across five experiments. This analysis accounts for all processing steps involved in the four comparative methods, including data preprocessing, parameter estimation, and color transformation. The total number of pixels across the five image sets (i.e., the sample size for a single experiment) ranged from 5 × 10 2 5 to 1.6 × 10 2 6 . The results indicate that due to the BHL method’s use of the uRIG method and a novel sampling strategy based on SLIC, its average runtime across the five experiments was only 4.874 s, demonstrating a significant advantage over the other three probabilistic methods.
In contrast, for the two deep learning methods, CAST and MCCNet, the training times were 18 h and 59 h, respectively, with inference times of 0.011 s and 0.013 s. Compared to the probabilistic methods, CAST and MCCNet exhibited a substantial advantage in inference speed. However, the extended training times and the limited scope of the training datasets clearly constrained their ability to address all scenarios. For instance, in Figure 3(6), large black smudges obscure the petal contours, and in Figure 7(5,6), the water droplet contours are transformed into rigid lines. These issues did not arise with any of the four probabilistic methods.
To further validate the time efficiency benefits of the uRIG method, we conducted a simulation comparison experiment with other commonly used stochastic gradient-based optimization methods. The comparison included classic Stochastic Gradient Descent (SGD), Adam [62], and Affine-Invariant Gradient Descent (AIG) [63], which is equivalent to the classic Riemannian gradient descent method. The simulation involved 150 Monte Carlo simulations. Each experiment’s data followed a randomly generated MGGD, with the initial values provided by the method of moments. The averaged results are presented in Figure 8. The horizontal axis represents the number of iterations, while the vertical axis depicts the error calculated using the empirical Kullback–Leibler divergence.
As evident from Figure 8, the uRIG method demonstrates a clear advantage for the parameter estimation task in statistical models. This advantage becomes increasingly pronounced as the number of iterations grows, highlighting the accelerating effect of the Fisher information metric.

5. Conclusions

In conclusion, this paper presents a novel color transfer method that effectively balances holistic color style and local detail preservation within a statistical framework. By integrating optimal transport theory in the first stage for holistic color style transfer and utilizing a GMM in the second stage for local detail refinement, the BHL method addresses the inherent challenges of color transfer. The implementation of the unit-wise Riemannian information gradient (uRIG) method successfully tackles the complex optimization problems associated with these stages.
Extensive experimental results demonstrate that the BHL method significantly outperforms existing state-of-the-art techniques in both visual quality and objective evaluation criteria. The proposed method is not only effective but also efficient, with the capability to process high-resolution images in an average time of 4.874 s, making it suitable for practical applications where time constraints are critical.
Overall, the BHL method provides a robust and efficient solution for color transfer in image editing, paving the way for future advancements in the field. Future work may explore further optimization techniques and extend the method to other related image processing tasks.

Author Contributions

Conceptualization, J.Z. and N.W.; methodology, J.Z., Z.W. and S.W.; software, Z.W.; data curation, Z.W.; writing—original draft preparation, J.Z.; writing—review and editing, Z.W. and S.W.; supervision, N.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (grant no. 42301401, 62301497), and the China Postdoctoral Science Foundation (grant no. GZC20232385, 2024T170823).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Proof of Proposition 1 and Proposition 4

For the Multivariate Generalized Gaussian Distribution (MGGD), the Fisher information metric (FIM) of the scatter matrix Σ for elliptical distributions was first introduced in its general form in [50]. By substituting the parameters specific to the MGGD, the expression in Equation (18b) can be directly obtained. The FIM for the location parameter μ was previously introduced in [51] for the Gaussian distribution. Here, we derive the FIM of μ using the first two derivatives of the log-likelihood function of the MGGD. Recall the log-likelihood function of the MGGD:
( x | θ ) = ln c 1 2 ln | Σ | + ln g β 2 ( δ )
where the symbol δ and function g β 2 are defined in Equations (5) and (6), respectively. The first and second derivatives of Equation (A1) with respect to μ are
(A2a) d = 2 ln g β δ d μ Σ 1 ( x μ ) (A2b) d 2 = 4 2 ln g β δ 2 d μ Σ 1 ( x μ ) ( x μ ) Σ 1 d μ + ln g β δ d μ Σ 1 d μ
Then, the negative expectation of the second derivative, E θ [ d 2 ] , provides the FIM with respect to μ . By employing the stochastic expression in Equation (8), a closed-form solution for this integral (expectation) can be obtained:
E θ [ d 2 ] = E θ 4 δ d 2 ln g β δ 2 + 2 ln g β δ d μ Σ 1 d μ = 2 ( β 1 ) + d Γ d 2 2 β 2 1 β d ( d 2 ) 1 Γ d 2 β d μ Σ 1 d μ
Finally, for the shape parameter β , which is a positive real number, its Fisher information is also a real-valued scalar. The value of the Fisher information for β is provided in [55] and can be directly used in Equation (18c).
Moving on to Gaussian Mixture Models (GMMs), the FIM for any m-th component ( m { 1 , , M } ) is a special case of Equation (17). By setting μ = z m 2 , Σ = σ m 2 I , and β = 1 , the information coefficient and the FIM with respect to z m 2 is
I z m 2 u z m 2 ( σ m 2 I ) 2 1 v z m 2 2 = u z m 2 v z m 2 2 σ m 2 ,   I z m = 1
For the variance σ m 2 , its information coefficients are
I σ m , 1 = 1 2 ,   I σ m , 2 = 0
Then, its FIM is
I σ m , 1 tr u σ m v σ m σ m 4 I = d u σ m v σ m 2 σ m 4
Since the parameter space is a product of these individual units, the uFIM is simply the sum of the FIMs for each unit.

Appendix B. Proof of Proposition 2 and Proposition 5

The Riemannian information gradient (RIG) of C g with respect to the scatter matrix Σ was established in [56]. The RIG with respect to the location parameter μ associated with uFIM is given by the derivative
μ C h , u μ μ = d d ε ε = 0 C h ( μ + ε u μ ) = u μ 1 L i = 1 L β δ i β 1 Σ 2 1 ( x i 2 μ ) = 1 L i = 1 L β I μ 1 δ i β 1 ( x i μ ) , u μ μ
Solving this equation yields the RIG in Equation (21a). An analogous computation can be applied in the univariate case to obtain the RIG for parameter β .
Each individual component of a GMM is, in fact, a special case of the MGGD, i.e., a Gaussian distribution. Given that I z m = 1 , the RIG with respect to z m is
z m C = 1 N n = 1 N o m , n I z m 1 δ m , n 0 ( e n z m ) = 1 N n = 1 N o m , n ( e n z m )
Then, for σ m , by solving the following equation:
d σ m C v σ m 2 σ m 2 = 1 N n = 1 N o m , n 2 σ m d e n z m 2 σ m
the expression in Equation (25b) can be obtained.

Appendix C. Proof of Proposition 3 and Proposition 6

In the context of the MGGD, a retraction on the manifold Θ is defined as a smooth mapping Ret θ 2 from the tangent space T θ 2 Θ onto Θ , satisfying the following two specific properties (Definition 4.1.1 of [47]):
  • Ret θ 2 maps the zero element 0 θ of T θ 2 Θ back to θ :
    θ Θ ,   Ret θ ( 0 ) = θ .
  • For any element u θ 2 in T θ 2 Θ , the curve γ ( t ) : t Ret θ 2 ( t u ) satisfies γ ˙ ( 0 ) = u θ 2 , where
    γ ˙ ( 0 ) = d d t t = 0 γ ( t ) .
    Setting u θ 2 = ( u μ 2 , u Σ 2 , u β 2 ) 2 = 0 θ 2 , for any point θ , we have
    Ret θ ( 0 θ 2 ) = μ + 0 Σ Exp Σ 2 1 0 β exp ( 0 / β ) = μ Σ β = θ
    which verifies property 1. Additionally, for the curve γ ( t ) = Ret θ 2 ( t u θ 2 ) ,
    γ ˙ ( 0 θ 2 ) = u μ 2 Σ Exp ( Σ 2 1 0 Σ 2 ) Σ 2 1 u Σ 2 | β exp ( 0 / β ) β 2 1 u β = ( u μ , u Σ , u β ) = u θ 2
    which aligns with our choice of t = 0 . This confirms property 2 for the retraction in Equation (22) of the MGGD.
Similar arguments can be extended to the case of a GMM with multiple components ( m = 1 , , M ) . Both properties 1 and 2 can be verified:
Ret ω 2 ( 0 ω 2 ) = z m 2 + 0 z m 2 σ m 2 exp ( 0 / σ m 2 ) = z m 2 σ m 2 = ω
γ ˙ ( 0 ω 2 ) = u z m σ m 2 exp ( 0 / σ m 2 ) σ m 1 u σ m 2 = , u z m 2 , u σ m 2 , 2 = u ω 2
Since properties 1 and 2 are verified, Equation (26) is a retraction on Ω . Equation (26) is established as a valid retraction in the context of a GMM as well.

Appendix D. Proof of Proposition 7

In this proof, we utilize the notations introduced in Section 3.4: ω denotes the parameter, and Ω denotes the parameter space. The proof is based on Remark 2 in [56]. Let u ( w 2 ( k ) ) be the current descent direction. According to Remark 2 in [56], if the following condition holds:
E u ( w 2 ( k ) ) , C ( ω 2 ( k ) ) < 0 ,   k > 0
then, almost surely, we have
lim k ω 2 ( k ) = ω *
As we declared in Section 3.4, the cost C ( ω ) has an isolated stationary point at ω = ω * , and there exists a compact and convex neighborhood U Ω of ω * such that the sequence generated by Algorithm 2 (or Algorithm 3) remains within U . For any k, we have u ( ω 2 ( k ) ) = C ( ω 2 ( k ) ) :
E u ( w 2 ( k ) ) , C ( ω 2 ( k ) ) = E C ( ω 2 ( k ) ) 2 2 < 0
This verifies the condition of Remark 2 in [56], and therefore,
P lim k ω 2 ( k ) = ω 2 *

References

  1. Kotovenko, D.; Sanakoyeu, A.; Lang, S.; Ommer, B. Content and style disentanglement for artistic style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 4422–4431. [Google Scholar]
  2. Li, Y.; Liu, M.Y.; Li, X.; Yang, M.H.; Kautz, J. A closed-form solution to photorealistic image stylization. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 453–468. [Google Scholar]
  3. Nam, S.; Ma, C.; Chai, M.; Brendel, W.; Xu, N.; Kim, S.J. End-to-end time-lapse video synthesis from a single outdoor image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1409–1418. [Google Scholar]
  4. Shih, Y.; Paris, S.; Durand, F.; Freeman, W.T. Data-driven hallucination of different times of day from a single outdoor photo. ACM Trans. Graph. (TOG) 2013, 32, 1–11. [Google Scholar] [CrossRef]
  5. Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef] [PubMed]
  6. Li, K.; Wu, L.; Qi, Q.; Liu, W.; Gao, X.; Zhou, L.; Song, D. Beyond single reference for training: Underwater image enhancement via comparative learning. IEEE Trans. Circuits Syst. Video Technol. 2022, 33, 2561–2576. [Google Scholar] [CrossRef]
  7. Reinhard, E.; Adhikhmin, M.; Gooch, B.; Shirley, P. Color transfer between images. IEEE Comput. Graph. Appl. 2001, 21, 34–41. [Google Scholar] [CrossRef]
  8. Xiao, X.; Ma, L. Color transfer in correlated color space. In Proceedings of the 2006 ACM International Conference on Virtual Reality Continuum and Its Applications, Hong Kong, China, 14–17 June 2006; pp. 305–309. [Google Scholar]
  9. Pitié, F.; Kokaram, A.C.; Dahyot, R. Automated colour grading using colour distribution transfer. Comput. Vis. Image Underst. 2007, 107, 123–137. [Google Scholar] [CrossRef]
  10. Pitie, F.; Kokaram, A. The linear Monge-Kantorovitch linear colour mapping for example-based colour transfer. In Proceedings of the 4th European Conference on Visual Media Production, London, UK, 27–28 November 2007; pp. 1–9. [Google Scholar]
  11. Ferradans, S.; Papadakis, N.; Peyré, G.; Aujol, J.F. Regularized discrete optimal transport. SIAM J. Imaging Sci. 2014, 7, 1853–1882. [Google Scholar] [CrossRef]
  12. Frigo, O.; Sabater, N.; Demoulin, V.; Hellier, P. Optimal transportation for example-guided color transfer. In Proceedings of the Asian Conference on Computer Vision, Singapore, 1–5 November 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 655–670. [Google Scholar]
  13. Rabin, J.; Papadakis, N. Non-convex relaxation of optimal transport for color transfer between images. In Proceedings of the Geometric Science of Information: Second International Conference (GSI 2015), Palaiseau, France, 28–30 October 2015; Proceedings 2; Springer: Berlin/Heidelberg, Germany, 2015; pp. 87–95. [Google Scholar]
  14. Hristova, H.; Le Meur, O.; Cozot, R.; Bouatouch, K. Transformation of the multivariate generalized Gaussian distribution for image editing. IEEE Trans. Vis. Comput. Graph. 2017, 24, 2813–2826. [Google Scholar] [CrossRef]
  15. Tai, Y.W.; Jia, J.; Tang, C.K. Local color transfer via probabilistic segmentation by expectation-maximization. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 747–754. [Google Scholar]
  16. Tai, Y.W.; Jia, J.; Tang, C.K. Soft color segmentation and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 1520–1537. [Google Scholar] [CrossRef]
  17. Xiang, Y.; Zou, B.; Li, H. Selective color transfer with multi-source images. Pattern Recognit. Lett. 2009, 30, 682–689. [Google Scholar] [CrossRef]
  18. Hristova, H.; Le Meur, O.; Cozot, R.; Bouatouch, K. Style-aware robust color transfer. In Proceedings of the CAe@ Expressive, Istanbul, Turkey, 20–22 June 2015; pp. 67–77. [Google Scholar]
  19. Wu, F.; Dong, W.; Kong, Y.; Mei, X.; Paul, J.C.; Zhang, X. Content-based colour transfer. In Proceedings of the Computer Graphics Forum; Wiley Online Library: New York, NY, USA, 2013; Volume 32, pp. 190–203. [Google Scholar]
  20. Han, Y.; Xu, C.; Baciu, G.; Li, M.; Islam, M.R. Cartoon and texture decomposition-based color transfer for fabric images. IEEE Trans. Multimed. 2016, 19, 80–92. [Google Scholar] [CrossRef]
  21. Giraud, R.; Ta, V.T.; Papadakis, N. Superpixel-based color transfer. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 700–704. [Google Scholar]
  22. Grogan, M.; Dahyot, R. L2 divergence for robust colour transfer. Comput. Vis. Image Underst. 2019, 181, 39–49. [Google Scholar] [CrossRef]
  23. Bonneel, N.; Sunkavalli, K.; Paris, S.; Pfister, H. Example-based video color grading. ACM Trans. Graph. 2013, 32, 1–12. [Google Scholar] [CrossRef]
  24. Su, Z.; Zeng, K.; Liu, L.; Li, B.; Luo, X. Corruptive artifacts suppression for example-based color transfer. IEEE Trans. Multimed. 2014, 16, 988–999. [Google Scholar] [CrossRef]
  25. Hwang, Y.; Lee, J.Y.; So Kweon, I.; Joo Kim, S. Color transfer using probabilistic moving least squares. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 3342–3349. [Google Scholar]
  26. Nguyen, R.M.; Kim, S.J.; Brown, M.S. Illuminant aware gamut-based color transfer. In Proceedings of the Computer Graphics Forum; Wiley Online Library: New York, NY, USA, 2014; Volume 33, pp. 319–328. [Google Scholar]
  27. Gong, H.; Finlayson, G.D.; Fisher, R.B. Recoding color transfer as a color homography. arXiv 2016, arXiv:1608.01505. [Google Scholar]
  28. Gong, H.; Finlayson, G.D.; Fisher, R.B.; Fang, F. 3D color homography model for photo-realistic color transfer re-coding. Vis. Comput. 2019, 35, 323–333. [Google Scholar] [CrossRef]
  29. Wang, D.; Zou, C.; Li, G.; Gao, C.; Su, Z.; Tan, P. L0 gradient-preserving color transfer. In Proceedings of the Computer Graphics Forum; Wiley Online Library: New York, NY, USA, 2017; Volume 36, pp. 93–103. [Google Scholar]
  30. Oskarsson, M. Robust image-to-image color transfer using optimal inlier maximization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 786–795. [Google Scholar]
  31. Luan, F.; Paris, S.; Shechtman, E.; Bala, K. Deep photo style transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4990–4998. [Google Scholar]
  32. Liu, D.; Jiang, Y.; Pei, M.; Liu, S. Emotional image color transfer via deep learning. Pattern Recognit. Lett. 2018, 110, 16–22. [Google Scholar] [CrossRef]
  33. Lee, J.; Son, H.; Lee, G.; Lee, J.; Cho, S.; Lee, S. Deep color transfer using histogram analogy. Vis. Comput. 2020, 36, 2129–2143. [Google Scholar] [CrossRef]
  34. Zhou, Y.; Barnes, C.; Shechtman, E.; Amirghodsi, S. Transfill: Reference-guided image inpainting by merging multiple color and spatial transformations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2266–2276. [Google Scholar]
  35. Xia, X.; Zhang, M.; Xue, T.; Sun, Z.; Fang, H.; Kulis, B.; Chen, J. Joint bilateral learning for real-time universal photorealistic style transfer. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part VIII 16; Springer: Berlin/Heidelberg, Germany, 2020; pp. 327–342. [Google Scholar]
  36. Wan, D.; Shen, F.; Liu, L.; Zhu, F.; Huang, L.; Yu, M.; Shen, H.T.; Shao, L. Deep quantization generative networks. Pattern Recognit. 2020, 105, 107338. [Google Scholar] [CrossRef]
  37. Liao, J.; Yao, Y.; Yuan, L.; Hua, G.; Kang, S.B. Visual attribute transfer through deep image analogy. arXiv 2017, arXiv:1705.01088. [Google Scholar] [CrossRef]
  38. He, M.; Liao, J.; Chen, D.; Yuan, L.; Sander, P.V. Progressive color transfer with dense semantic correspondences. ACM Trans. Graph. (TOG) 2019, 38, 1–18. [Google Scholar] [CrossRef]
  39. He, M.; Chen, D.; Liao, J.; Sander, P.V.; Yuan, L. Deep exemplar-based colorization. ACM Trans. Graph. (TOG) 2018, 37, 1–16. [Google Scholar] [CrossRef]
  40. Huang, Y.; Qiu, S.; Wang, C.; Li, C. Learning representations for high-dynamic-range image color transfer in a self-supervised way. IEEE Trans. Multimed. 2020, 23, 176–188. [Google Scholar] [CrossRef]
  41. Zhang, Y.; Tang, F.; Dong, W.; Huang, H.; Ma, C.; Lee, T.Y.; Xu, C. Domain enhanced arbitrary image style transfer via contrastive learning. In Proceedings of the ACM SIGGRAPH 2022 Conference Proceedings, Vancouver, BC, Canada, 7–11 August 2022; pp. 1–8. [Google Scholar]
  42. Kong, X.; Deng, Y.; Tang, F.; Dong, W.; Ma, C.; Chen, Y.; He, Z.; Xu, C. Exploring the temporal consistency of arbitrary style transfer: A channelwise perspective. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 8482–8496. [Google Scholar] [CrossRef]
  43. Faridul, H.S.; Pouli, T.; Chamaret, C.; Stauder, J.; Trémeau, A.; Reinhard, E. A Survey of Color Mapping and its Applications. Eurographics (State Art Rep.) 2014, 3, 1. [Google Scholar]
  44. Liu, S. An overview of color transfer and style transfer for images and videos. arXiv 2022, arXiv:2204.13339. [Google Scholar]
  45. Fang, K. Symmetric Multivariate and Related Distributions; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
  46. Gu, C.; Lu, X.; Zhang, C. Example-based color transfer with Gaussian mixture modeling. Pattern Recognit. 2022, 129, 108716. [Google Scholar] [CrossRef]
  47. Absil, P.A.; Mahony, R.; Sepulchre, R. Optimization on Matrix Manifolds. In Optimization Algorithms on Matrix Manifolds; Princeton University Press: Princeton, NJ, USA, 2008. [Google Scholar]
  48. Amari, S.I. Information Geometry and Its Applications; Springer: Berlin/Heidelberg, Germany, 2016; Volume 194. [Google Scholar]
  49. Amari, S.I. Natural gradient works efficiently in learning. Neural Comput. 1998, 10, 251–276. [Google Scholar] [CrossRef]
  50. Berkane, M.; Oden, K.; Bentler, P.M. Geodesic estimation in elliptical distributions. J. Multivar. Anal. 1997, 63, 35–46. [Google Scholar] [CrossRef]
  51. Besson, O.; Abramovich, Y.I. On the Fisher information matrix for multivariate elliptically contoured distributions. IEEE Signal Process. Lett. 2013, 20, 1130–1133. [Google Scholar] [CrossRef]
  52. Verdoolaege, G.; De Backer, S.; Scheunders, P. Multiscale colour texture retrieval using the geodesic distance between multivariate generalized Gaussian models. In Proceedings of the 2008 15th IEEE International Conference on Image Processing, San Diego, CA, USA, 12–15 October 2008; pp. 169–172. [Google Scholar]
  53. Ollivier, Y. Riemannian metrics for neural networks. Inf. Inference J. IMA 2013, 2. [Google Scholar]
  54. Pennec, X.; Fillard, P.; Ayache, N. A Riemannian framework for tensor computing. Int. J. Comput. Vis. 2006, 66, 41–66. [Google Scholar] [CrossRef]
  55. Verdoolaege, G.; Scheunders, P. Geodesics on the Manifold of Multivariate Generalized Gaussian Distributions with an Application to Multicomponent Texture Discrimination. Int. J. Comput. Vis. 2011, 95, 265–286. [Google Scholar] [CrossRef]
  56. Zhou, J.; Said, S. Fast, Asymptotically Efficient, Recursive Estimation in a Riemannian Manifold. Entropy 2019, 21, 1021. [Google Scholar] [CrossRef]
  57. Burt, P.J.; Adelson, E.H. The Laplacian pyramid as a compact image code. In Readings in Computer Vision; Elsevier: Amsterdam, The Netherlands, 1987; pp. 671–679. [Google Scholar]
  58. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
  59. Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
  60. Liu, S.; Zhang, B.; Liu, Y.; Han, A.; Shi, H.; Guan, T.; He, Y. Unpaired stain transfer using pathology-consistent constrained generative adversarial networks. IEEE Trans. Med Imaging 2021, 40, 1977–1989. [Google Scholar] [CrossRef]
  61. Alaoui, M. Fuzzy TOPSIS: Logic, Approaches, and Case Studies; CRC Press: Boca Raton, FL, USA, 2021. [Google Scholar]
  62. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  63. Bonnabel, S. Stochastic gradient descent on Riemannian manifolds. IEEE Trans. Autom. Control 2013, 58, 2217–2229. [Google Scholar] [CrossRef]
Figure 1. Overview of the BHL method.
Figure 1. Overview of the BHL method.
Entropy 26 00918 g001
Figure 2. Source image: purple flower; Example image: tomato.
Figure 2. Source image: purple flower; Example image: tomato.
Entropy 26 00918 g002
Figure 3. Source image: blue flower; Example image: mountain.
Figure 3. Source image: blue flower; Example image: mountain.
Entropy 26 00918 g003
Figure 4. Source image: clusters; Example image: parrot.
Figure 4. Source image: clusters; Example image: parrot.
Entropy 26 00918 g004
Figure 5. Source image: pink flower; Example image: sunflower.
Figure 5. Source image: pink flower; Example image: sunflower.
Entropy 26 00918 g005
Figure 6. Source image: bouquet; Example image: seaside.
Figure 6. Source image: bouquet; Example image: seaside.
Entropy 26 00918 g006
Figure 7. Comparison of details with zoomed-in images.
Figure 7. Comparison of details with zoomed-in images.
Entropy 26 00918 g007
Figure 8. Comparison of uRIG with SGD, AIG, and Adam.
Figure 8. Comparison of uRIG with SGD, AIG, and Adam.
Entropy 26 00918 g008
Table 1. X E : p u r p l e f l o w e r t o m a t o .
Table 1. X E : p u r p l e f l o w e r t o m a t o .
MethodGMM [46]L2 [22]MGGD [14]CAST [41]MCCNet [42]BHL
SSIM↑0.9720.9180.9040.6100.8080.881
PSNR↑29.422.922.119.022.920.3
FID↓0.0370.0180.0060.0140.0270.004
PHV↓0.7300.7560.7170.6910.7460.732
TOPSIS↑0.2600.2310.5960.2110.1490.809
The method with the highest TOPSIS score is highlighted in bold. An upward arrow following a criterion indicates that a higher score on this criterion represents better performance. Conversely, a downward arrow indicates that a lower score on this criterion represents better performance.
Table 2. X E : b l u e f l o w e r m o u n t a i n .
Table 2. X E : b l u e f l o w e r m o u n t a i n .
MethodGMM [46]L2 [22]MGGD [14]CAST [41]MCCNet [42]BHL
SSIM↑0.9020.7820.8090.4880.5740.706
PSNR↑27.615.724.619.518.722.5
FID↓0.0660.1190.0120.0200.0200.007
PHV↓0.7150.7580.7360.7010.6820.720
TOPSIS↑0.3370.1870.6080.3040.3080.816
The bold and arrows are the same as Table 1.
Table 3. X E : c l u s t e r s p a r r o t .
Table 3. X E : c l u s t e r s p a r r o t .
MethodGMM [46]L2 [22]MGGD [14]CAST [41]MCCNet [42]BHL
SSIM↑0.9570.6880.8320.4770.5660.844
PSNR↑30.916.319.517.516.919.9
FID↓0.0560.0450.0150.0160.1350.013
PHV↓0.7920.7690.8070.7330.7230.803
TOPSIS↑0.4750.2400.6730.5250.0970.707
The bold and arrows are the same as Table 1.
Table 4. X E : p i n k f l o w e r s u n f l o w e r .
Table 4. X E : p i n k f l o w e r s u n f l o w e r .
MethodGMM [46]L2 [22]MGGD [14]CAST [41]MCCNet [42]BHL
SSIM↑0.6160.5420.4040.2920.2770.434
PSNR↑22.616.310.411.412.611.2
FID↓0.1750.0530.0070.0070.0320.003
PHV↓0.7350.7430.6840.6690.6350.662
TOPSIS↑0.3500.2570.3170.3050.1030.706
The bold and arrows are the same as Table 1.
Table 5. X E : b o u q u e t s e a s i d e .
Table 5. X E : b o u q u e t s e a s i d e .
MethodGMM [46]L2 [22]MGGD [14]CAST [41]MCCNet [42]BHL
SSIM↑0.7750.6000.5240.3180.2980.532
PSNR↑32.117.612.913.614.112.5
FID↓0.1860.0460.0030.0120.0290.003
PHV↓0.7120.6990.7150.6750.6120.709
TOPSIS↑0.4570.2560.4990.1560.0980.589
The bold and arrows are the same as Table 1.
Table 6. Running times of different methods. For the MGGD method, parameter estimation is performed using a fixed-point iteration method.
Table 6. Running times of different methods. For the MGGD method, parameter estimation is performed using a fixed-point iteration method.
Probabilistic methodsBHLL2GMMMGGD
Running time4.874 s7.523 s348.754 s54.897 s
Deep learning methodsCAST MCCNet
Training time18 h 59 h
Inference time0.011 s 0.013 s
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Zhou, J.; Wang, S.; Wang, N. A Comprehensive Method for Example-Based Color Transfer with Holistic–Local Balancing and Unit-Wise Riemannian Information Gradient Acceleration. Entropy 2024, 26, 918. https://doi.org/10.3390/e26110918

AMA Style

Wang Z, Zhou J, Wang S, Wang N. A Comprehensive Method for Example-Based Color Transfer with Holistic–Local Balancing and Unit-Wise Riemannian Information Gradient Acceleration. Entropy. 2024; 26(11):918. https://doi.org/10.3390/e26110918

Chicago/Turabian Style

Wang, Zeyu, Jialun Zhou, Song Wang, and Ning Wang. 2024. "A Comprehensive Method for Example-Based Color Transfer with Holistic–Local Balancing and Unit-Wise Riemannian Information Gradient Acceleration" Entropy 26, no. 11: 918. https://doi.org/10.3390/e26110918

APA Style

Wang, Z., Zhou, J., Wang, S., & Wang, N. (2024). A Comprehensive Method for Example-Based Color Transfer with Holistic–Local Balancing and Unit-Wise Riemannian Information Gradient Acceleration. Entropy, 26(11), 918. https://doi.org/10.3390/e26110918

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop