Previous Article in Journal
Daily Emissions of CO2 in the World: A Fractional Integration Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Pseudo-Panel Decomposition of the Blinder–Oaxaca Gender Wage Gap

by
Jhon James Mora
* and
Diana Yaneth Herrera
Departmento de Economia, Universidad Icesi, Cali 760031, Colombia
*
Author to whom correspondence should be addressed.
Econometrics 2025, 13(3), 27; https://doi.org/10.3390/econometrics13030027 (registering DOI)
Submission received: 29 May 2025 / Revised: 12 July 2025 / Accepted: 15 July 2025 / Published: 19 July 2025

Abstract

This article introduces a novel approach to decomposing the Blinder–Oaxaca gender wage gap using pseudo-panel data. In many developing countries, panel data are not available; however, understanding the evolution of the gender wage gap over time requires tracking individuals longitudinally. When individuals change across time periods, estimators tend to be inconsistent and inefficient. To address this issue, and building upon the traditional Blinder–Oaxaca methodology, we propose an alternative procedure that follows cohorts over time rather than individuals. This approach enables the estimation of both the explained and unexplained components—“endowment effect” and “remuneration effect”—of the wage gap, along with their respective standard errors, even in the absence of true panel data. We apply this methodology to the case of Colombia, finding a gender wage gap of approximately 15% in favor of male cohorts. This gap comprises a −5.6% explained component and a 20% unexplained component without controls. When we control by informality, size of the firm and sector the gap comprises a −3.5% explained component and a 18.7% unexplained component.
JEL Classification:
J15; J16; J31; J70; C23

1. Introduction

Measuring wage gaps between groups over time ideally requires panel data. However, such data are often unavailable in most developing countries due to the high costs associated with tracking individuals over time. Instead, these countries typically rely on repeated cross-sectional household surveys that are representative at each point in time but do not follow the same individuals across periods. For example, Colombia conducts periodic household surveys with representative samples, but the individuals surveyed differ across survey waves.
Some researchers address this limitation by pooling cross-sectional data and including time dummies to estimate consistent parameters. However, this approach is inefficient in the presence of measurement error arising from unobserved heterogeneity that varies across time. Under these circumstances, estimators may become inconsistent, and a preferable alternative is the use of pseudo-panel data, as proposed by Deaton (1985) and further developed by Mora and Muro (2014).
Consider now the estimation of wage gaps between two groups—typically men and women—using pseudo-panel data. Measurement errors persist because gender wage gaps often reflect unobserved individual heterogeneity that varies across groups. Furthermore, these estimates may be inconsistent in the presence of selection bias.
In cross-sectional data, the Blinder–Oaxaca decomposition is commonly applied to estimate the wage gap, and adjustments for selection bias are often included. To enhance efficiency, Jann (2005) provides a method to estimate the variance–covariance matrix of the decomposition. In the context of panel data, Kröger and Hartmann (2021) present an approach that extends the Kitagawa–Oaxaca–Blinder decomposition method to analyze wage differentials over time. However, to date, no consistent and efficient methodology exists for estimating the gender wage gap and its twofold decomposition using pseudo-panel data. This is due to the fact that individuals in period t differ from those in period t − 1, and unobserved heterogeneity is not constant over time.
Wage gaps between men and women are a matter of global importance. Equal pay is one of the guiding principles of the International Labour Organization (ILO) and a key target of the United Nations Sustainable Development Goals. Moreover, Buchely (2013) argues that gender inequality in the labor market imposes inefficiencies on society, as the costs associated with women’s disadvantage are externalized through the social security system, particularly in health and pensions.
The literature on the gender wage gap is extensive. Among international studies, Paz (1998) estimates the wage gap in Greater Buenos Aires and the Northwest of Argentina using data from the Permanent Household Survey. The income disparity between women and men is 0.70 for the overall population and 0.60 among individuals with spouses. The Blinder–Oaxaca decomposition indicates that approximately 90% of the wage gap remains unexplained by differences in human capital. Di Paola and Berges (2000) analyze gender income differences in Mar del Plata, Argentina, employing the Blinder–Oaxaca decomposition and correcting for selection bias using the Heckman method. Their findings suggest that 78% of the wage gap is explained by human capital endowments, while the remaining 22% is attributable to discrimination.
Johansson et al. (2005) studied the gender wage gap in Sweden during the 1980s and 1990s using cross-sectional data from the Swedish Household Income Survey. Their results show a gap of approximately 13% in the 1980s and 15% in the 1990s, with the unexplained portion ranging between 5% and 9%. Watson (2010) analyzes the gender wage gap among full-time managers in Australia from 2001 to 2007, using data from the Household, Income and Labour Dynamics in Australia Survey. He finds that female managers earn about 27% less than their male counterparts, and that the unexplained portion of the gap—remuneration effect—ranges from 65% to 90%, depending on the decomposition method used.
Biltagy (2014) examines wage disparities in Egypt using data from the 2006 Egyptian Labour Market Panel Survey. The Blinder–Oaxaca decomposition reveals a gender wage gap of 25%, attributed entirely to discrimination against women. Blau and Kahn (2017) studied changes in the gender wage gap in the United States between 1980 and 2010 using microdata from the Panel Study of Income Dynamics. They found that the unexplained component of the wage gap—remuneration effect—declined from 49% in 1980 to 38% in 2010.
Several studies also focus on the Colombian labor market. Baquero (2001) applies the Oaxaca (1973) decomposition using data from the National Household Survey and finds a wage gap of approximately 34% in favor of men in 1999. Abadía (2005) examines statistical discrimination by gender using data from the Continuous Household Survey for the second quarter of 2003, distinguishing between public and private sector workers. While no discrimination is found in the public sector, evidence of discrimination exists in the private sector, particularly against married or cohabiting women. Bernat (2005) analyzes hourly wage differences in Colombia’s seven major cities from 2000 to 2003 and concludes, based on a Blinder–Oaxaca decomposition, that gender discrimination persists. Fernández (2006), using Quality of Life Survey data and quantile regressions for 1997–2003, shows that wage differences favoring men are concentrated in the upper percentiles of the wage distribution, while in lower percentiles, the differences tend to favor women.
This article contributes to two strands of the literature on the gender wage gap. First, it extends the Blinder (1973) and Oaxaca (1973) decomposition to pseudo-panel data, enabling analysis in contexts where traditional panel data are unavailable. Second, it adapts the correction proposed by Jann (2005) for estimating the variance–covariance matrix of the decomposition to the pseudo-panel framework. Finally, the proposed methodology is applied to the Colombian case, serving as an illustrative example for a developing country context.
The remainder of the article is organized as follows: Section 2 reviews the literature on the Blinder–Oaxaca decomposition. Section 3 presents the proposed pseudo-panel adaptation of the Blinder–Oaxaca decomposition and the corresponding variance–covariance matrix following Jann (2005). Section 4 applies this methodology to estimate the gender wage gap in Colombia. Section 5 concludes.

2. Blinder–Oaxaca Decomposition

The most widely employed technique for assessing the gender wage gap is the Blinder–Oaxaca decomposition (1973). This method disaggregates the observed wage differential into two main components. The first component reflects differences in the returns to observable productivity-related characteristics (e.g., education, experience), while the second component captures disparities due to unobservable factors, including discrimination.
Consider two groups: men and women. The first step in the decomposition involves estimating wage equations for each group g∈{m, w}, where individual wages are modeled as follows:
L n W g = f ( S g , , E x p g , E x p g 2 ) g = m e n ,   w o m e n .
In Equation (1), lnW denotes the natural logarithm of hourly wages, S represents years of schooling, Exp corresponds to potential labor market experience—calculated as age minus years of schooling minus six—and E x p 2 is the square of potential experience. This specification follows the human capital framework proposed by Mincer (1974), who argued that the returns to education can be quantified through an income equation based on an individual’s educational attainment and work experience. The Mincer equation predicts a positive relationship between years of schooling and earnings. However, in the case of Colombia, this theoretical expectation does not fully materialize for women. Despite having, on average, higher levels of schooling than men, women continue to experience lower wages. This indicates that the returns to human capital differ significantly between men and women, suggesting the presence of structural inequalities in the labor market that may not be explained solely by differences in observable characteristics.
With respect to Equation (1), the difference in average wages between the two groups can be expressed as follows:
( w ¯ m w ¯ w ) = ( X ¯ m X ¯ w ) β ^ m + ( β ^ m + β ^ w ) X ¯ w
where w ¯ g and X ¯ g denote the mean of the logarithm of wages and the control characteristics for group g, respectively, and β g ^ is the estimated parameter from Equation (1). The wage gap can thus be decomposed into two components: the explained component or “endowment effect”, which reflects differences in observable productive characteristics between the groups, and the unexplained component or “remuneration effect”, which captures the portion of the wage differential that cannot be attributed to such characteristics—often interpreted as a result of discrimination or other unobserved factors.
In recent decades, applications of the Blinder–Oaxaca decomposition often omitted statistical inference information, such as standard errors and confidence intervals. However, interpreting decomposition results without reference to the precision of the estimates significantly limits their reliability and analytical value.
Oaxaca and Ransom (1998) and Greene (2003) proposed methods to approximate these standard errors, assuming fixed regressors. This assumption neglects a critical source of statistical uncertainty, which may lead to biased inference in most empirical applications (Jann, 2005). In particular, treating the regressors as non-stochastic tends to substantially underestimate the standard errors associated with the explained component–endowment effect of the wage gap.
In response to these limitations, Jann (2005) developed unbiased variance estimators for the components of the Blinder–Oaxaca decomposition. Suppose that
Y ˜ = X ˜ β ^
where X ˜ is a vector of sample means and β ^ is a vector of regression coefficients. The sample variance V ( X ˜ β ^ ) can be estimated as follows:
(a)
If the covariates are fixed, then X ˜ has no sampling variance. If the regressors are fixed, then X ˜ is constant. Therefore, V ( X ˜ β ^ ) = X ˜ V ( β ^ ) X ˜ .
(b)
However, in most applications, the regressors and X ˜ are stochastic. Since X ˜ and β ^ are not correlated (as long as this is true, then C o v ( ϵ ,   X ) = 0 ), the sampling variance is as follows (Jann, 2008):
V ^ ( X ˜ β ^ ) = X ˜ V ^ ( β ^ ) X ˜ + β ˜ V ^ ( X ˜ ) β ^ + t r a c e { V ^ ( X ˜ ) V ^ ( β ^ ) }
where t r a c e { V ^ ( X ˜ ) V ^ ( β ^ ) } disappears asymptotically and V ^ ( β ^ ) is the variance–covariance matrix obtained from the regression process.

3. Pseudo-Panel Approach to the Blinder–Oaxaca Decomposition

In the absence of longitudinal panel data that track the same individuals over time, it is necessary to employ pseudo-panel data methods. Pseudo-panels consist of observations drawn from different individuals across various time periods—that is, the individuals observed at time t differ from those observed at time t − 1. Utilizing the pseudo-panel approach enables the consistent and efficient application of the Blinder–Oaxaca decomposition when only cohort-level tracking is feasible.
Deaton (1985) introduced the concept of pseudo-panels as a method for exploiting repeated cross-sectional surveys. This approach entails grouping individuals into synthetic cohorts based on time-invariant and exogenous characteristics, such as age and gender.
As previously noted, the foundational idea of pseudo-panels is to construct cohorts composed of individuals who exhibit similar behavioral patterns (Guillerm, 2017). For instance, in the case of Colombia, where the age of legal adulthood is 18, it is appropriate to form nine five-year cohorts spanning the working-age population, specifically individuals aged 18 to 63. These cohorts approximate different stages of the employment life cycle.
Estimating returns to human capital using pooled cross-sectional regressions introduces an errors-in-variables problem, primarily due to time-varying unobserved individual heterogeneity. Additionally, such estimations are subject to inconsistency in the presence of selection bias.
To address these concerns, consider the following pseudo-panel specification for estimating returns to human capital (Mincer, 1974; Mora & Muro, 2014):
Y i ( t ) = β 1 X i ( t ) + ρ λ i ( t ) + f i + ϵ i ( t )
where Y i ( t ) denotes income, X i ( t ) represents the set of explanatory variables consistent with human capital theory—namely, education and potential experience— λ i ( t ) accounts for potential selection bias, and f i captures individual-specific unobserved heterogeneity. The subscripts i ( t ) indicate that the data originate from independent, representative cross-sectional surveys in which individuals are observed only once in a single time period.
In this context, Deaton (1985) demonstrates that when individuals differ across time periods, estimations based on Equation (5) yield inconsistent results. To overcome this limitation, Deaton proposes a pseudo-panel estimation strategy that involves constructing cohorts based on invariant characteristics.
Building on this approach, Mora and Muro (2014) develop a methodology for addressing pseudo-panel data in the presence of selection bias. Specifically, they propose using the generalized method of moments (GMM) to account for the measurement error problem inherent in pseudo-panel data. This methodology, hereafter referred to as GMMC (GMM with correction for measurement error), leads to the following equation:
E [ ( Y i t X i t β 1 Z 0 i δ ρ λ c t ) h ( Z 0 i , Z 1 i t ) ] = B β + b
where Z 0 denotes a matrix of fictitious cohort indicators; Z 1 i t are instrumental variables that vary over time (they do not contain Z 0 ); h ( · ) is a known function—typically comprising time effects and cohort-by-time interaction terms—although other time-varying variables may also be incorporated; β = ( β 1 δ ρ ) ; and B , b depends on the covariance matrix of measurement errors.
Regarding the selection mechanism, a panel probit model is employed to characterize the selection process, specified as follows:
E [ ( s i t Z 1 i t γ t ) A t ] = 0
where s i t is the selection process, and A is a cohort mean operator, ( Z 0 Z 0 ) 1 Z 0 .
Definition 1.
The cohort-level expression for the moment conditions specified in Equations (6) and (7) can be formulated as follows:
E [ s c t Z 1 c t γ t ] = 0 ;   t = 1 , , T ,   c = 1 , C ,
E [ ( Δ Y c t Δ X c t β 1 ρ Δ λ c t ) Δ W c t ] = B β + b .
Here,  Δ W c t = ( Δ X c t , Δ λ c t ) . Equation (8) is a system of  T  cross-sectional linear regressions. First, differences from the synthetic panel (Deaton, 1985) are used in Equation (9). By substituting γ c t ^ into Equation (9), we obtain the following:
E [ ( Δ Y c t   Δ X c t β 1 ρ Δ   λ c t ^ ) Δ X c t ] = B β + b .
Finally, the GMMC estimator is as follows:
β ^ = [ c = 1 C ( Δ W c Δ W c + B ) D c c = 1 C ( Δ W c Δ W c + B ) ] 1 [ c = 1 C ( Δ W c Δ W c + B ) D c c = 1 C ( Δ W c Δ Y c b ) ]
where Δ W c = ( Δ W c 2 , Δ W c 3 , , Δ W C T ) and Δ Y c = ( Δ Y c 2 , Δ Y c 3 , , Δ Y C T ) . The optimal choice of D c is any consistent estimator of the inverse of the covariance matrix of Δ W c Δ W c (Hansen, 1982).
The asymptotic distribution of the GMMC estimator, for B , b , and Δ W c known, can be derived using standard assumptions and GMM theory (Mora & Muro, 2014). Following Deaton (1985), Newey and McFadden (1994), and Mora and Muro (2014), the following is a convenient expression for an upper limit of the covariance matrix V β M M :
V β M M = [ M W W Σ ] 1 [ Σ W W ( σ μ 2 + σ 00 + θ Σ θ 2 σ θ ) + ( σ Σ θ ) ( σ Σ θ ) ] [ M W W Σ ] 1 + Π V ^ Π .
In Equation (12), the first additive term corresponds to the covariance matrix associated with the pseudo-panel data model (Deaton, 1985). The second term is the correction matrix designed to adjust for selectivity bias, which is essential for obtaining consistent estimators in the pseudo-panel framework. This correction term reflects an estimated regressor—rather than the true regressor—in the second stage of the two-step generalized method of moments with measurement error correction (GMMC) estimation procedure (Mora & Muro, 2014). Moreover, the covariance matrix of the parameter estimates is further adjusted for bias using the approach proposed by Newey and McFadden (1994). A comprehensive demonstration of this methodology is provided in Mora and Muro (2014).
For instance, to estimate the returns to education within each group, extending the Mincer (1974) earnings equation to the pseudo-panel context can be expressed as follows:
L n W g , c t = α c t + β 1 S g , c t + β 2 E x p g , c t + β 3 E x p g , c t 2 + λ S e l g , c t + μ g , c t c = 1 , , C   t = 1 , , T   g = m e n ,   w o m e n
S e l g , c t = f ( M a r r i e d g , c t ,   H e a d _ H o u s e h o l d g , c t ,   C h 6 g , c t , N _ i n d g , c t ) .
In this context, l n W c t denotes the natural logarithm of hourly wages for cohort c in year t. The variable S represents years of schooling, while Exp denotes potential labor market experience, calculated as age minus years of schooling minus six. E x p 2 is the square of potential experience, capturing the nonlinear (diminishing) returns to experience. The term α accounts for unobserved heterogeneity across cohorts, and μ is the error term. The inverse Mills ratio λ is included in the wage equation to correct for selection bias, as wages are only observed for employed individuals. Excluding individuals who are not currently working (e.g., unemployed) but invested in human capital introduces selection bias in the estimation of returns to education.
The parameter β 1 captures the return to an additional year of education, while β 2 and β 3 represent the returns to an additional year of experience and its diminishing effect, respectively. Equations (13) and (14) are estimated using the GMMC approach that corrects for selection bias, as specified in Equation (11).
Regarding the selection equation, S e l i , c t is a binary indicator for labor force participation, equal to one if the individual is either employed or unemployed (i.e., actively participating in the labor market), and zero otherwise. The covariates used in the selection equation include the following: Married, a binary variable equal to one if the individual is married; Head_household, a dichotomous variable equal to one if the individual is the head of their household; Ch6, a continuous variable measuring the number of children under the age of six in the household; and N_ind, which denotes the total number of individuals residing in the household.
According to the ILO (2020b), marital status has a differential impact by gender on labor market outcomes, particularly in labor force participation, job types, and underemployment. Being the head of household entails greater financial responsibilities and thus influences the decision to participate in the labor market (Budlender, 2003). Similarly, the presence of young children1 and the overall household size are relevant determinants of labor market participation, as highlighted by Tobón and Rodríguez (2015), Cools et al. (2017), ILO (2020a), and Baranowska-Rataj and Matysiak (2022).
Definition 2.
The counterpart of the Blinder–Oaxaca (1973) decomposition in the pseudo-panel data framework for two groups is defined as follows:
L n W m , c t L n W w , c t = ( X m , c t X w , c t ) β ^ m , c t A + X w , c t ( β ^ m , c t β ^ w , c t ) B + ( ϵ m , c t ϵ w , c t ) C m : m e n ;   w : w o m e n .
In Equation (15), the first term (A) represents the explained component–endowment effect, which captures the portion of the wage differential attributable to observable differences in productive characteristics between the two groups. The second term (B) corresponds to the unexplained component–remuneration effect, which reflects differences in the returns to these characteristics and is often associated with discrimination or unobserved heterogeneity. The final term (C) tends to converge to zero, as the evaluation of Equation (15) at the mean of the logarithm of the hourly wage distribution implies that the linear combination of the error terms has an expected value of zero.
For instance, the explained component—based on education, experience, and squared experience as proxies for human capital accumulation—can be expressed as follows:
β ˜ m , c t S ( S ¯ m , c t S ¯ w , c t ) + β ˜ m , c t E x p ( E x p ¯ m , c t E x p ¯ w , c t ) + β ˜ m , c t E x p 2 ( E x p 2 ¯ m , c t E x p 2 ¯ w , c t )
Similarly, the unexplained component is expressed as:
α c t m α c t w + ( β ˜ m , c t S β ˜ w , c t S ) S ¯ w , c t + ( β ˜ m , c t E x p β ˜ w , c t E x p ) E x p ¯ w , c t + ( β ˜ m , c t E x p 2 β ˜ w , c t E x p 2 ) E x p 2 ¯ w , c t .
Definition 3.
The variance–covariance matrix counterpart for the pseudo-panel data model, following Jann (2008), for two groups is specified as follows:
For the explained component–endowment effect, the variance–covariance matrix is given by the following:
V ^ M M e x p l a i n { ( X ˜ m , c t X ˜ w , c t ) β ^ m , c t } ( X ˜ m , c t X ˜ w , c t ) V ^ ( β ^ m , c t ) ( X ˜ m , c t X ˜ w , c t ) + β ^ m , c t { V ^ ( X ˜ m , c t ) + V ^ ( X ˜ w , c t ) } β ^ m , c t .
For the unexplained component–remuneration effect, the variance–covariance matrix is given by the following:
V ^ M M u n e x p l a i n { X ˜ w , c t ( β ^ m , c t β ^ w , c t ) } X ˜ w , c t { V ^ ( β ^ m , c t ) + V ^ ( β ^ w , c t ) } X ˜ w , c t + ( β ^ m , c t β ^ w , c t ) V ^ ( X ˜ w , c t ) ( β ^ m , c t β ^ w , c t ) .
The  t r a c e {   V ^ ( X ˜ m , c t )   V ^ ( β ^ m , c t )   }  disappears when we use cohort’s as instruments and NT/C → ∞.

4. Blinder–Oaxaca Wage Gap Decomposition: The Case of Colombia

The Colombian labor market continues to exhibit significant gender disparities. For instance, according to the World Economic Forum’s (2021) Global Gender Gap Report, substantial wage differences persist between women and men in Colombia. Among 156 countries, Colombia ranks 120th on the equal pay index for comparable work, with a score of 0.56 (where 1 indicates full parity). Despite increased female labor force participation and longer average years of schooling among women, their earnings remain significantly lower than those of men. Data from the National Administrative Department of Statistics (DANE—its Spanish acronym) reveal that women’s labor force participation rate rose from approximately 46% in 1991 to 54% in 2019, while men’s participation rate remained steady at around 75%. Additionally, Piñeros (2009) notes that the educational attainment gap between men and women began to narrow in the 1970s, with women surpassing men in average years of schooling during the 1980s.
Peña et al. (2013) emphasize that the predominant emerging family structure in Colombia is the female-headed single-parent household, where gender inequalities negatively affect family income and human capital accumulation, thereby limiting social mobility for members of such families.
The gender wage gap in Colombia has been examined at various points in time using cross-sectional data (e.g., Bernat, 2005; Fernández, 2006; and Badel & Peña, 2010). For example, Fernández (2006) reports that the average wage differential was 19% in 1997 and decreased to 13% in 2003. However, a comprehensive understanding of the underlying causes for the persistence of this gap over time remains insufficient.
Badel and Peña (2010) examine the gender wage gap in Colombia’s seven largest cities using quantile regression techniques. Their findings indicate that men earn more than women, and the wage gap exhibits a U-shaped pattern, with women’s wages falling further below men’s at the extremes of the wage distribution compared to the middle. Similarly, Galvis (2011) investigates regional and gender wage differentials in Colombia employing quantile regressions. The results reveal consistent positive wage differentials favoring men. Furthermore, a Blinder–Oaxaca decomposition suggests that these wage gaps are not fully explained by observable individual characteristics; rather, they primarily arise from differences in the returns to these characteristics (e.g., education) and unobserved factors.
Mora and Arcila (2014) analyze the wage gap between Afro-descendant and White individuals in Cali, utilizing data from the 2013 Employment and Quality of Life Survey. When incorporating variables, such as migration status and perceived discrimination, into the selection equation for Afro-descendants, they estimate a wage gap of 42%, of which 9% is attributable to differences in human capital characteristics, while 33% is linked to labor market discrimination.
To the best of our knowledge, although prior studies documented the existence of the gender wage gap in Colombia (e.g., Baquero, 2001; Fernández, 2006; and Badel & Peña, 2010), the present study is the first to analyze the evolution of this gap over time. Specifically, it employs a pseudo-panel dataset combined with decomposition methods and selectivity correction techniques.
To estimate the gender wage gap over time, we constructed a pseudo-panel comprising a time series of independent and representative cross-sectional samples spanning from 2016 to 2021. This pseudo-panel is based on data from the Large Integrated Household Survey (GEIH—its Spanish acronym), a multipurpose survey conducted by Colombia’s official statistics agency, DANE. The GEIH regularly monitors the labor market and provides monthly labor statistics at the national, departmental, and major city levels.
Since the observations consist of independent cross-sectional data for each period, nine 5-year cohorts of individuals aged 18 to 63 have been defined. The sample comprises a total of 840,499 individuals. Table 1 displays the distribution of the sample by cohort and year. Each cohort includes more than 5500 individuals. The cohort representing the youngest age group has an average of 16,155 individuals per year, whereas the oldest cohort has an average of 8876 individuals per year.
Regarding the number of individuals per cohort, Mora and Muro (2014) assert that including at least 200 individuals per cohort is sufficient to test if there are selection bias in pseudopanel at 5% of level (p. 15). Descriptive statistics of the variables are provided in Appendix A.
Gender wage gaps in Colombia have been notable for their persistence over time. Despite increases in women’s average years of schooling and labor market participation in recent decades, empirical evidence consistently shows that men continue to receive higher remuneration than women.
Table 2 presents the results of the Blinder–Oaxaca wage gap decomposition without accounting for selection bias.2 The models considered include a pooled cross-section, a pooled cohort, and a pseudo-panel with Deaton’s correction. It is important to note that without applying Deaton’s correction, the model essentially becomes an error-in-variables model, wherein all explanatory variables (except dummy variables) are subject to measurement error (Deaton, 1985).3
When calculating the wage gap in the Colombian urban labor market, it is found that women earn, on average, 13% less than men in the pooled configuration. The endowment component is approximately −8.6%, indicating that the difference in observable characteristics favors women. In this regard, women possess superior attributes that enhance productivity (e.g., human capital and work experience) compared to men. This finding corroborates previous studies that documented women’s higher average years of schooling relative to men (Abadía, 2005; Galvis, 2011).
Regarding the unexplained component, the estimated effect is 21.6%. This suggests that if men and women had equivalent endowments, a substantial wage gap would still persist, indicating that gender differences in wages cannot be fully accounted for by productivity-related attributes or other supply-side factors.
In the pseudo-panel configurations, the wage differential is 14.5% without correction for measurement error and increases to 20.1% when such correction is applied. This implies that male cohorts earn, on average, 20% more than female cohorts in the Colombian urban labor market. The endowment effect in this setting is −2%, while the remuneration effect accounts for 22%, indicating that the unexplained component exceeds the total observed wage differential between men and women cohorts.
It is important to note that these regression results are subject to bias, as they do not adjust for selection bias, given that not all individuals participating in the labor market receive wages (Heckman, 1979).
Table 3 presents the results of the wage gap estimation with correction for selection bias (Equations (11)–(19), MH):
When adjusting for selection bias, the results from the pooled configuration become more pronounced, with the total wage differential increasing to 21%. Notably, while the endowment effect remains negative, indicating that women possess, on average, more productive characteristics, the remuneration effect rises to approximately 30%. In the pseudo-panel framework, the estimated wage gap stands at 14.6% in favor of male cohorts, with the explained component at −5.6% and the unexplained component at 20%.
Including additional controls in the human capital equation is crucial for capturing sources of heterogeneity beyond education and experience—such as labor market informality, firm size, and sector of economic activity. These factors may exert independent effects on wages, and their omission can lead to biased estimates of the returns to education.
In this context, one of the defining features of the Colombian labor market is its high level of informality (Mora & Muro, 2017; Arango & Flórez, 2020), with more than half of the workforce employed in the informal sector (Mora, 2017; Sánchez, 2020) and segmentation in the Colombian labor market (Mora & Muro, 2015). Informality is linked to several structural variables, including employment status (e.g., self-employed or independent workers), firm size, and the absence of formal employment contracts or access to public welfare systems such as health care and pensions (Mora & Muro, 2017).
Moreover, reports from the Economic Studies Office of the Colombian Ministry of Commerce, Industry, and Tourism indicate that microenterprises constitute the vast majority of the country’s business structure, accounting for over 90% of all firms (Ministerio de Comercio, Industria y Turismo, 2024). According to Law 905 of 2004, microenterprises are defined as those employing no more than ten workers or possessing total assets, excluding real estate used for housing, valued below 500 monthly minimum legal wages (Congreso de la República de Colombia, 2004).
Additionally, the tertiary sector exhibits the highest rate of female labor force participation in Colombia (DANE, 2020; Confecámaras, 2024). Since the final quarter of the 20th century, there has been a sustained incorporation of women into the service sector (Maubrigades, 2020). This overrepresentation is often attributed to the prevalence of occupations traditionally regarded as ‘feminine,’ such as those in education, healthcare, and administrative support (ILO, 2018).
Accordingly, Table 4 presents the results of the wage gap estimation under three model specifications: without controls; controlling for informality; and controlling for informality, firm size, and employment in the tertiary sector.
A wage differential of approximately 15% in favor of men is observed across all model specifications, with the endowment effect remaining negative and the remuneration effect ranging between 18% and 20%.
Within this context, the explained component captures gender differentials associated with variations in returns to individuals’ observable characteristics. The residual unexplained component is often interpreted as a proxy for labor market discrimination. However, it is important to emphasize that these estimates are indicative rather than definitive, since the unexplained portion may also encompass differences in unobservable attributes not captured by the model. That is, they could be affected by the impact of unobserved factors of productivity (e.g., motivation, risk aversion, and negotiation), job conditions (e.g., part-time, inflexibility), or employer-level factors (e.g., monopsony power, network benefits). In addition, discrimination itself is an immensely complex, multidimensional component that encompasses taste-based discrimination (based on employer bigotry), statistical discrimination (based on group-based assumptions), and structural discrimination (institutionalized or societal norms that hinder access/mobility).
Thus, after estimating the corresponding wag GAPs, we evaluate the statistical significance of the difference in the GAP between the models with controls and the baseline model, as well as between the two models with controls.
The difference between coefficients is assessed using the following expression (Clogg et al., 1995; Mora et al., 2022):
z = G A P ^ k i G A P ^ k j ( S G A P ^ k i ) 2 + ( S G A P ^ k j ) 2
where i and j refer to different models, k is differential, explained, or unexplained, G A P ^ k i is the estimated value of the GAP in Table 4, and S G A P ^ k i denotes the standard error of the G A P k ^ for model i or j and GAP k. The null hypothesis states that G A P ^ k i = G A P ^ k j . Table 5 reports these results.
Based on this hypothesis test, the null hypothesis that the coefficients related to the wage differential are equal between the models with controls, and the baseline model is rejected. Furthermore, the coefficients associated with the endowment effect and the remuneration effect are statistically different between the models with controls and the baseline model. Similarly, the null hypothesis that the respective coefficients are equal between the model that includes controls for informality, firm size, and tertiary sector and the model that controls only for informality is also rejected.
In this context, when informality is included as a control variable, the unexplained component decreases by approximately 1.2 percentage points compared to the baseline model without controls. Furthermore, when informality, firm size, and the tertiary sector are jointly included, the unexplained component is reduced by 1.5 percentage points.
Finally, when selection bias is present, our methodology reveals that pooling the data leads to an overestimation of both the explained and unexplained components. This overestimation arises from the failure to account for measurement errors introduced by data aggregation, as well as from neglecting changes in unobserved individual heterogeneity.

5. Conclusions

The Blinder–Oaxaca methodology is a widely recognized approach for estimating gender wage differentials between two groups. Jann (2005, 2008) enhances this methodology by providing a correction for the variance–covariance matrix, which ensures efficiency in cross-sectional estimations. While Kröger and Hartmann (2021) discuss the decomposition effects in panel data, such extensions are not directly applicable to pseudo-panel data. This limitation arises because the individuals observed in period t are not the same as those observed in period t − 1, and unobserved individual heterogeneity may vary across time.
Many developing countries, such as Colombia, lack true panel data structures, but possess independent repeated cross-sectional data. In this context, the pseudo-panel approach provides a practical alternative for analyzing labor market outcomes over time, particularly in the absence of longitudinal tracking of individuals.
As in many other countries, gender wage disparities persist in Colombia. Our empirical findings, based on a pseudo-panel configuration with corrections for both measurement error and selection bias, consistently show wage differentials favoring men in the Colombian urban labor market. The results indicate that female cohorts earn, on average, 15% less than their male counterparts. Importantly, this wage gap is not primarily attributable to differences in observable attributes such as education or experience. Instead, the bulk of the gap is explained by differential returns to these attributes and potentially by unobservable factors, suggesting that productivity-related characteristics or other supply-side determinants cannot fully explain gender differences in earnings. It may instead reflect underlying structural or institutional dynamics within the labor market.
The Colombian labor economics literature repeatedly documented that women increased their labor force participation and now exhibit, on average, higher educational attainment than men. Nevertheless, the gender gap in the returns to human capital remains significant, suggesting that women are not equally rewarded for their skills and qualifications in the labor market.
The review of previous studies and empirical evidence supports the conclusion that wage differentials between men and women persist in Colombia and that human capital endowments explain only a limited portion of this gap.
To address the gender wage gap, a range of policy interventions and institutional efforts can be implemented. These include promoting equal access to education and vocational training to ensure women acquire skills aligned with labor market demands. Scholarship programs and mentorship initiatives can also encourage women’s participation in male-dominated fields of study. Furthermore, workplace-level reforms are critical. These include the enforcement of anti-discrimination policies in hiring and compensation, as well as measures to support a work–life balance, such as flexible work arrangements and remote work options. Such policies can facilitate women’s sustained engagement in the labor market and contribute to reducing the persistent gender pay gap.

Author Contributions

Conceptualization, J.J.M. and D.Y.H.; Methodology, J.J.M.; Software, J.J.M.; Validation, D.Y.H.; Formal analysis, J.J.M. and D.Y.H.; Investigation, J.J.M.; Resources, J.J.M.; Data curation, D.Y.H.; Writing—original draft preparation, J.J.M. and D.Y.H.; Writing—review and editing, J.J.M. and D.Y.H.; Visualization, J.J.M.; Supervision, J.J.M.; Project administration, J.J.M.; Funding acquisition, D.Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Appendix A

Table A1. Descriptive statistics by year.
Table A1. Descriptive statistics by year.
VariableCi(t)MeanStd. dev.MinMax VariableCi(t)MeanStd. dev.MinMax
2016LnW_men98.3380580.12453418.0351968.4324052017LnW_men98.4084380.12037058.1181818.498527
S_men910.939831.1647749.06944512.35118S_men911.007281.183479.13066412.48615
Exp_men929.2080414.863169.02064451.70675Exp_men929.1555514.888688.9954451.67625
Exp2_men91076,076915.610192.674952712.067Exp2_men91073.002916.509591.702282708.582
Tertiary_men90.70370180.00971240.69049240.7162699Tertiary_men90.705660.01062970.69255410.7178571
Micro_men90.59344520.09491260.47761560.7515873Micro_men90.59655160.0903020.48826370.7464285
Informality_men90.50354610.07388530.42344860.6459062Informality_men90.4983910.07151470.41826960.6363524
LnW_women98.1507930.12142627.9436338.285857LnW_women98.2379260.12048458.0484928.371805
S_women911.629161.7441598.66389913.75622S_women911.715771.7231898.82964513.83775
Exp_women928.516215.451547.63552352.12373Exp_women928.431815.421737.62113551.89346
Exp2_women91051,439936.627267.308452754,966Exp2_women91045.286931,60966.86282730.813
Tertiary_women90.85392780.00398040.84790770.8613119Tertiary_women90.84796250.0068050.83792130.8595017
Micro_women90.62242410.12022690.46087450.8270415Micro_women90.61907770.11766310.46009120.8078209
Informality_women90.5611720.09033990.43929440.70415Informality_women90.55111260.09475220.42634080.6902496
Married_men90.58424650.21323950.11922480.7230747Married_men90.57608530.2122770.11596240.7180166
Head_men90.55033030.23427150.09465460.7713839Head_men90.53903290.23031750.09422740.7532307
Ch6_men90.33808490.13520570.18911620.5493901Ch6_men90.33413250.12645130.19769040.5283061
N_ind_men90.25058490.0517480.20158760.3492341N_ind_men90.25419710.05529630.20062570.3659145
Married_women90.52127010.12204630.23781640.622967Married_women90.51533260.12169570.23271360.6176041
Head_women90.2890740.12626740.07037480.4455163Head_women90.29433240.12819730.07430650.4519494
Ch6_women90.17210310.0862820.08802890.2974964Ch6_women90.16572710.08202780.08609060.2940533
N_ind_women90.03306330.01062350.01951330.0541875N_ind_women90.0335180.00949110.02469750.0541904
Sel_men90.89586720.10403520.67120980.9704566Sel_men90.89468910.10606420.65946430.9698599
Sel_women90.70365990.13880760.42907260.8216574Sel_women90.701930.14034890.43234980.8267639
2018LnW_men98.4454220.11806878.1599788.5455792019LnW_men98.4754060.12858588.163238.568756
S_men911.138641.1561819.28909512.51258S_men911.217441.1356749.47382612.58708
Exp_men929.0147914.841028.9869851.51229Exp_men928.9468114.809999.06634151.34639
Exp2_men91062,879911.246191.502462690.313Exp2_men91057,985907.8492.348772672,988
Tertiary_men90.70365990.00972550.68521120.7130123Tertiary_men90.69473110.01253460.67792960.7167998
Micro_men90.59784670.08763090.49150280.7424932Micro_men90.58542440.0808410.49134110.717611
Informality_men90.50419750.07785140.43060650.6646739Informality_men90.50123980.07592340.42964780.6667118
LnW_women98.2963290.12362268.1014088.434343LnW_women98.3382930.11902258.1368168.463536
S_women911.894691.7232758.90583213.98808S_women912.07921.5859889.44126614.00982
Exp_women928.2656215.410857.66298351.89826Exp_women928.0822815.260597.7278951.40152
Exp2_women91035.102929.614867.336042730.174Exp2_women91020,683915.055768.051772680,442
Tertiary_women90.84246610.00765560.82756880.8565161Tertiary_women90.84468640.00595770.83672680.8529325
Micro_women90.61453220.11688430.45937160.8119162Micro_women90.60662820.10976260.4637980.7926582
Informality_women90.54745340.08988710.42936730.6796263Informality_women90.54716250.09360750.43276450.6959494
Married_men90.56815120.20846020.11666010.7080629Married_men90.56675910.2099940.11729970.7156888
Head_men90.52894570.22650790.09286930.7390612Head_men90.52529560.22504060.09250150.7454165
Ch6_men90.33274240.12877930.17687620.5322238Ch6_men90.3313310.12936980.17771080.5234528
N_ind_men90.25690530.05362860.21532750.3665618N_ind_men90.26859360.05203340.22278210.3709244
Married_women90.51857470.12309330.22980880.6170303Married_women90.51436120.12292350.22858720.6170322
Head_women90.2934760.12610380.07304420.4463721Head_women90.29941370.12504160.07768660.4414686
Ch6_women90.16652840.08565390.0777630.2974278Ch6_women90.15770430.08005040.07563450.2802293
N_ind_women90.03377840.00970150.02317150.0525967N_ind_women90.03540720.00959420.02492690.0527509
Sel_men90.88779060.11156760.63297940.9648866Sel_men90.88317190.11219910.62343370.964309
Sel_women90.69306910.14157450.42524750.8230862Sel_women90.68409260.14539980.4170930.8179824
2020LnW_men98.4519910.12550188.1450818.5550272021LnW_men98.472140.12473968.1720388.570444
S_men911.354151.0619429.70774712.54652S_men911.442361.0499869.81403512.68768
Exp_men928.8006414.674519.10062451.03596Exp_men928.7546414.658349.18673551.00526
Exp2_men91045,842897.072692.932462641,078Exp2_men91041,916895.374493.517682635.418
Tertiary_men90.68715660.01625770.66485070.714402Tertiary_men90.69107330.01421750.66618740.710273
Micro_men90.60147760.07161770.51194950.7119974Micro_men90.60182230.07152950.50600260.7103314
Informality_men90.52190070.07429230.45287840.6927041Informality_men90.5260450.07305490.45816920.691379
LnW_women98.34690.12617758.1298768.501195LnW_women98.3531960.12801888.1404338.490309
S_women912.338151.5534949.69641214.17378S_women912.379381.5223759.71783614.1562
Exp_women927.8448115.190517.79572251.12937Exp_women927.7975115.108417.87524251.04327
Exp2_women91004.76907.108268.339042652.217Exp2_women9999.5638901.282969.907142643.216
Tertiary_women90.85284940.01219790.8406360.8795009Tertiary_women90.85970570.00602910.85471140.8744076
Micro_women90.60507610.10243070.4677390.7743154Micro_women90.60650020.09918850.48126140.7787197
Informality_women90.54029750.0934690.41227590.6745098Informality_women90.54416360.0899260.43098720.6719087
Married_men90.56296670.20388360.12077010.707759Married_men90.55453570.20553420.11299310.6995674
Head_men90.50843410.22166660.07485350.7233106Head_men90.49922360.21730660.08149470.7124131
Ch6_men90.3187020.12724510.16398860.5214967Ch6_men90.29514510.12334160.14929870.4826719
N_ind_men90.36168370.06684210.30577560.4843956N_ind_men90.3319730.06680360.2791480.4584355
Married_women90.50969740.11813090.23406680.6071287Married_women90.50210070.11866710.22418880.5943267
Head_women90.30817040.13096370.0657850.4649241Head_women90.32637670.13189540.07578310.4637941
Ch6_women90.15070330.07777630.07453630.2614055Ch6_women90.13710940.07686480.05894670.253281
N_ind_women90.05870620.01532980.04121370.0909816N_ind_women90.05392150.01525540.03685290.0838601
Sel_men90.87295010.11606770.61186180.9588074Sel_men90.87255520.1183240.61020730.9583185
Sel_women90.6638560.1451170.39241150.7944374Sel_women90.66065210.15211430.3719330.796875
Source: author’s own calculations.

Notes

1
The number of children under six years of age in a household is related to the participation decision but not to wage, as used by Heckman (1974).
2
Of course, there are other approaches to decomposing the gender gap (semiparametric or non-parametric, quantile regressions). However, this approach needs to develop a counterpart in the pseudo-panel approach.
3
We use a GMMC because migration, variations across types of employment (for example, informal/formal, public/private sector jobs) or changes across marriage and child care responsibilities could affect the assumption of homocedasticity. In the case of heteroskedasticity, GMM is more efficient than the IV estimator (Baum et al., 2003).

References

  1. Abadía, L. K. (2005). Discriminación salarial por sexo en Colombia: Un análisis desde la discriminación estadística. Documentos de Economía. Universidad Javeriana-Bogotá. [Google Scholar]
  2. Arango, L. E., & Flórez, L. A. (2020). Regional labour informality in Colombia and a proposal for a differential minimum wage. The Journal of Development Studies, 2020, 1841170. [Google Scholar] [CrossRef]
  3. Badel, A., & Peña, X. (2010). Decomposing the gender wage gap with sample selection adjustment: Evidence from Colombia. Revista de Análisis Económico, 25(2), 169–191. [Google Scholar] [CrossRef]
  4. Baquero, J. (2001). Estimación de la discriminación salarial por género para los trabajadores asalariados urbanos de Colombia (1984–1999). Informe técnico. Universidad del Rosario, Facultad de Economía. [Google Scholar]
  5. Baranowska-Rataj, A., & Matysiak, A. (2022). Family size and men’s labor market outcomes: Do social beliefs about men’s roles in the family matter? Feminist Economics, 28(2), 93–118. [Google Scholar] [CrossRef]
  6. Baum, C. F., Schaffer, M. E., & Stillman, S. (2003). Instrumental variables and GMM: Estimation and testing. The Stata Journal, 3(1), 1–31. [Google Scholar] [CrossRef]
  7. Bernat, L. (2005). Análisis de género de las diferencias salariales en las siete principales Áreas Metropolitanas colombianas: ¿Evidencia de discriminación? Documento PNUD. United Nations Development Programme (UNDP). [Google Scholar]
  8. Biltagy, M. (2014). Estimation of gender wage differentials using oaxaca decomposition technique. Topics in Middle Eastern and North African Economies, 16(1), 17–42. [Google Scholar]
  9. Blau, F. D., & Kahn, L. M. (2017). The gender wage gap: Extent, trends, and explanations. Journal of Economic Literature, 55(3), 789–865. [Google Scholar] [CrossRef]
  10. Blinder, A. S. (1973). Wage discrimination: Reduced form and structural estimates. The Journal of Human Resources, 8(4), 436–455. [Google Scholar] [CrossRef]
  11. Buchely, L. (2013). Overcoming gender disadvantages. Social policy: Analysis of urban middle-class women in Colombia. Revista de Economía del Rosario, 16(2), 313–340. [Google Scholar] [CrossRef]
  12. Budlender, D. (2003). The debate about household headship. Social Dynamics, 29(2), 48–72. [Google Scholar] [CrossRef]
  13. Clogg, C. C., Petkova, E., & Haritou, A. (1995). Statistical methods for comparing regression coefficients between models. American Journal of Sociology, 100(5), 1261–1293. [Google Scholar] [CrossRef]
  14. Confecámaras. (2024). Panorama de las mujeres en el ámbito laboral y empresarial. Available online: https://confecamaras.org.co/images/PANORAMA-MUJERES-MARZO-7-3.pdf (accessed on 5 June 2025).
  15. Congreso de la República de Colombia. (2004, August 2). Ley 905. Por medio de la cual se modifica la Ley 590 de 2000 sobre promoción del desarrollo de la micro, pequeña y mediana empresa colombiana y se dictan otras disposiciones. Diario Oficial 45628. Available online: https://www.funcionpublica.gov.co/eva/gestornormativo/norma.php?i=14501 (accessed on 1 July 2025).
  16. Cools, S., Markussen, S., & Strøm, M. (2017). Children and careers: How family size affects parents’ labor market outcomes in the long run. Demography, 54, 1773–1793. [Google Scholar] [CrossRef] [PubMed]
  17. Deaton, A. (1985). Panel data from a time series of cross-sections. Journal of Econometrics, 30, 109–125. [Google Scholar] [CrossRef]
  18. Departamento Administrativo Nacional de Estadística [DANE]. (2020). Participación de las mujeres colombianas en el mercado laboral. Comisión Legal Para la Equidad de la Mujer. [Google Scholar]
  19. Di Paola, R., & Berges, M. (2000). Sesgo de selección y estimación de la brecha por género para Mar del Plata. Nülan. Deposited Documents, 891. Universidad Nacional de Mar del Plata, Facultad de Ciencias Económicas y Sociales, Centro de Documentación. [Google Scholar]
  20. Fernández, M. (2006). Determinantes del diferencial salarial por género en Colombia, 1997–2003. Revista Desarrollo y Sociedad, 58, 165–208. [Google Scholar]
  21. Galvis, L. A. (2011). Diferenciales salariales por género y región en Colombia: Una aproximación con regresión por cuantiles. Revista de Economía del Rosario, 13(2), 235–277. [Google Scholar]
  22. Greene, W. H. (2003). Econometric analysis (5th ed.). Prentice Hall. [Google Scholar]
  23. Guillerm, M. (2017). Pseudo-panel methods and an example of application to household wealth data. Economie et Statistique/Economics and Statistics, 491–492, 109–130. [Google Scholar]
  24. Hansen, L. P. (1982). Large sample properties of generalized methods of moments estimators. Econometrica, 50(4), 1029–1054. [Google Scholar] [CrossRef]
  25. Heckman, J. (1974). Shadow prices, market wages, and labor supply. Econometrica, 42(4), 679. [Google Scholar] [CrossRef]
  26. Heckman, J. (1979). Sample selection bias as a specification error. Econometrica, 47(1), 153–161. [Google Scholar] [CrossRef]
  27. International Labour Organization [ILO]. (2018). Care work and care jobs for the future of decent work. ILO. Available online: https://www.ilo.org/sites/default/files/wcmsp5/groups/public/%40dgreports/%40dcomm/%40publ/documents/publication/wcms_633135.pdf (accessed on 2 January 2025).
  28. International Labour Organization [ILO]. (2020a, March 3). Having kids sets back women’s labour force participation more so than getting married. ILO. Available online: https://ilostat.ilo.org/blog/having-kids-sets-back-womens-labour-force-participation-more-so-than-getting-married/ (accessed on 10 January 2025).
  29. International Labour Organization [ILO]. (2020b, May 15). International day of families: How marital status shapes labour market outcomes. ILO. Available online: https://ilostat.ilo.org/blog/international-day-of-families-how-marital-status-shapes-labour-market-outcomes/ (accessed on 2 January 2025).
  30. Jann, B. (2005, April 8). Standard errors for the blinder-oaxaca decomposition. German Stata Users Group Meetings, Stata Users Group, Berlin, Germany. [Google Scholar]
  31. Jann, B. (2008). The Blinder–Oaxaca decomposition for linear regression models. The Stata Journal, 8(4), 453–479. [Google Scholar] [CrossRef]
  32. Johansson, M., Katz, K., & Nyman, H. (2005). Wage differentials and gender discrimination: Changes in Sweden 1981–1998. Acta Sociologica, 48(4), 341–364. [Google Scholar] [CrossRef]
  33. Kröger, H., & Hartmann, J. (2021). Extending the kitagawa oaxaca blinder decomposition approach to panel data. The Stata Journal: Promoting Communications on Statistics and Stata, 21(2), 360410. [Google Scholar] [CrossRef]
  34. Maubrigades, S. (2020). Participación y segregación ocupacional de género en los sectores económicos de América Latina durante el siglo XX. América Latina en la Historia Económica, 27(3), 1–24. [Google Scholar] [CrossRef]
  35. Mincer, J. (1974). Schooling, experience, and earnings. National Bureau of Economic Research. [Google Scholar]
  36. Ministerio de Comercio, Industria y Turismo. (2024). Informe de tejido empresarial. Oficina Estudios Económicos. Available online: https://www.mincit.gov.co/getattachment/estudios-economicos/estadisticas-e-informes/informes-de-tejido-empresarial/2024/diciembre/oee-dv-informe-de-tejido-empresarial-diciembre-2024.pdf.aspx (accessed on 5 June 2025).
  37. Mora, J. J. (2017). La informalidad laboral colombiana en los últimos años: Análisis y perspectivas de política pública. Revista De Métodos Cuantitativos Para La Economía y La Empresa, 24, 89–128. [Google Scholar] [CrossRef]
  38. Mora, J. J., & Arcila, A. M. (2014). Brechas salariales por etnia y ubicación geográfica en Santiago de Cali. Revista de Métodos Cuantitativos para la Economía y la Empresa, Universidad Pablo de Olavide, 18(1), 34–53. [Google Scholar] [CrossRef]
  39. Mora, J. J., Castillo, M., & Gomez, G. (2022). Migration and overeducation of venezuelans in the Colombian labor market. The Indian Journal of Labour Economics, 65(3), 1–65. [Google Scholar] [CrossRef]
  40. Mora, J. J., & Muro, J. (2014). Consistent estimation in pseudo panels in the presence of selection bias. Economics, 8, 1–25. [Google Scholar] [CrossRef]
  41. Mora, J. J., & Muro, J. (2015). Labor market segmentation in a developing country. The Indian Journal of Labour Economics, 58, 477–486. [Google Scholar] [CrossRef]
  42. Mora, J. J., & Muro, J. (2017). Dynamic effects of the minimum wage on informality in Colombia. Labor, 31, 59–72. [Google Scholar] [CrossRef]
  43. Newey, W. K., & McFadden, D. (1994). Large sample estimation and hypothesis testing. In R. Engle, & D. McFadden (Eds.), Handbook of econometrics (pp. 2111–2245). North-Holland. [Google Scholar] [CrossRef]
  44. Oaxaca, R. (1973). Male-female wage differentials in urban labor markets. International Economic Review, 14(3), 693–709. [Google Scholar] [CrossRef]
  45. Oaxaca, R., & Ransom, M. R. (1998). Calculation of approximate variances for wage decomposition differentials. Journal of Economic and Social Measurement, 24, 55–61. [Google Scholar] [CrossRef]
  46. Paz, J. (1998). Brecha de ingresos entre géneros (Comparación entre el Gran Buenos Aires y el Noroeste Argentino). Anales de la AAEP. [Google Scholar]
  47. Peña, X., Cárdenas, J. C., Ñopo, H., Castañeda, J. L., Muñoz, J. S., & Uribe, C. (2013). Mujer y movilidad social. Documentos CEDE. Universidad de los Andes. [Google Scholar]
  48. Piñeros, L. A. (2009). Las uniones maritales, los diferenciales salariales y la brecha educativa en Colombia. Revista Desarrollo y Sociedad, 64, 55–84. [Google Scholar] [CrossRef]
  49. Sánchez, R. M. (2020). Poverty and labor informality in Colombia. IZA Journal of Labor Policy, 10(1), 2020. [Google Scholar] [CrossRef]
  50. Tobón, C., & Rodríguez, F. L. (2015). Factores que determinan la probabilidad de participación laboral en el área metropolitana de Medellín [tesis de Maestría, Universidad EAFIT]. Available online: https://repository.eafit.edu.co/server/api/core/bitstreams/e15372c3-81b2-4e62-b53b-94dbd9e2ef4f/content (accessed on 5 June 2025).
  51. Watson, I. (2010). Decomposing the gender pay gap in the Australian managerial labour market. Australian Journal of Labour Economics, 13(1), 49–79. [Google Scholar]
  52. World Economic Forum. (2021). Global gender gap report 2021. Insight report. Available online: https://www3.weforum.org/docs/WEF_GGGR_2021.pdf (accessed on 5 March 2024).
Table 1. Number of individuals by cohort.
Table 1. Number of individuals by cohort.
Cohort, Ci(t)201620172018201920202021Total
18–22 years old18,98618,40217,67917,974913814,75096,929
23–27 years old22,02421,72921,90922,67711,74618,789118,874
28–32 years old20,11920,19820,25521,70911,63818,627112,546
33–37 years old19,54519,35819,32420,01110,66717,193106,098
38–42 years old16,40516,44916,95418,636995316,58894,985
43–47 years old16,02815,39614,93116,039834213,82784,563
48–52 years old15,81415,44815,43215,601820613,46583,966
53–58 years old15,88816,01816,31217,093919814,77189,280
59–63 years old89049353966210,2835600945653,258
Total153,713152,351152,458160,02384,488137,466840,499
Source: author’s own calculations.
Table 2. Decomposition results without selection bias correction.
Table 2. Decomposition results without selection bias correction.
Without Selection Bias
PoolPanel Cohort aPseudo Panel (Deaton)
Differential0.12987 ***0.14467 ***0.20106 ***
(0.00181)(0.02554)(0.0000779809)
Explained—endowment effect−0.08622 ***−0.05064 **−0.02020 ***
(0.00100)(0.02434)(0.0000000016)
Unexplained—remuneration effect0.21609 ***0.19531 ***0.22126 ***
(0.00152)(0.00887)(0.0000779794)
NT, cohorts771,194108108
Note: *** Statistically significant at the 0.01 level; and ** statistically significant at the 0.05 level. Standard errors appear in parentheses. a Since two groups of nine cohorts each are observed over six time periods, the total number of cohort-period observations used in the decomposition amounts to 108.
Table 3. Decomposition results with selection bias correction.
Table 3. Decomposition results with selection bias correction.
With Selection Bias
PoolPseudo Panel (MH)
Differential0.21186 ***0.14617 ***
(0.00203)(0.00005)
Explained—endowment effect−0.08622 ***−0.05636 ***
(0.00100)(0.000000025)
Unexplained—remuneration effect0.29809 ***0.20254 ***
(0.00179)(0.00005)
NT, cohorts840,499108
Note: *** Statistically significant at the 0.01 level. Standard errors appear in parentheses.
Table 4. Decomposition results with controls.
Table 4. Decomposition results with controls.
Without ControlsInformalityInformality, Firm Size, and the Tertiary Sector
Differential0.14617 ***0.15100 ***0.15186 ***
(0.00005)(0.0000400258)(0.0000394999)
Explained—endowment effect−0.05636 ***−0.03976 ***−0.03541 ***
(0.000000025)(0.0000000607)(0.0000000420)
Unexplained—remuneration effect0.20254 ***0.19076 ***0.18727 ***
(0.00005)(0.0000399651)(0.0000394579)
Cohorts108108108
Note: *** Statistically significant at the 0.01 level. Standard errors appear in parentheses.
Table 5. Test for the equality of regression coefficients across models.
Table 5. Test for the equality of regression coefficients across models.
Informality vs. No ControlsInformality, Firm Size, and the Tertiary Sector vs. No ControlsInformality, Firm Size, and the Tertiary Sector vs. Informality
Differential75.41 ***89.30 ***15.29 **
Explained—endowment effect252,868.74 ***428,623.47 ***58,932.00 ***
Unexplained—remuneration effect−184.04 ***−239.74 ***−62.14 **
Note: *** Statistically significant at the 0.01 level; and ** statistically significant at the 0.05 level.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mora, J.J.; Herrera, D.Y. Pseudo-Panel Decomposition of the Blinder–Oaxaca Gender Wage Gap. Econometrics 2025, 13, 27. https://doi.org/10.3390/econometrics13030027

AMA Style

Mora JJ, Herrera DY. Pseudo-Panel Decomposition of the Blinder–Oaxaca Gender Wage Gap. Econometrics. 2025; 13(3):27. https://doi.org/10.3390/econometrics13030027

Chicago/Turabian Style

Mora, Jhon James, and Diana Yaneth Herrera. 2025. "Pseudo-Panel Decomposition of the Blinder–Oaxaca Gender Wage Gap" Econometrics 13, no. 3: 27. https://doi.org/10.3390/econometrics13030027

APA Style

Mora, J. J., & Herrera, D. Y. (2025). Pseudo-Panel Decomposition of the Blinder–Oaxaca Gender Wage Gap. Econometrics, 13(3), 27. https://doi.org/10.3390/econometrics13030027

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop