Heterogeneity, transient and persistent technical efficiency of Polish crop farms

Andrzej Pisulewski

Cracow University of Economics, Dept. of Econometrics and Operational Research, ul. Rakowicka 27, 31-510 Kraków, Poland.

Jerzy Marzec

Cracow University of Economics, Dept. of Econometrics and Operational Research, ul. Rakowicka 27, 31-510 Kraków, Poland.



Accounting for heterogeneity in the measurement of farm efficiency is crucial to avoid biases related to climate and soil quality diversity in a given area. Therefore, this paper investigates the level of technical efficiency (TE) of Polish crop farms based on several stochastic frontier panel data models with different approaches to the measurement of unobserved heterogeneity, short- and long- run inefficiency. In our study, we show that ignoring farm heterogeneity can lead to underestimation of the level of TE in conventional stochastic frontier panel data models. Moreover, we have found empirically that not accounting for heterogeneity in the Generalized True Random Effects model may lead to incorrect estimates of persistent TE. The obtained results for Polish crop farms indicate that the level of transient TE (0.81) is lower than the level of persistent TE (0.88). This result suggests that Polish farms may have, for example, problems with adopting new technologies and poor managerial skills.

Additional keywords: panel data; stochastic frontier analysis; random effects.

Abbreviations used: AIC (Akaike’s Information Criterion); EU (European Union); FADN (Farm Accountancy Data Network); GDP (Gross Domestic Product); GTRE (Generalized True Random Effects); LR (Likelihood Ratio); MSL (Maximum Simulated Likelihood); RTS (Returns to Scale); TE (Technical Efficiency); TRE (True Random Effects).

Authors' contributions: AP: design, drafting of the manuscript, econometric analysis, interpretation of the data and the results. JM: econometric analysis, interpretation of the data and the results, critical revision of the manuscript for important intellectual content.

Citation: Pisulewski, A.; Marzec, J. (2019). Heterogeneity, transient and persistent technical efficiency of Polish crop farms. Spanish Journal of Agricultural Research, Volume 17, Issue 1, e0106.

Supplementary material (Appendix) accompanies the paper on SJAR’s website.

Received: 11 Sep 2018. Accepted: 04 Mar 2019.

Copyright © 2019 INIA. This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC-by 4.0) License.

Funding: National Science Centre, Poland (DEC 2013/09/N/HS4/03833); Faculty of Management at Cracow University of Economics.

Competing interests: The authors have declared that no competing interests exist.

Correspondence should be addressed to Andrzej Pisulewski:





Material and methods






Agricultural production is a land-based indus­try. Therefore, there are a number of potential sources, including soil quality, climate conditions and other location-specific factors, which determine the produc­tion technology of farms and thus should be taken into account in the efficiency measurement. In the case of Polish agriculture, the above-mentioned fac­tors also vary in different regions of the country. Obviously, when these inter-individual differences are observed (measured by variables to account for the observed heterogeneity), it is possible to incorporate environmental variables such as regional dummies (rainfall, nature of the soil, etc.). However, information on the above-mentioned factors is very rarely available for each farm.

There are other determinants that seem to be impor­tant when considering the productivity and efficiency of Polish agriculture. These include historical and social conditions that determine the agrarian structure of the country. It is important to mention main historical (external) factors in order to understand the motivation for this research and the interpretation of the results of further analyses. Heterogeneity of Polish agriculture is caused by the fact that there are a lot of farms which differ greatly in terms of economic size, utilised agricultural area or organisational culture, for instance in the share of rented land and owned land, share of family labour and hired labour. Small farms were created as a result of the land reform implemented immediately after World War II, where agricultural land from large estates was confiscated and distributed to small-scale family farmers. The socialist era in Poland (1945–1989) changed nothing in this matter because the collectivization had failed and 75% of agricultural land was both owned and used by small family farms. State land dominated in the northern and western parts of Poland. Privatization of state land through sale after 1989 had a certain impact on the farm structure. As a result, agricultural land in Poland is highly fragmented both in ownership and land use (Hartvigsen, 2014). These are examples of internal factors that could be taken into account, especially in the case of explicit modelling of heterogeneity.

Moreover, under socialism, according to Ellman (1981), farms increase their production until marginal product equals zero, while the standard microeconomic theory says that firms increase their production until the value of marginal product equals the price of the input. Thus, the optimization behaviour under socialism leads to overutilization of inputs and, in consequence, to low productivity. This is exactly the case of Poland, where, as Henningsen (2009) points out, despite the fact that most farms were private, they were inefficient. It seems that joining the European Union (EU) in 2004 did not improve the level of technical efficiency (TE) of Polish farms, since the studies conducted before Poland joined the EU (Brümmer et al., 2002; Latruffe et al., 2004, 2005) as well as the studies conducted within several years after Poland’s accession to the EU (Makieła et al., 2017; Marzec & Pisulewski, 2017, 2019) report a similarly low level of overall TE. However, the overall TE can be decomposed into persistent (long-run) inefficiency and transient (short-run) inefficiency. According to Njuki & Bravo-Ureta (2015), the long-run inefficiency may vary across farms for various reasons, which include prior institutional and statutory regimes (e.g. minimum pricing laws that effectively set price floors for crop products, food and safety regulations, environmental regulations), while the latter type may vary because of shocks associated with new production technologies, human capital, and learning-by-doing. Therefore, taking into account different optimization behaviour of Polish farms before 1989, distinguishing between the persistent (long-run) inefficiency and transient (short-run) inefficiency levels would allow to learn what measures should be taken to improve overall TE of farms.

In stochastic frontier literature, the problem with unobserved heterogeneity arises when the explanatory variables do not account enough for variability of the dependent variable and for the complexity of relations in the model conditional on some factors not available or not observed by the researcher. In the simple linear regression case, when omitted factors are not correlated with the explanatory variables, the problem of estimator bias does not appear. Unfortunately, the lack of correlation between observed and unobserved factors in nonlinear models is not sufficient to ignore these latter without consequences. The stochastic frontier analysis assumes there are unobserved sources (factors) that could cause technological diversity among firms. Firm characteristics omitted during the model building phase can imply heterogeneous technologies described by a production function or a cost function. Consequently, these characteristics affect inefficiency represented by the one-sided error component. One traditional way to include these hidden variables in the regression model is to add fixed or random effects that can vary across firms and over time. It is important to estimate these hidden effects and then to interpret them in light of economic theory. In the stochastic frontier analysis, this is the approach of the commonly used Pitt & Lee (1981) or Battese & Coelli (1992, 1995) random effects models. However, these models fail to distinguish between cross-firm heterogeneity unrelated to inefficiency and inefficiency itself. Thus, it leads to biased efficiency estimates (Greene, 2005a,b; Farsi et al., 2005; Abdulai & Tietje, 2007). That is why there is a need for modelling unobserved heterogeneity.

The first approach to statistical modelling of uno­bserved heterogeneity is the use of mixed models or mixture models. The examples of these models applied in stochastic frontier analysis include random coefficients stochastic frontier model presented by Tsionas (2002) and Greene’s (2005b) latent class sto­chastic frontier model, respectively. Another approach to account for heterogeneous technologies are extended fixed effects and random effects stochastic frontier models introduced by Greene (2005a,b). These models are labelled as true fixed effects (TFE) and true random effects (TRE). There is also a growing literature on the specific type of unobserved heterogeneity, i.e. spatial heterogeneity (Schmidt et al., 2009; Areal et al., 2012; Pede et al., 2018).

However, none of the above-mentioned models measure the persistent (long-run) TE. On the other hand, the Kumbhakar & Heshmati (1995) model, which distinguishes short-run and long-run inefficiency, ignores unobserved heterogeneity. With regard to the above-mentioned issues related to Polish farm sector, we are primarily interested in the measurement of the level of TE taking into account unobserved heterogeneity. Therefore, a model introduced recently by Colombi (2010) and Colombi et al. (2011, 2014), which takes into account all the above-mentioned issues (i.e. heterogeneity, transient inefficiency and persistent inefficiency) seems to be the most appropriate model in our empirical application.

The main contribution of this paper is not the methodological advances but investigation of practical application of recently developed models in the analysis of the efficiency of Polish crop farms. To the best of our knowledge, this is the first application of Colombi (2010) and Colombi et al. (2011, 2014) model to Central and Eastern European agricultural production data. First of all, this paper shows the consequences of ignoring farm heterogeneity in efficiency analysis. Secondly, our contribution pinpoints the areas of agricultural policy for improvement, distinguishing persistent and transient inefficiency for the main agricultural sector in Poland.

Material and methodsTop

Stochastic frontier production function models

The stochastic frontier models were first proposed by Aigner et al. (1977) as well as Meussen & van den Broeck (1977). Pitt & Lee (1981, Model II) generalised this class of models to handle cross-section and time-series data by considering the following model (hereafter referred to as pooled model and denoted in the present study by M1):

where yit is the natural log of the observed output for firm i (i=1,…, N) in period t (t=1,…, T), h is a known production function, xit is the (row) vector of natural logs of inputs used by the firm, β is a (column) vector of k parameters. This production function is usually spe­ci­fied to be log-linear in practice, i.e. h (xit) = xit β. Furthermore, vit is a normal random error term with a mean of zero and constant variance σ2v, representing random shocks, vit~N(0,σ2v). Component uit≥0 is referred to as inefficiency, and so the output-oriented TE score is calculated as TEit = exp (–uit). The conventional assumption is that the error term and the inefficiency term are independently and identically distributed across i and t. In the study by Aigner et al. (1977), the inefficiency term is derived from a normal distri­bution truncated above at zero, uit~N+(0,σ2v), or it has an exponential distribution. Other commonly adopted distributions are the truncated-normal and gamma distributions (Stevenson, 1980). This study considers only the models with a half- normal distribution for the one-sided inefficiency component.

In crop production, the inefficiency term can be used to express errors in management, e.g. related to poor quality of seeds used by farmers or using farm-saved grains for planting, machine performance and other constraining factors. On the other hand, the random shock variable represents those effects which cannot be controlled by farms, such as environmental factors and weather conditions, etc.

The next model (M2) considered in this study is the Pitt & Lee (1981, Model I) model, which, unlike the specification presented in the above equation (1), makes the assumption that the inefficiency level is an individual effect, i.e. uit = ui (t=1,…,T). This assumption is substantiated when we use panel data corresponding to a short period of time and when we want to improve precision of inferences on individual inefficiency (treated as an unobservable firm-specific effect). This model takes the following form:

However, the Pitt & Lee (1981, Model I) model has a shortcoming that the inefficiency in this model is time-invariant. Kumbhakar (1990) and Battese & Coelli (1992) made inefficiency to be time-varying by the following specification of the inefficiency term:

where Kumbhakar (1990) defined f (t; η) as:

f(t;η) = (1 + exp (η1t +η2t2))-1

while Battese & Coelli (1992) formulated it as:

f(t;η) = exp [-η(t - T)]

In the present study, we employed the Battese & Coelli (1992) model, which is denoted by M3, which allows heteroscedasticity of the inefficiency term in this sense that the zero-truncated variance of this component is time-varying.

To conclude, the above-mentioned models measure either time-invariant inefficiency (M2) or time-varying inefficiency (M1, M3). These models should be treated as reference points for building advanced models that account for heterogeneity.

Kumbhakar & Heshmati (1995) were the first to propose measuring both types of inefficiency in the same model (hereafter referred to as M4). Their model is as follows:

where hi~N+(0,σ2h) is interpreted as persistent (only firm-specific, time-invariant or long-run) component of inefficiency and uit is treated as one that varies randomly across firms as well as over time and represents the transient (short-run) part of inefficiency, uit~N+(0,σ2u). Therefore, this model does not include individual-specific effects to control for unobserved time invariant heterogeneity. Kumbhakar & Heshmati (1995) proposed a multi-stage procedure in order to estimate the parameters of this model and TE.

Subsequently, Greene (2005a,b) reinterpreted the Kumbhakar & Heshmati (1995) model. He proposed to interpret the time-invariant component as firm-specific effects that occur among cross-sectional units but which are not captured by the explanatory variables and differ from inefficiency. This extension of the stochastic frontier accounts for unmeasured heterogeneity as well as firm time-varying inefficiency, but not time-invariant inefficiency. The model (hereafter referred to as M5) is called TRE model and takes the following form:

where heterogeneous intercept wi is treated as a random variable and represents the effects of unobserved variables specific to firm i in the same fashion over time, wi~N(0,σ2w). However, in contrast to inefficiency, this random variable can assume both positive and negative values.

Similarly, as in the Kumbhakar & Heshmati (1995) model, this model includes an additional time-invariant variable, by what the marginal probability density function for observations (yit) cannot be calculated exactly from the closed-form formula. For this reason, alternatively to Kumbhakar & Heshmati (1995), who employed a multi-stage estimation procedure, Greene (2005a,b) proposed maximum simulated likelihood (MSL) method. In order to estimate the parameters of this model, gradient based optimization procedures are required. Therefore, the Appendix [suppl] provides first and second order derivatives for the simulated likelihood function for this model that takes the following form:

where the composed error term εit,r = yitxitβσwWi,r has a conditional (skew) normal distribution with parameters λ = σu⁄σv and σ = (σ2u2v)0.5 conditional on wi. Furthermore, Wi is an auxiliary standard normal variable corresponding to the individual specific effect wi (i.e. Wi = wi/σw) with a probability (cumulative) density function ϕ (Φ) and R is the number of simulated draws of Wi.

Moreover, we employed the model (hereafter referred to as M6) which takes into account all the above-mentioned factors (i.e. time-varying inefficiency, time-invariant inefficiency and heterogeneity) proposed by Colombi (2010) and Colombi et al. (2011, 2014). Tsionas & Kumbhakar (2014) called it the Generalized True Random Effects (GTRE) model, which in the case of stochastic production function takes the following form:

where hi is non-negative disturbance – in this case a half-normal random variable with parameter σ2h, identical with that proposed by Kumbhakar & Heshmati (1995). Following Kumbhakar et al. (2014), the model can be estimated using the multi-stage procedure whose statistical properties are, however, difficult to establish. Alternatively, the full maximum likelihood estimator can be employed, especially taking into account that, as showed by Colombi et al. (2014), it is more efficient and less biased than the multi-stage estimator. Other methods include Bayesian techniques presented by Tsionas & Kumbhakar (2014) or Makieła (2017). Finally, the estimation of this quite complicated model was considerably simplified by Filippini & Greene (2016), who used the MSL estimator. The simulated likelihood for this model is analogous to that for the TRE model, only with one difference that the composed error term is defined as

The commonly used estimator of technical ineffi­ciency is the one proposed by Jondrow et al. (1982), which is exp {–E(uitεit)}. However, Colombi (2010) and Colombi et al. (2011, 2014) followed Battese & Coelli (1988) suggestion to compute technical efficiency as the conditional expectation of TEit gi­ven the composed error, E[exp(–uit)|εit], obtaining the following formula:

where ei = yi Xi β, A = [ιTIT], Σ = σ 2v IT + σ 2wIT,

and Λ = (V-1 + A' Σ-1 A)-1, R = –ΛA' Σ-1.

Furthermore, yi is a vector of T observations on firm i, Xi is the T×k matrix which indicates the values of the regressor variables, i = [hi, ui1,… , uiT], IT is the identity matrix of dimension T, ιT is a column vector of ones, 0T a vector of zeros. While τ is a column vector that contains only ones and zeros that indicate what kind of efficiency measure is calculated. For example, τ' = [1 0…0] indicates that the conditional expected value of the time-invariant inefficiency (for hi) is calculated. The assumption that τ' = [0 0…1] leads to E [exp (–uiT)|yi], respectively. ΦT+1(μ, Ω) is the probabi­lity that a (T+1)-variate normal random variable of expected value μ and variance matrix Ω belongs to the positive orthant. Computation of the multivariate distribution function was conducted using the algorithm proposed by Geweke (1989), Hajivassiliou (1990) and Keane (1994), which is known in the econometric literature as the GHK simulator. In particular, we used the GAUSS code presented by Hajivassiliou (2000).

The TE was estimated by the above estimator from models M4 and M6. The TE from models M1, M2, M3 and M5 was calculated by the Battese & Coelli (1988) formula. However, in the case of the TRE model, to obtain an conditional mean of TEit given the observations (yi, Xi for i = 1,…,N), wi must be first integrated out of the following expression: TEit = E [exp (–uit)|εit, Wi], where εit = yit – xit β – σwWi. The above integral cannot be computed in closed form. Therefore, following Greene (2008), we have approximated it by simulation (drawing for r = 1,…,R). As a result, this estimator of TE takes the following form:

where eit = yit - xitβ

and LS,i,r is the si­mu­lated likelihood for T observations for individual i evaluated at MSL estimates σ, λ, σw and β (i.e. ).

To conclude this section, in Table 1 we have pre­sented the assumption about stochastic specification of the six models employed in the present study. It is noteworthy that two models (M4, M5) are not nested to each other, while model M3 has an entirely different construction compared to M4, M5 or M6. Therefore, Akaike’s information criterion (AIC) was used to order the models according to the degree of complexity and fitting.

Table 1. Specification of the models used in this study.

Data on Polish crop farms

The dataset used for the analysis includes Polish farms specialised in crop production. The selection criterion was the FADN (Farm Accountancy Data Network) classification, where it is stated that a farm is classified as a crop farm if two-thirds of the standard output comes from general cropping (EC, 2008). The precise definition of the variables in the production function is based on other studies on the field crop sector in which FADN data were used (Latruffe et al., 2004; Bojnec & Latruffe, 2009; Zhu & Lansink, 2010). Therefore, the output (Q) is specified as the deflated total net farm revenues from sales (deflated with base year 2004) excluding the value of feed, seeds and plants produced within the farm. Price indices of agricultural production (i.e. crop and animal market prices provided by the Central Statistical Office of Poland) are used as deflators. The four factors of production are defined as follows:

1. Physical capital (K) is measured in terms of deflated book value. It includes fixed capital such as buildings and fixed equipment, as well as machines and irrigation equipment. The aggregate of this input was deflated by the price index for machinery and equipment for agriculture, and building construction.

2. Total labour (L) is measured in hours. This measure includes both hired and family labour declared by the farmer during the interview.

3. Total utilised agricultural area (A, in hectares) refers to owned and rented land.

4. Materials (M) consist of several subcategories: seeds and plants, fertilizers, crop protection, purchased feed, crop and livestock specific costs, energy and services. Originally, these subcategories are measured as the costs of resources used in farm production. In order to deflate the total reported expenditure on materials, we used price indices provided by the Central Statistical Office for each subcategory. An aggregate measure of materials is calculated by deflating the total cost of all items with a share-weighted average price index constructed using the expenditure share for all the components. Furthermore, we excluded the value of seeds and feed produced within the farm from this category to avoid double measuring these costs.

The stochastic frontier models of the Polish farms specialised in field crops were estimated using a yearly data set covering a sample of 660 farms from 2004 to 2011. The sample summary statistics for farms are presented in Table 2. The arithmetic mean area of land per farm is 43 ha. However, since FADN data are biased toward larger farms, the average area per farm is in fact smaller. It amounted to 8.9 ha in 2010 (CSO, 2016). On the other hand, crop farms in Western Europe are larger. For example, the average land area per owner is 163 ha in Germany, 71 ha in the Netherlands, 115 ha in Sweden (Zhu & Lansink, 2010) and 142.9 ha in France (Latruffe et al., 2012). Crop farms in Central and Eastern European countries are also much larger than in Poland, for instance in the Czech Republic the average utilised agricultural area is 144 ha (Latruffe et al., 2008), while in Hungary it is 226.4 ha (Latruffe et al., 2012). The average size of a private farming company in Slovakia is 1030 ha, while of an agricultural cooperative it is 1620 ha (Fandel, 2003). Farms with a similar size to Polish ones can be found in Slovenia, where the average land size is 20 ha (Bojnec & Latruffe, 2013).

Table 2. Descriptive statistics for the variables in the sample.

As shown in Table 2, during the period from 2004 to 2011, the annual average crop production per farm was about 29,000 € (if the exchange rate is 1 euro to 4.15 Polish zloty). The average yearly revenue earned from selling crops was approximately 713 €/ha. The 5th and 95th percentiles of the empirical distribution of output per area show that there are large differences in profitability and effectiveness of the production process between farms. Moreover, the average labour productivity in Polish crop farms was 10.4 €/h. According to PORDATA (2017), gross domestic pro­duct (GDP) per hour worked was 7.2 € in Poland in 2004, whereas in 2011 this indicator reached 12 €. Therefore, the agricultural labour productivity obtained from the sample data is similar in value to GDP per hour worked published in official statistics. It is noteworthy that GDP per hour worked in 2004 was 30.2 € in 28 countries that presently constitute the EU, whereas in 2011 this indicator reached 35.3 € (see PORDATA, 2017). Therefore, it shows that labour productivity in Poland is still very low compared to EU average (of the 28 member states), and this also applies to agriculture. However, Bulgaria, Latvia, Lithuania and Romania have lower indicators than Poland.

Table 2 also reveals that the share of family labour is equal to 69% of total labour force. The share of owned land in total utilised agricultural area is also high (about 55%). Moreover, every second farm did not use hired labour and over 29% of farms reported that they used only their own land. Another important factor is the differences between farms depending on whether they grow root crops or cereals. Root crops are generally expensive to grow since they require organic fertilizers. Cereals occupied 65% of the total reported crop acreage and the share of crop production in total agricultural output equals 59%. Cereals crops and use of owned land dominated in almost 75% of all the farms under study. The 85% of the farms revenues come from selling cereals, as well as other crops (including oilseeds and fodder crops). Thus, 15% of the farms covered in the sample report that they earn revenue solely from cereal production.

Consequently, we can observe large diversity of the studied farms in terms of type of field crops, their area and labour productivity of this agricultural production system. Likewise, many of the selected farmers do not make decisions to employ hired labour because they have realised that family labour is cheaper than the former. A similar situation applies to utilised agricultural area they own. Despite the fact that farms are small and they probably should rent land from others to improve its economic performance, they do not do it. Instead, they cultivate small areas in order to receive subsidies, treating agricultural activity as an additional source of income.

The relationship between production and use of the four main inputs (capital, labour, materials and agricultural area) is a one of the main issues in this line of research. Figure 1 illustrates the dependence between farms’ output and inputs averaging the data over time using a geometric mean (a few out­liers were removed to improve the visibility of the graphs). These figures show a monotonicity or a li­near relationship between production and all inputs except one. The absence of a relationship between labour and production is puzzling since the estimate for this effect turned out to be high and statistically significant, as opposed to the relationship between production and capital. Furthermore, in the sample analysed here, there are not many large farms. Therefore, small and medium-sized farms dominate in this sector in Poland.

Figure 1. Output (production) versus capital, labour, materials and agricultural area in the sample (cross-sectional data averaged over time). Source: FADN data.

In our research, the size of the panel dataset is large and it includes different farms in terms of the factors discussed earlier. Because of the above reason, this sample is clearly heterogeneous. It might cause the typical modelling of the production function, which is based on the use of the translog specification, to be insufficient to cope adequately with the problem of heterogeneity of farm economic activity. In this context, more general models discussed above can deal with heterogeneous data, without using additional variables that would be effective in controlling for heterogeneity.


The most popular functional form of the production frontier is translog, which belongs to the family of flexible functional forms (Christensen et al., 1973). These functional forms are commonly used in applied econometrics including production and cost analysis. Translog is a second-order local approximation of any twice-differentiable function, and it is important that it satisfies Diewert’s minimum flexibility requirement for the flexible form. In our study, the deterministic kernel of the stochastic production frontier is given in this form (there are four production factors in this study, i.e.: J = 4):

where lower case letter of x indicates natural logs of inputs. The presented translog form (12) with a linear trend in the parameters was used (e.g. by Coelli et al., 2005). The time trend included in the equation above can be treated as an additional input and, in consequence, it allows for a non-neutral technical change, i.e. this latter raises the productivity of some factors more than others. Alternatively, the advantage of this form is that the elasticities with respect to factors and the economies of scale may change over time.

Table 3 shows the results of the estimation of the considered models. Due to mean-correction of the data prior to estimation, the first order parameters can be interpreted as elasticities of production with respect to inputs evaluated at the geometric mean of the data.

Table 3. Estimates for parameters of stochastic frontier models (standard errors in parentheses).

The model selection based on AIC (see Table 4) suggests that the best model supported by the data is the GTRE model. The elasticity with respect to capital was found to be statistically non-significant in this model. The highest output elasticity of materials is approximately 0.66. The lowest is area elasticity, which amounts to 0.22. In addition, it turns out that a 1% increase in labour use, while keeping all other explanatory variables constant, leads to a 0.27% increase in production. The sum of the above-mentioned elasticities is interpreted as returns to scale (RTS). The estimated RTS, at geometric mean of the data, is approximately 1.16, therefore it is increasing.

Table 4. Model selection based on Akaike’s criterion (AIC) and on the LR test (G2)a .

Moreover, we have tested the GTRE model against TRE and Kumbhakar & Heshmati (1995) models using the likelihood-ratio (LR) test. The comparison of model M6 with models M5 and M4 indicates that persistent component of inefficiency is generally less preferred than random farm effects that capture heterogeneity. Likewise, AIC confirms this conclusion. As Colombi et al. (2011) point out, testing the restrictions in stochastic frontier models is a non-standard problem since the distribution of LR statistic is a mixture of χ2 distri­butions, denoted by Χ2. Determining the critical value for one-sided LR tests (G2) is simple in single restriction hypothesis, since for a test of size α it is equal to the critical value of the χ2(1) distribution for a standard test of size 2α (Coelli, 1995). The extension to multiple restrictions is more complicated (Gourieroux et al., 1982). The table with critical values for the Χ2 distribution was provided by Kodde & Palm (1986). The results indicated that random components are statistically significant and cannot be excluded from model M6.

The aggregate results of efficiency estimates are presented in Table 5 and they require further explanation. It should be noted that the TE scores differ across the two types of models. The estimated efficiencies scores obtained in simpler models (M1, M2, M3, M5) are clearly greater than in models (M4, M6) that include the unit-specific intercept. It should be highlighted that the heterogeneous intercept term is the real-valued random variable with symmetric distribution around zero. Therefore, it partly captures inefficiency effects. Consequently, TRE model (M5) exhibits the highest TE scores, while in model M4 these scores are lowest due to the presence of two inefficiency terms and the lack of random farm effects. This result shows that it is difficult to separate inefficiency from unobserved individual effects. In summary, the omission of the latent heterogeneity results in overestimation of the long-run inefficiency, but not the short-run inefficiency. In our empirical example, the omission of the long-run inefficiency inflates the latent-heterogeneity component but not the short-run inefficiency.

Table 5. Estimates of technical efficiency (TE) scores - Descriptive statistics from the sample.

Regarding the detailed discussion about the GTRE model, it can be observed that the level of persistent TE is 0.88, while the level of transient TE is 0.81. Therefore, the overall level of TE amounts to 0.71. The obtained overall efficiency from the GTRE model is, as mentioned and explained earlier, lower than that of the TRE model.

Figure 2 shows a significant variation in the transient TE. Unfortunately, it is not possible to distinguish the general pattern of efficiency scores. We can, however, notice that after the initial downturn in the first year of EU membership, there was a progress in TE until 2007. However, probably because of the economic crisis that affected all of the sectors, there was a decrease in TE.

Figure 2. Efficiency decomposition into persistent and transient components.

Some of the models have time-varying efficiency and produce NT technical efficiency scores, and some are time-invariant and produce N scores. Therefore, in order to calculate correlation coefficient between NT and N scores, we averaged efficiency scores from time-varying models over the sample period for farm i.

Table 6 indicates that there is a close connection between transient efficiency scores, in particular correlation coefficient between M1 and M4, M1 and M5, and M1 and M6. Moreover, there is a nearly perfect correlation between M4 and M5, M4 and M6, and M5 and M6. Similarly, in the case of persistent TE scores there is a nearly perfect correlation between M4 and M2, M6 and M2, and M4 and M6.

Table 6. Pearson’s correlation coefficients between efficiency scores.


In the present study, we employed several stochastic frontier panel data models. It was revealed that the elasticities of production with respect to inputs are si­milar in all models except the pooled model. There­­fore, our results support the conclusions drawn by Greene (2005a,b) as well as by Filippini & Greene (2016) that TRE and GTRE models produce very similar results for structural parameters to the Pitt & Lee (1981) model with time-invariant inefficiency. However, unlike Greene (2005a,b), our results are quite different from the pooled model (similar results were obtained by Abdulai & Tietje (2007)), therefore all models except M1 give similar conclusions for the elasticities. Similarly, in all considered models we found increasing returns to scale at the geometric mean of the data. However, the estimates of TE are sensitive to model specification. Based on AIC, we can conclude that the GTRE model is the best among the considered models, i.e. the pooled Pitt & Lee (1981, Model II), Pitt & Lee (1981, Model I), Battese & Coelli (1992), Greene’s TRE models, and Kumbhakar & Heshmati (1995). Our empirical study in particular shows that ignoring heterogeneity in the production function leads to incorrect results of efficiency measurement. The studied dataset showed that omitting the latent heterogeneity component leads to lower overall TE scores in the pooled, Pitt & Lee (1981), Battese & Coelli (1992) and M4 models.

On one hand, when the Kumbhakar & Heshmati (1995) model (which does not account for unobserved heterogeneity) was adopted, the persistent inefficiency was revealed to be relatively high, and so this model might confound persistent inefficiency with heterogeneity. On the other hand, it was found that if the model without the long-run inefficiency is adopted, the variance of farm effects is inflated. Moreover, as pointed out by Colombi et al. (2014), the TRE model fails to account for persistent inefficiency; hence the estimated efficiencies are likely to be higher than overall efficiencies from the GTRE model.

The above-presented differences in TE scores between the considered models show that using an appropriate model is essential, since it leads to different policy recommendations. In our empirical analysis, the models which measure overall TE and do not account for unobserved heterogeneity (M1, M2 and M3) indicate its low level. On contrary, model M5 which as well measure overall TE but account for heterogeneity, indicate higher efficiency of farms. The most noticeable difference is between M4 and M6 models. While, the former model indicates that policy makers should take measures to eliminate the long-run inefficiency, in the latter model (GTRE) it was found that it is the transient efficiency that contributes most to the low overall TE score. Therefore, it seems that accounting for unobserved heterogeneity is crucial in formulating proper policy recommendations. Additionally, it is noteworthy that adopting different distributional assumptions for the inefficiency term leads to lower TE scores in M2 and M3 models (see Marzec & Pisulewski (2019) for comparison).

Moreover, the estimated level of overall TE is lower than that obtained in the previous studies on crop farms in Western Europe (Zhu & Lansink, 2010; Kumbhakar et al., 2014). Previous studies on Polish crop farms reported a similar overall average TE score, i.e. 0.73 (Latruffe et al., 2004). However, in the case of dairy farms, Marzec & Pisulewski (2017) found a higher average level of efficiency scores (0.83).

We found a higher level of persistent TE than transient TE in the GTRE model. This result contradicts the findings of Kumbhakar et al. (2014), who reported higher transient TE than persistent TE for Norwegian grain farms. Similarly, Lachaud et al. (2015) reported transient TE higher or equal to persistent TE of agriculture across all analysed Latin American countries.

Generally, correlation coefficients between the values of the transient TE (M1, M4, M5, M6) and persistent TE (M2, M5, M6) are relatively low. However, there are two exceptions. First, the efficiency scores from the pooled model are closely related to Pitt & Lee (1981) estimates. A similar result was obtained by Farsi et al. (2005), who pointed out that both models, although affected by heterogeneity bias in the coefficients, have a reasonable 'mutual consistency' with regard to efficiency estimation. Secondly, we also found that TE scores obtained from the Battese & Coelli (1992) model (which aims to measure transient TE) are perfectly correlated with persistent TE scores from models M2, M5 and M6, while poorly correlated with transient TE scores from models M4, M5 and M6. Therefore, it seems that this specification fails to measure the transient efficiency. Similarly, Greene (2005a,b) obtained nearly identical results with the Battese & Coelli (1992) model and the Pitt & Lee (1981) model for a panel of U.S. commercial banks.

In terms of agricultural policy, we showed that low overall TE of Polish crop farms is mainly due to the transient part. Therefore, in order to eliminate technical inefficiency, agricultural policy should focus on factors affecting the short-term inefficiency. Following Njuki & Bravo-Ureta (2015), these factors include adoption of new technologies, managerial skills, knowledge transfer or agricultural education. Moreover, this conclusion is supported by an earlier study by Henningsen (2009) who pointed out that poor managerial skills and low education of Polish farmers prevent them from optimal use of new technologies. Another factor which may decrease the transient TE are the considerable subsidies under the Common Agricultural Policy (CAP) that allow inefficient farms to survive. Different types of CAP subsidies may provide no incentives to improve efficiency. There are numerous studies on the impact of CAP subsidies on TE. However, these studies usually concern only one type of efficiency, i.e. transient or persistent. Therefore, it remains an unresolved issue which type of efficiency is affected by subsidies, thus further research is needed to confirm these findings. However, the confirmation of the effect of the above-mentioned factors on TE requires the use of, for example, a heteroscedastic GTRE model, such as presented by Badunenko & Kumbhakar (2017), which explicitly accounts for exogenous factors affecting efficiency. Although the level of persistent inefficiency is lower than that of transient inefficiency, it is still relatively substantial. In the case of Polish agriculture, it seems that its high level is connected with the unfavourable agrarian structure, which changes very slowly. Therefore, in order to improve the persistent efficiency, measures that could change the agrarian structure of Polish agriculture are required. Still, further research is needed to confirm these findings.


The first author gratefully acknowledges the financial support from the National Science Centre, Poland, based on decision no. DEC 2013/09/N/HS4/03833. The grant is being carried out at the Department of Econometrics and Operational Rese­arch, Faculty of Management, Cracow University of Economics. The second author acknowledges sup­port from research funds granted to the Faculty of Management at Cracow University of Economics, within the framework of the subsidy for the maintenance of research potential.


Abdulai A, Tietje H, 2007. Estimating technical efficiency under unobserved heterogeneity with stochastic frontier models: application to northern German dairy farms. Eur Rev Agric Econ 34 (3): 393-416.

Aigner D, Lovell CAK, Schmidt P, 1977. Formulation and estimation of stochastic frontier production function models. J Econometrics 6 (1): 21-37.

Areal FJ, Balcombe K, Tiffin R, 2012. Integrating spatial dependence into stochastic frontier analysis. Aust J Agr Resour Ec 56 (4): 521-541.

Badunenko O, Kumbhakar SC, 2017. Economies of scale, technical change and persistent and time-varying cost efficiency in Indian banking: Do ownership, regulation and heterogeneity matter? Eur J Oper Res 260 (2): 789-803.

Battese GE, Coelli TJ, 1988. Prediction of firm-level technical efficiencies with a generalized frontier production and panel data. J Econometrics 38 (3): 387-399.

Battese GE, Coelli TJ, 1992. Frontier production functions. technical efficiency and panel data: with application to paddy farmers in India. J Prod Anal 3 (1-2): 153-169.

Battese GE, Coelli TJ, 1995. A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empir Econ 20 (2): 325-332.

Bojnec Š, Latruffe L, 2009. Determinants of technical efficiency of Slovenian farms. Post-Communist Econ 21 (1): 117-124.

Bojnec Š, Latruffe L, 2013. Farm size, agricultural subsidies and farm performance in Slovenia. Land Use Pol 32: 207-217.

Brümmer B, Glauben T, Thijssen G, 2002. Decomposition of productivity growth using distance function: The case of dairy farms in three European countries. Am J Agr Econ 84 (3): 628-644.

Christensen LD, Jorgenson D, Lau LJ, 1973. Transcendental logarithmic production frontiers. Rev Econ Stats 55 (1): 28-45.

Coelli T, 1995. Estimators and hypothesis tests for a stochastic frontier function: A Monte Carlo analysis. J Prod Anal 6 (3): 247-268.

Coelli TJ, Prasada Rao DS, O'Donnell ChJ, Battese GE, 2005. An introduction to efficiency and productivity analysis, 2nd ed. Springer Science+Business Media, NY. 349 pp.

Colombi R, 2010. A skew normal stochastic frontier model for panel data. Proc 45th Sci Meeting of the Ital Stat Soc.

Colombi R, Martini G, Vittadini G, 2011. A stochastic frontier model with short-run and long-run inefficiency random effects. Working Paper no 01-2011. Dept Econ Technol Manage, Univ Bergamo, Italy.

Colombi R, Kumbhakar SC, Martini G, Vittadini G, 2014. Closed-skew normality in stochastic frontiers with individual effects and long/short-run efficiency. J Prod Anal 42 (2): 123-136.

CSO, 2016. Statistical Yearbook of Agriculture 2016. Central Statistical Office, Warsaw, Poland.

EC, 2008. Commission Regulation no 1242/2008 of 8 December 2008 establishing a Community typology for agricultural holdings: Annex I, part C. Definition of types of farming. European Commission.

Ellman M, 1981. Agricultural productivity under socialism. World Dev 9 (9-10): 979-989.

Fandel P, 2003. Technical and scale efficiency of corporate farms in Slovakia. Agr Econ-Czech 49 (8): 375-383.

Farsi M, Filippini M, Kuenzle M, 2005. Unobserved heterogeneity in stochastic cost frontiers models: an application to Swiss nursing homes. Appl Econ 37 (18): 2127-2141.

Filippini M, Greene WH, 2016. Persistent and transient productive inefficiency: a maximum simulated likelihood approach. J Prod Anal 45 (2): 187-196.

Geweke J, 1989. Bayesian inference in econometric models using Monte Carlo integration, Econometrica 57 (6): 1317-1339.

Gourieroux C, Holly A, Monfort A, 1982. Likelihood ratio test, Wald test, and Kuhn-Tucker test in linear models with inequality constraints on the regression parameters. Econometrica 50 (1): 63-80.

Greene WH, 2005a. Fixed and random effects in stochastic frontier models. J Prod Anal 23 (1): 7-32.

Greene WH, 2005b. Reconsidering heterogeneity in panel data estimators of the stochastic frontier model. J Econometrics 126 (2): 269-303.

Greene WH, 2008. The econometric approach to efficiency analysis. In: The measurement of productivity efficiency and productivity growth; Fried HO, Lovell CAK, Shelton SS (eds.), pp: 92-250. Oxford Univ Press, UK.

Greene WH, 2012. Econometric Analysis, 7th ed. Pearson Education.

Hajivassiliou V, 1990. Smooth simulation estimation of panel data LDV models. Dept Econ, Yale Univ.

Hajivassiliou VA, 2000. Some practical issues in maximum simulated likelihood. In: Simulation-based inference in econometrics: Methods and applications; Mariano R, Schuermann T, Weeks MJ (eds.), pp: 71-99. Cambridge Univ Press, UK.

Hartvigsen M, 2014. Land reform and land fragmentation in Central and Eastern Europe. Land Use Pol 36: 330-341.

Henningsen A, 2009. Why is the Polish farm sector still so underdeveloped? Post-Communist Econ 21 (1): 47-64.

Jondrow J, Lovell CAK, Materov I, Schmidt P, 1982. On the estimation of technical inefficiency in the stochastic frontier production function model. J Econometrics 19 (2-3): 233-238.

Keane M, 1994. A computationally practical simulation estimator for panel data. Econometrica 62 (1): 95-116.

Kodde DA, Palm FC, 1986. Wald criteria for jointly testing equality and inequality restrictions. Econometrica 54 (5): 1243-1248.

Kumbhakar SC, 1990. Production frontiers, panel data, and time-varying technical inefficiency. J Econometrics 46 (1-2): 201-211.

Kumbhakar SC, Heshmati A, 1995. Efficiency measurement in Swedish dairy farms: An application of rotating panel data 1976-88. Am J Agr Econ 77 (3): 660-674.

Kumbhakar SC, Lien G, Hardaker JB, 2014. Technical efficiency in competing panel data models: A study of Norwegian grain farming. J Prod Anal 41 (2): 321-337.

Lachaud MA, Bravo-Ureta BE, Ludena CE, 2015. Agricultural productivity growth in Latin America and the Caribbean and other world regions: An analysis of climatic effects, convergence and catch-up. Int-Am Dev Bank Working Paper No. 607 (IDB-WP-607), Washington DC.

Latruffe L, Balcombe K, Davidova S, Zawalińska K, 2004. Determinants of technical efficiency of crop and livestock farms in Poland. Appl Econ 36 (12): 1255-1263.

Latruffe L, Balcombe K, Davidova S, Zawalińska K, 2005. Technical and scale efficiency of crop and livestock farms in Poland: Does specialization matter? Agr Econ 32 (3): 281-296.

Latruffe L, Davidova S, Balcombe K, 2008. Application of a double bootstrap to investigation of determinants of technical efficiency of farms in Central Europe. J Prod Anal 29 (2): 183-191.

Latruffe L, Fogarasi J, Desjeux Y, 2012. Efficiency, productivity and technology comparison for farms in Central and Western Europe: The case of field crop and dairy farming in Hungary and France. Econ Syst 36 (2): 264-278.

Makieła K, 2017. Bayesian inference and Gibbs sampling in generalized true random-effects models. Cent Eur J Econ Mod Econ 9 (1): 69-95.

Makieła K, Marzec J, Pisulewski A, 2017. Productivity change analysis in dairy farms following Polish accession to the EU – An output growth decomposition approach. Outlook Agr 46 (4): 295-301.

Marzec J, Pisulewski A, 2017. The effect of CAP subsidies on the technical efficiency of Polish dairy farms. Cent Eur J Econ Mod Econom 9 (3): 243-273.

Marzec J, Pisulewski A, 2019. The measurement of time-varying technical efficiency and productivity change in Polish crop farms. Ger J Agr Econ 68 (1): 15-27.

Meeusen W, van den Broeck J, 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. Int Econ Rev 18 (2): 435-444.

Njuki E, Bravo-Ureta BE, 2015. The economic costs of environmental regulation in U.S. dairy farming: a directional distance function approach. Am J Agr Econ 97 (4): 1087-1106.

Pede VO, Areal FJ, Singbo A, McKinley J, Kajisa K, 2018. Spatial dependency and technical efficiency: An application of a Bayesian stochastic frontier model to irrigated and rainfed farmers in Bohol, Phillipines. Agr Econ 49 (3): 301-312.

Pitt M, Lee L, 1981. Measurement of sources of technical inefficiency in the Indonesian weaving industry. J Dev Econ 9 (1): 43-64.

PORDATA, 2017: Labour productivity per hour (Euro) – Europe. [5 Jan 2019].

Schmidt AM, Moreira ARB, Helfand SM, Fonseca TCO, 2009. Spatial stochastic frontier models: Accounting for unobserved local determinants of inefficiency. J Prod Anal 31 (2): 101-112.

Stevenson RE, 1980. Likelihood functions for generalized stochastic frontier estimation. J Econometrics 13 (1): 57-66.

Train K, 2009. Discrete choice methods with simulation. Cambridge Univ Press, UK. 378 pp.

Tsionas EG, 2002. Stochastic frontier models with random coefficients. J Appl Econom 17 (2): 127-147.

Tsionas EG, Kumbhakar SC, 2014. Firm heterogeneity, persistent and transient technical inefficiency: A generalized true random-effects model. J Appl Econom 29 (1): 110-132.

Zhu X, Lansink AO, 2010. Impact of CAP subsidies on technical efficiency of crop farms in Germany, the Netherlands and Sweden. J Agr Econ 61 (3): 545-564.