An Efficient Compromised Imputation Method for Estimating Population Mean 1 Association of Indian Universities, New Delhi, India
1. INTRODUCTION Imputation means
replacing a missing value with another value based on a reasonable estimate.
Information on the related auxiliary variable is generally used to recreate the
missing values for completing datasets. Incomplete data is usually categorized
into three different response mechanisms: Missing Completely at Random (MCAR);
Missing at Random (MAR); and Missing Not at Random (MNAR or NMAR) Little and Rubin (2002). Missing completely at random (MCAR): Missing
data are randomly distributed across the variable and unrelated to other
variables. Missing at random (MAR): Missing data are not randomly distributed
but they are accounted for by other observed variables. Missing not at random
(MNAR): Missing data systematically differ from the observed values. From the
above-mentioned classifications of missing data, we, in the present study, have
assumed MCAR. Auxiliary
information is important for survey practitioner as it is utilized to improve
the performance of the methods. It may be utilized at the design stage or the
estimation stage of the survey to get the more efficient estimator. At
estimation stage ratio, product and regression methods are traditionally used. Bhal
and Tuteja (1991) introduced
exponential ratio and product estimator for estimation of population mean. Many
modifications have been proposed using these methods till date. For handling
missing data on the study variable several extensions and developments were
proposed in the literature. Singh (2003) suggested
product estimation for imputation. Shakti Prasad (2018) adapts exponential
product type estimator given by Bahal and Tuteja (1991) and
proposed exponential estimators for imputation. Kadilar and Cingi (2008) investigated some ratio-type imputation
methods and proposed three new estimators to overcome the problem of the
missing data. Diana and Perri (2010)
proposed three regression type estimators which were
more efficient than the Kadilar and Cingi (2008). The
present article suggests a general ratio product exponential type method of
imputation and accordingly proposed three estimators using the different amount
of available auxiliary information as utilized by Ahmad
et al. (2006), Kadilar and Cingi (2008), and Diana and Perri (2010). The proposed methods are
than compared by traditional procedure of imputation. The proposed estimators
come out to be more efficient than the usual ratio, product, regression, and
exponential method for handling missing observations to estimate the population
mean. Given a finite population
2. Some
existing methods of imputation 1)
The mean method of imputation suggests replacing the
missing observations with the mean of the observations available on response
units i.e. Then the estimator of the population mean
2)
The ratio method of imputation uses information on one auxiliary
variable
Where This gives the resulting estimator by The MSE of
It is noted that, in the presence of missing data, the
availability of information on auxiliary variable 3) Diana
and Perri (2010) proposed three estimators as by using different regression-type
method of imputation such that the imputed data is given by For these
methods the resulting estimators are
They proved
that the suggested estimators are more efficient than the Kadilar and Cingi (2008) estimators. 3. The proposed Estimator With the above
imputation method, the resulting estimator of the population mean
4. First
Degree Approximation to the Bias To derive the Bias and MSE expressions of the proposed estimator Thus, we have The expectation of these And under simple random sampling without replacement, where Now representing (2.1) in terms of We assume that the sample is large enough to make
Theorem 2.1. The
conditional bias up to the first order of approximation of the estimator Where Proof: From (2.2) we have
Taking expectation on both side we obtain the bias of
Letting 5. Mean
Squared Error of T We calculate the mean squared error of Theorem 2.2. The minimum mean square
error of the proposed estimator The optimum values of Where
And the minimum MSE of the proposed estimator
is given by Proof:
Let coefficient
of
Now, let
From
previous theorem
Placing these values of Differentiating MSE with respect to On
solving these equations, we get And
after placing the values of 6. Expressions of MSE for different choices of Here we consider the
different forms of the proposed estimator for various values of Case 1.
Which is better
than the mean estimator And if Case 2.
For For Case 3. A linear
combination of product estimator and product type exponential estimator,
and further For
and for For the purpose of
comparison of the proposed estimator we conducted the empirical study and
computed the Percentage relative efficiency (PRE) of the estimators The empirical study has been carried out to
illustrate and compare the performance of the proposed imputation methods with
the existing conventional imputation methods and the method proposed by Diana and Perri (2010) and Bhushan and Pandey (2010) utilizing Searl () constant with the Diana and Perri (2010)
estimator for
(i) real data described in Horvitz
and Thompson (1952), Singh (2003), described in Table 1, Table 2. Table 1
2)
Ratio Method of imputation, 3)
Diana and Perri method of imputation 4)
Diana and Perri method of imputation 5)
Diana and Perri method of imputation 6)
Proposed method of imputation, Table 2
8. Interpretations of the computational results The
following interpretations may be read out from above Tables: 1) For the real
populations, HT data and Singh Population where the correlation between
CONFLICT OF INTERESTS None. ACKNOWLEDGMENTS None. REFERENCES Heitjan, D.F. and Basu, S. (1996). Distinguishing 'Missing at Random' and 'Missing Completely at Random'. The American Statistician. 50(3), 207-213. https://doi.org/10.1080/00031305.1996.10474381. Horvitz, D. G. and Thompson, D.J. (1952). A Generalization of Sampling Without Replacement From a Finite Universe. Journal of the American Statistical Association, 47(260), 663-685. https://doi.org/10.1080/01621459.1952.10483446. Lee, H., Rancourt, E., and Sarndal, C.E. (1994). Experiments with Variance Estimation from Survey Data with Imputed Values. Journal of Official Statistics,10(3),231-243. Rubin, R. B. (1978). Multiple Imputation for Nonresponse in Surveys. New York : John Wiley. Singh, S. (2003). Advanced Sampling Theory with Applications. How Michael Selected Amy. Kluwer, Dordrecht, 1(2). https://doi.org/10.1007/978-94-007-0789-4. Singh, S. and Horn, S. (2000). Compromised Imputation in Survey Sampling. Metrika, 51, 267-276. https://doi.org/10.1007/s001840000054.
© IJETMR 2014-2022. All Rights Reserved. |