Missing value replacement by mean values. Different means like median, mean, mode possible.
na_mean(x, option = "mean", maxgap = Inf)
x | Numeric Vector ( |
---|---|
option | Algorithm to be used. Accepts the following input:
|
maxgap | Maximum number of successive NAs to still perform imputation on. Default setting is to replace all NAs without restrictions. With this option set, consecutive NAs runs, that are longer than 'maxgap' will be left NA. This option mostly makes sense if you want to treat long runs of NA afterwards separately. |
Vector (vector
) or Time Series (ts
)
object (dependent on given input at parameter x)
Missing values get replaced by overall mean values. The function calculates the mean, median, mode, harmonic or geometric mean over all the non-NA values and replaces all NAs with this value. Option 'mode' replaces NAs with the most frequent value in the time series. If two or more values occur equally frequent, the function imputes the lower value. Due to their calculation formula geometric and harmonic mean are not well defined for negative values or zero values in the input series.
In general using the mean for imputation imputation is mostly a suboptimal choice and should be handled with great caution.
Steffen Moritz
# Prerequisite: Create Time series with missing values x <- ts(c(2, 3, 4, 5, 6, NA, 7, 8)) # Example 1: Perform imputation with the overall mean na_mean(x)#> Time Series: #> Start = 1 #> End = 8 #> Frequency = 1 #> [1] 2 3 4 5 6 5 7 8# Example 2: Perform imputation with overall median na_mean(x, option = "median")#> Time Series: #> Start = 1 #> End = 8 #> Frequency = 1 #> [1] 2 3 4 5 6 5 7 8# Example 3: Same as example 1, just written with pipe operator x %>% na_mean()#> Time Series: #> Start = 1 #> End = 8 #> Frequency = 1 #> [1] 2 3 4 5 6 5 7 8