Likan Zhan

R语言中的 diff 函数

战立侃 · 2018-07-29

diff() 是R语言中的一个简单函数。该函数有三个基本参数:x, lag = 1differences = 1

args(diff.default)
## function (x, lag = 1L, differences = 1L, ...) 
## NULL

其中 x 表示待计算的数据。当输入数据x为数组时,其基本功能是计算数组中序列值之间的差异。例如:

(x <- cumsum(cumsum(1:10)))
##  [1]   1   4  10  20  35  56  84 120 165 220
x1 <- x[1:(length(x) - 1)]
x2 <- x[2:length(x)]
x2 - x1
## [1]  3  6 10 15 21 28 36 45 55
diff(x)
## [1]  3  6 10 15 21 28 36 45 55

其中 lag = 1 序列中要计算差值的两个数值之间的距离,例如:

lag <- 2
x1 <- x[1:(length(x) - lag)]
x2 <- x[(lag + 1):length(x)]
x2 - x1
## [1]   9  16  25  36  49  64  81 100
diff(x, lag = 2)
## [1]   9  16  25  36  49  64  81 100

而参数 differences 表示计算差值的次数。例如,如果 differences = 2,则:

lag <- 2
x11 <- x[1:(length(x) - lag)]
x12 <- x[(lag + 1):length(x)]
x20 <- x12 - x11
x21 <- x20[1:(length(x20) - lag)]
x22 <- x20[(lag + 1):length(x20)]
x30 <- x22 - x21
x30
## [1] 16 20 24 28 32 36
diff(x, lag = 2, differences = 2)
## [1] 16 20 24 28 32 36

当输入数据x为矩阵时,diff 函数对每一列进行上述计算,例如

(x <- matrix(cumsum(cumsum(1:25)), ncol = 5))
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1   56  286  816 1771
## [2,]    4   84  364  969 2024
## [3,]   10  120  455 1140 2300
## [4,]   20  165  560 1330 2600
## [5,]   35  220  680 1540 2925
diff(x, lag = 2, differences = 1)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    9   64  169  324  529
## [2,]   16   81  196  361  576
## [3,]   25  100  225  400  625
diff(x, lag = 2, differences = 2)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   16   36   56   76   96