从0开始学习R语言--Day54--双重固定模型
对于具有空间差异的数据,如果不知道数据的特征关系或意义,直接用杜宾模型来处理是一个比较通用的思路,只是后续还需要很多检验去证明结果的可解释性和统计性。
但如果我们已经知道特征的意义,比如企业经济发展的数据中有着员工的科研能力,公司文化,当下的政策改革,外界的经济变化,我们就可以将其分为个体效应(不随时间改变的特征)和时间效应(所有个体共同经历的时间趋势),从而能够快速直接地分析出各个地域企业的发展状况。
以下是一个例子:
# 加载必要的包library(plm)library(lmtest)library(dplyr)# 生成模拟数据集set.seed(123)n <- 100 # 个体数量t <- 5 # 时间周期# 创建面板数据结构data % mutate( # 个体固定效应(不随时间变化) alpha_i = rnorm(n, mean = 0, sd = 2)[id], # 时间固定效应(不随个体变化) gamma_t = rnorm(t, mean = 0, sd = 1)[time], # 解释变量 X = rnorm(n*t, mean = 5, sd = 2), # 误差项 epsilon = rnorm(n*t, mean = 0, sd = 1), # 生成因变量(真实系数β=0.8) Y = 0.8 * X + alpha_i + gamma_t + epsilon )# 查看前几行数据head(data)# 双重固定效应模型估计twoway_model <- plm(Y ~ X, data = data, index = c(\"id\", \"time\"), model = \"within\", effect = \"twoways\")# 混合模型(无固定效应)pooled_model <- plm(Y ~ X, data = data, index = c(\"id\", \"time\"), model = \"pooling\")# 个体固定效应模型individual_model <- plm(Y ~ X, data = data, index = c(\"id\", \"time\"), model = \"within\", effect = \"individual\")# 时间固定效应模型time_model <- plm(Y ~ X, data = data, index = c(\"id\", \"time\"), model = \"within\", effect = \"time\")# 查看双重固定效应模型结果summary(twoway_model)# 正确进行F检验的方法# 1. 检验双重固定效应是否优于混合模型pFtest(twoway_model, pooled_model)# 2. 检验个体固定效应是否显著pFtest(individual_model, pooled_model)# 3. 检验时间固定效应是否显著pFtest(time_model, pooled_model)# 4. 检验双重固定效应是否优于仅个体固定效应pFtest(twoway_model, individual_model)# 5. 检验双重固定效应是否优于仅时间固定效应pFtest(twoway_model, time_model)
输出:
Twoways effects Within ModelCall:plm(formula = Y ~ X, data = data, effect = \"twoways\", model = \"within\", index = c(\"id\", \"time\"))Balanced Panel: n = 100, T = 5, N = 500Residuals: Min. 1st Qu. Median 3rd Qu. Max. -3.224723 -0.583125 -0.010202 0.599678 2.960869 Coefficients: Estimate Std. Error t-value Pr(>|t|) X 0.778466 0.026107 29.818 < 2.2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Total Sum of Squares: 1329.9Residual Sum of Squares: 409.08R-Squared: 0.6924Adj. R-Squared: 0.61141F-statistic: 889.127 on 1 and 395 DF, p-value: < 2.22e-16F test for twoways effectsdata: Y ~ XF = 17.185, df1 = 103, df2 = 395, p-value < 2.2e-16alternative hypothesis: significant effectsF test for individual effectsdata: Y ~ XF = 13.23, df1 = 99, df2 = 399, p-value < 2.2e-16alternative hypothesis: significant effectsF test for time effectsdata: Y ~ XF = 6.673, df1 = 4, df2 = 494, p-value = 3.094e-05alternative hypothesis: significant effectsF test for twoways effectsdata: Y ~ XF = 27.637, df1 = 4, df2 = 395, p-value < 2.2e-16alternative hypothesis: significant effectsF test for twoways effectsdata: Y ~ XF = 16.759, df1 = 99, df2 = 395, p-value < 2.2e-16alternative hypothesis: significant effects
输出表明:模型需要固定效应加入到模型中,且个体效应非常显著,只是需要控制个别特殊异体,时间效应同理;所有的F的p值都小于0.001,说明必须同时控制时间和个体固定效应,结果中X的系数为0.778,表明是纯净的因果效应,而标准差0.026则说明模型的精度较高。