导航:首页 > 数据行情 > r语言股票数据回归分析案例

r语言股票数据回归分析案例

发布时间：2022-08-28 14:31:23

1. R语言基本数据分析

R语言基本数据分析
本文基于R语言进行基本数据统计分析，包括基本作图，线性拟合，逻辑回归，bootstrap采样和Anova方差分析的实现及应用。
不多说，直接上代码，代码中有注释。
1. 基本作图（盒图，qq图）
#basic plot
boxplot(x)
qqplot(x,y)
2. 线性拟合
#linear regression
n = 10
x1 = rnorm(n)#variable 1
x2 = rnorm(n)#variable 2
y = rnorm(n)*3
mod = lm(y~x1+x2)
model.matrix(mod) #erect the matrix of mod
plot(mod) #plot resial and fitted of the solution, Q-Q plot and cook distance
summary(mod) #get the statistic information of the model
hatvalues(mod) #very important, for abnormal sample detection
3. 逻辑回归

#logistic regression
x <- c(0, 1, 2, 3, 4, 5)
y <- c(0, 9, 21, 47, 60, 63) # the number of successes
n <- 70 #the number of trails
z <- n - y #the number of failures
b <- cbind(y, z) # column bind
fitx <- glm(b~x,family = binomial) # a particular type of generalized linear model
print(fitx)

plot(x,y,xlim=c(0,5),ylim=c(0,65)) #plot the points (x,y)

beta0 <- fitx$coef[1]
beta1 <- fitx$coef[2]
fn <- function(x) n*exp(beta0+beta1*x)/(1+exp(beta0+beta1*x))
par(new=T)
curve(fn,0,5,ylim=c(0,60)) # plot the logistic regression curve
3. Bootstrap采样

# bootstrap
# Application: 随机采样，获取最大eigenvalue占所有eigenvalue和之比，并画图显示distribution
dat = matrix(rnorm(100*5),100,5)
no.samples = 200 #sample 200 times
# theta = matrix(rep(0,no.samples*5),no.samples,5)
theta =rep(0,no.samples*5);
for (i in 1:no.samples)
{
j = sample(1:100,100,replace = TRUE)#get 100 samples each time
datrnd = dat[j,]; #select one row each time
lambda = princomp(datrnd)$sdev^2; #get eigenvalues
# theta[i,] = lambda;
theta[i] = lambda[1]/sum(lambda); #plot the ratio of the biggest eigenvalue
}

# hist(theta[1,]) #plot the histogram of the first(biggest) eigenvalue
hist(theta); #plot the percentage distribution of the biggest eigenvalue
sd(theta)#standard deviation of theta

#上面注释掉的语句，可以全部去掉注释并将其下一条语句注释掉，完成画最大eigenvalue分布的功能
4. ANOVA方差分析

#Application：判断一个自变量是否有影响 (假设我们喂3种维他命给3头猪，想看喂维他命有没有用)
#
y = rnorm(9); #weight gain by pig(Yij, i is the treatment, j is the pig_id), 一般由用户自行输入
#y = matrix(c(1,10,1,2,10,2,1,9,1),9,1)
Treatment <- factor(c(1,2,3,1,2,3,1,2,3)) #each {1,2,3} is a group
mod = lm(y~Treatment) #linear regression
print(anova(mod))
#解释：Df（degree of freedom）
#Sum Sq: deviance (within groups, and resials) 总偏差和
# Mean Sq: variance (within groups, and resials) 平均方差和
# compare the contribution given by Treatment and Resial
#F value: Mean Sq(Treatment)/Mean Sq(Resials)
#Pr(>F): p-value. 根据p-value决定是否接受Hypothesis H0：多个样本总体均数相等(检验水准为0.05)
qqnorm(mod$resial) #plot the resial approximated by mod
#如果qqnorm of resial像一条直线，说明resial符合正态分布，也就是说Treatment带来的contribution很小，也就是说Treatment无法带来收益（多喂维他命少喂维他命没区别）
如下面两图分别是
（左）用 y = matrix(c(1,10,1,2,10,2,1,9,1),9,1)和
（右）y = rnorm(9);
的结果。可见如果给定猪吃维他命2后体重特别突出的数据结果后，qq图种resial不在是一条直线，换句话说resial不再符合正态分布，i.e., 维他命对猪的体重有影响。

2. R语言实现一个回归

x<-c(1,2,3)
y<-c(3,4,5)
a<-data.frame(x,y)
a_lm<-lm(y~x,data=a)
> summary(a_lm)

Call:
lm(formula = y ~ x, data = a)

Resials:
1 2 3
-7.020e-17 1.404e-16 -7.020e-17

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.000e+00 2.627e-16 7.614e+15 <2e-16 ***
x 1.000e+00 1.216e-16 8.224e+15 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Resial standard error: 1.72e-16 on 1 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 6.764e+31 on 1 and 1 DF, p-value: < 2.2e-16

3. 如何用R语言的quantmod包获取一系列股票的历史日线数据

我举个例子供你参考：
> install.packages('quantmod') # 安装安装quantmod包
> require(quantmod)#引用quantmod包
> getSymbols("GOOG",src="yahoo",from="2013-01-01", to='2013-04-24') #从雅虎财经获取google的股票数据
> chartSeries(GOOG,up.col='red',dn.col='green') #显示K线图

4. 如何用R语言做线性相关回归分析

cor()函数可以提供双变量之间的相关系数，还可以用scatterplotMatrix()函数生成散点图矩阵

不过R语言没有直接给出偏相关的函数；
我们要是做的话，要先调用cor.test()对变量进行Pearson相关性分析，
得到简单相关系数，然后做t检验，判断显着性。

5. r语言逻辑回归实例自变量有多个

辑回归是回归模型，其中响应变量(因变量)具有明确的值，如：True/False或0/1。它实际测量二元响应作为响应变量，是基于与预测变量有关它的数学方程的值的概率。
逻辑回归一般的数学公式是：
y = 1/(1+e^-(a+b1x1+b2x2+b3x3+...))

以下是所使用的参数的说明：
y 是响应变量。
x 是预测变量。
a 和 b 是数字常量系数。
用于创建回归模型的功能是 glm()函数。
语法
glm()函数在逻辑回归的基本语法是：
glm(formula,data,family)

以下是所使用的参数的说明：
formula 是呈现所述变量之间的关系的标志。
data 在数据集给出这些变量的值.
family 为R对象以指定模型的细节。它的值是二项分布

6. 如何用r软件对给定数据进行回归分析（不能用lm函数）

可以试着探索一下summary(lm(y~x))到底是什么。首先看一下summary(lm(y~x))是什么数据类型： > m class(summary(m)) [1] "summary.lm" #可以看到，lm的结果是一个"summary.lm" 对象。这有些显而易见。好吧，继续探索。 R语言中所有的对象都建立在一些native data structures之上，那么summary(lm(y~x)的native data structure是什么呢？可以用mode()命令查看。

阅读全文

与r语言股票数据回归分析案例相关的资料

热点内容

公司户股票账户开通创业板发布：2025-07-04 17:38:18 浏览：914

为什么股票主力昨天进今天就卖发布：2025-07-04 17:38:08 浏览：655

股神长期做一只股票波段发布：2025-07-04 17:36:25 浏览：592

股票主力8块卖散户不买怎么办发布：2025-07-04 17:23:05 浏览：72

为什么股票开市时间那么短发布：2025-07-04 17:21:40 浏览：520

股票成交量数据单位是什么发布：2025-07-04 17:14:58 浏览：120

买了st股票可以卖吗发布：2025-07-04 17:02:34 浏览：945

股票涨停了第二天会怎么办发布：2025-07-04 16:59:28 浏览：295

2007年上半年广济药业股票涨幅发布：2025-07-04 16:59:26 浏览：89

中信证券看现在的股票发布：2025-07-04 16:41:29 浏览：811

股票的信用账户如何使用发布：2025-07-04 16:35:35 浏览：87

股票赚钱会卖才是师傅发布：2025-07-04 16:13:04 浏览：628

九鼎投资可以减持股票发布：2025-07-04 16:13:04 浏览：88

股票投资几大流派发布：2025-07-04 16:11:50 浏览：963

股票资金账户解除农行银行卡发布：2025-07-04 16:10:47 浏览：991

港股协鑫能源股票历史数据查询发布：2025-07-04 16:10:04 浏览：646

股票风险的投资理财发布：2025-07-04 15:53:26 浏览：596

股票一个点能赚钱吗发布：2025-07-04 15:48:59 浏览：620

股票投资风险管理的意义发布：2025-07-04 15:48:56 浏览：983

哪家银行股票存管算资产发布：2025-07-04 15:46:42 浏览：359