R语言学习
R源码安装常见报错及更换package安装源
统计检验应用
卡方检验案例应用
linux本地安装ggplot2
shell脚本中运行R代码
conda构建虚拟环境
karyoploteR基因组数据可视化
Linux环境中R语言绘图问题
【实用】Bioconductor的正确使用
本文档使用 MrDoc 发布
-
+
up
down
首页
卡方检验案例应用
卡方检验是一种统计方法,用于确定两个分类变量之间是否具有显着的相关性。 这些变量应该来自相同的人口,它们应该是分类的,如 - 是/否,男/女,红/绿等。 例如,我们可以建立一个数据集,观察人们的冰淇淋购买模式,并尝试将一个人的性别与他们喜欢的冰淇淋的味道相关联。 如果发现相关性,我们可以通过了解访问者的性别数量来调整对应口味的库存。 ## 1、数据整理,参考示例: ``` library("MASS") # Create a data frame from the main data set. car.data <- data.frame(Cars93$AirBags, Cars93$Type) # Create a table with the needed variables. car.data = table(Cars93$AirBags, Cars93$Type) print(car.data) # Perform the Chi-Square test. print(chisq.test(car.data)) ``` 产生以下结果 ``` Pearson's Chi-squared test data: car.data X-squared = 33.001, df = 10, p-value = 0.0002723 Warning message: In chisq.test(car.data) : Chi-squared approximation may be incorrect ``` 我们的模型,只需要考虑变量AirBags和Type。 在这里,我们的目标是找出所销售的汽车类型和Air Bag的类型之间的显着相关性。 ## 2、实际应用 有表格结构如下: | 华南 | 西南 | 华东 | 华中 | 华北 | 东北 | | --- | --- | --- | --- | --- | --- | | GJB2:c.107T>C | 0 | 0 | 0 | 0 | 0 | 0 | | GJB2:c.109G>A | 8810 | 643 | 1621 | 18 | 25 | 19 | | GJB2:c.176-191del16 | 138 | 3 | 37 | 1 | 1 | 1 | | GJB2:c.235delC | 2264 | 74 | 706 | 7 | 7 | 16 | | GJB2:c.253T>C | 8 | 0 | 8 | 0 | 0 | 0 | | GJB2:c.257C>G | 34 | 1 | 13 | 1 | 2 | 1 | | GJB2:c.299-300delAT | 452 | 14 | 197 | 1 | 4 | 2 | | GJB2:c.35delG | 3 | 0 | 0 | 0 | 0 | 0 | | GJB2:c.416G>A | 48 | 4 | 9 | 0 | 0 | 0 | | GJB2:c.427C>T | 14 | 0 | 9 | 0 | 0 | 0 | | GJB2:c.512insAACG | 110 | 0 | 31 | 0 | 0 | 0 | | GJB2:c.94C>T | 2 | 0 | 2 | 0 | 0 | 0 | | 检测总样本数|47001|3483|34351|419|828|192| 我们需要探讨各位点突变频率在各个地区之间是否有显著差异。例如c.109G>A,整理一个列联表如下: ``` > c109GA HN XN HD HZ HB DB c.109G.A 8810 643 1621 18 25 19 none 38191 2840 32730 401 803 173 ``` 对六个地区的c.109G>A位点进行卡方检验, ``` > chisq.test(c109GA) Pearson's Chi-squared test data: c109GA X-squared = 3670.1, df = 5, p-value < 2.2e-16 ``` 可以看到,**差异极显著,说明其中至少两个地区之间的差异是显著的。因此对两两地区进行卡方检验**: ``` #使用for循环进行两两地区的卡方检验: for(i in 1:5){ for(j in (i+1):6){ print(colnames(c109GA[,c(i,j)])); print(chisq.test(c109GA[,c(i,j)])) } } #c.109G>A结果如下 [1] "HN" "XN" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 0.15277, df = 1, p-value = 0.6959 [1] "HN" "HD" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 3491.3, df = 1, p-value < 2.2e-16 [1] "HN" "HZ" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 56.272, df = 1, p-value = 6.311e-14 [1] "HN" "HB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 132.56, df = 1, p-value < 2.2e-16 [1] "HN" "DB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 9.2711, df = 1, p-value = 0.002328 [1] "XN" "HD" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 1059.1, df = 1, p-value < 2.2e-16 [1] "XN" "HZ" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 52.334, df = 1, p-value = 4.683e-13 [1] "XN" "HB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 120.64, df = 1, p-value < 2.2e-16 [1] "XN" "DB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 8.4687, df = 1, p-value = 0.003613 [1] "HD" "HZ" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 0.084166, df = 1, p-value = 0.7717 [1] "HD" "HB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 4.8624, df = 1, p-value = 0.02745 [1] "HD" "DB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 10.199, df = 1, p-value = 0.001405 [1] "HZ" "HB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 1.0054, df = 1, p-value = 0.316 [1] "HZ" "DB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 6.3068, df = 1, p-value = 0.01203 [1] "HB" "DB" Pearson's Chi-squared test with Yates' continuity correction data: c109GA[, c(i, j)] X-squared = 16.228, df = 1, p-value = 5.615e-05 ```
laihui126
2023年1月12日 10:55
分享文档
收藏文档
上一篇
下一篇
微信扫一扫
复制链接
手机扫一扫进行分享
复制链接
关于 MrDoc
觅道文档MrDoc
是
州的先生
开发并开源的在线文档系统,其适合作为个人和小型团队的云笔记、文档和知识库管理工具。
如果觅道文档给你或你的团队带来了帮助,欢迎对作者进行一些打赏捐助,这将有力支持作者持续投入精力更新和维护觅道文档,感谢你的捐助!
>>>捐助鸣谢列表
微信
支付宝
QQ
PayPal
下载Markdown文件
分享
链接
类型
密码
更新密码