PLINK is a well-established software for genetic analysis. In many project, we used
plink2 for genome-wide association study (GWAS) and other computations related to the raw genotype matrix. Here, I list several tips on the use of this software.
pgenlibr to read plink2 pgen files from
To read genotype data stored in
plink2 pgen/pvar/psam files, one can use an R package called
pgenlibr. Here I show some example usage of
How to install
pgenlibr in your R environment?
$ git clone https://github.com/chrchang/plink-ng.git $ R > install.packages('plink-ng/2.0/pgenlibr', repos = NULL, type='source')
Some basic usage of
require(pgenlibr) pfile <- '@@@/sample_genotype_data' pvar <- NewPvar(paste0(pfile, '.pvar.zst')) pgen <- NewPgen(paste0(pfile, '.pgen'), pvar=pvar) GetVariantId(pvar, 1) GetVariantId(pvar, 10) GetVariantCt(pvar) GetRawSampleCt(pgen) # read a single variant buf <- pgenlibr::Buf(pgen) pgenlibr::Read(pgen, buf, 1) # read a list of variants ## var.idxs (list of variants you would like to read) geno_mat <- ReadList(pgen, var.idxs, meanimpute=F)
Running recessive model on chrX
As illustrated in PLINK2 website,
--glm recessive will be the modifier to run GWAS scan with recessive model. However, this does not work well for X chromosome (chrX). To mitigate this limitation, we can record the chromosome X in the pvar file and pretend as if another autosome first and run thee association analysis.
--output-chr 26to change the
.pvarto encode chrX as “23”
- run the original
--glmcommand with e.g.
--chr-set 40(which specifies 40 instead of 22 autosomes).
GWAS of multiple quantitative traits
plink2/20190401, the program supports efficient linear regression of multiple quantitative phenotypes.
–glm without –adjust now detects groups of quantitative phenotypes with the same “missingness pattern”, and processes them together
One should use this feature for efficient computation.