Some tips when working with PLINK2

1 minute read

Published:

PLINK is a well-established software for genetic analysis. In many project, we used plink2 for genome-wide association study (GWAS) and other computations related to the raw genotype matrix. Here, I list several tips on the use of this software.

pgenlibr to read plink2 pgen files from R

To read genotype data stored in plink2 pgen/pvar/psam files, one can use an R package called pgenlibr. Here I show some example usage of pgenlibr.

How to install pgenlibr in your R environment?

$ git clone https://github.com/chrchang/plink-ng.git
$ R
> install.packages('plink-ng/2.0/pgenlibr', repos = NULL, type='source')

Some basic usage of pgenlibr

require(pgenlibr)

pfile <- '@@@/sample_genotype_data'

pvar <- NewPvar(paste0(pfile, '.pvar.zst'))
pgen <- NewPgen(paste0(pfile, '.pgen'), pvar=pvar)

GetVariantId(pvar, 1)
GetVariantId(pvar, 10)
GetVariantCt(pvar)
GetRawSampleCt(pgen)

# read a single variant
buf <- pgenlibr::Buf(pgen)
pgenlibr::Read(pgen, buf, 1)

# read a list of variants
## var.idxs (list of variants you would like to read)
geno_mat <- ReadList(pgen, var.idxs, meanimpute=F)

Running recessive model on chrX

As illustrated in PLINK2 website, --glm recessive will be the modifier to run GWAS scan with recessive model. However, this does not work well for X chromosome (chrX). To mitigate this limitation, we can record the chromosome X in the pvar file and pretend as if another autosome first and run thee association analysis.

  1. use --output-chr 26 to change the .pvar to encode chrX as “23”
  2. run the original --glm command with e.g. --chr-set 40 (which specifies 40 instead of 22 autosomes).