Skip to contents

For each genotype name, return the the dosage matrix for each residue (amino acid or nucleotide) at each position.

Usage

dosage(
  mat,
  names,
  drop_constants = TRUE,
  drop_duplicates = FALSE,
  verbose = FALSE
)

Arguments

mat

A one-hot encoded matrix with one row per allele and one column for each residue (amino acid or nucleotide) at each position.

names

Input character vector with one genotype for each individual. All entries must be present in rownames(mat).

drop_constants

Filter out constant amino acid positions. TRUE by default.

drop_duplicates

Filter out duplicate amino acid positions. FALSE by default.

verbose

If TRUE, print messages along the way.

Value

A matrix with one row for each input genotype, and one column for each residue at each position.

Details

Each genotype should be represented like this "HLA-A*01:01,HLA-A*01:01"

By default, the returned matrix is filtered to exclude:

  • positions where all input genotypes have the same allele

Examples

DRB1_file <- file.path(
  "https://github.com/ANHIG/IMGTHLA/raw",
  "5f2c562056f8ffa89aeea0631f2a52300ee0de17",
  "alignments/DRB1_prot.txt"
)
a <- read_alignments(DRB1_file)
genotypes <- c(
  "DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02",
  "DRB1*04:174,DRB1*15:152",
  "DRB1*04:56:02,DRB1*15:01:48",
  "DRB1*14:172,DRB1*04:160",
  "DRB1*04:359,DRB1*04:284:02"
)
dosage <- dosage(a$onehot, genotypes)
dosage[,1:5]
#>                                                 n29unk Mn29 n28unk Vn28 n27unk
#> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02      1    2      1    2      1
#> DRB1*04:174,DRB1*15:152                              2    0      2    0      2
#> DRB1*04:56:02,DRB1*15:01:48                          2    0      2    0      2
#> DRB1*14:172,DRB1*04:160                              2    0      2    0      2
#> DRB1*04:359,DRB1*04:284:02                           2    0      2    0      2