Skip to contents

Here are the conventions used for alignments (EBI IMGT-HLA help page):

  • The entry for each allele is displayed in respect to the reference sequences.

  • Where identity to the reference sequence is present the base will be displayed as a hyphen (-).

  • Non-identity to the reference sequence is shown by displaying the appropriate base at that position.

  • Where an insertion or deletion has occurred this will be represented by a period (.).

  • If the sequence is unknown at any point in the alignment, this will be represented by an asterisk (*).

  • In protein alignments for null alleles, the 'Stop' codons will be represented by a hash (X).

  • In protein alignments, sequence following the termination codon, will not be marked and will appear blank.

  • These conventions are used for both nucleotide and protein alignments.

Usage

hla_alignments(
  gene = "DRB1",
  type = "prot",
  release = "latest",
  verbose = FALSE
)

Arguments

gene

The name of a gene like "DRB1"

type

The type of sequence, one of "prot", "nuc", "gen"

release

Default is "latest". Should be a release name like "3.51.0".

verbose

If TRUE, print messages along the way.

Value

A list with a character vector called sequences and two matrices called alleles and onehot. The character vector sequences has one sequence for each allele, and the names are the allele names. The matrix alleles has one row for each allele, and one column for each position, with the values representing the residues at each position in each allele. The matrix onehot has a one-hot encoding of the variants that distinguish the alleles, with one row for each allele and one column for each amino acid at each position.

See also

hla_releases() to get a complete list of all release names.

Examples

# \donttest{
a <- hla_alignments("DRB1")
head(a$sequences)
#>                                                                                                                                                                                                                                                                DRB1*01:01:01:01 
#> "MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLERCIYNQEE.SVRFDSDVGEYRAVTELGRPDAEYWNSQKDLLEQRRAAVDTYCRHNYGVGESFTVQRR.VEPKVTVYPSKTQPLQHHNLLVCSVSGFYPGSIEVRWFRNGQEEKAGVVSTGLIQNGDWTFQTLVMLETVPRSGEVYTCQVEHPSVTSPLTVEWRARSESAQSKMLSGVGGFVLGLLFLGAGLFIYFRNQKGHSGLQPTGFLS" 
#>                                                                                                                                                                                                                                                                DRB1*01:01:01:02 
#> "------------------------------------------------------.-----------.----------------------------------------------------------.-----------------------------------------------------------------------------------------------------------------------------------------------" 
#>                                                                                                                                                                                                                                                                DRB1*01:01:01:03 
#> "------------------------------------------------------.-----------.----------------------------------------------------------.-----------------------------------------------------------------------------------------------------------------------------------------------" 
#>                                                                                                                                                                                                                                                                DRB1*01:01:01:04 
#> "------------------------------------------------------.-----------.----------------------------------------------------------.-----------------------------------------------------------------------------------------------------------------------------------------------" 
#>                                                                                                                                                                                                                                                                DRB1*01:01:01:05 
#> "------------------------------------------------------.-----------.----------------------------------------------------------.-----------------------------------------------------------------------------------------------------------------------------------------------" 
#>                                                                                                                                                                                                                                                                DRB1*01:01:01:06 
#> "------------------------------------------------------.-----------.----------------------------------------------------------.-----------------------------------------------------------------------------------------------------------------------------------------------" 
a$alleles[1:6,1:6]
#>                  n29 n28 n27 n26 n25 n24
#> DRB1*01:01:01:01 "M" "V" "C" "L" "K" "L"
#> DRB1*01:01:01:02 "M" "V" "C" "L" "K" "L"
#> DRB1*01:01:01:03 "M" "V" "C" "L" "K" "L"
#> DRB1*01:01:01:04 "M" "V" "C" "L" "K" "L"
#> DRB1*01:01:01:05 "M" "V" "C" "L" "K" "L"
#> DRB1*01:01:01:06 "M" "V" "C" "L" "K" "L"
a$onehot[1:6,1:6]
#>                  n29unk Mn29 n28unk Ln28 Vn28 n27unk
#> DRB1*01:01:01:01      0    1      0    0    1      0
#> DRB1*01:01:01:02      0    1      0    0    1      0
#> DRB1*01:01:01:03      0    1      0    0    1      0
#> DRB1*01:01:01:04      0    1      0    0    1      0
#> DRB1*01:01:01:05      0    1      0    0    1      0
#> DRB1*01:01:01:06      0    1      0    0    1      0
# }