Skip to contents

Introduction

Kamil Slowikowski

2024-05-15

In this vignette, we explore a few different methods for visualizing the molecular structure of HLA proteins. First, we’ll look at an example of how to use the NGLVieweR R package to show HLA protein structures. Next, we’ll use PyMOL to do the same thing.

What are the PDB identifiers for each HLA gene?

Here is a list of PDB identifiers you might consider using to represent each HLA protein:

HLA-A  2xpg
HLA-B  2bvp
HLA-C  4nt6
HLA-DP 3lqz
HLA-DQ 4z7w
HLA-DR 3pdo

Also try searching the PDB website for, e.g., "HLA-DR" and see if there is a more appropriate structure for your analysis.

Using NGLVieweR

Let’s try to visualize the amino acid at PDB position 9 in the HLA-B protein structure.

We will visualize the structure of 2bvp from the Protein Data Bank (PDB).

Here is an example of how to do this with the NGLVieweR R package by Niels van der Velden:

# devtools::install_github("nvelden/NGLVieweR") # we need the latest version
library(NGLVieweR)
library(magrittr)
my_sele <- "9:A"
NGLVieweR("2bvp") %>%
  stageParameters(
    backgroundColor = "white",
    zoomSpeed = 1,
    cameraFov = 80
  ) %>%
  addRepresentation(
    type = "cartoon"
  ) %>%
  addRepresentation(
    type = "ball+stick",
    param = list(
      sele = my_sele
    )
  ) %>%
  addRepresentation(
    type = "label",
    param = list(
      sele = my_sele,
      labelType = "format",
      labelFormat = "[%(resname)s]%(resno)s", # or enter custom text
      labelGrouping = "residue", # or "atom" (eg. sele = "20:A.CB")
      color = "black",
      fontFamiliy = "sans-serif",
      xOffset = 1,
      yOffset = 0,
      zOffset = 0,
      fixedSize = TRUE,
      radiusType = 1,
      radiusSize = 5.5, # Label size
      showBackground = TRUE
      # backgroundColor="black",
      # backgroundOpacity=0.5
    )
  ) %>%
  zoomMove(
    center = my_sele,
    zoom = my_sele,
    duration = 0, # animation time in ms
    z_offSet = -20
  ) %>%
  setSpin()

In the view above, we see the blue peptide and the red HLA-B protein. The tyrosine at PDB position 9 is highlighted with a ball+stick representation, and it is also labeled with a text label. The structure is rotating so we can getter a better view.

We can use hlabud to answer some questions about this HLA-B amino acid sequence.

The first question we need to ask is:

  • Which IMGT position corresponds to the tyrosine at PDB position 9?

We need to open the PDB Sequence Annotations tool in order to figure out which IMGT number corresponds to the PDB number 9. Here is a screenshot from that tool:

PDB Sequence Annotations for 2bvp
PDB Sequence Annotations for 2bvp

Next, we can view the amino acid sequence numbering from IMGT:

library(stringr)
a$alleles[which(str_detect(rownames(a$alleles), "B*57:03")),][1,1:50]
#>      n30      n29      n28      n27      n26      n25      n24      n23 
#>      "M"      "R"      "V"      "T"      "A"      "P"      "R"      "T" 
#>      n22  n22_n21      n21      n20      n19      n18      n17      n16 
#>      "V" "......"      "L"      "L"      "L"      "L"      "W"      "G" 
#>      n15      n14      n13      n12      n11      n10       n9       n8 
#>      "A"      "V"      "A"      "L"      "T"      "E"      "T"      "W" 
#>       n7       n6       n5       n4       n3       n2       n1        1 
#>      "A"      "G"      "S"      "H"      "S"      "M"      "R"      "Y" 
#>        2        3        4        5        6        7        8        9 
#>      "F"      "Y"      "T"      "A"      "M"      "S"      "R"      "P" 
#>       10       11       12       13       14       15       16       17 
#>      "G"      "R"      "G"      "E"      "P"      "R"      "F"      "I" 
#>       18    18_19 
#>      "A"  "....."

By eye, we can see that the sequence YFYT starting at PDB position 9 corresponds to the YFYT sequence at IMGT position 3. So, we have manually confirmed that PDB position 9 matches with IMGT position 3.

Next, we might ask which HLA-B alleles have Y3?

my_alleles <- names(which(a$onehot[,"Y3"] == 1))
length(my_alleles)
#> [1] 7023
head(my_alleles, 20)
#>  [1] "B*07:02:01:01" "B*07:02:01:02" "B*07:02:01:03" "B*07:02:01:04"
#>  [5] "B*07:02:01:05" "B*07:02:01:06" "B*07:02:01:07" "B*07:02:01:08"
#>  [9] "B*07:02:01:09" "B*07:02:01:10" "B*07:02:01:11" "B*07:02:01:12"
#> [13] "B*07:02:01:13" "B*07:02:01:14" "B*07:02:01:15" "B*07:02:01:16"
#> [17] "B*07:02:01:17" "B*07:02:01:18" "B*07:02:01:19" "B*07:02:01:20"

What fraction of reported HLA-B alleles have tyrosine at IMGT position 3 (Y3)?

sum(a$onehot[,"Y3"] == 1) / nrow(a$onehot)
#> [1] 0.711406

As it turns out, almost all of the HLA-B alleles have Y3.

Using PyMOL

PyMOL is one of my favorite methods for visualizing protein structures, because it allows us to change a residue in an existing protein and visualize the new mutated protein.

It only takes few lines of PyMOL to create a nice figure.

For example, if we want to quickly highlight positions 13 and 45 in HLA-DQB1, this snippet of PyMOL code will produce the figure below.

Here is a Bash script that will:

  1. Write a PyMOL script
  2. Run the PyMOL script with the pymol command
#!/usr/bin/env bash

# Write a pymol script
cat << EOF > script.pml
fetch 7kei
show cartoon
remove solvent
remove chain D
remove chain H
color teal, chain A
color orange, chain B
color purple, chain C
color red, chain B & resi 13
color red, chain B & resi 45
label n. CA and chain B & resi 13, "%s %s" % (resi, resn)
label n. CA and chain B & resi 45, "%s %s" % (resi, resn)
png 7kei.png, width=1200, height=800, dpi=300 
EOF

# On Linux, we can just use `pymol` without making an alias

# On macOS, we need to make an alias
alias pymol=/Applications/PyMOL.app/Contents/MacOS/PyMOL

pymol -c script.pml

Here is what the PyMOL script will do:

  1. Load a structure from the Protein Data Bank (PDB). 7kei is the identifier for a published protein structure.
  2. Color the HLA-DQA1 protein teal.
  3. Color the HLA-DQB1 protein orange.
  4. Color the peptide purple.
  5. We color residues 13 and 45 in HLA-DQB1 red.
  6. Label those residues with their positions and names.
  7. Write a PNG file with a view of the structure.

In the image above, I manually rotated the structure with my mouse and added more text labels like "PDB: 7kei" after saving the file.