Nomenclature report for killer-cell immunoglobulin-like receptors (KIR) in macaque species: new genes/alleles, renaming recombinant entities and IPD-NHKIR updates

The Killer-cell Immunoglobulin-like Receptors (KIR) are encoded by a diverse group of genes, which are characterized by allelic polymorphism, gene duplications, and recombinations, which may generate recombinant entities. The number of reported macaque KIR sequences is steadily increasing, and these data illustrate a gene system that may match or exceed the complexity of the human KIR cluster. This report lists the names of quality controlled and annotated KIR genes/alleles with all the relevant references for two different macaque species: rhesus and cynomolgus macaques. Numerous recombinant KIR genes in these species necessitate a revision of some of the earlier-published nomenclature guidelines. In addition, this report summarizes the latest information on the Immuno Polymorphism Database (IPD)-NHKIR Database, which contains annotated KIR sequences from four non-human primate species.


Introduction
Over the last two decades, the number of human Killer-cell Immunoglobulin-like Receptor (KIR) sequences and haplotypes has increased substantially. These data shed light on a plastic gene cluster of higher primates that is characterized by allelic polymorphism and variable gene content, and that involves complex recombinations and high levels of alternative splicing (Trowsdale et al. 2001;Hsu et al. 2002;Parham 2004;Hammond et al. 2016;Bruijnesteijn et al. 2018a, b;Bruijnesteijn et al. 2018a, b). The system of nomenclature for human KIR genes (Marsh et al. 2003) accounts for the number of domains (2D or 3D), as well as for the activating (S) or inhibitory (L) signalling potential of the intracellular Jesse Bruijnesteijn, Natasja G. de Groot, and Nel Otting curators of the IPD-NHKIR Database This article is part of the Topical Collection on "Nomenclature, databases and bioinformatics in Immunogenetics" tail. In addition, the KIR genes and alleles are differentiated by numbers. For example, KIR3DL1*001 defines the first allele of a gene encoding a receptor that has three extracellular domains and a long cytoplasmic tail. For non-human primate species (NHP) such as macaques, chimpanzees, and orangutans, the human KIR nomenclature rules have been applied, and when these have not been sufficient, species-specific adaptions have been added to the guidelines for the nomenclature (Robinson et al. 2018).
Among the human KIR characterized, only a few intragenic recombinations have been reported (Roe et al. 2017;Bruijnesteijn et al. 2018a, b), although the number of such recombinant KIR could be underestimated, because family studies have not been a focus of the work, and might be missed by studies that mainly involved unrelated individuals. A recent study of the KIR gene transcriptome in families of rhesus (Macaca mulatta, Mamu) and cynomolgus (Macaca fascicularis, Mafa) macaques has identified numerous intragenic recombinant KIR (Bruijnesteijn et al., unpublished data). In this report on KIR nomenclature, we build on the previously reported human and NHP guidelines (Robinson et al. 2018) to focus on macaques, because in these species the number of genes/alleles reported has significantly increased.

General nomenclature guidelines for macaque KIR genes
The naming of macaque KIR genes follows the general principles that have been previously described (Marsh et al. 2003;Robinson et al. 2018). In brief, the first digit following the KIR abbreviation gives the number of immunoglobulin-like domains (denoted as "D"). In macaques, genes that encode KIR1D, KIR2D, and KIR3D structures are found (Hershberger et al. 2001). Either a long or short cytoplasmic tail, which are characteristic of inhibitory and activating receptors, respectively, is specified with an "L" or an "S" following the D, whereas "P" denotes a pseudogene. Genes considered to be novel but that lack sufficient confirmation-at the genomic DNA level-to define a gene or lineage, are denoted by a "W" for "Workshop", which follows the designation of the cytoplasmic tail. Different KIR genes are distinguished by sequential two-digit numbering. Non-synonymous KIR alleles are distinguished by three-digit numbers that are separated from the gene digits by an asterisk. Synonymous polymorphisms in the coding sequence of a KIR gene are distinguished by a second set of two digits, which is separated from the non-synonymous three-digit number by a colon (e.g., Mamu-KIR3DL01*012:02). A third set of digits, separated from the synonymous two-digit number by a colon, define substitutions in the introns. Optional suffixes indicating the expression status of alleles can be provided, and these include indicators of no expression, referred to as "Null" alleles ("N"), low cell surface expression ("L"), soluble and secreted gene products ("S"), and cytoplasmic expression ("C"). The "A" suffix is used when there is doubt as to whether a protein is expressed, whereas "Q" indicates alleles for which the expression is "Questionable" based on the study of previously reported mutations that do affect the level of expression.
KIR genes in the various species of macaque More than 20 species of macaque have been distinguished (Anandam et al. 2013). At present, characterization of the KIR genes has concentrated on rhesus and cynomolgus macaques (Khakoo et al. 2000;Grendell et al. 2001;Hershberger et al. 2001;Rajalingam et al. 2001;Guethlein et al. 2002;Andersen et al. 2004;Sambrook et al. 2005;Guethlein et al. 2007;Bimber et al. 2008;Blokhuis et al. 2009a, b;Blokhuis et al. 2009a, b;Bostik et al. 2009;Abi-Rached et al. 2010;Blokhuis et al. 2010;Chaichompoo et al. 2010;Kruse et al. 2010;Colantonio et al. 2011;Hellmann et al. 2011;Moreland et al. 2011;Bimber and Evans 2015;Prall et al. 2017). Up to now, knowledge regarding the organization of KIR genes in macaque is sparse, with only two genomic assemblies of the macaque KIR region available (Sambrook et al. 2005;Graves 2019). Consequently, KIR genes that are highly similar based on phylogenetic clustering and sequence homology are considered to define a single gene or locus that is common to both species of macaque, and are therefore designated as orthologs and given matching gene names: for example, Mamu-KIR3DL01 and Mafa-KIR3DL01. In contrast, speciesspecific KIR genes are given different numbers in the order in which they are distinguished. To give an example, KIR3DLW13 has only been detected in cynomolgus macaques (Mafa-KIR3DLW13). At the allele level, sequences are numbered in sequence according to the order in which they were defined. This procedure was applied independently to the different macaque species, without taking shared KIR alleles into account. In total, 58 and 59 rhesus and cynomolgus macaque KIR genes have been defined, and these represent 576 and 334 KIR alleles, respectively (Tables 1 and 2).
These guidelines for naming KIR sequences will be applied to other macaque species, but will be distinguished by the use of species-specific prefixes (Table 3).

Nomenclature for recombinant macaque KIR genes
Study of rhesus and cynomolgus macaque KIR from different geographical origins has identified many recombinant KIR that are composed of segments derived from two or more different KIR genes, which were confirmed by independent PCRs or segregation analysis. According to the previous 63   (Table 4). In the future, newly discovered recombinant sequences will be assigned sequential gene (workshop) numbers. Previous designations of renamed alleles and genes will be retained and marked as deleted. An exception is made for recombinant KIR genes in the centromeric region and that involve the macaque framework gene KIR3DL20. The physical location of these recombinant genes has been established (Sambrook et al. 2005), which contrasts with those of the recombinant lineage II KIR genes. Recombinant sequences derived from the centromeric region are still assigned as KIR3DL20 alleles, but are shown to be "Recombinant" with a novel suffix "R" subsequent to t h e a l l e l e d e s i g n a t i o n . F o r e x a m p l e , M a m u -KIR3DL20*030R (acc. nr. LR694489) and Mamu-KIR3DL20*044R (acc. nr. LR694507) are recombinants that consist of the first seven exons of Mamu-KIR3DL20, and of the intracellular domains of Mamu-KIR2DL04 and Mamu-KIR1D, respectively.
Most recombinant KIR genes of cynomolgus macaques have been assigned novel workshop numbers (Bruijnesteijn et al., unpublished data). Three additional KIR sequence groups have been renamed based on their recombinant nature (Table 5).

Renaming other macaque KIR genes
Several rhesus and cynomolgus macaque KIR sequences that were not obvious recombinants have been renamed based on sequence comparison and phylogenetic analysis (Tables 4 and  5). For example, 13 Mamu-KIR3DSW08 alleles are readily distinguished from the other KIR3DSW08 alleles, and have been renamed as alleles of Mamu-KIR3DSW39 (Table 4).
We should stress that Mamu-KIR3DL07 and Mamu-KIR3DL11 alleles group phylogenetically into three and two clusters, respectively, and that some KIR haplotypes contain several copies of these genes (Blokhuis et al. 2010;Bruijnesteijn et al. 2018a, b). Although an argument can be made for giving these paralogous genes unique gene names, sequence comparison has yet to indicate distinctive functions, and for this reason these genes have not been renamed.

The IPD-NHKIR Database
Knowledge regarding the KIR repertoire in various NHP species has increased steadily over the past decade, escalating the need    (Bimber et al. 2008;Prall et al. 2017) (Bruijnesteijn et al., unpublished data) have now been named, meaning that the KIR data from a fifth non-human primate species should soon be available. The current version of the IPD-NHKIR Database can host genomic sequences, and contains a multiple sequence alignment tool (Maccari et al. 2017). This tool allows for single gene alignments (nucleotide or protein) as well as inter-and intra-species gene alignments from all groups within the IPD-NHKIR Database. For each allele, a nomenclature table is accessible with additional information (for example: previous designations, GenBank/ ENA/DDBJ accession number, and publications). The curators of the IPD-NHKIR Database are responsible for assembling, categorizing, and providing official designations for newly identified alleles. For the NHP part of the IPD-NHKIR Database, the research group of Prof. Dr. R.E. Bontrop (Rijswijk, The Netherlands) is responsible for curation of the KIR sequences of macaque species, and for these species currently only full-length sequences are accepted for annotation, whereas curation for all other non-human primate species is the responsibility of Dr. L. A. Guethlein and Prof. Dr. P. Parham (Stanford, USA). Sequences/alleles can be submitted using the online submission tool, which is available from the IPD-NHKIR Database homepage (https://www.ebi.ac.uk/ipd/ nhkir/). Submitted sequences must meet the criteria described above and have a GenBank/ENA/DDBJ accession number. In addition to newly identified KIR sequences, we urge and encourage all scientists working in the field of non-human primate KIR to submit all the sequences determined in their cohorts that are identical to published KIR alleles. This latter approach will provide an additional and valuable quality control tool for the database of archived KIR sequences. Although at present, only one KIR sequence at a time can be submitted, we are currently developing a bulk submission tool. The IPD-NHKIR Database provides a data release twice a year, which updates the website with all novel NHP KIR sequences that have become public, and relates them to the previously deposited sequences.