Uncommon Variants in FLG2 and NOD2 Are Associated with Atopic Dermatitis in the Ethiopian Population

Loss-of-function variants in the FLG gene have been identified as the strongest cause of susceptibility to atopic dermatitis (AD) in Europeans and Asians. However, very little is known about the genetic etiology behind AD in African populations, where the prevalence of AD is notably high. We sought to investigate the genetic origins of AD by performing whole-genome sequencing in an Ethiopian family with 12 individuals and several affected in different generations. We identified 2 variants within FLG2 (p.D13Y) and NOD2 (p.A918S) genes cosegregating with AD in the affected individuals. Further genotyping analyses in both Ethiopian and Swedish AD cases and controls revealed a significant association with the FLG2 variant (p.D13Y, P < .0013) only in the Ethiopian cohort. However, the NOD2 variant (p.A918S) did not show any association in our Ethiopian cohort. Instead, 2 previously recognized NOD2 variants (p.A849V, P < .0085 and p.G908R, P < .0036) were significantly associated with AD in our Ethiopian cohort. Our study indicates that the FLG2 and NOD2 genes might be important in the etiology of AD in Ethiopians. Additional genetic and functional studies are needed to confirm the role of these genes and the associated variants into the development of AD.


INTRODUCTION
Atopic dermatitis (AD) is the most common chronic inflammatory skin disorder and is characterized by T helper 2 cellemediated immune response and epidermal dysfunction.The current estimated prevalence of AD is approximately 2e10% in adults and 15e30% in children (Weidinger et al, 2018).The prevalence of AD varies among populations; it is slightly higher in African American children (19.3%) than in European American children (16.1%) (Brunner and Guttman-Yassky, 2019).In Ethiopia, the prevalence of AD has been estimated to be as high as 19% in Addis Ababa and around 1e2% in Southern Ethiopia (Ait-Khaled et al, 2007).
A significant contributing factor to AD is the family history; it is estimated that if 1 parent is affected, the risk in the child increases by threefold, and if both parents are affected, the risk increases to fivefold (Weidinger et al, 2018).Today, lossof-function variants in the FLG gene are the most recognized susceptibility factor for AD, increasing the risk of developing AD by 3e5 times (Laughter et al, 2021).The FLG gene encodes the epidermal protein FLG, a major structural protein of the stratum corneum, essential for the epidermal barrier and for maintaining hydration (Sandilands et al, 2009).Notably, the FLG-null variants are particularly prevalent in European and Asian populations (Smith et al., 2006), whereas the absence of the common FLG-null variants was indicated among AD cases in the African population, in particular, in Ethiopians (Polcari et al, 2014;Winge et al, 2011).
Beyond FLG, genome association studies have identified additional loci associated with AD, encompassing genes implicated in immune regulation, epidermal barrier maintenance, tissue response, and environmental sensing (Tsakok et al, 2019).However, the cumulative impact of these genes accounts for only approximately 20% of the total heritability of AD, underscoring the complexity of its genetic underpinnings.

RESULTS
Variants in FLG2 and NOD2 genes cosegregate with AD in affected individuals of an Ethiopian AD family individuals of a 3 generation Ethiopian AD family with 12 individuals (Figure 1a and b).As described in flowchart in Figure 1c, we looked for potentially pathogenic variants that were shared among the affected individuals and not present in the unrelated family member (individual 7 in Figure 1b).Using this strategy, we identified variants in 29 genes (Table 1).Among the variants, 2 rare deleterious missense variants rs771395865 (p.D13Y; Genome Aggregation Database, version 4.0: 0.0001869 in African/African American and Combined Annotation Dependent Depletion: 22.9) in FLG2 and rs769395722 (p.A918S; Genome Aggregation Database, version 4.0: 0.00009342 in African/African American and Combined Annotation Dependent Depletion: 25.8) were found in the NOD2 in the 4 sequenced affected individuals, indicating cosegregation of these variants with AD (Figure 1b and Table 2).Both variants were validated using Sanger sequencing in the family (Figure 2).Protein modeling of these missense variants using DynaMut2 predicted the destabilization of the proteins (Figure 3).

Association of FLG2 and NOD2 variants with AD in an Ethiopia cohort
To further investigate the association of the variants rs771395865 (p.D13Y) in FLG2 and rs769395722 (p.A918S) in NOD2 identified in the family, we genotyped Ethiopian and Swedish AD cases and controls.The Ethiopian cohort consisted of 189 cases and 203 controls, whereas the Swedish cohort included 300 cases and 2000 controls.In addition to investigating the identified variants in the Ethiopian family, we also genotyped the NOD2 variants (p.A849V and p.G908R), observed previously in a study where whole-exome sequencing was performed in 22 Ethiopian patients with AD (Taylan et al, 2015).The frequencies of these variants, globally and in Africans, were extracted from the Genome Aggregation Database and Allele Frequency Aggregator databases and are described in Table 2.The FLG2 p.D13Y variant (P < .0003)as well as the NOD2 p.A849V (P < .0085)and p.G908R (P < .0036)variants were found to be statistically significant in the Ethiopian AD cases compared with those in Ethiopian healthy controls.However, these variants were not found to be significant in the Swedish AD caseecontrol comparison (Table 3), suggesting that their importance in susceptibility to AD is limited to certain populations, including Ethiopians.The NOD2 p.A918S variant (P < .469)was not found to be associated with either of the populations (Table 3).

Expression of FLG2 and NOD2 in skin biopsies of Ethiopian patients carrying the associated variants
To further explore the effects of the variants found in FLG2 and NOD2 in the protein expression, we obtained skin biopsies from 4 Ethiopian individuals: a healthy control, a patient with AD not carrying any of the studied variants, a patient with AD carrying NOD2 p.G908R, and a patient with AD carrying the FLG2 p.D13Y and NOD2 p.A849V variants (Figure 4a).In addition, AlphaMissense, a prediction tool for pathogenicity of missense variants (Cheng et al, 2023), predicted the variants mentioned earlier-NOD2 p.A849V, p.G908R, and FLG2 p.D13Y-to be pathogenic (Table 4).
Interestingly, immunostaining of the skin tissues showed that the expression of FLG2 was reduced in the presence of the FLG2 p.D13Y variant, which also influenced the expression of FLG in the stratum granulosum (Figure 4bed).The expression of NOD2 was lower in the presence of the p.G908R variant, but no difference was observed for the p.A849V variant compared with the epidermis and the stratum basale from lesional skin of a noncarrier patient with AD (Figure 4e).

DISCUSSION
Genetics plays an important role in the risk of developing AD as well as in predicting its severity.It has been observed that the prevalence of AD varies greatly between diverse ethnic groups in countries with multiethnic populations, and the reasons for the geographical and ethnic differences are not well-understood (Kaufman et al, 2018).Our findings provide further support for the idea that European and Ethiopian populations show differing genetic backgrounds in AD etiology.Furthermore, we propose that the FLG2 and NOD2 genes may contribute to the development of AD specifically in the Ethiopian population.This underlines the importance of including a diverse genetic background when exploring  A CADD score of 30 is predicted to be the 0.1% most deleterious possible substitution in the human genome.PolyPhen-2 and SIFT are computational algorithms commonly used to predict whether or not a specific amino acid substitution in a protein, caused by a genetic variant, is likely to be damaging to protein function.AlphaMissense predictions may illuminate the molecular effects of missense variants, identify pathogenic variants, uncover novel disease-causing genes, and increase diagnostic accuracy for genetic diseases (https://github.com/google-deepmind/alphamissense).AlphaMissense_pred is a deep learning model based on the protein structure prediction tool AlphaFold2.AFR denotes African population.
S Wang et al.
FLG2 and NOD2 Variants in Ethiopians with AD the pathogenesis of AD as well as when designing clinical trials.
The epidermal differentiation complex region on chromosome 1 contains several important genes that contribute to the structural and functional integrity of the epidermal barrier (Hoffjan and Stemmler, 2007).FLG and FLG2, which are located next to each other, are in the epidermal differentiation complex region.FLG is strongly associated with AD in Europeans, whereas variants in FLG2, which is structurally very similar to FLG, have previously been described in African American patients with AD and linked to more persistent forms of AD (Margolis et al, 2014).Decreased expression of FLG2 has been associated with a thinner epidermis (Berna et al, 2022).Interestingly, when we further investigated the variants found in the Ethiopian family at the population level, we discovered that the FLG2 p.D13Y variant was associated with AD in the Ethiopian population cohort.
NOD2 encodes a cytosolic receptor involved in bacteria recognition by antigen-presenting cells and stimulates immune response (Negroni et al, 2018).Variants in NOD2 are associated with Crohn's disease (Hugot et al, 2001) and have also been shown to possibly confer susceptibility to atopic disorders in 2 German cohorts (Kabesch et al, 2003;Weidinger et al, 2005).
Even though we did not observe statistical significance in the Ethiopian AD cohort, and none of the family members reported to be affected by Crohn's disease, the NOD2 p.A918S variant found in the Ethiopian family affects the leucine-rich repeat domain, important for bacterial recognition, thereby potentially playing a critical role in innate immunity (Figure 1d).
Our results prompted us to investigate other variants in NOD2 previously suggested to be associated with AD, such as NOD2 p.A849V and p.G908R (Kabesch et al, 2003; Figure 3. Protein structural analysis of the genetic variants.The analyses were performed using the DynaMut2 server, which assesses the effects of missense variants on protein stability and flexibility.The dynamut2 analysis revealed that the missense variants of NOD2 p.A918S and FLG2 p.D13Y could cause destabilization, and the DDG stability was brought down to À0.45 kcal/mol and À0.03 kcal/mol, respectively.Amino acid changes are highlighted with arrows.Weidinger et al, 2005).Both NOD2 variants showed significant association in our Ethiopian caseecontrol cohort, suggesting that both FLG2 and NOD2 genes may play a role in the etiology of AD in the Ethiopian population.
Furthermore, reduced expression of FLG, FLG2, and NOD2 in the staining of skin biopsies from 2 Ethiopian patients with AD carrying the associated variants supports the association.Missense variants such as the ones associated with AD in Ethiopians presented in this study may affect protein folding, potentially influencing protein expression.Even though NOD2 p.A849V, p.G908R, and FLG2 p.D13Y variants were predicted to be pathogenic using in silico tools such as AlphaMissense, in vitro assays are needed to confirm the functional consequences of these variants.
Our study is limited by the cohort sizes.Because the identified variants were rare in databases (minor allele frequency < 0.01), increasing the number of participants would have improved the power of our study.However, our study suggests that the identified variants in NOD2 and FLG2 associated with AD may be specific to Ethiopians.The reason why loss-of-function variants in FLG are not associated with AD among Ethiopians remains unknown (Fernandez et al., 2017).NOD2 has been related to atopic diseases because it also seems to affect T helper 2 pathways, consistent with observations in AD (Kabesch et al, 2003).To fully understand the effects of the identified variants, additional studies are required for functional characterization.

Saliva collection and genomic DNA extraction
In both Ethiopia and Sweden, the diagnosis of AD was confirmed by dermatologists, using clinical examination, a standardized questionnaire regarding other atopic manifestations, and the UK Working Party's diagnostic criteria (Winge et al, 2011).To gather comprehensive information regarding AD severity and associated phenotypes such as allergies, questionnaires were administered to all patients and controls (Asad et al, 2019;Taylan et al, 2015;Winge et al, 2011).Saliva samples were collected using the Oragene-Discover (OGR-600) kit from DNAGenotek.In this study, we received saliva samples from a single family residing in Ethiopia, comprising a total of 12 individuals, including patients diagnosed with AD (n ¼ 4) (Table 5).Ethiopian individuals diagnosed with AD (n ¼ 189) and a control group (n ¼ 203), consisting of subjects without any history of AD, dry skin, or atopic manifestations, were recruited at the dermatology department of Black Lion University Hospital in Addis Ababa and the University of Gondar (Gondar, Ethiopia).For comparison, saliva samples were collected from patients with AD at Karolinska University Hospital (Stockholm, Sweden) (n ¼ 300).Data for Swedish healthy controls (n ¼ 2000) were obtained from the SweGen database (Ameur et al, 2017).
Genomic DNA was extracted with prepIT.L2P reagents (DNA-Genotek), following the manufacturer's instructions.Briefly, the prepIT.L2P reagent was added to the saliva samples, and the tubes were then centrifuged.The DNA pellet was washed, centrifuged, and air dried.Then, the DNA pellet was resuspended in TE buffer, and DNA quantification was performed using a Nanodrop1000 spectrophotometer.

WGS of an Ethiopian family, including patients with AD
WGS was performed on 3 patients with AD and 2 healthy individuals from an Ethiopian family using high-yield and high-quality DNA samples.The DNA of the male in the first generation failed to meet the WGS quality control criteria.The sequencing was carried out at the National Genomics Infrastructure at the Science for Life Laboratory (Stockholm, Sweden).For library preparation, highquality genomic DNA samples (300 ng) were pooled and sequenced on a 1 Illumina NovaSeq6000 S4 lane (Illumina, San Diego, CA) with 2 Â 150 bp pair-end reads with DNA PCR-Free kits (350 bp insert size).The resulting data were processed, and the sequence reads were aligned to the human genome build GRCh37.SNPs and insertions/deletions were detected using the Genome Analysis Toolkit pipeline.Genetic variants were annotated using a toolbox (eg, dbSnp and SnpEff).The variant selection was based on the following criteria: (i) present in several or all affected individuals, (ii) absent in unaffected parent, and (iii) in silico prediction of pathogenicity.The impact of variants was evaluated using the variant pathogenicity classifiers, including SIFT (Sorting Intolerant From Tolerant), Polyphen2 (Polymorphism Phenotyping, version 2), Combined Annotation-Dependent Depletion, GERPþþ (Genomic Evolutionary Rate Profiling), and Google DeepMind AlphaMissense (Cheng et al, 2023).Ultimately, the selected variants were manually examined in BAM (Binary Alignment Map) files using the Integrated Genomics Viewer.

Sanger sequencing
To confirm the variants identified through WGS in the Ethiopian family, Sanger DNA sequencing was conducted using the ABI 3730    6.

SNP genotyping
The SNPs were genotyped using the QuantStudio 6/7 Flex Real-Time PCR System Instrument (Life Technologies).Allele-specific Taqman MGB probes labeled with fluorescent dyes FAM and VIC (Applied Biosystems) were used, in accordance with the manufacturer's instructions.QuantStudio Real-Time PCR Software (Applied Biosystems) was used for allelic discrimination analysis.All the probes used in this study were either predesigned or designed by Thermo Fisher Scientific.The list of TaqMan SNP Genotyping Assays used in this study is as follows: rs769395722, ANAAUA4; rs2066845, C_11717466_20; rs104895486, C_152958082_10; rs771395865, ANKCM77; and rs12568784, C_11261511_10.

Immunohistochemistry of paraffin skin biopsies
The skin biopsies were obtained from 1 normal healthy control and lesional skin in 3 Ethiopian patients with AD.One of the patients was wild type for the variants examined in this study.Another patient was a carrier of rs771395865 (p.D13Y) in FLG2 and rs104895486 (p.A849V) in NOD2, and the third patient was a carrier of rs2066845 (p.G908R) in NOD2.After fixation and paraffin embedding, 5-mm-thick sections were mounted onto glass slides.Heatinduced antigen retrieval using a pressure cooker was performed after deparaffinization.The slides were then blocked for 40 minutes with PBS 10% goat serum (Thermo Fisher Scientific) in 1% BSA/ PBS þ 0.2% Triton X-100.Slides were incubated overnight with primary antibodies FLG2 (rabbit, ab122011, Abcam, diluted 1:2000), NOD2 (mouse, ab31488, Abcam, diluted 1:1000), and FLG (mouse, ab218395, Abcam, diluted 1:1000).The sections were incubated with secondary biotinylated antibodies for 40 minutes at room temperature and avidin-biotin reagents (PK4000, VECTASTAIN ABC-HRP Kit) for 30 minutes.The slides were then rinsed in PBS with Tween and exposed to ImmPACT DAB substrate (SK-4105, Vector Laboratories) for 1.5 minutes.

Quantification of immunohistochemistry
Three distinct areas (942.6 Â 600.1 mm) for each slide were analyzed.To evaluate the specificity of the immunohistochemistry results, negative controls in immunohistochemistry were used, where the primary antibody was omitted.In addition, the goat anti-Mouse IgG secondary antibody was utilized and to identify any false-positive staining reactions.The slides were converted to whole-slide imaging using an Olympus slide scanner.The entire set of slides was scanned utilizing the Olympus OlyVIA V3.3 software, enabling comprehensive viewing and the option to incorporate scale bars for suitable figures.Then, the average intensity of staining was quantified using ImageJ software and was normalized against negative immunohistochemistry controls.Through the selection in ImageJ Please draw the area of interest, the epidermis analysis included the stratum basale (the deepest portion of the epidermis), stratum spinosum, stratum granulosum, stratum lucidum, and stratum corneum (the most superficial portion of the epidermis); Specifically, for FLG2, the analysis incorporated the stratum granulosum, and for NOD2, the stratum basale was considered (Figure 2b).The results were expressed as a ratio, with a score of 1.0 indicating equal intensity and a score !1.0 representing stronger staining intensity.

Statistical analysis
Mendelian inheritance and SNP frequencies were checked in all samples, including families and controls in the caseecontrol cohort.Controls were checked for adherence to the HardyeWeinberg equilibrium.Chi-square test was carried out to test for SNP = the epidermis.Epidermis analysis encompassed the following layers: the stratum basale, stratum spinosum, stratum granulosum, stratum lucidum, and stratum corneum; the assessment for FLG2 specifically focused on the stratum granulosum (highlighted in pink), whereas stratum basale (highlighted in pink) is for NOD2.This figure was adapted from Skin Cancer Progression by BioRender.com (2020).Retrieved from https://app.biorender.com/biorender-templates.(cee) Staining intensity quantification was conducted using ImageJ software (National Institutes of Health, Bethesda, MD, https://imagej.nih.gov/ij/).Quantification of FLG2 and FLG expression was performed in the entire epidermis and stratum granulosum, and that of NOD2 was performed in the entire epidermis and stratum basale.To ensure data robustness, all comparisons were normalized against negative controls.Error bars in figures represent mean AE SEM.Dots represent the quantification of each area for each slide.Statistical analysis was performed using ordinary 1-way ANOVA with Tukey's multiple comparison test in GraphPad Prism, version 9.1.2(Dotmatics, Boston, MA).Significance levels are denoted as follows: *P < .05,**P < .01,***P < .001,and ****P < .0001.S Wang et al.
FLG2 and NOD2 Variants in Ethiopians with AD association between AD and control cohorts, and P < .05 was considered significant.One-way ANOVA was used to assess statistically significant differences between the different groups for immunohistochemistry (*P < .05,**P < .01,***P < .001,and ****P < .0001).

Figure 1 .
Figure 1.WGS revealed rare variants in the NOD2 and the FLG2 gene in Ethiopian patients with AD.(a) Flowchart of the pipeline involved in generating and verifying the WGS results.DNA was sequenced using the Illumina platform.The GATK pipeline was applied to identify SNPs and insertions/deletions.The prediction tools were used to assess the potential impact of variants on protein function, conservation, and pathogenicity.Sanger sequencing was used for validating nucleotide changes.Genotyping was applied in both Ethiopian and Swedish caseecontrol cohorts.SIFT denotes Sorting Intolerant from Tolerant, PolyPhen-2_pred denotes Polymorphism Phenotyping v2 Predication, CADD denotes Combined Annotation Dependent Depletion, and GERPþþ denotes Genomic Evolutionary Rate Profiling.This figure was created with BioRender.com.(b) Pedigree chart from the Ethiopian family studied.Basic annotations are as follows: circles and squares denote females and males, respectively.Orange-, green-, and blue-filled shapes mean affected by AD, allergic rhinitis, and bronchial asthma, respectively.WGS was performed for the samples marked with asterisks.A black dot represents a carrier of both heterozygous variants: NOD2 p.A918S(GT) and FLG2 p.D13Y(CA); a red dot represents a carrier of the FLG2 p.D13Y variant.(c) Filtering strategy for genetic variants found in the Ethiopian family.(d) Location of the variants in NOD2 and FLG2.R denotes repeat.Figure 1a and d was created with BioRender.com.AR, allergic rhinitis; AD, atopic dermatitis; BA, bronchial asthma; GATK, Genome Analysis Toolkit; LRR, leucine-rich repeat; NBD, nucleotide binding domain; WGS, whole-genome sequencing.

Figure 2 .
Figure 2. Variant validation through Sanger sequencing.Sanger sequencing results show the 2 missense variants p.A918S in NOD2 and p.D13Y in FLG2.A denotes alanine, S denotes serine, and D denotes aspartic acid.WT, wild type.

Figure 4 .
Figure 4. Immunohistochemical staining for FLG2, NOD2, and FLG in skin biopsies from 3 Ethiopian patients and a healthy control.(a) The first patient was a noncarrier of the variants, the second was a carrier of the FLG2 p.D13Y and NOD2 p.A849V variants, and the third was a carrier of the NOD2 p.G908R variant only.Brown 3,3 0 -diaminobenzidine staining revealed variations in epidermal cell expression levels.Bars ¼ 100 mm.(b) Structure and skin cell composition of

FLG2
and NOD2 Variants in Ethiopians with AD PRISM DNA Analyzer at the KIGene Core Facility.Pairs of primers were designed and picked using the Primer3web (version 4.1.0).The primers used in the study are shown in Table

Table 2 .
Characteristics of the Variants Studied

Table 3 .
Genotyping of Selected Variants in Ethiopian and Swedish CaseeControl Cohorts S Wang et al.FLG2 and NOD2 Variants in Ethiopians with AD

Table 4 .
Pathogenicity of Missense Variants Associated with Patients with AD in Ethiopian Patients Using AlphaMissense Predictions

Table 5 .
Demographic and Clinical Characteristics of Individuals in the Ethiopian family

Table 6 .
Primers Used for Genomic DNA Sanger Sequencing FLG2 and NOD2 Variants in Ethiopians with AD