Complete list of published work can be found at google scholar.
Selected publications
Sakaue, S. (2024). SCENT defines non-coding disease mechanisms using single-cell multi-omics. Nat Rev Genet, 25(9), 597.
@article{pmid38816646,
author = {Sakaue, S.},
title = {{{S}{C}{E}{N}{T} defines non-coding disease mechanisms using single-cell multi-omics}},
journal = {Nat Rev Genet},
year = {2024},
volume = {25},
number = {9},
pages = {597},
month = sep,
file = {s41576-024-00747-5.pdf},
doi = {10.1038/s41576-024-00747-5}
}
Sakaue, S., Weinand, K., Isaac, S., Dey, K. K., Jagadeesh, K., Kanai, M., Watts, G. F. M., Zhu, Z., Brenner, M. B., McDavid, A., Donlin, L. T., Wei, K., Price, A. L., & Raychaudhuri, S. (2024). Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles. Nat Genet, 56(4), 615–626.
@article{pmid38594305,
author = {Sakaue, S. and Weinand, K. and Isaac, S. and Dey, K. K. and Jagadeesh, K. and Kanai, M. and Watts, G. F. M. and Zhu, Z. and Brenner, M. B. and McDavid, A. and Donlin, L. T. and Wei, K. and Price, A. L. and Raychaudhuri, S.},
title = {Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles},
journal = {Nat Genet},
year = {2024},
volume = {56},
number = {4},
pages = {615--626},
month = apr,
file = {s41588-024-01682-1.pdf},
doi = {10.1038/s41588-024-01682-1}
}
Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer–gene maps from disease-relevant tissues. Building enhancer–gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer–gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer–gene maps, essential for defining noncoding variant function.
Sakaue, S., Gurajala, S., Curtis, M., Luo, Y., Choi, W., Ishigaki, K., Kang, J. B., Rumker, L., Deutsch, A. J., nherr, S., Forer, L., LeFaive, J., Fuchsberger, C., Han, B., Lenz, T. L., de Bakker, P. I. W., Okada, Y., Smith, A. V., & Raychaudhuri, S. (2023). Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease. Nat Protoc, 18(9), 2625–2641.
@article{pmid37495751,
author = {Sakaue, S. and Gurajala, S. and Curtis, M. and Luo, Y. and Choi, W. and Ishigaki, K. and Kang, J. B. and Rumker, L. and Deutsch, A. J. and nherr, S. and Forer, L. and LeFaive, J. and Fuchsberger, C. and Han, B. and Lenz, T. L. and de Bakker, P. I. W. and Okada, Y. and Smith, A. V. and Raychaudhuri, S.},
title = {{{T}utorial: a statistical genetics guide to identifying {H}{L}{A} alleles driving complex disease}},
journal = {Nat Protoc},
year = {2023},
volume = {18},
number = {9},
pages = {2625--2641},
month = sep,
file = {s41596-023-00853-4.pdf},
doi = {10.1038/s41596-023-00853-4}
}
The human leukocyte antigen (HLA) locus is associated with more complex diseases than any other locus in the human genome. In many diseases, HLA explains more heritability than all other known loci combined. In silico HLA imputation methods enable rapid and accurate estimation of HLA alleles in the millions of individuals that are already genotyped on microarrays. HLA imputation has been used to define causal variation in autoimmune diseases, such as type I diabetes, and in human immunodeficiency virus infection control. However, there are few guidelines on performing HLA imputation, association testing, and fine mapping. Here, we present a comprehensive tutorial to impute HLA alleles from genotype data. We provide detailed guidance on performing standard quality control measures for input genotyping data and describe options to impute HLA alleles and amino acids either locally or using the web-based Michigan Imputation Server, which hosts a multi-ancestry HLA imputation reference panel. We also offer best practice recommendations to conduct association tests to define the alleles, amino acids, and haplotypes that affect human traits. Along with the pipeline, we provide a step-by-step online guide with scripts and available software (https://github.com/immunogenomics/HLA_analyses_tutorial). This tutorial will be broadly applicable to large-scale genotyping data and will contribute to defining the role of HLA in human diseases across global populations.
Ishigaki, K., Sakaue, S., Terao, C., Luo, Y., Sonehara, K., Yamaguchi, K., Amariuta, T., Too, C. L., Laufer, V. A., Scott, I. C., Viatte, S., Takahashi, M., Ohmura, K., Murasawa, A., Hashimoto, M., Ito, H., Hammoudeh, M., Emadi, S. A., Masri, B. K., … Raychaudhuri, S. (2022). Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis. Nat Genet, 54(11), 1640–1651.
@article{pmid36333501,
author = {Ishigaki, K. and Sakaue, S. and Terao, C. and Luo, Y. and Sonehara, K. and Yamaguchi, K. and Amariuta, T. and Too, C. L. and Laufer, V. A. and Scott, I. C. and Viatte, S. and Takahashi, M. and Ohmura, K. and Murasawa, A. and Hashimoto, M. and Ito, H. and Hammoudeh, M. and Emadi, S. A. and Masri, B. K. and Halabi, H. and Badsha, H. and Uthman, I. W. and Wu, X. and Lin, L. and Li, T. and Plant, D. and Barton, A. and Orozco, G. and Verstappen, S. M. M. and Bowes, J. and MacGregor, A. J. and Honda, S. and Koido, M. and Tomizuka, K. and Kamatani, Y. and Tanaka, H. and Tanaka, E. and Suzuki, A. and Maeda, Y. and Yamamoto, K. and Miyawaki, S. and Xie, G. and Zhang, J. and Amos, C. I. and Keystone, E. and Wolbink, G. and van der Horst-Bruinsma, I. and Cui, J. and Liao, K. P. and Carroll, R. J. and Lee, H. S. and Bang, S. Y. and Siminovitch, K. A. and de Vries, N. and Alfredsson, L. and -Dahlqvist, S. and Karlson, E. W. and Bae, S. C. and Kimberly, R. P. and Edberg, J. C. and Mariette, X. and Huizinga, T. and é, P. and Schneider, M. and Kerick, M. and Denny, J. C. and Matsuda, K. and Matsuo, K. and Mimori, T. and Matsuda, F. and Fujio, K. and Tanaka, Y. and Kumanogoh, A. and Traylor, M. and Lewis, C. M. and Eyre, S. and Xu, H. and Saxena, R. and Arayssi, T. and Kochi, Y. and Ikari, K. and Harigai, M. and Gregersen, P. K. and Yamamoto, K. and Louis Bridges, S. and Padyukov, L. and Martin, J. and Klareskog, L. and Okada, Y. and Raychaudhuri, S.},
title = {{{M}ulti-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis}},
journal = {Nat Genet},
year = {2022},
volume = {54},
number = {11},
pages = {1640--1651},
month = nov,
file = {s41588-022-01213-w.pdf},
doi = {10.1038/s41588-022-01213-w}
}
Rheumatoid arthritis (RA) is a highly heritable complex disease with unknown etiology. Multi-ancestry genetic research of RA promises to improve power to detect genetic signals, fine-mapping resolution and performances of polygenic risk scores (PRS). Here, we present a large-scale genome-wide association study (GWAS) of RA, which includes 276,020 samples from five ancestral groups. We conducted a multi-ancestry meta-analysis and identified 124 loci (P < 5 × 10-8), of which 34 are novel. Candidate genes at the novel loci suggest essential roles of the immune system (for example, TNIP2 and TNFRSF11A) and joint tissues (for example, WISP1) in RA etiology. Multi-ancestry fine-mapping identified putatively causal variants with biological insights (for example, LEF1). Moreover, PRS based on multi-ancestry GWAS outperformed PRS based on single-ancestry GWAS and had comparable performance between populations of European and East Asian ancestries. Our study provides several insights into the etiology of RA and improves the genetic predictability of RA.
Sakaue, S., Hosomichi, K., Hirata, J., Nakaoka, H., Yamazaki, K., Yawata, M., Yawata, N., Naito, T., Umeno, J., Kawaguchi, T., Matsui, T., Motoya, S., Suzuki, Y., Inoko, H., Tajima, A., Morisaki, T., Matsuda, K., Kamatani, Y., Yamamoto, K., … Okada, Y. (2022). Decoding the diversity of killer immunoglobulin-like receptors by deep sequencing and a high-resolution imputation method. Cell Genom, 2(3), 100101.
@article{pmid36777335,
author = {Sakaue, S. and Hosomichi, K. and Hirata, J. and Nakaoka, H. and Yamazaki, K. and Yawata, M. and Yawata, N. and Naito, T. and Umeno, J. and Kawaguchi, T. and Matsui, T. and Motoya, S. and Suzuki, Y. and Inoko, H. and Tajima, A. and Morisaki, T. and Matsuda, K. and Kamatani, Y. and Yamamoto, K. and Inoue, I. and Okada, Y.},
title = {{{D}ecoding the diversity of killer immunoglobulin-like receptors by deep sequencing and a high-resolution imputation method}},
journal = {Cell Genom},
year = {2022},
volume = {2},
number = {3},
pages = {100101},
month = mar,
file = {1-s2.0-S2666979X22000180-main.pdf},
doi = {10.1016/j.xgen.2022.100101}
}
The killer cell immunoglobulin-like receptor (KIR) recognizes human leukocyte antigen (HLA) class I molecules and modulates the function of natural killer cells. Despite its role in immunity, the complex genomic structure has limited a deep understanding of the KIR genomic landscape. Here we conduct deep sequencing of 16 KIR genes in 1,173 individuals. We devise a bioinformatics pipeline incorporating copy number estimation and insertion or deletion (indel) calling for high-resolution KIR genotyping. We define 118 alleles in 13 genes and demonstrate a linkage disequilibrium structure within and across KIR centromeric and telomeric regions. We construct a KIR imputation reference panel (nreference = 689, imputation accuracy = 99.7%), apply it to biobank genotype (ntotal = 169,907), and perform phenome-wide association studies of 85 traits. We observe a dearth of genome-wide significant associations, even in immune traits implicated previously to be associated with KIR (the smallest p = 1.5 × 10-4). Our pipeline presents a broadly applicable framework to evaluate innate immunity in large-scale datasets.
Sakaue, S., Kanai, M., Tanigawa, Y., Karjalainen, J., Kurki, M., Koshiba, S., Narita, A., Konuma, T., Yamamoto, K., Akiyama, M., Ishigaki, K., Suzuki, A., Suzuki, K., Obara, W., Yamaji, K., Takahashi, K., Asai, S., Takahashi, Y., Suzuki, T., … Okada, Y. (2021). A cross-population atlas of genetic associations for 220 human phenotypes. Nat Genet, 53(10), 1415–1424.
@article{pmid34594039,
author = {Sakaue, S. and Kanai, M. and Tanigawa, Y. and Karjalainen, J. and Kurki, M. and Koshiba, S. and Narita, A. and Konuma, T. and Yamamoto, K. and Akiyama, M. and Ishigaki, K. and Suzuki, A. and Suzuki, K. and Obara, W. and Yamaji, K. and Takahashi, K. and Asai, S. and Takahashi, Y. and Suzuki, T. and Shinozaki, N. and Yamaguchi, H. and Minami, S. and Murayama, S. and Yoshimori, K. and Nagayama, S. and Obata, D. and Higashiyama, M. and Masumoto, A. and Koretsune, Y. and Ito, K. and Terao, C. and Yamauchi, T. and Komuro, I. and Kadowaki, T. and Tamiya, G. and Yamamoto, M. and Nakamura, Y. and Kubo, M. and Murakami, Y. and Yamamoto, K. and Kamatani, Y. and Palotie, A. and Rivas, M. A. and Daly, M. J. and Matsuda, K. and Okada, Y.},
title = {{{A} cross-population atlas of genetic associations for 220 human phenotypes}},
journal = {Nat Genet},
year = {2021},
volume = {53},
number = {10},
pages = {1415--1424},
month = oct,
doi = {10.1038/s41588-021-00931-x},
file = {s41588-021-00931-x.pdf}
}
Current genome-wide association studies do not yet capture sufficient diversity in populations and scope of phenotypes. To expand an atlas of genetic associations in non-European populations, we conducted 220 deep-phenotype genome-wide association studies (diseases, biomarkers and medication usage) in BioBank Japan (n = 179,000), by incorporating past medical history and text-mining of electronic medical records. Meta-analyses with the UK Biobank and FinnGen (ntotal = 628,000) identified 5,000 new loci, which improved the resolution of the genomic map of human traits. This atlas elucidated the landscape of pleiotropy as represented by the major histocompatibility complex locus, where we conducted HLA fine-mapping. Finally, we performed statistical decomposition of matrices of phenome-wide summary statistics, and identified latent genetic components, which pinpointed responsible variants and biological mechanisms underlying current disease classifications across populations. The decomposed components enabled genetically informed subtyping of similar diseases (for example, allergic diseases). Our study suggests a potential avenue for hypothesis-free re-investigation of human diseases through genetics.
Sakaue, S., Yamaguchi, E., Inoue, Y., Takahashi, M., Hirata, J., Suzuki, K., Ito, S., Arai, T., Hirose, M., Tanino, Y., Nikaido, T., Ichiwata, T., Ohkouchi, S., Hirano, T., Takada, T., Miyawaki, S., Dofuku, S., Maeda, Y., Nii, T., … Okada, Y. (2021). Genetic determinants of risk in autoimmune pulmonary alveolar proteinosis. Nat Commun, 12(1), 1032.
@article{pmid33589587,
author = {Sakaue, S. and Yamaguchi, E. and Inoue, Y. and Takahashi, M. and Hirata, J. and Suzuki, K. and Ito, S. and Arai, T. and Hirose, M. and Tanino, Y. and Nikaido, T. and Ichiwata, T. and Ohkouchi, S. and Hirano, T. and Takada, T. and Miyawaki, S. and Dofuku, S. and Maeda, Y. and Nii, T. and Kishikawa, T. and Ogawa, K. and Masuda, T. and Yamamoto, K. and Sonehara, K. and Tazawa, R. and Morimoto, K. and Takaki, M. and Konno, S. and Suzuki, M. and Tomii, K. and Nakagawa, A. and Handa, T. and Tanizawa, K. and Ishii, H. and Ishida, M. and Kato, T. and Takeda, N. and Yokomura, K. and Matsui, T. and Watanabe, M. and Inoue, H. and Imaizumi, K. and Goto, Y. and Kida, H. and Fujisawa, T. and Suda, T. and Yamada, T. and Satake, Y. and Ibata, H. and Hizawa, N. and Mochizuki, H. and Kumanogoh, A. and Matsuda, F. and Nakata, K. and Hirota, T. and Tamari, M. and Okada, Y.},
title = {{{G}enetic determinants of risk in autoimmune pulmonary alveolar proteinosis}},
journal = {Nat Commun},
year = {2021},
volume = {12},
number = {1},
pages = {1032},
month = feb,
file = {s41467-021-21011-y.pdf},
doi = {10.1038/s41467-021-21011-y}
}
Pulmonary alveolar proteinosis (PAP) is a devastating lung disease caused by abnormal surfactant homeostasis, with a prevalence of 6-7 cases per million population worldwide. While mutations causing hereditary PAP have been reported, the genetic basis contributing to autoimmune PAP (aPAP) has not been thoroughly investigated. Here, we conducted a genome-wide association study of aPAP in 198 patients and 395 control participants of Japanese ancestry. The common genetic variant, rs138024423 at 6p21, in the major-histocompatibility-complex (MHC) region was significantly associated with disease risk (Odds ratio [OR] = 5.2; P = 2.4 × 10-12). HLA fine-mapping revealed that the common HLA class II allele, HLA-DRB1*08:03, strongly drove this signal (OR = 4.8; P = 4.8 × 10-12), followed by an additional independent risk allele at HLA-DPβ1 amino acid position 8 (OR = 0.28; P = 3.4 × 10-7). HLA-DRB1*08:03 was also associated with an increased level of anti-GM-CSF antibody, a key driver of the disease (β = 0.32; P = 0.035). Our study demonstrated a heritable component of aPAP, suggesting an underlying genetic predisposition toward an abnormal antibody production.
Sakaue, S., Kanai, M., Karjalainen, J., Akiyama, M., Kurki, M., Matoba, N., Takahashi, A., Hirata, M., Kubo, M., Matsuda, K., Murakami, Y., Daly, M. J., Kamatani, Y., & Okada, Y. (2020). Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan. Nat Med, 26(4), 542–548.
@article{pmid32251405,
author = {Sakaue, S. and Kanai, M. and Karjalainen, J. and Akiyama, M. and Kurki, M. and Matoba, N. and Takahashi, A. and Hirata, M. and Kubo, M. and Matsuda, K. and Murakami, Y. and Daly, M. J. and Kamatani, Y. and Okada, Y.},
title = {{{T}rans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan}},
journal = {Nat Med},
year = {2020},
volume = {26},
number = {4},
pages = {542--548},
month = apr,
file = {s41591-020-0785-8.pdf},
doi = {10.1038/s41591-020-0785-8}
}
While polygenic risk scores (PRSs) are poised to be translated into clinical practice through prediction of inborn health risks1, a strategy to utilize genetics to prioritize modifiable risk factors driving heath outcome is warranted2. To this end, we investigated the association of the genetic susceptibility to complex traits with human lifespan in collaboration with three worldwide biobanks (ntotal = 675,898; BioBank Japan (n = 179,066), UK Biobank (n = 361,194) and FinnGen (n = 135,638)). In contrast to observational studies, in which discerning the cause-and-effect can be difficult, PRSs could help to identify the driver biomarkers affecting human lifespan. A high systolic blood pressure PRS was trans-ethnically associated with a shorter lifespan (hazard ratio = 1.03[1.02-1.04], Pmeta = 3.9 × 10-13) and parental lifespan (hazard ratio = 1.06[1.06-1.07], P = 2.0 × 10-86). The obesity PRS showed distinct effects on lifespan in Japanese and European individuals (Pheterogeneity = 9.5 × 10-8 for BMI). The causal effect of blood pressure and obesity on lifespan was further supported by Mendelian randomization studies. Beyond genotype-phenotype associations, our trans-biobank study offers a new value of PRSs in prioritization of risk factors that could be potential targets of medical treatment to improve population health.
Sakaue, S., Hirata, J., Kanai, M., Suzuki, K., Akiyama, M., Lai Too, C., Arayssi, T., Hammoudeh, M., Al Emadi, S., Masri, B. K., Halabi, H., Badsha, H., Uthman, I. W., Saxena, R., Padyukov, L., Hirata, M., Matsuda, K., Murakami, Y., Kamatani, Y., & Okada, Y. (2020). Dimensionality reduction reveals fine-scale structure in the Japanese population with consequences for polygenic risk prediction. Nat Commun, 11(1), 1569.
@article{pmid32218440,
author = {Sakaue, S. and Hirata, J. and Kanai, M. and Suzuki, K. and Akiyama, M. and Lai Too, C. and Arayssi, T. and Hammoudeh, M. and Al Emadi, S. and Masri, B. K. and Halabi, H. and Badsha, H. and Uthman, I. W. and Saxena, R. and Padyukov, L. and Hirata, M. and Matsuda, K. and Murakami, Y. and Kamatani, Y. and Okada, Y.},
title = {{{D}imensionality reduction reveals fine-scale structure in the {J}apanese population with consequences for polygenic risk prediction}},
journal = {Nat Commun},
year = {2020},
volume = {11},
number = {1},
pages = {1569},
month = mar,
file = {s41467-020-15194-z.pdf},
doi = {10.1038/s41467-020-15194-z}
}
The diversity in our genome is crucial to understanding the demographic history of worldwide populations. However, we have yet to know whether subtle genetic differences within a population can be disentangled, or whether they have an impact on complex traits. Here we apply dimensionality reduction methods (PCA, t-SNE, PCA-t-SNE, UMAP, and PCA-UMAP) to biobank-derived genomic data of a Japanese population (n = 169,719). Dimensionality reduction reveals fine-scale population structure, conspicuously differentiating adjacent insular subpopulations. We further enluciate the demographic landscape of these Japanese subpopulations using population genetics analyses. Finally, we perform phenome-wide polygenic risk score (PRS) analyses on 67 complex traits. Differences in PRS between the deconvoluted subpopulations are not always concordant with those in the observed phenotypes, suggesting that the PRS differences might reflect biases from the uncorrected structure, in a trait-dependent manner. This study suggests that such an uncorrected structure can be a potential pitfall in the clinical application of PRS.
Preprints
Sakaue, S., Network, A. M. P. R. A. S. L. E., & Raychaudhuri, S. (2025). Early and late RNA eQTL are driven by different genetic mechanisms. In bioRxiv. Cold Spring Harbor Laboratory. https://www.biorxiv.org/content/early/2025/02/26/2025.02.24.639351
@unpublished{Sakaue2025.02.24.639351,
author = {Sakaue, Saori and Network, Accelerating Medicines Partnership{\textregistered}: RA/SLE and Raychaudhuri, Soumya},
title = {Early and late RNA eQTL are driven by different genetic mechanisms},
elocation-id = {2025.02.24.639351},
year = {2025},
doi = {10.1101/2025.02.24.639351},
publisher = {Cold Spring Harbor Laboratory},
url = {https://www.biorxiv.org/content/early/2025/02/26/2025.02.24.639351},
eprint = {https://www.biorxiv.org/content/early/2025/02/26/2025.02.24.639351.full.pdf},
journal = {bioRxiv},
file = {2025.02.24.639351v1.full.pdf}
}
Understanding the genetic regulation of RNA abundance is essential to defining disease mechanisms. However, conventional expression quantitative loci (eQTL) studies quantify RNA molecules across the transcript lifecycle. While most eQTL likely affect transcription by altering promoter or enhancer function within the nucleus, it is also possible that they modulate any processes after transcription, including chemical modifications and RNA stability in the cytosol. To elucidate distinct eQTL mechanisms of early versus late RNA, we compared eQTL from mature cellular RNA and nascent nuclear RNA in the brain and the kidney. Across tissues, we identified different causal variants for cellular and nuclear eQTL for the same eGene. Cellular eQTL were enriched in transcribed regions (P=3.3×10-126), suggesting the importance of post-transcriptional regulation. Conversely, nuclear eQTL were enriched in distal regulatory elements (P=7.0×10-32), highlighting the role of DNA transcriptional regulation. For example, we identified stop-gain eQTL variants likely acting through nonsense-mediated decay in cellular eQTL that had no effect in nuclear eQTL. Cellular eQTL were enriched for loci with multiple causal variants in linkage disequilibrium within the transcribed regions, where they may in concert affect RNA stability. We also identified examples of nuclear eQTL variants within enhancers that had no effect in cellular eQTL. We show that such eQTL (e.g., TUBGCP4) sometimes uniquely colocalize with disease alleles (schizophrenia). This study reveals key differences in the genetic mechanisms of cellular and nuclear eQTL.