Case study: RABEP1 xQTL and AD GWAS¶

This notebook documents the analysis of xQTL case study on a targeted gene, RABEP1.

  • Section 0: Sanity check
  • Section 1: Fine-mapping for xQTL and GWAS
  • Section 2: Multi-context colocalization with Bellenguez 2022
  • Section 3: Refinement of colocalized loci with other AD GWAS
  • Section 4: Assessment of multi-context xQTL effect sizes
  • Section 5: Multi-context causal TWAS (including conventional TWAS and MR)
  • Section 6: Context specific multi-gene fine-mapping
  • Section 7: Epigenomic QTL and their target regions
  • Section 8: Context focused validation in other xQTL data
  • Section 9: Non-linear effects of xQTL
  • Section 10: in silico functional studies in iPSC model
  • Section 11: Functional annotations of selected loci
  • Section 12: Candidate loci as trans-xQTL

Overview¶

FunGen-xQTL resource contains 67 xQTL contexts as well as 9 AD GWAS fine-mapped genome-wide. The overarching goal for case studies is to use these resource to raise questions and learn more about gene targets of interest.

Overall, a case study consists of the following aspects:

  • Check the basic information of the gene
  • Check the existing xQTL and integrative analysis results, roughly including
    • Summary table for univariate fine-mapping
    • Marginal association results
    • Multi-gene and multi-context fine-mapping
    • Multi-context colocalization with AD GWAS
    • TWAS, MR and causal TWAS
    • Integration with epigenetic QTL
    • Quantile QTL
    • Interaction QTL
    • Validation:
      • Additional xQTL data in FunGen-xQTL
      • Additional AD GWAS data-set already generated by FunGen-xQTL
    • In silico functional studies
      • Additional iPSC data-sets
    • Functional annotations of variants, particularly in relevant cellular contexts
  • Creative thinking: generate hypothesis, search in literature, raise questions to discuss

Computing environment setup¶

In [109]:
region_p
No description has been provided for this image
In [121]:
pip_p
No description has been provided for this image

Section 0: Sanity check¶

Check the basic information of the gene¶

  • To gain a preliminary understanding of this gene’s expression—specifically, whether it is cell-specific—can help us quickly determine if our results are consistent with expectations.

Useful websites:

  1. check gene function, (immune) cell type specificity, tissue specifity, protein location: https://www.proteinatlas.org
  2. check gene position and structure: https://www.ncbi.nlm.nih.gov/gene/
  3. other collective information: https://www.genecards.org

Check the existing results which are inputs to this analysis¶

In [1]:
source('/data/interactive_analysis/rf2872/codes/cb_plot.R')
source('/data/interactive_analysis/rf2872/codes/utilis.R')
for(file in list.files("/data/colocalization/colocboost/R", pattern = ".R", full.names = T)){
          source(file)
        }
gene_name = 'RABEP1'

dir.create(paste0('plots/', gene_name), recursive = T)

get basic target gene information

In [2]:
target_gene_info <- get_gene_info(gene_name = gene_name)
target_gene_info
$gene_info
A data.table: 1 x 14
region_id#chrstartendTSSLD_matrix_idLD_sumstats_idLD_sumstats_id_oldTADB_indexTADB_idgene_startgene_endsliding_windowsgene_name
<chr><chr><dbl><dbl><int><chr><chr><chr><chr><chr><int><int><chr><chr>
ENSG00000029725chr17428226484000005282264chr17:3524642-4611485,chr17:4611485-5782462,chr17:5782462-7411566,chr17:7411566-920191317_3524642-4611485,17_4611485-5782462,17_5782462-7411566,17_7411566-920191317_3524642-4611485,17_4611485_5782462,17_5782462_7411566,17_7411566_9201913TADB_1182,TADB_1183chr17_1059843_6175034,chr17_3710595_939782652822655386340chr17:0-9397826,chr17:1059843-10729687,chr17:3710595-13143830,chr17:6187187-17353469,chr17:8250471-20190245RABEP1
$target_LD_ids
A matrix: 1 x 4 of type chr
chr17:3524642-4611485chr17:4611485-5782462chr17:5782462-7411566chr17:7411566-9201913
$target_sums_ids
A matrix: 1 x 4 of type chr
17_3524642-461148517_4611485-578246217_5782462-741156617_7411566-9201913
$gene_region
'chr17:4282264-8400000'
$target_TAD_ids
A matrix: 1 x 2 of type chr
chr17_1059843_6175034chr17_3710595_9397826
In [3]:
gene_id = target_gene_info$gene_info$region_id
chrom = target_gene_info$gene_info$`#chr`

Take a quick look for the expression of target gene in ROSMAP bulk data, we don't want them to be too low

In [4]:
source('/data/interactive_analysis/rf2872/codes/utilis.R')
expression_in_rosmap_bulk(target_gene_info)
No description has been provided for this image

Section 1: Fine-mapping for xQTL and GWAS¶

see notebook

In [5]:
region_p

Bellenguez et al GWAS signals has many overlap with CS from other xQTL sources. This motivates us to look further. The figure above shows the ranges of CS to give us a loci level idea. Below, we show the variants in those CS, color-coding the variants that are shared between them in the same color. In particular, AD GWAS signals are also captured by a few xQTL data, although at this point we don't have formal statistical (colocalization) evidences for these observations yet.

In [6]:
pip_p

Section 2: Multi-context colocalization with Bellenguez 2022¶

This is done using ColocBoost. The most updated version of ColocBoost results are under path s3://statfungen/ftp_fgc_xqtl/analysis_result/ColocBoost/2024_9/

In [7]:
cb_res <- readRDS(paste0("/data/analysis_result/ColocBoost/2024_9/",gene_id,"_res.rds") )
In [8]:
#save colocboost results
cb_res_table <- get_cb_summary(cb_res) 

saveRDS(cb_res_table, paste0(gene_name, "_colocboost_res.rds"))
In [9]:
cb <- plot_cb(cb_res = cb_res, cex.pheno = 1.5, x.phen = -0.2)
No description has been provided for this image
In [ ]:
options(repr.plot.width=6, repr.plot.height=6)

ggplot(RABEP1_int_res, aes(x = variant_id, y = qvalue_interaction)) +
  geom_point(alpha = 0.7, size = 6) +
  labs(title = "qvalue for RABEP1 csets in interaction association nalysis",
       x = "Gene Name",
       y = "qvalue_interaction",
       size = "qvalue_interaction") +
  theme_minimal(base_size = 14) +
  theme(panel.background = element_blank(),
        panel.grid.major = element_line(color = "grey80"),
        legend.position = NULL,
        axis.text.x = element_text(angle = 45, hjust = 1))  + ylim(0,1)
  # scale_color_manual(values = colorRampPalette(brewer.pal(8, "Set1"))(length(unique(flat_var$gene_name))))
ggsave('plots/RABEP1/sec11.interaction_association_RABEP1_lessPIP25.pdf', height = 5, width = 8) 
In [11]:
# colocalized variants
cb_res_table
A data.frame: 2 x 8
colocalized phenotypespurity# variantshighest VCPcolocalized indexcolocalized variantsmax_abs_z_variantcset_id
<chr><dbl><dbl><dbl><chr><chr><chr><chr>
Ast; Exc; Inh; AC; PCC; pQTL0.8004688340.15981375933; 5403; 5463; 5430; 5924; 5932; 5472; 6046; 5442; 5436; 6010; 5480; 6022; 5460; 5971; 5525; 6254; 6280; 5883; 5911; 5898; 5817; 5596; 6076; 6077; 5844; 5499; 5528; 5559; 5592; 5564; 5570; 5506; 5593 chr17:5359454:A:G; chr17:5358271:G:GTTA; chr17:5378694:T:C; chr17:5359234:T:C; chr17:5278766:C:T; chr17:5367119:T:C; chr17:5276110:C:T; chr17:5405283:G:A; chr17:5410487:TG:T; chr17:5372623:G:A; chr17:5280229:G:A; chr17:5267575:T:C; chr17:5271449:G:A; chr17:5356361:C:A; chr17:5298778:T:C; chr17:5345931:G:A; chr17:5374303:C:A; chr17:5334512:A:G; chr17:5350822:C:G; chr17:5273595:G:A; chr17:5282369:T:C; chr17:5272779:T:A; chr17:5298656:ATT:AT; chr17:5339674:CT:C; chr17:5286950:A:G; chr17:5283597:G:A; chr17:5275572:CT:CTT; chr17:5293087:C:T; chr17:5298227:G:A; chr17:5294580:CT:C; chr17:5383759:TTTATC:T; chr17:5383760:TTA:*; chr17:5287167:G:A; chr17:5291886:A:G chr17:5383759:TTTATC:Tcoloc_sets:Y2_Y5_Y6_Y8_Y9_Y11:CS1
Mic; Oli; AD_Bellenguez_20220.5972950680.43338585215; 5154; 5317; 5160; 5158; 5171; 5967; 5474; 5628; 5165; 5166; 5666; 5704; 5882; 5865; 5565; 5588; 5877; 5899; 5860; 5841; 5835; 5567; 5848; 5913; 5518; 5866; 5763; 5938; 5703; 5838; 5765; 5766; 5787; 5770; 5727; 5070; 5934; 5049; 5046; 5925; 5690; 5691; 5700; 5611; 5612; 5949; 5715; 5863; 4795; 4713; 5742; 4912; 5125; 5224; 5286; 5302; 5140; 5710; 5635; 5681; 5615; 5634; 5483; 5724; 5743; 5711; 5550chr17:5229833:T:C; chr17:5243359:T:TA; chr17:5235685:G:A; chr17:5241401:A:G; chr17:5235009:C:T; chr17:5251170:CT:CTTT; chr17:5252624:C:T; chr17:5255419:T:C; chr17:5233752:G:A; chr17:5232065:A:C; chr17:5214511:G:A; chr17:5215656:T:G; chr17:5220566:G:A; chr17:5237104:G:T; chr17:5319346:A:C; chr17:5300901:A:G; chr17:5301435:G:A; chr17:5317030:G:A; chr17:5317045:C:T; chr17:5318016:G:A; chr17:5363707:G:T; chr17:5319369:TA:T; chr17:5341776:G:A; chr17:5306542:T:C; chr17:5316491:T:A; chr17:5302275:C:T; chr17:5365888:T:C; chr17:5337701:A:G; chr17:5306215:CAGAA:AAGAA; chr17:5279135:G:A; chr17:5281172:T:G; chr17:5322499:C:T; chr17:5324584:A:G; chr17:5324697:G:A; chr17:5319350:CAAAAA:C; chr17:5327572:G:A; chr17:5286121:T:C; chr17:5290954:A:G; chr17:5293114:C:T; chr17:5236512:G:A; chr17:5236513:C:T; chr17:5305217:T:TG; chr17:5360990:G:A; chr17:5328117:G:A; chr17:5328323:A:G; chr17:5313703:A:G; chr17:5359952:G:A; chr17:5318880:G:C; chr17:5330522:C:T; chr17:5345387:G:T; chr17:5342744:A:T; chr17:5342855:A:G; chr17:5328907:C:T; chr17:5323543:A:AT; chr17:5296937:C:T; chr17:5344437:C:A; chr17:5351138:GAC:G; chr17:5341558:G:A; chr17:5358448:A:G; chr17:5318776:C:T; chr17:5338555:C:T; chr17:5337649:T:A; chr17:5293315:C:T; chr17:5340167:A:G; chr17:5138409:A:G; chr17:5356470:C:G; chr17:5125400:G:GT; chr17:5177049:A:Cchr17:5229833:T:C coloc_sets:Y1_Y3_Y15:MergeCS1
In [12]:
# effect sign for each coloc sets
get_effect_sign_csets(cb_res)
$`coloc_sets:Y2_Y5_Y6_Y8_Y9_Y11:CS1`
A data.frame: 34 x 7
variantsAstExcInhACPCCpQTL
<chr><dbl><dbl><dbl><dbl><dbl><dbl>
chr17:5359454:A:Gchr17:5359454:A:G -3.715741-4.979460-4.995939 15.20998 15.31915 12.31771
chr17:5267575:T:Cchr17:5267575:T:C -4.731993-4.394953-4.696897 12.97605 14.08966 11.76727
chr17:5276110:C:Tchr17:5276110:C:T 4.455271 4.787557 4.721195-14.13540-14.48959-12.16723
chr17:5271449:G:Achr17:5271449:G:A 4.278002 4.625743 4.659380-14.04984-14.60270-12.38393
chr17:5358271:G:GTTAchr17:5358271:G:GTTA -3.645044-4.958211-5.036522 15.22943 15.14271 11.96296
chr17:5359234:T:Cchr17:5359234:T:C -3.948051-4.776894-4.927096 15.06744 14.97976 12.21780
chr17:5278766:C:Tchr17:5278766:C:T -3.946444-4.802468-4.831753 14.11431 14.75761 12.21210
chr17:5378694:T:Cchr17:5378694:T:C -3.359199-4.363433-4.323898 16.02462 16.04294 12.50741
chr17:5273595:G:Achr17:5273595:G:A 4.195881 4.639293 4.682148-13.97326-14.42146-12.31565
chr17:5272779:T:Achr17:5272779:T:A 4.192765 4.618376 4.691182-14.01101-14.39611-12.31565
chr17:5372623:G:Achr17:5372623:G:A -3.256514-4.102272-4.198589 15.82085 15.96749 13.08882
chr17:5280229:G:Achr17:5280229:G:A -3.907854-4.683766-4.596392 14.18343 14.76471 12.30127
chr17:5374303:C:Achr17:5374303:C:A -3.331431-3.959980-4.301004 15.78624 15.80720 13.05944
chr17:5275572:CT:CTTchr17:5275572:CT:CTT 4.298248 4.830544 4.738207-13.71224-14.09672-12.03223
chr17:5367119:T:Cchr17:5367119:T:C -3.990546-4.675436-4.630625 14.83747 15.25827 11.95764
chr17:5286950:A:Gchr17:5286950:A:G 4.485925 4.975570 4.522980-14.14359-14.50978-11.29176
chr17:5405283:G:Achr17:5405283:G:A -3.216193-3.993719-4.548877 15.97231 15.73064 12.68335
chr17:5410487:TG:Tchr17:5410487:TG:T -3.216193-3.993719-4.548877 15.97231 15.73064 12.68335
chr17:5345931:G:Achr17:5345931:G:A -4.318291-5.095105-4.745683 14.17232 14.46912 11.38249
chr17:5356361:C:Achr17:5356361:C:A -4.305448-5.069807-4.768804 14.22344 14.52214 11.23184
chr17:5350822:C:Gchr17:5350822:C:G -4.305448-5.069807-4.768804 14.07878 14.50681 11.28134
chr17:5334512:A:Gchr17:5334512:A:G -4.180774-5.034225-4.521822 14.37477 14.48051 11.40504
chr17:5298778:T:Cchr17:5298778:T:C -4.245913-4.944436-4.865619 14.18081 14.51864 11.09431
chr17:5383759:TTTATC:Tchr17:5383759:TTTATC:T-3.185448-3.933164-3.893642 15.94556 16.10926 12.78796
chr17:5383760:TTA:*chr17:5383760:TTA:* -3.185448-3.933164-3.893642 15.94556 16.10926 12.78796
chr17:5339674:CT:Cchr17:5339674:CT:C -4.064495-4.906945-4.763990 14.21078 14.52214 11.35655
chr17:5282369:T:Cchr17:5282369:T:C -4.179875-4.976664-4.703051 14.20808 14.55017 11.14404
chr17:5287167:G:Achr17:5287167:G:A 4.251462 4.924872 4.618349-14.22344-14.52214-11.23184
chr17:5291886:A:Gchr17:5291886:A:G 4.251462 4.924872 4.618349-14.22344-14.52214-11.23184
chr17:5298227:G:Achr17:5298227:G:A -4.252675-4.924840-4.908130 13.72524 13.86328 10.89662
chr17:5293087:C:Tchr17:5293087:C:T -4.252675-4.924840-4.908130 13.72524 13.86328 10.84194
chr17:5294580:CT:Cchr17:5294580:CT:C -4.252675-4.924840-4.908130 13.67972 13.78415 10.60261
chr17:5283597:G:Achr17:5283597:G:A -4.180482-4.989203-5.049722 13.65083 13.82760 10.66898
chr17:5298656:ATT:ATchr17:5298656:ATT:AT -4.122928-4.952047-4.659619 13.70835 14.51708 10.25043
$`coloc_sets:Y1_Y3_Y15:MergeCS1`
A data.frame: 68 x 4
variantsMicOliAD_Bellenguez_2022
<chr><dbl><dbl><dbl>
chr17:5241401:A:Gchr17:5241401:A:G 9.546647-5.6539626.983607
chr17:5233752:G:Achr17:5233752:G:A 9.501270-5.4790107.016393
chr17:5255419:T:Cchr17:5255419:T:C 9.519035-5.5248936.918699
chr17:5235685:G:Achr17:5235685:G:A 9.565053-5.3768217.000000
chr17:5235009:C:Tchr17:5235009:C:T 9.536039-5.1667856.943089
chr17:5237104:G:Tchr17:5237104:G:T 9.314725-5.1735946.991803
chr17:5365888:T:Cchr17:5365888:T:C 7.940102-6.4288846.454545
chr17:5279135:G:Achr17:5279135:G:A 8.259632-6.2993696.270492
chr17:5305217:T:TGchr17:5305217:T:TG 8.078104-6.1537656.622429
chr17:5236512:G:Achr17:5236512:G:A 9.173383-4.8731387.182540
chr17:5236513:C:Tchr17:5236513:C:T 9.173383-4.8731387.024390
chr17:5313703:A:Gchr17:5313703:A:G 8.078104-6.1537656.352459
chr17:5318880:G:Cchr17:5318880:G:C 8.078104-6.1537656.344262
chr17:5345387:G:Tchr17:5345387:G:T 8.078104-6.1537656.336066
chr17:5342744:A:Tchr17:5342744:A:T 8.078104-6.1537656.336066
chr17:5293114:C:Tchr17:5293114:C:T 8.078104-6.1537656.319672
chr17:5296937:C:Tchr17:5296937:C:T 8.078104-6.1537656.311475
chr17:5344437:C:Achr17:5344437:C:A 8.078104-6.1537656.303279
chr17:5351138:GAC:Gchr17:5351138:GAC:G8.078104-6.1537656.296000
chr17:5341558:G:Achr17:5341558:G:A 8.078104-6.1537656.286885
chr17:5338555:C:Tchr17:5338555:C:T 8.078104-6.1537656.270492
chr17:5337649:T:Achr17:5337649:T:A 8.078104-6.1537656.262295
chr17:5293315:C:Tchr17:5293315:C:T 8.078104-6.1537656.252033
chr17:5340167:A:Gchr17:5340167:A:G 8.078104-6.1537656.254098
chr17:5356470:C:Gchr17:5356470:C:G 8.078104-6.1537656.195122
chr17:5286121:T:Cchr17:5286121:T:C 8.140700-6.1584286.040650
chr17:5342855:A:Gchr17:5342855:A:G 8.108684-5.9879216.327869
chr17:5327572:G:Achr17:5327572:G:A 7.768881-6.1718616.360656
chr17:5360990:G:Achr17:5360990:G:A 7.722492-6.0960356.504132
chr17:5318776:C:Tchr17:5318776:C:T 7.834962-5.9372876.278689
...............
chr17:5215656:T:Gchr17:5215656:T:G 9.234237-3.9917497.115702
chr17:5214511:G:Achr17:5214511:G:A 9.234237-3.9917497.115702
chr17:5358448:A:Gchr17:5358448:A:G 8.445012-4.8396566.372881
chr17:5317030:G:Achr17:5317030:G:A 6.638342-6.5976364.690265
chr17:5317045:C:Tchr17:5317045:C:T 6.638342-6.5976364.681416
chr17:5318016:G:Achr17:5318016:G:A 6.638342-6.5976364.637168
chr17:5300901:A:Gchr17:5300901:A:G 6.638342-6.5976364.637168
chr17:5301435:G:Achr17:5301435:G:A 6.638342-6.5976364.628319
chr17:5363707:G:Tchr17:5363707:G:T 6.638342-6.5976364.522124
chr17:5319369:TA:Tchr17:5319369:TA:T 6.460354-6.5293034.903622
chr17:5341776:G:Achr17:5341776:G:A 6.619266-6.5325674.584071
chr17:5138409:A:Gchr17:5138409:A:G 8.404479-3.9251106.568182
chr17:5125400:G:GTchr17:5125400:G:GT 8.406748-3.9085656.492424
chr17:5324584:A:Gchr17:5324584:A:G 6.419290-6.1952264.654867
chr17:5177049:A:Cchr17:5177049:A:C 8.213958-3.5793296.457364
chr17:5229833:T:Cchr17:5229833:T:C -9.591174 4.3326700.000000
chr17:5243359:T:TAchr17:5243359:T:TA 9.582396-5.3416470.000000
chr17:5251170:CT:CTTTchr17:5251170:CT:CTTT 9.405356-5.2212700.000000
chr17:5252624:C:Tchr17:5252624:C:T 9.519035-5.5248930.000000
chr17:5232065:A:Cchr17:5232065:A:C 9.275668-4.0229350.000000
chr17:5319346:A:Cchr17:5319346:A:C 6.365804-6.6190170.000000
chr17:5306542:T:Cchr17:5306542:T:C 6.590958-6.5089510.000000
chr17:5316491:T:Achr17:5316491:T:A 6.958697-6.4934510.000000
chr17:5302275:C:Tchr17:5302275:C:T 6.701620-6.4812600.000000
chr17:5306215:CAGAA:AAGAAchr17:5306215:CAGAA:AAGAA 7.921957-6.3956470.000000
chr17:5281172:T:Gchr17:5281172:T:G 8.259632-6.2993690.000000
chr17:5322499:C:Tchr17:5322499:C:T 6.419290-6.1952260.000000
chr17:5324697:G:Achr17:5324697:G:A 6.419290-6.1952260.000000
chr17:5319350:CAAAAA:Cchr17:5319350:CAAAAA:C 8.103984-6.1890820.000000
chr17:5290954:A:Gchr17:5290954:A:G 8.078104-6.1537650.000000
In [13]:
# LD between coloc sets
get_between_purity_simple(cb_res, gene.name = gene_id, path = '/data/colocalization/QTL_data/eQTL/')
A matrix: 1 x 5 of type chr
coloc_csets_1coloc_csets_2min_abs_cormax_abs_cormedian_abs_cor
coloc_sets:Y2_Y5_Y6_Y8_Y9_Y11:CS1coloc_sets:Y1_Y3_Y15:MergeCS10.282658619274760.4143122319920440.360688534423188

Here, different colors refer to different 95% Colocalization Sets (CoS, a metric developed in ColocBoost indicating that there is 95% probabilty that this CoS captures a colocalization event). We only included ROSMAP data for this particular ColocBoost analysis. In this case, we observe cell specific eQTL, bulk sQTL colocalization on ROSMAP data with AD as two separate CoS, suggesting two putative causal signals. We did not detect colocalization with pQTL of statistical significance although from Section 1 there are some overlap with pQTL signals in fine-mapping CS, the overlapped variants in CS have small PIP.

Section 3: Refinement of colocalized loci with other AD GWAS¶

Here we refine the colocalization with other AD GWAS to iron out any heterogeniety between studies (heterogeniety can come from many sources), to get additional candidate loci from these more heterogenous sources as candidates to study.

In [14]:
AD_cohorts <- c('AD_Jansen_2021', 'AD_Bellenguez_EADB_2022', 'AD_Bellenguez_EADI_2022',
             'AD_Kunkle_Stage1_2019', 'AD_Wightman_Excluding23andMe_2021',
             'AD_Wightman_ExcludingUKBand23andME_2021', 'AD_Wightman_Full_2021')
cb_ad <- plot_cb(cb_res = cb_res, cex.pheno = 1.5, x.phen = -0.2, add_gwas = TRUE, gene_id = gene_id, cohorts = AD_cohorts)
No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.
No description has been provided for this image
In [15]:
pdf('plots/RABEP1/sec3.colocboost_res_allad.pdf', width = 10, height = 5)
replayPlot(cb_ad$p)
dev.off()
pdf: 2

Section 4: Assessment of multi-context xQTL effect sizes¶

Option 1: ColocBoost + MASH¶

Use colocboost variants and check for mash posterior contrast to see if the effect size are shared or specific or even opposite. Advantage is that colocboost result is AD GWAS informed; issue is that marginal posterior effects is not always the joint

In [16]:
mash_p <- mash_plot(gene_name = 'RABEP1')
for (plot in mash_p) {
    print(plot)
}

Option 2: mvSuSiE¶

Use mvSuSiE multicontext fine-mapping results --- the bubble plot to check posterior effects. Issue is that we don't have this results yet, and this is limited to one cohort at a time, without information from AD.

We should go for option 1 by default and if we want to make claim about opposite effect size we double-check with mvSuSiE multicontext analysis.

Section 5: Multi-context causal TWAS (including conventional TWAS and MR)¶

The most updated version of cTWAS analysis are under path s3://statfungen/ftp_fgc_xqtl/analysis_result/cTWAS/

TWAS results¶

We report TWAS from all contexts and methods from the pipeline. Here we will filter it down to the best performing methods and only keep contexts that are significant.

In [17]:
plot_TWAS_res(gene_id = gene_id, gene_name = gene_name)
No description has been provided for this image

MR results¶

This is only available for genes that are deemed significant in TWAS and have summary statistics available for effect size and standard errors in GWAS, in addition to z-scores --- current version does not support z-scores although we will soon also support z-scores in MR using MAF from reference panel.

cTWAS results¶

To be updated soon.

Section 6: Context specific multi-gene fine-mapping¶

A quick analysis: using the xQTL-AD summary table (flatten table)¶

We extract from xQTL-AD summary table the variants to get other genes also have CS with the variants shared by target gene and AD.

In [18]:
multigene_flat <- get_multigene_multicontext_flatten('Fungen_xQTL_allQTL.overlapped.gwas.export.RABEP1.rds', sQTL = 'no_MSBB')
multigene_flat
A data.frame: 150 x 6
gene_id#chrstartendgene_namecontexts
<chr><chr><int><int><chr><chr>
ENSG00000004660chr1738930523893053CAMKK1 MiGA_THA_eQTL
ENSG00000004975chr1772345167234517DVL2 MiGA_THA_eQTL
ENSG00000005100chr1754689815468982DHX33 MiGA_GTS_eQTL,ROSMAP_AC_sQTL,ROSMAP_PCC_sQTL
ENSG00000006047chr1772946387294639YBX2 ROSMAP_PCC_sQTL
ENSG00000007168chr1725932092593210PAFAH1B1OPC_Kellis_eQTL
ENSG00000040531chr1736364583636459CTNS MiGA_GFM_eQTL,MiGA_GTS_eQTL,MiGA_SVZ_eQTL,MiGA_THA_eQTL,ROSMAP_DLPFC_sQTL
ENSG00000040633chr1772397217239722PHF23 MiGA_SVZ_eQTL,MiGA_THA_eQTL,ROSMAP_AC_sQTL
ENSG00000065320chr1790215099021510NTN1 MiGA_THA_eQTL
ENSG00000072778chr1772171247217125ACADVL MiGA_SVZ_eQTL,MiGA_THA_eQTL,Ast_10_Kellis_eQTL,ROSMAP_PCC_sQTL
ENSG00000072818chr1773365287336529ACAP1 MiGA_SVZ_eQTL,MiGA_THA_eQTL,BM_36_MSBB_eQTL,ROSMAP_DLPFC_sQTL
ENSG00000072849chr1754868105486811DERL2 MiGA_GTS_eQTL,MiGA_THA_eQTL
ENSG00000074356chr1738462453846246NCBP3 MiGA_SVZ_eQTL,ROSMAP_PCC_sQTL
ENSG00000074370chr1739644633964464ATP2A3 MiGA_GTS_eQTL
ENSG00000074755chr1741430294143030ZZEF1 MiGA_GTS_eQTL,ROSMAP_PCC_sQTL
ENSG00000091592chr1756194235619424NLRP1 ROSMAP_PCC_sQTL,ROSMAP_DLPFC_sQTL,STARNET_eQTL
ENSG00000091622chr1765565546556555PITPNM3 MiGA_SVZ_eQTL,ROSMAP_AC_sQTL,ROSMAP_PCC_sQTL
ENSG00000091640chr1749678164967817SPAG7 Knight_eQTL,BM_10_MSBB_eQTL,BM_22_MSBB_eQTL,Exc_DeJager_eQTL,DLPFC_DeJager_eQTL,PCC_DeJager_eQTL,Exc_mega_eQTL,Inh_mega_eQTL,DLPFC_Bennett_pQTL,monocyte_ROSMAP_eQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000108509chr1749876744987675CAMTA2 MiGA_GTS_eQTL,MiGA_THA_eQTL,ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000108515chr1749480914948092ENO3 MiGA_SVZ_eQTL,AC_DeJager_eQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000108518chr1749490604949061PFN1 MiGA_SVZ_eQTL,ROSMAP_PCC_sQTL
ENSG00000108523chr1749400074940008RNF167 MiGA_THA_eQTL,ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000108528chr1749400524940053SLC25A11ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000108556chr1749344374934438CHRNE MiGA_THA_eQTL,BM_36_MSBB_eQTL,AC_DeJager_eQTL,ROSMAP_DLPFC_sQTL
ENSG00000108559chr1754196755419676NUP88 MiGA_GFM_eQTL,MiGA_THA_eQTL,BM_10_MSBB_eQTL,BM_22_MSBB_eQTL,BM_44_MSBB_eQTL,AC_DeJager_eQTL,ROSMAP_AC_sQTL,ROSMAP_PCC_sQTL,ROSMAP_DLPFC_sQTL,STARNET_eQTL
ENSG00000108561chr1754488295448830C1QBP MiGA_SVZ_eQTL,Exc_DeJager_eQTL,Inh_DeJager_eQTL,Oli_Kellis_eQTL,Exc_Kellis_eQTL,Inh_Kellis_eQTL,Exc_mega_eQTL,Inh_mega_eQTL,Oli_mega_eQTL,ROSMAP_AC_sQTL,STARNET_eQTL
ENSG00000108590chr1766516336651634MED31 MiGA_THA_eQTL,BM_10_MSBB_eQTL
ENSG00000108839chr1769960486996049ALOX12 MiGA_GFM_eQTL,MiGA_SVZ_eQTL,BM_44_MSBB_eQTL
ENSG00000108947chr1777052017705202EFNB3 MiGA_SVZ_eQTL,BM_22_MSBB_eQTL,ROSMAP_DLPFC_sQTL
ENSG00000108961chr1782886538288654RANGRF MiGA_GTS_eQTL,Inh_Kellis_eQTL
ENSG00000125434chr1782953998295400SLC25A35MiGA_GFM_eQTL
..................
ENSG00000183914chr1777173537717354DNAH2 MiGA_SVZ_eQTL,BM_10_MSBB_eQTL,Exc_mega_eQTL,ROSMAP_AC_sQTL,ROSMAP_PCC_sQTL,ROSMAP_DLPFC_sQTL
ENSG00000184619chr1783767038376704KRBA2 MiGA_GTS_eQTL,BM_10_MSBB_eQTL,BM_44_MSBB_eQTL,PCC_DeJager_eQTL,ROSMAP_AC_sQTL,STARNET_eQTL
ENSG00000185245chr1749322764932277GP1BA MiGA_SVZ_eQTL,MiGA_THA_eQTL
ENSG00000185722chr1742639944263995ANKFY1 ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000196388chr1749976094997610INCA1 MiGA_GTS_eQTL,PCC_DeJager_eQTL,ROSMAP_PCC_sQTL
ENSG00000196689chr1736094103609411TRPV1 ROSMAP_PCC_sQTL
ENSG00000198844chr1783102408310241ARHGEF15 MiGA_GTS_eQTL,MSBB_BM36_pQTL,ROSMAP_AC_sQTL,ROSMAP_PCC_sQTL,ROSMAP_DLPFC_sQTL
ENSG00000198920chr1766407106640711KIAA0753 AC_DeJager_eQTL,ROSMAP_DLPFC_sQTL
ENSG00000215041chr1773293927329393NEURL4 DLPFC_Bennett_pQTL,ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000219200chr1770124167012417RNASEK MiGA_SVZ_eQTL,DLPFC_DeJager_eQTL,STARNET_eQTL
ENSG00000220205chr1781635458163546VAMP2 MiGA_GFM_eQTL,ROSMAP_AC_sQTL,ROSMAP_PCC_sQTL
ENSG00000221882chr1733863163386317OR3A2 BM_36_MSBB_eQTL
ENSG00000239697chr1775490577549058TNFSF12 MiGA_GTS_eQTL,MiGA_SVZ_eQTL
ENSG00000248871chr1775490987549099TNFSF12-TNFSF13ROSMAP_PCC_sQTL
ENSG00000256806chr1766517616651762C17orf100 Exc_DeJager_eQTL,Exc_Kellis_eQTL,Exc_mega_eQTL
ENSG00000257950chr1736961933696194P2RX5-TAX1BP3 ROSMAP_PCC_sQTL
ENSG00000258315chr1770144947014495C17orf49 MiGA_SVZ_eQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000259224chr1774813317481332SLC35G6 MiGA_GTS_eQTL,MiGA_THA_eQTL
ENSG00000261915chr1773191737319174AC026954.2 MiGA_SVZ_eQTL,ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000262165chr1748070124807013AC233723.1 ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL,ROSMAP_PCC_sQTL
ENSG00000262302chr1772620887262089AC003688.1 ROSMAP_PCC_sQTL,ROSMAP_DLPFC_sQTL,STARNET_eQTL
ENSG00000262304chr1736362483636249AC027796.3 ROSMAP_PCC_sQTL
ENSG00000262481chr1774040967404097TMEM256-PLSCR3 ROSMAP_PCC_sQTL
ENSG00000262481chr1774040967404097TMEM256 ROSMAP_PCC_sQTL
ENSG00000262730chr1779306217930622AC104581.2 ROSMAP_DLPFC_sQTL
ENSG00000263620chr1781629748162975AC129492.3 ROSMAP_PCC_sQTL
ENSG00000263809chr1783831868383187AC135178.3 ROSMAP_AC_sQTL
ENSG00000277957chr1775632867563287SENP3-EIF4A1 MiGA_SVZ_eQTL,ROSMAP_AC_sQTL,ROSMAP_DLPFC_sQTL
ENSG00000282936chr1766404556640456AC004706.3 MiGA_GTS_eQTL,ROSMAP_PCC_sQTL
ENSG00000286190chr1755010055501006AC055839.2 MiGA_GTS_eQTL,MiGA_THA_eQTL,BM_10_MSBB_eQTL

Other genes implicated are PROC and HS6ST1 in MiGA cohort which may share causal eQTL with RABEP1. Further look into the data-set --- using these genes as targets and repeating what we did above for RABEP1 --- might be needed to establish a more certain conclusion.

Alternatively, we may be able to apply a multi-gene statistical fine-mapping test on RABEP1 region to find these genes, as you will see in the section below.

A statistically solid approach: mvSuSiE multi-gene analysis¶

This multi-gene fine-mapping analysis was done for each xQTL context separately. We will need to check the results for all contexts where this gene has an xQTL, in order to identify if there are other genes also sharing the same xQTL with this target gene. We included other genes in the same TAD window along with this gene, and extended it into a sliding window approach to include multiple TADs just in case. You need to check the sliding windows belongs to that gene (TSS) on analysis repo.

In [19]:
sliding_windows <- target_gene_info$gene_info$sliding_windows %>% strsplit(., ',') %>% unlist %>% as.character
sliding_windows
  1. 'chr17:0-9397826'
  2. 'chr17:1059843-10729687'
  3. 'chr17:3710595-13143830'
  4. 'chr17:6187187-17353469'
  5. 'chr17:8250471-20190245'

The most updated version of mvSuSiE multi-gene results are under path s3://statfungen/ftp_fgc_xqtl/analysis_result/mvsusie_multi_gene_test/multi_gene/ Currently it is still WIP. You can revisit this later when we prompt you to. Here is an example for RABEP1:

In [20]:
mnm_gene <- list()
for (window in sliding_windows) {
    mnm_gene_tmp <- NULL
    mnm_gene_tmp <- tryCatch(
        readRDS(paste0('/data/analysis_result/mvsusie_multi_gene/multi_gene/ROSMAP_multi_gene.', window, '.mnm.rds')),
        error = function(e) NULL
    )
    
    if (!is.null(mnm_gene_tmp)) {
        if(target_gene_info$gene_info$region_id %in% mnm_gene_tmp$mvsusie_fitted$condition_names){
        tryCatch({
            p <- mvsusieR::mvsusie_plot(mnm_gene_tmp$mvsusie_fitted, sentinel_only = F, add_cs = T)
            print(p)  # This ensures the plot is displayed in JupyterLab
        }, error = function(e) NULL)
        } else {
            message('There is mnm result for sliding window ',window,', but not include target gene ', gene_name, ' in CS')
        }
        mnm_gene <- append(mnm_gene, list(mnm_gene_tmp))
    }
}
$pip_plot
$effect_plot
No description has been provided for this image
$z_plot
NULL

$effects
                        L1
ENSG00000141480 0.07508825
ENSG00000141503 0.07508825
ENSG00000161929 0.07508825
ENSG00000029725 0.07508825

$pip_plot
No description has been provided for this image
$effect_plot
No description has been provided for this image
$z_plot
NULL

$effects
                        L1
ENSG00000141480 0.07625814
ENSG00000141503 0.07625814
ENSG00000161929 0.07625814
ENSG00000029725 0.07625814
ENSG00000129226 0.07625814

$pip_plot
No description has been provided for this image
$effect_plot
No description has been provided for this image
$z_plot
NULL

$effects
                       L1           L3            L2
ENSG00000161929 0.5258283 1.367291e-06 -2.173809e-05
ENSG00000029725 0.5258283 1.367291e-06 -2.173809e-05
ENSG00000129226 0.5258283 1.367291e-06 -2.173809e-05
ENSG00000007237 0.5258283 1.367291e-06 -2.173809e-05

No description has been provided for this image

In this case, there is no statistical evidence for RABEP1 sharing any of its xQTL with other genes in ROSMAP Microglia data we looked into; although we have not analyzed MiGA this way yet (which showed some potential signals from the quick analysis above).

Section 7: Epigenomic QTL and their target regions¶

fsusie, see notebook

Generate a crude plot to determined whether the story is interesting¶

This is a crude version of the case study plot which shows the fsusie Effect (colored line), the gene body (black arrow), the epi-QTL (large dots with the same color as the effects) and ADGWAS cs position (small red dots).

Only produce the refine plot if we can see either:

  1. There are sharing snp between epi-QTL and AD CS
  2. There are the AD CS located within one of the effect range
  3. The crude plot suggest something interesting
In [21]:
options(repr.plot.width = 40, repr.plot.height = 40)

 ggplot() + theme_bw() +  facet_grid(cs_coverage_0.95+study + region ~ ., labeller = labeller(.rows = function(x) gsub("([_:,-])", "\n", x)), scale = "free_y") +

      theme(text = element_text(size = 20), strip.text.y = element_text(size = 25, angle = 0.5)) +
     # xlim(view_win) +
      ylab("Estimated effect") +
   #   geom_line(data = haQTL_df %>% mutate(study = "haQTL effect") %>% filter(CS == 5),
    #            aes_string(y = "fun_plot", x = "x", col = "CS"), size = 4, col = "#00AEEF") +
  geom_line(data = effect_of_interest ,
                aes_string(y = "fun_plot", x = "x", col = "cs_coverage_0.95"), size = 4) +  
    geom_point(data = effect_of_interest ,
                aes_string(y = "pip", x = "pos", col = "cs_coverage_0.95"), size = 4) +
    theme(text = element_text(size = 40), strip.text.y = element_text(size = 15, angle = 0.5), 
            axis.text.x = element_text(size = 40), axis.title.x = element_text(size = 40)) +
      xlab("Position") +
      ylab("Estimated\neffect") +
      geom_segment(arrow = arrow(length = unit(1, "cm")), aes(x = gene_start, xend = gene_end, y = 1, yend = 1), size = 6,
                  data = tar_gene_info$gene_info, alpha = 0.3) +
      geom_text(aes(x = (gene_start + gene_end) / 2, y = 1 , label = gene_name), size = 10, 
              data = tar_gene_info$gene_info)+
        geom_point(aes(x = pos, y = pip  ) ,color = "red", data = flatten_table%>%filter( str_detect(study,"AD_") , cs_coverage_0.95 != 0  )%>%mutate(AD_study = study%>%str_replace_all("_","\n" ))%>%select(-study,-region,-cs_coverage_0.95) ) 

Section 8: Context focused validation in other xQTL data¶

see notebook

add fake version for now, so you don't have to refer to above link

Background: our "discovery set" is ROSMAP but we have additional "validation" sets including:

  • STARNET
  • MiGA
  • KnightADRC
  • MSBB
  • metaBrain
  • UKB pQTL

TODO:

  • Get from Carlos WashU CSF based resource (pQTL and metabolomic QTL)

This section shows verification of findings from these data-sets. In principle we should check them through sections 1-6 more formally. In practice we will start with colocalization via colocboost --- since our study is genetics (variant and loci level) focused. We can selectively follow them up for potentially intereting validations. We therefore only demonstrate validation via colocboost as a starting point.

In [22]:
finempping_contexts <- readRDS(paste0(gene_name, '_finemapping_contexts.rds')) # from sec1
In [23]:
finempping_contexts <- get_norosmap_contexts(finempping_contexts)
In [24]:
cb_ad <- plot_cb(cb_res = cb_res, cex.pheno = 1.5, x.phen = -0.2, add_QTL = TRUE, cohorts = finempping_contexts, gene_id = gene_id)
No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.No pvalue cutoff. Extract all variants names.
No description has been provided for this image

In conclusion from what's shown above, when we check the association signals in STARNET and MiGA on colocalization established from ROSMAP and AD GWAS, we see additional evidences.

Section 9: Non-linear effects of xQTL¶

see notebook

APOE interaction¶

In [25]:
options(repr.plot.width=6, repr.plot.height=6)

ggplot(RABEP1_int_res, aes(x = variant_id, y = qvalue_interaction)) +
  geom_point(alpha = 0.7, size = 6) +
  labs(title = "qvalue for RABEP1 csets in interaction association nalysis",
       x = "Gene Name",
       y = "qvalue_interaction",
       size = "qvalue_interaction") +
  theme_minimal(base_size = 14) +
  theme(panel.background = element_blank(),
        panel.grid.major = element_line(color = "grey80"),
        legend.position = NULL,
        axis.text.x = element_text(angle = 45, hjust = 1))  + ylim(0,1)
  # scale_color_manual(values = colorRampPalette(brewer.pal(8, "Set1"))(length(unique(flat_var$gene_name))))
ggsave('plots/RABEP1/sec11.interaction_association_RABEP1_lessPIP25.pdf', height = 5, width = 8) 

In conclusion, there is no interaction QTL with APOE identified.

Section 10: in silico functional studies in iPSC model¶

see notebook

In [26]:
vars_p
In [27]:
apoe_p

Section 11: Functional annotations of selected loci¶

see notebook

TODO

  • Touch base with Ryan on the snATAC annotations
  • Run this by Pavel to see if there are additional comments on how we do this
In [28]:
func_p

Section 12: Candidate loci as trans-xQTL¶

see notebook

In [29]:
options(repr.plot.width=12, repr.plot.height=6)
if(!is.null(flat_var)){
    ggplot(flat_var, aes(x = gene_name, y = pip, size = pip)) +
      geom_point(alpha = 0.7) +
      labs(title = paste0("PIP values for trans fine mapped Genes in ", gene_name ," csets with AD"),
           x = "Gene Name",
           y = "PIP",
           size = "PIP",
           color = "CS Coverage 0.95 Min Corr") +
      theme_minimal(base_size = 14) +
      theme(panel.background = element_blank(),
            panel.grid.major = element_line(color = "grey80"),
            legend.position = NULL,
            axis.text.x = element_text(angle = 45, hjust = 1))  
      # scale_color_manual(values = colorRampPalette(brewer.pal(8, "Set1"))(length(unique(flat_var$gene_name))))
    ggsave(paste0('plots/RABEP1/sec12.trans_fine_mapping_',gene_name,'.pdf'), height = 5, width = 8)
} else{
    message('There are no detectable trans signals for ', gene_name)
}

Creative thinking: generate hypothesis, search in literature, raise questions to discuss¶

You can now generate some preliminary hypotheses based on the above results. Next, you should search for evidence in the literature to support or refine these hypotheses and identify additional analyses needed to confirm them.