Here we want to examine the relationship between the WGCNA modules and the levels of compounds measured by steroidomics. We work on mean values registered for the replicates of each exposure condition. For steroidomics, we select the values measured in CTL04, since the WGCNA has been performed on this line.

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, collapse = TRUE)

1. Environment Set Up

1.1 Values of RMarkdown parameters

for (i in 1:length(params))
  print(paste('Parameter:', names(params)[i], ' - Value:', params[[i]], '- Class:', class(params[[i]])))
## [1] "Parameter: InputFolder  - Value: ~/DataDir/steroidomics/ - Class: character"

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(ggrepel)
library(viridis)
## Loading required package: viridisLite

InputFolder <- params$InputFolder

2. Data Upload

2.1 Objects from WGCNA first step

We need FinalMEs to retrieve the module eigengene associated to each sample and the metadata present in DummyTraits.

load('~/DataDir/bulkRNASeq/8.WGCNA/CTL04/1_NetworkGeneration/NetworkGeneration.RData')
rm(adjacency, dissTOM, SE_WGCNA, VstSel)

2.2 Steroidomics data

Steroidomics <- openxlsx::read.xlsx("./Normalized_Counts_Neuralorganoids_Steroidomics.xlsx")
head(Steroidomics)
##                Sample Timepoint    Sex   Line Condition dht pregnenolone
## 1  1_1CTL04EDay 50CTL     DAY50 Female CTL04E       CTL   0     202.4586
## 2  1_2CTL04EDay 50CTL     DAY50 Female CTL04E       CTL   0     304.2842
## 3  1_3CTL04EDay 50CTL     DAY50 Female CTL04E       CTL   0     172.5682
## 4 1_4CTL04EDay 50DMSO     DAY50 Female CTL04E      DMSO   0     205.2690
## 5 1_5CTL04EDay 50DMSO     DAY50 Female CTL04E      DMSO   0     213.8746
## 6 1_6CTL04EDay 50DMSO     DAY50 Female CTL04E      DMSO   0     359.4976
##   testosterone androstenedione corticosterone cortisol progesterone cortisone
## 1            0               0       527.3886 2.027865            0 1.8435876
## 2            0               0       669.2955 1.187653            0 0.4624972
## 3            0               0       506.3620 1.658182            0 1.3603737
## 4            0               0       402.3395 0.000000            0 0.0000000
## 5            0               0       475.0151 0.000000            0 0.0000000
## 6            0               0       688.0275 0.000000            0 0.0000000
##   DHEA Androsterone Androstanedione E1 17bE2 E3 17aE2
## 1    0            0               0  0     0  0     0
## 2    0            0               0  0     0  0     0
## 3    0            0               0  0     0  0     0
## 4    0            0               0  0     0  0     0
## 5    0            0               0  0     0  0     0
## 6    0            0               0  0     0  0     0

3. Organize module eigengene data

Calculate the mean value for each module eigengene in each exposure condition.

ModuleEig <- data.frame(merge(FinalMEs, DummyTraits[, 1:3], by='row.names'))
## Loading required package: S4Vectors
## Loading required package: stats4
## Loading required package: BiocGenerics
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:dplyr':
## 
##     combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
##     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
##     get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
##     match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
##     table, tapply, union, unique, unsplit, which.max, which.min
## 
## Attaching package: 'S4Vectors'
## The following objects are masked from 'package:dplyr':
## 
##     first, rename
## The following objects are masked from 'package:base':
## 
##     expand.grid, I, unname

MeanEig <-  dplyr::group_by(ModuleEig, Condition) %>% 
  summarize(Greenyellow = mean(MEgreenyellow, na.rm=TRUE), 
            Grey60 = mean(MEgrey60, na.rm=TRUE),
            Turquoise = mean(MEturquoise, na.rm=TRUE), 
            Brown = mean(MEbrown, na.rm=TRUE), 
            Tan = mean(MEtan, na.rm=TRUE), 
            Black = mean(MEblack, na.rm=TRUE), 
            Green = mean(MEgreen, na.rm=TRUE), 
            Magenta = mean(MEmagenta, na.rm=TRUE), 
            Blue = mean(MEblue, na.rm=TRUE), 
            Pink = mean(MEpink, na.rm=TRUE), 
            Red = mean(MEred, na.rm=TRUE), 
            DarkRed = mean(MEdarkred, na.rm=TRUE), 
            Midnightblue = mean(MEmidnightblue, na.rm=TRUE), 
            Lightgreen = mean(MElightgreen, na.rm=TRUE), 
            Lightyellow = mean(MElightyellow, na.rm=TRUE), 
            Cyan = mean(MEcyan, na.rm=TRUE), 
            Purple = mean(MEpurple, na.rm=TRUE), 
            Royalblue = mean(MEroyalblue, na.rm=TRUE))

4. Organize steroidomics data

Select CTL04.
Select compounds with quantification higher than 0 in at least 8 samples.
Calculate the mean value of each compound in each exposure condition.
Recode the exposure nomenclature to be coherent with WGCNA.

lapply(Steroidomics[Steroidomics$Line=='CTL04E', 6:20], function(x) sum(x>0))
## $dht
## [1] 3
## 
## $pregnenolone
## [1] 48
## 
## $testosterone
## [1] 15
## 
## $androstenedione
## [1] 9
## 
## $corticosterone
## [1] 48
## 
## $cortisol
## [1] 8
## 
## $progesterone
## [1] 27
## 
## $cortisone
## [1] 7
## 
## $DHEA
## [1] 0
## 
## $Androsterone
## [1] 3
## 
## $Androstanedione
## [1] 0
## 
## $E1
## [1] 16
## 
## $`17bE2`
## [1] 18
## 
## $E3
## [1] 0
## 
## $`17aE2`
## [1] 7

The compounds that are quantified in at least 1/6 (8) samples are: pregnenolone, testosterone, androstenedione, corticosterone, cortisol, progesterone, E1, 17bE2.

MeanSteroid <-  dplyr::group_by(Steroidomics[Steroidomics$Line=='CTL04E', ], Condition) %>% 
  summarize(Pregnenolone = mean(pregnenolone, na.rm=TRUE), 
            Testosterone = mean(testosterone, na.rm=TRUE), 
            Androstenedione = mean(androstenedione, na.rm=TRUE), 
            Corticosterone = mean(corticosterone, na.rm=TRUE), 
            Cortisol = mean(cortisol, na.rm=TRUE), 
            Progesterone = mean(progesterone, na.rm=TRUE),
            E1 = mean(E1, na.rm=TRUE), 
            `17bE2` = mean(`17bE2`, na.rm=TRUE))

MeanSteroid <- MeanSteroid %>% mutate(Condition=replace(Condition, Condition=='AHR AG', "AhHyd_Ag")) %>%
  mutate(Condition=replace(Condition, Condition=='AHR ANT', "AhHyd_Inh")) %>%
  mutate(Condition=replace(Condition, Condition=='AND AG', "Andr_Ag")) %>%
  mutate(Condition=replace(Condition, Condition=='AND ANT', "Andr_Inh")) %>%
  #mutate(Condition=replace(Condition, Condition=='CTL', "CTL")) %>%
  #mutate(Condition=replace(Condition, Condition=='DMSO', "DMSO")) %>%
  mutate(Condition=replace(Condition, Condition=='EST AG', "Estr_Ag")) %>%
  mutate(Condition=replace(Condition, Condition=='EST ANT', "Estr_Inh")) %>%
  mutate(Condition=replace(Condition, Condition=='GC AG', "CG_Ag")) %>%
  mutate(Condition=replace(Condition, Condition=='GC ANT', "GC_Inh")) %>%
  mutate(Condition=replace(Condition, Condition=='LX AG', "LivX_Ag")) %>%
  mutate(Condition=replace(Condition, Condition=='LX ANT', "LivX_Inh")) %>%
  mutate(Condition=replace(Condition, Condition=='RET AG', "Ret_Ag")) %>%
  mutate(Condition=replace(Condition, Condition=='RET ANT', "Ret_Inh")) %>%
  mutate(Condition=replace(Condition, Condition=='TH AG', "Thyr_Ag")) %>%
  mutate(Condition=replace(Condition, Condition=='TH ANT', "Thyr_Inh"))

5. Module-compound correlation

5.1 Calculate Spearman correlation

MeanEigCorr <- data.frame(MeanEig[,-1])
row.names(MeanEigCorr) <- as.vector(as.matrix(MeanEig[,1]))

MeanSteroidCorr <- data.frame(MeanSteroid[,-1])
row.names(MeanSteroidCorr) <- as.vector(as.matrix(MeanSteroid[,1]))

nSteroids <- ncol(MeanSteroidCorr) 
nSamples <- nrow(MeanSteroidCorr)

# spearman correlation
SpearmanCor <- WGCNA::cor(MeanEigCorr, MeanSteroidCorr, use = 'p', method='spearman')
## 
SpearmanPvalue <- WGCNA::corPvalueStudent(SpearmanCor, nSamples)

5.2 Visualize correlation matrix

textMatrix <- paste(signif(SpearmanCor,2), '\n(', signif(SpearmanPvalue, 1), ')', sep = '')
dim(textMatrix) <- dim(SpearmanCor)
WGCNA::labeledHeatmap(Matrix = SpearmanCor, xLabels=names(MeanSteroidCorr), yLabels=names(MeanEigCorr), ySymbols=names(MeanEigCorr), 
                      colorLabels=FALSE, colors=viridis(50)[15:50], textMatrix = textMatrix, setStdMargins = FALSE, 
                      cex.text = 0.5, zlim = c(-1,1), main = paste('Module-trait relationships'))

6. Scatterplots

Represented by scatterplots the correlations for the modules and steroids with a correlation coefficient >= 0.6 and a pvalue <=0.01

Merged <- inner_join(MeanEig, MeanSteroid, by='Condition')

scatterPlot <- function(myData, compound, module){
  
  ggplot(data=myData, aes(x=log2(!!sym(compound)+0.01), y=(!!sym(module)), col=Condition)) +
  geom_point(size=2) +
    geom_label_repel(label=myData$Condition) +
    theme_minimal() +
    theme(legend.position="none")
  }

Pregnenolone

scatterPlot(Merged, compound='Pregnenolone', module='Magenta')

scatterPlot(Merged, compound='Pregnenolone', module='Turquoise')

Testosterone

Grey60

scatterPlot(Merged, compound='Testosterone', module='Grey60')

Tan

scatterPlot(Merged, compound='Testosterone', module='Tan')

### Lightgreen

scatterPlot(Merged, compound='Testosterone', module='Lightgreen')

Androstenedione

Tan

scatterPlot(Merged, compound='Androstenedione', module='Tan')

Lightgreen

scatterPlot(Merged, compound='Androstenedione', module='Lightgreen')

E1

Magenta

scatterPlot(Merged, compound='E1', module='Magenta')

7. SessionInfo

SessionInfo <- sessionInfo()
Date <- date()

Date
## [1] "Wed Aug 13 12:59:17 2025"
SessionInfo
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] S4Vectors_0.36.1    BiocGenerics_0.44.0 viridis_0.6.2      
## [4] viridisLite_0.4.1   ggrepel_0.9.3       ggplot2_3.4.1      
## [7] dplyr_1.1.0        
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7           matrixStats_0.63.0     bit64_4.0.5           
##  [4] doParallel_1.0.17      RColorBrewer_1.1-3     httr_1.4.5            
##  [7] GenomeInfoDb_1.34.9    dynamicTreeCut_1.63-1  tools_4.2.1           
## [10] backports_1.4.1        bslib_0.4.2            utf8_1.2.3            
## [13] R6_2.5.1               rpart_4.1.19           Hmisc_4.8-0           
## [16] DBI_1.1.3              colorspace_2.1-0       nnet_7.3-18           
## [19] withr_2.5.0            tidyselect_1.2.0       gridExtra_2.3         
## [22] preprocessCore_1.60.2  bit_4.0.5              compiler_4.2.1        
## [25] WGCNA_1.72-1           cli_3.6.1              Biobase_2.58.0        
## [28] htmlTable_2.4.1        labeling_0.4.2         sass_0.4.5            
## [31] scales_1.2.1           checkmate_2.1.0        stringr_1.5.0         
## [34] digest_0.6.31          foreign_0.8-84         rmarkdown_2.20        
## [37] XVector_0.38.0         base64enc_0.1-3        jpeg_0.1-10           
## [40] pkgconfig_2.0.3        htmltools_0.5.4        highr_0.10            
## [43] fastmap_1.1.1          htmlwidgets_1.6.1      rlang_1.1.1           
## [46] impute_1.72.3          rstudioapi_0.14        RSQLite_2.3.0         
## [49] farver_2.1.1           jquerylib_0.1.4        generics_0.1.3        
## [52] jsonlite_1.8.4         zip_2.2.2              RCurl_1.98-1.10       
## [55] magrittr_2.0.3         GO.db_3.16.0           GenomeInfoDbData_1.2.9
## [58] Formula_1.2-5          interp_1.1-3           Matrix_1.5-3          
## [61] Rcpp_1.0.10            munsell_0.5.0          fansi_1.0.4           
## [64] lifecycle_1.0.3        stringi_1.7.12         yaml_2.3.7            
## [67] zlibbioc_1.44.0        grid_4.2.1             blob_1.2.3            
## [70] parallel_4.2.1         crayon_1.5.2           deldir_1.0-6          
## [73] lattice_0.20-45        Biostrings_2.66.0      splines_4.2.1         
## [76] KEGGREST_1.38.0        knitr_1.42             pillar_1.8.1          
## [79] fastcluster_1.2.3      codetools_0.2-19       glue_1.6.2            
## [82] evaluate_0.20          latticeExtra_0.6-30    data.table_1.14.8     
## [85] png_0.1-8              vctrs_0.6.2            foreach_1.5.2         
## [88] gtable_0.3.1           cachem_1.0.7           xfun_0.37             
## [91] openxlsx_4.2.5.2       survival_3.5-3         tibble_3.2.1          
## [94] iterators_1.0.14       AnnotationDbi_1.60.0   memoise_2.0.1         
## [97] IRanges_2.32.0         cluster_2.1.4

Steroidomics: relationship between steroid concentrations and WGCNA gene modules

Date: August 13, 2025