Project name: USDA-NRSP-8-gigas-rDNA
Funding source: USDA-NRSP-8
Github repo: https://github.com/mattgeorgephd/USDA-NRSP-8-gigas-rDNA
Species: crassostrea gigas
variable: ploidy

» next notebook entry »


Power analysis

Determine the sample size needed for whole genome sequencing given publicly available data about single copy gene variation within c.gigas across locations.

Mac was able to pull publicly available data; her analysis is here. Here are her results.

From this analysis, we can see that the variation :

type mean SD
mito 31 9
ribo 335 105

Looking across region, it wasn’t much better:

type country mean SD
mito china 26.7 7.6
mito japan 38.7 8.4
mito south africa 32.4 6.8
ribo china 323.9 129.7
ribo japan 361.1 100.7
ribo south africa 331 53.9

I used the PWR package to determine the number of samples that we need to sequence given the variation observed, where X is the variation:

library(pwr)

# Set parameters
alpha   <- 0.05  # significance level
power   <- 0.80  # desired power
sigma   <- 8.4   # standard deviation of control group
delta   <- 20    # difference between trt and control group  
n       <- NULL  # sample size to be determined

# Perform power analysis
pwr.t.test(d = delta/sigma, sig.level = alpha, power = power, n = n)

Here are the results (best case scenario): mito_copy nuumber

delta SD n required (each group)
20 6.8 3.13
10 6.8 8.3
5 6.8 30.02
20 8.4 4.0
10 8.4 12.11
5 8.4 45.3