fluEvidenceSynthesis (R Package)

23 January 2016

Introduction

fluEvidenceSynthesis (the R-package)

Availability

Open source
Underlying in C/C++ for speed
R interface
- Load data in R and call C code from R
- Both high and low level interface
Available on github https://github.com/MJomaba/flu-evidence-synthesis

Load in R

library(fluEvidenceSynthesis)

Main Process

Inference
1. Epidemiological model
2. Data
3. Parameter inference
Vaccination scenarios
Cost effectiveness (under construction)

Stratified by age and risk

Data is stratified by age groups. Vaccination data is subdivided into 7 age groups, while monitoring data is subdivided into 5 age groups.

<1, 1-4, 5-14, 15-24, 25-44, 45-64, 65+
<4, 5-14, 15-44, 45-64, 65+

Further subdivided into 3 risk groups

Low risk
High risk
Pregnant women (currently mostly unused)

Inference

inference( age_sizes=age_sizes$V1,
                      vaccine_calendar=vaccine_calendar,
                      polymod_data=as.matrix(polymod_uk),
                      ili=ili$ili,
                      mon_pop=ili$total.monitored,
                      n_pos=confirmed.samples$positive,
                      n_samples=confirmed.samples$total.samples,
                      initial = initial.parameters,
                      nbatch=1000,
                      nburn=1000, blen=10 )

Inference inputs

Population size by age
Contact data (POLYMOD)
Initial model parameters

Influenza related data

By week and age group

Influenza Like Illness
Virological data
Vaccination uptake rates
- Efficacy

MCMC

Length of run, length of burn in

Population size by age

Vector with population sizes. First value is every <1, 2nd everyone 1 year old, etc. Final value includes everyone older than that age (i.e. 85+)

age_sizes$V1

##  [1] 625100 632100 648500 639800 645800 663500 668700 690300 700100 688200
## [11] 681700 689100 671500 661100 665000 638000 633700 635500 637500 636400
## [21] 623200 587000 586300 613300 636900 664500 711100 755000 780700 775700
## [31] 800400 808500 829700 834400 840200 833400 818800 797500 779200 749000
## [41] 733400 723100 700500 677700 661600 667500 661400 654100 660200 680200
## [51] 702500 755300 802200 647200 635300 624000 596200 539000 504600 526300
## [61] 528600 523600 510700 496500 482400 463900 460700 463000 464400 453500
## [71] 437100 421000 416400 408100 393000 380300 368700 369200 369000 352500
## [81] 240500 202100 208400 203400 197000 990400

Contact data

Matrix holding results for each subject and their contact pattern.

pander(head(polymod_uk,n=4))

Wknd	AG2	AG3	AG4	AG5	AG6
0	0	2	0	3	2
1	2	1	2	3	2
0	1	0	0	6	2
1	1	0	0	2	2

Initial parameter values

Starting values are not that important
- Could start with the mean value of the priors

We model 5 age groups and assume that the lower two (<5 and <15) and medium two (<25 and <65) have the same parameter values. Finally the elderly have different parameter values. This is to reduce the complexity.

Ascertainment probabilty for three age groups ($\epsilon_i$)
Outside infection ($\psi$)
Transmissibility ($q$)
Susceptibility for three age groups ($\sigma_i$)
Initial number of infections ($I$)

initial.parameters <- c(0.01188150, 0.01831852, 0.05434378,
                        1.049317e-05, 0.1657944,
                        0.3855279, 0.9269811, 0.5710709,
                        -0.1543508 )

Influenza Like Illness

Counts by week and age group

pander(head(ili$ili,n=2),emphasize.rownames=F)

V1	V2	V3	V4	V5
1	1	24	11	3
5	3	33	14	4

pander(head(ili$total.monitored,n=2),emphasize.rownames=F)

V1	V2	V3	V4	V5
32723	73891	234246	139598	86733
33210	74865	236759	139329	86764

Virulogical confirmation

Counts by week and age group for this strain

pander(confirmed.samples$positive[13:14,],emphasize.rownames=F)

V1	V2	V3	V4	V5
2	1	9	3	1
3	2	13	10	1

pander(confirmed.samples$total.samples[13:14,],emphasize.rownames=F)

V1	V2	V3	V4	V5
10	3	16	4	3
5	4	22	12	2

Vaccine calendar

vaccine_calendar <- list(
  "efficacy" = c(0.7,0.7,0.7,0.7,0.7,0.7,0.3),
  "calendar" = matrix(c(
    0,0,0,0,0,0, 0.02,0,0,0,0,0, 0.02, 0.02,0,0,0,0,0,0, 0.02,
    0,0,0,0,0,0,0.005,0,0,0,0,0,0.005,0.005,0,0,0,0,0,0,0.005,
    0,0,0,0,0,0,0.005,0,0,0,0,0,0.005,0.005,0,0,0,0,0,0,0.005
    ),ncol=21),
  "dates" =  c(as.Date("2010-10-01"), as.Date("2010-11-01"),
               as.Date("2010-12-01"), as.Date("2011-01-01"))
)

Calendar holds vaccination rates per day for:

7 Low risk age groups
7 High risk
7 Pregnant women groups

Inference result

pander(names(inference.results))

batch, llikelihoods and contact.ids

pander(head(inference.results$batch,n=3),emphasize.rownames=F)

eps1	eps2	eps3	psi	q	susc1	susc2	susc3	I0
0.009664	0.01574	0.05236	0.000165	0.1654	0.3851	0.9281	0.5735	-0.1549
0.009222	0.01578	0.05191	0.0001281	0.1657	0.3854	0.9284	0.5741	-0.1551
0.009602	0.01574	0.05255	8.838e-05	0.1656	0.3854	0.9284	0.574	-0.1546

Contact IDs

We have about 900 data points from the POLYMOD study on UK contact patterns. We could take these as given and derive the contact matrix. Instead we resample them at each mcmc step:

Resample with replacement (bootstrap)
Recalculate contact matrix with the resampled data set
Preserves uncertainty in the data set
After inference we have parameter samples, but also the (resampled) contact data used for each sample
"Posterior" for contact data

Posterior sample

Random draws from the posterior density function of the parameters (given the data).

Think of drawing from normal distribution
Random draws will be distributed according to probability

qaly qaly

Posterior sample

Random draws from the posterior density function of the parameters (given the data).

Think of drawing from normal distribution
Random draws will be distributed according to probability

For each draw calculate outcome (i.e. number of cases)

Gives posterior distribution of outcomes

$qaly$

Posterior is important

(co)variance
Outliers
Skewness

Risk assessment

Average outcome
Worst case

The inference method returns these posterior samples of all model parameters (transmission rate, susceptibility of the population etc.)

Evaluate Vaccination scenarios

Vaccination strategies

vaccinationScenario( age_sizes=age_sizes[,1], vaccine_calendar=vaccine_calendar,
                     polymod_data=as.matrix(polymod_uk),
                     contact_ids=inference.results$contact.ids[1000,],
                     parameters=inference.results$batch[1000,]
                    )

Input

Population size by age
Contact data
Results from inference
Vaccination strategy being evaluated

Vaccination Output

The disease burden for certain strategy given fitted model parameters, i.e: 101547, 632398, 1606080, 2923985, 4976025, 2751776, 569981, 2178, 36806, 174496, 278627, 504179, 616371, 466348, 0, 0, 0, 0, 0, 0 and 0

Use all posterior parameters to get complete posterior distribution of disease burden.

Possible outcomes for all parameter samples (1000)
Average/worst case etc.

Cost effectiveness analysis

See Marc and Dominic