We combined both EUR and YRI data sets from Paper: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3918453/ DataFtp: http://jungle.unige.ch/~lappalainen/geuvadis/ To generate the positive sets we started with the FDR5 data set, and accepted only autosomal SNP's associated with exon and gene expression levels. Specifically, we excluded from positive & control sets: - trqtl - non-SNV variants such as indels - annotated CDS The resultant positive set had 6,520 unique, noncoding, autosomal positions. The control set for the paper was generated from all dbSNP v137 SNP's. We also verified against a second control set consisting of all of the SNP's tested as QTL candidates by Tuuli Lappalainen, (9,7943,317 autosomal SNV), less any variants that had a p<.05 correlation with transcription from an exon or gene. This removed all identified genic & exonic QTL's as well as some positions that showed correlation with expression, but not enough to pass a statistical test. This left 7,645,533 positions. -- Brad