Type2Diabetes is phenotypically and genetically diverse and is a major global health concern affecting more than 76 million Indians and over 408 million individuals worldwide (http:// www.idf.org/atlasmap/atlasmap). The risk ofan individual towards T2D reflects the environ- mental influence in the background of genetic predisposition. Owing to the complex etiology of the disease progression, the identification of genetic markers has been slow until 2007. The advent of new technology in the form of microarray chips, has led to the development of high throughput genomewideassociation study (GWAS). Nevertheless, only a few variants in genes such as KCNJ11, PPARG, SLC30A8, and TCF7L2 were reported to be linked with T2D [1–3]. However, GWAS in a French population by Sladek et al., 2007 reported variants in TCF7L2, SLC30A8 and HHEX as new loci for T2D . Subsequent GWAS in various populations have identified SNPs in several novel genes such as IGF2BP2, PPARG, FTO, CDKN2A, CDKAL1, KCNQ1 and JAZF1 to be associated with T2D [5–9]. Palmer et al., 2012 showed variants of SLC44A3, RBM43, RND3, GALNTL4, TMEM45B and BARX2 as new susceptibilitylociin Afri- can –American population . Till date, GWAS have identified >70 susceptibilityloci associ- ated with T2D [10–19]. Besides, two well replicated genes, PPARG and KCNJ11 which were initially shown to be associated with T2D through candidate gene studies were also confirmed through GWAS in European population [7, 9]. Though the associationof many common vari- ants established by GWAS has been replicated in several Caucasian populations, the results are conflicting in Asian populationin general andIndianpopulationin particular. Amongst all the associationstudies on T2D inIndianpopulation, only the variants in TCF7L2 (rs7903146, rs12243326 and rs4506565) has been consistently replicated and shown to be most promising [3, 20, 21].
meta-analyses in European-derived populations, power calculations (Tables S8 and S9) show that this study has greater than 80% statistical power to detect effects for common variants (MAF = 0.20) consistent with published effect sizes (OR = 1.28) for T2DM (e.g. transcription factor 7-like 2 (TCF7L2) and potassium voltage-gated channel, KQT-like subfamily, member 1 (KCNQ1) with ORs 1.3– 1.4; reviewed by ) and more modest power (,70%) to detect effects for less common variants (MAF = 0.10). The power to detect and replicate moderate level contributions to T2DM susceptibility should increase with meta-analysis of this GWAS data and other GWAS currently being conducted in African-American popula- tions. In addition this study reports results from only directly genotyped SNPs. Effective imputation of additional SNPs would undoubtedly improve coverage of the African-American genome. While recent imputation methods development  show encour- aging progress, rigorous empirical testing continues. A potential bias of the current study design may be that the GWAS was conducted inan African-American populationof individuals with type2diabetes with nephropathy however; there is no specific reason why this African-American population should differ substantially from African Americans with T2DM without ESRD. For example, TCF7L2 is strongly associated in our studiesof African-American T2DM-ESRD subjects [28,40]. In addition it should be noted that although every precaution was taken to account for population structure, as with any GWAS or candidate gene study, there may be residual population substructure. The major strength of this study is the genotyping andreplicationin four additional populations, thus providing support for the evidence ofassociation observed. In addition, the study design which includes individuals with T2DM and ESRD allows for the identification of ESRD loci which are distinct from those presented herein (Table S10; ).
We exploited eSNP/eQTL inmultiple human tissues. Given that (1) disease-related human tissues are often difficult to obtain for research purposes; (2) eQTL analysis requires a large sample size to reach the statistical power necessary to observe subtle changes in gene expression ; and (3) all of the selected candidate genes were expressed in bone tissues, we believe that performing eQTL inmultiple tissues, although not replacing eQTL analysis in bone tissue, does provide complementary information. Genetic control of biological functions may be tissue-specific. Analysis of cis- eQTL in the tissue type directly relevant to the phenotype has been generally shown to be more informative than the same analysis in unrelated tissue types (such as blood). However, studies have found that cis-eQTLs are conserved across tissues, when genes are actively expressed in those tissues [10,12,13,41–44]. eQTL analyses in liver, adipose, brain and muscle tissues from the same individual mice suggested that, for a gene exhibiting significant cis-eQTL associations in one tissue, 63–88% (dependent on tissue types) of them also exhibit cis- eQTL associations in another tissue . Two recent studies, quantifying allele-specific gene expression in four human cell lines (lymphoblastoid cell, two primary fibroblasts and primary keratinocytes) from the same individuals, observed that only 2.3– 10% of the mRNA-associated SNPs showed tissue-specific cis- expression across these cell lines [43,44]. They also found that the variation of allelic ratios in gene expression among different cell lines was primarily explained by genetic variations, much more so than by specific tissue types or growth conditions . Among the highly heritable transcripts (within the upper 25th percentile for heritability), 70% of expression transcripts that had a significant cis-eQTL in adipose tissue also had a significant cis-eQTL in blood cells . Comparing eQTL in human primary fibroblasts, Epstein-Barr
The second and third simulations attempt to average over the possible outcomes of our future efforts to map causal mutations, to reveal the likely gains in our ability to stratify individuals on the basis of risk. These use the methodology above, under both prior distributions, to average over the posterior distribution of the allele frequency and effect size at the causal SNPs underlying reported GWAS loci for the three diseases. These adjusted estimates are also shown in Table 1. Across diseases we see that there is a significant increase in the risk associated with carrying multiple risk variants. In particular we see that the biggest differences in risk are for those individuals in the extreme tail. It is these individuals who carry the stronger, likely rarer, risk alleles which are currently insufficiently characterised by the most significant signal ofassociationin some regions identified to be important in disease. For example, the risk ofan individual in the top 0.1% of the population for genetic risk typed at the causal loci underlying currently known GWAS loci will likely be increased by a factor of 3–6.5, 5–12, or 25–50, compared to an average individual, for breast cancer, type2diabetesand Crohn’s disease. These are notably greater increases in risk than current prediction based in the hit SNPs from GWAS loci which would be 2.4, 3.5 and 20 respectively.
Our study does not represent a common replication attempt to identify lipid lociinan independent population. Rather, this investigation has been carefully carried out in this unique family- based cohort using a conservative statistical approach applying score-based statistics to map quantitative lipid traits in a non- randomly ascertained dataset. Exceeding our expectations, this study has identified linkage regions, primarily HDL cholesterol (10q21.1–21.2) and total cholesterol (22q13.32) that were previously reported for lipid traits or CVD. The most interesting part of this study is that some of these linkage signals also harbor important candidate loci (e.g., KIAA1462, PCDH15, PPARa, SLC16A9, and CELSR1) implicated with lipid traits in recent GWAS and meta-analysis studiesand also some of these regions overlap with prior linkage studies [55,56,57]. Therefore, our findings suggest that these regions might contain some novel genes for blood lipids rather than chance findings, and perhaps some of the loci may have larger effects in this Khatri Sikh cohort. Notably, the presence of HDL cholesterol signal on chromosome 10q21.2 is particularly important in view of low HDL cholesterol- associated CVD risk in Asian Indian men, in general, and may strongly relate to gene-environmental interaction which is enhanced by rapidly emerging western lifestyle [58,59]. Further fine mapping with more efficacious strategy using SNP-based
The power of our study to detect small effects in uncommon variants was low. Evidence from many recent studies now suggests that inType2diabetes the effects are likely in the range of OR 1.15–1.5. It is clear that much larger studies than that reported here are required for such effects (Figure 4), in particular when adjusting to a lower Type 1 error rate of 0.01% to compensate for multiple testing. The signiﬁcance of the associations we report have been described without adjustment for the number of tests undertaken, and thus the group of positive associations is likely to contain a proportion that is falsely positive. There is no consensus about the ideal method for adjusting the probability ofan observation occurring by chance for multiple testing. The simple Bonferroni correction would constitute overadjust- ment because the 152 genetic markers in this study are not independent. In addition, in the false-discovery rate method (Benjamini and Hochberg 1995), it is assumed that all N tests are carried out simultaneously, which may not correspond to reality if groups genotype one set of SNPs, as in this study, but then report results for additional SNPs at a later date. It is not clear whether the number of tests N should reﬂect the number to date or the number one might potentially undertake by continuing working through projects like these. An alternative Bayesian approach leading to a ‘genome-wide’ signiﬁcance level for association, such as has been done for whole-genome linkage studies (Lander and Kruglyak 1995), might be preferable. However, this also runs into difﬁculties. Instudies that are not based on ﬁne-mapping of linkage intervals, but rather on candidate genes selected on the basis of data from other studies, including previous reports ofassociation, it is unclear what level of prior probability ofassociation should be used. As a result of this uncertainty about the appropriate method of correction for multiple testing, our preferred strategy is to report the number of tests done and to encourage readers to interpret the signiﬁcance tests in that light, acknowledging that the results will require replicationin other cohorts.
The ‘‘thrifty genotype’’ hypothesis proposes that the high prevalence oftype2diabetes (T2D) in Native Americans and admixed Latin Americans has a genetic basis and reflects an evolutionary adaptation to a past low calorie/high exercise lifestyle. However, identification of the gene variants underpinning this hypothesis remains elusive. Here we assessed the role of Native American ancestry, socioeconomic status (SES) and 21 candidate gene lociinsusceptibility to T2D in a sample of 876 T2D cases and 399 controls from Antioquia (Colombia). Although mean Native American ancestry is significantly higher in T2D cases than in controls (32% v 29%), this difference is confounded by the correlation of ancestry with SES, which is a stronger predictor of disease status. Nominally significant association (P,0.05) was observed for markers in: TCF7L2, RBMS1, CDKAL1, ZNF239, KCNQ1 and TCF1 and a significant bias (P,0.05) towards OR.1 was observed for markers selected from previous T2D genome-wideassociationstudies, consistent with a role for Old World variants insusceptibility to T2D in Latin Americans. No association was found to the only known Native American-specific gene variant previously associated with T2D in a Mexican sample (rs9282541 in ABCA1). An admixture mapping scan with 1,536 ancestry informative markers (AIMs) did not identify genome regions with significant deviation of ancestry in Antioquia. Exclusion analysis indicates that this scan rules out ,95% of the genome as harboring loci with ancestry risk ratios .1.22 (at P , 0.05).
Rs339331 resides close to two genes, GPRC6A and RFX6. FASTSNP showed that rs339331 may be a location of intronic enhancer of the GPRC6A gene. Prostate does not express GPRC6A in normal conditions . Nevertheless, interestingly, GPRC6A is highly expressed in the Leydig cells of the testis, and mice deficient in Gprc6a show male feminization and a metabolic manifestation of higher circulating estradiol and reduced levels of testosterone. These two hormones are critical for initiation and progression of prostate cancer . As another supporting evidence for the significant association, GPRC6A is functionally important in regulating non-genomic effects of androgens inmultiple tissues . Rs339331 resides in the chromosome region 6q22, which was found to be a susceptibilitylociof prostate cancer in US Whites .
The findings corroborate studies on the relevanceoftype2diabetes mellitus in Brazil and worldwide in recent decades. The YLD rate per 1,000 inhabitants is more than half the rate of the entire group that includes infectious and parasitic diseases, maternal causes, perinatal causes, and nutritional deficiencies. The findings thus have implications for planning actions in the Brazilian health system. Since diabetes is a primary care-sensitive condition, it is hoped that strengthening primary care by including relatively simple preventive and curative measures will positively impact the diagnosis and follow-up of individu- als with diabetes, thus preventing diabetes mellitus and chronic complications or delaying the latter’s progression, helping to enhance care and quality of life for these patients.
As investigações realizadas para avaliação do polimorfismo de TNFA -308 G/A em pacientes com resistência à insulina, diabetes e obesidade têm levado a resultados controversos. Chang et al. (2005) observaram relação entre esse polimorfismo, resistência à insulina e DG. Em concordância, uma revisão sistemática identificou associação entre este polimorfismo e o risco de desenvolvimento de obesidade e de produção de altos níveis de insulina, sugerindo que este gene também pode estar envolvido na patogênese da síndrome metabólica (Sookoian et al., 2005), embora, estudo recente não tenha demonstrado associação deste polimorfismo com o DG (Montazeri et al., 2010). A substituição da base G por A neste polimorfismo tem sido associada à maior produção de TNFA, sendo que o genótipo AA estaria associado à maior produção desta citocina (Wilson et al., 1997). Estudos recentes sugeriram associação entre este polimorfismo e a susceptibilidade para DM1 (Das et al., 2006; Shin et al., 2008; Settin et al., 2009). Este polimorfismo parece estar também associado com complicações em pessoas com diabetes tipo 2, mas não em casos de DM1 (Lindholm et al., 2008).
SNP Genotyping was performed using Amplification Refractory Mutation System-PCR (ARMS-PCR) method. The PCR reaction was carried out in 20 ml of the solution containing Taq DNA polymerase, stan- dard10x PCR buffer, 200 ng DNA, 0.2 mM of each primer, 1.5mM of MgCl2 and 200 mM of dNTP mix and 0.5 U Taq polymerase. Cycling conditions was ini- tial denaturation (94ºC,3 min) followed by 40 cycles at 94ºC for 30 sec, 64ºC for 30 sec, 72ºC for 30 sec, and final extension at 72ºC for 10 min. Both alleles were amplified in separated PCR reactions with allele specific primers for A and G alleles (Table I). The am- plified DNA fragments were separated on 2% (w/v) agarose gel and viewed after staining with ethidium bromide. A 100 bp DNA ladder was used as a marker to estimate the size of the PCR products. The samples were genotyped and classified into one of the three pos- sible genotypes AA, AG, GG. In order to confirm ARMS- -PCR results, the DNA of 3 samples (one sample from each of the AA, AG and GG genotypes) was sequenced by the Sanger sequencing method and ana lyzed with the Codoncode Aligner software (Version. 6.0.2).
In addition to examining the role of host genetics in determining gut microbiome composition, we also identified bacterial taxa that are differentially abundant by sex. In the Hutterites, at least four bacterial taxa differ in abundance between the sexes each season, including genus Scardovia, genus Gordonibacter, genus Anaerotruncus, and phylum Proteobacteria. These abundance differences are directionally consistent across season and there are a number of hypotheses for these observations. One potential explaination is that inherent biological differ- ences between the sexes (for instance hormone levels) could drive the observed bacterial abun- dance differences. Alternatively, the division of labor could drive sex specific differences between men and women in Hutterite society. For example, Hutterite men typically work in the income-generating jobs, which vary by colony. Younger men might work in the fields, barns, or machine shops, while older men take on positions of leadership in the colonies. In contrast, Hutterite women perform family, domestic, and food preparation jobs, including cooking, cleaning, gardening, and sewing. It is possible that men and women are exposed to different environmental microbes due to differences in their daily activities. A similar notion was suggested previously in a study of Hadza hunter-gatherers, where sex differences in the rel- ative abundances of three taxa of the gut microbiome were observed. The authors attrib- uted those differences to the division of labor between men and women in that society (men tend to forage further from camp and for different food sources than women, who remain near to camp to stay with the children).
Although genome-wideassociationstudies have identified many risk loci associated with colorectal cancer, the molecular basis of these associations are still unclear. We aimed to infer biological insights and highlight candidate genes of interest within GWAS risk loci. We used anin silico pipeline based on functional annotation, quantitative trait loci mapping of cis-acting gene, PubMed text-mining, protein-protein interaction studies, genetic overlaps with cancer somatic mutations and knockout mouse phenotypes, and functional enrichment analysis to prioritize the candidate genes at the colorectal cancer risk loci. Based on these analyses, we observed that these genes were the targets of approved therapies for colorec- tal cancer, and suggested that drugs approved for other indications may be repurposed for the treatment of colorectal cancer. This study highlights the use of publicly available data as a cost effective solution to derive biological insights, and provides an empirical evidence that the molecular basis of colorectal cancer can provide important leads for the discovery of new drugs.
Thus, an interesting alternative to control GNI infestation in small ruminants is the selection of genetically resistant animals as well as the identification of chromosome regions controlling the resistance mechanisms. Once identified, these regions can be used in breeding programs under a marker-assisted selection and gene introgression approaches. Molecular genetics has reached great advances due to the development of dense panels of single nucleotide polymorphisms (SNP) markers. Given this large amount of genomic information, some markers can be close to relevant genes make possible to relate them to measurable traits through genome-wideassociationstudies (GWAS). In relation to GNI, one general trait that can be used in GWAS is the fecal egg count (FEC), which is a simple procedure that can be performed at herds to get an approximation of the parasite load that animals are carrying.
The genetic improvement of reproductive traits such as the number of teats is essential to the success of pig industry. As opposite to most SNP associationstudies that consider continuous phenotypes under Gaussian assumptions, this trait is characterized as discrete variable, which could potentially follow other distributions, such as the Poisson. Therefore, in order to access the complexity of a counting random regression considering all SNPs simultaneously as covariate under a GWAS modeling, the Bayesian inference tools become necessary. Currently, another point that deserves to be highlighted in GWAS is the genetic dissection of complex phenotypes through candidate genes network derived from significant SNPs. We present a full Bayesian treatment of SNP association analysis for number of teats assuming alternatively Gaussian and Poisson distributions for this trait. Under this framework, significant SNP effects were identified by hypothesis tests using 95% highest posterior density intervals. These SNPs were used to construct associated candidate genes network aiming to explain the genetic mechanism behind this reproductive trait. The Bayesian model comparisons based on deviance posterior distribution indicated the superiority of Gaussian model. In general, our results suggest the presence of 19 significant SNPs, which mapped 13 genes. Besides, we predicted gene interactions through networks that are consistent with the mammals known breast biology (e.g., development of prolactin receptor signaling, and cell proliferation), captured known regulation binding sites, and provided candidate genes for that trait (e.g., TINAGL1 and ICK).
First and Second Screenings, andReplication Study In the first screening, we performed a GWAS using discovery cohort of 286 cases and 557 controls, each of which had passed the sample quality control (QC) criteria. We applied an SNP QC [minor allele frequency (MAF) $0.05, call rates $0.98, Hardy- Weinberg equilibrium p-value $0.001 in controls, and visual cluster removal] and selected 531,009 SNPs for the first screening. We generated a quantile-quantile plot to inspect possible population stratification effects and obtained the genomic inflation factor (l) of 1.046, indicating no population substructure (Figure S1). However, none of the SNPs reached genome-wide signifi- cance in the first screening. The resulting Manhattan plot is shown in Figure 1.
adenocarcinoma were obtained from Biobank Japan (http:// biobankjp.org) at the Institute of Medical Science, The University of Tokyo as well as National Cancer Center Hospital, respectively. The control samples consisted of Japanese volunteers that were obtained from Osaka-Midosuji Rotary Club, Osaka, Japan (n = 906) as well as from staff members in Keio University, Japan, who participated in its health-check program (n = 677). In addition, individuals who were registered in Biobank Japan as subjects with various diseases except cancer (n = 3,728) (those having pulmonary tuberculosis, chronic hepatitis-B, keroid, drug-induced skin rash, peripheral artery disease, arrhythmia, stroke and myocardial infarction) were used as controls. All samples were obtained after obtaining the written informed consent. This project was approved by the ethics committee at The Institute of Medical Sciences, The University of Tokyo, National Cancer Center and Keio University. Individuals who had clinical history ofdiabetes mellitus (a possible confounding factor for pancreatic cancer) were excluded from these control sets. For sample quality control, we excluded five cases with call rate,0.98. After performing principal component analysis, we excluded outliers of 10 cases and 102 controls, who did not belong to the major Japanese cluster (Hondo cluster) (Figure S1) . We eventually performed the association study based on 991 cases and 5209 controls (Table S1). Power calculation showed that our study Figure 3. Regional association plots for three pancreatic cancer risk loci. (a) 6p25.3 region, SNP rs9502893 located 25 kb upstream to gene FOXQ1. (b) 12p11.21 region, SNP rs708224 is located at the second intron of gene BICD1. (c) 7q36.2 region, SNP rs6464375 is located at the first intron of gene DPP6 transcript variant 3. Each of the marker SNPs is marked by a blue diamond. SNPs that are genotyped in the Illumina platform are plotted as diamonds; Imputed SNPs are plotted as circles. The color intensity reflects the extent of LD with the marker SNP, red (r 2
Obesity andtype2diabetes are highly prevalent worldwide [1,4,7]. Obesity-associated insulin resistance is a major risk factor leading to type2diabetes [4,6,8]. Evidence has shown that genetic loci related to obesity could contribute to the risk for type2diabetes [10,11,13–16,18,20–26]. For example, allele A of SNP rs9939609 in the FTO gene was reported to be associated with both increased BMI in various populations and elevated risk for type2diabetes [31–33]. During recent decades, genetic studies have identified multiple susceptible genetic loci related to obesity [2,9]. Although many studies have attempted to investigate the relationship between some obesity-related genetic lociandtype2diabetesin different ethnicities, their associations are still far from fully understood [11–27,34]. Notably, previous studies conducted in Chinese populations have shown inconsistent results [11– 14,29,30]. Thus, it is worthwhile to examine the associations between obesity-related SNPs andtype2diabetesin a large sample of a Han Chinese population.
New sources of genetic diversity must be incorporated into plant breeding programs if they are to continue increasing grain yield and quality, and tolerance to abiotic and biotic stresses. Germplasm collections provide a source of genetic and phenotypic diversity, but characterization of these resources is required to increase their utility for breeding programs. We used a barley SNP iSelect platform with 7,842 SNPs to genotype 2,417 barley accessions sampled from the USDA National Small Grains Collection of 33,176 accessions. Most of the accessions in this core collection are categorized as landraces or cultivars/breeding lines and were obtained from more than 100 countries. Both STRUCTURE and principal component analysis identified five major subpopulations within the core collection, mainly differentiated by geographical origin and spike row number (an inflorescence architecture trait). Different patterns of linkage disequilibrium (LD) were found across the barley genomeand many regions of high LD contained traits involved in domestication and breeding selection. The genotype data were used to define ‘mini-core’ sets of accessions capturing the majority of the allelic diversity present in the core collection. These ‘mini-core’ sets can be used for evaluating traits that are difficult or expensive to score. Genome-wideassociationstudies (GWAS) of ‘hull cover’, ‘spike row number’, and ‘heading date’ demonstrate the utility of the core collection for locating genetic factors determining important phenotypes. The GWAS results were referenced to a new barley consensus map containing 5,665 SNPs. Our results demonstrate that GWAS and high-density SNP genotyping are effective tools for plant breeders interested in accessing genetic diversity in large germplasm collections.
This unique structure of dog breeds has been useful in determining the genetic basis for many desirable and deleterious traits, an effort facilitated by new genetic technologies emanating from the sequencing of the canine genome . Whole genomeassociation analysis studies that utilize single nucleotide polymor- phism (SNP) markers have been used to identify the molecular causes of various traits and conditions including genetic mutations within breeds that cause coat color variations , hairlessness  and defects in spinal development . Trait identification involves single breeds that segregate the trait of interest followed by fine structure mapping using additional breeds that segregate the trait . This two stage mapping approach has been very successful and requires modest numbers of individuals. Precise mapping of major loci responsible for trait variation has been accomplished with relatively small numbers of dogs as compared to the large numbers of individuals required for trait identification