Suicide and suicide attempts are complex behaviors that result from the interaction of differ- ent factors, including genetic variants that increase the predisposition to suicidal behaviors. Copynumber variations (CNVs) are deletions or duplications of a segment of DNA usually larger than one kilobase. These structural genetic changes, although quite rare, have been associated with genetic liability to mental disorders, such as autism, schizophrenia, and bi- polar disorder. No genome-wide level studies have been published investigating the poten- tial role of CNVs in suicidal behaviors. Based on single-nucleotide polymorphism array data, we followed the Penn-CNV standards to detect CNVs in 1,608 subjects, comprising 475 suicide and suicide attempt cases and 1,133 controls. Although the initial algorithms determined the presence of CNVs on chromosomes 6 and 12 in seven and eight cases, re- spectively, compared with none of the controls, visual inspection of the raw data did not sup- port this finding. Furthermore we were unable to validate these findings by CNV-specific real-time polymerase chain reaction. Additionally, rare CNV burden analysis did not find an association between the frequency or length of rare CNVs and suicidalbehavior in our sam- ple population. Although our findings suggest CNVs do not play an important role in the eti- ology ofsuicidal behaviors, they are not inconsistent with the strong evidence from the literature suggesting that other genetic variants account for a portion of the total phenotypic variability in suicidalbehavior.
Findings from GWAS provide valuable genetic infor- mation of trait architecture or candidate loci for subse- quent validation . Preliminary GWAS analysis should be complemented by statistical procedures to help prioritize GWAS results , such as pathway ana- lysis of GWAS results to rank genes and pathways within a biological context. Additional follow-up ana- lyses and experiments may even be required to pinpoint the causal genes . In the present study, the SNPs strongly associated with the traits analyzed and/or the SNPs from which the allelic variant was found to contrib- ute to the larger phenotypic effects should be priori- tized as candidate genomic regions for marker development to support selection activities, especially for the quality-related traits difficult to measure/as- sess. The final objective is to develop the necessary expedited tools to implement routine quality selection (such as for “breadability”) into maize breeding pro- grams. As an example, by using marker-assisted selec- tion, a few nutritional trait-associated genes or QTLs (for maize protein quality, oil content and provitamin A levels) have been recently introgressed into elite maize lines for their quality improvement .
and risk of glucose intolerance and type 2 diabetes , SGK1 with insulin secretion in type 2 diabeties . Frequency of eating and meal time are important indicators for eating behavior in humans. QTL for NVD was homologous with HSA 17q21 regions which contained many obesity candidate genes including PPY, PON1 and 2, GAST, PNMT, STAT3 and HCRT (reviewed in ). Moreover, some of the genes have been found to play very important roles in controlling feed intake in both human and animal models. For instance, the HCRT gene encodes a hypothalamic neuropeptide precursor protein that gives rise to two mature neuropeptides, orexin A and orexin B, which stimulate feed intake in rats . Peptide YY (PYY) also plays a very important role in energy homeostasis by balancing food intake  by acting as an ‘‘ileal brake’’ leading to a sensation of fullness and satiety . Other homologous regions including HSA 4q22–24, HSA 13q31–32 and HSA 17p13 also contain a numberof candidate genes for obesity/ metabolic syndorome and eating behavior in both human and animals. For instance, microsomal triglyceride transfer protein (MMTP) gene located in HSA q24 were found as a candidate gene for obesity  in humans. The inhibitation of this gene by JTT- 30 was found to suppress also the food intake in rats . The function of the MTTP gene in feed intake may be due to its involvement in the gut leptin-melanocortin pathway . Although pigs and humans have similar genetic structure, comparative genomic mapping between these species has a limitation on accuracy of homolgous regions. This limitation can be overcome by fine mapping or meta-analysis of QTL in each species and by taking systems biology approaches that links genomic regions with phenotypes through transcriptomics to detect potential causal genes ([5–6] and ). Nevertheless, the results of comparative QTL mapping from this study are useful for understanding the genetic background of eating behavior in humans (more QTL for traits) as well as in pigs (more candidate genes with functional validations).
We obtained Illumina NGS data for four taurine (three unrelated Angus, one Holstein) and one indicine (Nelore) cattle (Supple- mental Table S1). Additionally, we simulated NGS reads using Sanger sequence reads of the sequenced cow, L1 Dominette 01449, a Here- ford cow of European descent, and named its result as DTTRACE. The amount of sequence data for each animal varied from 43 (Hereford and Holstein) to nearly 203 (Angus and Nelore) coverage, allowing sufficient power to detect CNVs >20 kbp in length (Table 1). Since two of our animals (Holstein and Hereford) were sequenced pri- marily as single sequence reads, and we aimed to provide absolute genome-wide gene copynumber estimates in this study, we used an RD detection method similar to that previously described (Alkan et al. 2009). See Methods for full details of mrsFAST alignment and WSSD CNV discovery parameters. Based on sequence RD against the reference genome (Alkan et al. 2009; Sudmant et al. 2010), we detected a total of 1265 unique CNV regions (CNVRs) across all an- alyzed individual animals (average length = 49.1 kbp), amounting to 55.6 Mbp of variable sequence or 2.1% of the cattle genome (Fig. 1). A full list of CNV calls can be found in Supplemental Table S2. As expected, the ‘‘uncharacterized chromosome’’ (chrUn), which con- sists of sequence that cannot be uniquely mapped to the genome, contains much variable polymorphic sequence (Liu et al. 2009). Our analysis indicated that 36.7 Mbp of chrUn (944 regions) may be copynumber variable between individuals. Due to the shorter
We performed a genomewide analysis of 164 urothelial carcinoma samples and 27 bladder cancer cell lines to identify copynumber changes associated with disease characteristics, and examined the association of amplification events with stage and grade of disease. Multiplex inversion probe (MIP) analysis, a recently developed genomic technique, was used to study 80 urothelial carcinomas to identify mutations and copynumber changes. Selected amplification events were then analyzed in a validation cohort of 84 bladder cancers by multiplex ligation-dependent probe assay (MLPA). In the MIP analysis, 44 regions of significant copynumber change were identified using GISTIC. Nine gene-containing regions of amplification were selected for validation in the second cohort by MLPA. Amplification events at these 9 genomic regions were found to correlate strongly with stage, being seen in only 2 of 23 (9%) Ta grade 1 or 1–2 cancers, in contrast to 31 of 61 (51%) Ta grade 3 and T2 grade 2 cancers, p,0.001. These observations suggest that analysis of genomic amplification of these 9 regions might help distinguish non-invasive from invasive urothelial carcinoma, although further study is required. Both MIP and MLPA methods perform well on formalin-fixed paraffin-embedded DNA, enhancing their potential clinical use. Furthermore several of the amplified genes identified here (ERBB2, MDM2, CCND1) are potential therapeutic targets.
First, we note that all tested algorithms were able to detect large- scale genomic aberrations ranging from a 14 Mb deletion to a whole chromosome triplication. We therefore conclude that the SNP-Array under study can be used in cytogenetic research. Yet, as depicted in Figure 1, CNV-finding algorithms may vary considerably in sensitivity and specificity  and therefore we recommend a combination of different algorithms to facilitate interpretation of findings. Given this varying accuracy of different CNV-detection algorithms in large-scale genomic aberrations, cautiousness is indicated in interpreting findings from whole genome CNV screenings based on SNP-genotyping data. Common CNVs can comprise only few markers decreasing significantly the signal-to-noise ratio as compared to large-scale genomic aberrations. In order to improve reliability of CNV detection in single individuals, we suggest that detection algorithms should be capable of exploiting prior knowledge about CNV base rates in a given genomic region. To provide information on localization of CNVRs and probability measures Figure 5. CNVR-length, gene content and frequency distributions. Plot is depicting CNVR-length, gene content and frequency distributions. CNVRs are plotted according to CNVR-map (color), length (y-axis), frequency of CNVs per CNVR (x-axis, at least 2 overlapping CNVs had to be present to form a CNVR in the population specific maps, a CNVR in the CP-Map is constructed of at least 262 CNV events) and numberof RefSeq genes affected (circle size).
We constructed a 400K WG tiling oligoarray for the horse and applied it for the discovery ofcopynumber variations (CNVs) in 38 normal horses of 16 diverse breeds, and the Przewalski horse. Probes on the array represented 18,763 autosomal and X-linked genes, and intergenic, sub-telomeric and chrY sequences. We identified 258 CNV regions (CNVRs) across all autosomes, chrX and chrUn, but not in chrY. CNVs comprised 1.3% of the horse genome with chr12 being most enriched. American Miniature horses had the highest and American Quarter Horses the lowest numberof CNVs in relation to Thoroughbred reference. The Przewalski horse was similar to native ponies and draft breeds. The majority of CNVRs involved genes, while 20% were located in intergenic regions. Similar to previous studies in horses and other mammals, molecular functions of CNV-associated genes were predominantly in sensory perception, immunity and reproduction. The findings were integrated with previous studies to generate a composite genome-wide dataset of 1476 CNVRs. Of these, 301 CNVRs were shared between studies, while 1174 were novel and require further validation. Integrated data revealed that to date, 41 out of over 400 breeds of the domestic horse have been analyzed for CNVs, of which 11 new breeds were added in this study. Finally, the composite CNV dataset was applied in a pilot study for the discovery of CNVs in 6 horses with XY disorders of sexual development. A homozygous deletion involving AKR1C gene cluster in chr29 in two affected horses was considered possibly causative because of the known role of AKR1C genes in testicular androgen synthesis and sexual development. While the findings improve and integrate the knowledge of CNVs in horses, they also show that for effective discovery of variants of biomedical importance, more breeds and individuals need to be analyzed using comparable methodological approaches.
important complementary to the CNV map in the pig genome. Validation of 12 CNVRs of these CNVRs produced a similar confirm rate (66.67%) as previous CNV studies based on SNP arrays. Functional annotation revealed the CNVR identified have important molecular function, and may play an important role in phenotypic variation and are often related with disease suscepti- bility. However, only large CNVRs (Average length 158.37 kb) were identified using this SNP panel. As statistics of the size distribution of human CNVs in Database of Genomic Variants (http://dgvbeta.tcag.ca/dgv/app/home?ref = NCBI36/hg18), the CNVRs are most abundant in the 1 to 10 kb range, and the CNVR number decrease gradually when the the CNVR length larger or smaller than the range. Thus, the numberof CNVs identified in this study is likely to be a greatly underestimation of the true numberof CNVs in these pig genomes. Follow-up studies, using improved SNP arrays as well as other technologies, such as aCGH and next-generation sequencing, should be carried out to attain high-resolution CNV map.
and some other tumors . To date, this region has not been reported to be associated with any vascular diseases or phenotypes. Our initial association screen also identified other BAVM- associated CNVR mapping to chr15q11, 6q16 and 16p11. However, the association with BAVM did not persist in multivariate models adjusting for age, sex and the top 3 principal components, utilizing CNV calls from both PennCNV and Birdsuite. Further, none of these CNVRs overlapped genes associated with BAVM using the gene-based approach in both algorithms. The chromosome 15q11.2 CNVR identified in our study did not overlap the linkage region on 15q11-q13 reported in non-HHT familial BAVM patients . The small deletion on 6q16.3 showed poor concordance between CNV-calling algo- rithms and did not overlap any genes. Due to the highly repetitive Table 1. BAVM-associated CNVRs (PennCNV).
1) Discovery sample. The discovery sample consisted of 2,286 unrelated Caucasian subjects that were recruited in Midwestern US in Kansas City, Missouri and Omaha, Nebraska. All identified subjects were of European origin. Subjects with certain conditions were excluded, including chronic disorders involving vital organs (heart, lung, liver, kidney, brain), serious metabolic diseases (diabetes, hypo- and hyper-parathyroidism, hyperthyroidism, etc.), skeletal diseases (Paget disease, osteogenesis imperfecta, rheumatoid arthritis, etc.), chronic use of drugs affecting bone metabolism (hormone replacement therapy, corticosteroid therapy, anti- convulsant drugs), and malnutrition conditions (such as chronic diarrhea, chronic ulcerative colitis, etc.).
existence of mutations in glutamate receptors was described in prior exome sequencing studies , and our data not only confirmed that GRIN2A was mutated in melanoma (5 out of 28 cases) but also showed that GRIN2B was recurrently mutated (Figure 2). In addition, a numberof mutations have also been found in other metabotropic glutamate receptors, such as GRM1 and GRM3-8. Specifically, out of 23 nonsynonymous mutations from GRM genes, one nonsense and four missense mutations were from GRM3, previously shown to harbor activating mutations in melanomas . The observed mutation rate was 0.22 to 143 mutations per Mbp in the TCGA dataset compared to 3 to 155 mutations per Mbp in our 15 whole genome sequenced samples. In addition to the similar distribution of mutation rates, we also observed recurrent single nucleotide variants including S225F and G394E in EPHA7 and G114E and R136* in EPHA3 from both datasets. The Comparison of the numberof mutations in significant genes between this study and TCGA report  is shown in Table S15 in File S1.
The overall system design of microPIR is illustrated in Figure 1. From the bottom of this figure, three data sources 1) target site prediction, 2) supporting genomic information for predicted targets, and 3) other genome annotations were preprocessed/collected and incorporated into the local MySQL databases. MySQL version 5.5.1 was employed to manage all the predicted target entries as well as their supporting information. To make the database accessible for public use, the web interface, including submission forms and graphical outputs were constructed using Python scripts and toolkits from Python Webware (http://www.webwareforpython.org). The web inter- face offers three main functionalities. First, the search feature allows users to locate miRNA target sites of interest. Data can be queried through the webform, which formulates the correspond- ing SQL queries to MySQL using python scripting language. We used Python MySQLdb module to connect (sending and receiving SQL queries/outputs) to the MySQL database back- end. Second, the viewing feature enables users to graphically view the location of miRNA target sites within a specific locus via a genome browser interface. Third, the statistics module summarizes the information contained within this database for the genomic region of interest. The link-out module is provided for users to cross-check with other related databases, e.g. gene ontology and original data sources. Furthermore, for the sake of convenience, a primer design section is included to assist in validating potential target genes. The whole microPIR frame- work is running on our 12-core database server (2 AMD 6-core (2.8 GHz) processors with 64 Gigabytes of RAM and 2 Terabytes of hard disk space).
Cyberbullying is a new form of violence that is expressed through electronic media and has giv- en rise to concern for parents, educators and re- searchers. In this paper, an association between cyberbullying and adolescent mental health will be assessed through a systematic review of two databases: PubMed and Virtual Health Library (BVS). The prevalence of cyberbullying ranged from 6.5% to 35.4%. Previous or current expe- riences of traditional bullying were associated with victims and perpetrators of cyberbullying. Daily use of three or more hours of Internet, web camera, text messages, posting personal infor- mation and harassing others online were as- sociated with cyberbullying. Cybervictims and cyberbullies had more emotional and psycho- somatic problems, social difficulties and did not feel safe and cared for in school. Cyberbullying was associated with moderate to severe depres- sive symptoms, substance use, ideation and sui- cide attempts. Health professionals should be aware of the violent nature of interactions oc- curring in the virtual environment and its harm to the mental health of adolescents.
Information on resistance in this diverse group of Andean bean lines will be useful in future breeding efforts to develop anthracnose resistant cultivars depending on the prevailing races in a region. Finding resistance in adapted Andean lines with favorable agronomic and seed traits could have important implications and applications for breeders within target countries. Not only does it help maintain bean diversity through additional resistance options, but it also allows for a more rapid introgression of resistance into future Andean bean cultivars. A lack of information on the physical position of markers linked to the major resistance genes in the published literature prevents a final determination of co-localization between results from the GWAS and the presumed location of many anthracnose resistance genes. In this study, new sources of anthracnose resistance in Andean beans were discovered on Pv02, Pv10, and Pv11, as well as a unique location on Pv04. Breeders will need to identify the most effective resistance gene or allele at these loci prior to pyramiding genes from different chromosomes for more durable resistance. The physical position and the candidate genes identified in the current study will serve as a basis for developing functional markers to facilitate this effort. The resis- tance deployed in the MSU breeding program has largely been assumed to be controlled by the Co-1 gene and that assumption was confirmed in this study. A major putative QTL for resis- tance to anthracnose in both Andean and Mesoamerican beans was identified on Pv01 adjacent to SNPs ss715645251 at 50.30 Mb within the 58 kb region (50.26–50.32 Mb) where the Co-x was mapped. It is likely that this region corresponds to the major Co-1 resistance cluster, including the Co-x resistance gene. The identification of an InDel marker (50.22 Mb) tightly linked to four alleles at the Co-1 locus will be especially useful for the continued effort of breed- ers in developing countries as it can be utilized for marker assisted breeding in labs where resources are limiting.
Computational goals of the proposed study were to identify the most relevant features and address the class imbalance problem. The genomic characteristics of the DMRs are used as fea- tures for the learners. Active learning intelligently chooses the best instances / features to learn from [53, 54]. The approach uses Generalized Query Based Active Learning (GQAL) which not only can choose the best features to learn from, but also select the most relevant features for this instance for learning. This is accomplished by constructing intelligent queries by removing irrelevant features from the query which an Oracle (e.g., a human expert) can answer easily. This approach allows the learner to label multiple instances at the same time instead of labeling one instance per query. In addition, instead of using a global feature reduction (where a set of features are removed in the beginning of the training) GQAL uses a subset of features at each iteration by using local feature selection. This makes use of the most potential power of the features and it maximizes the use of a subset of features for learning. The GQAL approach has been tested on 13 datasets besides epigenetics and compared with 3 other classifiers (KNN, SVM and NB)  and later with (AdaBoost, Decision Trees, RandomForest and Logistics) and the GQAL was found to be the most efficient for the epigenetic dataset. The current study combines these two approaches into a single sequential computational tool.
Sequencing the DNA library resulted in a total numberof 12,302,376 read pairs of 2676 nt length. Reads were mapped to the reference genomeof strain PAO1, which was obtained from the Pseudomonas genome database . The 39-ends of reads were trimmed using Perl script Trim.pl (http://bioinformatics.ucdavis. edu/index.php/Trim.pl) with the adaptive window option and a quality threshold of 10 to remove sequences of low read quality. Reads pairs containing one or two reads that were trimmed to a length less than 20 nt were discarded leaving 11,991,338 read pairs (97.5%). The free license version of Novoalign (www. novocraft.com) was used for mapping, because this software includes a gapped alignment algorithm, which improves the detection of indels . 11,824,266 read pairs (96.1% of the original reads) were mapped to unique locations resulting in a median read depth (genome coverage) of 237. Single nucleotide polymorphisms (SNPs) and indels were detected using the MAQ software  using its built-in functions ‘‘cns2snp’’ and ‘‘indelpe’’, respectively. Initially, only SNPs with a minimal consensus quality of 30 and indels that were supported by at least 50% of the reads overlapping the indel position were considered for further analysis as potential true positives. All positively filtered SNPs and indels were checked by visual inspection for correct base/indel calling. Genomic regions showing a read depth of less than 30 were also checked by visual inspection for the occurrence of larger indels that cannot be detected by the combination of Novoalign and MAQ alone. Furthermore, the confirmation of positives by Sanger sequencing has been performed and aligned with the sequences extracted from www.pseudomonas.com.
High-throughput allelotyping of 906,000 SNPs was performed in triplicate on Affymetrix Human SNP Array 6.0 (Santa Clara, California, USA) at Instituto Gulbenkian de Ciência ’s Microarray Core Facility using standard protocols. After thorough quality control, probe inten- sity data was transferred to the R statistical platform (http://www.r-project.org) and normal- ized across chips using the SNPMaP package . SNPMaP identified and removed 38,338 SNPs performing poorly (e.g. located in sex chromosomes, CNV regions and mitochondria), and calculated the Relative Allele Scores (RAS), the pooling equivalent of a relative allele fre- quencies. RAS usually correspond to the ratio of the A probe to the sum of the A and B probes (where A is the major allele and B is the minor allele). However, with Affymetrix arrays, each SNP is assayed as quartets of perfect match (PM) and mismatch (MM) probes and the RAS score is corrected for the non-specific hybridisation (mismatch probes). The RAS for the sense strand is therefore the median(s i (s) ), where s i (s) (median of relative allele signal for the i th probe
The requirement for large amounts of good quality DNA for whole-genome applications prohibits their use for small, laser capture micro-dissected (LCM), and/or rare clinical samples, which are also often formalin-fixed and paraffin-embedded (FFPE). Whole-genome amplification of DNA from these samples could, potentially, overcome these limitations. However, little is known about the artefacts introduced by amplification of FFPE-derived DNA with regard to genotyping, and subsequent copynumber and loss of heterozygosity (LOH) analyses. Using a ligation adaptor amplification method, we present data from a total of 22 Affymetrix SNP 6.0 experiments, using matched paired amplified and non-amplified DNA from 10 LCM FFPE normal and dysplastic oral epithelial tissues, and an internal method control. An average of 76.5% of SNPs were called in both matched amplified and non-amplified DNA samples, and concordance was a promising 82.4%. Paired analysis for copynumber, LOH, and both combined, showed that copynumber changes were reduced in amplified DNA, but were 99.5% concordant when detected, amplifications were the changes most likely to be ‘missed’, only 30% of non- amplified LOH changes were identified in amplified pairs, and when copynumber and LOH are combined ,50% of gene changes detected in the unamplified DNA were also detected in the amplified DNA and within these changes, 86.5% were concordant for both copynumber and LOH status. However, there are also changes introduced as ,20% of changes in the amplified DNA are not detected in the non-amplified DNA. An integrative network biology approach revealed that changes in amplified DNA of dysplastic oral epithelium localize to topologically critical regions of the human protein-protein interaction network, suggesting their functional implication in the pathobiology of this disease. Taken together, our results support the use of amplification of FFPE-derived DNA, provided sufficient samples are used to increase power and compensate for increased error rates.
qPCR was used to estimate haploid gene numbers by comparing the rate of amplification of a multicopy gene to that of a single-copy gene as in  and . The multicopy genes we amplified using qPCR were: rPokeyA and rPokeyB, total 28S genes (t28S), 28S genes lacking an insert in the Pokey TTAA insertion site (u28S), and 18S genes. The single-copy reference genes are Gtp (a member of the RAB subfamily of small GTPases) and Tif (a transcription initiation factor) (Fig. 2, S4 Table in S2 File). Amplification efficiencies for the primer pairs were estimated by generating standard curves and calculating the Percent Amplification Efficiency (PAE) as described in Eagle and Crease . The PAE was estimated at least three times for each primer pair and any values outside the 95% confidence interval were omitted from calculation of the mean value . qPCR was performed using the 1X PerfeCTa SYBR Green FastMix with ROX (Quanta BioSciences, Gaithersburg, MD, USA) on 10 ng of DNA extracted from isolates of the four MAL-FG sampled at seven time points, 16 MAL-87 isolates, and the 21 NP isolates (S1 Table in S1 File). All reactions were run in triplicate on a StepOnePlus Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). The StepOne software was used to set the baseline, and C T values were obtained
Nazaryan‑Petersen et al (47) studied 21 clustered CNV carriers with congenital developmental disorders, intellectual disability or autism. Using whole genome sequencing to study the structures of the rearrangement first investigated by CMA, they identified a total of 83 breakpoint junctions (BPJs). Their results indicated 8 cases with deletions that frequently had additional structural rearrangements, such as insertions and inversions typical to chromothripsis, 7 cases with duplications, and 6 cases with combinations of duplica‑ tions and deletions showing interspersed duplications and BPJs enriched with microhomology. Some rearrangements also indicated both a breakage‑fusion‑bridge cycle process and haltered formation of a ring chromosome, and 2 cases showed rearrangements mediated by Alu and long inter‑ spersed nuclear elements (LINE). The authors concluded that various mechanisms may be involved in the formation of clustered CNVs: Replication independent canonical NHEJ and alt‑NHEJ, microhomology‑mediated break‑induced repli‑ cation (MMBIR)/fork stalling and template switching, and breakage‑fusion‑bridge cycle and Alu‑ and LINE‑mediated pathways. They suggested that 7 cases were chromothripsis and 10 cases were chromoanasynthesis events (47). The primary difference between chromoanasynthesis and chro‑ mothripsis is the presence ofcopy gains such as duplication, triplication, in addition to deletions and copy‑neutral chromo‑ somal regions (7).