7 Conclusões e trabalho futuro
7.2 Trabalho futuro 150
A análise de determinados genes, do ponto de vista das repetições, poderá contribuir para orientar o medicamento ao paciente e não para a doença, princípio que já está a ser discutido actualmente. O Canadian Institutes of Health Research apresentou em [186] o seu plano para avançar com a pesquisa orientada para o paciente. Esse plano foca-se em diversos aspectos da saúde pública, nomeadamente:
A maior ou menor predisposição de um indivíduo em correr risco de doença (incluindo a compreensão dos factores genéticos, descoberta de novos marcadores biológicos, etc.).
Acelerar e melhorar os mecanismos de rastreio e diagnóstico. Prognóstico do paciente quando sujeito a determinadas condições.
Procurar as melhores estratégias de forma a orientar a terapia para o paciente.
Neste campo, a integração de sistemas de informação suportados em bases de dados públicas poderão constituir um grande desafio, nomeadamente na constituição de bases de dados de DNA, usadas para múltiplos fins (medicinais, criminológicos, biotecnológicos, entre outros), sendo obviamente necessário software à medida para efectuar as tarefas de análise especializada.
As repetições de CAG no gene humano HTT que são responsáveis pela doença de Huntington [152, 187-188] quando essas repetições ultrapassam o valor 35 (normal entre 10 e 34) poderão vir a ser detectadas pela análise do património genético do indivíduo.
151
Dessa forma, poderá determinar-se se os valores obtidos lhe conferem ou não uma maior susceptibilidade para doença, bem como para os seus descendentes [9], pelo que mais uma vez os sistemas de informação continuarão a constituir o pilar necessário para suportar essa análise.
Outro campo de aplicação que poderá ser explorado neste contexto, refere-se à mineração de dados. Actualmente várias ferramentas fazem mineração de dados genómicos, nomeadamente na detecção de padrões de segmentos de DNA que desempenham uma determinada função. Refiro-me mais especificamente a Motifs, que não sendo uma área totalmente nova, possui ainda um potencial bastante grande a nível computacional [189]. A inferência de regras de associação entre Motifs, a criação de ferramentas de Clustering, entre outras metodologias poderão contribuir num futuro para detectar atempadamente a premonição para determinados problemas de saúde, quer ao nível do indivíduo, quer de saúde pública. Para esse efeito terão necessariamente de ser produzidos equipamentos bastante poderosos, não apenas de sequenciação dos genomas humanos, mas também para análise dos dados em tempo útil, o que actualmente, só é possível num grupo muito restrito de comunidades de investigação internacionais. Esse acesso restrito à tecnologia a par de várias barreiras legislativas, levam a que nem sempre se consiga ter acesso a dados reais para efectuar testes dos algoritmos implementados, usando em muitos casos, dados públicos, já trabalhados por terceiros, que condicionam o valor dos resultados obtidos. Estes são alguns dos exemplos sobre os quais as aplicações desenvolvidas poderão ser utilizadas, quer recorrendo às implementações actuais, quer pela inclusão de módulos específicos para esse fim.
Por último, de referir que os algoritmos desenvolvidos, principalmente para detecção de sequências exactas e aproximadas, poderão vir a ser alvo de refinamentos e optimização, nomeadamente pela inclusão de métodos baseados em arrays de sufixos.
153
Referências
[1] F. Sanger, et al., "The nucleotide sequence of bacteriophage [phi] X174," Journal
of molecular biology, vol. 125, pp. 225-246, 1978.
[2] J. Barrett, et al., "Genome-wide association defines more than 30 distinct
susceptibility loci for Crohn's disease," Nature genetics, vol. 40, pp. 955-962, 2008.
[3] M. Morley, et al., "Genetic analysis of genome-wide variation in human gene
expression," NATURE, vol. 430, pp. 743-747, 2004.
[4] H. Rheinberger, et al., "Three tRNA binding sites on Escherichia coli ribosomes,"
Proc Natl Acad Sci USA, vol. 78, pp. 5310 - 5314, 1981.
[5] M. Fardilha, et al., "A importância do mecanismo de “splicing” alternativo para a
identificação de novos alvos terapêuticos," Acta Urológica, vol. 25, pp. 39-47, 2008.
[6] F. Lee. (2010, 28-08-2010). Molecular Biology Web Book. Available: http://www.web-books.com/MoBio/Free/Ch7F3.htm
[7] K. A. Freed, et al., "Detection of CAG repeats in pre-eclampsia/eclampsia using the
repeat expansion detection method," Mol. Hum. Reprod., vol. 11, pp. 481-487, July 1, 2005 2005.
[8] P. Ferro, et al., "The androgen receptor CAG repeat: a modifier of
carcinogenesis?," Molecular and Cellular Endocrinology, vol. 193, pp. 109-120, 2002.
[9] Pearson. (2007, 9-02-2009). Repeat Disease Database. Available: http://www.cepearsonlab.com/rdd.php
[10] S. Subramanian, et al., "Triplet repeats in human genome: distribution and their
association with genes and other genomic regions," bioinformatics, vol. 19, p. 549, 2003.
[11] Y. Haberman, et al., "Trinucleotide repeats are prevalent among cancer-related
genes," Trends in Genetics, vol. 24, pp. 14-18, 2008.
[12] A. Chapman, "England's Leonardo: Robert Hooke (1635-1703) and the art of experiment in Restoration England," 1996, pp. 239-276.
[13] G. Mendel, Experiments in plant hybridisation: Cosimo, Inc., 2008.
[14] L. Wong, The practical bioinformatician: World Scientific Pub Co Inc, 2004.
[15] A. Nakabachi, et al., "The 160-Kilobase Genome of the Bacterial Endosymbiont
Carsonella," Science, vol. 314, pp. 267-, October 13, 2006 2006.
[16] S. G. Gregory, et al., "A physical map of the mouse genome," NATURE, vol. 418, pp. 743-50, Aug 15 2002.
154
[17] P. V. Baranov, et al., "Codon size reduction as the origin of the triplet genetic code," PLoS One, vol. 4, p. e5708, 2009.
[18] Delgado, Jr., "The genial gene: deconstructing Darwinian selfishness," Choice:
Current Reviews for Academic Libraries, vol. 47, pp. 135-135, 2009.
[19] T. G. Boyer, et al., "Genome mining for human cancer genes: whereforeartthou?,"
Trends in Molecular Medicine, vol. 7, pp. 187-189, 2001.
[20] I. Rigoutsos and G. Stephanopoulos, Systems Biology. Volume I: Genomics: Oxford University Press, 2006.
[21] J.-M. Claverie, "GENE NUMBER: What If There Are Only 30,000 Human Genes?," Science, vol. 291, pp. 1255-1257, February 16, 2001 2001.
[22] L. Wong, THE PRACTICAL BIOINFORMATICIAN (duplicado): World Scientific
Publishing Co. Pte. Ltd., 2004.
[23] L. Duret, et al., "Strong conservation of non-coding sequences during vertebrates
evolution: potential involvement in post-transcriptional regulation of gene expression," Nucl. Acids Res., vol. 21, pp. 2315-2322, May 25, 1993 1993.
[24] L. Flanking, "Primer on Molecular Genetics," 1992.
[25] J. S. Andersen, et al., "Nucleolar proteome dynamics," NATURE, vol. 433, pp. 77- 83, 2005.
[26] J. Collinge, "PRION DISEASES OF HUMANS AND ANIMALS: Their Causes and Molecular Basis," Annual Review of Neuroscience, vol. 24, pp. 519-550, 2001. [27] E. Keedwell and A. Narayanan, Intelligent bioinformatics: the application of
artificial intelligence techniques to bioinformatics problems: John Wiley & Sons
Inc, 2005.
[28] P. d. Oliveira. (2008, 29-08-2010). Manual de genética. Available: http://home.dbio.uevora.pt/~oliveira/Bio/Manual/i4.htm
[29] R. Belshaw, et al., "Long-term reinfection of the human genome by endogenous
retroviruses," Proceedings of the National Academy of Sciences of the United
States of America, vol. 101, p. 4894, 2004.
[30] T. Ogawa and T. Okazaki, "Discontinuous DNA Replication," Annual Review of
Biochemistry, vol. 49, pp. 421-457, 1980.
[31] J. Finsterer, "Bulbar and spinal muscular atrophy (Kennedy’s disease): a review,"
European Journal of Neurology, vol. 16, pp. 556-561, 2009.
[32] T. A. Kunkel and D. A. Erie, "DNA Mismatch Repair," Annual Review of
Biochemistry, vol. 74, pp. 681-710, 2005.
[33] E. Hoffman, "Skipping toward personalized molecular medicine," The New
England journal of medicine, vol. 357, p. 2719, 2007.
[34] R. Roeder, "The role of general initiation factors in transcription by RNA polymerase II," Trends in biochemical sciences, vol. 21, pp. 327-334, 1996.
[35] P. Cramer, et al., "Structural basis of transcription: RNA polymerase II at 2.8
angstrom resolution," Science, vol. 292, p. 1863, 2001.
[36] M. Kimura, "Evolutionary rate at the molecular level," NATURE, vol. 217, pp. 624- 626, 1968.
[37] M. Arnold, Evolution through genetic exchange: Oxford University Press, USA,
2006.
[38] X. Xia, "How optimized is the translational machinery in Escherichia coli, Salmonella typhimurium and Saccharomyces cerevisiae?," Genetics, vol. 149, pp. 37 - 44, 1998.
155
[39] H. Dong, et al., "Co-variation of tRNA abundance and codon usage in Escherichia
coli at different growth rates," J Mol Biol, vol. 260, pp. 649 - 663, 1996.
[40] S. Boycheva, et al., "Codon pairs in the genome of Escherichia coli,"
bioinformatics, vol. 19, pp. 987 - 998, 2003.
[41] G. Moura, et al., "Comparative context analysis of codon pairs on an ORFeome
scale," Genome Biol, vol. 6, p. R28, 2005.
[42] S. Kanaya, et al., "Codon usage and tRNA genes in eukaryotes: correlation of
codon usage diversity with translation efficiency and with CG-dinucleotide usage as assessed by multivariate analysis," J Mol Evol, vol. 53, pp. 290 - 298, 2001. [43] D. Wilson and K. Nierhaus, "The E-site story: the importance of maintaining two
tRNAs on the ribosome during protein synthesis," Cell Mol Life Sci, vol. 63, pp. 2725 - 2737, 2006.
[44] F. Wettstein and H. Noll, "Binding of transfer ribonucleic acid to ribosomes engaged in protein synthesis: number and properties of ribosomal binding sites," J
Mol Biol, vol. 11, pp. 35 - 53, 1965.
[45] K. Nierhaus, "Decoding errors and the involvement of the E-site," Biochimie, vol. 88, pp. 1013 - 1019, 2006.
[46] K. Nierhaus, "The allosteric three-site model for the ribosomal elongation cycle: features and future," Biochemistry, vol. 29, pp. 4997 - 5008, 1990.
[47] A. Korostelev, et al., "Crystal structure of a 70S ribosome-tRNA complex reveals
functional interactions and rearrangements," Cell, vol. 126, pp. 1065 - 1077, 2006.
[48] A. Shah, et al., "Computational identification of putative programmed translational
frameshift sites," bioinformatics, vol. 18, pp. 1046 - 1053, 2002.
[49] C. Bertrand, et al., "Influence of the stacking potential of the base 3' of tandem shift codons on -1 ribosomal frameshifting used for gene expression," RNA, vol. 8, pp. 16 - 28, 2002.
[50] J. George Chin and C. S. Lansing, "Capturing and supporting contexts for scientific data sharing via the biological sciences collaboratory," presented at the Proceedings of the 2004 ACM conference on Computer supported cooperative work, Chicago, Illinois, USA, 2004.
[51] M. Crochemore, et al., Algorithms on strings: Cambridge Univ Pr, 2007.
[52] A. Srikantha, et al., "A fast algorithm for exact sequence search in biological
sequences using polyphase decomposition," bioinformatics, vol. 26, pp. i414-i419, September 15, 2010 2010.
[53] S. Offner, "Using the NCBI Genome Databases to Compare the Genes for Human & Chimpanzee Beta Hemoglobin," The american biology Teacher, vol. 72, pp. 252-256, 2010.
[54] D. L. Wheeler, et al., "Database resources of the National Center for Biotechnology
Information," Nucl. Acids Res., p. gkl1031, December 14, 2006 2006.
[55] NCBI. (2010, 23-08-2010). National Center for Biotechnology Information. Available: http://www.ncbi.nlm.nih.gov/
[56] 05-09-2010). The Broad Institute of MIT and Harvard. Available: http://www.broadinstitute.org/
[57] B. J. Haas, et al., "Genome sequence and analysis of the Irish potato famine pathogen Phytophthora infestans," NATURE, vol. 461, pp. 393-398, 2009.
[58] G. R. Cochrane and M. Y. Galperin, "The 2010 Nucleic Acids Research Database Issue and online Database Collection: a community of data resources," Nucl. Acids
156
[59] D. A. Benson, et al., "GenBank," Nucl. Acids Res., vol. 37, pp. D26-31, January 1, 2009 2009.
[60] A. Hamosh, et al., "Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders," Nucleic Acids Research, vol. 33, p. D514, 2005.
[61] OMIM. (2009, OMIM, Online Mendelian Inheritance in Man. Available: http://www.ncbi.nlm.nih.gov/omim/
[62] KEGG. (2010, 20-03-2010). KEGG: Kyoto Encyclopedia of Genes and Genomes. Available: http://www.kegg.com
[63] M. Kanehisa, et al., "KEGG for representation and analysis of molecular networks
involving diseases and drugs," Nucl. Acids Res., vol. 38, pp. D355-360, January 1, 2010 2010.
[64] T. Hubbard, et al., "The Ensembl genome database project," Nucl. Acids Res., vol.
30, pp. 38-41, January 1, 2002 2002.
[65] L. Benuskova and R. Scurr. (2010, 19-10-2010). Global Alignment. Available: http://www.cs.otago.ac.nz/cosc348/alignments/Lecture05_GlobalAlignment.pdf
[66] V. Bafna, et al., "Approximation algorithms for multiple sequence alignment,"
Theoretical Computer Science, vol. 182, pp. 233-244, 1997.
[67] S. Needleman and C. Wunsch, "A general method applicable to the search for similarities in the amino acid sequence of two proteins," Journal of molecular
biology, vol. 48, pp. 443-453, 1970.
[68] D. S. Hirschberg, "Serial computations of Levenshtein distances," in Pattern
matching algorithms, ed: Oxford University Press, 1997, pp. 123-141.
[69] T. Smith and M. Waterman, "Identification of common molecular subsequences,"
J. Mol. Bwl, vol. 147, pp. 195-197, 1981.
[70] S. Altschul, et al., "Basic local alignment search tool," Journal of molecular
biology, vol. 215, pp. 403-410, 1990.
[71] J. Stoye, "Multiple sequence alignment with the divide-and-conquer method,"
Gene, vol. 211, pp. GC45-GC56, 1998.
[72] D. Powell, et al., "A versatile divide and conquer technique for optimal string
alignment," Information Processing Letters, vol. 70, pp. 127-139, 1999.
[73] E. Ukkonen, "Algorithms for approximate string matching," Information and
control, vol. 64, pp. 100-118, 1985.
[74] D. Sokol, et al., "Tandem repeats over the edit distance," bioinformatics, vol. 23, pp. e30-e35, January 15, 2007 2007.
[75] B. Ma, et al., "PatternHunter: faster and more sensitive homology search,"
bioinformatics, vol. 18, pp. 440-445, March 1, 2002 2002.
[76] M. Li, et al., "PatternHunter II: Highly sensitive and fast homology search,"
GENOME INFORMATICS SERIES, pp. 164-175, 2003.
[77] W. J. Kent, "BLAT—The BLAST-Like Alignment Tool," Genome Research, vol. 12, pp. 656-664, April 1, 2002 2002.
[78] D. Higgins and P. Sharp, "CLUSTAL: a package for performing multiple sequence alignment on a microcomputer," GENE, vol. 73, pp. 237-244, 1988.
[79] M. A. Larkin, et al., "Clustal W and Clustal X version 2.0," bioinformatics, vol. 23, pp. 2947-8, Nov 1 2007.
[80] A. Budd. (2009, 11-09-2010). Multiple Sequence Alignments - Exercices and
157
http://www.embl.de/~seqanal/courses/commonCourseContent/commonMsaExercis es.html
[81] A. L. Delcher, et al., "Alignment of whole genomes," Nucl. Acids Res., vol. 27, pp. 2369-2376, January 1, 1999 1999.
[82] A. L. Delcher, et al., "Fast algorithms for large-scale genome alignment and
comparison," Nucleic Acids Research, vol. 30, pp. 2478-2483, June 1, 2002 2002.
[83] S. Kurtz, et al., "Versatile and open software for comparing large genomes,"
Genome Biology, vol. 5, p. R12, 2004.
[84] D. Russell, et al., "Grammar-based distance in progressive multiple sequence
alignment," BMC Bioinformatics, vol. 9, p. 306, 2008.
[85] D. Lipman and W. Pearson, "Rapid and sensitive protein similarity searches,"
Science, vol. 227, p. 1435, 1985.
[86] X. Huang, et al., "A space-efficient algorithm for local similarities," Computer
applications in the biosciences : CABIOS, vol. 6, pp. 373-381, October 1, 1990
1990.
[87] R. Edgar, "MUSCLE: Multiple sequence alignment with high score accuracy and high throughput," Nucleic Acids Res, vol. 32, pp. 1792 - 1797, 2004.
[88] T. Treangen, et al., "A novel heuristic for local multiple alignment of interspersed
DNA repeats," IEEE/ACM Transactions on Computational Biology and
Bioinformatics (TCBB), vol. 6, pp. 180-189, 2009.
[89] C. Notredame, et al., "T-Coffee: a novel algorithm for multiple sequence
alignment," J Mol Biol, vol. 302, pp. 205 - 217, 2000.
[90] K. Katoh, et al., "MAFFT: a novel method for rapid multiple sequence alignment
based on fast Fourier transform," Nucleic Acids Res, vol. 30, pp. 3059 - 3066, 2002. [91] T. Lassmann and E. Sonnhammer, "Kalign - an accurate and fast multiple sequence
alignment algorithm," BMC Bioinformatics, vol. 6, p. 298, 2005.
[92] C. Do, et al., "ProbCons: probabilistic consistency-based multiple sequence
alignment," Genome Research, vol. 15, p. 330, 2005.
[93] L. Parida, et al., "MUSCA: an algorithm for constrained alignment of multiple data
sequences," GENOME INFORMATICS SERIES, pp. 112-119, 1998.
[94] A. Subramanian, et al., "DIALIGN-TX: greedy and progressive approaches for
segment-based multiple sequence alignment," Algorithms for Molecular Biology, vol. 3, p. 6, 2008.
[95] J. S. Papadopoulos and R. Agarwala, "COBALT: constraint-based alignment tool for multiple protein sequences," bioinformatics, vol. 23, pp. 1073-1079, May 1, 2007 2007.
[96] W. R. Pearson. (2010, 20-10-2010). FASTA Tools. Available: http://www.ebi.ac.uk/Tools/sss/fasta/help/index-protein.html#program
[97] R. Kolpakov and G. Kucherov, "Finding Approximate Repetitions under Hamming Distance," in Algorithms — ESA 2001. vol. 2161, F. auf der Heide, Ed., ed: Springer Berlin / Heidelberg, 2001, pp. 170-181.
[98] S. Henikoff and J. Henikoff, "Amino acid substitution matrices from protein blocks," Proceedings of the National Academy of Sciences of the United States of
America, vol. 89, p. 10915, 1992.
[99] A. Elofsson. (2000, 29-09-2010). Scoring Matrices and gap penalties. Available: http://bioinfo.se/kurser/swell/substmatrix.html
[100] (2007, 30-09-2010). Scoring Matrix. Available:
158
[101] M. Dayhoff and R. Schwartz, "A model of evolutionary change in proteins," 1978. [102] G. Gonnet, et al., "Exhaustive matching of the entire protein sequence database,"
Science, vol. 256, p. 1443, 1992.
[103] NCBI. (2010, 20-10-2010). The Statistics of Sequence Similarity Scores. Available: http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html
[104] E. Rocha, "Folhas de BioInformática e Análise de sequências," Centre national de la recherche scientifique2001.
[105] S. Henikoff and J. G. Henikoff, "Performance evaluation of amino acid substitution matrices," Proteins, vol. 17, pp. 49-61, Sep 1993.
[106] W. R. Pearson, "Comparison of methods for searching protein sequence databases,"
Protein Sci, vol. 4, pp. 1145-60, Jun 1995.
[107] R. Mott, Smith–Waterman Algorithm: John Wiley & Sons, Ltd, 2001.
[108] C. Bio, "Bioinformatics explained: Smith-Waterman," 2007.
[109] G. Barton, "Protein Sequence Alignment and Database Scanning," in Protein
Structure prediction - a practical approach, M. J. E. Sternberg, Ed., ed: Oxford
University Press, 1996.
[110] S. F. Altschul, et al., "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs," Nucleic Acids Res, vol. 25, pp. 3389-402, Sep 1 1997.
[111] MPI. 10-04-2010). mpiBLAST. Available: http://www.mpiblast.org
[112] NVidia. 10-04-2010). BLASTp on Tesla. Available:
http://www.nvidia.com/object/blastp_on_tesla.html
[113] Mitrion. 10-04-2010). OpenBio. Available: http://mitc-openbio.sourceforge.net/
[114] A. Biocomputing. 14-04-2010). AB-BLAST. Available:
http://www.advbiocomp.com/blast.html
[115] TimeLogic. 14-04-2010). Tera BLAST. Available:
http://www.timelogic.com/decypher_blast.html
[116] TimeLogic. 15-09-2010). TimeLogic Biocomputing Solutions. Available: http://www.timelogic.com/decypher_citations.html
[117] K. Thompson, "Programming Techniques: Regular expression search algorithm,"
Commun. ACM, vol. 11, pp. 419-422, 1968.
[118] J. Morris and V. Pratt, "A linear pattern-matching algorithm," Technical Report 40, University of California, Berkeley, 19701970.
[119] D. Gusfield, Algorithms on strings, trees and sequences: computer science and
computational biology: Oxford University, 1999.
[120] D. E. Knuth, et al., "Fast Pattern Matching in Strings," SIAM Journal on
Computing, vol. 6, pp. 323-350, 1977.
[121] I. Simon, "String matching algorithms and automata," in Results and Trends in
Theoretical Computer Science. vol. 812, J. Karhumäki, et al., Eds., ed: Springer
Berlin / Heidelberg, 1994, pp. 386-395.
[122] A. Apostolico and Z. Galil, Pattern matching algorithms. New York: Oxford University Press, 1997.
[123] P. Weiner, "Linear pattern matching algorithms," 1973, pp. 1-11.
[124] E. McCreight, "A space-economical suffix tree construction algorithm," Journal of
the ACM (JACM), vol. 23, pp. 262-272, 1976.
[125] E. Ukkonen, "On-line construction of suffix trees," Algorithmica, vol. 14, pp. 249- 260, 1995.
159
[126] U. Manber and G. Myers, "Suffix arrays: a new method for on-line string searches," 1990, pp. 319-327.
[127] A. Aho and M. Corasick, "Efficient string matching: an aid to bibliographic search," Communications of the ACM, vol. 18, pp. 333-340, 1975.
[128] R. S. Boyer and J. S. Moore, "A fast string searching algorithm," Commun. ACM, vol. 20, pp. 762-772, 1977.
[129] C. Maxime and R. Wojciech, Text algorithms: Oxford University Press, Inc., 1994. [130] R. Karp and M. Rabin, "Efficient randomized pattern-matching algorithms," IBM
Journal of Research and Development, vol. 31, p. 249, 1987.
[131] X. Wang, et al., "Collisions for hash functions MD4, MD5, HAVAL-128 and RIPEMD."
[132] C. Charras and T. Lecroq, Handbook of exact string matching algorithms: Citeseer, 2004.
[133] E. Lander, et al., "Initial sequencing and analysis of the human genome," NATURE,
vol. 409, pp. 860-921, 2001.
[134] G. Moura, et al., "Large scale comparative codon-pair context analysis unveils general rules that fine-tune evolution of mRNA primary structure," PLoS One, vol. 2, p. e847, 2007.
[135] M. Santos and M. Tuite, "The CUG codon is decoded in vivo as serine and not leucine in Candida albicans," Nucleic Acids Res, vol. 23, pp. 1481 - 1486, 1995. [136] C. Marck, et al., "The RNA polymerase III-dependent family of genes in
hemiascomycetes: comparative RNomics, decoding strategies, transcription and evolutionary implications," Nucleic Acids Research, vol. 34, p. 1816, 2006.
[137] S. K. Shin and G. L. Sanders, "Denormalization strategies for data retrieval from data warehouses," Decision Support Systems, vol. 42, pp. 267-282, 2006.
[138] B. Louie, et al., "Data integration and genomic medicine," Journal of Biomedical
Informatics, vol. 40, pp. 5-16, 2007.
[139] J. K. Han, Micheline Data Mining – Concepts and Techniques, second edition ed.: Morgan Kaufmann Publishers, 2006.
[140] R. M. Wideman, "Software Development and Linearity (Or, why some project management methodologies don’t work)," Projects & Profits, 2003.
[141] S. Tripp and B. Bichelmeyer, "Rapid prototyping: An alternative instructional design strategy," Educational Technology Research and Development, vol. 38, pp. 31-44, 1990.
[142] G. R. Moura, et al., "Codon-triplet context unveils unique features of the Candida albicans protein coding genome," BMC Genomics, vol. 8, p. 444, 2007.
[143] J. P. Lousado, et al., "Exploiting Codon-Triplets Association for Genome Primary Structure Analysis," presented at the Biocomputation, Bioinformatics, and Biomedical Technologies, 2008. BIOTECHNO '08. International Conference on, Bucharest, Romania, 2008.
[144] J. P. Lousado, et al., "GeneSplit - Uma Aplicação para o Estudo de Associações de Codões e de Aminoácidos em ORFeomas," in CISTI 2008: 3ª Conferencia Ibérica
de Sistemas y Tecnologías de la Información, OURENSE, 2008.
[145] R. A. George, et al., "Analysis of protein sequence and interaction data for candidate disease gene prediction," Nucl. Acids Res., vol. 34, p. e130, November 14, 2006 2006.
160
[146] S. Ali, et al., "Analysis of the evolutionarily conserved repeat motifs in the genome
of the highly endangered central Indian swamp deer Cervus duvauceli branderi,"
GENE, vol. 223, pp. 361–367, 1998.
[147] Z. Fu and T. Jiang, "Clustering of main orthologs for multiple genomes.," J
Bioinform Comput Biol, vol. 6, pp. 573-84, Jun 2008.
[148] N. C. Jones and P. A. Pevzner, "Comparative genomics reveals unusually long motifs in mammalian genomes," Bioinformatics, vol. 22, pp. e236-242, July 15, 2006 2006.
[149] M. Brameier and C. Wiuf, "Ab initio identification of human microRNAs based on structure motifs," BMC Bioinformatics, vol. 8, p. 478, 2007.
[150] T. a. Bowen, et al., "Repeat sizes at CAG/CTG loci CTG18.1, ERDA1 and TGC13-7a in schizophrenia," Psychiatric Genetics, vol. 10, pp. 33-37, 2000.
[151] T. V. Pestova, et al., "A conserved AUG triplet in the 5' nontranslated region of poliovirus can function as an initiation codon in vitro and in vivo," Virology, vol. 204, pp. 729-37, Nov 1 1994.
[152] Y. O. Herishanu, et al., "Huntington disease in subjects from an Israeli Karaite community carrying alleles of intermediate and expanded CAG repeats in the HTT gene: Huntington disease or phenocopy?," Journal of the Neurological Sciences, vol. 277, pp. 143-146, 2009.
[153] V. Bogaerts, et al., "Genetic findings in Parkinson's disease and translation into treatment: a leading role for mitochondria?," Genes Brain Behav, vol. 7, pp. 129- 51, Mar 2008.
[154] M. A. Mena, et al., "On the pathogenesis and neuroprotective treatment of Parkinson disease: what have we learned from the genetic forms of this disease?,"
Curr Med Chem, vol. 15, pp. 2305-20, 2008.
[155] B. A. Tarini, et al., "Parents Interest in Predictive Genetic Testing for Their Children When a Disease Has No Treatment," Pediatrics, vol. 124, pp. e432-e438, Aug 24 2009.
[156] W. Hsueh, "Genetic discoveries as the basis of personalized therapy: rosiglitazone treatment of Alzheimer's disease," Pharmacogenomics J, vol. 6, pp. 222-4, Jul-Aug 2006.
[157] M. P. Gabriela Moura, Raquel Silva, Isabel Miranda, Vera Afreixo, Gaspar Dias, Adelaide Freitas, José L Oliveira, and Manuel AS Santos, "Comparative context analysis of codon pairs on an ORFeome scale," Genome Biology, vol. 6, 2005. [158] D. B. Gordon, et al., "TAMO: a flexible, object-oriented framework for analyzing
transcriptional regulation using DNA-sequence motifs," Bioinformatics, vol. 21, pp.