• Nenhum resultado encontrado

UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE INSTITUTO METRÓPOLE DIGITAL PROGRAMA DE PÓS-GRADUAÇÃO EM BIOINFORMÁTICA PEDRO IGOR CÂMARA DE OLIVEIRA

N/A
N/A
Protected

Academic year: 2021

Share "UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE INSTITUTO METRÓPOLE DIGITAL PROGRAMA DE PÓS-GRADUAÇÃO EM BIOINFORMÁTICA PEDRO IGOR CÂMARA DE OLIVEIRA"

Copied!
47
0
0

Texto

(1)

UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE INSTITUTO METRÓPOLE DIGITAL

PROGRAMA DE PÓS-GRADUAÇÃO EM BIOINFORMÁTICA

PEDRO IGOR CÂMARA DE OLIVEIRA

PLANEJAMENTO DE NOVOS INIBIDORES DA CYP51 DO TRYPANOSOMA CRUZI POR ESTUDOS DE QSAR

NATAL - RN 2019

(2)

PEDRO IGOR CÂMARA DE OLIVEIRA

PLANEJAMENTO DE NOVOS INIBIDORES DA CYP51 DO TRYPANOSOMA CRUZI POR ESTUDOS DE QSAR

Defesa de Mestrado apresentada ao Programa de Pós-Graduação em Bioinformática da Universidade Federal do Rio Grande do Norte.

Área de concentração: Bioinformática

Linha de Pesquisa: Desenvolvimento de produtos e processos Orientador: Prof. Dr. Euzébio Guimarães Barbosa

NATAL - RN 2019

(3)

Universidade Federal do Rio Grande do Norte - UFRN

Sistema de Bibliotecas - SISBI

Catalogação de Publicação na Fonte. UFRN - Biblioteca Central Zila Mamede

Oliveira, Pedro Igor Câmara de.

Planejamento de novos inibidores da CYP51 do Trypanosoma Cruzi por estudos de QSAR / Pedro Igor Câmara de Oliveira. - 2019.

47 f.: il.

Dissertação (mestrado) - Universidade Federal do Rio Grande do Norte, Centro de Ciências da Saúde, Programa de Pós-Graduação em Bioinformática, Natal, RN, 2019.

Orientador: Prof. Dr. Euzébio Guimarães Barbosa.

1. Quantitative Structure Activity Relationship -

Dissertação. 2. Trypanosoma cruzi - Dissertação. 3. CYP51 - Dissertação. 4. Design de fármacos - Dissertação. 5. Doença de Chagas - Dissertação. I. Barbosa, Euzébio Guimarães. II. Título.

RN/UF/BCZM CDU 616.937

Elaborado por Ana Cristina Cavalcanti Tinôco - CRB-15/262

(4)

PEDRO IGOR CÂMARA DE OLIVEIRA

PLANEJAMENTO DE NOVOS INIBIDORES DA CYP51 DO TRYPANOSOMA CRUZI POR ESTUDOS DE QSAR

Defesa de Mestrado apresentada ao Programa de Pós-Graduação em Bioinformática da Universidade Federal do Rio Grande do Norte.

Área de concentração: Bioinformática

Linha de Pesquisa: Desenvolvimento de produtos e processos Orientador: Prof. Dr. Euzébio Guimarães Barbosa

Natal, 07 de junho de 2019.

BANCA EXAMINADORA

___________________________________________ Prof. Dr. Euzébio Guimarães Barbosa Universidade Federal do Rio Grande do Norte

(Presidente)

___________________________________________ Prof. Dr. Marcus Tullius Scotti

Universidade Federal da Paraíba (Examinador Externo à Instituição)

___________________________________________ Prof. Dr. Paulo Marcos da Matta Guedes Universidade Federal do Rio Grande do Norte

(5)

AGRADECIMENTOS

Agradeço à CAPES, pelo financiamento da minha bolsa ao longo do meu mestrado. Agradeço ao NPAD, pelo investimento no supercomputador que possibilitou mais agilidade na execução de tarefas computacionais auxiliares à este trabalho e muitos outros de nosso grupo de pesquisa. Agradeço ao PPGBIOINFO, PPGCF, PPGBIOQ e PPGCSA pela oferta de disciplinas e suporte institucional.

Agradeço à minha família, pelo suporte incondicional, pela compreensão e apoio que foram essenciais para que eu conseguisse cumprir com todos os meus compromissos ao longo do mestrado. Sou especialmente grato à minha mãe, Antônia Claudia, à minha vó, Francisca Nogueira, ao meu tio, Raimundo, ao meu pai, Francisco e aos meus irmãos, Paulinho e Kaká.

Agradeço à minha namorada, Letícia Sales, pelo companheirismo e apoio à todas as minhas atividades simultâneas que muitas vezes nos interromperam até mesmo em momentos de descanso. Agradeço também à sua ajuda na elaboração e revisão de diversas figuras deste trabalho e por deixar meus slides com aspecto profissional.

Agradeço ao meu orientador, Euzébio Guimarães, por ser sempre muito acessível para mim e meus amigos do laboratório, por ter criado um ambiente de trabalho que sempre teve um equilíbrio muito bom entre a produtividade e diversão. E agradeço também por ter a sensibilidade de saber o momento certo para agir como irmão, amigo e como líder, guiando e auxiliando sempre para obtermos os melhores resultados possíveis.

Agradeço aos meus amigos do LABQFC: Thaynã, Rodolfo, Rita, Amanda, Marcel, Priscilla, Vanessa, Paulo, Estela, Natália, Sofia, Macau e Thaís. Vocês me ensinaram muito ao longo do meu mestrado, me ajudaram a me tornar uma pessoa melhor e foram como uma família para mim nestes últimos anos na Faculdade de Farmácia.

Agradeço à todos os colegas e professores do PPGBIOINFO e do BioME por me ensinarem um pouco das maravilhas que podemos alcançar na área da saúde usando a computação e programação. Agradeço em especial aos amigos Diego Morais, Paulo Toscano, Marcel Câmara pela ajuda nas disciplinas e por me mostrarem que muitos dos nossos problemas podem ser resolvidos com programação.

Agradeço aos pesquisadores do LABIOPAR: Antônia Claudia Câmara, Egler Chiari, Lúcia Galvão, Andressa Noronha, Ramon Brito, Daniela Nunes, Kiev Martins, George Sampaio, Vicente Toscano, Cléber Mesquita, Giovani Lavieri, Pedro Ramon, Raniery Santana e Nathan Honorato. Tenho certeza que eles inspiraram não só eu, como muitos outros alunos que já passaram e muitos que ainda irão passar por lá a continuar acreditando na importância de pesquisar sobre a Doenças de Chagas.

(6)

RESUMO

A doença de Chagas mata cerca de 10.000 pessoas por ano e aproximadamente 8 milhões de pessoas estão infectadas pelo Trypanosoma cruzi. O principal medicamento de referência para o tratamento da doença, o benzonidazol, é utilizado desde a década de 70. Nos últimos anos, muitos inibidores da CYP51 têm sido testados contra esta enzima do parasito. Um destes inibidores, o posaconazol chegou inclusive a testes clínicos, que infelizmente não teve resultados superiores ao benzonidazol. Porém ainda há indícios que a CYP51 é um ótimo alvo em potencial para tratar a infecção pelo T. cruzi. A pesquisa por novas moléculas eficazes que poderiam possivelmente curar a fase crônica da doença é algo essencial. Estudos de QSAR (Quantitative Structure Activity Relationship) 2D e 3D foram utilizados neste trabalho para criar três modelos para previsão de atividade biológica, baseados em estruturas químicas de 197 compostos com grupos piridina e azol publicados na literatura que já passaram por testes in vivo ou in vitro. Após a análise dos modelos, novos análogos que ainda não foram sintetizados foram sugeridos neste trabalho e tiveram sua atividade biológica prevista e acessibilidade sintética avaliada.

Palavras-chave: QSAR, Trypanosoma cruzi, CYP51, Design de fármacos baseado na estrutura, Doença de Chagas.

(7)

ABSTRACT

Chagas disease kills over 10,000 people per year and approximately 8 million people are infected by Trypanosoma cruzi. The reference drug for treatment of the disease, benznidazole, is the same since the 70s. In recent years, many CYP51 inhibitors were tested against this parasite’s target. One of them, posaconazole, was even tested in clinical trials that unfortunately could not prove a superior efficiency compared to benznidazole. Nevertheless, there are still many evidences that CYP51 is a great potential target to treat T. cruzi infection. The research for new effective molecules that can cure the chronic phase of the disease is essential. 2D and 3D-Quantitative Structure Activity Relationship (QSAR) studies were conducted in this work to create three QSAR models using the chemical structures of 197 pyrydine and azole published compounds that already went through either in vivo or in vitro tests. After the analysis of the models, new analogues not yet synthesized were suggested here and had their biological activity and synthetic availability assessed.

(8)
(9)

LISTA DE ABREVIATURAS

CYP51 Lanosterol 14α-desmetilase

WHO Organização Mundial da Saúde (do inglês, world health organization) RN Rio Grande do Norte

HTS Triagem de alta produtividade (do inglês, high throughput screening)

QSAR Relação quantitativa entre a estrutura e atividade (do inglês, quantitative structure activity relationship).

pEC50 Logaritmo negativo do EC50. É utilizado para facilitar o entendimento da concentração que

induz metade do efeito máximo efetivo produzido por uma substância. É usado para medir a potência de uma substância.

pIC50 Logaritmo negativo do IC50. É utilizado para facilitar o entendimento da concentração que induz

metade do efeito máximo inibitório produzido por uma substância. É usado para expressar a potência da substância.

(10)

SUMÁRIO

1 INTRODUÇÃO ... 10

1.1 Doença de Chagas ... 10

1.2 Tratamentos atualmente disponíveis para a doença de Chagas ... 10

1.3 CYP51: uma boa molécula puxa outra ... 11

1.4 Modelos de QSAR para o planejamento de novos inibidores ... 12

2 JUSTIFICATIVA DO ESTUDO ... 12 3 OBJETIVOS ... 13 3.1 Geral ... 13 3.2 Específicos ... 13 4 CONVITE ... 13 ARTIGO... ...14 5 DISCUSSÃO...39 6 CONCLUSÃO...40 7 REFERENCIAS...41

(11)

1 INTRODUÇÃO

1.1 Doença de Chagas

É estimado que o Trypanosoma cruzi vem infectando mamíferos selvagens na natureza há cerca de 10 milhões de anos. Em comparação, os humanos chegaram nas Américas de 26 a 12 mil anos atrás (WHO | HISTORY OF CHAGAS DISEASE, 2016). Isso mostra o quão solidificado é o ciclo do parasito no ambiente silvestre.

Aproximadamente 8 milhões de pessoas se encontram infectadas pelo T. cruzi atualmente, que é o agente etiológico responsável por causar a doença de Chagas (WHO CHAGAS DISEASE, 2018). Esta doença é endêmica das Américas e sua transmissão é predominantemente vetorial nesta região, porém devido à migração das pessoas entre diversos países, esta doença tem se expandido também para outros continentes (COURA; VIÑAS, 2010).

A doença de Chagas é endêmica no estado do Rio Grande do Norte (RN). Existem diversos estudos epidemiológicos que mostram que existem várias pessoas atualmente infectadas neste estado, assim como evidências da manutenção do ciclo silvestre próximo aos lugares onde a população rural habita (BARBOSA-SILVA et al., 2016; BRITO et al., 2012; CÂMARA et al., 2013). No último inquérito sorológico nacional, foram coletadas amostras de quase 105.000 crianças da área rural de diversos estados do Brasil e no RN foi detectada a infecção de um recém-nascido que provavelmente foi infectado por transmissão vetorial, uma vez que sua mãe era soronegativa (LUQUETTI et al., 2011). O inseto foi encontrado recentemente em ambientes próximos às casas de moradores da área rural e até mesmo dentro de domicílios no RN (BARBOSA-SILVA et al., 2016), o que mostra que a transmissão vetorial continua ativa neste estado.

A doença geralmente é curável na fase aguda, com 71,5% de chance de cura e quanto mais precoce o tratamento, melhor o prognóstico, sendo os casos congênitos tratados com um percentual de cura de até 97,9% (GUEDES, P.M.M. et al., 2011). Porém devido à fase aguda ser comumente assintomática ou apresentar sintomas inespecíficos como febre, mal-estar e irritação da pele, muitas vezes o diagnóstico não é realizado durante este período. Três a quatro meses depois, o indivíduo infectado entra na fase crônica da infecção. Na fase crônica da doença o percentual de cura cai para uma média de 5,9% (GUEDES, P.M.M. et al., 2011). A fase crônica pode ser classificada como indeterminada, cardíaca, digestiva e cardiodigestiva.

Na fase crônica da doença, entre 20 e 30% dos infectados desenvolvem lesões cardíacas, que ocorrem devido a reações inflamatórias das quais ainda não há um consenso se são causadas por persistência do parasito no organismo ou por mecanismos autoimunes desencadeados por este patógeno. Adicionalmente, a doença é também a causa mais frequente de cardiomiopatia na América Latina. As lesões cardíacas podem levar à falência cardíaca e morte súbita das pessoas na fase crônica da infecção (WANDERLEY DE SOUZA, TÉCIA MARIA ULISSES DE CARVALHO, 2013).

1.2 Tratamentos atualmente disponíveis para a doença de Chagas

Até hoje só existem dois medicamentos para o tratamento da doença de Chagas, o nifurtimox e o benzonidazol. No caso do Brasil, apenas o benzonidazol é usado para o tratamento da doença, que é um medicamento que começou a ser utilizado por volta de 1974 (COURA; DE CASTRO, 2002). Em outras palavras, o medicamento atualmente utilizado foi desenvolvido há mais de 40 anos e existem diversos efeitos adversos relatados decorrentes do seu uso.

(12)

Estes efeitos adversos do tratamento são frequentes e tem diferentes tipos de gravidade. Podem ocorrer manifestações tópicas, como dermatite com erupções cutâneas. Assim como também podem aparecer sintomas adversos mais severos como linfadenopatia, dores musculares, nas juntas e até mesmo alterações na medula óssea que podem levar à agranulocitose e púrpura trombocitopênica (PINAZO et al., 2010b). A diversidade de fatores adversos faz com que muitas vezes pacientes idosos e principalmente aqueles com a forma indeterminada da doença não sejam tratados.

O nifurtimox também possui um alto potencial para causar efeitos adversos. Este medicamento inclusive não tem seu uso aprovado no Brasil. Alguns dos efeitos tóxicos que este medicamento pode provocar incluem danos testiculares, toxicidade para os ovários, neurotoxicidade e efeitos deletérios em diversos outros tecidos de órgãos como o cólon, esôfago e mamas (CASTRO; MONTALTO; BARTEL, 2006).

1.3 CYP51: uma boa molécula puxa outra

Ao se iniciar uma busca por novos tratamentos contra uma doença, tem que se levar em conta os possíveis alvos disponíveis para tal. Um alvo farmacológico geralmente é uma molécula que se deseja que sua ação seja alterada ou impedida por completo, como proteínas e enzimas. No caso do T. cruzi, um alvo que tem sido bastante promissor e para o qual já existe diversos estudos publicados e revisões, é a CYP51.

Essa enzima é parte da família de enzimas do citocromo P450, que são encontradas em todos os seres vivos de diversos reinos, incluindo eucariotos e procariotos (SUETH-SANTIAGO et al., 2015). Estas enzimas possuem vários substratos, e alguns deles são os esteroides que tem como função primordial a manutenção das membranas celulares dos seres vivos. No reino animal, o colesterol é o esteroide utilizado, por sua vez nos fungos e no T. cruzi é utilizado o ergosterol. A CYP51 também é conhecida como lanosterol 14α-desmetilase, esta enzima provoca uma desmetilação no carbono da posição 14 do lanesterol, que é um dos precursores metabólicos da formação do ergosterol. Esta diferença no tipo de esterol usado pelo homem e pelo parasito foi o principal motivo inspirador para continuar investigando inibidores seletivos para essa enzima.

Por esta via sintética ser comum entre os fungos e alguns protozoários, já houve diversos testes de antifúngicos contra o T. cruzi. Um destes medicamentos testados mais promissores foi o posaconazol, que mostrou em testes com infecções experimentais em camundongos um desempenho muitas vezes superior ao benzonidazol. Estes testes mostraram que o posaconazol levou a uma maior taxa de sobrevivência dos animais e em alguns casos até mesmo a cura deles (MOLINA et al., 2000). Infelizmente quando o posaconazol foi testado posteriormente em testes clínicos em humanos, ele teve um desempenho inferior ao benzonidazol (MOLINA et al., 2014). No entanto muitas moléculas ativas contra essa via metabólica do parasito continuam sendo sintetizadas e testadas até hoje na esperança de se encontrar um medicamento mais eficaz que os atualmente aprovados para o tratamento da doença. Adicionalmente, outros compostos nitroheterocíclicos azóis foram sintetizados, testados e se mostraram promissores contra o T. cruzi em testes in vitro (PAPADOPOULOU et al., 2014, 2015b, 2015a; SURYADEVARA et al., 2013). E após relatos de surgimento de resistência desenvolvida pelo parasito em laboratório contra esta classe de medicamentos, também foram sintetizados e testados análogos com aminopiridinas que tiveram bons resultados in vitro (CALVET et al., 2014; CHOI et al., 2013, 2014; VIEIRA et al., 2014a, 2014b).

Outras classes de medicamentos também foram testadas contra o T. cruzi, como candidatos a tratamento contra câncer, como o tipifarnib, que teve análogos criados e testados contra o parasito (BUCKNER et al., 2012;

(13)

KRAUS et al., 2009, 2010). Análogos de produtos naturais, como a piperina extraída da pimenta negra (piper nigrum), também foram sintetizados e experimentados contra o parasito (FRANKLIM et al., 2013).

1.4 Modelos de QSAR para o planejamento de novos inibidores

Compostos já publicados podem ser triados de diversas maneiras em relação a um alvo, uma abordagem possível para testar diversos compostos em pouco tempo de forma automatizada é o high throughput screening (HTS). Porém um dos problemas de se usar uma metodologia in vitro como essa é o custo elevado de se instalar a estrutura para os testes, que são bastante utilizados nas indústrias farmacêuticas (MACARRON et al., 2011).

Todas as moléculas usadas neste trabalho para a construção dos modelos de QSAR (Relação quantitativa entre a estrutura e atividade) tem estereoquímica definida e foram testadas anteriormente por outros autores ou por métodos in vivo ou in vitro e tiveram seus valores relacionados à atividade biológica determinados experimentalmente (CALVET et al., 2014; CHOI et al., 2013, 2014; PAPADOPOULOU et al., 2014, 2015a, 2015b; SURYADEVARA et al., 2013; VIEIRA et al., 2014a, 2014b). Com estes dados de atividade biológica e com desenhos computacionais otimizados das moléculas, desenvolvemos modelos híbridos de QSAR com descritores clássicos (2D) e 3D para tentar prever a atividade biológica de três conjuntos de moléculas agrupadas de acordo com o principal grupo de interação com a CYP51.

QSAR é uma metodologia que estabelece uma relação quantitativa entre a estrutura e atividade das moléculas utilizadas no modelo. Estes modelos são utilizados para se tentar construir uma correlação estatisticamente significante entre a estrutura química das moléculas utilizadas e suas atividades biológicas (VERMA; KHEDKAR; COUTINHO, 2010)

Para os modelos de QSAR-2D, são calculados descritores que não dependem do posicionamento espacial das moléculas, os descritores neste caso são calculados predominantemente em relação as propriedades químicas intrínsecas de cada molécula. Já para os modelos de QSAR-3D, é essencial que as moléculas utilizadas para calcular os descritores estejam bem alinhadas em relação ao sítio ativo, pois os descritores desta vez dependem das interações químicas no espaço em que estão ocupando em relação ao receptor (VERMA; KHEDKAR; COUTINHO, 2010).

Os modelos de QSAR são bastante utilizados para se tentar prever a atividade biológica de novas moléculas semelhantes às moléculas utilizadas para se treinar o modelo (DE ARAÚJO SANTOS et al., 2015). A partir da análise dos modelos, podemos tentar prever novos análogos cuja síntese ainda não foi realizada e assim racionalizar a síntese de novos compostos.

2 JUSTIFICATIVA DO ESTUDO

A doença de Chagas é uma doença reconhecidamente negligenciada, porém tem considerável importância clínica e é endêmica não só no estado do RN como nas Américas. É estimado que anualmente cerca de 10.000 pessoas morrem devido a esta doença (WHO CHAGAS DISEASE, 2018). O único tratamento disponível para a doença no Brasil é o mesmo desde a década de 70 (COURA; DE CASTRO, 2002). Atualmente a quantidade de indivíduos curados na fase crônica deixa muito a desejar, chegando a uma média de apenas 5,9% (P.M.M. et al., 2011). É essencial continuar os estudos com o T. cruzi com o propósito de ajudar no desenvolvimento de tratamentos mais eficazes e específicos contra o parasito também na fase crônica e preferencialmente com menos efeitos colaterais que os tratamentos atualmente disponíveis. Os modelos de QSAR utilizados neste estudo podem

(14)

ajudar na obtenção de outros análogos eficientes contra a CYP51 baseado no que já existe publicado na literatura científica.

3 OBJETIVOS

3.1 Geral

Construção de modelos QSAR treinados com compostos já publicados na literatura científica e planejamento de novos inibidores da CYP51.

3.2 Específicos

• Entender como ocorre a inibição da CYP51 do T. cruzi;

• Utilizar ferramentas computacionais para desenhar e alinhar os compostos existentes;

• Usar estes compostos publicados e testados na literatura para a obtenção de modelos QSAR;

• Empregar os modelos de QSAR para prever in silico a atividade dos compostos já publicados;

• Planejar novos compostos inibidores da CYP51 e prever suas atividades biológicas com a utilização dos modelos de QSAR gerados;

4 CONVITE

Convidamos a banca para avaliar o artigo que submetemos recentemente para o periódico Molecular Diversity (regras para publicação serão enviadas em anexo com este trabalho). A Molecular Diversity é o nosso alvo inicial para publicação. Seria interessante que as avaliações e críticas visem a melhoria das chances de publicação, fortalecendo assim o trabalho não só do nosso grupo de pesquisa, como também do programa de pós-graduação em bioinformática (PPGBIOINFO) da UFRN. Foram criados três modelos de QSAR híbridos. Os análogos planejados no trabalho tiveram seus valores de atividade biológica prevista pelos modelos de QSAR. Estes análogos podem ser sintetizados e testados futuramente.

(15)

Planning new Trypanosoma cruzi CYP51 inhibitors using QSAR studies

Pedro Igor Camara de Oliveira1, Paulo Henrique de Santana Miranda2, Estela Mariana Guimarães

Lourenço2, Priscilla Suene de Santana Nogueira Silverio1, Euzebio Guimaraes Barbosa1,2*

Universidade Federal do Rio Grande do Norte, UFRN. Faculdade de Farmácia, Rua Gen. Gustavo Cordeiro de Faria, S/N - Petrópolis, Natal - RN, 59012-570. 1. Programa de Pós-Graduação em Bioinformática

2. Programa de Pós-Graduação em Ciências Farmacêuticas

Oliveira, P.I.C. ORCID: https://orcid.org/0000-0001-7801-171X Miranda, P.H.S ORCID: https://orcid.org/0000-0002-7893-8408 Lourenço, E.M.G ORCID: https://orcid.org/0000-0003-2708-4526 Silverio, P. S. S. N. ORCID: https://orcid.org/0000-0003-1687-6907

Barbosa, E.G. ORCID: https://orcid.org/0000-0002-7685-9618

*Corresponding author: (euzebiogb@imd.ufrn.br) +558433429831

Acknowledgements: CAPES – Coordenação de Aperfeiçoamento de Pessoal de Nível Superior for the granted scholarship. NPAD – Núcleo de Processamento de Alto Desempenho.

(16)

Abstract

Chagas disease kills over 10,000 people per year and approximately 8 million people are infected by Trypanosoma cruzi. The reference drug for treatment of the disease, benznidazole, is the same since the 70s. In recent years, many CYP51 inhibitors were tested against this parasite’s target. One of them, posaconazole, was even tested in clinical trials that unfortunately could not prove a superior efficiency compared to benznidazole. Nevertheless, there are still many evidences that CYP51 is a great potential target to treat T. cruzi infection. The research for new effective molecules that can cure the chronic phase of the disease is essential. 2D and 3D-Quantitative Structure Activity Relationship (QSAR) studies were conducted in this work to create three QSAR models using the chemical structures of 197 pyrydine and azole published compounds that already went through either in vivo or in vitro tests. After the analysis of the models, new analogues not yet synthesized were suggested here and had their biological activity and synthetic availability assessed.

Keywords: QSAR; Trypanosoma cruzi; CYP51; Structure-based drug design; Chagas disease.

Introduction

Chagas disease kills over 10 thousand people every year due to complications of its clinical symptoms. About 8 million people are currently infected by Trypanosoma cruzi, most of them in Latin America (WHO CHAGAS DISEASE, 2018). Even though its vector-borne transmission is restricted to the region of the Americas, because of the migration of populations between countries, this disease is no longer confined to its endemic areas (COURA; VIÑAS, 2010). There are only two drugs approved for the treatment of the disease, benznidazole and nifurtimox, which started being used in the early 70s (COURA; DE CASTRO, 2002). Nevertheless, these drugs are still far from being ideal because they present multiple adverse effects and are considered to have a low efficacy in the chronic phase of the disease. There are studies that show that infections treated during the acute phase can reach a cure rate of 71.5%, but the cure rate for the chronic phase is an average of only 5.9% (P.M.M. et al., 2011).

This absence of therapeutic alternatives for the disease motivates the research of new drugs. Quantitative structure-activity relationship (QSAR) is a method that could help in getting us closer to an alternative drug that could work against T. cruzi. This method consists in using mathematical models that are statistically relevant which creates a link between the biological activity and the chemical properties of the researched molecules. Furthermore it is a method considered very useful in drug design because it can considerably reduce costs in the syntheses and in vivo or in vitro tests of new compounds, which are expensive methods (VERMA; KHEDKAR; COUTINHO, 2010).

An existing antifungal drug, posaconazole, showed promising activity in vivo during the acute and chronic phase of the disease in murine models (MOLINA et al., 2000). There were also other studies, including an encouraging case report where this drug helped in resolving the infection better than benznidazole for an immunosuppressed patient that had a reactivation of the infection in the chronic phase of the disease (PINAZO et al., 2010a). After these successful reports, a clinical trial of posaconazole was financed by the Ministry of Health of Spain. This clinical trial unfortunately proved that posaconazole was more prone to treatment failure when compared to benznidazole, with the former drug showing only suppressive activity (MOLINA et al., 2014).

Posaconazole and other azole-based compounds studied in the last few years are known for their antifungal activity based on the inhibition of the sterol 14-alpha-demethylase (CYP51) (LEPESHEVA et al.,

(17)

2010). The inhibition of CYP51 has been proven to reduce the ability of trypomastigotes to invade heart cells and also strongly inhibit the multiplication of amastigotes, the predominant form in the chronic phase of the disease (LEPESHEVA et al., 2007). There were many pre-clinical tests that prove that CYP51 is still a promising target with many published active compounds ((ANDRIANI et al., 2013; BUCKNER et al., 2012; CALVET et al., 2014; CHOI et al., 2013, 2014; DE VITA et al., 2016; FERREIRA DE ALMEIDA FIUZA et al., 2018; FRANKLIM et al., 2013; FRIGGERI et al., 2014; KRAUS et al., 2010, 2009; LEPESHEVA et al., 2008; PAPADOPOULOU et al., 2014, 2015b, 2015a; SURYADEVARA et al., 2013; VIEIRA et al., 2014a, 2014b)). The compounds that effectively inhibit this enzyme hinder the formation of ergosterol, which is used by T. cruzi and is different from cholesterol used by most animal’s cells (SUETH-SANTIAGO et al., 2015).

Posaconazole initially seemed like a good treatment option, but its cost is too high for the treatment of Chagas disease and its structure is not optimized for the interaction with the parasite’s CYP51. Given this, the objective of this study was to find structural characteristics of the existing inhibitors that could help plan more suitable compounds. Several inhibitors had their CYP51 inhibition activities determined in vitro and were published in the literature. Most of these T. cruzi’s CYP51 inhibitors were selected here to create predictive QSAR models. These models may be useful to determine the structure-activity relationship in the compounds. Such models were interpreted and used to plan new binders that ultimately can be synthesized and tested against T. cruzi.

Methodology

Molecular models’ creation and alignment

The dataset we initially evaluated composed 376 compounds published in the literature (CALVET et al., 2014; CHOI et al., 2013, 2014; PAPADOPOULOU et al., 2014, 2015b, 2015a; SURYADEVARA et al., 2013; VIEIRA et al., 2014a, 2014b). The compounds used here were either previously tested in cell cultures infected by amastigotes or against the amastigotes themselves, which is the most prevalent form of the parasite in the chronic phase of the disease. Part of these compounds could not be used here, when they did not have a defined stereochemistry or did not present a biological activity measured with an exact value. Nevertheless, more than half of these compounds (197) that had closely related chemical structures were used to create three local QSAR models (CALVET et al., 2014; CHOI et al., 2013, 2014; PAPADOPOULOU et al., 2014, 2015b, 2015a; SURYADEVARA et al., 2013; VIEIRA et al., 2014a, 2014b). The construction of a global model was not attempted due the diversity of biological tests in the literature.

Every structure found in the literature was drawn in 2D using MarvinSketch 14.11.3.0 (ChemAxon) considering a 7.4 pH for protonation, simulating physiological conditions, and converted to 3D in the same program. All the molecules were carefully curated and edited when necessary using visual inspection in Avogadro (HANWELL et al., 2012). The molecular geometries were optimized by using the PM7 semiempirical theory in MOPAC (JAMES J. P. STEWART, STEWART COMPUTATIONAL CHEMISTRY, COLORADO SPRINGS, CO, [s.d.]) using implicit COSMOS solvation (KLAMT; SCHUURMANN, 1993).

The molecules were also aligned to reference crystalized ligands available in PDB before constructing the 3D-QSAR models. The aminopyridyl (CALVET et al., 2014; CHOI et al., 2013, 2014; VIEIRA et al., 2014a, 2014b) set was aligned to the crystalized inhibitor available in PDB code 4BY0 (CHOI et al., 2014).

(18)

As for the dialkylimidazole (SURYADEVARA et al., 2013) and imidazole (PAPADOPOULOU et al., 2014, 2015b, 2015a) set of molecules, they were both aligned to the crystalized ligand available in PDB code 3ZG2 (HARGROVE et al., 2013). All the receptors were also previously aligned to 4UVR (VIEIRA et al., 2014a), which was chosen as a reference receptor because the molecules available in this article had high values of biological activity. These alignments were achieved with an automated use of pharmACOphore (KORB et al., 2010). After all the molecules were aligned, they were visualized and sometimes went through minor adjustments using UCSF Chimera (PETTERSEN et al., 2004) to ensure realistic bioactive conformations.

In order to build the QSAR models both pEC50 and pIC50 values were used to calibrate the biological

activity vector (y). This was carried out because part of the tests were made directly in amastigotes, which had their activity measured as pIC50, and the pEC50 measured tests were made in cell cultures. When using data derived

of tests in cell cultures it is necessary to consider the ability of the compounds to permeate through cell membranes, which explains the pEC50 values used in the cases cell cultures were used (ARNOTT; PLANEY, 2012).

Activity cliffs

Activity cliffs analyses (STUMPFE; BAJORATH, 2012) were also performed to help the interpretation of the models. The objective here was to identify significant differences in the biological activity caused by small changes in the chemical structure of molecules that are very similar. Openbabel software (O’BOYLE et al., 2011) was used to calculate a Tanimoto Similarity (TS) analysis (BENDER; GLEN, 2004) of the molecules to compare them and find the ones that were more chemically related.

The Structure Landscape Activity Index (SALI) (MAGGIORA, 2006) was calculated as shown in the equation 2 below.

SALI= (|𝐴𝑎 -𝐴𝑏|) (1-TS)

2

the term Ax represents the biological activity of molecule A and B; TS stands for the Tanimoto Similarity. SALI

was essential to determine the activity cliffs for the molecules because it links the structural similarity of the molecules to their biological activity.

QSAR descriptor computation and filtering

In order to obtain the 3D descriptors, the Lennard Jones (LJ), Electrostatic (QQ), Hydrogen bonds (HB) and Hydrophobic (HF) descriptors were calculated as presented in Table 1. This approach was based on a previously described method (MARTINS et al., 2009). The molecular interaction field descriptors were calculated in a grid with a resolution of 1 Å, creating a matrix with an immense number of columns. Then to simplify the data enabling modeling, data points too far from the aligned molecules were removed applying a variance cutoff of 0.02. Furthermore, intercorrelated descriptors were removed (at a 0.98 level) keeping only the ones that were best correlated with the dependent variable y.

Table 1. Potentials used to compute all the molecular interaction field descriptors.

Descriptor (Abbreviation) Equation Description

Lennard Jones (LJ) 𝐿𝐽𝑖= ∑ 4 𝜀 [( 𝜎 𝑟𝑎−𝑝 ) 12 − ( 𝜎 𝑟𝑎−𝑝 ) 6 ] 𝑛 𝑎=1

𝐿𝐽𝑖 is the potential to compute the

van der Walls interactions for a molecule i to a probe p separated

(19)

by the distance ra-p summed over

all atoms n in a molecule. The values ε and σ are parameterized from Sybyl atom types in a .mol2 molecular file type according to the GAFF(WANG et al., 2004) force field. Electrostatic (QQ) 𝑄𝑄𝑖= ∑ 𝑞𝑖 𝑟𝑖−𝑝 𝑛 𝑎=1

𝑄𝑄𝑖 represents the potential to

compute the Coulombic

interactions for the molecule i to a probe p separated by the distance ra-p summed over all atoms n in a

molecule. The change for each atom was computed with the OpenBabel (O’BOYLE et al., 2011) default charge.

Hydrogens Bond (HB) and Hydrophobic (HF) atom perception 𝐻𝐵𝑖 𝑜𝑟 (𝐻𝐹𝑗) = ∑ 𝑒−10(𝑟𝑖−𝑝−1) 2 𝑛 𝑎=1

𝐻𝐵𝑖 and 𝐻𝐹𝑖 are gaussian

functions employed to detect the presence of a Hydrogen bond (HB) forming atom or Hydrophobic (HF) atom around 1 Å radius of a probe p by the distance ra-p

summed over all atoms n in a molecule.

The PaDEL (YAP, [s.d.]) program was also employed to generate classical (2D) descriptors using the same aligned compounds described earlier. The obtained descriptors count was extensively reduced through a series of filters using R version 3.4.4 (R CORE TEAM (2018). R: A LANGUAGE AND ENVIRONMENT FOR STATISTICAL COMPUTING. R FOUNDATION FOR STATISTICAL COMPUTING, VIENNA, AUSTRIA., [s.d.]) and the Caret package (MAX KUHN. CONTRIBUTIONS FROM JED WING, STEVE WESTON, ANDRE WILLIAMS, CHRIS KEEFER, ALLAN ENGELHARDT, TONY COOPER, ZACHARY MAYER, BRENTON KENKEL, THE R CORE TEAM, MICHAEL BENESTY, REYNALD LESCARBEAU, ANDREW ZIEM, LUCA SCRUCCA, YUAN TANG, CAN CANDAN AND TYL, [s.d.]). These filters were used to remove columns with missing values, columns that had the same value for all the samples (invariant ones) or those that had a high intercorrelation (redundancy) of above 0.98 in a way that is similar to what has been described previously (BARBOSA; FERREIRA, 2012; DE ARAÚJO SANTOS et al., 2015).

The creation of QSAR models was attempted using 2D, 3D descriptors and also a model combining both types of descriptors to try to predict the biological activity as previously described (DE ARAÚJO SANTOS et al., 2015).

QSAR model validation and testing

All the descriptors’ data was filtered in R (R CORE TEAM (2018). R: A LANGUAGE AND ENVIRONMENT FOR STATISTICAL COMPUTING. R FOUNDATION FOR STATISTICAL COMPUTING, VIENNA, AUSTRIA., [s.d.]) and the remaining ones were further submitted to a multitude of approaches to obtain the most optimized set of descriptors. The ordered predictors selection (OPS) (MARTINS; FERREIRA, 2013) was the first approach used to start reducing the data dimensionality. The stepwise multiple linear regression (S-MLR) method implemented in the NanoBRIDGES software (AMBURE et al., 2015) was used to build a linear regression model from a set of independent variables output from the OPS reduction. The obtained models were

(20)

evaluated based on the determination coefficient, R² and the leave-one-out coefficient 𝑄𝐿𝑂𝑂2 . A subset of

compounds was also uniformly separated to compose an external dataset to test the predictive power of each QSAR model. Both training set (70%) and test set (30%) were chosen to have a similar molecular diversity and range of biological activity. The predictive power, the determination coefficient and error percentage, respectively 𝑄𝑒𝑥𝑡2 ,

R², 𝑄𝐿𝑂𝑂2 were computed according to the equation 3 below. The models were considered robust and useful for

prediction only if R2 > 𝑄

𝐿𝑂𝑂2 and higher than 0.70 (KIRALJ; FERREIRA, 2009). To check the full list of descriptors

employed in the creation of each model, please see the supplementary material.

𝑋2= 1 −∑(𝑦𝑒𝑥𝑝− 𝑦𝑝𝑟𝑒𝑑) 2

∑(𝑦𝑒𝑥𝑝− 𝑦̅̅̅̅̅)𝑒𝑥𝑝

2 3

In this equation, yexp is the experimental dependent variable and ypred is the QSAR predicted values for the models

built with all samples (X=R), Leave-one-out predicted samples (X=𝑄𝐿𝑂𝑂) and for the samples for the test set

(X=Qext).

The validation in QSAR models is an essential step that is necessary to prove that the models are statistically relevant and were not obtained by chance (KIRALJ; FERREIRA, 2009). To prove that the models were not obtained by chance, they were tested with a y-randomization in which the biological activity - the dependent variable - went through 50 attempts of recreating it using scrambled biological activities. The y-randomization validation is plotted as R2 vs 𝑄

𝐿𝑂𝑂2 for every repletion and the real model built with the real y. R2

and 𝑄𝐿𝑂𝑂2 for the random y’s must be far from the real ones. Lastly, to test the robustness of the model, a

leave-n-out (LNO) cross-validation was used. The LNO constructs several models by gradually removing N samples up to 50% the number of samples. New models were built and the 𝑄𝐿𝑁𝑂2 compared to the 𝑄𝐿𝑂𝑂2 . The difference between

these values should not oscillate more than 0.1 for 25% of samples out (KIRALJ; FERREIRA, 2009). To analyze the relevant data of the LNO test it was applied the box plot data analysis method, the boxplot data permitted to account for anomalous 𝑄𝐿𝑁𝑂2 coming from rare events where the data out isn’t present in the compounds used to

build the models. This phenomena is more frequent the higher N out is.

Design of new CYP51 inhibitors

Novel inhibitors were designed based on the structural features of the other compounds used for the local QSAR models construction, as well as their synthetic pathway, in order to guarantee the synthetical accessibility. In addition, binding site features were also considered to the design of the new compounds. For the construction, the radicals were searched in the Scifinder database to consider only existent fragments. Every fragment was introduced in the pharmacophore region of the correspondent local model and drawn using Avogadro (AVOGADRO: AN OPEN-SOURCE MOLECULAR BUILDER AND VISUALIZATION TOOL., [s.d.]). The charges were adjusted considering a 7.4 pH and checked with the program MarvinSketch 14.11.3.0 (MARVINSKETCH, 2014). After the construction, the compounds were submitted to a molecular geometry optimization using PM7 semiempirical theory in MOPAC (JAMES J. P. STEWART, STEWART COMPUTATIONAL CHEMISTRY, COLORADO SPRINGS, CO, [s.d.]) using implicit COSMO solvation (KLAMT; SCHUURMANN, 1993).

(21)

Every optimized molecule was aligned using as reference the most active molecule of each local model. These alignments were achieved via an ad hoc shell script using the ant colony optimization methodology employed in the pharmACOphore algorithm (KORB et al., 2010). All aligned molecules were visualized with UCSF Chimera (PETTERSEN et al., 2004) in order to ensure the alignment quality.

After all the molecules were aligned, the 2D molecular descriptors were calculated using PaDEL program (YAP, [s.d.]). In addition, the 3D descriptors were also calculated using the same process used by the calculation described previously (Table 1). The 2D and 3D descriptors were selected individually following the generated local models. These selections were made using R version 3.4.4 (R CORE TEAM (2018). R: A LANGUAGE AND ENVIRONMENT FOR STATISTICAL COMPUTING. R FOUNDATION FOR STATISTICAL COMPUTING, VIENNA, AUSTRIA., [s.d.]) employing the package dplyr (HADLEY WICKHAM, ROMAIN FRANÇOIS, 2019). These descriptors were used in the prediction of activity based on the QSAR local models.

Results and discussion

CYP51 inhibitors databank analysis and alignment of the molecules

A total of 376 molecules were considered initially, but some of them were not adequate to create QSAR models (ANDRIANI et al., 2013; BUCKNER et al., 2012; DE VITA et al., 2016; FERREIRA DE ALMEIDA FIUZA et al., 2018; FRANKLIM et al., 2013; FRIGGERI et al., 2014; KRAUS et al., 2009, 2010; LEPESHEVA et al., 2008) because of one or more of the following reasons: few molecules with the same scaffold were tested, stereochemistry not defined or biological activity measured without a precise value for inactive compounds. These molecules were separated based on their chemical similarity using Gephi (BASTIAN; HEYMANN; JACOMY, 2009) as illustrated in Fig. 1. This analysis was really helpful in identifying the potencies and types of molecules available that were likely to lead to the creation of the QSAR models. A model was only considered if the series had over 30 compounds with their respective potencies measured and a defined stereochemistry configuration.

Fig. 1 Clusters of molecules separated based on their chemical similarity and colored by their graph modularity. The size of the spheres is determined on how high is their pEC50 or pIC50. Each local model is identified by a letter,

(22)

for dialkylimidazole (SURYADEVARA et al., 2013) and C stands for imidazole (PAPADOPOULOU et al., 2014, 2015b, 2015a). One compound of the series is standing next to each red circle representing its main scaffold.

A total of 197 molecules were seen as fit to create QSAR models. These 197 molecules were used to create three local models. The molecules of each local model (Fig. 1) had a similar scaffold and were characterized by the same type of group interacting with T. cruzi CYP51 heme: i. The aminopyridyl (A) model (CALVET et al., 2014; CHOI et al., 2013, 2014; VIEIRA et al., 2014a, 2014b) was characterized by a pyridine group interacting with the CYP51 heme and was made using 84 molecules, ii. the dialkylimidazole (B) model (SURYADEVARA et al., 2013) was characterized by an imidazole group and was made using 67 molecules and and iii. the imidazole (C) model (PAPADOPOULOU et al., 2014, 2015b, 2015a) was characterized by either a imidazole or triazole group and was made using 46 molecules.

All of these molecules were aligned to a crystalized ligand of its series when it was available in PDB or aligned to the most similar crystalized ligand that was found in this database. The spatial coordinates of the biological active conformation were estimated by the automatic alignment using the flexible superimposition with pharmACOphore based on ant colony optimization (KORB et al., 2010) and additional manual adjustments were performed to achieve a more precise alignment using UCSF Chimera (PETTERSEN et al., 2004). The local group A compounds were aligned to the bound ligand available in PDB code 4BY0 (CHOI et al., 2014), as for the local groups B and C compounds were both aligned to the bound ligand in the crystalized receptor of CYP51 available in PDB code 3ZG2 (HARGROVE et al., 2013). The molecules used in the models were aligned as seen in Fig. 2. The alignment proved to be a crucial step in creating good 3D-QSAR models, because the creation of these models were attempted with only an automated alignment approach and it was unsuccessful. The models improved considerably when the alignment was optimized.

(23)

Fig. 2 Alignment of the molecules of each local model. Aminopyridyl (A) series of molecules aligned to the heme group of CYP51 (PDB: 4BY0 (CHOI et al., 2014)). The dialkylimidazole (B) and imidazole (C) series of molecules aligned to the heme group of CYP51 (PDB: 3ZG2 (HARGROVE et al., 2013)).

Activity cliffs

An activity cliff analysis was performed in order to analyze differences in the biological activity caused by small modifications in the chemical structure of the molecules and to aid the interpretation of the QSAR models. For instance, in the A set of (CALVET et al., 2014; CHOI et al., 2013, 2014; VIEIRA et al., 2014a, 2014b) molecules we found two molecules that were composed by the same atoms, but were different enantiomers (Fig. 3). In this case the R-enantiomer presented a higher potency (pEC50 = 10.77) compared to its S-enantiomer (pEC50 = 6.01). For this series, the R-enantiomers are more potent than the S-enantiomers. The presence of halogen atoms also seems to improve the potency of the molecules in this series. For the local model B (SURYADEVARA et al., 2013), the addition of a methyl group as a radical in the second carbon of its imidazole group, apparently impedes the interaction with the CYP51 heme and also clashes with Ala91 residue. This last modification leaves the molecule with practically no activity as also observed by the authors who tested the compounds (SURYADEVARA et al., 2013). The use of a piperidine as a radical to the phenyl that is connected to an amine reduced the biological acitivity. The C model (PAPADOPOULOU et al., 2014, 2015b, 2015a) showed a significant activity cliff in which the use of nitrotriazole groups in the molecules improved significantly the biological activity compared to nitroimidazole groups that were also reported as toxic to the host cells (PAPADOPOULOU et al., 2014).

(24)

Fig. 3 Activity cliffs 2D representations. The first three molecules in the upper part of the figure are from local model A. The other 4 compounds are from local model C.

2D QSAR models

The PM7 (JAMES J. P. STEWART, STEWART COMPUTATIONAL CHEMISTRY, COLORADO SPRINGS, CO, [s.d.]) optimized compounds were used to build 2D-QSAR models. The optimized structures were submitted to PaDEL(YAP, [s.d.]) to calculate thousands of descriptors. Unfortunately, even after several attempts, the models were not effective in predicting the biological activity using only 2D descriptors. Even though the output from Nanobriges MLR (AMBURE et al., 2015) presented adequate values of R2 and 𝑄

𝑙𝑜𝑜2 ,

when it came down to using the descriptors to predict the biological activity for the test set, they were not accurate. The best results obtained using mostly 2D descriptors was with the dialkylimidazole molecules that a 𝑄𝑒𝑥𝑡2 of 0.53

was obtained using 11 descriptors and still many outliers were observed.

3D-QSAR models

The calculated 3D descriptors from the aligned compounds were thoroughly filtered in R eliminating the descriptors that contained bad quality data, such as missing values and invariant columns for the samples. After this filtering, the remaining descriptors were then submitted to QSAR Modelling OPS (MARTINS; FERREIRA, 2013) and Nanobridges MLR (AMBURE et al., 2015) to select the ones that were more representative of the biological activity. Then the biological activity and the selected descriptors were used to try to predict the activity of the molecules by using a regression vector generated by the MLR.

These models were considered inadequate (marked as –, in Table 2), but after many attempts we have obtained models with mixed 2D and 3D descriptors that predict the molecules’ biological activity much better than using these descriptors separately as shown in Table 2.

(25)

Table 2 Predictive power, determination coefficient and error percentage, respectively 𝑄𝑒𝑥𝑡2 , R², 𝑄𝐿𝑂𝑂2 , for all the

models using 2D, 3D descriptors and a hybrid approach (2D + 3D, Hybrid).

1 2 3

R2 (A) Q2LOO

(A) Q

2ext (A) R2 (B) Q2LOO (B) Q2 ext (B) R 2 (C) Q2LOO (A) Q2 ext (C) 2D 0.35 0.32 - 0.66 0.62 - 0.64 0.58 - 3D - - - 0.88 0.68 0.79 Hybrid 0.92 0.79 0.80 0.91 0.86 0.68 0.93 0.88 0.85

Mixed 2D and 3D-QSAR models

The best results regarding the prediction of the biological activity were obtained by using a mix of 2D and 3D descriptors. These hybrid 2D+3D models performed much better than the single type descriptor models described before for predicting the biological activity of the external set molecules. These models are interpreted as follows for each local model. All the 3D descriptors are displayed along with the most active compound in each local model. All the models were validated using a LNO test to prove its robustness and y-randomization tests showed that the models were not obtained by chance.

Local model A

The A model was built using 84 compounds (R2 = 0.92, 𝑄

𝑙𝑜𝑜2 = 0.79, 𝑄𝑒𝑥𝑡2 = 0.80) and needed various

descriptors to be able to predict the biological activity. This model presented a total of two classical descriptors (2D) and 12 3D descriptors. The descriptors were also classified as positive and negative, based on the PLS sign of the regression vector. Therefore, one descriptor was positive LJ, three negative LJ, two positive HF, three negative HF, one positive HB, one positive QQ and one negative QQ as shown in Fig. 4. The pyridine group is the main part that interacts with the heme group of T. cruzi’s CYP 51 and it is surrounded by all the HFN (3-5) descriptors. This last portion of the molecule and the pyrrolidine containing part remain unchanged for most of the molecules in the series.

The portion of the molecule that has the greatest influence in the biological activity is surrounded by eight out of twelve (2, 6-12) of the 3D descriptors. The molecules that have small groups that only go as far as the first negative LJ descriptor (6) or that are too large that reach past the negative QQ descriptor (12) tend to have a worse biological activity compared to the reference molecule displayed in Fig. 4, which is the most potent molecule in the series with an pEC50 = 10.77.

(26)

Fig. 4 Local model A. A representation of the most potent molecule of the aminopyridyl series is displayed above.

The 3D descriptors are represented as spheres, the 2D descriptors are written in the figure and they were all used to predict this series’ activity. The 3D descriptors are represented as positive LJ descriptors (6), negative LJ (8, 9 and 10), negative HF (3, 4 and 5), positive HF (1 and 2), positive QQ (11) and negative QQ (12). The graph on the right shows the distribution of the molecules in the imidazole model comparing the predicted to the experimental pEC50 values of the training and external set of molecules. The training set is displayed in red

and the external set is displayed in blue (Fig. 4).

Local model B

The B model (SURYADEVARA et al., 2013) had only a few outliers, with an R2 = 0.90, 𝑄

𝑙𝑜𝑜2 = 0.86,

𝑄𝑒𝑥𝑡2 = 0.68. Seven 2D descriptors and five 3D descriptors were used to construct this model. There are three 3D

LJ descriptors of a positive nature and two HF descriptors of a negative nature represented as blue and grey respectively displayed near the most potent molecule (pEC50 = 9.7) of the series in Fig. 5. The HF (4,5) descriptors

tend to indicate the presence of hydrophobic parts of the molecule, and therefore are standing near predominantly hydrophobic groups.

(27)

Fig. 5 Local model B. A representation of the most potent molecule of the dialkylimidazole series is displayed above. The 3D descriptors are represented as spheres, the 2D descriptors are written in the figure and they were all used to predict this series’ activity. The 3D descriptors are represented as positive LJ descriptors (1, 2 and 3) and negative HF (4 and 5). The graph on the right shows the distribution of the molecules in the dialkylimidazole model comparing the predicted to the experimental pEC50 values of the training and external set of molecules.

The training set is displayed in red and the external set is displayed in blue.

The three LJ (1-3) descriptors seem to indicate the presence or absence of nearby chemical groups. One of them (3) is standing right next to the imidazole group, which is essential to interact with the parasite’s CYP51. The addition of small groups to the sides of the imidazole seem to be worse than bulkier hydrophobic groups. The presence of a phenyl group attached to the pyridine ring also improves the biological activity; this phenyl ring is standing right next to a LJ descriptor (2).

Local model C

The C model (PAPADOPOULOU et al., 2014, 2015b, 2015a) has an R2 = 0.93, 𝑄

𝑙𝑜𝑜2 = 0.88, 𝑄𝑒𝑥𝑡2 =

0.85 and is composed of one 3D descriptor and five 2D descriptors. There is only one QQ descriptor of a negative nature (Fig. 6). The most potent molecule in the series is displayed in the figure and has a pIC50 = 8.1. As for many

of the 3D descriptors, the negative QQ descriptor in this case does not seem to have a linear relation to the biological activity. The molecules that contain nitrotriazole groups are much better concerning the biological activity when compared to the nitroimidazole groups (PAPADOPOULOU et al., 2015b). It was also observed that a methyl group substituent in the imidazole group interferes in the interaction with the heme group.

(28)

Fig. 6 Local model C. A representation of the most potent molecule of the imidazole series is displayed above. The 3D descriptor is represented as a sphere, the 2D descriptors are written in the figure and they were all used to predict this series’ activity. The graph on the right shows the distribution of the molecules in the imidazole model comparing the predicted to the experimental pIC50 values of the training and external set of molecules.

The training set is displayed in red and the external set is displayed in blue.

Validation of the QSAR models

All the models were validated by LNO tests, the A model resisted until the removal of 18 samples, B model endured the removal of 21 samples and C model sustained the removal of 16 samples, which proves that robust models were obtained, as shown in Fig. 7. Additionally a y-randomization test was performed in all the models demonstrating that they were not obtained by chance as the real models are showcased far from the models generated with randomized samples (Fig. 8).

(29)

Fig. 7 Leave N out validation applied to models A, B and C, they are displayed in this order from top to bottom. The figure shows that the models are able to resist the removal of several samples.

Fig. 8 Y randomization of the models A, B and C, the models are displayed in this order from left to right. The randomly generated models stray far from the real models.

Design of new CYP51 inhibitors Local model A

After planning and predicting the activity of the analogues, seven compounds with particularly high CYP51 inhibitor potential were selected and are displayed in Table 3.

(30)

R Predicted (pEC50) 9.26 8.57 8.63 8.55 9.18 8.34

Every new designed inhibitor has an R group that follows the structural features necessary for a promising activity.

The most difficult part of the new inhibitor’s construction was to guarantee the size of this group pointed as necessary for a good inhibition activity. In this way, all the compounds comprised in Table 3 had the R group large enough to explore the area pointed by the first negative LJ descriptor (6) but not capable to reach past the negative

(31)

QQ descriptor (12) as illustrated in Fig. 4. In addition, the high values of these molecules with similar chemical features demonstrates an interesting precision of the activity predictions made by the local model A.

None of the compounds showed a pEC50 higher than the most potent molecule in the series (pEC50

= 10.77). However, it is important to highlight the presence of a triazole group instead of the benzene ring that is present in most of the molecules comprised in this local model. This former chemical group is widely known in the literature because of its bioactivity, being utilized in several molecule structures of new chemical entities, particularly due to the possibility of making relevant intermolecular interactions between the amino acid residues of various receptors (TOTOBENAZARA; BURKE, 2015; ZHANG et al., 2014). In addition, because of the large number of studies, there are a lot of existent fragments with this group, favoring the synthetic accessibility of the compounds.

Local model B

In this model, the activity prediction pointed to six promising new designed inhibitors as shown in Table 4.

Table 4 Six most promising compounds designed based on the structures of the local model B

R Predicted (pEC50)

8.8

(32)

8.8

11.8

11.3

12.2

Every compound had the imidazole group preserved in their structure once its presence was confirmed to be essential for the inhibitory activity in the parasite’s CYP51. Also, in both R regions were included large groups with benzene rings and halogen atoms as substituents. These chemical characteristics follow the points observed by the presence of local model B descriptors, explaining the high values of pEC50 that were predicted for the

analogues.

Three of the six compounds showed a higher value of pEC50 when compared with the most potent

molecule of the series (pEC50 = 9.7). However, the QSAR are generally interpolative, presenting some difficulties

in interpolation which implies that the activity prediction is particularly more reliable if the value is between the high and low values of activity (SAHIGARA et al., 2012). Even so, the results of pEC50 after the prediction is in

accordance with its chemical structure characteristics, which makes these compounds very interesting molecules for synthesis and, posteriorly, in vitro tests.

(33)

Local model C

The inhibitors designed based on this local model also had a total number of six interesting molecules with particularly high values of pIC50 (Table 5).

Table 5 Six most promising compounds designed based on the structures of the local model C

R Predicted (pIC50) 7.6 7.5 7.3 8.5 9.5 7.6

(34)

These compounds were planned with nitritriazole groups that were proven as an important structural feature regarding the biological activity (PAPADOPOULOU et al., 2015b). Also, the increase in the number of halogen atoms demonstrated a direct improvement in the biological activity which can be related to a probable better capacity of the molecules in entering the parasite’s cells.

Similarly to the compounds constructed based on the model B, two of the six planned molecules showed a higher value of pEC50 in comparison with the most potent molecule of the series (pEC50 = 8.1). Generally, as

cited before, the biological activities predictions are most reliable when they are in the interval between the lowest and highest values especially because of the interpolation problem of the QSAR models (SAHIGARA et al., 2012). However, the compounds are still promising to a future synthesis process.

Conclusions

The research for new drugs to treat Chagas disease is still relevant because in over 40 years, not a single effective treatment has been developed against T. cruzi and the existing treatments are not effective in the chronic phase of the disease. In this study, we demonstrated that the interpretation of an activity cliff analysis paired with models generated using both classical 2D and 3D QSAR descriptors worked better in understanding and predicting the biological activity when compared to using only one type of descriptor. The models were internally validated and its predictive power was asserted for an external dataset. The obtained models were employed to predict the inhibitory activity of several hypothetical T. cruzi CYP51 inhibitors. These compounds were generated by the interpretation of the descriptors present in each model. All designed analogs were predicted as potent and certainly synthetically feasible. These results could be used in future in vitro or in vivo tests.

References

AMBURE, P.; AHER, R. B.; GAJEWICZ, A.; PUZYN, T.; ROY, K. “NanoBRIDGES” software: Open access tools to perform QSAR and nano-QSAR modeling. Chemometrics and Intelligent Laboratory Systems, [s. l.], v. 147, p. 1–13, 2015. Disponível em: <http://dx.doi.org/10.1016/j.chemolab.2015.07.007>

ANDRIANI, G.; AMATA, E.; BEATTY, J.; CLEMENTS, Z.; COFFEY, B. J.; COURTEMANCHE, G.; DEVINE, W.; ERATH, J.; JUDA, C. E.; WAWRZAK, Z.; WOOD, J. T.; LEPESHEVA, G. I.; RODRIGUEZ, A.; POLLASTRI, M. P. Antitrypanosomal Lead Discovery: Identification of a Ligand-Efficient Inhibitor of Trypanosoma cruzi CYP51 and Parasite Growth. Journal of Medicinal Chemistry, [s. l.], v. 56, n. 6, p. 2556–2567, 2013. Disponível em: <https://doi.org/10.1021/jm400012e>

ARNOTT, J. A.; PLANEY, S. L. The influence of lipophilicity in drug discovery and design. Expert Opinion on Drug Discovery, [s. l.], v. 7, n. 10, p. 863–875, 2012. Disponível em: <http://www.tandfonline.com/doi/full/10.1517/17460441.2012.714363>

Avogadro: an open-source molecular builder and visualization tool. [s.d.]. Disponível em: <http://avogadro.cc/>

BARBOSA-SILVA, A. N.; DA CÂMARA, A. C. J.; MARTINS, K.; NUNES, D. F.; DE OLIVEIRA, P. I. C.; DE AZEVEDO, P. R. M.; CHIARI, E.; GALVÃO, L. M. da C. Characteristics of triatomine infestation and natural Trypanosoma cruzi infection in the state of Rio grande do norte, Brazil. Revista da Sociedade Brasileira de Medicina Tropical, [s. l.], v. 49, n. 1, p. 57–67, 2016.

(35)

Molecular Informatics, [s. l.], v. 31, n. 1, p. 75–84, 2012.

BASTIAN, M.; HEYMANN, S.; JACOMY, M. Gephi: An Open Source Software for Exploring and Manipulating Networks. In: ICWSM 2009, Anais... [s.l: s.n.]

BENDER, A.; GLEN, R. C. Molecular similarity: A key technique in molecular informatics. Organic and Biomolecular Chemistry, [s. l.], v. 2, n. 22, p. 3204–3218, 2004.

BRITO, C. R. do N.; SAMPAIO, G. H. F.; CÂMARA, A. C. J. Da; NUNES, D. F.; AZEVEDO, P. R. M. De; CHIARI, E.; GALVÃO, L. M. da C. Seroepidemiology of Trypanosoma cruzi infection in the semiarid rural zone of the State of Rio Grande do Norte, Brazil. Revista da Sociedade Brasileira de Medicina Tropical,

[s. l.], v. 45, n. 3, p. 346–352, 2012. Disponível em:

<http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0037-86822012000300013&lng=en&tlng=en> BUCKNER, F. S.; BAHIA, M. T.; SURYADEVARA, P. K.; WHITE, K. L.; SHACKLEFORD, D. M.; CHENNAMANENI, N. K.; HULVERSON, M. A.; LAYDBAK, J. U.; CHATELAIN, E.; SCANDALE, I.; VERLINDE, C. L. M. J.; CHARMAN, S. A.; LEPESHEVA, G. I.; GELB, M. H. Pharmacological characterization, structural studies, and in vivo activities of anti-chagas disease lead compounds derived from tipifarnib. Antimicrobial Agents and Chemotherapy, [s. l.], v. 56, n. 9, p. 4914–4921, 2012.

CALVET, C. M.; VIEIRA, D. F.; CHOI, J. Y.; KELLAR, D.; CAMERON, M. D.; SIQUEIRA-NETO, J. L.; GUT, J.; JOHNSTON, J. B.; LIN, L.; KHAN, S.; MCKERROW, J. H.; ROUSH, W. R.; PODUST, L. M. 4 ‑ Aminopyridyl-Based CYP51 Inhibitors as Anti- Trypanosoma cruzi Drug Leads with Improved Pharmacokinetic Pro fi le and in Vivo Potency. J Med Chem, [s. l.], v. 57, p. 6989–7005, 2014.

CASTRO, J. A.; MONTALTO, M.; BARTEL, L. C. Toxic side effects of drugs used to treat Chagas ’ disease ( American trypanosomiasis ). Human & Experimental Toxicology, [s. l.], v. 25, n. 8, p. 471–479, 2006.

CHOI, J. Y.; CALVET, C. M.; GUNATILLEKE, S. S.; RUIZ, C.; CAMERON, M. D.; MCKERROW, J. H.; PODUST, L. M.; ROUSH, W. R. Rational development of 4-aminopyridyl-based inhibitors targeting trypanosoma cruzi CYP51 as anti-chagas agents. Journal of Medicinal Chemistry, [s. l.], v. 56, n. 19, p. 7651–7668, 2013.

CHOI, J. Y.; CALVET, C. M.; VIEIRA, D. F.; GUNATILLEKE, S. S.; CAMERON, M. D.; MCKERROW, J. H.; PODUST, L. M.; ROUSH, W. R. R -configuration of 4-aminopyridyl-based inhibitors of CYP51 confers superior efficacy against Trypanosoma cruzi. ACS Medicinal Chemistry Letters, [s. l.], v. 5, n. 4, p. 434–439, 2014.

COURA, J. R.; DE CASTRO, S. L. A critical review on chagas disease chemotherapy. Memorias do Instituto Oswaldo Cruz, [s. l.], v. 97, n. 1, p. 3–24, 2002.

COURA, J. R.; VIÑAS, P. A. Chagas disease: a new worldwide challenge. Nature, [s. l.], v. 465, p. S6, 2010. Disponível em: <http://dx.doi.org/10.1038/nature09221>

DA CÂMARA, A. C. J.; LAGES-SILVA, E.; SAMPAIO, G. H. F.; D’ÁVILA, D. A.; CHIARI, E.; DA CUNHA GALVÃO, L. M. Homogeneity of Trypanosoma cruzi I, II, and III populations and the overlap of wild and domestic transmission cycles by Triatoma brasiliensis in northeastern Brazil. Parasitology Research, [s. l.], v. 112, n. 4, p. 1543–1550, 2013.

DE ARAÚJO SANTOS, R. A.; BRAZ, C. A.; GHASEMI, J. B.; SAFAVI-SOHI, R.; BARBOSA, E. G. Mixed 2D-3D-LQTA-QSAR study of a series of Plasmodium falciparum dUTPase inhibitors. Medicinal Chemistry Research, [s. l.], v. 24, n. 3, p. 1098–1111, 2015.

(36)

DE VITA, D.; MORACA, F.; ZAMPERINI, C.; PANDOLFI, F.; DI SANTO, R.; MATHEEUSSEN, A.; MAES, L.; TORTORELLA, S.; SCIPIONE, L. In vitro screening of 2-(1H-imidazol-1-yl)-1-phenylethanol derivatives as antiprotozoal agents and docking studies on Trypanosoma cruzi CYP51. European Journal of Medicinal Chemistry, [s. l.], v. 113, p. 28–33, 2016.

FERREIRA DE ALMEIDA FIUZA, L.; PERES, R. B.; SIMÕES-SILVA, M. R.; DA SILVA, P. B.; BATISTA, D. da G. J.; DA SILVA, C. F.; NEFERTITI SILVA DA GAMA, A.; KRISHNA REDDY, T. R.; SOEIRO, M. de N. C. Identification of Pyrazolo[3,4-e][1,4]thiazepin based CYP51 inhibitors as potential Chagas disease therapeutic alternative: In vitro and in vivo evaluation, binding mode prediction and SAR exploration. European Journal of Medicinal Chemistry, [s. l.], v. 149, p. 257–268, 2018.

FRANKLIM, T. N.; FREIRE-DE-LIMA, L.; DE NAZARETH SÁ DINIZ, J.; PREVIATO, J. O.; CASTRO, R. N.; MENDONÇA-PREVIATO, L.; DE LIMA, M. E. F. Design, synthesis and trypanocidal evaluation of novel 1,2,4-triazoles-3- thiones derived from natural piperine. Molecules, [s. l.], v. 18, n. 6, p. 6366– 6382, 2013.

FRIGGERI, L.; HARGROVE, T. Y.; RACHAKONDA, G.; WILLIAMS, A. D.; WAWRZAK, Z.; DI SANTO, R.; DE VITA, D.; WATERMAN, M. R.; TORTORELLA, S.; VILLALTA, F.; LEPESHEVA, G. I. Structural basis for rational design of inhibitors targeting Trypanosoma cruzi Sterol 14α-demethylase: Two regions of the enzyme molecule potentiate its inhibition. Journal of Medicinal Chemistry, [s. l.], v. 57, n. 15, p. 6704– 6717, 2014.

HADLEY WICKHAM, ROMAIN FRANÇOIS, L. H. and K. M. dplyr: A Grammar of Data Manipulation. R package version 0.8.0.1., 2019.

HANWELL, M. D.; CURTIS, D. E.; LONIE, D. C.; VANDERMEERSCH, T.; ZUREK, E.; HUTCHISON, G. R. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. Journal of Cheminformatics, [s. l.], v. 4, n. 1, p. 17, 2012. Disponível em: <https://doi.org/10.1186/1758-2946-4-17>

HARGROVE, T. Y.; WAWRZAK, Z.; ALEXANDER, P. W.; CHAPLIN, J. H.; KEENAN, M.; CHARMAN, S. A.; PEREZ, C. J.; WATERMAN, M. R.; CHATELAIN, E.; LEPESHEVA, G. I. Complexes of trypanosoma cruzi Sterol 14α-Demethylase (CYP51) with Two Pyridine-based Drug Candidates for Chagas Disease: Structural basis for pathogen selectivity. Journal of Biological Chemistry, [s. l.], v. 288, n. 44, p. 31602– 31615, 2013.

JAMES J. P. STEWART, STEWART COMPUTATIONAL CHEMISTRY, COLORADO SPRINGS, CO, U. MOPAC2016, [s.d.]. Disponível em: <http://openmopac.net>

KIRALJ, R.; FERREIRA, M. Basic validation procedures for regression models in QSAR and QSPR studies: theory and application. J. Braz. Chem. Soc., [s. l.], v. 20, n. 4, p. 770–787, 2009. Disponível em: <http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0103-50532009000400021&lng=en&nrm=iso>

KLAMT, A.; SCHUURMANN, G. COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. Journal of the Chemical Society, Perkin Transactions 2, [s. l.], n. 5, p. 799–805, 1993.

KORB, O.; MONECKE, P.; HESSLER, G.; STÜTZLE, T.; EXNER, T. E. PharmACOphore: Multiple flexible ligand alignment based on ant colony optimization. Journal of Chemical Information and Modeling, [s. l.], v. 50, n. 9, p. 1669–1681, 2010.

Referências

Documentos relacionados

Peça de mão de alta rotação pneumática com sistema Push Button (botão para remoção de broca), podendo apresentar passagem dupla de ar e acoplamento para engate rápido

 Managers involved residents in the process of creating the new image of the city of Porto: It is clear that the participation of a resident designer in Porto gave a

Uma das alternativas abaixo. a) Processo encaminhado ao Ordenador para Assinar nota de empenho (Atividade 170), se adimplente. b) Se inadimplente, processo encaminhado

Despercebido: não visto, não notado, não observado, ignorado.. Não me passou despercebido

Ao Dr Oliver Duenisch pelos contatos feitos e orientação de língua estrangeira Ao Dr Agenor Maccari pela ajuda na viabilização da área do experimento de campo Ao Dr Rudi Arno

Neste trabalho o objetivo central foi a ampliação e adequação do procedimento e programa computacional baseado no programa comercial MSC.PATRAN, para a geração automática de modelos

Ousasse apontar algumas hipóteses para a solução desse problema público a partir do exposto dos autores usados como base para fundamentação teórica, da análise dos dados

Este relatório relata as vivências experimentadas durante o estágio curricular, realizado na Farmácia S.Miguel, bem como todas as atividades/formações realizadas