INITIAL STUDY ON SARS-COV-2 MAIN PROTEASE INHIBITION MECHANISM OF SOME POTENTIAL DRUGS USING MOLECULAR DOCKING SIMULATION

The infection by the new coronavirus SARS-CoV-2 (called as COVID-19 disease) is a worldwide emergency, however, there is no antiviral treatment or vaccine to date. 3C like protease (3CLpro) is the main protease of SARS-CoV-2 that involved in the process of translation of the polypeptide from the genomic RNA to protein components, which are required for virus replication. The crystal structure of this protease has been rapidly resolved and made publicly in the Protein Data Bank recently. Many efforts have been conducted by scientists including the use of several commercial medicines that are known for treatment of HIV and antimalarial/antibiotic such as arbidol, chloroquine, hydroxychloroquine, azithromycin, darunavir, remdesivir and lopinavir/ritonavir. These drugs exhibited significant efficacy in clinical, however, the understanding at atomic level of how these compounds prevent SARS-CoV-2 protease is still lacking. Therefore, in this context docking protocol was employed to rapidly estimate the binding affinity and binding pose of six drugs on the main protease of SARS-CoV-2 virus. The obtained results might help to shed light on the interaction mechanism of these compounds toward the protein, and thus suggesting an efficient approach to drug discovery and treatments.


INTRODUCTION
The severe acute respiratory syndrome CoV (SARS-CoV), Middle East respiratory syndrome CoV (MERS-CoV) and new coronavirus (SARS-CoV-2) belong to Betacoronavirus which contain a single positive-stranded RNA from 26 to 32 kb in length and cause a wide array of respiratory, gastrointestinal and neurological diseases in human hosts [1,2]. In December 2019, the World Health Organisation (WHO) officially announced a cluster of cases detected in Wuhan city, Hubei province of China [3] and today, the infection has spread out to more than 213 countries and territories. The total number of confirmed COVID-19 infections are 15,096,315 cases, and the number of deaths reached 619,520 as of July 22, 2020 [4] which show no sign of decline in the near future. The rapid increase in numbers of infected patients urge scientific community to find vaccines/drugs to cure this disease. However, this process could be time-consuming and take many years to complete as the traditional research pathway and safety test of new developed vaccines/drugs could be a major concern. In the meanwhile, finding an alternate therapy to treat infections and reduce death cases is necessary.
There have been many research efforts conducted by scientists worldwide to find an efficient pathway to treat the disease, especially focusing on potential drug targets. The spike protein of SARS-CoV-2 has been quickly identified as novel target for drug development due to its high affinity in binding to human cell receptor ACE2, thus initiating molecular events that release the viral genome intracellularly [5,6]. On the other hand, coronaviruses usually encode two or three viral proteases. In the cases of SARS-CoV, the two identified proteases are a papain-like cysteine protease (PL pro ) [7] and a chymotrypsin-like cysteine protease (3CL pro ) [8], also known as the main protease. There are multiple domain functions that are active in the replication of the coronavirus and these domains are presented in a protein designated as nonstructural protein 3 (nsp3) which is the largest protein in the coronavirus genome [9]. 3C like protease (3CL pro ) is proven to be involved in the process of translation of the polypeptide from the genomic RNA to protein components that are required structurally or non-structurally for replication and packaging of new generation viruses. Notably, the crystal structure of the COVID-19 main protease in complex with a peptidomimetic inhibitor (PDB ID: 6LU7) was determined recently by the scientists from China and made available on the Protein Data Bank [10]. The RNA genome of the new coronavirus was reported with up to 82 % identity to that of SARS-CoV. Thanks to untiring research efforts, several medicines including arbidol, chloroquine, hydroxychloroquine, azithromycin, darunavir, remdesivir and lopinavir/ritonavir, which are known in the therapy of HIV and anti-malarial/antibiotic infection, have been used in the treatment of SARS-CoV infections s and resulted in high efficacy [11 -13].
Nowadays, computer-aided drug design is usually used to search for new potential drug candidates and one of the popular methods is molecular docking which focus on analyzing the interaction between receptors and compounds [14]. This technique explores the mechanism which inhibit targeted receptors based on binding free energy and amino acids participated in interaction, hence helping the scientists to speed up the time and also save costs on new drugs development. The application of this method has resulted in many publications that predict potential bioactive compounds against SARS-CoV-2 specific biological targets recently [15 -19]. With the hope to quickly identify candidate drugs for COVID-19 therapy, we conduct molecular docking using Autodock4 tool to simulate the interaction between some potential medicines and main protease of SARS-CoV-2. The obtained results might reveal the mechanism of inhibiting the target protease, thus, giving additional chance to find promising drugs quickly.

Protein preparation
Crystal structure of COVID-19 main protease in complex with a peptidomimetic inhibitor N3 (PDB ID: 6LU7) was obtained from Protein Data Bank [10]. Autodock Tools (MGLTools) was utilized to prepare protein for docking simulations [20]. To turn the protein molecule into a free receptor, water molecules and standard inhibitor were removed. Then, the polar hydrogen atoms, default Kollman charges and solvation parameters were allocated to the protein atoms [21]. Obtained atomic coordinates of the protein were then exported into a PDBQT file which will be used for the execution of AutoGrid and AutoDock.

Protein structure validation
Before performing molecular docking studies, the COVID-19 main protease crystal structure (PDB ID: 6LU7) was cross-checked with another known structure of SARS-CoV main protease published in 2005 (PDB ID: 2BX4) [28] to clarify whether the new coronavirus induces amino acids mutation within the active site. The active site of SARS-CoV 3CL pro is located in the cleft between domains I and II of the protein, and the binding active site of this protease is composed of essential amino acids such as His41, Met49, Cys145, His163, Glu166, His172 [29]. The validation was carried out using plugins from Chimera 1.13.1 [30] and PyMol [25].

Molecular docking studies using AutoDock4.2.6
Four softwares including PyMOL [25], Discovery Studio Visualizer [31], LigPlus [32] and Maestro [33] were used to analyze the obtained results, which describe distances of hydrogen bonds between the hydrogen and its supposed binding partner. The grid box that encloses amino acids domain involved in the binding active sites, had the dimension of 50 × 60 × 60 (x × y × z) with grid spacing of 0.375 Å (Figure 2). AutoGrid and AutoDock were used to calculate the precalculated binding affinity of each ligand's atom type and to perform molecular docking simulation, respectively. The parameters of the Lamarckian Genetic Algorithm (LGA) were: 50 runs; elitism of 1; the mutation rate of 0.02; the population size of 300; a crossover rate of 0.80; number of generations of 27,000; the energy evaluations of 50,000,000 and the root-meansquare (RMS) cluster tolerance was set to 2.0 Å in each run. Default parameters were selected for step sizes for translations, quaternions and torsions. The ligand conformation with the lowest free energy of binding, chosen from the most favored cluster, was selected for further analysis.

Protein structure validation study
The structure validation between COVID-19 and SARS-CoV main protease models were executed by Chimera 1.13.1. Figure 3 shows the comparison results between two models with high identity (91.3 %). For further analysis, amino acids sequence alignment between two proteases has been conducted. According to the report, the two proteins differ by only 17 amino acids and it should be noted that, there is no mutation occur amongst essential amino acids within the active site of COVID-19 in comparison to SARS-CoV ( Figure 4). Therefore, this model could be relied on when carrying out molecular docking studies in the next step.

Molecular docking studies
Molecular docking simulation is an important method for understanding various interactions between ligand and protein/enzyme active site, which is helpful in drug discovery in the pharmaceutical industry. In an attempt to find potential compounds for COVID-19 treatments, AutoDock4.2.6, with the Lamarkcian Genetic Algorithm, were used to analyze the docking probability of 6 medicines currently used in clinical on the binding pocket of the main protease which play an important role in the propagation of the virus. Obtained results could reveal pivotal information on the inhibition mechanism of the compounds toward target protein, thus, providing helpful suggestion in structure characteristics for further drug development. All the docked compounds were compared to each other and the ranking were sorted from the lowest to the highest binding energy (Table 1). It is observed that peptidomimetic inhibitor N3 has binding energy -13.13 kcal/mol and will be considered as threshold value for this docking study, and thus any ligands whose docking energies are close to this value would be viewed as potential. The standard inhibitor formed 4 H-bonding with residues Cys145, His163, Glu166 and Arg188, in which three of them are considered as essential amino acids for protease inhibiting function. Ligand efficiency (LE) is a useful metric for the selection of lead compounds in drug discovery. It is also a measurement of the binding energy of the ligand per atom, which is calculated according to equation 1. It has been estimated that most of the hits or lead compounds can be considered for further structure optimization given that LE value ranging from 0.25 to 0.6 [34].  It is remarkable that remdesivir, an adenosine nucleotide analog, recently utilized in the treatment of COVID-19 and reported as highly effective drug showed the lowest binding free energy (-14.12 kcal/mol) However, the docking conformation of this drug to the active site pocket of protease does not exhibit well fitting. Only 2 hydrogen bonds were formed with amino acids Glu166 and Thr190 and supported by weak interactions with His41, Met49, Met165, Pro168 and Gln189 ( Figure 5). This is an interesting result since remdesivir is known to be an inhibitor which focuses on the viral RNA polymerase, thereby causing mistakes in proofreading by viral exoribunuclease. Therefore, this dock pose analysis proves that remdesivir does not specifically interact with main protease target.
Based on the table provided, two HIV protease inhibitor lopinavir-ritonavir showed a minor decrease in binding affinity (-12.92 and -12.15 kcal/mol, respectively). The specificity of two compounds toward targeted protease is demonstrated through docking conformation in Figure 5, Lopinavir formed 4 H-bonding interaction with the main protease including Cys145, His164, Glu166 and Gln189. Meanwhile, ritonavir, known as additional agents that increase plasma Lopinavir concentration, induced only 2 hydrogen bonds with residues Ser144 and Cys145. In addition, the ligand efficacy value of lopinavir and ritonavir were 0.28 and 0.31, respectively which falls within the acceptable range. This result indicates that the structure of these two compounds could be still promising for further optimization in regards to improving bioactivities. Darunavir is another famous HIV-1 protease inhibitor that prevents the formation of mature infectious virus particles. Our results show that darunavir has the potential inhibiting protease with an estimated binding energy in the middle rank (-12.39 kcal/mol) and LE value 0.33. Five H-bonding were created by this drug with His41, Cys145, His164, Glu166 and Thr190 in which three of them are important residues, besides, the interaction was further strengthen by hydrophobic bonds with Met49, Gly143, His163 and Met165. These information suggest that darunavir is a specific inhibitor of SARS-CoV-2 in the main protease.
Chloroquine, a common anti-malarial agent, is effective in COVID-19 treatment as it increases the endosomal pH, an essential factor for virus fusion. The low binding affinity of this drugs (-12.96 kcal/mol) indicates that it does not fit well in the binding site of the protease, thus, the mechanism of this medicine is yet to be investigated. Dock pose analysis further confirms this statement, with chloroquine forming only 2 hydrogen bonds with Gly143 and His164, which are not the essential amino acids in the active site. Arbidol, known as a broad-spectrum anti-viral medicine, showed the least potential to dock with COVID-19 main protease in the active site. The dock score of the drug was -11.14 kcal/mol with only one hydrogen bond created with Arg188 which is located outside the important pocket. This suggests the structure of this candidate is not appropriate to inhibit the protease. On the other hand, the high effectiveness of this medicine in the treatment of new coronavirus-infected patients proves that its mechanism was based on another potential target.

CONCLUSIONS
In this study, the chymotrypsin-like cysteine protease (3CL pro ), known as main protease of COVID-19 has been cross-checked with that of SARS-CoV using sequence alignment method and structure overlay between two models. Obtained results prove that these two models have up to 91.3% of identity and the binding active site or new coronavirus is still conserved, thus, it is considered as potential target for drug development.
With the aim to investigate the inhibition mechanism on SARS-CoV-2 main protease which suggests pathway for studies of potential compounds development, molecular docking studies have been carried out to investigate the mechanism of action of some medicines currently used in COVID-19 clinical treatment. Obtained results reveal that lopinavir-ritonavir and darunavir are drugs that possess specific structure characteristic that can fit well with the binding site of target protease through interactions with essential amino acids, hence inhibiting the replication function of the virus. These results could be a helpful suggestion for scientists to conduct further research related to structure-based of known drugs in order to develop compounds with desired bioactivity for treatment.