Tautomerism of 4-phenyl-2,4-dioxobutanoic acid. Insights from pH ramping NMR study and quantum chemical calculations

Aryldiketo acids (ADKs) exhibit the variety of biological activities, mainly due to large affinity toward divalent metal ions. Metal complexation ability of ADKs, as well as interactions with proteins, depend on tautomeric form present in solution. The main aim of this study was to fully explore the tautomeric preferences of 4-phenyl-2,4-dioxobutanoic acid (4PDA), as ADKs representative, in aqueous media at different pH values. 1D and 2D NMR spectroscopy in combination with quantum chemical calculations was applied in order to better understand the tautomeric preferences of 4PDA. The data in highly acidic media are especially interesting since there are no such findings in the literature due to low solubility of ADKs in molecular form. At low pH values, where 4PDA is unionized, the most abundant tautomeric form is enol with keto group closer to phenyl ring. At higher pH values, mixture of two 4PDA ionic forms coexists in solution. Their ratio calculated according to NMR data fits the values predicted using two experimentally determined pKa values. Based on the complexity of 1H NMR spectrum of monoanionic 4PDA form, coexistence of two stable rotamers was assumed. In an alkaline media, 4PDA is mostly present in dianionic form. As π-electrons of dianion are delocalized over an entire keto-enol moiety, spectral distinction between tautomers was not possible. Quantum chemical calculations were used to predict relative stability of tautomers. The predictions were in good accordance with experimental results only in case when explicit water molecule was included in calculations.


Introduction
Molecular properties and biological activities of aryldiketo acids (ADKs), an important class of molecules with widespread biological activities [1][2][3][4][5][6], are affected by keto-enol tautomerism. Tautomerism is of particular interest in studies of small organic molecules recognition properties, including protein-ligand interactions, since different tautomers of molecules of this class have different hydrogen-bond acceptor/ hydrogen-bond donor patterns, as well as different metal complexation abilities [7].
ADKs act by functional sequestration of Mg 2+ ion in the active center of HIV-1 integrase (IN) [8], an enzyme We regret to inform that Branko Drakulić has passed away since completion of this work.
ADKs simultaneously exist in two enolate forms (Scheme 1, I and III), conformationally locked by the pseudo-ring, and one diketo form (II) having two rotatable bonds responsible for their conformational flexibility [15,17]. As a continuation of our group's work on structure-property relationships [18][19][20] and biological activities of ADKs [6], we aimed to further explore their tautomeric preferences.
It was shown previously that the enol form of 1,3-diketones is thermodynamically favored compared to diketo form due to stabilization via intramolecular hydrogen bonding. In the ground state of avobenzone (1,3-diketone used in sunscreen products as a UV light absorber), the presence of other forms such as non-chelated enol and its rotamers is also hypothesized [21,22].
Keto-enol tautomerization of 1,3-diketones has been studied extensively in various chemical systems using a range of analytical techniques (NMR, IR, HPLC, gas electron diffraction) [15,23,24]. It is enhanced in polar, protic solvents. Density functional theory (DFT) calculations showed that tautomerization is highly enhanced in water, as two water molecules assist process via transition state analog to E2 mechanism TS [25]. Tautomerization may be difficult without highly activated proton relays of water hydrogen bonds, i.e., the methylene C-H bond in the keto form cannot be cleaved readily by the nucleophilic sources other than a water molecule connected by hydrogen-bond networks. The newly formed enol O-H bond participates in the intermolecular hydrogen bond rather than in the intramolecular one. NMR spectroscopy and DFT calculations of phenindione (cyclic 1,3-diketone) and derivatives showed that the predominant tautomer for these compounds in DMSO solution is enol form [26].
In a quantum chemical study of the structure and stability of diketo acid HIV-1 IN inhibitors, 5-CITEP and L-731,988, enol forms were more stable than diketo form, with the energy differences for these molecules ranging from 15 to 28 kJ/mol [17]. Two enol forms had similar energies with only 3 kJ/mol difference. Because practically no energy barrier between these two enol forms was observed, it was proposed that they can interconvert easily, and a delocalized transition state (sixmembered ring) was suggested.
Detailed study of keto-enol tautomerism of 11 4-alkyl-and 4-aryl-2,4-diketobutanoic acids has shown that enolate I (Scheme 1) is a predominant form (98%) in a protic solvent (CDCl 3 ) [15]. The equilibrium ratios of aqueous solution structures of aliphatic 2,4-diketobutanoic acids (2,4-diketo, 2-enol-4-keto, and 2-hydrate-4-keto) were markedly affected by solution pH value within wide pH range (1.5-10.5). At pH 7.5 ratio of these structures was approximately 4:5:1, at low pH values 2-hydrate predominated (≈ 50%) and at high pH values 2-enolate carboxylate was dominant (≈ 80%) while 2-hydrate was not detected. As aromatic 2,4-diketobutanoic acids are less soluble in acidic media, a tautomerization study was performed only in solutions with pH ≥ 5.5. The authors reported that three tautomeric forms (I-III, Scheme 1) were not distinguished in the NMR spectra due to the formation of pseudodienolate form (IV) and fast interconversion between two enolate forms. To the best of our knowledge, no experimental data on keto-enol tautomerism of ADKs in aqueous solution with pH ≤ 5.5 was published so far.
In order to fully explore a potential of ADKs as biologically active molecules with metal complexing ability, it is important to have knowledge about their tautomeric forms in aqueous media at different pH values. NMR experiments at a wide range of pH values, starting from extreme acidic to basic conditions, are performed for assessing the tautomeric preferences of 4PDA as a representative scaffold for active ADKs. Results are explained and rationalized with the aid of theoretical calculations and previously determined acid-base properties of the compound of interest.

Experimental Materials and methods
Synthesis and characterization of 4PDA were already described [20]. The same sample was used in all NMR measurements described herein. All other chemicals were purchased from Fluka, Aldrich, or Merck, having > 98% purity, and were used as received. 1 H, 13 C, COSY, HMQC, and HMBC NMR spectra of 4PDA in aqueous solution at different pH values were acquired using Bruker Avance 500/125 MHz NMR spectrometer. pH values were measured using Corning 120 pH-meter equipped with Corning Ag/AgCl microelectrode.
1D and 2D NMR spectra of 4PDA All NMR spectra of 4PDA were acquired at t = 25 ± 1°C, and constant ionic strength (I = 0.1 M (NaNO 3 )). TSP was used as the internal standard for spectra calibration, and chemical shifts (δ) are given in ppm. Measured pH values are converted to pD according to relation: pD = pH measured + 0.4 [27]. The sample was dissolved in an appropriate solvent (CF 3 COOD for pH < 0, D 2 O/CD 3 COOD for pH 2.09; acetate buffer-dfor pH 4.41; and carbonate buffer for pH 7.80 and 9.20), and spectra were calibrated using TMS as an internal standard.
The 1D 1 H spectrum was acquired using a sweep width of 10,330 Hz and 65,536 data, giving a digital resolution of 0.157 Hz/point and an acquisition time of 3.17 s. Pulse width of 11.50 μs and relaxation delay of 2 s were used for 256 transients.
The 1D 13 C spectrum was acquired using a sweep width of 30,030 Hz and 131,072 data points giving a digital resolution of 0.229 Hz/point and an acquisition time of 2.18 s. Pulse width of 10 μs and relaxation delay of 2 s were used for 6000 transients.
2D HMQC (pulse program inv4gpqf) and HMBC (pulse program inv4gplplrndqf-optimized for coupling constant 10 Hz) spectra were acquired with 1024 points in direct dimensions (sweep width 4882 Hz) and 256 complex points (sweep width 31,440 Hz) giving digital resolutions of 4.76 and 122 Hz/point, respectively. Digital resolution for complex dimension has increased to 30.7 Hz/point after zero filling to 1024 data points.

Solubility studies
Intrinsic solubility (solubility of the molecular form (H 2 A)) of 4PDA was determined by the Bshake-flask^method in 1 M HCl. An excess of the accurately weighed compound was added to 5.0 mL of 1 M HCl; five samples were prepared. All samples were stirred and thermostated overnight at t = 25 ± 1°C. After equilibration, samples were filtered, aliquots appropriately diluted with HCl, and the concentration of 4PDA in saturated solution determined spectrophotometrically at the wavelength of the absorption maximum, λ max = 308.8 nm. Conformity with Beer's law had previously been verified.

Computational chemistry studies
The full geometry optimizations of different 4PDA tautomers, in neutral (H 2 A) and monoanionic (HA − ) forms, were performed at the MP2 level of theory, using the 6-31G(d,p) basis set, calculating the force constants at every point (Opt=Calcall), and tightening the cutoff for forces and step size to determine the convergence (Opt=Tight).
The NMR shifts of the three tautomers of molecular form of 4PDA with the carboxyl H in out orientation were predicted applying the gauge-independent atomic orbital (GIAO) method with MP2/6-311++G(d,p) by SP calculations on fully optimized geometries [28]. Implicit water solvation model (IEF-PCM) was applied. The GIAO predicted magnetic shielding of tetramethylsilane (TMS) was taken as a reference.
For the optimization of the systems comprising one water molecule and different tautomeric forms of 4PDA, we used optimized geometries of tautomers and manually added one explicit water molecule in the proximity of carboxyl OH. For each tautomer, the system was initially optimized holding 4PDA rigid and allowing movement of the water molecule by semiempirical MO PM6 method in MOPAC2016 [29]; then full optimization without constraints was performed by MP2/6-31G(d,p). The influence of solvent was also simulated applying the implicit (water) solvation model, IEF-PCM, as default in Gaussian09 [30].

Results and discussion NMR study of aqueous 4PDA solutions at different pH values
Brecker et al. [15] used the NMR spectroscopy to study the aqueous structures of 4PDA in phosphate buffer at pH 7.5. The authors reported broad signals of C γ H atom, as well as broad signals of C β , C γ , and C δ carbon atoms (for atom labeling see Table 1). Interconversion between two enolic forms is very fast, and therefore, it is impossible to ascribe signals to any distinct tautomer. Signal broadening could be explained by the finding of Guthrie et al. [31] that enolization rate constants of 2,4-diketo acids are significantly increasing as pH value increases.
Maurin et al. [32] studied NMR spectra of 4PDA in pure water, and in buffers pH 7.5 and 10.0. In water, they observed only one species, and attributed signals to fully protonated 4PDA in enol I tautomeric form (Scheme 1). However, as 4PDA in aqueous media act as weak diprotic acid (pK a1 = 2.06; pK a2 = 7.56) sparingly soluble in water [20], it will partially dissociate, giving a mixture of molecular and monoanionic form. Therefore, we recorded NMR spectra in highly acidic medium (Figs. 1, 2, and 3, Fig. S1) in order to suppress the dissociation and obtain signals of the pure molecular (H 2 A) form of 4PDA.
The low intrinsic solubility of 4PDA (1.08 ± 0.05 × 10 −3 M) explains why long (overnight) signal acquisition was necessary for 1D and 2D NMR spectra recording in the highly acidic medium. Full structure-spectra assignments were achieved using COSY, HMQC (ESM, Fig. S1), and HMBC spectra ( Fig. 3) of 4PDA recorded in CF 3 COOD. Experimental and calculated 1 H and 13 C NMR chemical shifts of 4PDA are given in Table 1.
Singlet at 7.23 ppm in 1 H NMR spectrum (Fig. 1) is the signal of the vinyl group H atom (H γ , Table 1). As the exchange rate between H γ and D atom from CF 3 COOD is fast, the integral of this signal is smaller than expected. Furthermore, the lack of a singlet around 4.5 ppm in 1 H NMR spectrum (this part of the spectra is not shown in Fig.  1), as well as lack of a signal around 50 ppm in 13 C NMR spectrum (Fig. 2), both expected for > CH 2 group (C γ ) in diketo form II, indicate that diketo form does not exist in solution at concentrations detectable by NMR. The presence of hydrate form is also excluded as no signals characteristic for this form were observed. The C γ atom of hydrate form would also give a signal around 50 ppm in 13 C NMR spectrum, as well as the singlet at chemical shift below 4 ppm in 1 H NMR spectrum. This finding is in accordance with findings of Brecker et al. [15], who found this form in significant amount only in 4-alkyl-2,4-dioxobutanoic acids, but not in 4-phenyl-2,4-dioxobutanoic acid.
Due to H/D exchange, signal at 95 ppm in 13 C NMR spectrum consists of C-H singlet (C-H decoupled spectrum) as well as C-D triplet, as shown in an inset of Fig. 2. This proves that the 1 H signal at 7.23 ppm is not an impurity but the real signal of C-H group of enolic tautomer. Signals at 7.33 ppm (s, 1H), 7.80 ppm (t, 1H), and 8.12 ppm (d, 2H), although very weak, indicate the existence of enol III form, but its concentration is negligible. In highly acidic media (pH < 0), enolization rate constant is very low [31], allowing the coexistence of two enol tautomers in solution.
HMBC spectrum (Fig. 3) provides important data about the major tautomeric form of 4PDA in CF 3 COOD solution. Strong coupling between ortho-phenyl hydrogens and C δ keto group carbon atom (circled signal) indicate that enol I is the major form of 4PDA in solution. If an enol form III was the predominant one, no coupling between ortho-hydrogens and C β would be visible in HMBC spectrum. This confirms the predominance of enol I form of 4PDA in highly acidic solution.
Calculated 1 H NMR spectrum for diketo tautomer II has signals at 3.89 and 5.14 ppm, assigned to H atoms from > CH 2 group; corresponding C γ atom has signal at 36.38 ppm in 13 C spectrum. The asymmetric structure of the most stable geometry of 4PDA in a diketo form II results in the different chemical environment of two H atoms of a methylene group. As a consequence, two signals appeared in calculated 1 H NMR spectrum.  It was mentioned that signals characteristic for the diketo tautomer were not observed in experimentally obtained NMR spectra of 4PDA in CF 3 COOD. During routine characterization of some ADKs in aprotic solvents (CDCl 3 or DMSO-d 6 ), signals characteristic for diketo tautomer were present in 1 H and 13 C NMR spectra. H atoms from > CH 2 group give one broad signal at 4.2 to 4.5 ppm, depending on substitution pattern on the phenyl ring, while in 13 C NMR spectrum, the signal at~50 ppm could be found [18][19][20].
Calculated NMR shifts of enolic hydrogens (H γ ) are different for two enol tautomers (Table 1). Thus, the low intensity signal at 7.33 ppm in experimentally obtained 1 H NMR spectrum ( Fig. 1) could be ascribed to the second, less abundant, enol tautomer III. 1 H NMR spectra of 4PDA were also acquired in solutions with different acidity, within pD range of 1-10. As the acidity decreases, 1 H NMR spectrum becomes more complicated because the carboxylic group dissociates, leading to coexistence of H 2 A and HA − forms in solution. NMR spectrum of 4PDA in D 2 O/CD 3 COOD mixture (pD = 2.49) is shown in Fig. 4.
Two forms of 4PDA (H 2 A and HA − ) were present in this solution as is expected according to experimentally obtained pK a1 value (2.06) [20]. Separated doublets at 7.99 (H 2 A) and 8.02 ppm (HA − ) correspond to the ortho-H atoms. The molar ratio of H 2 A to HA − in a solution (n H2A /n HA-= 0.78) was calculated from corresponding peak areas, and was in good agreement with the ratio (n H2A /n HA-= 0.93) calculated using the Henderson-Hasselbalch equation. Two overlapped triplets at~7.7 ppm correspond to para-H atoms of two ionization forms, while meta-H atoms (at~7.6 ppm) are overlapped and could not be distinguished. The signal of H γ atom was observed at 7.04 ppm, but the intensity was even smaller than in CF 3 COOD, as the exchange rate between H γ and D atom in D 2 O/CF 3 COOD is faster than in pure CF 3 COOD. Lack of corresponding signals of enol III and diketo form II confirmed that the enol I is the dominant form of 4PDA in a solution.
NMR spectra were also recorded in deuterated acetate buffer, pD 4.81 (Fig. 5), where 4PDA, according to pK a2 value (7.56, [20]), is expected to be present in a pure HA − form. No signals of diketo tautomer were observed. Signals of ortho-H (d, 7.99 ppm), para-H (t, 7.69 ppm), and meta-H (t, 7.58 ppm) are all doubled with meta-H being the least resolved, indicating the coexistence of another species in a solution. Although there is a possibility for a free rotation around C γ -C δ bond in enol I, intramolecular H-bond Blocks^the pseudo 6membered ring, making the conformer/rotamer in which enolic oxygen on C β and keto oxygen on C δ are proximal to each other, a predominant one. Deprotonation of carboxyl group enables the formation of intramolecular hydrogen bond between carboxylate's oxygen and enolic -OH group on C β , forming a pseudo 5-membered ring. This facilitates rotation around the C γ -C δ bond, so different conformers/rotamers, in which keto oxygen on C δ and enolic oxygen on C β are distal from each other, can exist in a solution. Therefore, we assume that the presence of two (stable) rotamers around C γ -C δ bond causes the duplication of signals of aromatic protons in 1 H NMR spectrum. Similar explanation is given in the literature to justify the complex 1 H NMR spectrum of ADK methyl esters sodium salts. The formation and coexistence of Z,Zand E,Zisomers is described, as a consequence of rotation around C β -C γ and C γ -C δ bonds [33].
At pD 8.20, two forms of 4PDA (HA − and A 2− ) are observed in a solution (Fig. 6), and the ratio of two forms was again in a good agreement with the one predicted from pK a values.
In carbonate buffer at pD = 9.60 dianionic (A 2− ) form should be the dominant species (Fig. 7). Ortho-H atoms appeared as a doublet at 7.72 ppm, and metaand para-H atoms as overlapped triplets around 7.40 ppm. As π-electrons of dianion are delocalized over an entire keto-enol moiety, the spectral distinction between tautomers is not possible. Thus, two weak, but visible signals (t, δ = 7.57 ppm and d, δ = 7.82 ppm) indicated the presence of HA − with the abundance < 10%, which is in accordance with measured pD and pK a2 value.
A summary of 1 H NMR shifts at all studied pD values, and calculated vs. predicted ratio of different protonation forms of 4PDA is given in Table 2.
Several observations important for understanding the solution chemistry of 4PDA should be stressed. The area under the 1 H NMR signal of enolic hydrogen atom (H γ ) is less than expected since it exchanges with the 2 D from the solvent. With the increase of pD value, the intensity of H γ decreases. This results in absence of H γ signal in solutions with high pD values (8.20 and 9.60). As compound's, the carboxylic group dissociates the electron density of nucleus increases and enolic hydrogen atom becomes more shielded.
At pD 2.49, 8.20, and 9.60, 4PDA was found in solution as a mixture of two ionization forms, whose ratio of NMR peak areas correlates well with the ratio calculated from pK a values (Table 2). When a mixture of two ionization forms is present in a solution, NMR signals of ortho-H atoms are resolved much better than the signals of para-atoms, while meta-H signals of two forms are overlapped. A transfer of negative charge from 4PDA (di)anion to aromatic ring is the most effective in ortho-position, because of the influence of both inductive and resonance effect. Since resonance effect is transferred only through orthoand para-positions of the phenyl    Going from molecular to dianionic form, chemical shifts of all aromatic H atoms are moving upfield. It shows that the electron density of dioxobutanoic moiety is, to some extent, inductively transferred to aromatic ring.
At pD 4.81, we hypothesized the coexistence of two stable rotamers in a solution, and the trend in spectral resolution of  A computational study of the stability of 4PDA tautomeric forms The stability of different tautomers of 4PDA in molecular form (H 2 A) was investigated using quantum chemical calculations employing MP2 Hamiltonian, and 6-31G(d,p) basis set. For all three tautomers, forms with carboxyl hydrogen oriented toward surroundings (out orientation) and with carboxyl hydrogen oriented inside (in orientation) were considered. The inclusion of implicit solvation model (H 2 O) led to the stabilization of all tautomeric forms studied. Such stabilization was larger for diketo than for enol tautomers of 4PDA. Larger dipole moments of diketo comparing to enol tautomers, calculated with implicit solvation model (ESM ,  Table S1), indicate a possibility for more favorable electrostatic interactions between polar solvent and diketo tautomers.
Additional intramolecular hydrogen bond appears in the geometries with carboxyl hydrogen in when compared to out forms, and hence a possibility for the interaction of carboxyl hydrogen with a polar solvent is reduced. Although an implicit solvation model had larger influence on forms with carboxyl hydrogen oriented toward surroundings (out orientation), the intramolecular hydrogen bonding made forms with carboxyl hydrogen oriented in more stable, both in calculations with and without implicit solvent model applied. Geometries of the lowest energy forms of 4PDAwith carboxyl hydrogen in both orientations are given in Fig. 8 a-f. Relative energies are given in respect to the apparently most stable tautomer (enol III in), for calculations in a vacuum and in implicit solvation model.
Energy differences between two enol tautomers are 7.23 kJ/mol in vacuum and 6.20 kJ/mol in implicit solvent model, for in orientation of carboxyl hydrogen (enol I in vs. enol III in). Diketo forms were less stable, but such difference was significantly reduced upon the inclusion of implicit Even though calculations suggested that forms with in orientation of carboxylic H were more stable than corresponding out forms, it was reasonable to assume the existence of stable out forms in a real situation, i.e., in a protic solvent like water. It is well known that a PCM model is not good for simulating solvents with hydrogen-bonding properties. The existence of tautomer III in which was predicted as the most stable form could be questioned, since no experimental data point to the existence of tautomer III in detectable amount for ADKs, or structurally similar compounds. Literature data on quantum chemical calculations of L-731,988, aryldiketo acid derivative having N-benzyl substituted pyrrole as aroyl moiety, describe also an analog of enol III as the most stable one when solvation effects were accounted using the PCM model of water [17].
As implicit solvation model accounts for dipolar interactions, influencing mainly C-heteroatom and heteroatom-H bonds polarization, we added one explicit water molecule in the proximity of carboxyl group, and optimized geometry of such system for all six forms. Results are shown in Fig. 9.
The enol I with carboxyl H in out orientation + water couple (enol I w ) appeared as the most stable of all six forms, followed by enol III w (ΔE = 2.82 kJ/mol) and diketo II w (ΔE = 9.67 kJ/mol) forms. All tautomers with carboxyl H in orientation (I w in, III w in, and II w in) appeared significantly less stable compared to out counterparts. Thus, an inclusion of one explicit water molecule in the model, which allowed carboxyl H to establish its (intermolecular) hydrogen-bonddonating ability, provided results in much better agreement with experiments.

Conclusions
Aryldiketo acids complexation ability with divalent metal ions, that could be responsible for their mode of biological action, and the hydrolytic C-C bond cleavage by βketolases depend on predominant tautomeric form. To the best of our knowledge, no experimental data on keto-enol tautomerism of ADKs in aqueous solution with pH ≤ 5.5 was published so far. In a highly acidic medium, 4-phenyl-2,4dioxobutanoic acid (4PDA) predominantly exists as an enolic form with keto group closer to the phenyl ring. At higher pH values, where 4PDA exists as a mixture of two species with different ionization states, NMR pattern becomes more complex. The ratio of two forms is in a good agreement with the ratio predicted from the Henderson-Hasselbalch equation. An exception was found in 1 H NMR spectrum of 4PDA monoanion where signals are doubled, and the coexistence of two stable rotamers is hypothesized. Ab initio MP2/6-31G(d,p) method accurately predicted the relative stability of tautomers only when explicit water molecule was included in calculations.
Our combined experimental and computational evidences strongly suggest that enol I form is most prominent form in biologically relevant environments that these molecules can be exposed to. These observations could be taken into account in future efforts to develop drug-like molecules based on this class of compounds. Furthermore, a similar approach can be used to assess the prevalence of tautomers for other classes of molecules with diketo moiety. Findings from this study may be important for further development of this type of compounds as drugs, since absorption, distribution, and other pharmacokinetic properties depend on a tautomeric form of a compound.
Funding Information Ministry of Education, Science, and Technological Development of Serbia supported this work, Grant No. 172035.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.