Monday, November 4, 2019

Comparing NUS schedules on different compounds

Non uniform sampling (NUS) speeds NMR experiments by sampling a fraction of the data traditionally collected and predicting the rest. The data to be collected is defined by a sampling schedule and many different sampling schedules can be defined for a given level of sampling. A previous post showed that different schedules produce different results. If the best performing schedule for one compound also performs well for other compounds then that schedule could be defined as the default to give the best possible performance. To determine if this is true I compared NUS spectra obtained using the same sampling schedules on three different compounds, strychnine, ethyl benzene and cholesteryl acetate.

For all three compounds a fully sampled, multiplicity edited HSQC spectrum was used as the data source. For each compound 1000 sampling schedules at four levels of sampling (50.0, 25.0, 12.5, 6.25%) were used to select raw data that was processed with nmrPipe. To assess quality, the reconstructed spectra were subtracted from a synthetic spectrum without noise. From the difference matrix the root mean square was calculated. While this number cannot be used to compare different levels of sampling it should allow comparison of the same level of sampling.

The rms of the reconstructed spectra of each compound at each sampling level was used to sort the sampling schedules from best (1) to worst (1000). For each of the reconstructed ethyl benzene and cholesteryl acetate spectra the sampling schedule used was found in the strychnine list and its rank reported. The ethyl benzene and cholesteryl acetate rank was then plotted against the strychnine rank. If the schedules that performed best for strychnine also worked well for ethyl benzene and cholesteryl acetate then the plot should show a positive correlation. In the graphs below the ethyl benzene data is shown in blue and the cholesteryl acetate in red.


At 50.0 and 25.0% sampling the sampling schedules that work best for strychnine also work well for ethyl benzene (blue points). At 12.5% sampling, however, the correlation is nearly gone. For cholesteryl acetate (red points) there is no correlation in the effectiveness of the schedules at 50.0, 25.0 or 12.5% sampling.

The 6.25% sampling data shows different behaviour, but it appears to be the same for both ethyl benzene and cholesteryl acetate. While there is little correlation in the best performing schedules, the worst performing schedules (those with high rank at the top right of the graph) do correlate. Examination of these schedules revealed that they did not sample the last quarter of the raw data, indicating that at very low levels of sampling it is critical to include a point from the last quarter of the full sampling.

Why do the sampling schedules behave so differently for cholesteryl acetate? Its likely to be due to the complexity of the cholesteryl acetate spectra. Cholesteryl acetate shows more HSQC peaks than ethyl benzene or strychnine and these peaks show more complex splitting patterns. The number of lines (the components of a multiplet) is much greater for cholesteryl acetate than for the other two compounds, making any differences in the reconstructed spectra much more obvious.

ethyl benzene strychnine cholesteryl acetate
molecular weight 106.17 334.41 428.7
# HSQC peaks 5 18 30
# HSQC lines 15 21 118

This study shows that for high levels of sampling and simple compounds one may be able to select a high performing NUS schedule, however, for more complex compounds and lower levels of sampling this cannot be done. In these cases the schedule must be used before its effectiveness can be assessed, and so a default high performing schedule cannot be chosen.

No comments:

Post a Comment