# Signal Deconvolution and Noise Factor Analysis Based on a Combination of Time–Frequency Analysis and Probabilistic Sparse Matrix Factorization

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

_{2}* relaxation time. Noise factor analysis of NMR datasets identified correlations between SNR and acquisition parameters, identifying major experimental factors that can lower SNR.

## 1. Introduction

_{2}relaxation time, a physical parameter independent of field inhomogeneity. In reality, however, because of the effect of magnetic field inhomogeneity, the decay constant of the FID is defined as T

_{2}*, an instrument-dependent parameter, rather than T

_{2}. STFT has the ability to extract time-varying behavior from FIDs, allowing for the analysis of dynamic chemical shifts of atoms in flexible proteins [26]. In addition, it has been reported that STFT can extract T

_{2}* information from FIDs and improve the results of discriminant analysis [27]. Applying the same idea to covariance NMR [28], T

_{2}*-weighted covariance NMR improves the sensitivity and resolution of signals based on the difference in T

_{2}*, determined by dividing each FID in the t

_{1}dimension of 2D-NMR to create a series of sub-FIDs [29]. In an alternative approach, matrix factorization (MF) is commonly used to extract signal components and separate peaks in spectra [30]. For example, a noise reduction method using principal component analysis (PCA), which is one of the most commonly used multivariate analysis methods for extracting features of data, has been applied to solid CP-MAS NMR data measured by various parameters [31]. Therefore, the quality and amount of information from FIDs can be maximized by applying corrections based on different characteristics. Nevertheless, all these methods require multiple FIDs obtained by adding either spectral dimensions or multiple conditions of samples or parameters. There is also a computational approach such as CORE (COmponent-REsolved; a multi-component spectral separation approach previously introduced method). It focuses on diffusion coefficients to separate the NMR signals of different compounds in PFG-NMR [32,33,34]. However, this technique requires a specific NMR probe with a coil for generating PFG.

_{2}* on the time axis determined by performing STFT for each frequency component is useful to separate signals based on MF instead of ROI [22,23,24]. Our method that focuses on the relaxation time utilizes the attenuation behavior of the FID signal without any hardware upgrade for NMR research field. Lastly, we have developed a function for collecting acquisition parameters as a measurement of experimental factors from a directory of NMR data, and investigated the relationship between signal-to-noise ratio (SNR) and acquisition parameters. A researcher performing NMR must select parameters for each experiment, and normally chooses a reasonable set of parameters based on their experience. We show that these parameters can be characterized in terms of their correlation with SNR by a statistical analysis of accumulated NMR datasets. Therefore, this method will be useful to determine the optimal conditions of acquisition parameters.

## 2. Results and Discussion

#### 2.1. Signal Deconvolution Method

_{signal}(t) and S

_{noise}(t) are sets of ideal signals and signals from different types of noise, respectively (Equation (1) and Supplementary Equation (S1)) [45]. The relaxation process can then be described as the exponential decay of the transverse magnetization $S\left(t\right)$ (Supplementary Equation (S2)) [46]. The shorter the relaxation time T

_{2}*, the more rapid the decay. If an FID has more than one component, it will be the sum of contributions from each component (Supplementary Equation (S3)).

_{denoised}) to the original SNR (SNR

_{original}), which is calculated as follows (Equation (6) and Supplementary Equation (S18)):

^{1}H-NMR. STFT of the original FID adds a time axis to the frequency axis of the conventional FT spectrum (Figure 1a). The STFT spectrogram is three-dimensional, showing the frequency, time, and intensity of signal and noise. The matrix of the spectrogram was separated into signal and noise components based on the patterns of relaxation time using PSMF (Figure 1b). Each component was then converted into a signal FID and time-domain noise data by using inverse STFT (Figure 1c). Lastly, the time-region data were converted into the denoised spectrum and noise by using standard FT (Figure 1d). Regarding the noise reduction of the sucrose data, SNR of the denoised spectrum was improved about tenfold relative to the original data. In other words, for the sucrose sample, a 100-fold longer acquisition time would be required to obtain the same SNR without denoising. We compared signal and spectral quality between the original FT and noise reduction data (Supplementary Figure S2 and Table S1). There was almost no difference between them.

_{2}*, it will not be possible to choose an optimal filter for all lines simultaneously by applying commonly used apodization. The apodization such as exponential filtering decreases both signal and noise. In contrast, the method that we propose enables signal and noise to be extracted from an FID based on each pattern of T

_{2}* relaxation time.

^{1}H-NMR data. We also examined the effect of the number of components in PMSF on signal deconvolution, which showed that it was possible to properly extract signal components when there were two components (Supplementary Figure S5). When the number of components was increased, only noise components were separated more finely. Based on this result, the number of components was set to 2 in the signal deconvolution method for noise reduction. In the case of more complex data, such as the NMR signal of a mixture, it may be possible to apply the method to the characterization of multiple components by separating them with an arbitrary number of components.

#### 2.2. Noise Reduction in NMR Data Measured by Various Pulse Sequences

_{2}*, diffusion-edited, which detects proteins and lipids with relatively short T

_{2}*, and WATERGATE, which detects both of these. For the analysis of extensive data, percentages of the time width to FID lengths were set to 6.3% for CPMG and WATERGATE, 12.5% for diffusion-edited (1024 points for 16384 and 8192 points), and the initial three values of the noise component were added as a signal component. For CPMG and WATERGATE, the improvement rate was 3.7-fold and 3.3-fold, respectively. On the other hand, it was only 2.2-fold for diffusion-edited NMR data (Figure 3a). As a result of comparing the relative SNRs of three typical pulse sequences for 10 representative samples, the data of diffusion-edited tended to be lower than those of CPMG and WATERGATE as in the case of large-scale data (Figure 3b, Supplementary Table S2) since the time width for diffusion-edited (12.5%) is higher than that of the other two pulse sequences (6.3%). The SNR of any NMR data set is related to the acquisition parameters (Supplementary Figures S6–8). In NMR data using CPMG and WATERGATE, the SNR is related to several acquisition parameters, such as receiver gain (RG), number of scans (NS), relaxation delay time (D1), spectral width (SW), and offset of the transmitter frequency (O1), whereas in diffusion-edited NMR, the SNR is particularly related to the gradient pulse in the z-axis (GPZ). In diffusion-edited NMR, signals from small molecules with long T

_{2}* relaxation times are suppressed. We therefore considered that, if the GPZ setting was insufficient, signals of small molecules would remain, resulting in a difference in relative SNR. As expressed, the peak SNR depends on T

_{2}* because an FID with large T

_{2}* yields a sharp line with higher SNR at the peak [38]. Thus, it seems likely that the diffusion-edited NMR data contain a lot of broad signals derived from macromolecules, resulting in less improvement as compared with CPMG and WATERGATE which have many sharp signals.

#### 2.3. Application of Signal Deconvolution Method in Diffusion-Edited NMR

_{2}* (Figure 4a,b). By extracting each component and performing standard FT, the SNR of the denoised spectrum was improved about threefold as compared with the original data. In addition, we obtained individual spectra for the short and long components of T

_{2}* (Figure 4c,d). Thus, the diffusion-edited spectrum was separated into signals from macromolecules and small molecules by the length of the T

_{2}* relaxation time. The composition of molecules in these signals is related to the GPZ value of the acquisition parameters (Supplementary Figures S8 and S9). We consider that insufficient GPZ is the main factor affecting the relative SNR of diffusion-edited NMR data because, if GPZ is insufficient, relatively more signals from small molecules are contained in the measured signals. Knowing this composition will help to evaluate the data quality of diffusion-edited NMR.

#### 2.4. Noise Factor Analysis in Data Measured by Low- and High-Field NMR at Multiple Institutions

_{2}* gives, on Fourier transformation, a line width of 1/πT

_{2}* or approximately 1/3T

_{2}*. Thus, data acquisition beyond about 3T

_{2}* provides little gain in resolution, but causes a considerable deterioration in SNR. In addition, the spectral width may be set high enough to prevent aliasing of NMR signals. If not, there may be still other signals that fold, namely noise, meaning that the final SNR in the spectrum deteriorates.

## 3. Materials and Methods

#### 3.1. Signal Deconvolution Method

#### 3.2. Noise Factor Analysis Method

#### 3.3. NMR Data Acquisition

^{1}H-NMR data were by recorded using an Avance II 700 Bruker spectrometer equipped with a 5-mm inverse CryoProbe operating at 700.153 MHz for

^{1}H. In the

^{1}H -NMR data, the number of data using CPMG pulse sequence was 2386, the number of data using WATERGATE pulse sequence was 2760, and the number of data in the 1D LED experiment using bipolar gradients (diffusion-edited) pulse sequence was 975 [58,59,60,61]. Regarding these large data sets, a summary of information on the sample and acquisition parameters (the sample title, solvent, acquisition time, acquisition point, and the original SNR) is available at http://dmar.riken.jp/NMRinformatics/. Data sets for comparing the relative SNRs of three typical pulse sequences for 10 representative samples are shown in Supplementary Table S2. To demonstrate the denoising method, data for sucrose and citric acid were acquired by using the presaturation (program name; “zgpr”) pulse sequence. To demonstrate the method of separating signals in the diffusion-edited spectrum,

^{1}H-NMR data for fish muscle were measured by a diffusion-edited pulse sequence. Lastly, 48 sets of

^{1}H-NMR data (glucose, sucrose, citric acid, and lactic acid) were collected from the following five sites; RIKEN, NUIS, BMRB, BML, and HMDB. The data were measured with NMR spectrometers of 60, 500, 600, and 700 MHz manufactured by Bruker, Varian, and Nanalysis (Supplementary Table S3).

## 4. Conclusions

_{2}* length, recycle delay, sample molecular weight, or measurement temperature. The percentage of the time width against the effective average signal region of FIDs must be adjusted according to T

_{2}* length. Therefore, when using this method for fast relaxation systems such as solid-state NMR and quadrupole nucleus, additional efforts are needed. In the case of 2D-NMR, it is necessary to use this method by splitting each t

_{1}-dimensional FID and creating a series of sub-FIDs. Noise factor analysis of accumulated NMR datasets might facilitate the investigation of experimental factors related to a lower SNR. Therefore, these methods will help to determine optimal acquisition parameters, to cleanse data, including data management and noise reduction in accumulated NMR datasets, and to promote data-driven studies of molecular complexity using NMR.

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Takeuchi, K.; Baskaran, K.; Arthanari, H. Structure determination using solution NMR: Is it worth the effort? J. Magn. Reson.
**2019**, 306, 195–201. [Google Scholar] [CrossRef] [PubMed] - Jimenez, B.; Holmes, E.; Heude, C.; Tolson, R.F.M.; Harvey, N.; Lodge, S.L.; Chetwynd, A.J.; Cannet, C.; Fang, F.; Pearce, J.T.M.; et al. Quantitative Lipoprotein Subclass and Low Molecular Weight Metabolite Analysis in Human Serum and Plasma by
^{1}H NMR Spectroscopy in a Multilaboratory Trial. Anal. Chem.**2018**, 90, 11962–11971. [Google Scholar] [CrossRef] [PubMed] - Chikayama, E.; Yamashina, R.; Komatsu, K.; Tsuboi, Y.; Sakata, K.; Kikuchi, J.; Sekiyama, Y. FoodPro: A Web-Based Tool for Evaluating Covariance and Correlation NMR Spectra Associated with Food Processes. Metabolites
**2016**, 6, 36. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Singh, K.; Kumar, S.P.; Blümich, B. Monitoring the mechanism and kinetics of a transesterification reaction for the biodiesel production with low field
^{1}H NMR spectroscopy. Fuel**2019**, 243, 192–201. [Google Scholar] [CrossRef] - Kikuchi, J.; Ito, K.; Date, Y. Environmental metabolomics with data science for investigating ecosystem homeostasis. Prog. Nucl. Magn. Reson. Spectrosc.
**2017**, 104, 56–88. [Google Scholar] [CrossRef] [PubMed] - Wishart, D.S. NMR metabolomics: A look ahead. J. Magn. Reson.
**2019**, 306, 155–161. [Google Scholar] [CrossRef] [PubMed] - Maeda, H.; Yanagisawa, Y. Future prospects for NMR magnets: A perspective. J. Magn. Reson.
**2019**, 306, 80–85. [Google Scholar] [CrossRef] - Kovacs, H.; Moskau, D.; Spraul, M. Cryogenically cooled probes—A leap in NMR technology. Prog. Nucl. Magn. Reson. Spectrosc.
**2005**, 46, 131–155. [Google Scholar] [CrossRef] - Clos, L.J., II; Jofre, M.F.; Ellinger, J.; Westler, W.M.; Markley, J.L. NMRbot: Python scripts enable high-throughput data collection on current Bruker BioSpin NMR spectrometers. Metabolomics
**2013**, 9, 558–563. [Google Scholar] [CrossRef] [Green Version] - Ardenkjær-Larsen, J.H.; Fridlund, B.; Gram, A.; Hansson, G.; Hansson, L.; Lerche, M.H.; Servin, R.; Thaning, M.; Golman, K. Increase in signal-to-noise ratio of >10,000 times in liquid-state NMR. Proc. Natl. Acad. Sci. USA
**2003**, 100, 10158–10163. [Google Scholar] [CrossRef] [Green Version] - Kazimierczuk, K.; Orekhov, V. Non-uniform sampling: Post-Fourier era of NMR data collection and processing. Magn. Reson. Chem.
**2015**, 53, 921–926. [Google Scholar] [CrossRef] [PubMed] - Pines, A.; Gibby, M.G.; Waugh, J.S. Proton-Enhanced Nuclear Induction Spectroscopy. A Method for High Resolution NMR of Dilute Spins in Solids. J. Chem. Phys.
**1972**, 56, 1776–1777. [Google Scholar] [CrossRef] [Green Version] - Morris, G.A.; Freeman, R. Enhancement of nuclear magnetic resonance signals by polarization transfer. J. Am. Chem. Soc.
**1979**, 101, 760–762. [Google Scholar] [CrossRef] - Blümich, B. Low-field and benchtop NMR. J. Magn. Reson.
**2019**, 306, 27–35. [Google Scholar] [CrossRef] - Meiboom, S.; Gill, D. Modified Spin-Echo Method for Measuring Nuclear Relaxation Times. Rev. Sci. Instrum.
**1958**, 29, 688. [Google Scholar] [CrossRef] [Green Version] - Piotto, M.; Saudek, V.; Sklenar, V. Gradient-tailored excitation for single-quantum NMR spectroscopy of aqueous solutions. J. Biomol. NMR
**1992**, 2, 661–665. [Google Scholar] [CrossRef] - Vilén, E.M.; Klinger, M.; Sandström, C. Application of diffusion-edited NMR spectroscopy for selective suppression of water signal in the determination of monomer composition in alginates. Magn. Reson. Chem.
**2011**, 49, 584–591. [Google Scholar] [CrossRef] - Chandrakumar, N. Chapter 3 1D Double Quantum Filter NMR Studies. Annu. Rep. NMR Spectrosc.
**2009**, 67, 265–329. [Google Scholar] [CrossRef] - Lopez, J.; Cabrera, R.; Maruenda, H. Ultra-Clean Pure Shift
^{1}H-NMR applied to metabolomics profiling. Sci. Rep.**2019**, 9, 6900. [Google Scholar] [CrossRef] - Gouilleux, B.; Rouger, L.; Giraudeau, P. Ultrafast 2D NMR: Methods and Applications. Annu. Rep. NMR Spectrosc.
**2018**, 93, 75–144. [Google Scholar] [CrossRef] - Castañar, L.; Poggetto, G.D.; Colbourne, A.; Morris, G.A.; Nilsson, M. The GNAT: A new tool for processing NMR data. Magn. Reson. Chem.
**2018**, 56, 546–558. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Morris, G.A.; Barjat, H.; Home, T.J. Reference deconvolution methods. Prog. Nucl. Magn. Reson. Spectrosc.
**1997**, 31, 197–257. [Google Scholar] [CrossRef] - Taylor, H.S.; Haiges, R.; Kershaw, A. Increasing Sensitivity in Determining Chemical Shifts in One Dimensional Lorentzian NMR Spectra. J. Phys. Chem. A
**2013**, 117, 3319–3331. [Google Scholar] [CrossRef] - Krishnamurthy, K. CRAFT (complete reduction to amplitude frequency table)—Robust and time-efficient Bayesian approach for quantitative mixture analysis by NMR. Magn. Reson. Chem.
**2013**, 51, 821–829. [Google Scholar] [CrossRef] [PubMed] - Ibrahim, M.; Pardi, C.; Brown, T.; McDonald, P.J. Active elimination of radio frequency interference for improved signal-to-noise ratio for in-situ NMR experiments in strong magnetic field gradients. J. Magn. Reson.
**2018**, 287, 99–109. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Langmead, C.J.; Donald, B.R. Extracting structural information using time-frequency analysis of protein NMR data. In Proceedings of the Fifth Annual International Conference on Computing Machinery, Montreal, QC, Canada, 22–25 April 2001; pp. 164–175. [Google Scholar]
- Hirakawa, K.; Koike, K.; Kanawaku, Y.; Moriyama, T.; Sato, N.; Suzuki, T.; Furihata, K.; Ohno, Y. Short-time Fourier Transform of Free Induction Decays for the Analysis of Serum Using Proton Nuclear Magnetic Resonance. J. Oleo Sci.
**2019**, 68, 369–378. [Google Scholar] [CrossRef] - Short, T.; Alzapiedi, L.; Brüschweiler, R.; Snyder, D. A covariance NMR toolbox for MATLAB and OCTAVE. J. Magn. Reson.
**2010**, 209, 75–78. [Google Scholar] [CrossRef] [Green Version] - Manu, V.; Gopinath, T.; Wang, S.; Veglia, G. T
_{2}* weighted Deconvolution of NMR Spectra: Application to 2D Homonuclear MAS Solid-State NMR of Membrane Proteins. Sci. Rep.**2019**, 9, 8225. [Google Scholar] [CrossRef] - Yamada, S.; Ito, K.; Kurotani, A.; Yamada, Y.; Chikayama, E.; Kikuchi, J. InterSpin: Integrated Supportive Webtools for Low- and High-Field NMR Analyses Toward Molecular Complexity. ACS Omega
**2019**, 4, 3361–3369. [Google Scholar] [CrossRef] - Kusaka, Y.; Hasegawa, T.; Kaji, H. Noise Reduction in Solid-State NMR Spectra Using Principal Component Analysis. J. Phys. Chem. A
**2019**, 123, 10333–10338. [Google Scholar] [CrossRef] - Stilbs, P. Automated CORE, RECORD, and GRECORD processing of multi-component PGSE NMR diffusometry data. Eur. Biophys. J.
**2012**, 42, 25–32. [Google Scholar] [CrossRef] [PubMed] - Stilbs, P. RECORD processing–A robust pathway to component-resolved HR-PGSE NMR diffusometry. J. Magn. Reson.
**2010**, 207, 332–336. [Google Scholar] [CrossRef] - Stilbs, P.; Paulsen, K.; Griffiths, P. Global Least-Squares Analysis of Large, Correlated Spectral Data Sets: Application to Component-Resolved FT-PGSE NMR Spectroscopy. J. Phys. Chem.
**1996**, 100, 8180–8189. [Google Scholar] [CrossRef] - Kikuchi, J.; Yamada, S. NMR window of molecular complexity showing homeostasis in superorganisms. Analyst
**2017**, 142, 4161–4172. [Google Scholar] [CrossRef] [PubMed] - Pupier, M.; Nuzillard, J.-M.; Wist, J.; Schlörer, N.E.; Kuhn, S.; Erdélyi, M.; Steinbeck, C.; Williams, A.; Butts, C.P.; Claridge, T.D.W.; et al. NMReDATA, a standard to report the NMR assignment and parameters of organic compounds. Magn. Reson. Chem.
**2018**, 56, 703–715. [Google Scholar] [CrossRef] [Green Version] - Halouska, S.; Powers, R. Negative impact of noise on the principal component analysis of NMR data. J. Magn. Reson.
**2006**, 178, 88–95. [Google Scholar] [CrossRef] [Green Version] - Becker, E.D.; Ferretti, J.A.; Gambhir, P.N. Selection of optimum parameters for pulse Fourier transform nuclear magnetic resonance. Anal. Chem.
**1979**, 51, 1413–1420. [Google Scholar] [CrossRef] - Mo, H.; Harwood, J.; Zhang, S.; Xue, Y.; Santini, R.; Raftery, D. A quantitative measure of NMR signal receiving efficiency. J. Magn. Reson.
**2009**, 200, 239–244. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Mo, H.; Harwood, J.S.; Raftery, D. A quick diagnostic test for NMR receiver gain compression. Magn. Reson. Chem.
**2010**, 48, 782–786. [Google Scholar] [CrossRef] [Green Version] - Mo, H.; Harwood, J.S.; Raftery, D. Receiver gain function: The actual NMR receiver gain. Magn. Reson. Chem.
**2010**, 48, 235–238. [Google Scholar] [CrossRef] - Mo, H.; Harwood, J.; Raftery, D. NMR quantitation: Influence of RF inhomogeneity. Magn. Reson. Chem.
**2011**, 49, 655–658. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Liu, H.; Dong, H.; Ge, J.; Bai, B.; Yuan, Z.; Zhao, Z. Research on a secondary tuning algorithm based on SVD & STFT for FID signal. Meas. Sci. Technol.
**2016**, 27, 105006. [Google Scholar] [CrossRef] - Zitnik, M.; Zupan, B. NIMFA: A python library for nonnegative matrix factorization. J. Mach. Learn. Res.
**2012**, 13, 849–853. [Google Scholar] - Liu, H.; Dong, H.; Ge, J.; Liu, Z.; Yuan, Z.; Zhu, J.; Zhang, H. A fusion of principal component analysis and singular value decomposition based multivariate denoising algorithm for free induction decay transversal data. Rev. Sci. Instrum.
**2019**, 90, 035116. [Google Scholar] [CrossRef] [PubMed] - Keeler, J. Understanding NMR Spectroscopy; Appollo—University of Cambridge Repository: Cambridge, UK, 2004. [Google Scholar] [CrossRef]
- Dueck, D.; Morris, Q.; Frey, B.J. Multi-way clustering of microarray data using probabilistic sparse matrix factorization. Bioinformatics
**2005**, 21, 144–151. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Claridge, T. MNova: NMR data processing, analysis, and prediction software. J. Chem. Inf. Model.
**2009**, 49, 1136–1137. [Google Scholar] [CrossRef] - Ulrich, E.L.; Akutsu, H.; Doreleijers, J.F.; Harano, Y.; Ioannidis, Y.E.; Lin, J.; Livny, M.; Mading, S.; Maziuk, D.; Miller, Z.; et al. BioMagResBank. Nucleic Acids Res.
**2007**, 36, D402–D408. [Google Scholar] [CrossRef] [Green Version] - Ludwig, C.; Easton, J.; Lodi, A.; Tiziani, S.; Manzoor, S.E.; Southam, A.; Byrne, J.J.; Bishop, L.M.; He, S.; Arvanitis, T.N.; et al. Birmingham Metabolite Library: A publicly accessible database of 1-D
^{1}H and 2-D^{1}H J-resolved NMR spectra of authentic metabolite standards (BML-NMR). Metabolomics**2011**, 8, 8–18. [Google Scholar] [CrossRef] - Wishart, D.S.; Feunang, Y.D.; Marcu, A.; Guo, A.C.; Liang, K.; Vázquez-Fresno, R.; Sajed, T.; Johnson, D.; Li, C.; Karu, N.; et al. HMDB 4.0: The human metabolome database for 2018. Nucleic Acids Res.
**2018**, 46, D608–D617. [Google Scholar] [CrossRef] - Larive, C.K.; Jayawickrama, D.; Őrfi, L. Quantitative Analysis of Peptides with NMR Spectroscopy. Appl. Spectrosc.
**1997**, 51, 1531–1536. [Google Scholar] [CrossRef] - Helmus, J.J.; Jaroniec, C.P. Nmrglue: An open source Python package for the analysis of multidimensional NMR data. J. Biomol. NMR
**2013**, 55, 355–367. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Laurberg, H.; Christensen, M.G.; Plumbley, M.; Hansen, L.K.; Jensen, S.H. Theorems on Positive Data: On the Uniqueness of NMF. Comput. Intell. Neurosci.
**2008**, 2008, 1–9. [Google Scholar] [CrossRef] [PubMed] - Kim, H.; Park, H. Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics
**2007**, 23, 1495–1502. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature
**1999**, 401, 788–791. [Google Scholar] [CrossRef] - Demchak, B.; Hull, T.; Reich, M.; Liefeld, T.; Smoot, M.; Ideker, T.; Mesirov, J.P. Cytoscape: The network visualization tool for GenomeSpace workflows. F1000Research
**2014**, 3, 151. [Google Scholar] [CrossRef] [PubMed] - Yoshida, S.; Date, Y.; Akama, M.; Kikuchi, J. Comparative metabolomic and ionomic approach for abundant fishes in estuarine environments of Japan. Sci. Rep.
**2014**, 4. [Google Scholar] [CrossRef] [Green Version] - Misawa, T.; Wei, F.; Kikuchi, J. Application of Two-Dimensional Nuclear Magnetic Resonance for Signal Enhancement by Spectral Integration Using a Large Data Set of Metabolic Mixtures. Anal. Chem.
**2016**, 88, 6130–6134. [Google Scholar] [CrossRef] [Green Version] - Asakura, T.; Sakata, K.; Date, Y.; Kikuchi, J. Regional feature extraction of various fishes based on chemical and microbial variable selection using machine learning. Anal. Methods
**2018**, 10, 2160–2168. [Google Scholar] [CrossRef] [Green Version] - Wei, F.; Fukuchi, M.; Ito, K.; Sakata, K.; Asakura, T.; Date, Y.; Kikuchi, J. Large-Scale Evaluation of Major Soluble Macromolecular Components of Fish Muscle from Conventional
^{1}H NMR Spectral Database. Molecules**2020**, 25, 1966. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**The free induction decay (FID) signal deconvolution method and its application to

^{1}H-NMR data for sucrose. (

**a**) The spectrogram was obtained by applying short-time Fourier transform (STFT) to the original FID. (

**b**) The matrix obtained after STFT was applied to probabilistic sparse matrix factorization (PSMF), which separated it into signal and noise components. (

**c**) The signal and noise components were converted into a noise-removed FID signal (orange) and a time-domain noise signal (blue) by using inverse short-time Fourier transform. (

**d**) Finally, the noise-removed FID and the time-domain noise signal were converted to a frequency-domain spectrum by applying standard Fourier transform. As compared with the original FID, the signal-to-noise ratio of the denoised FID was improved about tenfold.

**Figure 2.**Comparison of four matrix factorization (MF) methods in signal deconvolution. Shown are spectral patterns of signal deconvolution for sucrose

^{1}H-NMR data using (

**a**) PSMF, (

**b**) NMF, (

**c**) PMF, and (

**d**) SNMF. The signal components are shown in orange and the noise components are shown in blue.

**Figure 3.**Relative SNR in data measured by three pulse sequences. (

**a**) Shown is the relationship between the relative SNR after application of the noise reduction method to large-scale data measured by three pulse sequences: CPMG (blue), WATERGATE (red), and diffusion-edited (yellow), and its acquisition time. The upper part of the figure shows the number of spectra and the average relative SNR for each pulse sequence. (

**b**) Comparison of the efficiency for improvement of the SNR measured by three pulse sequences: CPMG (blue), WATERGATE (red), and diffusion-edited (yellow), among NMR spectra derived from sample ID of 1 to 10. The acquisition time and the average relative SNR for each pulse sequence are shown in the upper part of the figure.

**Figure 4.**Application of the signal deconvolution method to diffusion-edited spectra. (

**a**) Spectral patterns showing signals from small molecules (orange) and macromolecules (olive) separated by the length of the T

_{2}* relaxation time, and noise (blue). (

**b**) Time-varying coefficients of each component in MF. (

**c**) Denoised spectrum (gray), and spectrum of the short T

_{2}* component (olive). (

**d**) Denoised spectrum (gray), and spectrum of the long T

_{2}* component (orange).

**Figure 5.**Analysis of experimental factors based on a correlation network of SNR and experimental parameters. The network diagram was drawn by setting positive correlations to red, negative correlations to blue, and the magnitude of the correlation coefficient to the edge thickness. Abbreviations: SNR, signal-to-noise ratio; calcSNR, calculated SNR; Cstd, concentration of standard compound; Ccomp, concentration of compound; Water+, positive intensity of water signal peak to standard peak; Water–, negative intensity of water signal peak to standard peak; Intensity, intensity of standard signal; FWHM, full width at half maximum; Area, area of standard signal; RG, receiver gain; NS, number of scans; D1, relaxation delay time; SW, spectral width; AT, acquisition time; TD, time-domain data size; O1, offset of transmitter frequency; TE, temperature; BF1, basic transmitter frequency for channel F1 in Hertz; PROBHD, if cryoprobe, value is 4, if not, value is 0.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Yamada, S.; Kurotani, A.; Chikayama, E.; Kikuchi, J.
Signal Deconvolution and Noise Factor Analysis Based on a Combination of Time–Frequency Analysis and Probabilistic Sparse Matrix Factorization. *Int. J. Mol. Sci.* **2020**, *21*, 2978.
https://doi.org/10.3390/ijms21082978

**AMA Style**

Yamada S, Kurotani A, Chikayama E, Kikuchi J.
Signal Deconvolution and Noise Factor Analysis Based on a Combination of Time–Frequency Analysis and Probabilistic Sparse Matrix Factorization. *International Journal of Molecular Sciences*. 2020; 21(8):2978.
https://doi.org/10.3390/ijms21082978

**Chicago/Turabian Style**

Yamada, Shunji, Atsushi Kurotani, Eisuke Chikayama, and Jun Kikuchi.
2020. "Signal Deconvolution and Noise Factor Analysis Based on a Combination of Time–Frequency Analysis and Probabilistic Sparse Matrix Factorization" *International Journal of Molecular Sciences* 21, no. 8: 2978.
https://doi.org/10.3390/ijms21082978