Chemists Focus on Probes, Biologists on Cells—But Who Talks about Probe-Cell Interactions? A Critical Account of the Suboptimal Reporting of Novel Fluorescent Imaging Probes, Using Lipid Droplet Stains as a Case Study

: Many current reports in the scientiﬁc literature describe novel ﬂuorescent probes intended to provide information on various structures or properties of live cells by using microscopic imaging. Unfortunately, many such reports fail to provide key information regarding the staining process. It is often the case that neither the necessary minimum technical detail (probe concentration, solvent and cosolute, temperature and time of staining, and details of post-staining washes) nor a discussion of the proposed staining mechanism are provided. Such omissions make it unnecessarily difﬁcult for biomedical end-users to try out reported novel probes in their own laboratories. The validity of these criticisms is explored and demonstrated by a detailed analysis of 75 non-cherry-picked articles describing novel ﬂuorescent probes for the detection of lipid droplets in live cells. This dataset also suggests that papers from journals with high journal impact factors or from better-known research groups are no more likely to provide better protocol information or discussion of the mechanism than papers from less prestigious sources. Comments on possible reasons for this suboptimal reporting are offered. The use of a suitable information/feature checklist, following best practice in many leading chemical and biological journals, is suggested as a mechanism for ameliorating this situation, with a draft checklist being provided.


Problems with the Reporting of Novel Fluorescent Imaging Probes
At this time there is a substantial global effort directed at synthesizing and trialing small-molecule fluorescent compounds (hereafter "probes") to be used for imaging applications in biomedicine.Such compounds permit investigators to detect biological structures, substances, and processes within living cells.This burgeoning field has been described many times, see [1][2][3] for recent representative reviews.
However, there is a largely unacknowledged problem with the scientific reports of such new imaging probes.Namely, that most accounts fail to provide the key technical information concerning the staining process which would permit a biomedical end-user to apply the novel probe in their own laboratory without further experimentation.In addition to this reporting problem, two additional factors concerning the translation of a novel probe from a chemical to a biological laboratory should be mentioned.First is an experimental design problem: using only a single cell line to evaluate staining by a novel probe will not give a life science investigator the confidence that the proposed staining procedure is of general applicability.This is because when a probe is applied to different cell types, the localization patterns in the different cells can sometimes differ.For instance, when JCS cells were incubated with merocyanine 540, the dye was seen in the plasma membrane and mitochondria, whereas under the same staining conditions, M1 leukemia cells accumulated this dye in the lysosomes [4].Second is an interpretive problem: many articles describing new probes fail to provide any significant analysis of the mechanism of staining.The overall effect is that the critical application of a novel probe by a biomedical end-user may well be difficult and, in consequence, not be attempted.
Of course, reports of staining procedures do vary widely in quality.Both the technical aspects of probe/cell application and the interactions involved are considered by some investigators.Nevertheless, factors such as the staining time and staining temperature, the identity of the staining solvent and of any cosolutes, the details of post-staining treatments, and even the concentration of the probe, are not always provided.This is despite the fact that these factors can all substantially influence the localization and accumulation of probes in living cells, and that this has been clear for a considerable period, see, for instance, [5][6][7].
This account addresses the reasons why such omissions of reporting are so significant a limitation to both the probe development process and to the likely take-up of new probes by biomedical investigators.Readers may consider the propositions that "most accounts" fail to provide sufficient information concerning key variables, or that "many" articles fail to provide accounts of staining mechanisms, are exaggerations.Consequently, these criticisms are here empirically assessed in a test case of publications describing novel probes targeting lipid droplets (LDs).Finally, possible reasons for why this situation arises are considered, and suggestions are provided as to how it might be remedied.

Using Publications Describing Novel LD Probes as a Case Study
Blanket criticisms, such as those sketched above, demand empirical evidence.In this review, the dataset used for such testing comprised 75 articles describing novel fluorochromes used to target LDs in live cells.The papers comprising this dataset, which became available due to an on-going project involving the use of QSAR to assist the design of LD-selective probes, are listed in the Appendix A. It should be noted that in all but four of the articles, the titles contain words or phrases indicating that the new compounds were intended to be applied as imaging agents.Moreover, in more than half of these articles, the title explicitly addresses some diagnostic, therapeutic, or biological concern.
To avoid cherry-picking, the dataset had been assembled as follows.Papers were identified using a simple search with Google Scholar for "fluorescent probe + lipid droplet".The first 75 papers in the listing meeting the following criteria were then added to the dataset.First, that a paper reported a fluorochrome which had not been used before to stain LDs.Second, that the paper contained a microscopic image of a live cell containing LDs stained with the probe.Third, that a structural formula of the probe was provided.Fourth, that the identity of each cell line or lines used as target cells was stated.This procedure resulted in just over 100 individual probes being described in the 75-article dataset.
This present account focuses on the availability, and salience, in the published report of key technical information concerning the application of novel probes, and on the availability of information concerning the staining mechanism.

Why Are the Technical Factors concerning Probe-Cell Interactions So Important?
Although the dataset discussed here is restricted to LD probes, the author is well aware that similar issues arise with, for example, fluorescent probes for lysosomes, mitochondria, and nucleic acids.While acknowledging that no systematic review of these other cases has yet been carried out, and noting that the technical factors influencing the staining of different cellular targets may vary with the nature of the target, it is considered likely that conclusions drawn from the LD dataset are generalizable.
That said, when considering the practical factors influencing the uptake and uptake selectivity of LD probes, the starting point is that a partitioning process occurs.Specifically, in the present instance, partitioning between aqueous solutions and hydrophobic targets.Therefore, the following phenomena must be considered, as they have been long known to influence such partitioning.
Cosolutes in that solution-Such as inorganic salts [10] or serum albumin [9].How long partitioning was carried out for-See [8,9].Post-partitioning washing of the target material-See [11,12].This last step is important since it could constitute a second "reverse" partitioning process in which staining is reduced or even eliminated, especially in smaller LDs.
Consequently, all such factors must be specified when reporting a staining procedure, so that a biomedical end-user wishing to try out a new probe would be provided with the complete information required, and would not need to engage in the exploration of technical variations.Moreover, such accounts must be both explicit and easy to find in the published report.Note that in many journals, such material may be placed in the supporting information.
Additional technical factors may be worth considering in certain cases.For instance, if the new probe is an acid or base with a pK a near pH 7, then the pH of the solution becomes important.If uptake into the LDs is due to the partitioning of an electrically neutral species, then any pH shift may be expected to alter the concentration of the relevant non-ionic species.However, since this is not a universal issue, it will not be further considered here, although for relevant compounds this should be considered as, indeed, it is in one of the papers of the dataset [13].Another factor of restricted relevance is that some staining systems only give the staining of LDs after suitable pre-staining treatments, an example being the requirement for pre-treatment with nystatin, reported by Guo et al. [14].Again, while this does need stating, it is not a common feature of the current data set.

The Protocol Reporting Index (PRI) as an Indicator of Quality
The previous section explored and justified the range of features expected to routinely influence probe uptake and selective retention, and these features are listed in Table 1.To facilitate subsequent analysis, a score was allocated to each factor.The sum of these gave an overall score, here termed the protocol reporting index, or PRI.This can be considered a simple indicator of the usefulness of the published account for potential biological end-users.

Table 1. Determination of the protocol reporting index (PRI).
Using the criteria given in the table, the PRI score can range from 0-9.Regarding post-staining treatment, "no detail" implies that only the washing solution was specified, whereas "detail" would also include wash time and temperature.

Sum of individual item scores
As well as the factors mentioned previously, Table 1 includes an additional factor likely to influence biologists considering trying out a novel fluorochrome as an imaging probe.Namely, whether the recommended staining procedure is provided as a clear, stand-alone protocol, rather than the relevant information being distributed between different sections of the text and the supplementary information.In the present account, the "additional technical factors", mentioned previously as sometimes being relevant, are not further discussed.The PRI scoring system described above and in Table 1 thus gives rise to a ten-point scale, with possible scores running from zero to nine.As every factor listed in Table 1 has been considered significant by the authors of at least one of the 75 articles of the dataset, these criteria represent the collective opinion of workers in this field.Of course, nearly all this information would have been known to the investigators, the problem being addressed here is largely (albeit it not entirely) one of suboptimal reporting.Note at this point that the previous discussion implies that the minimum score required for a report is seven, as any with a lower score would fail to provide the technical detail required to duplicate the procedure.
The overall PRI score was then compiled for each paper in the 75-member dataset by checking whether there was, or was not, an explicit mention of each of the factors listed in Table 1.Various biases could have arisen during this process.Thus, items could be overlooked, giving rise to a score which was erroneously low.On the other hand, since section headings such as "Co-localization assay" or "Cell culture and co-staining" were accepted as identifying the staining protocol, which was perhaps over generous in some cases, the PRI scores may, in some other instances, have been erroneously high.Similar over generosity may have arisen when assessing the consideration of the staining mechanism.

How Good Is the Reporting of Key Protocol Factors?
As a direct way of addressing this crucial question, consider the PRI frequency diagram shown in Figure 1.Since, as noted above, the minimum PRI score for a document allowing the straight forward replication of staining procedure is seven, it is apparent that only a quarter of the papers met this criterion.This strongly supports the earlier pessimistic characterization of the reporting of novel probes.As well as this global assessment, however, it is possible to provide some more fine-grained comments on the information provided in this dataset.Surprisingly, one article provided no information whatsoever of the type here considered necessary for the potential biological end-user.At the other end of the quality spectrum, two articles did achieve the highest possible PRI score of nine.As the modal value of the dataset was six, it is the case that most papers do not provide sufficient information to permit the straight-forward application of the new compounds in other laboratories.
So, what information was provided, regarding the staining process?Nearly all (73/75) Surprisingly, one article provided no information whatsoever of the type here considered necessary for the potential biological end-user.At the other end of the quality spectrum, two articles did achieve the highest possible PRI score of nine.As the modal value of the dataset was six, it is the case that most papers do not provide sufficient information to permit the straight-forward application of the new compounds in other laboratories.
So, what information was provided, regarding the staining process?Nearly all (73/75) articles reported the concentration of the probe used.The next most widely reported factor was the staining time, although about one in ten articles failed to provide this.The staining solvent and cosolutes were each reported in only half of the papers, with the staining temperature being noted in only a third of the documents.
Information concerning post-staining washing calls for a specific mention.As noted previously, this step could result in the extraction of the probe from the LDs, hence the need for an explicit description of the process.Eight articles in the dataset noted that post-staining washing was not needed, obviously a technically advantageous feature.In almost half of the papers, no statement was made regarding washing or no washing, which is an unfortunate lack of detail.This left 34 papers which stated that washing was necessary.Oddly, two of these merely stated that fact, but provided no further information.The remaining 32 articles all provided information regarding the nature of the wash solution.However, none of these provided any indication of the temperature, nor of the time of washing, although 18 did state the number of times washing was carried out.
Overall, it is clear that, from the viewpoint of a biologist seeking novel staining probes, the answer to the query "How useful is the reporting of key technical information?" is "Not good enough".This is surprising, since the papers in the dataset did typically provide a great deal of information regarding the synthesis and optical properties of the new probes.Moreover, most of the information contributing to the PRI score must be known to the experimenters, or the work reported could not have been carried out.
Consequently, the following question arises.When a biologist browses the literature seeking novel probes, what sources of information are likely to be most informative?Perhaps high-quality journals or better-known research groups should receive most attention.

Do High-Quality Journals or Papers from Better-Known Research Groups Provide Better Staining Protocol Reporting?
When pursuing this question, the initial task was to operationalize "quality" and "better-known".In the present review, journal impact factors and the numbers of citations are, therefore, regarded as such indicators for a journal and for an investigator, respectively.Although both criteria have been disputed (e.g., [15]), it is likely that most biologists consider the notions of "quality journals" and "well known investigators" as common sense, making these appropriate measures in the present context.
To test for a possible relationship between the good reporting of staining protocol factors and journal quality, the following procedure was carried out.The mean PRI score of papers published in each of the 34 journals contributing to the 75-article dataset were calculated.A scatter plot was made of these scores versus the corresponding journal impact factors (obtained from Journal Citation Reports), see Figure 2.An inspection of this figure indicates that there is no trend to higher PRI values (i.e., to better reporting) at higher journal impact factors.Consequently, basing a search for novel probes on journal quality, so defined, would not help a biologist seeking well-documented novel probes.
Next, to test for a possible relationship between the good reporting of staining protocol factors and the prominence of the research group involved, the following procedure was adopted.The corresponding authors of each of the 75 articles were taken as proxies for the chief investigators of the group responsible for the work.Papers whose corresponding authors had Google Scholar user profiles were then identified, providing a subset of 43 articles.This proportion is within the range expected for academic investigators (cf.[16]).The total number of citations given in Google Scholar for each such corresponding author was determined, and if an article had more than one individual named as the corresponding author, then the person with the larger number of citations was listed.A scatter plot was made of the PRI values for each article versus the listed corresponding author's total citation count, see Figure 3.An inspection of this plot indicates that there is no trend to higher PRI values (i.e., to better reporting) at higher total citation numbers.Consequently, basing a search on the prominence of the putative chief investigator would not help a biologist searching for well-documented novel probes for LDs.
When pursuing this question, the initial task was to operationalize "quality" and "better-known".In the present review, journal impact factors and the numbers of citations are, therefore, regarded as such indicators for a journal and for an investigator, respectively.Although both criteria have been disputed (e.g., [15]), it is likely that most biologists consider the notions of "quality journals" and "well known investigators" as common sense, making these appropriate measures in the present context.
To test for a possible relationship between the good reporting of staining protocol factors and journal quality, the following procedure was carried out.The mean PRI score of papers published in each of the 34 journals contributing to the 75-article dataset were calculated.A scatter plot was made of these scores versus the corresponding journal impact factors (obtained from Journal Citation Reports), see Figure 2.An inspection of this figure indicates that there is no trend to higher PRI values (i.e., to better reporting) at higher journal impact factors.Consequently, basing a search for novel probes on journal quality, so defined, would not help a biologist seeking well-documented novel probes.Next, to test for a possible relationship between the good reporting of staining protocol factors and the prominence of the research group involved, the following procedure was adopted.The corresponding authors of each of the 75 articles were taken as proxies for the chief investigators of the group responsible for the work.Papers whose corresponding authors had Google Scholar user profiles were then identified, providing a subset of 43 articles.This proportion is within the range expected for academic investigators (cf.[16]).The total number of citations given in Google Scholar for each such corresponding author was determined, and if an article had more than one individual named as the corresponding author, then the person with the larger number of citations was listed.A scatter plot was made of the PRI values for each article versus the listed corresponding author s total citation count, see Figure 3.An inspection of this plot indicates that there is no trend to higher PRI values (i.e., to better reporting) at higher total citation numbers.Consequently, basing a search on the prominence of the putative chief investigator would not help a biologist searching for well-documented novel probes for LDs.This latter conclusion is supported by another line of evidence.Consider the work emerging from the group of the most prolific, and most cited, author in the dataset.The total citation number of 153 152 for this corresponding author resulted in the articles coming from this group falling into a distinct zone of the scatterplot shown in Figure 3.This allows us to see that, even when originating from a single group, a dozen articles can have PRI values ranging from 1-7, an extremely diverse set.Consequently, even a productive laboratory which reports considerable protocol information on one occasion may, in another article, be very frugal indeed with the information provided.
It may be concluded that neither journal quality, nor scientific reputation, nor the high output of authors, can provide guidance for a biologist's search.This is very unsatisfactory.However, before discussing possible reasons and possible solutions for these problems, a short digression will be made into the informative concept of probe design.

The Oddly Biased Use of the Term "Probe Design"
Successful imaging probes must possess features of two types.They need to possess a range of desirable optical properties, such as particular λ abs and λ em values, sometimes to have photon cross-sections which fall into particular ranges, to have large Stokes shifts, and often to be solvatochromic.In addition, probes must of course localize at specific cellular sites.Producing an effective imaging probe is, therefore, a challenge for a synthetic chemist.Perhaps as a result, it is not unusual for probes to be described as "designed," or "engineered," or an equivalent term.It is, therefore, intriguing that the final, crucial requirement-for a probe to possess features resulting in specific cellular localization-was apparently not considered part of the design process by many authors.Of the 33 articles claiming "design" or similar, only 20 explicitly consider the requirement that a probe must be hydrophobic if it is to accumulate in the LDs.Moreover, even within this subgroup, the widely used numerical indicator of hydrophobicity, namely, a partition coefficient, was only considered by a few of these articles.It is perhaps not surprising that there is no positive relationship between claims for design and higher PRI values, see Figure 4. allows us to see that, even when originating from a single group, a dozen articles can have PRI values ranging from 1-7, an extremely diverse set.Consequently, even a productive laboratory which reports considerable protocol information on one occasion may, in another article, be very frugal indeed with the information provided.
It may be concluded that neither journal quality, nor scientific reputation, nor the high output of authors, can provide guidance for a biologist s search.This is very unsatisfactory.However, before discussing possible reasons and possible solutions for these problems, a short digression will be made into the informative concept of probe design.

The Oddly Biased Use of the Term "Probe Design"
Successful imaging probes must possess features of two types.They need to possess a range of desirable optical properties, such as particular λabs and λem values, sometimes to have photon cross-sections which fall into particular ranges, to have large Stokes shifts, and often to be solvatochromic.In addition, probes must of course localize at specific cellular sites.Producing an effective imaging probe is, therefore, a challenge for a synthetic chemist.Perhaps as a result, it is not unusual for probes to be described as "designed," or "engineered," or an equivalent term.It is, therefore, intriguing that the final, crucial requirement-for a probe to possess features resulting in specific cellular localization-was apparently not considered part of the design process by many authors.Of the 33 articles claiming "design" or similar, only 20 explicitly consider the requirement that a probe must be hydrophobic if it is to accumulate in the LDs.Moreover, even within this subgroup, the widely used numerical indicator of hydrophobicity, namely, a partition coefficient, was only considered by a few of these articles.It is perhaps not surprising that there is no positive relationship between claims for design and higher PRI values, see Figure 4. Still considering the usage of the term "design", note that almost a third of the articles failed to provide any explicit account of how the probes achieved the selective staining of Still considering the usage of the term "design", note that almost a third of the articles failed to provide any explicit account of how the probes achieved the selective staining of LDs.The proportion which did give a full account-that is, considered both the necessity of probe hydrophobicity and, when appropriate, of probe solvatochromism-was about a third.Overall, therefore, about two thirds of the articles provided no, or only partial, information regarding the supposed staining mechanism.Again, this is in marked contrast to the often extensive analyses of the mechanisms underlying, for instance, solvatochromism.

A Possible Explanation for the Poor Reporting of Protocol Information
A simplistic, albeit plausible, reason for the curious omission of key technical information, and for the equally curious lack of analysis of the mechanism of action of the probes, is the following.Chemists focus on probe chemistry, biologists focus on cell structure and properties, but only a minority of investigators (about a third, see above) regard probe-cell interactions as being of significance.Perhaps this is because the interactive process falls into the aforementioned conceptual and inter-disciplinary gap.The present author must admit to finding this stance puzzling, as over a considerable time period he has been publishing work aimed at elucidating precisely such dye-biostructure interactions (e.g., [17,18]).
If the above suggestion is correct, at least in part, then fixing the problem may be thought to be difficult.Against that view, there is some good news.Nearly all the factors in the list defining the PRI are available to the authors of the articles, otherwise the work reported could not have been carried out.Moreover, only one factor contributing to the PRI (namely, the number of cell lines studied) involves experimental design.Thus, the problem is overwhelmingly one of improving reporting.
Taking an optimistic stance, some suggestions regarding how to ameliorate the situation can be offered.Before attempting this, an obvious limitation of the present critique must be acknowledged.Namely, that the case study only addresses articles describing probes for a single organelle.Whilst it is the present author's impression that similar problems in reporting arise with papers describing other novel organelle-targeted probes-in particular, those targeting lysosomes, mitochondria, and nucleic acids-the question of the generalizability of the critique remains to be addressed empirically.However, it is unlikely that the poor reporting of protocol and of mechanistic information is, coincidentally, restricted to novel probes for a single organelle, namely LDs.

Some Suggestions for Remedial Action
For published accounts of novel fluorescent probes to be of greater value to biomedical end-users, authors must change their reporting practices.The necessary changes are straightforward and simple to state: the more difficult problem is how to motivate authors to make the necessary changes.Fortunately, however, most chemists and many biologists will already be familiar with journal publishing procedures which could be applied to this problem.
Many major chemical and biological journals currently require submitted manuscripts to contain information which increases clarity for readers and enables replicability by other investigators.Failure to provide such information is then cause to ask for revision of the manuscript.The American Chemical Society has been one pioneer of such an approach, see for instance the "Compound characterization checklist" in the Journal of Organic Chemistry.The life science journals published by Cell Press provide another example, with their "STAR" system, which aims to enable "structured, transparent, accessible reporting".
Such approaches make considerable use of checklists of required information and features.Inspired by such strategies, a draft checklist for the reports of staining LDs with novel fluorescent probes, based on the key protocol factors and other features discussed above, is offered below.
For each probe, check if the following features and information are provided in the manuscript.
Respond using Y for yes, N for no, and X for not relevant.
Was the application of the novel probe evaluated using more than one cell type?Is a stand-alone staining protocol provided, with a clear heading?Is the probe concentration in the staining solution specified?Is the staining time specified?Is the staining temperature specified?Is the solvent for the staining solution specified?Are cosolutes in the staining solution specified?Was post-staining washing required?If Y, respond to the following queries.
Were washing solvents and any cosolutes specified?Were wash times, number of washes, and washing temperature specified?
Hopefully, such a checklist, if used by authors, editors, and reviewers, would facilitate the effective translation of novel probes from the laboratories of origin to the laboratories of biomedical end-users.However, it should be noted that this checklist is tailored for probes of LDs.A variant checklist might be needed to deal with reports of probes targeting a larger range of organelles.For instance, when using probes targeting lysosomes, the pH of the staining solvent and the presence or absence of cosolutes such as BSA become significant.

1.
Approximately three quarters of scientific reports describing novel fluorescent probes for LDs do not provide sufficient technical information (e.g., concentration, solvent and cosolute, time, post-stain washing procedure) to permit the direct replication of the recommended staining process by another investigator.

2.
Approximately two thirds of such reports also fail to provide a full account of the supposed staining mechanism of the probe.

3.
Consequently, a trial application of such probes by biomedical investigators is made harder than need be the case.

4.
This suboptimal reporting is anomalous, as extensive accounts are typically provided of syntheses and, where appropriate, of the optical properties of the probes.Moreover, the technical protocol information was known to the authors as they carried out the work described.

5.
A possible explanation for such inadequate reporting is that chemists focus on probe chemistry, biologists on cell properties.The failure of either party to address the interaction of probe and cell is thus a "not my problem" problem.6.
It is very unlikely that the omissions discussed are restricted to probes for LDs, and consequently, such omissions are probably of general concern for the fluorescent imaging probe field.7.
Correcting such errors of omission requires authors to alter their reporting habits; facilitating such changes is a task for journal editors.8.
One way to achieve this would be to adopt the best practice already in place in many chemistry and biology journals.Namely, to devise and make use of checklists of required content, for use by authors, editors, and reviewers.

15 Figure 1 .
Figure 1.A frequency diagram showing the overall PRI scores for the 75 articles of the test case.

Figure 1 .
Figure 1.A frequency diagram showing the overall PRI scores for the 75 articles of the test case.

Figure 2 . 15 Figure 2 .
Figure 2. Scatter plot of mean PRI score for all papers published in a journal vs. corresponding journal impact factor for the 75 articles of the LD test case.

Figure 3 .
Figure 3. Scatter plot of overall PRI scores vs. total citation count of most-cited corresponding author for the 75 articles of the LD test case.This latter conclusion is supported by another line of evidence.Consider the work emerging from the group of the most prolific, and most cited, author in the dataset.The total citation number of 153 152 for this corresponding author resulted in the articles coming from this group falling into a distinct zone of the scatterplot shown in Figure3.This

Figure 3 .
Figure 3. Scatter plot of overall PRI scores vs. total citation count of most-cited corresponding author for the 75 articles of the LD test case.

Figure 4 .
Figure 4.A comparison of the PRI scores of articles stating a probe was "designed" with the scores of articles making no such claim.Note that this plot only considers 74 papers, as one applied a compound not synthesized by the authors.

Figure 4 .
Figure 4.A comparison of the PRI scores of articles stating a probe was "designed" with the scores of articles making no such claim.Note that this plot only considers 74 papers, as one applied a compound not synthesized by the authors.