Simple Soil Tests for On-Site Evaluation of Soil Health in Orchards

: Standard commercial soil tests typically quantify nitrogen, phosphorus, potassium, pH, and salinity. These factors alone are not su ﬃ cient to predict the long-term e ﬀ ects of management on soil health. The goal of this study was to assess the e ﬀ ectiveness and use of simple physical, biological, and chemical soil health indicator tests that can be completed on-site. Analyses were conducted on soil samples collected from three experimental peach orchards located on the Utah State Horticultural Research Farm in Kaysville, Utah. All simple tests were correlated to comparable lab analyses using Pearson’s correlation. The highest positive correlations were found between Solvita ® respiration, and microbial biomass (R = 0.88), followed by our modiﬁed slake test and microbial biomass (R = 0.83). Both Berlese funnel and pit count methods of estimating soil macro-organism diversity were fairly predictive of soil health. Overall, simple commercially available chemical tests were weak indicators of soil nutrient concentrations compared to laboratory tests. Modiﬁed slake tests, Solvita ® respiration and soil organism biodiversity counts may be e ﬃ cient and cost-e ﬀ ective tools for monitoring soil health on-site.


Introduction
Soil health or quality is typically defined as the ability of soil to function while maintaining or improving water and air quality, and supporting biota [1,2] (pp.3-21; pp.[23][24][25][26][27][28][29][30][31][32][33][34][35].It is assessed using a suite of physical, biological, and chemical tests.Maintaining soil health is essential for agricultural sustainability and the long-term viability of all land-based natural systems [3].In the U.S., cropland loses an average of 16 metric tons of soil per hectare, per year [4].Maintaining soil health can prevent loss in system productivity while also improving long-term financial outcomes for farmers.For example, researchers in Iowa were able to increase yield by 3-12% and reduce costs from inputs by 41-89% [5].Despite attempts, little progress has been made in increasing grower involvement in soil testing and soil health maintenance [6,7].Soil health tests are not always convenient, affordable, reliable, or feasible for interested individuals [8] as well as an ongoing lack of education [7]. Numerous simple soil health tests have been developed over the years, in particular, soil health cards and test kits such as the Natural Resources and Conservation Services (NRCS) soil health test kit.Soil health cards provide a user-friendly visual tool; however, when used alone, they can be subjective and incomplete [8].The NRCS test kit is one of the most comprehensive soil health test kits available, yet many of the tests are time consuming and confusing for a novice soil tester [8].Submitting soil samples to an analytical laboratory is the most straightforward testing method for growers.However, most laboratories do not offer biological and physical soil tests, and when they do, it is often cost prohibitive [8].A few innovative U.S. laboratories offer affordable soil health tests.For example, at least 20 soil labs in the U.S. offer Solvita®respiration tests, including at least one lab in the Intermountain West [9].The Cornell Soil Health Testing Laboratory offers a complete soil health test [10].Sample shipping costs can be a limitation and soil health deterioration during shipment can limit the accuracy of results.
There is no definitive list of soil health tests, although tests should generally include the combined assessment of soil physical, biological and chemical parameters [1].Specific tests will likely vary based on the laboratory and the problems frequently encountered in a given region.A few common soil health measurements include aggregate stability, texture, organic matter, nitrogen (N), potassium (K), phosphorus (P), pH, microbial biomass, soil respiration and enzyme assays.The NRCS recently released a minimum list of soil health tests with recommended methods to provide a standard against which other methods can be compared [11].
Soil structure or aggregation is one of the most important physical soil health attributes.Aggregate stability is the ability of primary soil particles to remain attached under disruptive forces.Aggregate stability tests are useful in addressing a soil's potential for erosion, in particular, when comparing the same soil type among management systems [12,13] (pp.425-442).Researchers have largely focused their efforts on improving the reproducibility of laboratory aggregate stability tests.A chief criticism is the lack of a universally accepted method to measure soil structure [14,15].
According to Lal and Shukla [16], aggregate stability tests generally fall into three categories: (1) ease of dispersion by turbidimetric techniques [17], (2) evaluation of aggregate strength based on raindrop impact [18], and (3) aggregate stability by wet sieving [13,19] (pp.425-442).All three categories of soil aggregate tests have been modified for on-site use.As rainfall simulators are often bulky and complicated to build, the most effective on-site aggregate testing options for growers are turbidimetric tests or wet sieving/slake tests.Herrick et al. [20] developed an inexpensive stability test kit constructed from simple tools.It could test up to 18 samples in 10 min.The kit was made of two boxes (21 × 10 × 3.5 cm) with eighteen equal sections.There were also 18 2.5 cm sieves (1.5 mm) for placing the soil aggregates.The rating system was based on a scale of 1-6.This test was found to be highly sensitive to a variety of plant and soil conditions [20].The NRCS incorporated a modified version of the slake test developed by Herrick et al. [20] into their field test kit.
Soil organisms and their diversity are also important indicators of soil health as they are responsible for organic matter breakdown and nutrient release, and may rapidly respond to shifts in management practices [21] (pp.419-435).The rate of organic matter turnover and mineralization potential is an important factor to consider when determining nutrient application rates in efficient systems [22].The most common simple biological tests are counting earthworms or measuring soil respiration in a given volume of soil; however, earthworms are not native to all soils and soil respiration can be highly affected by weather and management practices such as irrigation that affect soil moisture and temperature [8].Litterbag tests are uncommon in agricultural applications; however, they may provide an inexpensive, simple, and perhaps more reliable option for determining soil microorganism activity than completing a soil respiration test.Litterbag tests can quantify decomposition rates over an extended period versus measurement only of current field conditions [23].
Other tests to measure soil biological health include assessment of soil arthropods.Heteroptera (known as 'true bugs' have distinctive wings and piercing-sucking mouthparts) and Collembola (known as 'springtails' are wingless and lack metamorphoses) have been cited as important indicators of ecological health and or change [24][25][26] (pp.225-264).The Berlese funnel test is commonly used to measure abundance of soil arthropods in the laboratory [27][28][29].There are no published studies using in-field versions tailored for growers; however, foldable or collapsible Berlese funnels have been constructed for lightweight transportation [30,31].Hence, a Berlese funnel could possibly be further modified as a convenient, affordable test for growers.
Unlike physical and biological tests, chemical tests such as organic matter, N, P, K, and pH, are available from most commercial laboratories.However, the accuracy of commonly available on-site chemical test kits is uncertain.Accurate on-site tests might increase adoption of soil testing by growers.
The goal of this study was to assess the effectiveness and use of simple physical, biological and chemical soil health indicator tests that can be completed on-site.A number of potential soil tests were initially screened for ease and time of use in addition to availability of materials.Twelve simple tests for measuring soil physical, biological and chemical properties were then correlated to comparable laboratory analyses for their ability to distinguish between soils of known soil health characteristics.Tests that compared favorably with corresponding lab analysis were taught to orchardists through demonstrations.Finally, survey response data were collected on grower perceptions of the tests.

Selection of Simple Soil Testing Strategies
Simple soil tests were selected based on the accessibility of the test or test components in terms of cost, availability, ease of use and reasonable time commitment.Emphasis was placed on tests that could be easily constructed from materials for under $20 and completed in less than an hour.Many different types of test kits are available online; the NRCS test kit is one of the most comprehensive, although expensive.The slake test included in the NRCS test kit was selected as a simple measure of aggregate stability and two additional slake tests were developed as further simplifications.We refer to these modified slake tests as the surface test and the hose test.The biological simple tests used were, a litterbag test [23], the Solvita®respiration test (Woods End Laboratories, Mt Vernon, ME, USA) measuring CO 2 evolved in a given volume of soil over 24 h, a simplified Berlese funnel test [30,31], an earthworm abundance test [8], and a soil biodiversity test measuring arthropods, earthworms and organism diversity in soils.The soil biodiversity test was modified from the more common earthworm soil test to include observation of a wider diversity of organisms.The chemical tests chosen Rapidtest kit (Luster Leaf Products Inc., Woodstock, IL, USA); LaMotte test kit (LaMotte Company, Chestertown, MD, USA) Hanna pH meter (Hanna Instruments, Cluj-Napoca, RO, USA) and Mosser test kit (Mosser Lee, Millston, WI, USA) were either available locally or readily available online.

Experimental Field Sites
Soil samples were collected from replicated plots in three experimental peach orchards-one conventional, one integrated, and one organic-located on the Utah State Horticultural Research Farm in Kaysville, Utah.The integrated and the organic orchard consisted of 11 orchard floor treatments (four replicates per treatment) with documented differences in soil health [32,33].Full descriptions of management, soil health and tree growth response to treatment can be found in Culumber et al. [32] and Reeve et al. [33].In general soil health was linked to changes in soil organic matter and level of disturbance with legume cover crop containing treatments ranking highest and tillage and conventional orchard floor management ranking lowest.The integrated orchard (Table 1) consisted of five tree-row treatments, all with grass alleyways: conventional fertilizer and herbicide (CfH); conventional fertilizer and herbicide, transitioned to compost as organic fertilizer after tree establishment (CfHO); compost as organic fertilizer plus herbicide (OfH); conventional fertilizer with paper mulch and reduced herbicide (CfM); compost with paper mulch (OfM).In 2014, 16-16-16 and 46-0-0 fertilizers were applied to the CfH and CfM plots at a rate of 28.8 and 130 g N per tree respectively.Glyphosate herbicide was applied twice per year to CfH, CfHO and once a year to CfM at a rate of 1.5% in spray volumes of 234-281 L ha −1 .Organic fertilizers were applied to CfHO, OfH, and OfM in the form of steer manure compost (Miller's, Hyrum, Utah) and feather meal (NatureSafe 13-0-0) at a rate of 20 and 137 g N per tree respectively.Pesticide applications were made uniformly across all treatments as follows: copper sulfate and horticultural oil were used to treat coryneum blight and were applied once in the spring and once in the fall in 2014.Flubendiamide and spinosad were used to treat peach twig borer.
Flubendiamide was applied once in the spring and the summer, while spinosad was used once in the summer.Imidacloprid and potassium salt of fatty acids were applied in the spring to treat green peach aphids in 2014.Tebuconazone and trifloxystrobin were used in the spring of 2014 to treat mildew.
In 2015, a nearby conventional orchard (located on the same farm and the same soil type) was used instead of the integrated orchard, which was removed in 2014.The conventional orchard had a grass alleyway with some clover.The conventional orchard received 30-8-8 and 46-0-0 fertilizers at a rate of 45 and 104 g N per tree respectively.Alion herbicide was used at a rate of 366 mL ha −1 .
Copper sulfate and horticultural oil were used to treat coryneum and were applied once in the fall.Horticultural oil, tebuconaozole and trifloxystrobin were used to treat coryneum and applied once in the spring.Trifloxystrobin, difenoconazole and cyprodinil were used to treat mildew.Spinosad was used to treat peach twig borer.The organic orchard (Table 1) included six understory treatments: straw mulch in the tree row with a grass alleyway (StGr), straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus L.) alleyway (StTr), living mulch (mowed weeds) in the tree row with a grass alleyway (LmGr), living mulch in the tree row with a legume alleyway (LmTr), woven plastic mulch (5oz.Dewitt, Sikeston, MI) in the tree row with a grass alleyway (WfGr) and tilled tree rows with a grass alleyway (TiGr).All treatments had steer manure compost (Miller's, Hyrum, Utah) and feather meal (NatureSafe 13-0-0) applied at a rate of 13.6 g N per tree in 2014 and 2015, and 136 g N per tree in 2014 and 2015 respectively.In the tillage treatment, compost was applied under the drip line.In the straw and living mulched treatments, the compost was applied to a 30 cm tillage strip separating the tree row from the alleyway.Pesticide applications were made uniformly across all treatments as follows: spinosad was applied to treat peach twig borer twice in 2014, and twice in 2015.Copper oxychloride/hydrochloride and paraffinic oil was used once in the spring of 2014 and paraffinic oil was used once in the spring of 2015, and both organic treatments were used twice in the fall of 2014 and 2015 to treat coryneum.Potassium salt of fatty acids was used to treat green peach aphids once in the spring of 2014.
All eleven treatments were used to correlate the simple chemical tests to the laboratory tests, but only four of the treatments were used for the biological and physical tests: StGr, StTr, TiGr and CfH.Each treatment consisted of four replicates in a randomized incomplete block design (RIBD).Six subsamples were randomly collected from each of the four replicates per treatment with a 2.5 cm soil corer or shovel (as described below) to a depth of 10 or 30 cm and pooled for analysis.Samples were collected for different tests on two different dates to spread out the workload and minimize the time that soil was stored.The soil was collected on the same date for all paired comparisons i.e. simple vs lab based slaking tests.All physical tests were completed in August either in the field or on air-dried soil transported to the laboratory.All chemical tests were conducted in July on fresh or dried soil as described below.All biological tests were conducted in June with the exception of the Berlese funnel test which was conducted in August.All biological lab bested tests were completed on fresh soil within two weeks of collection.

Simple Physical Tests
Simple physical tests were conducted on soil collected in August in both years, two to three days after an irrigation event.Soil was collected with a shovel from the top 10 cm of each replicate, transported to the lab and air-dried.The NRCS slake test was completed as described in NRCS [34].Sieves were removed from the NRCS tray and one air-dried soil aggregate measuring one cm placed in each sieve.The empty compartments in the tray were filled with distilled water.Sieves were lowered into the compartments and soaked in the distilled water for five minutes.After five minutes, the sieves were lowered and raised from the water four more times.Sieves were placed on a dry surface and aggregates were examined and rated according to the seven point slake test scale included in the instructions.Zero was recorded if all soil disintegrated from the sieve upon first contact with the water.Six was recorded if 75% to 100% of soil aggregates remained intact after five dipping cycles [34].
Two modified slake tests were also developed.The first modified slake test, the surface structure test, was conducted by taking a 20 cm diameter kitchen sieve with a 1.5 mm mesh, filled to the rim with un-sieved soil (approximately 1.8 kg dry weight equivalent) from the designated plot, with rocks and large pieces of organic material removed.Soil was collected with a shovel from the top 10 cm in early August as described above and the tests performed immediately in the field.A picture and notes were taken to document the general appearance and structure of the soil.The sieve was soaked in a bucket of water for five minutes.The sieve was raised and submerged four times, allowing water to drain (about five seconds) in between.The sieve was removed and another picture and more notes were taken documenting the soil surface structure.An estimate was recorded of the percent of soil structure remaining intact in the sieve.
The second modified slake test, the hose test, was conducted on the same sieve of soil directly after completing the surface structure test.A hose was turned on, using one and three quarters turn to the knob (hose psi 80, flow rate 24.7 L per minute), to maintain the same water pressure on all of the samples.The sieve was held about one half meter from the hose and then sprayed for one minute in a circular motion, while maintaining an equal distribution of water flow overall surface points of the soil in the sieve.The mass of soil remaining at the end of one minute was recorded after air-drying.

Laboratory Physical Tests
The machine aggregate stability test as described by Kemper and Rosenau [13] (pp.425-442) was used to correlate to the simple slaking tests.Four grams of air-dried soil (collected as described under Section 2.1.2),was placed in sieves in a mechanical sieving device (Make: 8.13.01;Model: 33255301; Giesbeck, Netherlands) and pre-moistened with steam to 4.75 g soil wet weight (19.5% water content).The instrument submerged the sieves and soil into water, and raised and lowered them at regular intervals for three minutes.The soil that was lost during the sieving process was oven dried at 40 • C and weighed.The process was repeated in a 0.2% sodium hexametaphosphate solution (NaPo 3 ) 6 .The soil removed from the sieves by the (NaPo 3 ) 6 solution represented the stable aggregates.

Simple Biological Tests
The earthworm and biodiversity tests were conducted in the field two to three days after an irrigation event during August in 2014 and 2015.To determine earthworm/biodiversity counts a 30 × 30 × 30 cm hole was dug in each designated test plot.The soil from the hole was placed in a bucket and visually inspected one handful at a time for earthworms and other macroscopic soil organisms.The total number of earthworms and macro organisms, as well as the number of different kinds of organisms, were recorded.
The Berlese funnel tests were also conducted in the field in August two to three days after an irrigation event in 2014 and 2015.The methods for construction of on-site Berlese funnel tests were modified and simplified from known laboratory and field methods [27,28,30,31].A shovel of topsoil, about 15 to 20 cm in depth, excluding the top 2 cm of soil, from each designated plot was placed in a 20-L bucket.A 20 × 20 cm piece of cheesecloth was folded in half and taped to the inside of a 12 × 40 cm funnel with masking tape, approximately 10 cm below the opening of the funnel to function as a sieve.The spout of the funnel was placed in a glass jar, and the space between the funnel and the jar was sealed with aluminum foil.One large handful of gently mixed soil (approximately 250 g dry weight equivalent) from the original shovelful was placed on top of the cheesecloth in the funnel.The funnels were left in the sun for three hours at an average temperature that afternoon of 28.9 • C. The funnels were removed from the jars.The contents of the jars were poured onto a piece of paper, and the number and type of organisms recorded.
The Solvita®respiration test was conducted in late June in both years, two to three days after an irrigation event.Soil was sampled with a 2.5 cm corer to a depth of 10 cm in 2014 and 30 cm in 2015 and transported to the lab on ice for immediate analysis.The Solvita®test kit included plastic jars, lids, and CO 2 reactive probes.Each jar was marked with the required soil volume, which came to approximately 64 g of field moist soil.The CO 2 probe was removed from its metallic pouch and placed in the soil within the jar with the color indicator side facing upward.The jars were sealed with lids, placed in a dark cupboard at room temperature (20 • C) for 24 h, after which the probe color was matched to the test kit indicator sheet.The corresponding soil respiration number was recorded.
Litterbags were filled with three different substrates to measure decomposition rates: dried peach leaves, dried straw, and dried alfalfa with eight replicates per plot.The dried straw and alfalfa materials were cut into 2.5 cm segments.Two and one half grams of each litter type was placed separately into a labeled nylon bag and the bag sealed by tying a knot at the end.The nylon bags were buried 8 cm below the surface on June 21, 2014 and the location was marked with a landscape flag labeled with the litter type.One nylon bag of each litter type from each plot was unburied at week 1, 2, 3, 4, 6, 8, 12, and 48 weeks after burial.These methods were a modification of those used by Keuskamp et al. [23].

Laboratory Biological Tests
Soil samples for the laboratory analyses were taken using a 2.5 cm corer at the end of June, 2014 and 2015, to a depth of 10 cm in 2014 and 30 cm in 2015 two days after an irrigation event.They were transported to the lab on ice and were analyzed during the first two weeks of July.Mineralizable carbon (RMC), basal respiration (BR), and microbial biomass (Cmic) determined by substrate-induced respiration were measured with an infrared CO 2 analyzer (Model 6251, LICOR Biosciences) on day 12, 13, and 14 of an incubation at 20 • C and 22% moisture as described by Anderson and Domsch [35] and Davidson et al. [36].Dehydrogenase enzyme activity (DHA), the reduction of triphenyl tetrazolium chloride of 2.5 g soil dried weight equivalent at 22% moisture was measured as described by Tabatabai [37] (pp.778-826).

Simple Chemical Tests
Soil samples were taken the last week of July each year for both laboratory and simple test kit chemical analyses.Soil was sampled with a 2.5 cm corer to a depth of 30 cm and transported to the lab on ice for immediate analysis.Instructions were followed according to the respective manuals for testing N, P, K, and pH by the Rapidtest kit, LaMotte test kit, and Mosser test kit.Instructions were also followed according to the manual for the testing of pH by the Hanna pH meter.

Laboratory Chemical Tests
For the laboratory chemical analysis, soils were collected as described in Section 2.1.6,passed through a 4 mm sieve, stored at 4 • C and processed within 10 days for measuring available N. Laboratory measured N was measured by nitrate and ammonium extraction using 1M Potassium chloride and analyzed by Lachat (QuikChem 8500, Hach Company, Loveland, CO, USA) using the sulfanilamide and phenate methods respectively, as described in the manufacturer's protocols.Olsen [38] sodium bicarbonate extraction method was used for measuring P and K and were measured after sieving soils at 4 mm and air-drying for two weeks.

Statistical Analysis
Each simple test was compared to a relevant laboratory-based test using Pearson's correlation (SAS PROC CORR).Pearson's correlations were measured and not P values because the analyses did not meet P value assumptions; individual observations were not independent of treatment and or replicated blocks.Results from the litterbag tests were analyzed with SAS PROC GLM as a randomized block design with two factors, treatment and litter type, with time as a repeated measure (SAS Institute, Cary, NC, USA).
The estimated percentage of stable soil aggregates from the simple slake tests were correlated with the percent stable soil aggregates from the mechanized slake test.The estimated percentages of stable soil aggregates from the simple slake tests were also correlated with biological laboratory procedures (RMC, BR, Cmic, and DHA) as the physical qualities of the soil are often directly linked to biological activity in the soil.

Training Sessions with Growers, and Collection of Feedback
Soil health training opportunities were presented to local farmers.Seven orchardists volunteered to be trained in soil quality and on-site soil quality tests which included the modified slake tests, NRCS slake test, Solvita®soil respiration, and earthworm abundance/biodiversity test.At the end of each training, they provided feedback on a prepared questionnaire.A demonstration of the same simple on-site soil health tests was taught at a summer field tour.Questionnaires were filled out at this event.Finally, a questionnaire was distributed through a USU orchardist listserv, to obtain general feedback from Utah orchardists on their knowledge and interest in soil quality and testing methods.

Physical Tests
There was no relationship detected between the machine aggregate stability test and any of the simple slake testing methods, although several of the simple tests correlated well with the biological tests (Table 2).The machine aggregate stability test categorized the tillage management system with the strongest aggregate stability (Supplementary Figures S1-S3), which is the opposite of what would be expected [39,40].This can occur when air-dried soil is stored for several months to years prior to testing [12,41].In our study, soils were stored air-dried for three months prior to testing in 2014 and one month in 2015 with no change in results.Soils at the research site contain calcium carbonates below 30 cm and trace amounts in the topsoil may explain the unexpected finding.Aggregate stability in calcium carbonate containing soils is not always correlated with organic matter [42].Kemper and Koch [12] reported a factor necessary for obtaining reproducible results was sieving out soil particles with a diameter of less than 1 mm.This step was not done in this study, which could have also influenced the results.The challenge of comparing results from different stability tests has been a persistent one, as the degree of variability between and within methods is large and can lead to weak comparisons [15].Physical soil properties were more visible on a larger scale, with informative results.In our first modified slake test, the surface soil test, soil aggregates in good quality soils would hold together tightly on wetting, indicating good aggregate structure.However, poorer quality soil would smooth out and gloss over on wetting, showing weak soil structure.Using the smaller on-site slake tests such as the NRCS test, these visual cues were absent.Kheyrodin [43] recognized visual cues as being important indicators of changing or threatened soil health.
In both years, the best physical test correlation was between the surface soil test and microbial biomass (Table 2 and Figure 1).The results were consistent with results from the Solvita®test (Table 3 and Figure 2), and easily distinguished between orchard floor management practices that build soil health (such as addition of organic matter) and the soil management practices that typically diminish soil health (such as tillage, Figure 1a,b).In 2014, the hose test clearly distinguished between most treatments, even moderately differentiating soil health in the tree row with a trefoil alleyway, and the tree row with a grass alleyway (Figure 1c).Previous research has shown that treatments with a trefoil alleyway had the best soil health [32].In 2015, though, the hose test results were less conclusive (R = 0.42: Table 2).The difference in sensitivity between years could have been influenced by sampling depth.

Biological Test Results
Results from the Solvita®soil respiration test kit had the highest correlations with laboratory tests in both years (Table 3).The results coincide with Haney et al. [44], where Solvita®soil respiration tests strongly correlated with the titration method of measuring CO 2 soil respiration (R 2 = 0.82) and infrared gas analysis measuring CO 2 analysis (R 2 = 0.79).In the first year (2014), Solvita®soil respiration was able to differentiate between the two treatments with documented higher soil health and the treatments with lower soil health (differentiated SG and ST from HN and TG, Figure 2a) [32].In the second year (2015), similar treatments were differentiated with less precision (Figure 2b), again, likely due to the greater sampling depth used in 2015.
Soil microbial activity is heavily concentrated in surface soils; so limiting soil sampling to the top 10-15 cm would maximize the likelihood of differentiation among soils and management histories.It is also possible that precision of the Solvita®test could be improved by lessening the amount of time that the soil probes were incubated, as many of the organically managed soils maxed out at the upper range of the test within a few hours of the 24 h incubation time specified in the instructions.The drawback with the Solvita®test is that soil respiration is highly affected by weather and management practices such as irrigation, making it difficult to compare biological activity over time in a given location [8].In our study, we controlled for potential differences in soil moisture between treatments and years by timing the test two to three days after an irrigation event.Soil microbial activity is heavily concentrated in surface soils; so limiting soil sampling to the top 10-15cm would maximize the likelihood of differentiation among soils and management histories.It is also possible that precision of the Solvita® test could be improved by lessening the amount of time that the soil probes were incubated, as many of the organically managed soils maxed out at the upper range of the test within a few hours of the 24 hr incubation time specified in the instructions.The drawback with the Solvita® test is that soil respiration is highly affected by weather and management practices such as irrigation, making it difficult to compare biological activity over time in a given location [8].In our study, we controlled for potential differences in soil moisture between treatments and years by timing the test two to three days after an irrigation event.
The earthworm abundance test, although often recommended by the NRCS as well as others, proved to have little relationship with laboratory soil biological testing measures (Table 3).In 2014, the earthworm abundance test showed some differentiation among treatments (StTr often had the best soil health parameters, followed by StGr, TiGr and then CfH), when correlated to DHA (Figure 3a).In 2015, no correlation with DHA was found and there were only weak correlations with other The earthworm abundance test, although often recommended by the NRCS as well as others, proved to have little relationship with laboratory soil biological testing measures (Table 3).In 2014, the earthworm abundance test showed some differentiation among treatments (StTr often had the best soil health parameters, followed by StGr, TiGr and then CfH), when correlated to DHA (Figure 3a).In 2015, no correlation with DHA was found and there were only weak correlations with other biological measurements (Table 2).Conversely, previous work at this site has shown that DHA, RMC, BR and Cmic as measured in the laboratory have consistently differentiated between all treatments [32].The earthworm test weakly correlated the second year with laboratory measured Cmic (R = 0.32).The correlation of the number of different organisms found in the 30 cm 3 pit was higher (Table 2, R = 0.69 correlation with BR in 2014, Figure 3b), and could potentially be improved with more repetitions.Earthworms have beneficial effects on soil health, but numbers may not necessarily reflect laboratory biological indicators.According to Pelosi et al. [45], earthworm abundance is highly variable due to climatic conditions, and multiple years of assessments are required to obtain meaningful soil health implications.Our results suggest that total macro organism counts are a more reliable soil health indicator.
BR and Cmic as measured in the laboratory have consistently differentiated between all treatments [32].The earthworm test weakly correlated the second year with laboratory measured Cmic (R=0.32).The correlation of the number of different organisms found in the 30 cm 3 pit was higher (Table 2, R = 0.69 correlation with BR in 2014, Figure 3b), and could potentially be improved with more repetitions.Earthworms have beneficial effects on soil health, but numbers may not necessarily reflect laboratory biological indicators.According to Pelosi et al. [45], earthworm abundance is highly variable due to climatic conditions, and multiple years of assessments are required to obtain meaningful soil health implications.Our results suggest that total macro organism counts are a more reliable soil health indicator.The results for the on-site Berlese funnel tests also corresponded fairly poorly to the laboratory tests (the best correlation was with BR in 2015, R = 0.68, Table 3).It was assumed that the heat of the sun over the space of a few hours would cause the soil arthropods to descend into the jar from the The results for the on-site Berlese funnel tests also corresponded fairly poorly to the laboratory tests (the best correlation was with BR in 2015, R = 0.68, Table 3).It was assumed that the heat of the sun over the space of a few hours would cause the soil arthropods to descend into the jar from the sieve [30].The sieves used, may have been too deep, allowing the organisms to remain in a comfortable environment for the duration of the test.A longer test period may also have improved the results.It is important to choose a sunny day with temperatures over 25 • C for this type of test.The ability of the Berlese funnel method to distinguish soil health did compare favorably with the total organism counts obtained with the pit method, however.Provided time and resources are invested in making the funnels, the test is less labor intensive than sifting through soil to count organisms by hand.Provided further modifications improve accuracy, this test holds potential for use as a simple on-site soil health test.
Litterbag tests failed to distinguish between treatments (Supplementary Table S1).The results were likely affected by loss of litterbag contents through perforations in the nylon material caused by roots and rocks.Nylon was chosen to prevent decomposition; however, a stronger material such as a commercially available synthetic teabag, as used by Keuskamp et al. [23] might have produced results that are more consistent.Keuskamp et al. [23] found that the tea bags prevented root penetration, and did not decompose after 3 months in the field.More replicates may also be needed for each litter type and excavation date to improve accuracy.The requirement of a precise weight scale, and the time needed to dry, remove adhered soil from the outside, and transfer the contents of the litterbags onto the scale is also a disadvantage in terms of ease of adoption by growers.Burying cotton underpants to compare decomposition potential between different sites has become a popular way to demonstrate soil microbial activity in extension settings [46].

Chemical Tests
Simple N tests produced the highest correlations among chemical tests in both years (Table 4).
Although not very precise, the results roughly corresponded to laboratory measured soil N concentrations.The exception was the LaMotte simple N test in the organic orchard.It is possible that the diminished accuracy was an effect of organic materials, such as humic acids, on the chemical solution.The LaMotte simple K tests correlated quite well, with better results in the organic orchard than the conventional orchard (Table 4).The highest concentration of K recorded in the laboratory, corresponded to the highest concentration of K recorded using the LaMotte simple test-in particular for the treatment StTr and StGr.It was less accurate in distinguishing K levels in the other four treatments, which could also be an effect of such a narrow test scale (Supplementary Figure S4).
The Mosser N test correlated best of all of the simple with laboratory tests (Table 3); however, the Moser K test showed poor correlation (Supplementary Figure S5).The test correctly identified the StTr treatment as having greatest levels of K; however, the overall scale shows that the concentration of K was often undervalued and not very precise.Samples that were rated with the lowest concentrations of K on the Mosser scale were measured above 150 ppm in the laboratory, which is typically considered a sufficient/high level.The range of the scale also did not measure excessive nutrients with a maximum range of 180 ppm.The correlations with soil P and pH were poor, regardless of the test used.The test kits came in packages of N, P, K and pH.To purchase a kit only to use one or two particular tests is not the most efficient use of a product.
Information on the extraction solutions was not provided with the Rapid test kit.However, the N simple tests for the LaMotte and the Mosser test kits were based on colorimetric standardized tests [47,48] (pp.1123-1184).The tests used zinc to reduce nitrate to nitrite.Nitrite would then react with a color agent allowing for the determination of concentration of N through observation.The Mosser potassium simple test used sodium tetraphenylboron, which reacts with nonexchangeable K to form a white precipitate.The cloudiness of the sample is then recorded [49] (pp.551-574).The LaMotte K simple test did not match any standardized K laboratory procedures [49] (pp.551-574).The Mosser and LaMotte simple P tests used modified versions of a colorimetric procedure for measuring P [50] (pp.869-920).The Mosser test, used ammonium molybdate which reacts with P, producing a complex that reduces to a blue color in the presence of ascorbic acid.The LaMotte simple test, used sodium molybdate instead of ammonium molybdate.No information was found as to whether these colorimetric methods work better in acidic or alkaline soils, or are affected by humic acids.These soil attributes could potentially affect results.

Grower Feedback
Results from the grower surveys (Supplementary Figures S6-S9, Supplementary Tables S2-S4) showed that growers are interested in soil health and are interested to learn more.Most orchard growers in Utah do test their soil, however the majority of them only complete macronutrient laboratory tests.Growers, for the most part, are satisfied with current testing methods, yet essentially half of the growers surveyed acknowledged only some or limited knowledge of soil health.Hence, they may not be fully aware of the potential benefits of assessing soil health over the long-term.Other studies conducted in the US and Australia show that routine soil testing is still relatively rare, with interest in and knowledge of soil health generally limited to specialized grower groups such as organic or no-till farmers [7].Simple on-site tests provide a plausible avenue for farmers to improve understanding of their soil health without the difficulty or cost associated with laboratory testing.In terms of user friendliness and cost of simple on-site physical and biological tests, modified slake tests and soil biodiversity/earthworm abundance counts consistently ranked as most preferred among our grower collaborators (Figure 4).Note: N = available soil nitrogen, P = available phosphorus, K = available potassium.

Grower Feedback
Results from the grower surveys (Supplementary Figures S6-9, Supplementary Tables S2-4) showed that growers are interested in soil health and are interested to learn more.Most orchard growers in Utah do test their soil, however the majority of them only complete macronutrient laboratory tests.Growers, for the most part, are satisfied with current testing methods, yet essentially half of the growers surveyed acknowledged only some or limited knowledge of soil health.Hence, they may not be fully aware of the potential benefits of assessing soil health over the long-term.Other studies conducted in the US and Australia show that routine soil testing is still relatively rare, with interest in and knowledge of soil health generally limited to specialized grower groups such as organic or no-till farmers [7].Simple on-site tests provide a plausible avenue for farmers to improve understanding of their soil health without the difficulty or cost associated with laboratory testing.In terms of user friendliness and cost of simple on-site physical and biological tests, modified slake tests and soil biodiversity/earthworm abundance counts consistently ranked as most preferred among our grower collaborators (Figure 4).Comments provided to the researchers indicated that growers particularly appreciated the hands-on nature of the tests.More education and research is needed to improve grower knowledge and adoption of soil testing in order to improve land management decisions.The development and promotion of simple user-friendly soil health assessment tools could help fill that gap.Further work could also help to refine these simple soil tests.For example, discovering if precision of the tests could be improved with more repetitions or through modifications of the protocols to improve sensitivity.Finally, limiting testing to the top 10-15 cm would likely improve the ability of the tests to differentiate between soils of different management histories.Further information on test performance in a wider range of soil types and environments is also needed.Complete compilations of before and after pictures from various soil types, management systems and environments would be useful to provide a good reference to aid in interpretation of the modified slake tests.

Conclusions
A number of potential soil tests were initially screened for ease and time of use in addition to availability of materials.Twelve simple tests for measuring soil physical, biological and chemical properties were then correlated to comparable laboratory analyses for their ability to distinguish between soils of known soil health characteristics.The soils were collected from replicated plots in three experimental orchards with documented treatment effects on soil health.The simple tests evaluated did not all prove to be accurate indicators of soil health.However, the two modified slake tests, Solvita®respiration test and soil organism counts accurately differentiated the majority of orchard floor treatments based on soil health.Based on these findings, these four tests were selected for on farm demonstrations and grower trainings.Feedback from growers was also collected.The sieve and bucket test was ranked highest by growers in terms of visual impact and ease of use.Due to the variable nature of on-site chemical tests, we recommend growers continue conducting chemical tests through laboratories.An increase in the level of soil health testing will help growers improve on farm decision making and contribute significantly to overall agricultural sustainability.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2071-1050/11/21/6009/s1, Figure S1: NRCS slake test correlated with machine aggregate slake test in 2015.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.Figure S2.Hose test correlated with machine aggregate slake test in 2015.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.Figure S3.Soil surface test correlated with machine aggregate slake test in 2015.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.Figure S4.LaMotte potassium test correlated with laboratory measured Olsen potassium in the integrated orchard.The LaMotte potassium scale is interpreted as: 0-120 lbs per acre for Low (1-2), 120-200 lbs per acre for medium (3)(4)(5), 200 + lbs per acre for high (6+).HC = herbicides plus compost for nitrogen, HN = NPK fertilizers and herbicides with a grass alleyway, HNC = NPK fertilizers and herbicides, and converted to organic practices after tree establishment, PC = paper mulch, organic herbicide and compost for nitrogen, PR = paper mulch with reduced herbicide in addition to NPK fertilizers.Figure S5.Mosser nitrogen test correlated with laboratory nitrate nitrogen in the organic orchard.LG = living mulch (low-growing shallow rooted alyssum, Lobularia maritima) in the tree row with a grass alleyway, LT = living mulch in the tree row with a legume alleyway, SG = straw mulch in the tree row with a grass alleyway, ST = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TG = tilled tree rows with a grass alleyway, WG = woven plastic mulch in the tree row with a grass alleyway.Figure S6.Grower perceptions of their soil testing knowledge.Response to the question: How do you rate your knowledge on soil testing?Responses (101 received out of 400 mailed) from survey emailed to USU grower listserv.The majority of respondents were men between the ages of 55 and 64, although women represented 43% of respondents.The greatest number of respondents owned farms of 0.4-2 hectares.Figure S7.What soil tests growers use to test their soil.Responses (101 received out of 400 mailed) from survey emailed to USU grower listserv.The majority of respondents were men between the ages of 55 and 64, although women represented 43% of respondents.The greatest number of respondents owned farms of 0.4-2 hectares.Figure S8.Growers indicate the usefulness, affordability and ease of current soil testing strategies.The question was, to what extent do you agree that the following qualities are common traits among current soil tests?Answers are indicated in percentages.Responses (101 received out of 400 mailed) from survey emailed to USU grower listserv.The majority of respondents were men between the ages of 55 and 64, although women represented 43% of respondents.The greatest number of respondents owned farms of 0.4-2 hectares.Figure S9.Percent of respondents interested to learn more with researchers on soil quality tests.Responses (101 received out of 400 mailed) from survey emailed to USU grower listserv.The majority of respondents were men between the ages of 55 and 64, although women represented 43% of respondents.The greatest number of respondents owned farms of 0.4-2 hectares.Table S1.Pearson correlations between litterbag tests at number of weeks of burial and laboratory biological tests.BR-basal respiration, Cmic = microbial biomass C, DHA-dehydrogenase enzyme assay.Table S2.Growers' perceptions of what healthy soil means to them.Responses (101 received out of 400 mailed) from survey emailed to USU grower listserv.The majority of respondents were men between the ages of 55 and 64, although women represented 43% of respondents.The greatest number of respondents owned farms of 0.4-2 hectares.Table S3.Growers indicate why they test their soils.Responses (101 received out of 400 mailed) from survey emailed to USU grower listserv.The majority of respondents were men between the ages of

Figure 1 .
Figure 1.(a).Soil surface test correlated with microbial biomass 2014.(b).Soil surface test correlated with microbial biomass 2015.(c).Hose test correlated with microbial biomass 2014.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.

Figure 1 .
Figure 1.(a).Soil surface test correlated with microbial biomass 2014.(b).Soil surface test correlated with microbial biomass 2015.(c).Hose test correlated with microbial biomass 2014.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.

Figure 2 .
Figure 2. (a).Solvita® respiration correlated with microbial biomass 2014.(b).Solvita® respiration correlated with microbial biomass 2015.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.

Figure 2 .
Figure 2. (a).Solvita®respiration correlated with microbial biomass 2014.(b).Solvita®respiration correlated with microbial biomass 2015.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.

Figure 3 .
Figure 3. (a).Earthworm abundance test correlated with dehydrogenase enzyme assay as measured by reduction of triphenylformazan (TPF) per gram of soil per hour in 2014.(b).Biodiversity test correlated with laboratory measured soil basal respiration in 2014.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.

Figure 3 .
Figure 3. (a).Earthworm abundance test correlated with dehydrogenase enzyme assay as measured by reduction of triphenylformazan (TPF) per gram of soil per hour in 2014.(b).Biodiversity test correlated with laboratory measured soil basal respiration in 2014.CfH = conventional fertilizers and herbicide with a grass alleyway, StGr = straw mulch in the tree row with a grass alleyway, StTr = straw mulch in the tree row with a legume (birdsfoot trefoil, Lotus corniculatus) alleyway, TiGr = tillage in the tree rows with a grass alleyway.

Figure 4 .
Figure 4. Proportion of growers indicating which simple soil tests they would be most likely to use.The results are from one on one meetings with farmers (7), and a survey (21) conducted at a field demonstration of the tests.Some growers indicated more than one option.

Figure 4 .
Figure 4. Proportion of growers indicating which simple soil tests they would be most likely to use.The results are from one on one meetings with farmers (7), and a survey (21) conducted at a field demonstration of the tests.Some growers indicated more than one option.

Table 1 .
Summary of the treatments in the integrated and organic orchards.The treatments were laid out as a random incomplete block design (RIBD) with two factors (fertilizer and weed control in the integrated orchard and tree-row and alley in the organic orchard) with four replicates.

Table 2 .
Pearson's correlations between in field aggregate stability tests and laboratory physical and biological tests in 2014 and 2015 (n = 4).
Note: DHA = dehydrogenase enzyme assay, MicC = microbial biomass carbon as measured by substrate reduced respiration.

Table 3 .
Pearson's correlations (n = 4) between in field biological tests and laboratory biological tests in 2014 and 2015.

Table 4 .
Pearson's correlations (n = 4) between in field chemical tests and laboratory chemical tests on conventional and organic orchard soils in 2014 and 2015.= available soil nitrogen, P = available phosphorus, K = available potassium.

Table 4 .
Sustainability 2019, 11, x FOR PEER REVIEW 13 of 17 colorimetric methods work better in acidic or alkaline soils, or are affected by humic acids.These soil attributes could potentially affect results.Pearson's correlations (n = 4) between in field chemical tests and laboratory chemical tests on conventional and organic orchard soils in 2014 and 2015.