Some Ultra-Processed Foods Are Needed for Nutrient Adequate Diets: Linear Programming Analyses of the Seattle Obesity Study

Typical diets include an assortment of unprocessed, processed, and ultra-processed foods, along with culinary ingredients. Linear programming (LP) can be used to generate nutritionally adequate food patterns that meet pre-defined nutrient guidelines. The present LP models were set to satisfy 22 nutrient standards, while minimizing deviation from the mean observed diet of the Seattle Obesity Study (SOS III) sample. Component foods from the Fred Hutch food frequency questionnaire comprised the market basket. LP models generated optimized 2000 kcal food patterns by selecting from all foods, unprocessed foods only, ultra-processed foods only, or some other combination. Optimized patterns created using all foods contained less fat, sugar, and salt, and more vegetables compared to the SOS III mean. Ultra-processed foods were the main sources of added sugar, saturated fat and sodium. Ultra-processed foods also contributed most vitamin E, thiamin, niacin, folate, and calcium, and were the main sources of plant protein. LP models failed to create optimal diets using unprocessed foods only and ultra-processed foods only: no mathematical solution was obtained. Relaxing the vitamin D criterion led to optimized diets based on unprocessed or ultra-processed foods only. However, food patterns created using unprocessed foods were significantly more expensive compared to those created using foods in the ultra-processed category. This work demonstrates that foods from all NOVA categories can contribute to a nutritionally adequate diet.


Introduction
The NOVA food classification scheme [1] assigns foods into four categories: ultraprocessed, processed, unprocessed, and culinary ingredients. Accounting for >60% of energy in the American diet [2,3], ultra-processed foods have been linked to a variety of adverse health outcomes, including obesity [4], diabetes [5], hypertension [6], depression [7], cancer [8], and all-cause mortality [9]. There is also an economic dimension [10,11]. In the Seattle Obesity Study III (SOS III) [12], percent energy from ultra-processed foods was associated with lower diet quality but also with significantly lower food spending [10,11]. Study participants in the bottom decile of estimated diet cost ($216/person/month) derived 67.5% energy from ultra-processed foods as compared to 48.7% for those in the top decile ($370/person/month) [11]. The consumption of lower-cost ultra-processed foods depends on household socioeconomic status [10,11]. Unobserved socioeconomic factors may confound any observed relation between diet quality and health [13,14].
Ultra-processed foods are formally defined as industrial creations that contain chemical ingredients (flavors, stabilizers, or emulsifiers) not used in normal home cooking along with added fat, sugar, and salt [1]. Classed as ultra-processed are sugary beverages, sweets, desserts, pizza, and breakfast cereals, but also commercial yogurts, juices, and whole grain breads. Based on past analyses of 384 component foods of the Fred Hutch food frequency questionnaire, the ultra-processed classification captured not only fats and sweets (73%), the intended target, but virtually all grain foods (91%), along with most beans, nuts, and seeds (70%) [11]. Grains and cereals are rarely eaten raw; some degree of processing is usually involved. Assigned to the unprocessed category were most fruit, vegetables, meat, poultry, and fish [11].
Many foods classified as ultra-processed, including breads, ready to eat cereals, and some beverages are fortified with vitamins and minerals [15]. It is possible that incorporating a proportion of such foods in the habitual diet allows for micronutrient requirements to be met at a more affordable cost. The mathematical technique of linear programming (LP) permits the construction of theoretical dietary patterns that satisfy multiple nutrient requirements under a variety of constraints [16][17][18][19]. Such models can then be used to test the feasibility of proposed dietary guidance. For example, one LP study [20] tested whether a stringent sodium reduction goal (1500 mg/day at the time) was compatible with nutrient-adequate diets. Whereas the 2300 mg/d sodium goal was feasible, the more stringent 1500 mg/day sodium goal was incompatible with nutrient adequacy and no mathematical solution was obtained [20]. Modeled food patterns have also been created to minimize diet cost [21], or to minimize diet-associated greenhouse gas emissions [18].
This project sought to determine whether nutrient-adequate food patterns could be created using unprocessed foods only, or using ultra-processed foods only. The relative cost of the optimized food patterns was of special interest. Nutrient composition and cost data for component foods of the Fred Hutch FFQ served as input. All foods were assigned into NOVA categories. Observed dietary intakes for 857 male and female adults came from the Seattle Obesity Study III [12]. The goal was to create optimized 2000 kcal/d food patterns that met standards for 22 nutrients while respecting existing food habits of SOS III participants.

Study Design and Participants
The SOS III was a population-based study of adult men and women living in King, Pierce, and Yakima Counties in Washington State [12]. County-specific recruitment schemes used address-based sampling stratified by residential property values along with community outreach to ensure broad representation by socioeconomic status and race/ethnicity. Recruitment and in-person data collection were conducted from July 2016-May 2017 by local staff at each site.
Eligible participants were aged 21-59 years, were principal household food shoppers, without any mobility issues, and not pregnant or breastfeeding. Written consent was provided during the in-person visit before starting the study procedures. Data were collected in English and Spanish (in Yakima County). All study procedures were approved by the Institutional Review Boards (IRBs) of respective sites. For analyses, participants were excluded due to implausible caloric intakes: 12 participants with kcal >5000 and 3 participants with kcal <500. The present analytical sample was based on 857 male and female respondents.

FFQ Component Foods
The Fred Hutch Food Frequency Questionnaire (FFQ) was used to collect dietary intake data [22]. The FFQ consists of a list of 126 line-item foods and 384 component foods that are a part of the FFQ structure but are not visible to the respondent. Each food's energy and nutrient content are calculated based on a weighted average of component foods. A nutrient database containing the nutrient content of each of the 384 component foods was derived from the USDA Food and Nutrient Database for Dietary Studies [10,[23][24][25]. For the present calculations, we omitted drinking water, tea, coffee, diet soda, beer, wine and whiskey. The 360 unique foods were then aggregated into seven major food groups: 1. fats/sugary beverages/non-grain sweets; 2. dairy; 3. fruits and fruit juices; 4. vegetables; 5. beans/nuts/seeds; 6. grains; 7. meats/poultry/fish.
The same foods were also assigned to the four NOVA food processing categories as shown in Table 1. Fresh, dry, or frozen foods that had been subjected to minimal or no processing were treated as unprocessed. These included fresh meat, fish, fruits (such as apple, banana, apricots), salad, milk, vegetables (broccoli, green beans, potato), eggs, legumes, and unsalted nuts (raisins and prunes) and seeds. Culinary ingredients were sugar, animal fats (butter) and oils (olive oil, canola oil, corn oil), and salt (21). Based on published NOVA criteria, the addition of fat, sugar and salt to wholesome fresh foods transformed them into processed foods. Among processed foods were cheese, ham, canned fruits, and canned beans. Among FFQ component foods that were classified as ultraprocessed were commercial breads, jams and jelly, ready to eat and other breakfast cereals, sweet snacks (cookies and cakes), pizza, potato chips or tortilla chips, soft drinks (sodas and fruit drinks), French fries, sauces (ketchup, mayonnaise), desserts (ice cream, frozen yogurt, sherbet), fast food meals, juices and soups.

Food Processing Level Description
1 unprocessed/minimally processed 2 processed culinary ingredients 3 processed foods 4 ultra-processed foods

Energy and Nutrient Intakes and Estimated Diet Cost
Estimates of individual-level daily diet cost were obtained by joining dietary intake data with county specific retail prices for 360 FFQ component foods. Retail prices were obtained from large supermarkets in King, Pierce and Yakima counties following standard and published procedures [10,11]. Retail prices converted to dollars per 100 g edible portion were added to the nutrient database, to parallel nutrient values, expressed as amounts (g/mg/IU) per 100 g edible portion. In this way, each of the 360 foods in the nutrient database was associated with 45 nutrient vectors and a single cost vector. The procedures of estimating diet costs from FFQ have been described previously [26]. The calculated cost of each modeled food pattern was expressed per 2000 kcal/d.

Linear Programming to Generate a Nutrient-Adequate Diet
Linear programming (LP) models used to optimize dietary patterns have an objective function, a set of nutritional goals, and a set of consumption constraints. The objective function typically measures the deviation from current eating behaviors as the model seeks a nutritionally adequate diet. Foods in different amounts are then selected from the market basket to create a food pattern that satisfies a minimum set-or an optimal range-of nutrient goals [16].
The linear objective function can vary depending on study purpose: some studies have minimized the deviation from current eating behaviors [17][18][19][20][27][28][29], while other studies have maximized or minimized total energy [30] or minimized cost [16]. Table 2 shows observed data from SOS III for the entire adult sample (n = 857). The 22 unique nutrient constraints that needed to be met are also shown in Table 2. We label each of these nutrients from 1 to 22 in the order displayed in this table. To develop our linear program, we first defined the following variables. We use the keywords "observed" and "optimized" to refer to quantities from our dataset and from our linear program respectively. The observed quantities were computed from averaging consumption over all participants in the SOS III dataset after excluding select outliers.
For i = 1 to 7 and j = 1 to 22, let: (1) n o i j = observed consumption of food group i and nutrient j (4) n p i j = optimized consumption of food group i and nutrient j The objective function is important and helps govern the solution output from the model. We defined the following three objective functions:

Grams Objective Function
This objective function minimizes the relative difference between the observed and optimized gram quantities from each of the food groups. Formally, this is defined as: This has been commonly utilized in previous studies to ensure that the modelgenerated diet has food group weights that are similar to the average diet. However, the total weight of foods in each food group may not assure optimal nutrient distribution, since each major food groups contains a variety of different food subgroups and categories. This may result in a model output that is much different from the SOS III mean.

Nutrient Objective Function
This objective function minimizes the relative difference between the observed and optimized nutrient quantities from each of the food groups. Formally, this is defined as: This objective function is more precise than the grams objective function; by minimizing the nutrient differences in each food group, the model output using this objective function should be closer to the actual food basket consumed by the average person.

Grams and Nutrient Objective Function
This objective function combines the previous two objective functions, utilizing both nutrient and gram quantities. Formally, this is defined as: This objective function encapsulates information from the first two objective functions by combining them. This will result in a diet that is similar to the average consumed diet. We used this objective function in our modeling.

Modeling
The objective function was supplemented with nutrient constraints on the modeled food patterns, as defined in Table 2. Thus, the output of the linear programming was a set of foods and their respective quantities that satisfied the nutritional requirements and optimized the current objective function.
We used linear programming to construct optimized diets under the three objective functions we identify above, including foods with certain processing levels. Table 3 lists the combinations of food processing levels attempted.

Food Processing Levels Included
Diet Description 1 Unprocessed/minimally processed foods 1,2 Unprocessed/minimally processed foods and processed ingredients 1,2,3 All foods excluding ultra-processed foods 1,2,3,4 All foods 4,3,2 All foods excluding unprocessed/minimally foods 4,3 Ultra-processed and highly processed foods 4 Ultra-processed foods For each of these diets, we determined first if a solution was feasible; if it was not, then there was no way to create a nutrient adequate diet using the current market basked of Fred Hutch FFQ component foods. If a solution was found, we divided the objective function by 161, the number of items in our objective function summation, to determine the model's average deviation from the SOS III diet.

SOS III Observed Diets
We first describe the population data from the SOS III dataset in Table 4. The population was approximately evenly spread among age groups, and was primarily women. There were comparable proportions of non-Hispanic white and Hispanic people. Most of the population were college graduates or in higher education. Finally, household incomes were relatively uniformly distributed.  Table 5 shows that LP solutions were obtained only under specific market basket conditions. First, an LP solution was obtained when the market basket contained foods from all four NOVA categories: ultra-processed, processed, unprocessed, and culinary ingredients. A solution was also obtained using a market basket that contained both processed and ultra-processed foods. However, in this analysis based on FFQ component foods, a market basket limited to only ultra-processed foods did not allow for the creation of nutrient adequate food patterns. Furthermore, limiting the market basket to unprocessed foods alone did not yield a mathematical solution. Nutrient adequate food patterns were only viable when both unprocessed and processed foods were included.  Table 6 displays the nutrient composition of the three models that arrived at a mathematical solution and compares them to the SOS III average. Note that Models 2 and 3 produced the same output, so their results are condensed into one column. Both models output a food pattern with exactly 2000 calories, similar to the average of 1953 from the SOS III. Energy density of the observed diet was 1.16 kcal/g. Energy density in Model 1 (all food groups) was 1.14 kcal/g. However, in Model 2, which excluded unprocessed foods, energy density was 1.51 kcal/g. Lower energy density has been used as a proxy indicator of higher diet quality.

Model Outputs
The modeled food patterns were of higher quality than the observed diet of SOS III participants. All three models were slightly lower in the amounts of fats and sweets and of grains compared to the SOS III average. In addition, all three models contained a significantly smaller quantity of meat, poultry, and fish as compared to the observed SOS III diets. Whereas Model 1 significantly increased the amount of vegetables compared to the SOS III average, Models 2 & 3 significantly decreased the quantity of vegetables. A similar pattern emerged for milk and dairy: Model 1 and Models 2 & 3 increased and decreased the quantity of milk and dairy in comparison to the SOS III average respectively. Model 1 had quantities of beans, nuts, and seeds comparable to the SOS III average. However, Models 2 & 3 had almost double the consumption compared to the SOS III observed diets. Model 1 had similar fruit quantities to the SOS III average, while Models 2 & 3 increased the amounts of fruits. Finally, the calculated cost for dietary patterns created by the three all three models were comparable to the observed SOS III diet.
The nutrient composition and the cost of the models in comparisons to the SOS III mean diet is next described in Table 7. The optimized food patterns were more nutrient dense than the observed SOS III diet. First, the modeled food patterns were approximately equicaloric. Second, the amounts of total fat, saturated fat, MUFA, and PUFA were reduced, as were the amounts of total and added sugars and sodium. That was consistent with the imposition of specific constraints on the nutrients to limit. Third, each modeled food pattern was higher in protein, fiber, vitamin A (RAE), vitamin D, vitamin E, vitamin C, thiamin, riboflavin, niacin, calcium, iron, and substantially higher in potassium, as compared to the SOS III mean. The modeled food patterns were slightly lower in zinc and vitamin B-12. Most importantly, all three models were within each of the 22 nutrient constraints defined above. Table 8 compares the percent composition of energy and nutrients from unprocessed and ultra-processed foods from the SOS III mean and from Model 1. Table A1 shows the full percent splits of energy and nutrients by NOVA food processing categories for all models.

A Focus on Foods in the Ultra-Processed Category
In the SOS III sample and in all the models, more than half of dietary energy came from ultra-processed foods. Since ultra-processed foods were more energy dense, they contributed less than half of the weight of the observed diet and modeled food patterns. As shown by the observed diets and modeled food patterns (Model 1), ultra-processed foods accounted for the bulk of added sugar (65.0%), total sugar (98.5%), sodium (87.1%), carbohydrates (63.4 %) and saturated fat (46.1%). Protein was more evenly distributed among unprocessed and ultra-processed foods in the SOS III observed diets, while in Model 1 it was slightly skewed towards unprocessed foods. Substantial differences were observed in protein by food source. In the SOS III diets, most of the animal protein (76.5%) and cholesterol (68.3%) came from unprocessed meat, poultry and fish. By contrast, most plant protein came from ultra-processed foods. In addition to saturated fat, ultra-processed foods accounted for the bulk of MUFA and PUFA. The same distributions were observed in the modeled food patterns (Model 1).
Whereas fiber came mostly from ultra-processed foods in the SOS III sample, Model 1 had an equal contribution of fiber from unprocessed and ultra-processed foods.
Although ultra-processed foods were the principal dietary sources of added sugar, sodium and saturated fat, they also provided substantial amounts of vitamin E, thiamin, niacin, folate, and calcium. These micronutrients mostly came from ultra-processed foods both in the observed SOS III diets and in the modeled food patterns.
On the other hand, some vitamins came largely from unprocessed foods. Those included vitamin C, vitamin D, vitamin B-6 and vitamin B-12. That finding held for both the observed SOS III diets and for the modeled food patterns. Notably, in the SOS III, 56.7% of vitamin B-12 came from unprocessed foods; that percentage increased to 84.6% in Model 1. Similarly, vitamin A came mostly from unprocessed foods in SOS III diets, but from ultra-processed foods in modeled food patterns.
Iron came primarily from ultra-processed foods in both the SOS III diets and in modeled food patterns while zinc was more evenly split. Finally, although potassium and water were evenly split in SOS III diets, it came from unprocessed foods in Model 1. For the observed SOS III diets, estimated diet cost was evenly split between unprocessed and ultra-processed foods. In Model 1, ultra-processed foods slightly edged out the unprocessed foods with 53.7% of the cost. In Models 2 & 3, much of the cost came from the ultra-processed foods.
Finally, Table 9 shows the top five foods in each of the food groups ordered by energy for Models 1 and 2 & 3 respectively. Models 2 & 3 contained fewer foods, particularly in the milk and dairy and meat, poultry, and fish groups. In addition, many of the foods contained in this pattern were highly fortified foods. The food pattern from Model 1 contained a more balanced assortment of foods, with high quantities of vegetables, grains, and meats. However, it also contained some fortified ultra-processed foods, such as the Vitamin C fortified drink. The full food patterns for these models are shown in Tables A2 and A3.

Lowering Vitamin D Requirements
Section 3.2 demonstrated that there were only three market basket conditions in which a LP solution was obtained. An overly high vitamin D requirement in Section 3.2 could have prevented LP solutions from the other models. As such, we reduced the vitamin D requirement by 50%, from 20 mcg to 10 mcg, and kept all other requirements the same. After re-running the models, we obtained Table 10, which shows that model-generated patterns were feasible for all market basket combinations. Specifically, Table 11 shows the gram and energy distributions split by various categories for the ultra-processed-exclusive food pattern (Model 4) and the unprocessedexclusive food pattern (Model 5), both with reduced Vitamin D requirements. Ultra-processed foods had a higher caloric density, as both Models 4 and 5 had 2000 kcal, but Model 4 had only 1327.3 g compared to 2562.8 g in Model 5. Notably, both models contained less fats and sweets than the SOS III mean; Model 5 contained no foods in this category. While Model 4 contained less milk and dairy foods than the SOS III mean in grams, it had comparable energy. Similarly, Model 5 had almost double the gram quantity of milk and dairy, but an energy quantity nearly identical to the SOS III mean. Model 5 significantly increased the quantities of vegetables consumed, achieving almost a threefold increase compared to the SOS III mean, while Model 4 decreased the gram quantity of vegetables, while raising the total energy. Both models have similar quantities of beans, nuts, and seeds to the population mean. Model 4 had a higher caloric intake of grains, compared to the population mean, while Model 5 was similar. Finally, Model 4 had a significantly lower amount of meat, poultry, and fish than the SOS III mean, while Model 5 had a slightly higher quantity. In the processed-only diet (Model 4), slightly more of the protein came from plant sources. For the unprocessed-only diet (Model 5), this was reversed. Finally, Model 4 was close to the population mean of $9.10 and was significantly cheaper than Model 5 with a cost of $9.20, while Model 5 had a cost of $18.48.
Finally, Table 12 compares the specific foods chosen by the ultra-processed-only model (Model 4) and the unprocessed-only model (Model 5). Model 4 contained a large quantity of foods, including a high quantity of potato chips and fortified drinks. Model 5 contained more fresh fruits and vegetables. Both models utilized fish as the primary source of meat, although Model 4 supplemented protein with tofu, while Model 5 used high quantities of fresh fish. Finally, Model 5 contained no fats and sweets. The full food patterns for these models are shown in Tables A4 and A5.

Discussion
In this work, we used linear programming to generate nutritionally adequate food patterns similar to diets from a sample of people in three counties in Washington. First, we demonstrated that a combination of unprocessed and ultra-processed foods were required for the model to generate a feasible food pattern. Notably, models consisting of just unprocessed foods or just ultra-processed foods did not have any solutions. This was a surprising result, as ultraprocessed foods have historically been disparaged due to their negative health effects [11]. However, we show they are an essential component of any food pattern that seeks to be nutritionally adequate.
The present analyses clearly show that foods that fall into the NOVA ultra-processed category were the principal sources of added sugar, sodium and saturated fat. In that respect, the present results are consistent with past reports [3,4]. On the other hand, those same foods were also among the major contributors of vitamin E, thiamin, niacin, folate, and calcium, and were the main sources of plant protein. That was due to several factors. First, vitamin E is contained in oils used in food processing, ultra-processed white bread has been enriched with B-vitamins and folic acid, whereas calcium is provided not only by yogurt and pizza, but also by fortified beverages. Plant proteins are provided by grainbased products, most of which fall into the ultra-processed category. The present data point to the need for further studies on the contribution of processed fortified foods to the American diet.
In the linear programming analyses, vitamin D was the limiting nutrient. Lowering vitamin D requirements to 10 mcg allowed for the creation of modeled food patterns from the unprocessed and the ultra-processed market baskets respectively. The present findings that the current vitamin D requirements could not be met by any combination of fresh, unprocessed foods suggests that the NOVA scheme misclassifies dairy products, placing many in the processed and ultra-processed categories. Vitamin D and calcium requirements are met more easily by the inclusion of dairy in the habitual diet. Fortification of processed foods may play an additional role.
Nutritionally adequate food patterns constructed using a market basket of ultraprocessed foods were higher in plant proteins than in animal proteins. By contrast, food patterns constructed using unprocessed foods only were higher in animal proteins from meat and fish. This would suggest that the major sources of desirable plant proteins are foods that fall into the ultra-processed category, including extruded grains and legumes. Modeled food patterns from the ultra-processed food market basket were also more energy dense. This was due to vegetables being replaced by grains and fortified beverages. These food patterns were also lower in cost, consistent with past data on the low cost of many ultra-processed foods [10].
Although modeled food patterns based on unprocessed foods only were feasible (see Model 5), such patterns incorporated high amounts of fish and vegetables to meet all nutrient constraints. The amounts of fish and vegetables were significantly above the observed diets of the SOS III cohort. Furthermore, modeled food patterns from unprocessed foods only were prohibitively expensive at a daily cost of $18.48.
The work here has several limitations. First, the SOS III dataset consists primarily of women. The optimized food pattern we generate based on the SOS III sample could be improved by using a more representative sample of men and women. Second, the food patterns we generated were limited to the 360 foods extracted from the FFQ. The FFQ component foods that were included in the models were, of necessity, a very limited proxy for the much broader US food supply. Third, the component foods do not contain indicators of fortification, preventing an investigation of foods by fortification status.
Future research directions should include the application of linear programming models with modified nutrient constraints, as the quality of the modeled food pattern may change significantly based on specific constraints. In addition, food patterns should also be varied by changing the objective function in the linear programming models. In this work, we prioritized developing a food pattern consistent with what was observed in the population we studied. In future work, other methods could be employed, such as changing the objective function to minimize cost or minimize differences between the generated food patterns' nutrient values and the respective RDAs. In addition, the recent increase in Vitamin D requirements may justify further exploring the role of fortified foods in the American diet. Together, these findings point to the role of food manufacturers and their potential for the reformulation of the nutrient density of the food supply.

Data Availability Statement:
The datasets generated and/or analyzed during the current study are not publicly available because the data are part of an ongoing study.

Conflicts of Interest:
Adam Drewnowski has received grants, honoraria, and consulting fees from numerous food, beverage, and ingredient companies and from other commercial and nonprofit entities with an interest in diet quality and nutrient density of foods. The University of Washington receives research funding from public and private sectors. None of other authors have any conflict of interest to declare.

Abbreviations
The following abbreviations are used in this manuscript:

LP
Linear Programming SOS III Seattle Obesity Study

Appendix A
The appendix consists of five tables. Table A1 contains the percent composition of nutrients from the SOS III mean and in Models 1, 2, and 3. The subsequent tables list the foods contained in the patterns generated by different models, including cost, processing level, mass, and energy information stratified by food group. From the initial experimentation, Tables A2 and A3 describe the food patterns for Model 1 and Models 2 & 3 respectively. Tables A4 and A5 describe the food patterns for Model 4 and 5 respectively, when the vitamin D requirement is dropped to 10 mcg.