Dietary Patterns and Breast Cancer Risk in Black Urban South African Women: The SABC Study

A total of 396 breast cancer cases and 396 population-based controls from the South African Breast Cancer study (SABC) matched on age and demographic settings was included. Validated questionnaires were used to collect dietary and epidemiological data. Dietary patterns were derived using principal component analysis with a covariance matrix from 33 food groups. Odds ratios and 95% confidence intervals were estimated using conditional logistic regression. A traditional, a cereal-dairy breakfast and a processed food dietary pattern were identified, which together explained 40.3% of the total variance in the diet. After adjusting for potential confounders, the traditional dietary pattern and cereal-dairy breakfast dietary pattern were inversely associated with breast cancer risk (highest tertile versus lowest tertile) (OR = 0.72, 95%CI: 0.57–0.89, p-trend = 0.004 and OR = 0.73, 95%CI: 0.59–0.90, p-trend = 0.004, respectively). The processed food dietary pattern was not significantly associated with breast cancer risk. The results of this study show that a traditional dietary pattern and a cereal-dairy breakfast dietary pattern may reduce the risk of developing breast cancer in this population.


Introduction
Diets in South Africa have shifted from nutritious traditional meals towards diets characterized by higher consumption of nutrient-poor, energy-dense foods [1][2][3][4]. Diets comprising more nutrient poor and energy-dense foods have been associated with an increased risk of obesity and other noncommunicable diseases, such as breast cancer [5,6]. Breast cancer is the most frequently diagnosed cancer in South African women, and mortality rates are rapidly increasing [7]. A lack of early cancer screening and costly cancer treatment contribute to high mortality rates in South Africa [8]. Preventing breast cancer is therefore a priority to reduce high incidence rates and the burden on the public healthcare system in South Africa [8].
It is already established that modifiable lifestyle factors such as diet, body weight and physical activity play a crucial role in cancer prevention [9]. However, research on the association between diet and breast cancer risk in South Africa is limited. A previous study conducted in black women from Soweto, South Africa, showed that higher adherence to an adapted version of the 2018 World Cancer Research Fund/American Institute for Cancer Research's Cancer Prevention Recommendations was inversely associated with breast cancer risk [10]. Another study conducted on black women from Soweto, investigated the association between the degree of food processing and breast cancer risk. In this study, higher intakes of minimally or unprocessed food were inversely associated with breast cancer risk while processed and ultra-processed foods did not show any significant association with breast cancer risk [11]. While these studies mentioned above contribute to valuable insights into the diets of black women form Soweto, more research is required to understand the association between diet and breast cancer risk in South Africa.
There are various different methods to investigate the association between dietary intake and cancer risk, and dietary pattern analysis has emerged as a complementary method over investigating individual nutrients or foods [12], as it allows for an investigation of the effects of overall diets [13]. Dietary patterns can be derived either a priori or a posteriori [14]. The a priori method refers to the use of a scoring system (healthy eating/diet quality index) to calculate adherence to a predefined dietary pattern whereas the a posteriori method (a data-driven approach) refers to the use of statistical modelling techniques such as principle component analysis or exploratory factor analysis to derive dietary patterns empirically [12,13].
Studies investigating a posteriori dietary patterns in association with noncommunicable diseases in the adult South African population are limited. One of these studies showed that dietary patterns comprising predominantly processed foods were positively associated with the risk of being overweight or obese [15]. Obesity is particularly of concern since it is associated with the risk of several noncommunicable diseases, including postmenopausal breast cancer [9]. The association between a posteriori dietary patterns and breast cancer risk in black South African women has not yet been investigated. The aim of this study is, therefore, to determine the association between data-driven dietary patterns and breast cancer risk in black urban women residing in Soweto, South Africa.

Study Population
The subjects included in this study were part of the South African Breast Cancer (SABC) study [16][17][18]. Breast cancer cases were black women and newly diagnosed incidences prior to any cancer treatment from the Chris Hani Baragwanath Academic Hospital. Cases were recruited as soon as possible after the cancer diagnoses. Controls were healthy (not admitted to hospital) black women and unrelated to the breast cancer cases, with no history of cancer diagnoses and matched only by age (±5 years) and area of residence to the cases. Information describing the inclusion and exclusion criteria of breast cancer cases and controls and recruitment of breast cancer cases was previously described elsewhere [16,17]. A total of 396 cases and 396 controls were included in the current analyses.

Personal Information and Lifestyle Status/History
Trained investigators and fieldworkers conducted face-to-face interviews at the time of recruitment using previously validated questionnaires [19,20]. Information regarding socioeconomic and demographics (income, education and other household amenities) were self-reported. Detailed information was further collected with a questionnaire regarding history of health, ethnicity, reproductive risk factors, breast health, family history of cancer, physical activity and smoking habits. Anthropometric measurements such as height, sitting height, weight and waist circumference) were performed according to a standardized protocol, and Body Mass Index (BMI) was calculated as kg/m 2 .

Habitual Dietary Intake
Participants were asked about their habitual dietary intake over the past month, and dietary intake data were collected as soon as possible after breast cancer diagnoses (at recruitment) before any cancer treatment. A validated and culture-specific quantified food frequency questionnaire (QFFQ) was used, together with food models, food portion pictures and household utensils alongside the South African Food Composition Tables to determine habitual dietary intake [21][22][23][24]. A detailed description of the QFFQ and method used to determine the daily intakes are described elsewhere [10]. The nutrient and energy intakes (EI) were calculated by multiplying the daily intake of each food item by the nutrient and energy content (per 100 g), derived from the South African Food Composition Tables, and then by adding the contribution from all food items together [24].

Categorizing of Food Groups to Determine Dietary Patterns
All individual foods and beverages contained in the QFFQ were categorized into 33 food groups (measured in grams/day) based on similarity of the nutrient content (e.g., protein, saturated fat, unsaturated fat, type of carbohydrate, added sugar, fibre or micronutrients). Certain individual foods were classified as individual groups on their own since they were consumed often within the population (bread, maize meal, organ/offal meat and peanuts/peanut butter).

Ethical Approval
The International Agency for Research on Cancer and the University of the Witwatersrand Committee for Research on Human Subjects granted ethical approval for the South African Breast Cancer study (M140980). Permission to conduct research at Chris Hani Baragwanath academic hospital was obtained from the Gauteng Province Medical Advisory Committee. All subjects gave written informed consent prior to participation.

Statistical Analysis
Descriptive analyses were performed, and differences between cases and controls were assessed using paired sample t-test (normal distributed data presented as mean ± standard deviation) and Wilcoxon Signed Rank test (not normal data, presented as median, and 25th and 75th percentiles) for continuous variables and paired Chi-square test for categorical variables (presented as percentages). Specifications of the World Health Organization were used to calculate BMI, using measured height and weight (kg/m 2 ).
Principal component analysis with a covariance matrix was used to derive a number of independent linear combinations, based on a set of food groups, to retain habitual dietary patterns. This method reduces foods or food groups based on a linear combination of correlated foods or food groups into a smaller set of principle components (dietary patterns) [25]. Although it is preferred that dietary patterns should be uncorrelated, it might be that an individual's diet consists out of two different patterns at once [26]. For example, a pattern could be characterized by high loadings of vegetables and fruits, together with a pattern characterized by high loadings of refined grains or highly processed foods. For this reason, principle component analysis was the best fit for our data. Normality of food group variables was tested using P-P plots. When food variables were not normally distributed, log-transformation was performed to achieve normality. The Extraction of principle components was followed by orthogonal (varimax) rotation to enhance the interpretability of the dietary patterns [27]. Three components were retained based on a minimum eigenvalue of 1.0, visual inspection of the scree plot, the percentage variance explained and interpretability of the components. Each component was defined by a subset of at least three food groups with an absolute factor loading equal to or greater than −0.21 or 0.21 [27]. If a food group had a factor loading ≥0.21 in more than one pattern, only the one with the highest factor load was considered in the pattern since individuals tend to follow the pattern with the highest score. To validate the suitability of applying principle component analysis on our study sample, the Kaiser Meyer Olkin (KMO) and Bartlett's Test of Sphericity values were calculated. The obtained KMO value was 0.953 (values close to 1 are considered a very good inter-correlation), and the Bartlett's test of Sphericity was significant (p < 0.001) and indicated homogeneity of variance of the different foods consumed.

Determining the Association between Dietary Patterns and Breast Cancer Risk
Conditional logistic regression models were used to compute odds ratios, and associated 95% confidence intervals were used to determine the association between breast cancer risk and each dietary pattern. Each identified dietary pattern was divided into tertiles based on the 33rd and 66th percentiles of controls to compare the highest to the lowest tertiles to determine the association with breast cancer risk. One standard deviation increase in each dietary pattern (continuous variable) was also used to determine the association with breast cancer risk. Analysis was stratified by hormonal breast cancer receptor subtypes, menopausal status (pre vs. post) and obesity (BMI < 30 kg/m 2 vs. BMI ≥ 30 kg/m 2 ). For the latter two variables, unconditional logistic regression was used.
A three-stage sequential model was used to obtain odds ratios and the associated 95% confidence intervals. Confounding factors were considered factors influencing the crude odds ratios output by more than 10%. The following confounders were examined in the analysis: age (continuous) ethnicity (Zulu/Pedi/Swazi, Xhosa, Sotho, Tshwane, Venda, Tsonga and Ndebele), individual income (R1-R3000, R3001-R6000 and R6001+), level of education (none/primary school, high school and college/postgraduate/diploma), smoking (smokers and non-smokers), height (continuous), waist circumference (continuous), habitual physical activity/d (active and less active), age at menarche (continuous), full-term pregnancy (yes/no), age at first pregnancy (<24 vs. >24 years of age), age at menopause (<48 vs. >48 years of age), time since menopause, parity (≤3 children vs. >3 children), ever breast-feeding (yes/no), duration of exclusive breast-feeding (months), use of exogenous hormones including hormonal birth control to avoid pregnancy (oral contraceptives and injections) and hormone replacement therapy/combined hormone replacement therapy after menopause, family history of breast cancer (yes/no) alcohol consumption, HIV positivity (yes/no), miss-reporting of energy (under reporting vs. over reporting) and total energy intake in kJ (continuous). Only ethnicity, individual income per month, waist circumference, physical activity and menopausal status influenced the crude output by more than 10% and were therefore included in model 2.
Model 3 included all adjustments made in model 2 and additional dietary factors (total energy intake per day, ever alcohol consumption and mutually adjusting for all dietary patterns) to evaluate the additional impact of dietary factors on the association with breast cancer risk in the respective food groups. Sensitivity analysis was conducted by excluding HIV positive breast cancer cases and controls but did not alter the results (results not shown). Table 1 presents the distribution of selected characteristics between breast cancer cases and control participants. Ethnicity differed significantly among case and control participants with cases having more Ndebele-speaking people and with controls having more Sotho-speaking people. Breast cancer cases had a significant lower waist circumference (93.3 cm ± 13.8 cm) compared with controls (95.8 cm ± 13.7 cm) and had a lower percentage of HIV-positive women (16.5% vs. 22.6%). Considering dietary factors, the percentage of non-alcohol consumers was higher in cases (80.8%) than in controls (69.4%). Additionally, in breast cancer cases, oestrogen positivity (ER+) (75.3%) and progesterone positivity (PR+) (66.4%) were the dominant hormonal breast cancer tumour receptors while triple-negative breast cancer accounted for 16.2% of all tumour types. Table 1. Selected characteristics of the study participants by case-control status (means ± standard deviations for parametric data, median and 25th; 75th percentiles for nonparametric data and n (%) for categorical variables).

Characteristics
Breast  Table 2 presents the factor loadings of each retained dietary pattern as well as the percentage variance explained. Three components were retained based on a minimum eigenvalue of 1.0, visual inspection of the scree plot, the percentage variance explained and interpretability of the components. Each component was defined by an absolute factor loading equal or greater than − 0.21 or 0.21. The three components explained 40.3% of the total variance in consumption. Component one, explaining 23.7% of the total variance, predominantly comprised poultry, organ-and-offal meat, mono-and polyunsaturated fats (vegetable oils and margarine), soup powders and vegetables (non-starchy and starchy vegetables) and was named the traditional pattern. Component two explained 9.2% of the total variance and comprised milk, plain yoghurt, unsweetened breakfast cereals, sorghum porridge (oats and maltabella) and fruit juice, while being negatively correlated with maize meal porridge and saturated fats. Component two was named the cereal-dairy breakfast pattern. Component three explained 7.4% of the total variance and comprised cheese, sweetened dairy products, candy/sugar, fast foods, alcoholic beverages, sugar sweetened beverages, fruit spreads or preserved fruits (jam and canned fruit in syrup), and crackers/potato crisps and was named the processed food pattern. Table A1 (Appendix A) presents the nutrient profiles of each dietary pattern per day (comparing the highest tertiles). The traditional dietary pattern had the lowest total energy content (median = 7356 kJ, 6070 kJ-8925 kJ), followed by the cereal-dairy breakfast pattern (median = 8234 kJ, 6544 kJ-10 931 kJ), and the processed food dietary pattern showed the highest total energy content (median = 12 325, 9589 kJ-15 418 kJ). The processed food dietary pattern had the highest content of saturated fat (median = 27.7 g, 20.3 g-37.1 g) and added sugar (median = 72.8 g, 48.3 g-106.4 g) while showing the lowest content of dietary fibre (mean = 21.7 g ± 8.7 g). The protein-to-carbohydrate-to-fat ratio of each dietary pattern is as follows (calculated as percentages, using each macro-nutrient's energy content (kJ/d), divided by total energy intake from total protein, carbohydrate, and fat): traditional dietary pattern = 1:5.3:2.8, cereal-dairy breakfast pattern = 1:5.1:2.3 and processed food dietary pattern = 1:4.8:2.5. The processed food dietary pattern also showed the lowest micronutrient content compared with the traditional and cereal-dairy breakfast dietary patterns.

The Association between Dietary Patterns and Breast Cancer Risk
The association between the three retained dietary patterns, comparing the highest with the lowest tertiles of the respective dietary patterns, and breast cancer risk is presented in Table 3 (Table A2).

Discussion
In this black urban population of South African women, a traditional, a cereal-dairy breakfast and a processed food dietary pattern were identified, which together explained 40.3% of the total variance in the diet. After adjusting for potential confounders, the traditional dietary pattern (characterized by poultry, organ-and-offal meat, mono-and polyunsaturated fats, soup powders and vegetables) showed inverse associations with breast cancer risk overall, in postmenopausal women, in women with PR+ breast cancer and in women with a BMI < 30 kg/m 2 . The cereal-dairy breakfast pattern (characterized by milk, plain yoghurt, unsweetened breakfast cereals, sorghum porridge and fruit juice, while being negatively correlated with maize meal porridge and saturated fats) also showed inverse associations with breast cancer risk overall, in postmenopausal women and in women with a BMI < 30 kg/m 2 . No significant association was observed between the processed food dietary pattern (characterized by cheese, sweetened dairy products, candy/sugar, fast foods, alcoholic beverages, sugar sweetened beverages, fruit spreads and crackers/potato crisps) and breast cancer risk.
The a posteriori approach in our study did not identify the same prudent dietary pattern that was observed in previous studies, which also used a posteriori approaches [37,38]. This is probably because our population has many constraints hindering their ability to access and afford a prudent dietary pattern. Different dietary patterns across populations and different study populations under investigation (i.e., black women from low and middle incomes compared with populations from Asia, Europe and America) may further contribute to the different prudent dietary patterns observed in our study. However, the patterns identified in our population, which most resembled the prudent patterns (traditional and cereal-dairy breakfast patterns), were also inversely associated with breast cancer risk. A subcategory analysis of both the traditional dietary pattern and the cereal-dairy breakfast dietary pattern showed inverse associations with breast cancer risk overall, in postmenopausal women, for women with PR+ breast cancer tumours and for women with a BMI < 30 kg/m 2 .
While the amount of foods consumed differed, the traditional dietary pattern in our study contained similar food groups (poultry, vegetables and unprocessed grains) to the prudent dietary patterns identified in other studies using a posteriori approaches. However, the traditional dietary pattern in our study did not contain any fruits while also containing food groups that are not usually included in a prudent dietary pattern, such as organ and offal meat, soup powders, and mono-and polyunsaturated fats such as margarine (excluding fatty fish).
Organ and offal meat are more affordable meat options in South Africa and are often chosen over costlier lean meat cuts, especially red meat, in lower-income households [40]. Organ meat such as the liver can be a good source of protein and certain key micronutrients such as iron, which was previously associated with a reduced breast cancer risk in this population [18]. However, organ and offal meat have a higher saturated fat content compared with lean meats [24] and may therefore be considered as less healthy meats in the context of noncommunicable disease prevention. In this population, soup powders and margarine are often used in the preparation of homemade dishes such as meat stews and vegetable dishes or as a sauce eaten together with unprocessed grains. However, due to their high sodium content, these foods are generally considered less healthy foods and are both classified as ultra-processed foods, which have previously been linked to an increased risk for noncommunicable diseases such as breast cancer [41,42].
Of the three dietary patterns, the traditional dietary pattern had the lowest total energy, saturated fat and added sugar content while having the highest amounts for dietary fibre, vitamins and minerals. The traditional diets' lower energy content indicates that organ/offal meats and margarine were consumed in smaller portion sizes and less frequently. This together with the higher amounts of fibre and micronutrients in the traditional dietary pattern may explain why the traditional dietary pattern was inversely associated with breast cancer risk in this study.
In our study, a cereal-dairy breakfast dietary pattern was inversely associated with breast cancer risk. This may be related to the negative saturated fat loading in this dietary pattern together with the relatively high calcium content of the diet, being the highest of all three identified dietary patterns. Although limited evidence suggests a protective association between diets high in calcium and breast cancer risk, evidence is, however, inconclusive and warrants further investigation [9].
Westernized or unhealthy dietary patterns are often characterized by consumption of fast and deep fried foods, processed meats, saturated fats, sugar sweetened beverages, alcoholic beverages and other highly processed foods. In general, findings from previous studies investigating the association between 'Westernized' or unhealthy dietary patterns and breast cancer risk have been inconclusive. For example, a systematic review and meta-analysis, conducted in 2010 and including 17 case-control and cohort studies, did not show any significant association between the highest versus lowest categories of Western/unhealthy dietary patterns (OR = 1.09, 95% CI: 0.98-1.22, p = 0.12) [37]. However, a more recent systematic review and meta-analysis conducted in 2019 and including 34 casecontrol and cohort studies showed a 14% increased risk for developing breast cancer when the highest intake category of the Westernized/prudent dietary pattern was compared with the lowest intake category (OR = 1.14, 95%CI: 1.02-1.28, p < 0.001) [38].
In contrast, no significant association between the processed food dietary pattern and breast cancer risk was observed in our study. The results of this study are in line with a former study conducted in black women from Soweto, which investigated the association between ultra-processed food consumption (identified using the NOVA food classification system and breast cancer risk) [11,42]. In the latter study, no significant association was observed between higher ultra-processed food consumption and breast cancer risk [11]. Compared with the highest category of the traditional dietary pattern and the cereal-dairy breakfast pattern, the processed food dietary pattern had the highest total energy, total fat, saturated fat and added sugar content and the lowest fibre and micro-nutrient content. Although such a dietary pattern is not directly associated with breast cancer risk in our population, following a processed food dietary pattern may reduce the overall quality of the diet and may increase the risk of being obese, which is a major risk factor for many chronic diseases and should therefore not be encouraged [9].
The strengths of this study include the fact that cases were recruited prior to any breast cancer treatment, that the questionnaires used to obtain data were proven to be validated, and that the data used in the analysis were standardized and administered by trained personnel. The limitations include the relatively limited sample size of this study; the nature of a case-control study design, which is prone to differential biases of cases; and the use of a QFFQ to collect dietary data, which relies on the memory of participants and is therefore more prone to recall bias. Dietary intake and physical activity were measured over the past month when habitual dietary intake/physical activity of case participants could have changed due to illness and may contribute to random misclassification and under estimation of dietary intake. In addition, although dietary intakes were captured throughout the year (in different participants) seasonal variability of foods (not adjusted for) may have influenced usual reporting of dietary intakes. Ideally large-scale longitudinal studies should confirm the results of this case-control studies that was conducted in the absence of any South African cohort study.

Conclusions
The results of this study show that a traditional dietary pattern and a cereal-dairy breakfast dietary pattern, consisting of a lower total energy, saturated fat and added sugar and higher fibre, calcium and other key micro-nutrient contents, may reduce the risk of developing breast cancer in this population. Food groups associated with these dietary patterns may play key roles in breast cancer prevention interventions. Following a processed food dietary pattern was not associated with breast cancer risk in our study. However, the higher total energy, saturated fat and added sugar content and lower dietary fibre and key micronutrients content of this processed food dietary pattern may increase the risk of being overweight and obese and ultimately breast cancer risk.
Author Contributions: I.J., formal statistical analysis and writing-original draft; C.T.-K., writingreview and editing, conceptualization, investigation and supervision of the study; M.W., writingreviewing and editing; H.C., South African principle investigator of the SABC study, and coresponsibility for methodology and resources; M.J., South African project administration of the SABC study; R.L., formal analysis of dietary intake; I.R., head principal investigator of SABC, and writing-reviewing and editing; C.B., overseeing formal statistical analysis; S.R., supervision of the overall SABC study project, and review and editing; I.H., writing-review and editing, and overseeing formal statistical analysis. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available from the corresponding author upon request. The data are not publicly available since access to the SABC data is subject to the approval of the SABC Steering Committee.
International Agency for Research on Cancer, Tracy Lignini and Robyn Smith. We also acknowledge the contribution towards dietary data collection, coding of QFFQ's and scientific input of H.H. Vorster (posthumous) in this study.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.  CHO; carbohydrate, n/a, not applicable * Nonparametric data presented as median (25th-75th percentiles). † Parametric data presented as mean ± SD. ‡ p-value for significance of differences in nutrient value between each dietary pattern, comparing the highest tertile of each dietary pattern (Wilcoxon signed-rank test for nonparametric data and paired t-test for parametric data). § Calculated as percentages, using each macro-nutrient's energy content (kJ/d), divided by total energy intake from total protein + carbohydrate + fat.